Androma — The Home of Mathematics on the Internet

custom_env admin

[step:Bound each martingale increment by the corresponding bounded difference constant]For $i\in\{1,\dots,n\}$, define the martingale difference $D_i:\Omega\to\mathbb R$ by \begin{align*} D_i(\omega):=M_i(\omega)-M_{i-1}(\omega). \end{align*} We claim that $D_i$ is almost surely contained in an interval of length at most $c_i$ conditional on $\mathcal F_{i-1}$. Fix $i$. Let $\mu_j$ denote the law of $X_j$ on $(\mathcal X_j,\mathcal A_j)$. Let $\nu_i$ denote the product probability measure $\mu_{i+1}\otimes\cdots\otimes\mu_n$ on $\mathcal X_{i+1}\times\cdots\times\mathcal X_n$, with the convention that for $i=n$ this is the one-point probability measure on the empty tail. Independence means that the law of $(X_1,\dots,X_n)$ on the product measurable space is $\mu_1\otimes\cdots\otimes\mu_n$. We will use this product-law identity directly to verify conditional expectations, rather than invoking a regular conditional distribution of the unrevealed coordinates. Define a measurable version $G_i:\mathcal X_1\times\cdots\times\mathcal X_i\to\mathbb R$ of the conditional expectation section as follows. Since $Z\in L^1(\Omega,\mathcal F,\mathbb P)$, the Tonelli theorem applied to $|f|$ under the product law is valid because $|f|$ is nonnegative and measurable. Hence for $(\mu_1\otimes\cdots\otimes\mu_i)$-almost every prefix $(x_1,\dots,x_i)$ the integral of $|f(x_1,\dots,x_i,\cdot)|$ with respect to $\nu_i$ is finite. On this full-measure set define \begin{align*} G_i(x_1,\dots,x_i):=\int_{\mathcal X_{i+1}\times\cdots\times\mathcal X_n} f(x_1,\dots,x_i,y_{i+1},\dots,y_n)\, d\nu_i(y_{i+1},\dots,y_n), \end{align*} and define $G_i$ to be $0$ on the exceptional null set. For $i=n$, this formula reads $G_n=f$ outside a null set. The function $G_i$ is $\mathcal A_1\otimes\cdots\otimes\mathcal A_i$-measurable by the measurable-parameter form of Tonelli theorem applied to the positive and negative parts of the integrable function $f$. We verify the conditional-expectation identity by the defining property. Let $H:\Omega\to\mathbb R$ be a bounded $\mathcal F_i$-measurable random variable. By the Doob-Dynkin factorization lemma for $\sigma(X_1,\dots,X_i)$-measurable real random variables, there is a bounded $\mathcal A_1\otimes\cdots\otimes\mathcal A_i$-measurable function $h:\mathcal X_1\times\cdots\times\mathcal X_i\to\mathbb R$ such that \begin{align*} H=h(X_1,\dots,X_i) \end{align*} almost surely. Since $h$ is bounded and $f$ is integrable under the product law, [Fubini's theorem](/theorems/2961) applies to the integrable function \begin{align*} (x_1,\dots,x_n)\mapsto h(x_1,\dots,x_i)f(x_1,\dots,x_n). \end{align*} Using the product law of $(X_1,\dots,X_n)$, we have \begin{align*} \mathbb E[HZ] = \int_{\mathcal X_1\times\cdots\times\mathcal X_n} h(x_1,\dots,x_i)f(x_1,\dots,x_n)\,d(\mu_1\otimes\cdots\otimes\mu_n)(x_1,\dots,x_n). \end{align*} [Fubini's theorem](/theorems/2961) identifies the last integral as \begin{align*} \int_{\mathcal X_1\times\cdots\times\mathcal X_i} h(x_1,\dots,x_i)G_i(x_1,\dots,x_i)\,d(\mu_1\otimes\cdots\otimes\mu_i)(x_1,\dots,x_i). \end{align*} By the product law of $(X_1,\dots,X_i)$, this equals \begin{align*} \mathbb E[H G_i(X_1,\dots,X_i)]. \end{align*} Thus $G_i(X_1,\dots,X_i)$ is integrable, $\mathcal F_i$-measurable, and satisfies the defining identity for $\mathbb E[Z\mid\mathcal F_i]$. Hence \begin{align*} M_i=G_i(X_1,\dots,X_i) \end{align*} almost surely. By [Fubini's theorem](/theorems/2961) applied to the integrable function $|f|$ under the product law, there is a set $B_i\subset\mathcal X_1\times\cdots\times\mathcal X_{i-1}$ with $(\mu_1\otimes\cdots\otimes\mu_{i-1})(B_i)=1$ such that, for every prefix $x=(x_1,\dots,x_{i-1})\in B_i$, the section $u\mapsto G_i(x_1,\dots,x_{i-1},u)$ is finite for $\mu_i$-almost every $u\in\mathcal X_i$. For such a prefix $x$, define the finite-a.e. one-coordinate section $g_{i,x}:\mathcal X_i\to\mathbb R$ by $g_{i,x}(u):=G_i(x_1,\dots,x_{i-1},u)$ on its full-measure domain. If $u,v$ belong to this full-measure domain, then the bounded differences hypothesis gives, for every tail point outside a $\nu_i$-null set depending on $u$ and $v$, \begin{align*} \left| f(x_1,\dots,x_{i-1},u,y_{i+1},\dots,y_n) - f(x_1,\dots,x_{i-1},v,y_{i+1},\dots,y_n) \right| \le c_i. \end{align*} Integrating this pointwise inequality with respect to the probability measure $\nu_i$ gives \begin{align*} |g_{i,x}(u)-g_{i,x}(v)| \le c_i. \end{align*} Thus the essential range of $g_{i,x}(X_i)$ under $\mu_i$ is contained in an interval of length at most $c_i$. Conditionally on $\mathcal F_{i-1}$, $M_i$ has this essential range and \begin{align*} M_{i-1}=\mathbb E[M_i\mid\mathcal F_{i-1}]. \end{align*} Subtracting $M_{i-1}$ translates the interval, so conditionally on $\mathcal F_{i-1}$, $D_i=M_i-M_{i-1}$ has mean $0$ and is supported, in the essential-range sense, in an interval of length at most $c_i$.[/step]

custom_env admin

[guided]The point of the Doob martingale is that $M_i$ is the best prediction of $Z$ after seeing the first $i$ variables. We must show that revealing $X_i$ can change this prediction by at most $c_i$. Fix $i\in\{1,\dots,n\}$. Let $\mu_j$ be the distribution of $X_j$ on $(\mathcal X_j,\mathcal A_j)$ for each $j$, and let $\nu_i$ be the product measure $\mu_{i+1}\otimes\cdots\otimes\mu_n$ on the unrevealed coordinates, with the empty product interpreted as the one-point probability measure when $i=n$. Because $X_1,\dots,X_n$ are independent, the joint law of $(X_1,\dots,X_n)$ is the product measure $\mu_1\otimes\cdots\otimes\mu_n$. This product-law statement is what allows us to compute conditional expectations by testing against bounded functions of the revealed coordinates; no regular [conditional probability](/page/Conditional%20Probability) is needed. Since the theorem assumes $Z\in L^1(\Omega,\mathcal F,\mathbb P)$, the Tonelli theorem applies to $|f|$ under the product law because $|f|$ is nonnegative and measurable. It implies that for $(\mu_1\otimes\cdots\otimes\mu_i)$-almost every prefix $(x_1,\dots,x_i)$, the tail integral of $|f(x_1,\dots,x_i,\cdot)|$ with respect to $\nu_i$ is finite. On this full-measure set define $G_i:\mathcal X_1\times\cdots\times\mathcal X_i\to\mathbb R$ by \begin{align*} G_i(x_1,\dots,x_i):=\int_{\mathcal X_{i+1}\times\cdots\times\mathcal X_n} f(x_1,\dots,x_i,y_{i+1},\dots,y_n)\, d\nu_i(y_{i+1},\dots,y_n), \end{align*} and set $G_i$ equal to $0$ on the exceptional null set. For $i=n$, this is just the statement that $G_n=f$ outside a null set. The function $G_i$ is measurable with respect to $\mathcal A_1\otimes\cdots\otimes\mathcal A_i$ by the measurable-parameter form of Tonelli theorem, applied separately to the positive and negative parts of the integrable function $f$. We now verify that $G_i(X_1,\dots,X_i)$ is the conditional expectation of $Z$ given $\mathcal F_i$. Take any bounded $\mathcal F_i$-measurable random variable $H:\Omega\to\mathbb R$. Since $\mathcal F_i=\sigma(X_1,\dots,X_i)$, the Doob-Dynkin factorization lemma gives a bounded measurable function $h:\mathcal X_1\times\cdots\times\mathcal X_i\to\mathbb R$ such that \begin{align*} H=h(X_1,\dots,X_i) \end{align*} almost surely. Because $h$ is bounded and $f$ is integrable under the product law, [Fubini's theorem](/theorems/2961) applies to the integrable function $(x_1,\dots,x_n)\mapsto h(x_1,\dots,x_i)f(x_1,\dots,x_n)$. Therefore \begin{align*} \mathbb E[HZ] = \int_{\mathcal X_1\times\cdots\times\mathcal X_n} h(x_1,\dots,x_i)f(x_1,\dots,x_n)\,d(\mu_1\otimes\cdots\otimes\mu_n)(x_1,\dots,x_n). \end{align*} Fubini's theorem turns this into \begin{align*} \int_{\mathcal X_1\times\cdots\times\mathcal X_i} h(x_1,\dots,x_i)G_i(x_1,\dots,x_i)\,d(\mu_1\otimes\cdots\otimes\mu_i)(x_1,\dots,x_i). \end{align*} By the product law of $(X_1,\dots,X_i)$, the last display is \begin{align*} \mathbb E[H G_i(X_1,\dots,X_i)]. \end{align*} This is exactly the defining property of $\mathbb E[Z\mid\mathcal F_i]$, and $G_i(X_1,\dots,X_i)$ is $\mathcal F_i$-measurable by construction. Hence \begin{align*} M_i=G_i(X_1,\dots,X_i) \end{align*} almost surely. Now we must be careful about null sets. [Fubini's theorem](/theorems/2961) applied to the integrable function $|f|$ gives a full-measure set $B_i\subset\mathcal X_1\times\cdots\times\mathcal X_{i-1}$ such that, once $x=(x_1,\dots,x_{i-1})\in B_i$ is fixed, the one-coordinate section $u\mapsto G_i(x_1,\dots,x_{i-1},u)$ is finite for $\mu_i$-almost every $u$. Define $g_{i,x}:\mathcal X_i\to\mathbb R$ by $g_{i,x}(u):=G_i(x_1,\dots,x_{i-1},u)$ on this full-measure domain. Compare two values $u,v$ from that full-measure domain. The bounded differences condition says that changing only the $i$-th coordinate changes $f$ by at most $c_i$. Hence for $\nu_i$-almost every tail point, \begin{align*} \left| f(x_1,\dots,x_{i-1},u,y_{i+1},\dots,y_n) - f(x_1,\dots,x_{i-1},v,y_{i+1},\dots,y_n) \right| \le c_i. \end{align*} Since $\nu_i$ is a probability measure, integrating preserves this bound: \begin{align*} |g_{i,x}(u)-g_{i,x}(v)| \le c_i. \end{align*} Thus, conditional on the information $\mathcal F_{i-1}$, the essential range of $M_i$ under the remaining randomness of $X_i$ is contained in an interval whose length is at most $c_i$. Since \begin{align*} M_{i-1}=\mathbb E[M_i\mid\mathcal F_{i-1}], \end{align*} subtracting $M_{i-1}$ only translates that interval. Therefore $D_i=M_i-M_{i-1}$ has conditional mean zero and, conditional on $\mathcal F_{i-1}$, is supported in the essential-range sense in an interval of length at most $c_i$.[/guided]

custom_env admin

[step:Control the exponential moment of each martingale increment] We use the following conditional form of [Hoeffding's lemma](/theorems/1956). Let $\mathcal G\subset\mathcal F$ be a sub-$\sigma$-algebra. Let $A:\Omega\to\mathbb R$ and $B:\Omega\to\mathbb R$ be $\mathcal G$-measurable real-valued random variables satisfying $A\le B$ almost surely and $B-A\le c$ almost surely. If $Y:\Omega\to\mathbb R$ is an integrable real-valued random variable with $\mathbb E[Y\mid\mathcal G]=0$ and $A\le Y\le B$ almost surely, then for every $\lambda\in\mathbb R$, \begin{align*} \mathbb E[e^{\lambda Y}\mid\mathcal G]\le \exp\left(\frac{\lambda^2c^2}{8}\right). \end{align*} This follows by applying the scalar argument below on each conditioning fiber, with the $\mathcal G$-measurable endpoints $A$ and $B$. If $B=A$, then the support condition forces $Y=A$ conditionally on $\mathcal G$, and $\mathbb E[Y\mid\mathcal G]=0$ forces $A=0$ conditionally; hence $\mathbb E[e^{\lambda Y}\mid\mathcal G]=1$, which gives the estimate. Assume now that $B>A$. Since $A\le Y\le B$ and $\mathbb E[Y\mid\mathcal G]=0$, taking conditional expectations gives \begin{align*} A\le 0\le B \end{align*} almost surely. The graph of $y\mapsto e^{\lambda y}$ lies below the chord joining $(A,e^{\lambda A})$ and $(B,e^{\lambda B})$ on $[A,B]$. Taking conditional expectations and using $\mathbb E[Y\mid\mathcal G]=0$ gives \begin{align*} \mathbb E[e^{\lambda Y}\mid\mathcal G]\le \frac{-A}{B-A}e^{\lambda B}+\frac{B}{B-A}e^{\lambda A}. \end{align*} Writing $p:=-A/(B-A)$ and $h:=\lambda(B-A)$, the right-hand side equals $(1-p)e^{-ph}+p e^{(1-p)h}$. The remaining scalar estimate is exactly [Hoeffding's lemma](/theorems/1956) applied pointwise to the centered two-point random variable taking values $-ph$ and $(1-p)h$ with probabilities $1-p$ and $p$, respectively; its hypotheses hold because this variable has mean $0$ and range length $|h|$. Therefore \begin{align*} (1-p)e^{-ph}+p e^{(1-p)h}\le e^{h^2/8}. \end{align*} Since $|h|=|\lambda|(B-A)\le |\lambda|c$, the displayed conditional bound follows. In the preceding step, the interval containing the conditional essential range of $D_i$ may depend on the value of the prefix $(X_1,\dots,X_{i-1})$. This causes no change in the estimate: for each prefix in the full-measure set $B_i$, apply the deterministic interval argument above to the conditional law of $D_i$ on that fiber. The bound depends only on the interval length, which is at most $c_i$, and hence the resulting inequality is $\mathcal F_{i-1}$-almost sure. Applying this fiberwise estimate to $Y=D_i$, $\mathcal G=\mathcal F_{i-1}$, and $c=c_i$, we obtain \begin{align*} \mathbb E[e^{\lambda D_i}\mid\mathcal F_{i-1}] \le \exp\left(\frac{\lambda^2c_i^2}{8}\right) \end{align*} for every $\lambda\in\mathbb R$ and every $i\in\{1,\dots,n\}$. [/step]

custom_env admin

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data