Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[guided]We want to show the residual $X - \mathbb{E}[X \mid \mathcal{G}]$ is uncorrelated with every $\mathcal{G}$-measurable square-integrable random variable — that is, $\mathbb{E}[(X - \mathbb{E}[X \mid \mathcal{G}])\,W] = 0$ for all $W \in L^2(\Omega, \mathcal{G}, \mathbb{P})$. In Hilbert space language this says the error is orthogonal to the entire subspace $L^2(\Omega, \mathcal{G}, \mathbb{P})$, which is the defining property of an orthogonal projection. Intuitively, $\mathbb{E}[X \mid \mathcal{G}]$ has already "extracted" everything in $X$ that is visible through the $\sigma$-algebra $\mathcal{G}$; the leftover residual cannot be detected by any $\mathcal{G}$-measurable probe $W$. Let $W \in L^2(\Omega, \mathcal{G}, \mathbb{P})$. **Step 1: Verify that $XW$ and $\mathbb{E}[X \mid \mathcal{G}] \cdot W$ are integrable.** This is needed before we can manipulate expectations. By the [Cauchy-Schwarz Inequality](/theorems/432) applied to $(\Omega, \mathcal{F}, \mathbb{P})$: \begin{align*} \mathbb{E}[|XW|] \leq \|X\|_{L^2(\Omega,\mathcal{F},\mathbb{P})}\,\|W\|_{L^2(\Omega,\mathcal{G},\mathbb{P})} < \infty, \end{align*} so $XW \in L^1(\Omega, \mathcal{F}, \mathbb{P})$. Since $\mathbb{E}[X \mid \mathcal{G}] \in L^2(\Omega, \mathcal{G}, \mathbb{P})$ (established in the previous step), the same Cauchy-Schwarz bound gives $\mathbb{E}[X \mid \mathcal{G}] \cdot W \in L^1(\Omega, \mathcal{G}, \mathbb{P})$. **Step 2: Reduce to the defining identity of conditional expectation.** By linearity: \begin{align*} \mathbb{E}\!\left[\bigl(X - \mathbb{E}[X \mid \mathcal{G}]\bigr)W\right] = \mathbb{E}[XW] - \mathbb{E}\!\left[\mathbb{E}[X \mid \mathcal{G}]\cdot W\right]. \end{align*} It suffices to show $\mathbb{E}[\mathbb{E}[X \mid \mathcal{G}] \cdot W] = \mathbb{E}[XW]$. **Step 3: Apply "taking out what is known" and the tower property.** We invoke two properties from [Basic Properties of Conditional Expectation](/theorems/1148) and the [Tower Property of Conditional Expectation](/theorems/1150). - **Taking out what is known**: since $W$ is $\mathcal{G}$-measurable and $XW \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, we have \begin{align*} \mathbb{E}[XW \mid \mathcal{G}] = W\,\mathbb{E}[X \mid \mathcal{G}] \quad \mathbb{P}\text{-a.s.} \end{align*} The reason this holds is that from the perspective of $\mathcal{G}$, the value of $W$ is already determined, so it factors out of the conditional expectation of $X$. - **Tower property**: for any $Y \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, $\mathbb{E}[\mathbb{E}[Y \mid \mathcal{G}]] = \mathbb{E}[Y]$. Chaining these: \begin{align*} \mathbb{E}\!\left[\mathbb{E}[X \mid \mathcal{G}]\cdot W\right] = \mathbb{E}\!\left[\mathbb{E}[XW \mid \mathcal{G}]\right] = \mathbb{E}[XW]. \end{align*} **Conclusion.** Substituting back: \begin{align*} \mathbb{E}\!\left[\bigl(X - \mathbb{E}[X \mid \mathcal{G}]\bigr)W\right] = \mathbb{E}[XW] - \mathbb{E}[XW] = 0. \end{align*} This orthogonality is the key structural fact. It says that conditional expectation is not merely a good predictor — it is the unique best predictor in the sense that the prediction error is completely invisible to any $\mathcal{G}$-measurable function. The Pythagorean expansion in the next step converts this orthogonality into the minimization inequality.[/guided]

custom_env admin

[guided]The strategy is to write $X - Z$ as a sum of two parts: the irreducible error $\varepsilon = X - \mathbb{E}[X \mid \mathcal{G}]$ (the error of the optimal predictor, which no $\mathcal{G}$-measurable estimate can improve) and the correction $\delta = \mathbb{E}[X \mid \mathcal{G}] - Z$ (the gap between the optimal predictor and the chosen estimator $Z$). The orthogonality established in the preceding step means these two parts are perpendicular in $L^2$, so the total squared error splits as the sum of the two squared errors — the Pythagorean theorem in the Hilbert space $L^2(\Omega, \mathcal{F}, \mathbb{P})$. **Setting up the decomposition.** Define: \begin{align*} \varepsilon &:= X - \mathbb{E}[X \mid \mathcal{G}], \\ \delta &:= \mathbb{E}[X \mid \mathcal{G}] - Z. \end{align*} We check membership in the relevant $L^2$ spaces. Since $X \in L^2(\Omega, \mathcal{F}, \mathbb{P})$ and $\mathbb{E}[X \mid \mathcal{G}] \in L^2(\Omega, \mathcal{G}, \mathbb{P}) \subseteq L^2(\Omega, \mathcal{F}, \mathbb{P})$ (from the previous step), $\varepsilon$ belongs to $L^2(\Omega, \mathcal{F}, \mathbb{P})$. Since $\mathbb{E}[X \mid \mathcal{G}], Z \in L^2(\Omega, \mathcal{G}, \mathbb{P})$ and that space is closed under subtraction, $\delta \in L^2(\Omega, \mathcal{G}, \mathbb{P})$. In particular, $\delta$ is $\mathcal{G}$-measurable. By construction, $\varepsilon + \delta = X - Z$. **Validity of the bilinear expansion.** To expand $\mathbb{E}[(\varepsilon + \delta)^2]$ we need the cross term $\mathbb{E}[\varepsilon\delta]$ to be finite. By the [Cauchy-Schwarz Inequality](/theorems/432): \begin{align*} \mathbb{E}[|\varepsilon\delta|] \leq \|\varepsilon\|_{L^2(\Omega,\mathcal{F},\mathbb{P})}\,\|\delta\|_{L^2(\Omega,\mathcal{G},\mathbb{P})} < \infty, \end{align*} so $\varepsilon\delta \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ and the expansion is justified: \begin{align*} \mathbb{E}[(X - Z)^2] = \mathbb{E}[(\varepsilon + \delta)^2] = \mathbb{E}[\varepsilon^2] + 2\,\mathbb{E}[\varepsilon\delta] + \mathbb{E}[\delta^2]. \end{align*} **Killing the cross term.** The cross term vanishes by orthogonality: since $\delta \in L^2(\Omega, \mathcal{G}, \mathbb{P})$, we apply the result of the preceding step with $W := \delta$ to obtain $\mathbb{E}[\varepsilon\delta] = \mathbb{E}[(X - \mathbb{E}[X \mid \mathcal{G}])\,\delta] = 0$. In Hilbert space terms, $\varepsilon$ and $\delta$ are orthogonal elements of $L^2(\Omega, \mathcal{F}, \mathbb{P})$, so the Pythagorean identity holds: $\|\varepsilon + \delta\|_{L^2}^2 = \|\varepsilon\|_{L^2}^2 + \|\delta\|_{L^2}^2$. **Concluding the inequality.** Since $\mathbb{E}[\delta^2] \geq 0$: \begin{align*} \mathbb{E}[(X - Z)^2] = \mathbb{E}[\varepsilon^2] + \mathbb{E}[\delta^2] \geq \mathbb{E}[\varepsilon^2] = \mathbb{E}\!\left[(X - \mathbb{E}[X \mid \mathcal{G}])^2\right]. \end{align*} The mean-square error of any $\mathcal{G}$-measurable estimator $Z$ is at least the mean-square error of the conditional expectation. The extra cost of using $Z$ rather than $\mathbb{E}[X \mid \mathcal{G}]$ is precisely $\mathbb{E}[\delta^2] = \mathbb{E}[(\mathbb{E}[X \mid \mathcal{G}] - Z)^2]$, the squared $L^2$-distance between $Z$ and the optimal predictor.[/guided]

custom_env admin

Verification Progress

7 Total Blocks

0 Verified

0% verified

Contributors

admin 7 blocks (0 verified)

Who Can Verify

Areas: Analysis, Probability & Statistics
Subareas: Probability Theory, Functional Analysis

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data