[proofplan]
Each conditional convergence theorem is derived from its unconditional counterpart by verifying the defining conditions of conditional expectation. The Conditional MCT follows by applying the (unconditional) [Monotone Convergence Theorem](/theorems/509) to the [sequence](/page/Sequence) $Y_n \mathbb{1}_A$. The Conditional [Fatou Lemma](/theorems/510) is deduced from the Conditional MCT applied to the increasing sequence $\inf_{k \geq n} X_k$. The Conditional DCT follows from two applications of Conditional Fatou to $X_n + Y$ and $Y - X_n$. The Conditional [Jensen Inequality](/theorems/515) uses the representation of a convex [function](/page/Function) as the supremum of countably many affine minorants.
[/proofplan]
[step:Prove the Conditional [Monotone Convergence Theorem](/theorems/509)]
**(i)** Let $Y_n$ be a version of $\mathbb{E}[X_n \mid \mathcal{G}]$. Since $0 \leq X_n \leq X_{n+1}$ a.s., the positivity of conditional expectation ([Basic Properties of Conditional Expectation](/theorems/1148), part (iv)) applied to $X_{n+1} - X_n \geq 0$ gives $Y_{n+1} - Y_n = \mathbb{E}[X_{n+1} - X_n \mid \mathcal{G}] \geq 0$ a.s. (using linearity, part (v)). Since $Y_n \geq 0$ a.s. and the sequence is increasing, the [limit](/page/Limit) $Y = \lim_{n \to \infty} Y_n$ exists a.s. in $[0, \infty]$ and is $\mathcal{G}$-measurable as a limit of $\mathcal{G}$-[measurable functions](/page/Measurable%20Functions).
For each $A \in \mathcal{G}$, the sequence $Y_n \mathbb{1}_A$ is non-negative and increasing with limit $Y \mathbb{1}_A$. The (unconditional) [Monotone Convergence Theorem](/theorems/509) gives:
\begin{align*}
\mathbb{E}[Y \mathbb{1}_A] = \lim_{n \to \infty} \mathbb{E}[Y_n \mathbb{1}_A] = \lim_{n \to \infty} \mathbb{E}[X_n \mathbb{1}_A] = \mathbb{E}[X \mathbb{1}_A],
\end{align*}
where the second equality uses the [integral](/page/Integral)-matching condition for each $Y_n$, and the third applies MCT to $X_n \mathbb{1}_A \uparrow X \mathbb{1}_A$. By uniqueness, $Y = \mathbb{E}[X \mid \mathcal{G}]$ a.s.
[guided]
The strategy mirrors the proof of the unconditional MCT, but with the integral-matching condition playing the role of integration. The key observation is that the monotone convergence theorem is applied to the *unconditional* expectations $\mathbb{E}[Y_n \mathbb{1}_A]$ — not to the conditional expectations themselves. This is possible because each $Y_n \mathbb{1}_A$ is a non-negative random variable with $Y_n \mathbb{1}_A \uparrow Y \mathbb{1}_A$.
The monotonicity $Y_n \leq Y_{n+1}$ follows from the positivity of conditional expectation: writing $X_{n+1} - X_n \geq 0$ and applying linearity and positivity,
\begin{align*}
Y_{n+1} - Y_n = \mathbb{E}[X_{n+1} \mid \mathcal{G}] - \mathbb{E}[X_n \mid \mathcal{G}] = \mathbb{E}[X_{n+1} - X_n \mid \mathcal{G}] \geq 0 \quad \text{a.s.}
\end{align*}
Without the positivity property, we would have no reason to expect the conditional expectations to be monotone, and the limit might not exist.
[/guided]
[/step]
[step:Deduce the Conditional Fatou Lemma from the Conditional MCT]
**(ii)** Define $Z_n = \inf_{k \geq n} X_k$. Since $X_k \geq 0$ for all $k$, we have $Z_n \geq 0$. The sequence $(Z_n)$ is increasing ($Z_n \leq Z_{n+1}$ since the infimum is taken over a shrinking [set](/page/Set)), and $Z_n \uparrow \liminf_{n \to \infty} X_n$ a.s. By part (i) (Conditional MCT):
\begin{align*}
\mathbb{E}[Z_n \mid \mathcal{G}] \uparrow \mathbb{E}\Bigl[\liminf_{n \to \infty} X_n \;\Big|\; \mathcal{G}\Bigr] \quad \text{a.s.}
\end{align*}
Since $Z_n \leq X_k$ for all $k \geq n$, monotonicity of conditional expectation gives $\mathbb{E}[Z_n \mid \mathcal{G}] \leq \mathbb{E}[X_k \mid \mathcal{G}]$ for all $k \geq n$, hence $\mathbb{E}[Z_n \mid \mathcal{G}] \leq \inf_{k \geq n} \mathbb{E}[X_k \mid \mathcal{G}]$. Passing to the limit:
\begin{align*}
\mathbb{E}\Bigl[\liminf_{n \to \infty} X_n \;\Big|\; \mathcal{G}\Bigr] = \lim_{n \to \infty} \mathbb{E}[Z_n \mid \mathcal{G}] \leq \lim_{n \to \infty} \inf_{k \geq n} \mathbb{E}[X_k \mid \mathcal{G}] = \liminf_{n \to \infty} \mathbb{E}[X_n \mid \mathcal{G}] \quad \text{a.s.}
\end{align*}
[/step]
[step:Derive the Conditional DCT from two applications of Conditional Fatou]
**(iii)** Since $X_n \to X$ a.s. and $|X_n| \leq Y$ a.s. with $Y \in L^1$, both $X_n + Y \geq 0$ and $Y - X_n \geq 0$ a.s. Applying the Conditional Fatou Lemma (part (ii)) to $X_n + Y$:
\begin{align*}
\mathbb{E}[X + Y \mid \mathcal{G}] \leq \liminf_{n \to \infty} \mathbb{E}[X_n + Y \mid \mathcal{G}].
\end{align*}
Linearity of conditional expectation gives $\mathbb{E}[X \mid \mathcal{G}] + \mathbb{E}[Y \mid \mathcal{G}] \leq \liminf_{n \to \infty} \bigl(\mathbb{E}[X_n \mid \mathcal{G}] + \mathbb{E}[Y \mid \mathcal{G}]\bigr)$. Subtracting $\mathbb{E}[Y \mid \mathcal{G}]$ (which is finite a.s. since $Y \in L^1$):
\begin{align*}
\mathbb{E}[X \mid \mathcal{G}] \leq \liminf_{n \to \infty} \mathbb{E}[X_n \mid \mathcal{G}] \quad \text{a.s.}
\end{align*}
Applying Conditional Fatou to $Y - X_n \geq 0$ and performing the analogous cancellation:
\begin{align*}
\mathbb{E}[Y - X \mid \mathcal{G}] \leq \liminf_{n \to \infty} \mathbb{E}[Y - X_n \mid \mathcal{G}],
\end{align*}
which gives $-\mathbb{E}[X \mid \mathcal{G}] \leq -\limsup_{n \to \infty} \mathbb{E}[X_n \mid \mathcal{G}]$, i.e., $\limsup_{n \to \infty} \mathbb{E}[X_n \mid \mathcal{G}] \leq \mathbb{E}[X \mid \mathcal{G}]$ a.s. Combining the two bounds:
\begin{align*}
\mathbb{E}[X_n \mid \mathcal{G}] \to \mathbb{E}[X \mid \mathcal{G}] \quad \text{a.s.}
\end{align*}
[guided]
The structure exactly parallels the standard derivation of DCT from Fatou: one obtains the $\liminf$ bound and the $\limsup$ bound separately, then squeezes. The auxiliary variable $Y$ ensures that we can form non-negative sequences to which Fatou applies.
The subtraction of $\mathbb{E}[Y \mid \mathcal{G}]$ is justified because $Y \in L^1$ implies $\mathbb{E}[Y \mid \mathcal{G}] \in L^1$ (by the $L^1$ contraction property of conditional expectation), so in particular $\mathbb{E}[Y \mid \mathcal{G}]$ is a.s. finite.
[/guided]
[/step]
[step:Prove the Conditional Jensen Inequality using affine minorants]
**(iv)** A convex function $\varphi : \mathbb{R} \to \mathbb{R}$ admits a representation as the pointwise supremum of its supporting affine functions. Since $\varphi$ is convex and hence continuous on the [interior](/page/Interior) of its domain, the supremum may be taken over a *countable* family: there exist sequences $(a_i)_{i \in \mathbb{N}}$ and $(b_i)_{i \in \mathbb{N}}$ in $\mathbb{R}$ such that:
\begin{align*}
\varphi(x) = \sup_{i \in \mathbb{N}} (a_i x + b_i) \quad \text{for all } x \in \mathbb{R}.
\end{align*}
For each $i$, the inequality $\varphi(X) \geq a_i X + b_i$ holds a.s. Applying the monotonicity and linearity of conditional expectation ([Basic Properties of Conditional Expectation](/theorems/1148), parts (iv) and (v)):
\begin{align*}
\mathbb{E}[\varphi(X) \mid \mathcal{G}] \geq \mathbb{E}[a_i X + b_i \mid \mathcal{G}] = a_i \, \mathbb{E}[X \mid \mathcal{G}] + b_i \quad \text{a.s.}
\end{align*}
Taking the supremum over the *countable* index set $i \in \mathbb{N}$ (which preserves measurability):
\begin{align*}
\mathbb{E}[\varphi(X) \mid \mathcal{G}] \geq \sup_{i \in \mathbb{N}} \bigl(a_i \, \mathbb{E}[X \mid \mathcal{G}] + b_i\bigr) = \varphi\bigl(\mathbb{E}[X \mid \mathcal{G}]\bigr) \quad \text{a.s.}
\end{align*}
For the $L^p$ contraction, take $\varphi(x) = |x|^p$ (which is convex for $p \geq 1$) and apply the averaging property (part (i) of this theorem):
\begin{align*}
\|\mathbb{E}[X \mid \mathcal{G}]\|_p^p = \mathbb{E}\bigl[|\mathbb{E}[X \mid \mathcal{G}]|^p\bigr] \leq \mathbb{E}\bigl[\mathbb{E}[|X|^p \mid \mathcal{G}]\bigr] = \mathbb{E}[|X|^p] = \|X\|_p^p.
\end{align*}
[guided]
Why is countability essential? The inequality $\mathbb{E}[\varphi(X) \mid \mathcal{G}] \geq a_i \, \mathbb{E}[X \mid \mathcal{G}] + b_i$ holds a.s. for each fixed $i$, meaning there is a null set $N_i$ outside of which the inequality holds. If we take the supremum over an uncountable family, the union $\bigcup_i N_i$ need not be a null set. With a countable family, $\mathbb{P}(\bigcup_{i \in \mathbb{N}} N_i) = 0$, so the supremum inequality holds a.s.
The countability of the family $(a_i, b_i)$ follows from the fact that a convex function on $\mathbb{R}$ is continuous (on the interior of its effective domain), and the rationals are dense in $\mathbb{R}$: one may take the supporting hyperplanes at rational points.
For the $L^p$ contraction, the chain of inequalities uses three properties: Jensen with $\varphi(x) = |x|^p$ gives $|\mathbb{E}[X \mid \mathcal{G}]|^p \leq \mathbb{E}[|X|^p \mid \mathcal{G}]$. Taking expectations of both sides and applying the averaging property gives $\mathbb{E}[|\mathbb{E}[X \mid \mathcal{G}]|^p] \leq \mathbb{E}[|X|^p]$.
[/guided]
[/step]