[proofplan]
The key ingredient is the [UI Martingale Convergence Theorem](/theorems/1163), which gives $X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n]$ a.s. for every $n$. This reduces the two-stopping-time conclusion $\mathbb{E}[X_T \mid \mathcal{F}_S] = X_S$ to proving the single-stopping-time identity $\mathbb{E}[X_\infty \mid \mathcal{F}_T] = X_T$ for an arbitrary stopping time $T$: once this is established, the [tower property of conditional expectation](/theorems/1150) recovers the general case. We verify this identity by checking the two defining [properties of conditional expectation](/theorems/1122) — $\mathcal{F}_T$-measurability and the [integral](/page/Integral)-matching condition — decomposing integrals over the partition $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$ and replacing $X_n$ by $\mathbb{E}[X_\infty \mid \mathcal{F}_n]$ on each piece.
[/proofplan]
[step:Reduce to the single-stopping-time identity $\mathbb{E}[X_\infty \mid \mathcal{F}_T] = X_T$]
Since $X$ is uniformly integrable, the [UI Martingale Convergence Theorem](/theorems/1163) (implication (i) $\Rightarrow$ (iii)) provides a random variable $X_\infty \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ such that $X_n \to X_\infty$ a.s. and
\begin{align*}
X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n] \quad \text{a.s. for every } n \geq 0.
\end{align*}
We claim that the full result $\mathbb{E}[X_T \mid \mathcal{F}_S] = X_S$ a.s. follows once we establish the identity
\begin{align*}
\mathbb{E}[X_\infty \mid \mathcal{F}_T] = X_T \quad \text{a.s.}
\end{align*}
for every stopping time $T$ (possibly infinite). Indeed, assuming this identity holds for both $S$ and $T$, and using $\mathcal{F}_S \subset \mathcal{F}_T$ (which holds since $S \leq T$), the [Tower Property of Conditional Expectation](/theorems/1150) gives
\begin{align*}
\mathbb{E}[X_T \mid \mathcal{F}_S] = \mathbb{E}\bigl[\mathbb{E}[X_\infty \mid \mathcal{F}_T] \bigm| \mathcal{F}_S\bigr] = \mathbb{E}[X_\infty \mid \mathcal{F}_S] = X_S \quad \text{a.s.}
\end{align*}
It therefore suffices to prove $\mathbb{E}[X_\infty \mid \mathcal{F}_T] = X_T$ a.s. for an arbitrary stopping time $T$. The remainder of the proof establishes this.
[guided]
The theorem asks us to prove $\mathbb{E}[X_T \mid \mathcal{F}_S] = X_S$ for two stopping times $S \leq T$. Rather than working with both stopping times simultaneously, we reduce to a simpler identity involving a single stopping time.
The [UI Martingale Convergence Theorem](/theorems/1163) (specifically the equivalence (i) $\Leftrightarrow$ (iii)) states that a uniformly integrable martingale is closed: there exists $X_\infty \in L^1(\Omega, \mathcal{F}, \mathbb{P})$ with $X_n \to X_\infty$ a.s. and $X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n]$ a.s. for every $n \geq 0$. The hypothesis that $X$ is uniformly integrable is consumed here — without it, $X_n$ need not have the representation $\mathbb{E}[X_\infty \mid \mathcal{F}_n]$, and the entire reduction fails. (Compare with the [Optional Stopping Theorem](/theorems/1153) for bounded stopping times, where no uniform integrability is needed but the stopping times must be bounded.)
Why does the single-stopping-time identity suffice? Suppose we know $\mathbb{E}[X_\infty \mid \mathcal{F}_T] = X_T$ a.s. for every stopping time $T$. Apply this with $T$ replaced by $S$ and by $T$ itself. Since $S \leq T$, the stopped $\sigma$-algebras satisfy $\mathcal{F}_S \subset \mathcal{F}_T$. The [Tower Property](/theorems/1150) — which requires the inclusion $\mathcal{F}_S \subset \mathcal{F}_T \subset \mathcal{F}$ and $X_\infty \in L^1$ — then yields
\begin{align*}
\mathbb{E}[X_T \mid \mathcal{F}_S] = \mathbb{E}\bigl[\mathbb{E}[X_\infty \mid \mathcal{F}_T] \bigm| \mathcal{F}_S\bigr] = \mathbb{E}[X_\infty \mid \mathcal{F}_S] = X_S \quad \text{a.s.}
\end{align*}
[/guided]
[/step]
[step:Establish $X_T \in L^1$ via the conditional [Jensen inequality](/theorems/515)]
We decompose $|X_T|$ over the partition $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$ of $\Omega$. Since $T$ is a stopping time, $\{T = n\} \in \mathcal{F}_n$ for each $n \geq 0$. By the [Conditional Jensen Inequality](/theorems/1149) (part (iv), applied with the convex [function](/page/Function) $\varphi(x) = |x|$ and $\mathcal{G} = \mathcal{F}_n$), we have $|X_n| = |\mathbb{E}[X_\infty \mid \mathcal{F}_n]| \leq \mathbb{E}[|X_\infty| \mid \mathcal{F}_n]$ a.s., so that for each $n$,
\begin{align*}
\mathbb{E}[|X_n| \mathbb{1}_{\{T = n\}}] \leq \mathbb{E}[\mathbb{E}[|X_\infty| \mid \mathcal{F}_n] \mathbb{1}_{\{T = n\}}] = \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = n\}}],
\end{align*}
where the final equality uses the defining property of conditional expectation: $\{T = n\} \in \mathcal{F}_n$, so $\int_{\{T=n\}} \mathbb{E}[|X_\infty| \mid \mathcal{F}_n] \, d\mathbb{P} = \int_{\{T=n\}} |X_\infty| \, d\mathbb{P}$ by [Basic Properties of Conditional Expectation](/theorems/1148) (the averaging property applied to the $\mathcal{F}_n$-measurable [set](/page/Set) $\{T = n\}$).
Summing over all $n \geq 0$ and adding the contribution from $\{T = \infty\}$:
\begin{align*}
\mathbb{E}[|X_T|] &= \sum_{n=0}^\infty \mathbb{E}[|X_n| \mathbb{1}_{\{T = n\}}] + \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = \infty\}}] \\
&\leq \sum_{n=0}^\infty \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = n\}}] + \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = \infty\}}] \\
&= \mathbb{E}\Bigl[|X_\infty| \sum_{n=0}^\infty \mathbb{1}_{\{T = n\}} + |X_\infty| \mathbb{1}_{\{T = \infty\}}\Bigr] \\
&= \mathbb{E}[|X_\infty|] < \infty,
\end{align*}
where the penultimate equality uses the fact that $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$ partitions $\Omega$, and the finiteness holds since $X_\infty \in L^1$.
[guided]
Before we can speak of $\mathbb{E}[X_\infty \mid \mathcal{F}_T]$ or verify the integral-matching condition, we need $X_T \in L^1$. The natural strategy is to bound $\mathbb{E}[|X_T|]$ by $\mathbb{E}[|X_\infty|]$, which is finite since $X_\infty \in L^1$.
We decompose the expectation over the partition $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$. On the event $\{T = n\}$, we have $X_T = X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n]$. To compare $|X_n|$ with $|X_\infty|$, we use the [Conditional Jensen Inequality](/theorems/1149) (part (iv)). Taking $\varphi(x) = |x|$, which is convex, and $\mathcal{G} = \mathcal{F}_n$, the conditional Jensen inequality yields
\begin{align*}
|X_n| = |\mathbb{E}[X_\infty \mid \mathcal{F}_n]| \leq \mathbb{E}[|X_\infty| \mid \mathcal{F}_n] \quad \text{a.s.}
\end{align*}
Note that the hypothesis $\varphi(X_\infty) = |X_\infty| \in L^1$ is satisfied since $X_\infty \in L^1$.
Multiplying both sides by $\mathbb{1}_{\{T = n\}}$ (which preserves the inequality since $\mathbb{1}_{\{T = n\}} \geq 0$) and taking expectations:
\begin{align*}
\mathbb{E}[|X_n| \mathbb{1}_{\{T = n\}}] \leq \mathbb{E}[\mathbb{E}[|X_\infty| \mid \mathcal{F}_n] \mathbb{1}_{\{T = n\}}].
\end{align*}
Since $\{T = n\} \in \mathcal{F}_n$ (by the definition of a stopping time), the right-hand side equals $\mathbb{E}[|X_\infty| \mathbb{1}_{\{T = n\}}]$ by the defining property of conditional expectation: for any $\mathcal{F}_n$-measurable set $A$, $\int_A \mathbb{E}[Y \mid \mathcal{F}_n] \, d\mathbb{P} = \int_A Y \, d\mathbb{P}$ (this is the averaging property from [Basic Properties of Conditional Expectation](/theorems/1148), part (i), localised to $A$).
Summing over $n$ and adding the $\{T = \infty\}$ term (which needs no bound since $X_T = X_\infty$ on that event):
\begin{align*}
\mathbb{E}[|X_T|] &= \sum_{n=0}^\infty \mathbb{E}[|X_n| \mathbb{1}_{\{T = n\}}] + \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = \infty\}}] \\
&\leq \sum_{n=0}^\infty \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = n\}}] + \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = \infty\}}] \\
&= \mathbb{E}[|X_\infty|] < \infty.
\end{align*}
The equality in the last line follows because $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$ is a partition of $\Omega$, so the indicators sum to $\mathbb{1}_\Omega = 1$.
[/guided]
[/step]
[step:Verify $\mathcal{F}_T$-measurability of $X_T$]
We must verify that $X_T$ is $\mathcal{F}_T$-measurable. Recall that $\mathcal{F}_T = \{A \in \mathcal{F} : A \cap \{T = n\} \in \mathcal{F}_n \text{ for all } n \geq 0\}$. For any Borel set $C \subset \mathbb{R}$,
\begin{align*}
\{X_T \in C\} \cap \{T = n\} = \{X_n \in C\} \cap \{T = n\} \in \mathcal{F}_n,
\end{align*}
since $X_n$ is $\mathcal{F}_n$-measurable and $\{T = n\} \in \mathcal{F}_n$.
[/step]
[step:Verify the integral-matching condition: $\int_B X_T \, d\mathbb{P} = \int_B X_\infty \, d\mathbb{P}$ for all $B \in \mathcal{F}_T$]
Let $B \in \mathcal{F}_T$. We decompose the integral over the partition $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$:
\begin{align*}
\mathbb{E}[X_T \mathbb{1}_B] = \sum_{n=0}^\infty \mathbb{E}[X_n \mathbb{1}_{\{T = n\}} \mathbb{1}_B] + \mathbb{E}[X_\infty \mathbb{1}_{\{T = \infty\}} \mathbb{1}_B].
\end{align*}
(The interchange of sum and expectation is justified since the [series](/page/Series) is dominated by the convergent series $\sum_n \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = n\}}]$, as established in the preceding step.)
For each $n \geq 0$, the set $B \cap \{T = n\}$ belongs to $\mathcal{F}_n$ (since $B \in \mathcal{F}_T$ and $\{T = n\} \in \mathcal{F}_n$). Using $X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n]$ and the defining property of conditional expectation on the $\mathcal{F}_n$-measurable set $B \cap \{T = n\}$:
\begin{align*}
\mathbb{E}[X_n \mathbb{1}_{\{T = n\}} \mathbb{1}_B] = \mathbb{E}[\mathbb{E}[X_\infty \mid \mathcal{F}_n] \mathbb{1}_{B \cap \{T = n\}}] = \mathbb{E}[X_\infty \mathbb{1}_{B \cap \{T = n\}}] = \mathbb{E}[X_\infty \mathbb{1}_{\{T = n\}} \mathbb{1}_B].
\end{align*}
Substituting back:
\begin{align*}
\mathbb{E}[X_T \mathbb{1}_B] &= \sum_{n=0}^\infty \mathbb{E}[X_\infty \mathbb{1}_{\{T = n\}} \mathbb{1}_B] + \mathbb{E}[X_\infty \mathbb{1}_{\{T = \infty\}} \mathbb{1}_B] = \mathbb{E}[X_\infty \mathbb{1}_B].
\end{align*}
Since $X_T$ is $\mathcal{F}_T$-measurable, $X_T \in L^1$, and $\mathbb{E}[X_T \mathbb{1}_B] = \mathbb{E}[X_\infty \mathbb{1}_B]$ for every $B \in \mathcal{F}_T$, the uniqueness characterisation of conditional expectation gives
\begin{align*}
X_T = \mathbb{E}[X_\infty \mid \mathcal{F}_T] \quad \text{a.s.}
\end{align*}
[guided]
This is the core of the proof. We need to show that $X_T$ satisfies the two defining properties of $\mathbb{E}[X_\infty \mid \mathcal{F}_T]$:
1. $X_T$ is $\mathcal{F}_T$-measurable (verified in the preceding step).
2. For every $B \in \mathcal{F}_T$, $\int_B X_T \, d\mathbb{P} = \int_B X_\infty \, d\mathbb{P}$.
For property (2), fix $B \in \mathcal{F}_T$. We decompose $\mathbb{E}[X_T \mathbb{1}_B]$ over the partition $\{T = n\}_{n \geq 0} \cup \{T = \infty\}$:
\begin{align*}
\mathbb{E}[X_T \mathbb{1}_B] = \sum_{n=0}^\infty \mathbb{E}[X_n \mathbb{1}_{\{T = n\}} \mathbb{1}_B] + \mathbb{E}[X_\infty \mathbb{1}_{\{T = \infty\}} \mathbb{1}_B].
\end{align*}
To justify the interchange of sum and expectation: the partial sums $\sum_{n=0}^N \mathbb{E}[|X_n| \mathbb{1}_{\{T = n\}} \mathbb{1}_B]$ are bounded by $\sum_{n=0}^N \mathbb{E}[|X_\infty| \mathbb{1}_{\{T = n\}}] \leq \mathbb{E}[|X_\infty|] < \infty$ (by the integrability bound from the previous step), so the series converges absolutely and Fubini–Tonelli permits the interchange.
The key step is to replace $X_n$ by $X_\infty$ on each piece. Fix $n \geq 0$. Since $B \in \mathcal{F}_T$, we have $B \cap \{T = n\} \in \mathcal{F}_n$ by the definition of $\mathcal{F}_T$. The representation $X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n]$ and the defining property of conditional expectation — for any $\mathcal{F}_n$-measurable set $A$, $\int_A \mathbb{E}[Y \mid \mathcal{F}_n] \, d\mathbb{P} = \int_A Y \, d\mathbb{P}$ — applied to $A = B \cap \{T = n\}$ and $Y = X_\infty$ give
\begin{align*}
\mathbb{E}[X_n \mathbb{1}_{\{T = n\}} \mathbb{1}_B] = \int_{B \cap \{T = n\}} \mathbb{E}[X_\infty \mid \mathcal{F}_n] \, d\mathbb{P} = \int_{B \cap \{T = n\}} X_\infty \, d\mathbb{P} = \mathbb{E}[X_\infty \mathbb{1}_{\{T = n\}} \mathbb{1}_B].
\end{align*}
This is the moment where the representation $X_n = \mathbb{E}[X_\infty \mid \mathcal{F}_n]$ is essential. Without it, we would have no way to pass from $X_n$ to $X_\infty$ under the integral. And the membership $B \cap \{T = n\} \in \mathcal{F}_n$ is equally critical — it is what entitles us to use the integral-matching property.
Summing over $n$ and combining with the $\{T = \infty\}$ term (where $X_T = X_\infty$ by convention):
\begin{align*}
\mathbb{E}[X_T \mathbb{1}_B] = \sum_{n=0}^\infty \mathbb{E}[X_\infty \mathbb{1}_{\{T = n\}} \mathbb{1}_B] + \mathbb{E}[X_\infty \mathbb{1}_{\{T = \infty\}} \mathbb{1}_B] = \mathbb{E}[X_\infty \mathbb{1}_B].
\end{align*}
Since $X_T$ is $\mathcal{F}_T$-measurable and integrable, and its integral agrees with that of $X_\infty$ over every $B \in \mathcal{F}_T$, the uniqueness of conditional expectation (which identifies $\mathbb{E}[X_\infty \mid \mathcal{F}_T]$ as the a.s.-unique $\mathcal{F}_T$-measurable random variable satisfying this integral condition) gives $X_T = \mathbb{E}[X_\infty \mid \mathcal{F}_T]$ a.s.
[/guided]
[/step]
[step:Combine the reduction and the single-stopping-time identity to conclude]
Having established $\mathbb{E}[X_\infty \mid \mathcal{F}_T] = X_T$ a.s. for every stopping time $T$, the reduction in the first step gives the conclusion: for stopping times $S \leq T$,
\begin{align*}
\mathbb{E}[X_T \mid \mathcal{F}_S] = \mathbb{E}\bigl[\mathbb{E}[X_\infty \mid \mathcal{F}_T] \bigm| \mathcal{F}_S\bigr] = \mathbb{E}[X_\infty \mid \mathcal{F}_S] = X_S \quad \text{a.s.}
\end{align*}
by the [Tower Property](/theorems/1150) (applicable since $\mathcal{F}_S \subset \mathcal{F}_T \subset \mathcal{F}$ and $X_\infty \in L^1$).
[/step]