Backwards Martingale Convergence Theorem (Theorem # 1165)
Theorem
Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space, let $(\mathcal{G}_n)_{n \leq 0}$ be a decreasing filtration ($\mathcal{G}_{n-1} \subset \mathcal{G}_n$ for $n \leq -1$), and let $X = (X_n)_{n \leq 0}$ be a backwards martingale: $\mathbb{E}[X_{n+1} \mid \mathcal{G}_n] = X_n$ a.s. for all $n \leq -1$. If $X_0 \in L^p(\Omega, \mathcal{F}, \mathbb{P})$ for some $p \in [1, \infty)$, then:
\begin{align*}
X_n \to X_{-\infty} = \mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}] \quad \text{a.s. and in } L^p \text{ as } n \to -\infty,
\end{align*}
where $\mathcal{G}_{-\infty} = \bigcap_{n \leq 0} \mathcal{G}_n$.
Probability & Statistics
Probability Theory
Discussion
The backwards martingale convergence theorem asserts that every backwards martingale — a martingale indexed by non-positive integers with a decreasing filtration — converges both a.s. and in $L^p$ (for any $p \in [1, \infty)$ for which $X_0 \in L^p$) to the conditional expectation $\mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$ onto the tail $\sigma$-algebra $\mathcal{G}_{-\infty} = \bigcap_{n \leq 0} \mathcal{G}_n$. This is strictly stronger than the forward [Almost Sure Convergence Theorem](/theorems/1157): backwards martingales always converge in $L^p$, without requiring any additional uniform integrability or $L^p$-boundedness hypothesis.
The reason for this automatic strength is that backwards martingales are always uniformly integrable. Since $X_n = \mathbb{E}[X_0 \mid \mathcal{G}_n]$ for all $n \leq 0$, the family $\{X_n : n \leq 0\}$ is the family of conditional expectations of a single integrable random variable $X_0$, which is uniformly integrable by the [Conditional Expectations are UI](/theorems/1161) result. The [UI and $L^1$ Convergence](/theorems/1162) criterion then gives $L^1$ convergence for free, and for $p > 1$ the argument adapts via [Doob's $L^p$ Inequality](/theorems/1159).
The backwards martingale convergence theorem has two landmark applications. First, it provides one of the cleanest proofs of the strong law of large numbers: the empirical average $\bar{X}_n = S_n/n$ is a backwards martingale with respect to the exchangeable $\sigma$-algebras, and its a.s. limit is $\mathbb{E}[X_1 \mid \mathcal{E}]$, the conditional expectation onto the exchangeable $\sigma$-algebra, which equals $\mathbb{E}[X_1]$ a.s. by the Hewitt–Savage zero–one law. Second, it is used in the probabilistic proof of the [Radon–Nikodym Theorem](/theorems/1167), where the backwards limit of likelihood ratios on increasingly fine partitions constructs the Radon–Nikodym derivative.
Proof
[proofplan]
The proof proceeds in three stages. First, we establish almost sure convergence by reversing time: for each fixed $n \leq 0$, the process $(X_{-n+k})_{0 \leq k \leq n}$ is a forward martingale, so [Doob's Upcrossing Inequality](/theorems/1156) bounds the expected number of upcrossings. The [Convergence Criterion via Upcrossings](/theorems/1155) then delivers a.s. convergence to a [limit](/page/Limit) $X_{-\infty}$. Second, we prove $L^p$ convergence by showing that the family $(|X_n - X_{-\infty}|^p)_{n \leq 0}$ is uniformly integrable, using [Conditional Jensen](/theorems/1149) and the fact that [conditional expectations of a fixed $L^1$ function form a uniformly integrable family](/theorems/1161). The [Uniform Integrability and $L^1$ Convergence theorem](/theorems/1162) then upgrades a.s. convergence to $L^p$ convergence. Third, we identify $X_{-\infty} = \mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$ by verifying the defining property of conditional expectation on all [sets](/page/Set) in $\mathcal{G}_{-\infty}$.
[/proofplan]
[step:Reverse time and bound the expected number of upcrossings]
Fix rationals $a < b$. For each $n \leq 0$, define the time-reversed process
\begin{align*}
Y_k := X_{n+k}, \quad k = 0, 1, \ldots, -n,
\end{align*}
with the increasing filtration $\mathcal{H}_k := \mathcal{G}_{n+k}$ for $k = 0, 1, \ldots, -n$. The filtration $(\mathcal{H}_k)$ is indeed increasing: since $(\mathcal{G}_m)_{m \leq 0}$ is decreasing and $n + k \leq n + k + 1$, we have $\mathcal{H}_k = \mathcal{G}_{n+k} \subset \mathcal{G}_{n+k+1} = \mathcal{H}_{k+1}$. The backwards martingale property $\mathbb{E}[X_{m+1} \mid \mathcal{G}_m] = X_m$ for $m \leq -1$ translates to $\mathbb{E}[Y_{k+1} \mid \mathcal{H}_k] = Y_k$ for $0 \leq k \leq -n - 1$, so $(Y_k)_{0 \leq k \leq -n}$ is a forward martingale with respect to $(\mathcal{H}_k)$.
Let $N_{-n}([a,b], X)$ denote the number of upcrossings of $[a,b]$ by the [sequence](/page/Sequence) $X_n, X_{n+1}, \ldots, X_0$, which equals the number of upcrossings by $Y_0, Y_1, \ldots, Y_{-n}$. Since a martingale is a supermartingale, [Doob's Upcrossing Inequality](/theorems/1156) applied to the supermartingale $(Y_k)_{0 \leq k \leq -n}$ gives
\begin{align*}
(b - a)\, \mathbb{E}[N_{-n}([a,b], X)] &\leq \mathbb{E}[(Y_{-n} - a)^-] = \mathbb{E}[(X_0 - a)^-].
\end{align*}
The final equality holds because $Y_{-n} = X_{n + (-n)} = X_0$. Since $(X_0 - a)^- = \max(a - X_0, 0) \leq |X_0| + |a|$ and $X_0 \in L^p \subset L^1$, the right-hand side is finite and independent of $n$.
[guided]
The central idea is that a backwards martingale, when read in the reverse time direction, becomes a forward martingale. This allows us to import the entire forward convergence theory.
Fix rationals $a < b$. For each $n \leq 0$, we define the time-reversed process
\begin{align*}
Y_k := X_{n+k}, \quad k = 0, 1, \ldots, -n,
\end{align*}
with the filtration $\mathcal{H}_k := \mathcal{G}_{n+k}$ for $k = 0, 1, \ldots, -n$. Why is $(\mathcal{H}_k)$ increasing? Because $(\mathcal{G}_m)_{m \leq 0}$ is a *decreasing* filtration (meaning $\mathcal{G}_{m-1} \subset \mathcal{G}_m$), and the map $k \mapsto n + k$ is increasing, so $\mathcal{H}_k = \mathcal{G}_{n+k} \subset \mathcal{G}_{n+k+1} = \mathcal{H}_{k+1}$.
The backwards martingale property states $\mathbb{E}[X_{m+1} \mid \mathcal{G}_m] = X_m$ a.s. for all $m \leq -1$. Setting $m = n + k$ (which satisfies $m \leq -1$ when $k \leq -n - 1$), this becomes $\mathbb{E}[X_{n+k+1} \mid \mathcal{G}_{n+k}] = X_{n+k}$, i.e., $\mathbb{E}[Y_{k+1} \mid \mathcal{H}_k] = Y_k$ a.s. for $0 \leq k \leq -n - 1$. Thus $(Y_k)_{0 \leq k \leq -n}$ is a forward martingale.
Since every martingale is a supermartingale, we may apply [Doob's Upcrossing Inequality](/theorems/1156). This theorem requires a supermartingale with respect to a filtration on a probability space — all conditions are satisfied by $(Y_k, \mathcal{H}_k)$. The inequality states
\begin{align*}
(b - a)\, \mathbb{E}[N_{-n}([a,b], X)] \leq \mathbb{E}[(Y_{-n} - a)^-],
\end{align*}
where $N_{-n}([a,b], X)$ counts upcrossings of $[a,b]$ by $Y_0, Y_1, \ldots, Y_{-n}$ (equivalently, by $X_n, X_{n+1}, \ldots, X_0$). Now $Y_{-n} = X_{n+(-n)} = X_0$, so the right-hand side is $\mathbb{E}[(X_0 - a)^-]$. Since $(X_0 - a)^- \leq |X_0| + |a|$ and $X_0 \in L^p \subset L^1$, this bound is finite and — crucially — does not depend on $n$.
This uniformity in $n$ is what distinguishes backwards martingales from general forward martingales: for forward martingales, the upcrossing bound involves $\mathbb{E}[(X_n - a)^-]$, which requires a separate $L^1$-boundedness hypothesis. For backwards martingales, the bound always involves the *terminal* value $X_0$, so no additional integrability assumption is needed beyond $X_0 \in L^1$.
[/guided]
[/step]
[step:Deduce almost sure convergence via the upcrossing criterion]
As $n \to -\infty$, the upcrossing counts $N_{-n}([a,b], X)$ increase to $N([a,b], X)$, the total number of upcrossings of $[a,b]$ by the full sequence $(X_n)_{n \leq 0}$. Applying the [Monotone Convergence Theorem](/theorems/509) to the non-negative increasing sequence $N_{-n}([a,b], X) \uparrow N([a,b], X)$:
\begin{align*}
(b - a)\, \mathbb{E}[N([a,b], X)] = \lim_{n \to -\infty} (b-a)\, \mathbb{E}[N_{-n}([a,b], X)] \leq \mathbb{E}[(X_0 - a)^-] < \infty.
\end{align*}
Therefore $N([a,b], X) < \infty$ a.s. for each pair of rationals $a < b$. The set $\{(a,b) \in \mathbb{Q}^2 : a < b\}$ is countable, so a countable intersection gives
\begin{align*}
\mathbb{P}\Bigl(N([a,b], X) < \infty \text{ for all } a < b \in \mathbb{Q}\Bigr) = 1.
\end{align*}
By the [Convergence Criterion via Upcrossings](/theorems/1155), a real sequence converges in $\overline{\mathbb{R}}$ if and only if it has finitely many upcrossings of every rational interval. Therefore $X_n(\omega)$ converges in $\overline{\mathbb{R}}$ for a.e. $\omega$. Denote the limit by $X_{-\infty}$.
To verify that $X_{-\infty}$ is finite a.s. and belongs to $L^p$: since $(X_n)_{n \leq 0}$ is a backwards martingale, the [Tower Property](/theorems/1150) gives $X_n = \mathbb{E}[X_0 \mid \mathcal{G}_n]$ for all $n \leq 0$. Applying [Conditional Jensen (part (iv) of theorem 1149)](/theorems/1149) with $\varphi(t) = |t|^p$ (convex for $p \geq 1$, non-negative):
\begin{align*}
|X_n|^p = |\mathbb{E}[X_0 \mid \mathcal{G}_n]|^p \leq \mathbb{E}[|X_0|^p \mid \mathcal{G}_n] \quad \text{a.s.}
\end{align*}
Taking expectations and using the [averaging property (theorem 1148, part (i))](/theorems/1148):
\begin{align*}
\mathbb{E}[|X_n|^p] \leq \mathbb{E}\bigl[\mathbb{E}[|X_0|^p \mid \mathcal{G}_n]\bigr] = \mathbb{E}[|X_0|^p] < \infty.
\end{align*}
Applying [Fatou's Lemma](/theorems/510) to the non-negative [functions](/page/Function) $|X_n|^p$ on $(\Omega, \mathcal{F}, \mathbb{P})$:
\begin{align*}
\mathbb{E}[|X_{-\infty}|^p] = \mathbb{E}\Bigl[\liminf_{n \to -\infty} |X_n|^p\Bigr] \leq \liminf_{n \to -\infty} \mathbb{E}[|X_n|^p] \leq \mathbb{E}[|X_0|^p] < \infty.
\end{align*}
In particular, $|X_{-\infty}| < \infty$ a.s. and $X_{-\infty} \in L^p(\Omega, \mathcal{F}, \mathbb{P})$.
[guided]
We now pass from bounded upcrossings to convergence. As $n \to -\infty$ (i.e., as the backwards time index goes further into the past), the upcrossing count $N_{-n}([a,b], X)$ — which counts upcrossings of $[a,b]$ by $X_n, X_{n+1}, \ldots, X_0$ — increases monotonically, since a longer sequence can only have more upcrossings. The limit $N([a,b], X) = \lim_{n \to -\infty} N_{-n}([a,b], X)$ is the total number of upcrossings over the entire backwards trajectory.
Since $N_{-n}([a,b], X) \geq 0$ and $N_{-n} \uparrow N$, the [Monotone Convergence Theorem](/theorems/509) (applied to the non-negative, increasing sequence of random variables $N_{-n}([a,b], X)$) gives
\begin{align*}
(b - a)\, \mathbb{E}[N([a,b], X)] &= \lim_{n \to -\infty} (b - a)\, \mathbb{E}[N_{-n}([a,b], X)] \\
&\leq \mathbb{E}[(X_0 - a)^-] < \infty.
\end{align*}
Since $\mathbb{E}[N([a,b], X)] < \infty$, we conclude $N([a,b], X) < \infty$ a.s. for each fixed rational pair $a < b$.
There are countably many rational pairs $(a,b)$ with $a < b$. For each such pair, let $\Omega_{a,b} = \{N([a,b], X) < \infty\}$, which satisfies $\mathbb{P}(\Omega_{a,b}) = 1$ by the preceding argument. By [countable subadditivity](/theorems/1108) of probability (taking the countable intersection):
\begin{align*}
\mathbb{P}\Bigl(\bigcap_{a < b \in \mathbb{Q}} \Omega_{a,b}\Bigr) = 1.
\end{align*}
On this full-measure event, the sequence $(X_n(\omega))_{n \leq 0}$ has finitely many upcrossings of every rational interval $[a,b]$. By the [Convergence Criterion via Upcrossings](/theorems/1155), this is equivalent to the convergence of $X_n(\omega)$ in $\overline{\mathbb{R}} = \mathbb{R} \cup \{\pm\infty\}$. Denote the a.s. limit by $X_{-\infty}$.
It remains to show that this limit is finite a.s. and belongs to $L^p$. The backwards martingale property, applied repeatedly via the [Tower Property](/theorems/1150), gives $X_n = \mathbb{E}[X_0 \mid \mathcal{G}_n]$ for all $n \leq 0$. (To see this: for $n = -1$, the definition gives $X_{-1} = \mathbb{E}[X_0 \mid \mathcal{G}_{-1}]$. For $n = -2$, $X_{-2} = \mathbb{E}[X_{-1} \mid \mathcal{G}_{-2}] = \mathbb{E}[\mathbb{E}[X_0 \mid \mathcal{G}_{-1}] \mid \mathcal{G}_{-2}] = \mathbb{E}[X_0 \mid \mathcal{G}_{-2}]$ by the Tower Property, since $\mathcal{G}_{-2} \subset \mathcal{G}_{-1}$. Induction extends this to all $n \leq 0$.)
[Conditional Jensen (part (iv) of theorem 1149)](/theorems/1149) states that for a convex function $\varphi$ with $\varphi(X) \in L^1$ or $\varphi \geq 0$, we have $\varphi(\mathbb{E}[X \mid \mathcal{G}]) \leq \mathbb{E}[\varphi(X) \mid \mathcal{G}]$ a.s. Applying this with $\varphi(t) = |t|^p$ (which is convex for $p \geq 1$ and satisfies $\varphi \geq 0$), $X = X_0$, and $\mathcal{G} = \mathcal{G}_n$:
\begin{align*}
|X_n|^p = |\mathbb{E}[X_0 \mid \mathcal{G}_n]|^p \leq \mathbb{E}[|X_0|^p \mid \mathcal{G}_n] \quad \text{a.s.}
\end{align*}
Taking expectations and using the [averaging property](/theorems/1148) ($\mathbb{E}[\mathbb{E}[Y \mid \mathcal{G}]] = \mathbb{E}[Y]$):
\begin{align*}
\mathbb{E}[|X_n|^p] \leq \mathbb{E}[|X_0|^p] < \infty \quad \text{for all } n \leq 0.
\end{align*}
This is the uniform $L^p$ bound. [Fatou's Lemma](/theorems/510), applied to the non-negative [measurable functions](/page/Measurable%20Functions) $|X_n|^p$ on the probability space $(\Omega, \mathcal{F}, \mathbb{P})$, gives
\begin{align*}
\mathbb{E}[|X_{-\infty}|^p] = \mathbb{E}\Bigl[\liminf_{n \to -\infty} |X_n|^p\Bigr] \leq \liminf_{n \to -\infty} \mathbb{E}[|X_n|^p] \leq \mathbb{E}[|X_0|^p] < \infty.
\end{align*}
Since $\mathbb{E}[|X_{-\infty}|^p] < \infty$, we have $|X_{-\infty}| < \infty$ a.s. and $X_{-\infty} \in L^p(\Omega, \mathcal{F}, \mathbb{P})$.
[/guided]
[/step]
[step:Establish $L^p$ convergence via uniform integrability]
We show that $\mathbb{E}[|X_n - X_{-\infty}|^p] \to 0$ as $n \to -\infty$. Since $X_n \to X_{-\infty}$ a.s. has already been established, by the [Uniform Integrability and $L^1$ Convergence theorem](/theorems/1162) it suffices to show that the family $(|X_n - X_{-\infty}|^p)_{n \leq 0}$ is uniformly integrable.
The convexity inequality $(s + t)^p \leq 2^{p-1}(s^p + t^p)$ for $s, t \geq 0$ and $p \geq 1$ gives
\begin{align*}
|X_n - X_{-\infty}|^p \leq 2^{p-1}(|X_n|^p + |X_{-\infty}|^p).
\end{align*}
By [Conditional Jensen (part (iv) of theorem 1149)](/theorems/1149) applied with $\varphi(t) = |t|^p$:
\begin{align*}
|X_n|^p = |\mathbb{E}[X_0 \mid \mathcal{G}_n]|^p \leq \mathbb{E}[|X_0|^p \mid \mathcal{G}_n] \quad \text{a.s.}
\end{align*}
Therefore
\begin{align*}
|X_n - X_{-\infty}|^p \leq 2^{p-1}\bigl(\mathbb{E}[|X_0|^p \mid \mathcal{G}_n] + |X_{-\infty}|^p\bigr).
\end{align*}
The term $|X_{-\infty}|^p$ is a fixed $L^1$ random variable, hence the singleton family $\{|X_{-\infty}|^p\}$ is uniformly integrable. For the first term: since $|X_0|^p \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, the family $\{\mathbb{E}[|X_0|^p \mid \mathcal{G}] : \mathcal{G} \subset \mathcal{F}\}$ is uniformly integrable by [theorem 1161 (Conditional Expectations are Uniformly Integrable)](/theorems/1161). In particular, the subfamily $\{\mathbb{E}[|X_0|^p \mid \mathcal{G}_n] : n \leq 0\}$ is uniformly integrable.
A family dominated by a uniformly integrable family is itself uniformly integrable: if $0 \leq Z_n \leq W_n$ a.s. and $(W_n)$ is UI, then for any $\alpha > 0$,
\begin{align*}
\mathbb{E}[Z_n \mathbb{1}_{Z_n > \alpha}] \leq \mathbb{E}[W_n \mathbb{1}_{W_n > \alpha}] \to 0 \quad \text{as } \alpha \to \infty,
\end{align*}
uniformly in $n$. Since $|X_n - X_{-\infty}|^p$ is dominated by a constant multiple of a sum of two uniformly integrable families, it is itself uniformly integrable.
Since $X_n \to X_{-\infty}$ a.s. and $(|X_n - X_{-\infty}|^p)_{n \leq 0}$ is uniformly integrable, the [Uniform Integrability and $L^1$ Convergence theorem](/theorems/1162) gives $|X_n - X_{-\infty}|^p \to 0$ in $L^1$, which is equivalent to $\mathbb{E}[|X_n - X_{-\infty}|^p] \to 0$, i.e., $X_n \to X_{-\infty}$ in $L^p$.
[guided]
We have a.s. convergence and need to upgrade it to $L^p$. The standard tool for this upgrade is the [Uniform Integrability and $L^1$ Convergence theorem](/theorems/1162), which states: if $Y_n \to Y$ a.s. and $(Y_n)$ is UI, then $Y_n \to Y$ in $L^1$. We apply this to $Y_n := |X_n - X_{-\infty}|^p$, which converges to $0$ a.s. (since $X_n \to X_{-\infty}$ a.s. and $t \mapsto |t|^p$ is continuous). Thus $L^p$ convergence of $X_n$ reduces to verifying that $(|X_n - X_{-\infty}|^p)_{n \leq 0}$ is uniformly integrable.
To establish UI, we bound the family by a known UI family. By the convexity of $t \mapsto t^p$ for $p \geq 1$ and the inequality $(s+t)^p \leq 2^{p-1}(s^p + t^p)$ for $s,t \geq 0$:
\begin{align*}
|X_n - X_{-\infty}|^p \leq 2^{p-1}(|X_n|^p + |X_{-\infty}|^p).
\end{align*}
Now $X_n = \mathbb{E}[X_0 \mid \mathcal{G}_n]$, so [Conditional Jensen (part (iv) of theorem 1149)](/theorems/1149) with the convex function $\varphi(t) = |t|^p \geq 0$ yields $|X_n|^p \leq \mathbb{E}[|X_0|^p \mid \mathcal{G}_n]$ a.s. Therefore
\begin{align*}
|X_n - X_{-\infty}|^p \leq 2^{p-1}\bigl(\mathbb{E}[|X_0|^p \mid \mathcal{G}_n] + |X_{-\infty}|^p\bigr).
\end{align*}
Why is the right-hand side uniformly integrable? The term $|X_{-\infty}|^p$ is a single $L^1$ function (as shown in the previous step), so the constant family $\{|X_{-\infty}|^p\}$ is UI. For the conditional expectation term, we invoke [theorem 1161](/theorems/1161), which states: if $Z \in L^1(\Omega, \mathcal{F}, \mathbb{P})$, then the family $\{\mathbb{E}[Z \mid \mathcal{G}] : \mathcal{G} \subset \mathcal{F} \text{ sub-}\sigma\text{-algebra}\}$ is uniformly integrable. Applying this with $Z = |X_0|^p \in L^1$, we conclude that $\{\mathbb{E}[|X_0|^p \mid \mathcal{G}_n] : n \leq 0\}$ is UI.
A finite sum of UI families, scaled by constants, is again UI. Moreover, if $0 \leq Z_n \leq W_n$ a.s. and $(W_n)$ is UI, then $(Z_n)$ is UI (since $\{Z_n > \alpha\} \subset \{W_n > \alpha\}$ and $Z_n \mathbb{1}_{Z_n > \alpha} \leq W_n \mathbb{1}_{W_n > \alpha}$). Combining these two facts, $(|X_n - X_{-\infty}|^p)_{n \leq 0}$ is UI.
We can now apply the [Uniform Integrability and $L^1$ Convergence theorem](/theorems/1162). The hypotheses are: (i) $|X_n - X_{-\infty}|^p \to 0$ a.s. (from a.s. convergence and [continuity](/page/Continuity) of $|\cdot|^p$), and (ii) $(|X_n - X_{-\infty}|^p)$ is UI (just established). The conclusion is $|X_n - X_{-\infty}|^p \to 0$ in $L^1$, which means $\mathbb{E}[|X_n - X_{-\infty}|^p] \to 0$. This is precisely $L^p$ convergence: $X_n \to X_{-\infty}$ in $L^p$.
[/guided]
[/step]
[step:Identify the limit as $\mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$]
It remains to show that $X_{-\infty} = \mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$ a.s., where $\mathcal{G}_{-\infty} = \bigcap_{n \leq 0} \mathcal{G}_n$. By the [uniqueness of conditional expectation (theorem 1147)](/theorems/1147), it suffices to verify:
1. $X_{-\infty}$ is $\mathcal{G}_{-\infty}$-measurable, and
2. $\mathbb{E}[X_{-\infty} \mathbb{1}_A] = \mathbb{E}[X_0 \mathbb{1}_A]$ for all $A \in \mathcal{G}_{-\infty}$.
**Measurability.** For each fixed $N \leq 0$ and each $n \leq N$, we have $\mathcal{G}_n \subset \mathcal{G}_N$ (since the filtration is decreasing), so $X_n$ is $\mathcal{G}_N$-measurable. Therefore $\sup_{n \leq N} X_n$ is $\mathcal{G}_N$-measurable as a countable supremum of $\mathcal{G}_N$-measurable functions. Then
\begin{align*}
\limsup_{n \to -\infty} X_n = \inf_{N \leq 0} \sup_{n \leq N} X_n
\end{align*}
is measurable with respect to $\bigcap_{N \leq 0} \mathcal{G}_N = \mathcal{G}_{-\infty}$, since for each $N$ the function $\sup_{n \leq N} X_n$ is $\mathcal{G}_N$-measurable, and the infimum over all $N \leq 0$ is measurable with respect to the intersection. Since $X_n \to X_{-\infty}$ a.s., we have $\limsup_{n \to -\infty} X_n = X_{-\infty}$ a.s. After redefining $X_{-\infty}$ on a $\mathbb{P}$-null set if necessary, we may take $X_{-\infty}$ to be $\mathcal{G}_{-\infty}$-measurable.
**[Integral](/page/Integral) identity.** Fix $A \in \mathcal{G}_{-\infty}$. Since $\mathcal{G}_{-\infty} = \bigcap_{n \leq 0} \mathcal{G}_n$, we have $A \in \mathcal{G}_n$ for every $n \leq 0$. The defining property of conditional expectation, applied to $X_n = \mathbb{E}[X_0 \mid \mathcal{G}_n]$ and the set $A \in \mathcal{G}_n$, gives
\begin{align*}
\mathbb{E}[X_n \mathbb{1}_A] = \mathbb{E}[X_0 \mathbb{1}_A] \quad \text{for all } n \leq 0.
\end{align*}
We pass to the limit $n \to -\infty$ on the left-hand side. Since $X_n \to X_{-\infty}$ in $L^p \subset L^1$ (as $p \geq 1$ and the underlying space is a probability space) and $\mathbb{1}_A \in L^\infty$ with $\|\mathbb{1}_A\|_\infty \leq 1$, Holder's inequality with exponents $1$ and $\infty$ gives
\begin{align*}
|\mathbb{E}[X_n \mathbb{1}_A] - \mathbb{E}[X_{-\infty} \mathbb{1}_A]| = |\mathbb{E}[(X_n - X_{-\infty}) \mathbb{1}_A]| \leq \mathbb{E}[|X_n - X_{-\infty}|] \to 0.
\end{align*}
Therefore $\mathbb{E}[X_{-\infty} \mathbb{1}_A] = \mathbb{E}[X_0 \mathbb{1}_A]$ for all $A \in \mathcal{G}_{-\infty}$. Combined with $\mathcal{G}_{-\infty}$-measurability of $X_{-\infty}$, this establishes $X_{-\infty} = \mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$ a.s. by [uniqueness of conditional expectation](/theorems/1147).
[guided]
The final step is to identify the a.s. and $L^p$ limit $X_{-\infty}$ as the conditional expectation $\mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$. Conditional expectation is characterised uniquely (a.s.) by two properties: measurability with respect to the conditioning $\sigma$-algebra, and the integral identity on all sets in that $\sigma$-algebra. We verify each in turn.
**Measurability.** We need $X_{-\infty}$ to be $\mathcal{G}_{-\infty}$-measurable, where $\mathcal{G}_{-\infty} = \bigcap_{n \leq 0} \mathcal{G}_n$. The subtlety is that each $X_n$ is $\mathcal{G}_n$-measurable, but the $\sigma$-algebras $\mathcal{G}_n$ are *different* (and decreasing), so the pointwise limit is not automatically $\mathcal{G}_{-\infty}$-measurable from the definition alone.
The resolution uses lattice operations and the monotonicity of the filtration. Fix $N \leq 0$. For any $n \leq N$, we have $\mathcal{G}_n \subset \mathcal{G}_N$ (since the filtration is decreasing), so $X_n$ is $\mathcal{G}_N$-measurable. Therefore $\sup_{n \leq N} X_n$ is $\mathcal{G}_N$-measurable (as a countable supremum of $\mathcal{G}_N$-measurable functions). Now
\begin{align*}
\limsup_{n \to -\infty} X_n = \inf_{N \leq 0} \sup_{n \leq N} X_n.
\end{align*}
For each $N \leq 0$, $\sup_{n \leq N} X_n$ is $\mathcal{G}_N$-measurable. The infimum $\inf_{N \leq 0} \sup_{n \leq N} X_n$ is then measurable with respect to $\bigcap_{N \leq 0} \mathcal{G}_N = \mathcal{G}_{-\infty}$: indeed, for any $c \in \mathbb{R}$,
\begin{align*}
\Bigl\{\inf_{N \leq 0} \sup_{n \leq N} X_n \leq c\Bigr\} = \bigcap_{N \leq 0} \Bigl\{\sup_{n \leq N} X_n \leq c\Bigr\} \in \bigcap_{N \leq 0} \mathcal{G}_N = \mathcal{G}_{-\infty}.
\end{align*}
Since $X_n \to X_{-\infty}$ a.s., we have $\limsup_{n \to -\infty} X_n = X_{-\infty}$ a.s. After redefining $X_{-\infty}$ on a $\mathbb{P}$-null set (setting it equal to $\limsup X_n$ everywhere), we obtain a $\mathcal{G}_{-\infty}$-measurable version.
**Integral identity.** Fix $A \in \mathcal{G}_{-\infty}$. Then $A \in \mathcal{G}_n$ for every $n \leq 0$. Since $X_n = \mathbb{E}[X_0 \mid \mathcal{G}_n]$ is a version of the conditional expectation, the defining property gives
\begin{align*}
\mathbb{E}[X_n \mathbb{1}_A] = \mathbb{E}[X_0 \mathbb{1}_A] \quad \text{for all } n \leq 0.
\end{align*}
We pass to the limit $n \to -\infty$ on the left-hand side. Since $X_n \to X_{-\infty}$ in $L^1$ (as $L^p$ convergence implies $L^1$ convergence on a probability space, because $\|f\|_1 \leq \|f\|_p$ by [Jensen's inequality](/theorems/9) applied to the probability measure) and $\mathbb{1}_A \in L^\infty$ with $\|\mathbb{1}_A\|_\infty = 1$, we apply Holder's inequality with exponents $1$ and $\infty$:
\begin{align*}
|\mathbb{E}[(X_n - X_{-\infty}) \mathbb{1}_A]| \leq \|X_n - X_{-\infty}\|_1 \cdot \|\mathbb{1}_A\|_\infty = \|X_n - X_{-\infty}\|_1 \to 0.
\end{align*}
Therefore
\begin{align*}
\mathbb{E}[X_{-\infty} \mathbb{1}_A] = \lim_{n \to -\infty} \mathbb{E}[X_n \mathbb{1}_A] = \mathbb{E}[X_0 \mathbb{1}_A].
\end{align*}
This holds for all $A \in \mathcal{G}_{-\infty}$. Together with $\mathcal{G}_{-\infty}$-measurability, this characterises $X_{-\infty}$ as a version of $\mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$ by the [uniqueness of conditional expectation (theorem 1147)](/theorems/1147). This completes the proof that $X_n \to \mathbb{E}[X_0 \mid \mathcal{G}_{-\infty}]$ both a.s. and in $L^p$.
[/guided]
[/step]
Prerequisites (0/7 completed)
Prerequisites Graph
Interactive dependency map showing how this theorem builds on foundational concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Theorems
- Conditional Convergence Theorems
- Convergence Criterion via Upcrossings
- Doob's Upcrossing Inequality
- Conditional Expectations are Uniformly Integrable
- Uniform Integrability and $L^1$ Convergence
Definitions & Concepts
Explore Further
martingale
Definition
filtration
Definition
Conditional Convergence Theorems
Theorem #1149
Convergence Criterion via Upcrossings
Theorem #1155
Doob's Upcrossing Inequality
Theorem #1156
Conditional Expectations are Uniformly Integrable
Theorem #1161
Uniform Integrability and $L^1$ Convergence
Theorem #1162
Variance Inflation Factor Formula
Probability & Statistics
Singular Value Decomposition Formula for Ridge Regression
Probability & Statistics
Conditional Expectation from a Conditional Density
Probability Theory
Properties of the Discrete Conditional Expectation
Conditional Expectation
Consistency of Random Design Ordinary Least Squares
Probability & Statistics
Countable Subadditivity
Probability Theory
Basic Implications Between Modes of Convergence
Probability Theory
Unbiasedness of the Ordinary Least Squares Estimator Under Exogeneity
Probability & Statistics
Probability & Statistics
Area