Limit Is Conditional Expectation onto Invariant Sigma-Algebra

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We view composition with $T$ as the Koopman isometry on $L^2(X,\mathcal B,\mu)$ and apply the mean ergodic theorem to identify the $L^2$ limit of the ergodic averages as the orthogonal projection onto the fixed-vector subspace. We then identify that fixed-vector subspace with $L^2(X,\mathcal I,\mu)$, the functions measurable with respect to the invariant $\sigma$-algebra. Finally, testing the orthogonal projection against indicators of invariant sets gives exactly the defining integral identity for conditional expectation onto $\mathcal I$. [/proofplan] [step:Apply the mean ergodic theorem to the Koopman isometry] Let $\mathcal H := L^2(X,\mathcal B,\mu)$, equipped with the inner product \begin{align*} \langle g,h\rangle_{\mathcal H} := \int_X g(x)\overline{h(x)}\,d\mu(x). \end{align*} Define the Koopman operator \begin{align*} U:\mathcal H &\to \mathcal H \\ g &\mapsto g\circ T . \end{align*} Since $T$ is measure-preserving, $U$ is well-defined on $L^2$ equivalence classes and \begin{align*} \|Ug\|_{\mathcal H}^2 &= \int_X |g(Tx)|^2\,d\mu(x) \\ &= \int_X |g(y)|^2\,d\mu(y) = \|g\|_{\mathcal H}^2 . \end{align*} Thus $U$ is a linear isometry. For each $N\in \mathbb N$, define the ergodic averaging operator \begin{align*} M_N:\mathcal H &\to \mathcal H \\ g &\mapsto \frac{1}{N}\sum_{n=0}^{N-1}U^n g . \end{align*} Let \begin{align*} \mathcal H_1:=\{g\in \mathcal H: Ug=g\}. \end{align*} By the [Von Neumann Mean Ergodic Theorem](/theorems/3448), applied to the Hilbert space $\mathcal H$ and the isometry $U$, there is an orthogonal projection \begin{align*} P:\mathcal H &\to \mathcal H_1 \end{align*} such that \begin{align*} \lim_{N\to\infty}\|M_N f-Pf\|_{\mathcal H}=0. \end{align*} [guided] The ergodic averages are averages of iterates of the composition operator induced by $T$, so we first put them into Hilbert-space form. Let $\mathcal H:=L^2(X,\mathcal B,\mu)$, with inner product \begin{align*} \langle g,h\rangle_{\mathcal H} := \int_X g(x)\overline{h(x)}\,d\mu(x). \end{align*} Define \begin{align*} U:\mathcal H &\to \mathcal H \\ g &\mapsto g\circ T . \end{align*} This operator is well-defined on $L^2$ classes: if two representatives agree outside a $\mu$-null set $N\in\mathcal B$, then their compositions with $T$ agree outside $T^{-1}N$, and $\mu(T^{-1}N)=\mu(N)=0$ because $T$ is measure-preserving. The same measure-preserving property gives the isometry identity \begin{align*} \|Ug\|_{\mathcal H}^2 &= \int_X |g(Tx)|^2\,d\mu(x) \\ &= \int_X |g(y)|^2\,d\mu(y) = \|g\|_{\mathcal H}^2 . \end{align*} Thus $U$ is a linear isometry of the Hilbert space $\mathcal H$. For $N\in\mathbb N$, define \begin{align*} M_N:\mathcal H &\to \mathcal H \\ g &\mapsto \frac{1}{N}\sum_{n=0}^{N-1}U^n g . \end{align*} The fixed-vector subspace of $U$ is \begin{align*} \mathcal H_1:=\{g\in \mathcal H: Ug=g\}. \end{align*} The [Von Neumann Mean Ergodic Theorem](/theorems/3448) applies because $\mathcal H$ is a Hilbert space and $U$ is an isometry, hence a contraction. It gives the orthogonal projection \begin{align*} P:\mathcal H &\to \mathcal H_1 \end{align*} and the convergence \begin{align*} \lim_{N\to\infty}\|M_N f-Pf\|_{\mathcal H}=0. \end{align*} So the problem is reduced to identifying the projection $Pf$ with the conditional expectation onto $\mathcal I$. [/guided] [/step] [step:Identify fixed vectors with functions measurable over $\mathcal I$] Let \begin{align*} \mathcal K:=L^2(X,\mathcal I,\mu) \end{align*} be the closed subspace of $\mathcal H$ consisting of $L^2$ classes admitting an $\mathcal I$-measurable representative. [claim:The fixed-vector subspace equals $L^2(X,\mathcal I,\mu)$] One has $\mathcal H_1=\mathcal K$. [/claim] [proof] Let $\mathscr Q$ be the countable basis of $\mathbb C$ consisting of open balls with centers in $\mathbb Q+i\mathbb Q$ and positive rational radii. First let $h:X\to\mathbb C$ be an $\mathcal I$-measurable representative of an element of $\mathcal K$. For every $Q\in\mathscr Q$, the set $h^{-1}(Q)$ belongs to $\mathcal I$, so \begin{align*} T^{-1}(h^{-1}(Q))=h^{-1}(Q). \end{align*} Equivalently, \begin{align*} (h\circ T)^{-1}(Q)=h^{-1}(Q) \end{align*} for every $Q\in\mathscr Q$. Since $\mathscr Q$ separates points of $\mathbb C$, it follows that $h\circ T=h$ pointwise. Hence $\mathcal K\subseteq \mathcal H_1$. Conversely, let $g:X\to\mathbb C$ be a finite $\mathcal B$-measurable representative of an element of $\mathcal H_1$. Since $Ug=g$ in $L^2$, the set \begin{align*} N:=\{x\in X:g(Tx)\neq g(x)\} \end{align*} satisfies $\mu(N)=0$. For $Q\in\mathscr Q$, define \begin{align*} E_Q:=g^{-1}(Q). \end{align*} Then $T^{-1}E_Q\triangle E_Q\subseteq N$, where $\triangle$ denotes symmetric difference, and hence \begin{align*} \mu(T^{-1}E_Q\triangle E_Q)=0. \end{align*} For $n\geq 0$, write $T^n:X\to X$ for the $n$-fold iterate, with $T^0=\operatorname{id}_X$, and write $T^{-n}E=(T^n)^{-1}(E)$. Since $T^n$ is measure-preserving for every $n\geq 0$, induction using \begin{align*} T^{-(n+1)}E_Q\triangle E_Q \subseteq T^{-n}(T^{-1}E_Q\triangle E_Q)\cup (T^{-n}E_Q\triangle E_Q) \end{align*} gives \begin{align*} \mu(T^{-n}E_Q\triangle E_Q)=0 \end{align*} for every $n\geq 0$. Define \begin{align*} A_Q:=\bigcap_{m=0}^{\infty}\bigcup_{n=m}^{\infty}T^{-n}E_Q . \end{align*} Then $A_Q\in\mathcal B$ and \begin{align*} T^{-1}A_Q &= \bigcap_{m=0}^{\infty}\bigcup_{n=m}^{\infty}T^{-(n+1)}E_Q \\ &= \bigcap_{m=1}^{\infty}\bigcup_{n=m}^{\infty}T^{-n}E_Q = A_Q. \end{align*} Thus $A_Q\in\mathcal I$. Moreover, with \begin{align*} N_Q:=\bigcup_{n=0}^{\infty}(T^{-n}E_Q\triangle E_Q), \end{align*} we have $\mu(N_Q)=0$ and $A_Q\triangle E_Q\subseteq N_Q$. Hence $g^{-1}(Q)$ differs by a null set from an element of $\mathcal I$ for every $Q\in\mathscr Q$. Let $\overline{\mathcal I}^{\mu}$ denote the $\mu$-completion of $\mathcal I$. The collection \begin{align*} \mathscr S:=\{C\in\mathcal B(\mathbb C):g^{-1}(C)\in \overline{\mathcal I}^{\mu}\} \end{align*} is a $\sigma$-algebra containing $\mathscr Q$, and therefore $\mathscr S=\mathcal B(\mathbb C)$. Thus $g$ is $\overline{\mathcal I}^{\mu}$-measurable. Since $\mathbb C$ with its Borel $\sigma$-algebra is a standard Borel space, the measurable modification lemma for completed sub-$\sigma$-algebras gives an $\mathcal I$-measurable map \begin{align*} g_{\mathcal I}:X&\to\mathbb C \end{align*} such that $g_{\mathcal I}=g$ $\mu$-a.e. Hence the $L^2$ class of $g$ belongs to $\mathcal K$, proving $\mathcal H_1\subseteq\mathcal K$. [/proof] Therefore $Pf\in L^2(X,\mathcal I,\mu)$. [guided] We must connect two notions of invariance. The Hilbert-space limit from the mean ergodic theorem lands in \begin{align*} \mathcal H_1=\{g\in L^2(X,\mathcal B,\mu):g\circ T=g \text{ in } L^2\}, \end{align*} whereas conditional expectation onto $\mathcal I$ is characterized among $\mathcal I$-measurable functions. Let \begin{align*} \mathcal K:=L^2(X,\mathcal I,\mu). \end{align*} We prove that $\mathcal H_1=\mathcal K$. [claim:The fixed-vector subspace equals $L^2(X,\mathcal I,\mu)$] One has $\mathcal H_1=\mathcal K$. [/claim] [proof] Let $\mathscr Q$ be the countable basis of $\mathbb C$ consisting of open balls with centers in $\mathbb Q+i\mathbb Q$ and positive rational radii. This basis is used because it is countable and separates points: if $z\neq w$ in $\mathbb C$, then some $Q\in\mathscr Q$ contains one of $z,w$ and not the other. First take an $\mathcal I$-measurable representative \begin{align*} h:X&\to\mathbb C \end{align*} of an element of $\mathcal K$. For each $Q\in\mathscr Q$, the preimage $h^{-1}(Q)$ belongs to $\mathcal I$. By the definition of $\mathcal I$, \begin{align*} T^{-1}(h^{-1}(Q))=h^{-1}(Q). \end{align*} Since $T^{-1}(h^{-1}(Q))=(h\circ T)^{-1}(Q)$, we get \begin{align*} (h\circ T)^{-1}(Q)=h^{-1}(Q) \end{align*} for every $Q\in\mathscr Q$. Because $\mathscr Q$ separates points, $h(Tx)=h(x)$ for every $x\in X$. Thus the $L^2$ class of $h$ lies in $\mathcal H_1$, proving $\mathcal K\subseteq\mathcal H_1$. Now take an element of $\mathcal H_1$ and choose a finite $\mathcal B$-measurable representative \begin{align*} g:X&\to\mathbb C . \end{align*} The equality $Ug=g$ in $L^2$ means that \begin{align*} N:=\{x\in X:g(Tx)\neq g(x)\} \end{align*} has $\mu(N)=0$. For $Q\in\mathscr Q$, set \begin{align*} E_Q:=g^{-1}(Q). \end{align*} If $x\notin N$, then $g(Tx)=g(x)$, so $x\in T^{-1}E_Q$ exactly when $x\in E_Q$. Therefore \begin{align*} T^{-1}E_Q\triangle E_Q\subseteq N, \end{align*} and hence \begin{align*} \mu(T^{-1}E_Q\triangle E_Q)=0. \end{align*} For $n\geq 0$, let $T^n:X\to X$ be the $n$-fold iterate, with $T^0=\operatorname{id}_X$, and write $T^{-n}E=(T^n)^{-1}(E)$. Since $T$ is measure-preserving, every iterate $T^n$ is measure-preserving. The induction step is the inclusion \begin{align*} T^{-(n+1)}E_Q\triangle E_Q \subseteq T^{-n}(T^{-1}E_Q\triangle E_Q)\cup (T^{-n}E_Q\triangle E_Q). \end{align*} The first set on the right has measure zero because $T^n$ is measure-preserving and $T^{-1}E_Q\triangle E_Q$ has measure zero; the second has measure zero by the induction hypothesis. Hence \begin{align*} \mu(T^{-n}E_Q\triangle E_Q)=0 \end{align*} for every $n\geq 0$. We now replace the almost-invariant set $E_Q$ by an exactly invariant set. Define \begin{align*} A_Q:=\bigcap_{m=0}^{\infty}\bigcup_{n=m}^{\infty}T^{-n}E_Q . \end{align*} This is the set of points whose forward orbit visits $E_Q$ infinitely often. It is exactly invariant because \begin{align*} T^{-1}A_Q &= \bigcap_{m=0}^{\infty}\bigcup_{n=m}^{\infty}T^{-(n+1)}E_Q \\ &= \bigcap_{m=1}^{\infty}\bigcup_{n=m}^{\infty}T^{-n}E_Q \\ &= \bigcap_{m=0}^{\infty}\bigcup_{n=m}^{\infty}T^{-n}E_Q = A_Q. \end{align*} Thus $A_Q\in\mathcal I$. It remains to check that $A_Q$ represents the same measurable set as $E_Q$ modulo null sets. Define \begin{align*} N_Q:=\bigcup_{n=0}^{\infty}(T^{-n}E_Q\triangle E_Q). \end{align*} The preceding paragraph gives $\mu(N_Q)=0$. If $x\notin N_Q$, then membership in every $T^{-n}E_Q$ agrees with membership in $E_Q$. Therefore $x\in A_Q$ exactly when $x\in E_Q$, so $A_Q\triangle E_Q\subseteq N_Q$. Hence $E_Q$ differs by a null set from the invariant set $A_Q$. Let $\overline{\mathcal I}^{\mu}$ be the $\mu$-completion of $\mathcal I$. The collection \begin{align*} \mathscr S:=\{C\in\mathcal B(\mathbb C):g^{-1}(C)\in \overline{\mathcal I}^{\mu}\} \end{align*} is a $\sigma$-algebra. Since every basis set $Q\in\mathscr Q$ belongs to $\mathscr S$, and $\mathscr Q$ generates $\mathcal B(\mathbb C)$, we have $\mathscr S=\mathcal B(\mathbb C)$. Thus $g$ is measurable with respect to the completed invariant $\sigma$-algebra. The measurable modification lemma for completed sub-$\sigma$-algebras applies because $\mathbb C$ is a standard Borel space; it gives an $\mathcal I$-measurable map \begin{align*} g_{\mathcal I}:X&\to\mathbb C \end{align*} with $g_{\mathcal I}=g$ $\mu$-a.e. Therefore the $L^2$ class of $g$ belongs to $\mathcal K$, proving $\mathcal H_1\subseteq\mathcal K$. [/proof] Since $Pf\in\mathcal H_1$ by construction of the orthogonal projection, the equality $\mathcal H_1=\mathcal K$ gives \begin{align*} Pf\in L^2(X,\mathcal I,\mu). \end{align*} [/guided] [/step] [step:Test the projection against indicators of invariant sets] Let $A\in\mathcal I$ be arbitrary, and define the indicator map \begin{align*} \mathbb 1_A:X&\to\{0,1\} \\ x&\mapsto \begin{cases} 1,&x\in A,\\ 0,&x\notin A. \end{cases} \end{align*} Since a measure-preserving system is a probability space, $\mu(X)=1$, so $\mathbb 1_A\in\mathcal H$. Since $A\in\mathcal I$, \begin{align*} \mathbb 1_A\circ T = \mathbb 1_{T^{-1}A} = \mathbb 1_A, \end{align*} and hence $\mathbb 1_A\in\mathcal H_1$. Because $P$ is the orthogonal projection onto $\mathcal H_1$, one has $f-Pf\in\mathcal H_1^\perp$. Therefore \begin{align*} 0 &= \langle f-Pf,\mathbb 1_A\rangle_{\mathcal H} \\ &= \int_X (f(x)-Pf(x))\mathbb 1_A(x)\,d\mu(x) \\ &= \int_A (f(x)-Pf(x))\,d\mu(x). \end{align*} Thus \begin{align*} \int_A Pf(x)\,d\mu(x) = \int_A f(x)\,d\mu(x) \end{align*} for every $A\in\mathcal I$. [guided] To prove that $Pf$ is the conditional expectation, we must verify the defining integral identity on every invariant set. Fix $A\in\mathcal I$, and define \begin{align*} \mathbb 1_A:X&\to\{0,1\} \\ x&\mapsto \begin{cases} 1,&x\in A,\\ 0,&x\notin A. \end{cases} \end{align*} Since the measure space of a measure-preserving system is a probability space, $\mu(X)=1$, so \begin{align*} \int_X |\mathbb 1_A(x)|^2\,d\mu(x) = \mu(A) \leq 1. \end{align*} Hence $\mathbb 1_A\in\mathcal H$. The reason indicators of invariant sets are the right test functions is that they are fixed by $U$. Indeed, $A\in\mathcal I$ means $T^{-1}A=A$, and therefore \begin{align*} \mathbb 1_A\circ T = \mathbb 1_{T^{-1}A} = \mathbb 1_A. \end{align*} So $\mathbb 1_A\in\mathcal H_1$. Since $P$ is the orthogonal projection onto $\mathcal H_1$, the error $f-Pf$ is orthogonal to every vector in $\mathcal H_1$. Applying this to $\mathbb 1_A$ gives \begin{align*} 0 &= \langle f-Pf,\mathbb 1_A\rangle_{\mathcal H} \\ &= \int_X (f(x)-Pf(x))\overline{\mathbb 1_A(x)}\,d\mu(x). \end{align*} Because $\mathbb 1_A$ is real-valued and equals $1$ on $A$ and $0$ on $X\setminus A$, this becomes \begin{align*} 0 = \int_A (f(x)-Pf(x))\,d\mu(x). \end{align*} Therefore \begin{align*} \int_A Pf(x)\,d\mu(x) = \int_A f(x)\,d\mu(x) \end{align*} for every invariant set $A\in\mathcal I$. [/guided] [/step] [step:Use the uniqueness of conditional expectation to identify the limit] Since $\mu(X)=1$ and $f\in L^2(X,\mathcal B,\mu)$, the Cauchy-Schwarz inequality gives \begin{align*} \int_X |f(x)|\,d\mu(x) &\leq \left(\int_X |f(x)|^2\,d\mu(x)\right)^{1/2} \left(\int_X 1^2\,d\mu(x)\right)^{1/2} <\infty. \end{align*} Thus $\mathbb E[f\mid\mathcal I]$ is defined. Let \begin{align*} e:X&\to\mathbb C \end{align*} be an $\mathcal I$-measurable representative of $\mathbb E[f\mid\mathcal I]$. By the defining characterization of conditional expectation, \begin{align*} \int_A e(x)\,d\mu(x) = \int_A f(x)\,d\mu(x) \end{align*} for every $A\in\mathcal I$. The previous step shows that $Pf$ is $\mathcal I$-measurable and satisfies the same integral identity. Since $Pf\in L^2(X,\mu)$ and $\mu(X)=1$, another application of Cauchy-Schwarz gives $Pf\in L^1(X,\mu)$. By the [Existence and Uniqueness of Conditional Expectation](/theorems/1147), $Pf=e$ $\mu$-a.e. Hence \begin{align*} Pf=\mathbb E[f\mid\mathcal I] \end{align*} as elements of $L^2(X,\mathcal B,\mu)$. Combining this identity with the mean ergodic convergence gives \begin{align*} \lim_{N\to\infty} \left\| \frac{1}{N}\sum_{n=0}^{N-1} f\circ T^n - \mathbb E[f\mid\mathcal I] \right\|_{L^2(X,\mu)} = 0. \end{align*} [guided] The final step is to identify the abstract Hilbert-space projection $Pf$ with the measure-theoretic conditional expectation. Since $\mu(X)=1$ and $f\in L^2(X,\mathcal B,\mu)$, Cauchy-Schwarz gives \begin{align*} \int_X |f(x)|\,d\mu(x) &\leq \left(\int_X |f(x)|^2\,d\mu(x)\right)^{1/2} \left(\int_X 1^2\,d\mu(x)\right)^{1/2} <\infty, \end{align*} so $f\in L^1(X,\mu)$ and $\mathbb E[f\mid\mathcal I]$ is defined. Let \begin{align*} e:X&\to\mathbb C \end{align*} be an $\mathcal I$-measurable representative of this conditional expectation. Its defining property is \begin{align*} \int_A e(x)\,d\mu(x)=\int_A f(x)\,d\mu(x) \end{align*} for every $A\in\mathcal I$. The previous step proved exactly the same identity for $Pf$ and also proved that $Pf$ has an $\mathcal I$-measurable representative. Moreover $Pf\in L^2(X,\mu)$, so $\mu(X)=1$ and Cauchy-Schwarz imply $Pf\in L^1(X,\mu)$. The [Existence and Uniqueness of Conditional Expectation](/theorems/1147) therefore identifies the two: \begin{align*} Pf=\mathbb E[f\mid\mathcal I] \end{align*} in $L^2(X,\mathcal B,\mu)$. Substituting this identity into the mean ergodic convergence gives \begin{align*} \lim_{N\to\infty} \left\| \frac{1}{N}\sum_{n=0}^{N-1} f\circ T^n - \mathbb E[f\mid\mathcal I] \right\|_{L^2(X,\mu)} =0. \end{align*} [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.