Ergodic Measures Are Extreme Points — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove both directions. For the ergodic-to-extreme direction, a convex decomposition of $\mu$ forces each component measure to be absolutely continuous with respect to $\mu$; the Radon-Nikodym density of one component is bounded and, by invariance, is fixed by composition with $T$ in $L^2(\mu)$. Ergodicity then forces this density to be constant, and the total mass condition forces the constant to be $1$. Conversely, if $\mu$ is not ergodic, a proper invariant set lets us condition $\mu$ on the set and on its complement, producing two distinct invariant probability measures whose convex combination is $\mu$. [/proofplan] [step:Record the invariant-set formulation used throughout] Let $(X,\mathcal B)$ denote the measurable space and let \begin{align*} T:X&\to X \end{align*} be the measurable transformation underlying $\mathcal M_T$. For a probability measure $\nu:\mathcal B\to[0,1]$, the condition $\nu\in\mathcal M_T$ means \begin{align*} \nu(T^{-1}B)=\nu(B)\quad\text{for every }B\in\mathcal B. \end{align*} We use the measure-theoretic formulation of ergodicity: $\mu\in\mathcal M_T$ is ergodic if every $A\in\mathcal B$ satisfying \begin{align*} \mu(T^{-1}A\triangle A)=0 \end{align*} has $\mu(A)\in\{0,1\}$, where $\triangle$ denotes symmetric difference. [/step] [step:Prove that dominated invariant densities are constant under ergodicity] Assume that $\mu$ is ergodic. Let $C\in(0,\infty)$ and let $\nu:\mathcal B\to[0,1]$ be a probability measure such that $\nu\in\mathcal M_T$ and \begin{align*} \nu(B)\leq C\mu(B)\quad\text{for every }B\in\mathcal B. \end{align*} Then $\nu\ll\mu$. By the [Radon-Nikodym Theorem](/theorems/1247), there is a $\mathcal B$-measurable function \begin{align*} h:X&\to[0,\infty)\\ x&\mapsto h(x) \end{align*} such that \begin{align*} \nu(B)=\int_B h(x)\,d\mu(x)\quad\text{for every }B\in\mathcal B. \end{align*} For $\varepsilon\in(0,\infty)$, define $H_\varepsilon:=\{x\in X:h(x)>C+\varepsilon\}$. The domination of $\nu$ by $C\mu$ gives \begin{align*} (C+\varepsilon)\mu(H_\varepsilon) \leq \int_{H_\varepsilon}h(x)\,d\mu(x) =\nu(H_\varepsilon) \leq C\mu(H_\varepsilon), \end{align*} so $\mu(H_\varepsilon)=0$. Taking the union over $\varepsilon\in\mathbb Q\cap(0,\infty)$ shows that $h\leq C$ $\mu$-almost everywhere. After redefining $h$ on a $\mu$-null set, we may assume $0\leq h\leq C$ everywhere. For every bounded $\mathcal B$-measurable function $\varphi:X\to\mathbb R$, invariance of $\nu$ gives \begin{align*} \int_X \varphi(x)h(x)\,d\mu(x) =\int_X \varphi(x)\,d\nu(x) =\int_X \varphi(T(x))\,d\nu(x) =\int_X \varphi(T(x))h(x)\,d\mu(x), \end{align*} where the middle identity is first the defining invariance identity for indicators and then follows for bounded measurable functions by the monotone class theorem. Taking $\varphi=h$, which is bounded, gives \begin{align*} \int_X h(x)^2\,d\mu(x)=\int_X h(T(x))h(x)\,d\mu(x). \end{align*} Since $\mu\in\mathcal M_T$ and $h^2:X\to[0,\infty)$ is bounded and measurable, \begin{align*} \int_X h(T(x))^2\,d\mu(x)=\int_X h(x)^2\,d\mu(x). \end{align*} Therefore \begin{align*} \int_X\bigl(h(T(x))-h(x)\bigr)^2\,d\mu(x) &=\int_X h(T(x))^2\,d\mu(x)-2\int_X h(T(x))h(x)\,d\mu(x)+\int_X h(x)^2\,d\mu(x)\\ &=0. \end{align*} Thus $h\circ T=h$ $\mu$-almost everywhere. [claim:Invariant measurable functions are constant under an ergodic measure] Let $g:X\to\mathbb R$ be a bounded $\mathcal B$-measurable function satisfying $g\circ T=g$ $\mu$-almost everywhere. Then there is a constant $c\in\mathbb R$ such that $g=c$ $\mu$-almost everywhere. [/claim] [proof] For each $q\in\mathbb Q$, define \begin{align*} E_q:=\{x\in X:g(x)>q\}. \end{align*} Since $g\circ T=g$ $\mu$-almost everywhere, the sets $T^{-1}E_q$ and $E_q$ differ only on a $\mu$-null set. Hence \begin{align*} \mu(T^{-1}E_q\triangle E_q)=0. \end{align*} By ergodicity, $\mu(E_q)\in\{0,1\}$ for every $q\in\mathbb Q$. Choose $M\in(0,\infty)$ such that $|g(x)|\leq M$ for every $x\in X$. Define \begin{align*} c:=\inf\{q\in\mathbb Q:\mu(E_q)=0\}. \end{align*} This number is finite because $\mu(E_q)=0$ for every rational $q\geq M$ and $\mu(E_q)=1$ for every rational $q<-M$. If $q\in\mathbb Q$ and $q<c$, then $\mu(E_q)=1$. If $q\in\mathbb Q$ and $q>c$, then there is $r\in\mathbb Q$ with $r<q$ and $\mu(E_r)=0$, so $E_q\subseteq E_r$ and $\mu(E_q)=0$. Now \begin{align*} \{x\in X:g(x)>c\}=\bigcup_{\substack{q\in\mathbb Q\\ q>c}}E_q \end{align*} has $\mu$-measure $0$, and \begin{align*} \{x\in X:g(x)<c\}=\bigcup_{\substack{q\in\mathbb Q\\ q<c}}(X\setminus E_q) \end{align*} also has $\mu$-measure $0$. Therefore $\mu(\{x\in X:g(x)\neq c\})=0$. [/proof] Applying the claim to $g=h$ gives $h=c$ $\mu$-almost everywhere for some $c\in\mathbb R$. Since both $\nu$ and $\mu$ are probability measures, \begin{align*} 1=\nu(X)=\int_X h(x)\,d\mu(x)=\int_X c\,d\mu(x)=c\mu(X)=c. \end{align*} Thus $h=1$ $\mu$-almost everywhere, and consequently $\nu=\mu$. [guided] We prove a lemma that will be applied to one component of a convex decomposition. Assume $\mu$ is ergodic, let $C\in(0,\infty)$, and let $\nu:\mathcal B\to[0,1]$ be a probability measure satisfying $\nu\in\mathcal M_T$ and \begin{align*} \nu(B)\leq C\mu(B)\quad\text{for every }B\in\mathcal B. \end{align*} The domination implies absolute continuity: if $\mu(B)=0$, then $\nu(B)\leq C\mu(B)=0$, so $\nu(B)=0$. Hence $\nu\ll\mu$. By the [Radon-Nikodym Theorem](/theorems/1247), there is a $\mathcal B$-measurable function \begin{align*} h:X&\to[0,\infty)\\ x&\mapsto h(x) \end{align*} such that \begin{align*} \nu(B)=\int_B h(x)\,d\mu(x)\quad\text{for every }B\in\mathcal B. \end{align*} We also need $h$ to be bounded, because later we will use $h$ itself as a test function. For $\varepsilon\in(0,\infty)$, define \begin{align*} H_\varepsilon:=\{x\in X:h(x)>C+\varepsilon\}. \end{align*} Then \begin{align*} (C+\varepsilon)\mu(H_\varepsilon) \leq \int_{H_\varepsilon}h(x)\,d\mu(x) =\nu(H_\varepsilon) \leq C\mu(H_\varepsilon). \end{align*} The only way this inequality can hold is $\mu(H_\varepsilon)=0$. Taking the countable union over rational $\varepsilon>0$ shows that $h\leq C$ $\mu$-almost everywhere. Redefining $h$ on a $\mu$-null set does not change the measure represented by $h\,d\mu$, so we may assume $0\leq h\leq C$ everywhere. The key point is to convert invariance of the measure $\nu$ into invariance of its density $h$. Let $\varphi:X\to\mathbb R$ be bounded and $\mathcal B$-measurable. Since $\nu\in\mathcal M_T$, the identity \begin{align*} \int_X \varphi(x)\,d\nu(x)=\int_X \varphi(T(x))\,d\nu(x) \end{align*} holds first for indicator functions $\varphi=\mathbb 1_B$ and then for bounded measurable $\varphi$ by the monotone class theorem. Substituting $d\nu=h\,d\mu$ gives \begin{align*} \int_X \varphi(x)h(x)\,d\mu(x)=\int_X \varphi(T(x))h(x)\,d\mu(x). \end{align*} Now choose $\varphi=h$. This is allowed because $h$ is bounded and measurable. We get \begin{align*} \int_X h(x)^2\,d\mu(x)=\int_X h(T(x))h(x)\,d\mu(x). \end{align*} Since $\mu\in\mathcal M_T$, applying invariance of $\mu$ to the bounded measurable function $h^2:X\to[0,\infty)$ gives \begin{align*} \int_X h(T(x))^2\,d\mu(x)=\int_X h(x)^2\,d\mu(x). \end{align*} Combining these two identities, \begin{align*} \int_X\bigl(h(T(x))-h(x)\bigr)^2\,d\mu(x) &=\int_X h(T(x))^2\,d\mu(x)-2\int_X h(T(x))h(x)\,d\mu(x)+\int_X h(x)^2\,d\mu(x)\\ &=0. \end{align*} A nonnegative function with integral $0$ is $0$ $\mu$-almost everywhere, so $h\circ T=h$ $\mu$-almost everywhere. It remains to explain why an invariant measurable function is constant under ergodicity. Let $g:X\to\mathbb R$ be a bounded $\mathcal B$-measurable function satisfying $g\circ T=g$ $\mu$-almost everywhere. For each rational number $q\in\mathbb Q$, define \begin{align*} E_q:=\{x\in X:g(x)>q\}. \end{align*} Because $g(T(x))=g(x)$ outside a $\mu$-null set, membership in $E_q$ and membership in $T^{-1}E_q$ agree outside that null set. Thus \begin{align*} \mu(T^{-1}E_q\triangle E_q)=0. \end{align*} Ergodicity gives $\mu(E_q)\in\{0,1\}$ for every rational $q$. Choose $M\in(0,\infty)$ such that $|g(x)|\leq M$ for every $x\in X$, and define \begin{align*} c:=\inf\{q\in\mathbb Q:\mu(E_q)=0\}. \end{align*} The set in the infimum is nonempty because $E_q=\varnothing$ for rational $q\geq M$, and it is bounded below because $E_q=X$ for rational $q<-M$. If $q<c$, then $\mu(E_q)$ cannot be $0$, so $\mu(E_q)=1$. If $q>c$, the definition of the infimum gives some rational $r<q$ with $\mu(E_r)=0$; since $E_q\subseteq E_r$, we get $\mu(E_q)=0$. Therefore \begin{align*} \{x\in X:g(x)>c\}=\bigcup_{\substack{q\in\mathbb Q\\q>c}}E_q \end{align*} has measure $0$, and \begin{align*} \{x\in X:g(x)<c\}=\bigcup_{\substack{q\in\mathbb Q\\q<c}}(X\setminus E_q) \end{align*} has measure $0$. Hence $g=c$ $\mu$-almost everywhere. Applying this to $g=h$ gives $h=c$ $\mu$-almost everywhere. Since $\nu(X)=1$ and $\mu(X)=1$, \begin{align*} 1=\nu(X)=\int_X h(x)\,d\mu(x)=\int_X c\,d\mu(x)=c. \end{align*} Thus $h=1$ $\mu$-almost everywhere, so $\nu=\mu$. [/guided] [/step] [step:Use the constant-density result to rule out proper decompositions of an ergodic measure] Assume $\mu$ is ergodic and suppose \begin{align*} \mu=t\mu_1+(1-t)\mu_2 \end{align*} for some $t\in(0,1)$ and some $\mu_1,\mu_2\in\mathcal M_T$. For every $B\in\mathcal B$, \begin{align*} t\mu_1(B)\leq t\mu_1(B)+(1-t)\mu_2(B)=\mu(B), \end{align*} so \begin{align*} \mu_1(B)\leq \frac{1}{t}\mu(B). \end{align*} Applying the previous step with $\nu=\mu_1$ and $C=1/t$ gives $\mu_1=\mu$. Then, for every $B\in\mathcal B$, \begin{align*} (1-t)\mu_2(B)=\mu(B)-t\mu_1(B)=\mu(B)-t\mu(B)=(1-t)\mu(B). \end{align*} Since $1-t>0$, $\mu_2(B)=\mu(B)$ for every $B\in\mathcal B$, so $\mu_2=\mu$. Hence every convex decomposition of $\mu$ inside $\mathcal M_T$ is forced to use only $\mu$ itself, and $\mu$ is an extreme point of $\mathcal M_T$. [/step] [step:Condition a nonergodic measure on a proper invariant set] Assume now that $\mu$ is not ergodic. Then there exists $A\in\mathcal B$ such that \begin{align*} 0<\mu(A)<1 \quad\text{and}\quad \mu(T^{-1}A\triangle A)=0. \end{align*} Define \begin{align*} a&:=\mu(A),\\ A^c&:=X\setminus A,\\ b&:=\mu(A^c)=1-a. \end{align*} Then $a,b\in(0,1)$. Define set functions \begin{align*} \mu_A:\mathcal B&\to[0,1]\\ B&\mapsto \frac{\mu(B\cap A)}{a} \end{align*} and \begin{align*} \mu_{A^c}:\mathcal B&\to[0,1]\\ B&\mapsto \frac{\mu(B\cap A^c)}{b}. \end{align*} Both are probability measures, because intersection with a fixed measurable set preserves countable disjoint unions, and $\mu_A(X)=\mu_{A^c}(X)=1$. For every $B\in\mathcal B$, the sets $T^{-1}B\cap A$ and $T^{-1}B\cap T^{-1}A$ differ by a subset of $A\triangle T^{-1}A$, which has $\mu$-measure $0$. Hence \begin{align*} \mu_A(T^{-1}B) &=\frac{\mu(T^{-1}B\cap A)}{a}\\ &=\frac{\mu(T^{-1}B\cap T^{-1}A)}{a}\\ &=\frac{\mu(T^{-1}(B\cap A))}{a}\\ &=\frac{\mu(B\cap A)}{a}\\ &=\mu_A(B), \end{align*} where the fourth equality uses $\mu\in\mathcal M_T$. Also $T^{-1}A^c\triangle A^c=T^{-1}A\triangle A$, so the same computation gives \begin{align*} \mu_{A^c}(T^{-1}B) &=\frac{\mu(T^{-1}B\cap A^c)}{b}\\ &=\frac{\mu(T^{-1}B\cap T^{-1}A^c)}{b}\\ &=\frac{\mu(T^{-1}(B\cap A^c))}{b}\\ &=\frac{\mu(B\cap A^c)}{b}\\ &=\mu_{A^c}(B). \end{align*} Therefore $\mu_A,\mu_{A^c}\in\mathcal M_T$. [guided] Because $\mu$ is not ergodic, there is a measurable set $A\in\mathcal B$ that is invariant up to a $\mu$-null set and has genuinely intermediate measure: \begin{align*} 0<\mu(A)<1 \quad\text{and}\quad \mu(T^{-1}A\triangle A)=0. \end{align*} Define \begin{align*} a&:=\mu(A),\\ A^c&:=X\setminus A,\\ b&:=\mu(A^c)=1-a. \end{align*} The inequalities above imply $a,b\in(0,1)$, so division by $a$ and by $b$ is valid. We condition $\mu$ on $A$ and on $A^c$. Define \begin{align*} \mu_A:\mathcal B&\to[0,1]\\ B&\mapsto \frac{\mu(B\cap A)}{a} \end{align*} and \begin{align*} \mu_{A^c}:\mathcal B&\to[0,1]\\ B&\mapsto \frac{\mu(B\cap A^c)}{b}. \end{align*} These are probability measures: countable additivity follows from countable additivity of $\mu$ because intersections with $A$ and $A^c$ preserve disjoint unions, and \begin{align*} \mu_A(X)=\frac{\mu(A)}{a}=1, \qquad \mu_{A^c}(X)=\frac{\mu(A^c)}{b}=1. \end{align*} We now verify that these conditional measures are $T$-invariant. Fix $B\in\mathcal B$. Since $\mu(T^{-1}A\triangle A)=0$, replacing $A$ by $T^{-1}A$ inside an intersection changes the $\mu$-measure by $0$. Thus \begin{align*} \mu(T^{-1}B\cap A)=\mu(T^{-1}B\cap T^{-1}A). \end{align*} Using this replacement, \begin{align*} \mu_A(T^{-1}B) &=\frac{\mu(T^{-1}B\cap A)}{a}\\ &=\frac{\mu(T^{-1}B\cap T^{-1}A)}{a}\\ &=\frac{\mu(T^{-1}(B\cap A))}{a}\\ &=\frac{\mu(B\cap A)}{a}\\ &=\mu_A(B). \end{align*} The fourth equality is exactly the $T$-invariance of $\mu$, applied to the measurable set $B\cap A$. The complement is invariant up to the same null set because \begin{align*} T^{-1}A^c\triangle A^c=T^{-1}A\triangle A. \end{align*} Therefore, for the same fixed $B\in\mathcal B$, \begin{align*} \mu_{A^c}(T^{-1}B) &=\frac{\mu(T^{-1}B\cap A^c)}{b}\\ &=\frac{\mu(T^{-1}B\cap T^{-1}A^c)}{b}\\ &=\frac{\mu(T^{-1}(B\cap A^c))}{b}\\ &=\frac{\mu(B\cap A^c)}{b}\\ &=\mu_{A^c}(B). \end{align*} Thus $\mu_A,\mu_{A^c}\in\mathcal M_T$. [/guided] [/step] [step:Assemble the conditional measures into a proper convex decomposition] For every $B\in\mathcal B$, \begin{align*} a\mu_A(B)+b\mu_{A^c}(B) &=a\frac{\mu(B\cap A)}{a}+b\frac{\mu(B\cap A^c)}{b}\\ &=\mu(B\cap A)+\mu(B\cap A^c)\\ &=\mu(B). \end{align*} Thus \begin{align*} \mu=a\mu_A+b\mu_{A^c}. \end{align*} Since $a,b\in(0,1)$, this is a convex decomposition inside $\mathcal M_T$. It is proper because \begin{align*} \mu_A(A)=1 \quad\text{and}\quad \mu_{A^c}(A)=0, \end{align*} so $\mu_A\neq\mu_{A^c}$. Therefore $\mu$ is not an extreme point of $\mathcal M_T$. We have shown that ergodicity implies extremality and that nonergodicity implies nonextremality. Hence $\mu\in\mathcal M_T$ is ergodic if and only if $\mu$ is an extreme point of $\mathcal M_T$. [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Ergodic Measures Are Extreme Points (Theorem # 3452)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Ergodic Measures Are Extreme Points (Theorem # 3452)

Discussion

Proof

Explore Further