[proofplan]
We use the continued-fraction Markov partition for the Gauss map. Each finite cylinder is mapped bijectively by an iterate of $G$ onto the full interval, and the inverse branches have uniformly bounded distortion with respect to the Gauss density. If $A$ is invariant and has positive measure, a density point of $A$ has shrinking cylinders on which $A$ occupies asymptotically full conditional measure. Full-branch bounded distortion transfers this local fact to a global lower bound for $\mu_G(A)$, forcing $\mu_G(A)=1$.
[/proofplan]
custom_env
admin
[step:Introduce the continued-fraction branches and cylinders]
Let $\rho:(0,1)\to(0,\infty)$ denote the Gauss density
\begin{align*}
\rho(x)=\frac{1}{\log 2}\frac{1}{1+x}.
\end{align*}
Let $\mathbb N=\{1,2,3,\dots\}$ denote the set of positive integers. For each $n\in\mathbb N$, define the interval
\begin{align*}
I_n=\left(\frac{1}{n+1},\frac{1}{n}\right)
\end{align*}
and define the inverse branch $h_n:(0,1)\to I_n$ by
\begin{align*}
h_n(y)=\frac{1}{n+y}.
\end{align*}
Then $G(h_n(y))=y$ for every $y\in(0,1)$.
For a word $a=(a_1,\dots,a_k)\in\mathbb N^k$, define the inverse branch $h_a:(0,1)\to(0,1)$ by
\begin{align*}
h_a=h_{a_1}\circ h_{a_2}\circ\cdots\circ h_{a_k}.
\end{align*}
Define the corresponding $k$-cylinder $C_a\subset(0,1)$ by
\begin{align*}
C_a=h_a((0,1)).
\end{align*}
The set $C_a$ is an open interval, and $G^k:C_a\to(0,1)$ is a bijection with inverse $h_a$. The endpoints of all cylinders form a [countable set](/page/Countable%20Set), so they have $\mu_G$-measure zero.
[/step]
custom_env
admin
[step:Prove that the Gauss measure is invariant under the Gauss map]
We prove that $G$ is $\mu_G$-preserving. Let $B\in\mathcal B((0,1))$ be a Borel set. Since the intervals $I_n$ are pairwise disjoint and cover $(0,1)$ up to the countable endpoint set $\{1/n:n\in\mathbb N\}$, which has $\mu_G$-measure zero, we have
\begin{align*}
\mu_G(G^{-1}(B))=\sum_{n=1}^{\infty}\mu_G(h_n(B)).
\end{align*}
For each $n\in\mathbb N$, use the substitution $x=h_n(y)=1/(n+y)$. The one-dimensional change-of-variables formula gives
\begin{align*}
d\mathcal L^1(x)=|h_n'(y)|\,d\mathcal L^1(y)=\frac{1}{(n+y)^2}\,d\mathcal L^1(y).
\end{align*}
Therefore
\begin{align*}
\mu_G(h_n(B))=\frac{1}{\log 2}\int_B \frac{1}{1+h_n(y)}\frac{1}{(n+y)^2}\,d\mathcal L^1(y).
\end{align*}
Since $1+h_n(y)=1+1/(n+y)=(n+y+1)/(n+y)$, this becomes
\begin{align*}
\mu_G(h_n(B))=\frac{1}{\log 2}\int_B \frac{1}{(n+y)(n+y+1)}\,d\mathcal L^1(y).
\end{align*}
The integrand is non-negative, so Tonelli's theorem permits summing the branch integrals inside the integral. The telescoping identity
\begin{align*}
\sum_{n=1}^{\infty}\frac{1}{(n+y)(n+y+1)}=\sum_{n=1}^{\infty}\left(\frac{1}{n+y}-\frac{1}{n+y+1}\right)=\frac{1}{1+y}
\end{align*}
holds for every $y\in(0,1)$. Hence
\begin{align*}
\mu_G(G^{-1}(B))=\frac{1}{\log 2}\int_B \frac{1}{1+y}\,d\mathcal L^1(y)=\mu_G(B).
\end{align*}
Thus $G$ is $\mu_G$-preserving.
[/step]
custom_env
admin
[step:Record the bounded distortion estimate for inverse branches]
We prove a uniform distortion estimate for the densities obtained by pushing $\mu_G|_{C_a}$ forward under $G^k$.
[claim:Uniform distortion for continued-fraction cylinders]
There exists a constant $D=16$ such that for every $k\in\mathbb N$, every word $a\in\mathbb N^k$, and all $y,z\in(0,1)$,
\begin{align*}
\frac{\rho(h_a(y))|h_a'(y)|}{\rho(h_a(z))|h_a'(z)|}\le D.
\end{align*}
[/claim]
[proof]
For every word $a=(a_1,\dots,a_k)$, the map $h_a$ is a fractional [linear map](/page/Linear%20Map) of the form
\begin{align*}
h_a(y)=\frac{p_k+p_{k-1}y}{q_k+q_{k-1}y}
\end{align*}
for integers $p_k,p_{k-1},q_k,q_{k-1}$ with $q_k\ge q_{k-1}\ge 0$ and $q_k\ge1$. Hence
\begin{align*}
|h_a'(y)|=\frac{1}{(q_k+q_{k-1}y)^2}.
\end{align*}
Thus, for $y,z\in(0,1)$,
\begin{align*}
\frac{|h_a'(y)|}{|h_a'(z)|}=\left(\frac{q_k+q_{k-1}z}{q_k+q_{k-1}y}\right)^2\le \left(\frac{q_k+q_{k-1}}{q_k}\right)^2\le4.
\end{align*}
Since $h_a(y),h_a(z)\in(0,1)$, the Gauss density satisfies
\begin{align*}
\frac{\rho(h_a(y))}{\rho(h_a(z))}=\frac{1+h_a(z)}{1+h_a(y)}\le2.
\end{align*}
Also $\rho(z)/\rho(y)\le2$ for all $y,z\in(0,1)$. Combining the first two estimates gives
\begin{align*}
\frac{\rho(h_a(y))|h_a'(y)|}{\rho(h_a(z))|h_a'(z)|}\le8.
\end{align*}
The weaker constant $D=16$ will be used below to compare normalised conditional densities with $\mu_G$; it follows immediately from this estimate.
[/proof]
[/step]
custom_env
admin
[step:Compare pushed-forward conditional measures with the Gauss measure]Fix $k\in\mathbb N$ and $a\in\mathbb N^k$. Define a probability measure $\nu_a$ on $((0,1),\mathcal B((0,1)))$ by
\begin{align*}
\nu_a(B)=\frac{\mu_G(C_a\cap G^{-k}(B))}{\mu_G(C_a)}
\end{align*}
for every $B\in\mathcal B((0,1))$.
Using the change of variables $x=h_a(y)$, the measure transformation is
\begin{align*}
d\mathcal L^1(x)=|h_a'(y)|\,d\mathcal L^1(y).
\end{align*}
Therefore
\begin{align*}
\nu_a(B)=\frac{\int_B \rho(h_a(y))|h_a'(y)|\,d\mathcal L^1(y)}{\int_0^1 \rho(h_a(y))|h_a'(y)|\,d\mathcal L^1(y)}.
\end{align*}
By the distortion estimate from the previous step and the fact that $d\mu_G(y)=\rho(y)\,d\mathcal L^1(y)$, the Radon-Nikodym derivative $d\nu_a/d\mu_G$ exists and satisfies
\begin{align*}
\frac{1}{16}\le \frac{d\nu_a}{d\mu_G}(y)\le16
\end{align*}
for $\mu_G$-almost every $y\in(0,1)$. In particular, for every Borel set $B\subset(0,1)$,
\begin{align*}
\nu_a(B)\ge \frac{1}{16}\mu_G(B).
\end{align*}[/step]
custom_env
admin
[guided]The point of introducing $\nu_a$ is to measure what a set looks like after zooming out from the cylinder $C_a$ by the map $G^k$. Since $G^k:C_a\to(0,1)$ is bijective with inverse $h_a$, every integral over $C_a$ can be rewritten as an integral over $(0,1)$.
For a Borel set $B\subset(0,1)$, the set $C_a\cap G^{-k}(B)$ is exactly $h_a(B)$ up to endpoints, and those endpoints have $\mu_G$-measure zero. Using the substitution $x=h_a(y)$, the one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) transforms by
\begin{align*}
d\mathcal L^1(x)=|h_a'(y)|\,d\mathcal L^1(y).
\end{align*}
Hence the numerator of $\nu_a(B)$ is
\begin{align*}
\mu_G(C_a\cap G^{-k}(B))=\int_B \rho(h_a(y))|h_a'(y)|\,d\mathcal L^1(y).
\end{align*}
The denominator is obtained by taking $B=(0,1)$:
\begin{align*}
\mu_G(C_a)=\int_0^1 \rho(h_a(y))|h_a'(y)|\,d\mathcal L^1(y).
\end{align*}
Thus $\nu_a$ has density proportional to the weight $y\mapsto \rho(h_a(y))|h_a'(y)|$.
The bounded distortion estimate says that this weight cannot vary by more than a fixed multiplicative factor on the whole interval $(0,1)$, independently of the cylinder. Since $\rho(y)$ itself varies by at most a factor of $2$ on $(0,1)$, normalising the weight produces a density with respect to $\mu_G$ bounded above and below by a universal constant. With the explicit constant used here, this gives
\begin{align*}
\frac{1}{16}\le \frac{d\nu_a}{d\mu_G}(y)\le16
\end{align*}
for $\mu_G$-almost every $y\in(0,1)$. Therefore every Borel set $B$ satisfies
\begin{align*}
\nu_a(B)\ge \frac{1}{16}\mu_G(B).
\end{align*}
This lower bound is the mechanism that converts almost-full measure inside one small cylinder into almost-full measure globally.[/guided]
custom_env
admin
[step:Choose shrinking cylinders around a density point of an invariant set]
Let $A\in\mathcal B((0,1))$ satisfy
\begin{align*}
\mu_G(G^{-1}(A)\triangle A)=0.
\end{align*}
Assume first that $\mu_G(A)>0$. By the [Lebesgue density theorem](/theorems/894) for finite Borel measures on intervals, applied to the measure $\mu_G$ whose density is bounded above and below by positive constants with respect to $\mathcal L^1$, there exists a point $x\in A$ which is not a cylinder endpoint and such that, for every sequence of intervals $J_m\subset(0,1)$ containing $x$ with lengths tending to $0$,
\begin{align*}
\frac{\mu_G(A\cap J_m)}{\mu_G(J_m)}\to1.
\end{align*}
Here we are citing a result not yet in the wiki: the Lebesgue density theorem for finite Borel measures with locally bounded positive density.
For each $k\in\mathbb N$, let $C_k(x)$ be the unique $k$-cylinder containing $x$. Since the continued-fraction cylinders separate all points outside the countable endpoint set, the intervals $C_k(x)$ shrink to $x$. Therefore
\begin{align*}
\frac{\mu_G(A\cap C_k(x))}{\mu_G(C_k(x))}\to1.
\end{align*}
[/step]
custom_env
admin
[step:Use invariance to transfer local density to the whole interval]
For each $k\in\mathbb N$, write $C_k(x)=C_{a(k)}$ for the unique word $a(k)\in\mathbb N^k$ defining that cylinder. Since $A$ is invariant modulo $\mu_G$,
\begin{align*}
\mu_G(A\triangle G^{-1}(A))=0.
\end{align*}
The measure-preserving property proved above implies that preimages of $\mu_G$-null sets are $\mu_G$-null. We prove by induction that
\begin{align*}
\mu_G(A\triangle G^{-k}(A))=0
\end{align*}
for every $k\in\mathbb N$. The case $k=1$ is the assumed invariance. If it holds for $k=m$, then
\begin{align*}
A\triangle G^{-(m+1)}(A)\subset (A\triangle G^{-1}(A))\cup G^{-1}(A\triangle G^{-m}(A)).
\end{align*}
Both sets on the right have $\mu_G$-measure zero, so the induction closes. Hence
\begin{align*}
\frac{\mu_G(A\cap C_{a(k)})}{\mu_G(C_{a(k)})}=\frac{\mu_G(C_{a(k)}\cap G^{-k}(A))}{\mu_G(C_{a(k)})}=\nu_{a(k)}(A).
\end{align*}
The previous step gives $\nu_{a(k)}(A)\to1$, so $\nu_{a(k)}(A^c)\to0$. The lower comparison estimate gives
\begin{align*}
\nu_{a(k)}(A^c)\ge \frac{1}{16}\mu_G(A^c).
\end{align*}
Letting $k\to\infty$ yields $\mu_G(A^c)=0$. Thus $\mu_G(A)=1$ whenever $\mu_G(A)>0$.
[/step]
custom_env
admin
[step:Conclude that every invariant set has measure zero or one]
If $\mu_G(A)=0$, there is nothing to prove. If $\mu_G(A)>0$, the previous step proves $\mu_G(A)=1$. Therefore every Borel set $A\subset(0,1)$ satisfying $\mu_G(G^{-1}(A)\triangle A)=0$ has $\mu_G(A)\in\{0,1\}$. This is precisely ergodicity of the measure-preserving system $((0,1),\mathcal B((0,1)),\mu_G,G)$.
[/step]