[proofplan]
We first disintegrate $\mu$ over the invariant $\sigma$-algebra $\mathcal{I}$. This gives conditional measures whose integrals represent conditional expectations. The barycenter identity is the case $A=X$ of the defining disintegration identity. Invariance of almost every component is checked on a countable generating $\pi$-system. For ergodicity, we apply Birkhoff's theorem to a countable dense algebra of bounded test functions; after disintegrating the exceptional null sets, each component has the property that time averages of all test functions converge to their component integrals. Approximating the indicator of any component-invariant set by those test functions forces that indicator to be constant almost everywhere for the component. Uniqueness is the usual uniqueness of regular conditional probabilities on a countable generating class.
[/proofplan]
[step:Disintegrate over the invariant sigma-algebra]
By the [Existence of Regular Conditional Distributions](/theorems/972), applied to the probability space $(X,\mathcal{B},\mu)$ and the sub-$\sigma$-algebra $\mathcal{I}$, there is a regular conditional probability kernel
\begin{align*}
x\longmapsto \mu_x
\end{align*}
such that, for every $B\in\mathcal{B}$, the map
\begin{align*}
h_B:X&\to[0,1]\\
x&\mapsto \mu_x(B)
\end{align*}
is $\mathcal{I}$-measurable and satisfies
\begin{align*}
h_B=\mathbb{E}_{\mu}[\mathbb{1}_B\mid\mathcal{I}]
\quad\text{in }L^1(X,\mathcal{B},\mu).
\end{align*}
Equivalently, for every $A\in\mathcal{I}$ and every $B\in\mathcal{B}$,
\begin{align*}
\int_A \mu_x(B)\,d\mu(x)=\mu(A\cap B).
\end{align*}
By the monotone class theorem, the same identity extends from indicators to every bounded $\mathcal{B}$-measurable function $f:X\to\mathbb{R}$:
\begin{align*}
x\mapsto\int_X f(y)\,d\mu_x(y)
\end{align*}
is an $\mathcal{I}$-measurable representative of $\mathbb{E}_{\mu}[f\mid\mathcal{I}]$.
[/step]
[step:Recover the original measure as the barycenter]
Taking $A=X$ in the preceding identity gives, for every $B\in\mathcal{B}$,
\begin{align*}
\int_X \mu_x(B)\,d\mu(x)=\mu(B).
\end{align*}
Equivalently, for every bounded $\mathcal{B}$-measurable $f:X\to\mathbb{R}$,
\begin{align*}
\int_X f(y)\,d\mu(y)
=
\int_X\left(\int_X f(y)\,d\mu_x(y)\right)d\mu(x).
\end{align*}
Thus $\mu$ is the barycenter of the conditional probability measures $\mu_x$.
[/step]
[step:Show that almost every component is invariant]
Because $X$ is standard Borel, there is a countable $\pi$-system $\mathcal{C}$ generating $\mathcal{B}$. Fix $B\in\mathcal{C}$ and define
\begin{align*}
g_B:X&\to\mathbb{R}\\
y&\mapsto \mathbb{1}_B(Ty)-\mathbb{1}_B(y).
\end{align*}
For every $A\in\mathcal{I}$,
\begin{align*}
\int_A g_B\,d\mu
&=
\mu(A\cap T^{-1}B)-\mu(A\cap B)\\
&=
\mu(T^{-1}A\cap T^{-1}B)-\mu(A\cap B)\\
&=
\mu(T^{-1}(A\cap B))-\mu(A\cap B)\\
&=0,
\end{align*}
because $T^{-1}A=A$ and $\mu$ is $T$-invariant. Hence
\begin{align*}
\mathbb{E}_{\mu}[g_B\mid\mathcal{I}]=0.
\end{align*}
Using the conditional-measure representation of this conditional expectation,
\begin{align*}
0
=
\int_X g_B(y)\,d\mu_x(y)
=
\mu_x(T^{-1}B)-\mu_x(B)
\end{align*}
for $\mu$-almost every $x$.
Intersect the resulting full-measure sets over the countable family $\mathcal{C}$. For every $x$ in the intersection,
\begin{align*}
\mu_x(T^{-1}B)=\mu_x(B)
\end{align*}
for every $B\in\mathcal{C}$. Since $\mathcal{C}$ is a generating $\pi$-system, the equality extends to every $B\in\mathcal{B}$. Thus $\mu_x$ is $T$-invariant for $\mu$-almost every $x$.
[/step]
[step:Build a countable family of component-generic test functions]
Choose a countable Boolean algebra $\mathcal{A}$ generating $\mathcal{B}$, and let $\mathcal{S}$ be the countable set of all bounded rational-valued simple functions whose level sets belong to $\mathcal{A}$. For every Borel probability measure $\nu$ on $X$, the set $\mathcal{S}$ is dense in $L^1(X,\mathcal{B},\nu)$.
Fix $s\in\mathcal{S}$. Since $s$ is bounded, the [Birkhoff Ergodic Theorem](/theorems/518) gives almost-everywhere convergence of
\begin{align*}
A_Ns(y):=\frac{1}{N}\sum_{n=0}^{N-1}s(T^n y).
\end{align*}
The limit is the conditional expectation onto the invariant $\sigma$-algebra; this identification follows from the [Limit Is Conditional Expectation onto Invariant Sigma-Algebra](/theorems/3449), because $s\in L^2(X,\mathcal{B},\mu)$. Hence
\begin{align*}
A_Ns(y)\longrightarrow F_s(y)
\quad\text{for }\mu\text{-almost every }y,
\end{align*}
where $F_s$ is an $\mathcal{I}$-measurable representative of $\mathbb{E}_{\mu}[s\mid\mathcal{I}]$.
Let $N_s$ be a $\mu$-null set outside which this convergence holds. Since
\begin{align*}
0=\mu(N_s)=\int_X \mu_x(N_s)\,d\mu(x),
\end{align*}
we have $\mu_x(N_s)=0$ for $\mu$-almost every $x$. Also, because $F_s$ is $\mathcal{I}$-measurable, the conditional-measure identity applied to the rational level sets of $F_s$ gives
\begin{align*}
F_s(y)=F_s(x)
\quad\text{for }\mu_x\text{-almost every }y
\end{align*}
for $\mu$-almost every $x$. Finally,
\begin{align*}
F_s(x)=\int_X s(y)\,d\mu_x(y)
\end{align*}
for $\mu$-almost every $x$, by the conditional-expectation representation from the first step.
Intersect these full-measure sets over the countable family $\mathcal{S}$. For every $x$ in the resulting set $X_{\mathrm{gen}}$, and every $s\in\mathcal{S}$,
\begin{align*}
A_Ns(y)\longrightarrow \int_X s\,d\mu_x
\quad\text{for }\mu_x\text{-almost every }y.
\end{align*}
Since $s$ is bounded, dominated convergence also gives
\begin{align*}
\left\|A_Ns-\int_X s\,d\mu_x\right\|_{L^1(\mu_x)}\longrightarrow 0.
\end{align*}
[/step]
[step:Use the component-generic test functions to prove ergodicity]
Let $x$ belong to the full-measure set where $\mu_x$ is invariant and where the conclusion of the preceding step holds for every $s\in\mathcal{S}$. We prove that $\mu_x$ is ergodic.
Let $B\in\mathcal{B}$ satisfy
\begin{align*}
\mu_x(T^{-1}B\triangle B)=0.
\end{align*}
Since $\mu_x$ is $T$-invariant, induction gives
\begin{align*}
\mu_x(T^{-n}B\triangle B)=0
\end{align*}
for every $n\geq0$. Therefore, outside a $\mu_x$-null set,
\begin{align*}
A_N\mathbb{1}_B(y)=\mathbb{1}_B(y)
\end{align*}
for every $N\geq1$.
Let $\varepsilon>0$. By density of $\mathcal{S}$ in $L^1(\mu_x)$, choose $s\in\mathcal{S}$ such that
\begin{align*}
\|\mathbb{1}_B-s\|_{L^1(\mu_x)}<\varepsilon.
\end{align*}
For every $N$,
\begin{align*}
\|A_N\mathbb{1}_B-A_Ns\|_{L^1(\mu_x)}
&\leq
\frac{1}{N}\sum_{n=0}^{N-1}
\int_X |\mathbb{1}_B(T^n y)-s(T^n y)|\,d\mu_x(y)\\
&=
\|\mathbb{1}_B-s\|_{L^1(\mu_x)}\\
&<\varepsilon,
\end{align*}
where the equality uses $T$-invariance of $\mu_x$. Letting $N\to\infty$ and using the $L^1(\mu_x)$ convergence for $s$ gives
\begin{align*}
\left\|\mathbb{1}_B-\int_X s\,d\mu_x\right\|_{L^1(\mu_x)}
\leq \varepsilon.
\end{align*}
Also
\begin{align*}
\left|\int_X s\,d\mu_x-\mu_x(B)\right|
\leq
\|\mathbb{1}_B-s\|_{L^1(\mu_x)}
<\varepsilon.
\end{align*}
Hence
\begin{align*}
\|\mathbb{1}_B-\mu_x(B)\|_{L^1(\mu_x)}\leq 2\varepsilon.
\end{align*}
Since $\varepsilon>0$ was arbitrary,
\begin{align*}
\mathbb{1}_B=\mu_x(B)
\quad\text{in }L^1(\mu_x).
\end{align*}
The left side takes only the values $0$ and $1$, so the constant $\mu_x(B)$ must be either $0$ or $1$. Therefore every $\mu_x$-invariant Borel set has $\mu_x$-measure $0$ or $1$, and $\mu_x$ is ergodic.
[/step]
[step:Prove uniqueness of the kernel]
Let $x\mapsto\nu_x$ be another kernel satisfying the first condition of the theorem. For every fixed $B\in\mathcal{B}$, both functions
\begin{align*}
x\mapsto \mu_x(B),
\qquad
x\mapsto \nu_x(B)
\end{align*}
represent $\mathbb{E}_{\mu}[\mathbb{1}_B\mid\mathcal{I}]$, so they agree for $\mu$-almost every $x$.
Choose a countable $\pi$-system $\mathcal{C}$ generating $\mathcal{B}$. Intersect the corresponding full-measure sets over $B\in\mathcal{C}$. On the resulting full-measure set,
\begin{align*}
\mu_x(B)=\nu_x(B)
\end{align*}
for every $B\in\mathcal{C}$. Since probability measures that agree on a generating $\pi$-system agree on the generated $\sigma$-algebra, $\mu_x=\nu_x$ for $\mu$-almost every $x$.
[/step]
[step:Record the compact metrizable version]
If $X$ is compact metrizable and $T$ is continuous, then Borel probability measures on $X$ form the weak-$*$ measurable space whose Borel $\sigma$-algebra is generated by the evaluation maps
\begin{align*}
\eta\mapsto \eta(B),
\qquad B\in\mathcal{B}.
\end{align*}
The defining measurability of the regular conditional probability kernel therefore makes $x\mapsto\mu_x$ measurable as a map into that space. The invariance and ergodicity steps show $\mu_x\in\mathcal{M}_T$ and $\mu_x$ is ergodic for $\mu$-almost every $x$. Applying the barycenter identity to continuous $f:X\to\mathbb{R}$ gives the stated compact metrizable form.
[/step]