[proofplan]
The proof rewrites the correlation using the defining duality of the transfer operator, so that the problem becomes estimating $\mathcal L^n f-\Pi f$. The spectral gap decomposition gives $\mathcal L^n f=\Pi f+N^n f$, and the projection term is exactly the product of the two integrals. The spectral radius bound gives exponential decay of $N^n$ in the $\mathcal B_0$ norm, and the continuous embedding $\mathcal B_0\hookrightarrow L^1(X,\mu)$ converts this into an $L^1$ bound that can be paired with $g\in L^\infty(X,\mu)$.
[/proofplan]
[step:Rewrite the correlation through the transfer operator]
Fix $f\in\mathcal B_0$, fix $g\in L^\infty(X,\mathcal B,\mu)$, and fix an integer $n\ge 0$. Define the measurable map
\begin{align*}
T^n:X\to X
\end{align*}
to be the $n$-fold iterate of $T$, with $T^0=\operatorname{id}_X$. Since $T$ is probability-preserving, $g\circ T^n\in L^\infty(X,\mathcal B,\mu)$ and
\begin{align*}
\|g\circ T^n\|_{L^\infty(X,\mu)}\le \|g\|_{L^\infty(X,\mu)}.
\end{align*}
We claim that
\begin{align*}
\int_X f(g\circ T^n)\,d\mu(x)=\int_X (\mathcal L^n f)g\,d\mu(x).
\end{align*}
For $n=0$, this is the identity $\mathcal L^0=\operatorname{Id}_{\mathcal B_0}$. For $n\ge 1$, apply the transfer-operator duality $n$ times, first with $u=\mathcal L^{n-1}f$ and $h=g$, then with $u=\mathcal L^{n-2}f$ and $h=g\circ T$, and continuing until $u=f$ and $h=g\circ T^{n-1}$. This gives the displayed identity.
[guided]
We first translate the dynamical expression into an operator expression. Fix $f\in\mathcal B_0$, $g\in L^\infty(X,\mathcal B,\mu)$, and an integer $n\ge 0$. The map
\begin{align*}
T^n:X\to X
\end{align*}
is the $n$-fold iterate of $T$, with $T^0=\operatorname{id}_X$. Because $T$ is measurable, $g\circ T^n$ is measurable. Because $T$ is probability-preserving, composition with $T^n$ does not increase the essential supremum: if $M=\|g\|_{L^\infty(X,\mu)}$, then $|g|\le M$ $\mu$-a.e.; the exceptional set has measure zero, and its preimage under $T^n$ also has measure zero. Hence
\begin{align*}
\|g\circ T^n\|_{L^\infty(X,\mu)}\le \|g\|_{L^\infty(X,\mu)}.
\end{align*}
The defining property of the transfer operator says that for every $u\in\mathcal B_0$ and every $h\in L^\infty(X,\mathcal B,\mu)$,
\begin{align*}
\int_X (\mathcal L u)h\,d\mu(x)=\int_X u(h\circ T)\,d\mu(x).
\end{align*}
We apply this identity repeatedly. For $n=0$, there is nothing to prove because $\mathcal L^0=\operatorname{Id}_{\mathcal B_0}$ and $T^0=\operatorname{id}_X$. Suppose $n\ge 1$. Applying the defining identity with $u=\mathcal L^{n-1}f$ and $h=g$ gives
\begin{align*}
\int_X (\mathcal L^n f)g\,d\mu(x)=\int_X (\mathcal L^{n-1}f)(g\circ T)\,d\mu(x).
\end{align*}
Applying the same identity again with $u=\mathcal L^{n-2}f$ and $h=g\circ T$ gives
\begin{align*}
\int_X (\mathcal L^{n-1}f)(g\circ T)\,d\mu(x)=\int_X (\mathcal L^{n-2}f)(g\circ T^2)\,d\mu(x).
\end{align*}
Iterating this argument $n$ times yields
\begin{align*}
\int_X (\mathcal L^n f)g\,d\mu(x)=\int_X f(g\circ T^n)\,d\mu(x).
\end{align*}
This is the key conversion: decay of correlations will follow once we can estimate $\mathcal L^n f-\Pi f$ in a norm that pairs with $g$.
[/guided]
[/step]
[step:Separate the invariant projection from the decaying remainder]
Since $\mathcal L=\Pi+N$ and $\Pi N=N\Pi=0$, the binomial expansion with annihilating cross terms gives
\begin{align*}
\mathcal L^n=(\Pi+N)^n=\Pi^n+N^n
\end{align*}
for every integer $n\ge 1$. The operator $\Pi$ is a projection, since for every $u\in\mathcal B_0$,
\begin{align*}
\Pi^2u=\Pi\left(\left(\int_X u\,d\mu(x)\right)1_X\right)=\left(\int_X u\,d\mu(x)\right)\left(\int_X 1_X\,d\mu(x)\right)1_X=\Pi u,
\end{align*}
using $\mu(X)=1$. Hence $\Pi^n=\Pi$ for every $n\ge 1$, and therefore
\begin{align*}
\mathcal L^n f=\Pi f+N^n f
\end{align*}
for every $n\ge 1$. The same identity also holds for $n=0$ if $N^0$ is interpreted as $\operatorname{Id}_{\mathcal B_0}$ only after subtracting the projection; in the final estimate, the case $n=0$ will be absorbed into the constant.
For $n\ge 1$, the projection term satisfies
\begin{align*}
\int_X (\Pi f)g\,d\mu(x)=\left(\int_X f\,d\mu(x)\right)\left(\int_X g\,d\mu(x)\right).
\end{align*}
Thus
\begin{align*}
\int_X f(g\circ T^n)\,d\mu(x)-\left(\int_X f\,d\mu(x)\right)\left(\int_X g\,d\mu(x)\right)=\int_X (N^n f)g\,d\mu(x)
\end{align*}
for every $n\ge 1$.
[/step]
[step:Convert the spectral radius gap into an operator norm estimate]
Choose $\rho$ such that $r(N)<\rho<1$. By the spectral radius formula (citing a result not yet in the wiki: Spectral Radius Formula),
\begin{align*}
\lim_{m\to\infty}\|N^m\|_{\mathcal L(\mathcal B_0)}^{1/m}=r(N).
\end{align*}
Since $r(N)<\rho$, there exists an integer $m_0\ge 1$ such that
\begin{align*}
\|N^m\|_{\mathcal L(\mathcal B_0)}\le \rho^m
\end{align*}
for every integer $m\ge m_0$. Define
\begin{align*}
C_0=\max\left\{1,\rho^{-m}\|N^m\|_{\mathcal L(\mathcal B_0)}:0\le m<m_0\right\}.
\end{align*}
Then $C_0>0$, and for every integer $m\ge 0$,
\begin{align*}
\|N^m\|_{\mathcal L(\mathcal B_0)}\le C_0\rho^m.
\end{align*}
Consequently, for every $u\in\mathcal B_0$ and every integer $m\ge 0$,
\begin{align*}
\|N^m u\|_{\mathcal B_0}\le C_0\rho^m\|u\|_{\mathcal B_0}.
\end{align*}
[/step]
[step:Estimate the remainder against the bounded observable]
For every integer $n\ge 1$, the remainder identity and the $L^1$-$L^\infty$ inequality give
\begin{align*}
\left|\int_X (N^n f)g\,d\mu(x)\right|\le \|N^n f\|_{L^1(X,\mu)}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
Using the continuous embedding $\mathcal B_0\hookrightarrow L^1(X,\mu)$ and the operator norm estimate for $N^n$, we obtain
\begin{align*}
\|N^n f\|_{L^1(X,\mu)}\le C_1\|N^n f\|_{\mathcal B_0}\le C_1C_0\rho^n\|f\|_{\mathcal B_0}.
\end{align*}
Therefore, for every $n\ge 1$,
\begin{align*}
\left|\int_X f(g\circ T^n)\,d\mu(x)-\left(\int_X f\,d\mu(x)\right)\left(\int_X g\,d\mu(x)\right)\right|\le C_1C_0\rho^n\|f\|_{\mathcal B_0}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
[guided]
At this point all dynamical information has been compressed into the estimate on $N^n$. We now pair $N^n f$ with the bounded observable $g$. The relevant elementary inequality is the $L^1$-$L^\infty$ estimate: if $a\in L^1(X,\mu)$ and $b\in L^\infty(X,\mu)$, then
\begin{align*}
\left|\int_X ab\,d\mu(x)\right|\le \|a\|_{L^1(X,\mu)}\|b\|_{L^\infty(X,\mu)}.
\end{align*}
Here $a=N^n f$ and $b=g$. The function $N^n f$ belongs to $\mathcal B_0$ because $N:\mathcal B_0\to\mathcal B_0$ is bounded, and it belongs to $L^1(X,\mu)$ because the embedding $\mathcal B_0\hookrightarrow L^1(X,\mu)$ is continuous. Hence
\begin{align*}
\left|\int_X (N^n f)g\,d\mu(x)\right|\le \|N^n f\|_{L^1(X,\mu)}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
The continuous embedding gives the constant $C_1>0$ from the statement:
\begin{align*}
\|N^n f\|_{L^1(X,\mu)}\le C_1\|N^n f\|_{\mathcal B_0}.
\end{align*}
The spectral radius step gives the constant $C_0>0$ and the number $\rho\in(0,1)$:
\begin{align*}
\|N^n f\|_{\mathcal B_0}\le C_0\rho^n\|f\|_{\mathcal B_0}.
\end{align*}
Combining these two inequalities yields
\begin{align*}
\|N^n f\|_{L^1(X,\mu)}\le C_1C_0\rho^n\|f\|_{\mathcal B_0}.
\end{align*}
Substituting this into the $L^1$-$L^\infty$ pairing estimate gives
\begin{align*}
\left|\int_X (N^n f)g\,d\mu(x)\right|\le C_1C_0\rho^n\|f\|_{\mathcal B_0}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
Finally, the previous step identified this integral with the correlation error for every $n\ge 1$, so the same bound holds for the correlation error.
[/guided]
[/step]
[step:Absorb the initial time into the final constant]
It remains to include $n=0$. Since $\mu$ is a probability measure,
\begin{align*}
\left|\int_X f g\,d\mu(x)-\left(\int_X f\,d\mu(x)\right)\left(\int_X g\,d\mu(x)\right)\right|\le 2\|f\|_{L^1(X,\mu)}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
Using the embedding bound,
\begin{align*}
2\|f\|_{L^1(X,\mu)}\|g\|_{L^\infty(X,\mu)}\le 2C_1\|f\|_{\mathcal B_0}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
Define
\begin{align*}
C=\max\{C_1C_0,2C_1\}.
\end{align*}
Then $C>0$, $\rho\in(0,1)$, and the desired estimate holds for every integer $n\ge 0$:
\begin{align*}
\left|\int_X f(g\circ T^n)\,d\mu(x)-\left(\int_X f\,d\mu(x)\right)\left(\int_X g\,d\mu(x)\right)\right|\le C\rho^n\|f\|_{\mathcal B_0}\|g\|_{L^\infty(X,\mu)}.
\end{align*}
This proves exponential decay of correlations.
[/step]