[proofplan]
We prove ergodicity by using the invariant-set characterization. When $P$ is irreducible, a shift-invariant event is measurable, modulo null sets, with respect to the future tail $\sigma$-algebra; conditioning this event on the present state gives a bounded harmonic function for $P$. Irreducibility and stationarity force every bounded harmonic function to be constant, and martingale convergence then forces the invariant event to have measure $0$ or $1$. Conversely, if $P$ is reducible, a proper closed communicating class gives a proper shift-invariant event.
[/proofplan]
[step:Set up the coordinate maps and the invariant-set criterion]
Let $A$ denote the finite state space of the Markov shift, with $X = A^{\mathbb{Z}}$. For each $n \in \mathbb{Z}$, define the coordinate map
\begin{align*}
\xi_n: X &\to A \\
x &\mapsto x_n .
\end{align*}
Write $\pi_i := \pi(\{i\})$ for $i \in A$, and, for $C \subseteq A$, write
\begin{align*}
\pi(C) := \sum_{i \in C} \pi_i .
\end{align*}
We take $A$ to be the support of the stationary distribution, so $\pi_i > 0$ for every $i \in A$.
For $m \geq 0$, define the future $\sigma$-algebra
\begin{align*}
\mathcal{F}_m^+ := \sigma(\xi_m,\xi_{m+1},\xi_{m+2},\ldots)
\end{align*}
and the finite-time forward filtration
\begin{align*}
\mathcal{G}_m := \sigma(\xi_0,\xi_1,\ldots,\xi_m).
\end{align*}
The future tail $\sigma$-algebra is
\begin{align*}
\mathcal{T}^+ := \bigcap_{m=0}^{\infty} \mathcal{F}_m^+ .
\end{align*}
By the [Equivalence of Ergodicity Conditions](/theorems/3444), since $\mu$ is $\sigma$-invariant by stationarity of $\pi$, the Markov shift is ergodic if and only if every $E \in \mathcal{B}$ satisfying
\begin{align*}
\mu(E \triangle \sigma^{-1}E)=0
\end{align*}
has $\mu(E) \in \{0,1\}$.
[/step]
[step:Represent every invariant event as a future-tail event]
Let $E \in \mathcal{B}$ satisfy $\mu(E \triangle \sigma^{-1}E)=0$. Then, for every $n \geq 1$,
\begin{align*}
\mu(E \triangle \sigma^{-n}E)=0 .
\end{align*}
Because $\mathcal{B}$ is generated by finite coordinate cylinders, for every $\varepsilon > 0$ there exists a cylinder event $C_\varepsilon \in \sigma(\xi_{-r},\ldots,\xi_r)$, for some $r \geq 0$, such that
\begin{align*}
\mu(E \triangle C_\varepsilon) < \varepsilon .
\end{align*}
Fix $m \geq 0$ and choose $n > m+r$. Then $\sigma^{-n}C_\varepsilon \in \mathcal{F}_m^+$, and $\sigma$-invariance of $\mu$ gives
\begin{align*}
\mu(E \triangle \sigma^{-n}C_\varepsilon)
&\leq \mu(E \triangle \sigma^{-n}E) + \mu(\sigma^{-n}E \triangle \sigma^{-n}C_\varepsilon) \\
&= 0 + \mu(E \triangle C_\varepsilon) \\
&< \varepsilon .
\end{align*}
Thus $E$ belongs to the $\mu$-completion of $\mathcal{F}_m^+$ for every $m \geq 0$. Applying the [Backwards Martingale Convergence Theorem](/theorems/1165) to the decreasing family $(\mathcal{F}_m^+)_{m \geq 0}$ gives
\begin{align*}
\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{T}^+] = \mathbb{1}_E
\end{align*}
$\mu$-almost everywhere. Hence $E$ is $\mathcal{T}^+$-measurable modulo $\mu$-null sets.
[guided]
The point of this step is to turn an arbitrary two-sided invariant event into a future-tail event. The event $E$ might originally depend on negative and positive coordinates. Invariance lets us shift any finite approximation of $E$ arbitrarily far into the future.
Let $E \in \mathcal{B}$ satisfy $\mu(E \triangle \sigma^{-1}E)=0$. Iterating the same identity gives
\begin{align*}
\mu(E \triangle \sigma^{-n}E)=0
\end{align*}
for every $n \geq 1$. Since finite coordinate cylinders generate $\mathcal{B}$, for every $\varepsilon > 0$ there is a cylinder event $C_\varepsilon$ depending only on coordinates $\xi_{-r},\ldots,\xi_r$ such that
\begin{align*}
\mu(E \triangle C_\varepsilon) < \varepsilon .
\end{align*}
Now fix a future starting time $m \geq 0$. If $n > m+r$, then the shifted cylinder $\sigma^{-n}C_\varepsilon$ depends only on coordinates
\begin{align*}
\xi_{n-r},\xi_{n-r+1},\ldots,\xi_{n+r},
\end{align*}
all of which have index at least $m$. Therefore $\sigma^{-n}C_\varepsilon \in \mathcal{F}_m^+$. Since $\mu$ is $\sigma$-invariant and $E$ is invariant modulo null sets,
\begin{align*}
\mu(E \triangle \sigma^{-n}C_\varepsilon)
&\leq \mu(E \triangle \sigma^{-n}E) + \mu(\sigma^{-n}E \triangle \sigma^{-n}C_\varepsilon) \\
&= 0 + \mu(E \triangle C_\varepsilon) \\
&< \varepsilon .
\end{align*}
Thus $E$ can be approximated in measure by events in $\mathcal{F}_m^+$ for every $m$. Equivalently, $\mathbb{1}_E$ is measurable with respect to the $\mu$-completion of each $\mathcal{F}_m^+$.
Because the $\sigma$-algebras $\mathcal{F}_m^+$ decrease as $m$ increases, the [Backwards Martingale Convergence Theorem](/theorems/1165) applies and yields
\begin{align*}
\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{T}^+]
=
\lim_{m \to \infty}
\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{F}_m^+]
=
\mathbb{1}_E
\end{align*}
$\mu$-almost everywhere. This proves that $E$ is a future-tail event modulo null sets.
[/guided]
[/step]
[step:Show that irreducibility forces bounded harmonic functions to be constant]
Assume that $P$ is irreducible. Let
\begin{align*}
h: A &\to \mathbb{R}
\end{align*}
be a bounded function satisfying
\begin{align*}
h(i)=\sum_{j \in A} P_{ij}h(j)
\end{align*}
for every $i \in A$.
Since $A$ is finite, choose $a\in A$ such that
\begin{align*}
M:=h(a)=\max_{i\in A}h(i).
\end{align*}
The harmonicity identity at $a$ gives
\begin{align*}
h(a)=\sum_{j\in A}P_{aj}h(j).
\end{align*}
Every term $h(j)$ is at most $M=h(a)$, and the coefficients $P_{aj}$ are non-negative with sum $1$. Therefore
\begin{align*}
h(j)=M
\end{align*}
for every $j$ with $P_{aj}>0$.
Repeating this argument along paths, if
\begin{align*}
a=i_0\to i_1\to\cdots\to i_m
\end{align*}
and $P_{i_\ell i_{\ell+1}}>0$ for every $\ell$, then $h(i_m)=M$. Irreducibility says that every state is reachable from $a$ by such a path. Hence $h(i)=M$ for every $i\in A$, so $h$ is constant.
[/step]
[step:Use harmonicity and martingale convergence to prove ergodicity when $P$ is irreducible]
Assume that $P$ is irreducible. Let $E \in \mathcal{B}$ satisfy $\mu(E \triangle \sigma^{-1}E)=0$. Replacing $E$ by a $\mathcal{T}^+$-measurable representative from the previous step, define
\begin{align*}
h: A &\to [0,1] \\
i &\mapsto \mu(E \mid \xi_0=i).
\end{align*}
This conditional probability is well-defined because $\pi_i=\mu(\xi_0=i)>0$ for every $i \in A$.
We show that $h$ is harmonic. Since $E$ is invariant and future-tail measurable, the Markov property gives, for every $i \in A$,
\begin{align*}
h(i)
&= \mu(E \mid \xi_0=i) \\
&= \mu(\sigma^{-1}E \mid \xi_0=i) \\
&= \sum_{j \in A} P_{ij}\mu(E \mid \xi_0=j) \\
&= \sum_{j \in A} P_{ij}h(j).
\end{align*}
By the preceding step, $h$ is constant; write $h(i)=c$ for all $i \in A$.
For each $m \geq 0$, the Markov property and invariance of $E$ give
\begin{align*}
\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{G}_m]
=
h(\xi_m)
=
c
\end{align*}
$\mu$-almost everywhere. Since $E$ is $\mathcal{F}_0^+$-measurable and $\mathcal{G}_m \uparrow \mathcal{F}_0^+$, the [Almost Sure Martingale Convergence Theorem](/theorems/1157) gives
\begin{align*}
\mathbb{1}_E
=
\lim_{m \to \infty}\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{G}_m]
=
c
\end{align*}
$\mu$-almost everywhere. Because $\mathbb{1}_E$ takes only the values $0$ and $1$, we have $c \in \{0,1\}$. Hence $\mu(E) \in \{0,1\}$. By the invariant-set criterion, $(X,\mathcal{B},\mu,\sigma)$ is ergodic.
[guided]
Let $E$ be a shift-invariant event. The previous step allows us to treat $E$ as a future-tail event, so membership in $E$ is unaffected by changing any finite initial block of future coordinates. This is what makes the Markov property usable.
Define
\begin{align*}
h: A &\to [0,1] \\
i &\mapsto \mu(E \mid \xi_0=i).
\end{align*}
The definition is legitimate because $\mu(\xi_0=i)=\pi_i>0$ for every state $i \in A$.
We next prove that $h$ is harmonic for $P$. Since $E$ is invariant modulo null sets,
\begin{align*}
\mu(E \mid \xi_0=i)=\mu(\sigma^{-1}E \mid \xi_0=i).
\end{align*}
On the event $\{\xi_0=i,\xi_1=j\}$, the future process after time $1$ has transition matrix $P$ and starts from $j$. Since $E$ is a future-tail event, the conditional probability that the shifted sequence belongs to $E$ is exactly $\mu(E \mid \xi_0=j)=h(j)$. Therefore the Markov property yields
\begin{align*}
h(i)
&= \mu(\sigma^{-1}E \mid \xi_0=i) \\
&= \sum_{j \in A} \mu(\xi_1=j \mid \xi_0=i)\mu(E \mid \xi_0=j) \\
&= \sum_{j \in A} P_{ij}h(j).
\end{align*}
Thus $h$ is a bounded harmonic function. By the harmonic-function result already proved, irreducibility of $P$ forces $h$ to be constant. Write this constant as $c$.
Now condition on the finite history $\mathcal{G}_m=\sigma(\xi_0,\ldots,\xi_m)$. Because $E$ is invariant, the event $E$ can be shifted to begin at time $m$. Given the history up to time $m$, the Markov property says that the conditional probability of $E$ depends only on the current state $\xi_m$. Hence
\begin{align*}
\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{G}_m]
=
h(\xi_m)
=
c
\end{align*}
$\mu$-almost everywhere.
Finally, the $\sigma$-algebras $\mathcal{G}_m$ increase to $\mathcal{F}_0^+$. Since $E$ is $\mathcal{F}_0^+$-measurable, the [Almost Sure Martingale Convergence Theorem](/theorems/1157) gives
\begin{align*}
\mathbb{1}_E
=
\lim_{m \to \infty}
\mathbb{E}_\mu[\mathbb{1}_E \mid \mathcal{G}_m]
=
c
\end{align*}
$\mu$-almost everywhere. An indicator function that is almost surely equal to a constant can only be almost surely $0$ or almost surely $1$. Therefore $\mu(E)\in\{0,1\}$, which is exactly ergodicity by the invariant-set criterion.
[/guided]
[/step]
[step:Construct a proper invariant event when $P$ is reducible]
Assume that $P$ is reducible. Since $A$ is finite, the directed graph with vertices $A$ and arrows $i \to j$ whenever $P_{ij}>0$ has a proper closed communicating class $C \subsetneq A$. Thus $C$ is nonempty, and
\begin{align*}
P_{ij}=0
\end{align*}
for every $i \in C$ and every $j \in A \setminus C$.
Define
\begin{align*}
I := \{x \in X : \xi_0(x) \in C\}.
\end{align*}
Because $\pi_i>0$ for every $i \in A$, and $C$ is nonempty and proper,
\begin{align*}
0 < \mu(I)=\pi(C) < 1.
\end{align*}
We verify that $I$ is shift-invariant modulo $\mu$-null sets. Since $C$ is closed,
\begin{align*}
\mu(I \setminus \sigma^{-1}I)
&= \mu(\xi_0 \in C,\xi_1 \notin C) \\
&= \sum_{i \in C}\sum_{j \in A\setminus C} \pi_i P_{ij} \\
&= 0.
\end{align*}
Stationarity gives
\begin{align*}
\pi(C)
=
\sum_{i \in A}\pi_i P_{iC}
=
\sum_{i \in C}\pi_i P_{iC}+\sum_{i \in A\setminus C}\pi_i P_{iC}.
\end{align*}
Since $C$ is closed, $P_{iC}=1$ for $i \in C$, so
\begin{align*}
\sum_{i \in A\setminus C}\pi_i P_{iC}=0.
\end{align*}
Therefore
\begin{align*}
\mu(\sigma^{-1}I \setminus I)
&= \mu(\xi_0 \notin C,\xi_1 \in C) \\
&= \sum_{i \in A\setminus C}\pi_i P_{iC} \\
&= 0.
\end{align*}
Hence $\mu(I \triangle \sigma^{-1}I)=0$, while $0<\mu(I)<1$. By the invariant-set criterion, the Markov shift is not ergodic.
Combining this with the irreducible case proves that $(X,\mathcal{B},\mu,\sigma)$ is ergodic if and only if $P$ is irreducible.
[/step]