[proofplan]
We first translate the [Markov chain](/page/Markov%20Chain) into its directed graph of positive-probability transitions. Communication is exactly mutual reachability in this graph, so it partitions the finite state space into communicating classes. The finite directed acyclic graph obtained by collapsing communicating classes has at least one sink; these sinks are precisely the closed communicating classes. Finally, every non-sink class is exited almost surely by a uniform finite-time attempt argument, and finite descent in the class graph gives almost sure entrance into some closed class.
[/proofplan]
[step:Identify communicating classes as equivalence classes of mutual reachability]
For states $i,j \in S$, write $i \to j$ if there exist $n \ge 0$ such that
\begin{align*}
\mathbb P_i(X_n=j)>0.
\end{align*}
Equivalently, $i \to j$ means that there is a finite path $i=i_0,i_1,\ldots,i_n=j$ in $S$ with
\begin{align*}
p_{i_0i_1}p_{i_1i_2}\cdots p_{i_{n-1}i_n}>0.
\end{align*}
The equivalence follows from the finite-dimensional distribution formula [citetheorem:9527] applied to the chain started from $i$.
Define $i \leftrightarrow j$ if $i \to j$ and $j \to i$. Reflexivity holds by taking $n=0$. Symmetry is built into the definition. If $i \leftrightarrow j$ and $j \leftrightarrow k$, then concatenating a positive-probability path from $i$ to $j$ with one from $j$ to $k$ gives $i \to k$, and concatenating paths from $k$ to $j$ and $j$ to $i$ gives $k \to i$. Hence $i \leftrightarrow k$, so $\leftrightarrow$ is an [equivalence relation](/page/Equivalence%20Relation).
Its equivalence classes are exactly the communicating classes. Therefore $S$ is the disjoint union of its communicating classes.
[guided]
The goal of this first step is to make the word “class” precise. For states $i,j \in S$, define $i \to j$ to mean that $j$ can be reached from $i$ with positive probability in some finite number of steps:
\begin{align*}
\mathbb P_i(X_n=j)>0
\end{align*}
for at least one integer $n \ge 0$.
Because the chain is time-homogeneous with transition matrix $P=(p_{ij})_{i,j \in S}$, this is the same as saying that there is a finite sequence of states $i=i_0,i_1,\ldots,i_n=j$ whose transition probabilities along the path have positive product:
\begin{align*}
p_{i_0i_1}p_{i_1i_2}\cdots p_{i_{n-1}i_n}>0.
\end{align*}
Indeed, the finite-dimensional distribution formula [citetheorem:9527], applied with initial state $i$, says that the probability of following this exact path is precisely that product. Thus positive path probability and positive product along a directed path are the same condition.
Now define $i \leftrightarrow j$ if both $i \to j$ and $j \to i$. This relation is reflexive because $i$ reaches itself in zero steps. It is symmetric because the definition explicitly requires reachability in both directions. It is transitive because paths can be concatenated: if $i$ reaches $j$ with positive probability and $j$ reaches $k$ with positive probability, then the product of the two positive path probabilities is still positive, so $i$ reaches $k$. The same concatenation in the reverse direction shows that $k$ reaches $i$.
Therefore $\leftrightarrow$ is an equivalence relation on $S$. Its equivalence classes are, by definition, the communicating classes, and every equivalence relation partitions its underlying set. Hence $S$ is the disjoint union of its communicating classes.
[/guided]
[/step]
[step:Collapse communicating classes into a finite directed acyclic graph]
Let $\mathcal C$ denote the finite set of communicating classes. Define a directed graph $G_{\mathcal C}$ as follows: its vertex set is $\mathcal C$, and for two distinct classes $C,D \in \mathcal C$ there is a directed edge $C \to D$ if there exist $i \in C$ and $j \in D$ with $p_{ij}>0$.
The graph $G_{\mathcal C}$ has no directed cycle. Indeed, if
\begin{align*}
C_0 \to C_1 \to \cdots \to C_q=C_0
\end{align*}
were a directed cycle among distinct classes before returning to $C_0$, then every state in each $C_r$ would communicate with every state in every other $C_s$: one moves inside a class by communication and moves between consecutive classes using the positive-probability edge. This would merge all $C_r$ into one communicating class, contradicting their distinctness. Thus $G_{\mathcal C}$ is a finite directed acyclic graph.
[/step]
[step:Choose a sink class and identify sinks with closed communicating classes]
Since $G_{\mathcal C}$ is finite and acyclic, it has at least one sink vertex. To see this, start from any class $C \in \mathcal C$. If $C$ is not a sink, choose an outgoing edge from $C$. Repeating this process cannot continue forever without repeating a vertex, because $\mathcal C$ is finite; a repetition would create a directed cycle. Hence the process terminates at a sink.
A class $C \in \mathcal C$ is closed exactly when it is a sink in $G_{\mathcal C}$. If $C$ is not a sink, there are $i \in C$ and $j \notin C$ with $p_{ij}>0$, so the chain can leave $C$ in one step and $C$ is not closed. Conversely, if $C$ is not closed, then there exist $i \in C$ and $j \notin C$ with $p_{ij}>0$, and $j$ belongs to some communicating class $D \ne C$, giving an edge $C \to D$. Therefore $C$ is not a sink. Thus at least one communicating class is closed.
[/step]
[step:Show that every non-closed communicating class is exited almost surely]
Let $C \in \mathcal C$ be a non-closed communicating class. Define the exit time
\begin{align*}
\tau_C:\Omega \to \mathbb N \cup \{\infty\}
\end{align*}
by
\begin{align*}
\tau_C=\inf\{n \ge 0 : X_n \notin C\}.
\end{align*}
Since $C$ is not closed, there exist $a \in C$ and $b \notin C$ such that $p_{ab}>0$.
For each $x \in C$, because $x$ communicates with $a$, choose a finite path
\begin{align*}
x=x_0,x_1,\ldots,x_{\ell_x}=a
\end{align*}
inside $C$ with positive transition product. Appending the transition $a \to b$ gives a path from $x$ to $S \setminus C$ with positive probability. Let $\beta_x>0$ be the probability of following this chosen path, and let $m_x$ be its length. Since $C$ is finite, define
\begin{align*}
\beta=\min_{x \in C}\beta_x>0
\end{align*}
and
\begin{align*}
M=\max_{x \in C}m_x<\infty.
\end{align*}
For every $x \in C$,
\begin{align*}
\mathbb P_x(\tau_C \le M)\ge \beta.
\end{align*}
By the Markov property applied at times $M,2M,\ldots$, for every $k \ge 1$ and every $x \in C$,
\begin{align*}
\mathbb P_x(\tau_C>kM)\le (1-\beta)^k.
\end{align*}
Letting $k \to \infty$ gives
\begin{align*}
\mathbb P_x(\tau_C=\infty)=0.
\end{align*}
Thus the chain exits $C$ almost surely when started from any state in $C$.
[guided]
The delicate point is that non-closed only gives one possible exit somewhere in the class; it does not immediately say that the chain must exit. We turn the existence of one exit into a uniform positive chance of exit from every state of $C$.
Define the exit time from $C$ by
\begin{align*}
\tau_C=\inf\{n \ge 0 : X_n \notin C\}.
\end{align*}
This is the first time the chain is outside $C$, with the convention that $\tau_C=\infty$ if the chain never leaves.
Because $C$ is not closed, there are states $a \in C$ and $b \notin C$ such that $p_{ab}>0$. Now take any starting state $x \in C$. Since all states in $C$ communicate, $x$ can reach $a$ through a finite path of positive transition probability. Choose one such path and append the one-step transition from $a$ to $b$. The resulting finite path starts at $x$ and exits $C$ with positive probability. Denote its probability by $\beta_x>0$ and its length by $m_x$.
The finiteness of $C$ is used here in an essential way. Since there are only finitely many $x \in C$, the positive numbers $\beta_x$ have a positive minimum:
\begin{align*}
\beta=\min_{x \in C}\beta_x>0.
\end{align*}
The finitely many path lengths also have a finite maximum:
\begin{align*}
M=\max_{x \in C}m_x<\infty.
\end{align*}
Therefore, no matter where the chain currently is in $C$, it has probability at least $\beta$ of leaving $C$ within the next $M$ steps:
\begin{align*}
\mathbb P_x(\tau_C \le M)\ge \beta.
\end{align*}
Now split time into blocks of length $M$. If the chain has not left $C$ by time $rM$, then $X_{rM}\in C$. Starting from that state, the Markov property gives probability at least $\beta$ of exiting during the next block. Hence the [conditional probability](/page/Conditional%20Probability) of surviving one more block inside $C$ is at most $1-\beta$. Iterating this estimate gives
\begin{align*}
\mathbb P_x(\tau_C>kM)\le (1-\beta)^k.
\end{align*}
Since $0 \le 1-\beta < 1$, the right-hand side tends to $0$ as $k \to \infty$. Thus
\begin{align*}
\mathbb P_x(\tau_C=\infty)=0.
\end{align*}
So every non-closed communicating class is exited almost surely.
[/guided]
[/step]
[step:Prove that states in non-closed classes are transient]
Let $i \in S$ belong to a non-closed communicating class $C$. From the previous step,
\begin{align*}
\mathbb P_i(\tau_C<\infty)=1.
\end{align*}
Once the chain exits $C$, it cannot return to $C$. Indeed, if the chain can move from some $c \in C$ to some $y \notin C$ along a positive-probability path and $y$ could reach some $d \in C$, then $y$ would communicate with $C$: the state $d$ reaches $c$ inside $C$, while $c$ reaches $y$ along the exit path. This would force $y$ to belong to $C$, a contradiction.
Therefore, after time $\tau_C$, the state $i$ is never visited again. Hence the total number of visits to $i$ is almost surely finite, so $i$ is transient. Since $i$ was arbitrary in a non-closed communicating class, every state outside the union of closed communicating classes is transient.
[/step]
[step:Descend through the finite class graph until a closed class is reached]
Let $i \in S$, and let $C(i)$ be the communicating class containing $i$. If $C(i)$ is closed, then the desired event occurs at time $0$.
Suppose $C(i)$ is not closed. By the almost sure exit result, the chain exits $C(i)$ in finite time with probability one. When it exits, it enters a communicating class $D$ with an edge $C(i)\to D$ in $G_{\mathcal C}$. If $D$ is closed, the desired event has occurred. If not, the same argument applies to $D$.
Because $G_{\mathcal C}$ is finite and acyclic, every directed path in $G_{\mathcal C}$ has length at most $|\mathcal C|-1$. Thus this descent cannot pass through infinitely many non-closed classes. Combining the probability-one exit statement at each of the finitely many possible stages, the chain reaches a sink of $G_{\mathcal C}$ with probability one. Since sinks are exactly closed communicating classes, for every $i \in S$,
\begin{align*}
\mathbb P_i\left(\exists n \ge 0 \text{ and } r \in \{1,\ldots,m\} \text{ such that } X_n \in C_r\right)=1.
\end{align*}
This also makes explicit that “enters a closed communicating class” means hitting some state belonging to one of the closed communicating classes.
[/step]