[proofplan]
We derive the distribution of $V_i$ (the number of returns to $i$ after time $0$) using the [Strong Markov Property](/theorems/2208) applied at successive return times. The chain makes exactly $r$ returns if and only if each of the first $r$ return attempts succeeds (each with probability $f_{i,i}$) and the $(r+1)$-th attempt fails (probability $1 - f_{i,i}$). Independence of consecutive attempts follows from the strong Markov property. The recurrent and transient consequences follow from analysing the cases $f_{i,i} = 1$ and $f_{i,i} < 1$.
[/proofplan]
[step:Define successive return times and establish independence of return attempts]
Let $X_0 = i$. Define the successive return times to state $i$: let $T_i^{(1)} = \inf\{n \geq 1 : X_n = i\}$ be the first return time, and for $k \geq 2$,
\begin{align*}
T_i^{(k)} = \inf\{n > T_i^{(k-1)} : X_n = i\},
\end{align*}
with $T_i^{(k)} = \infty$ if $T_i^{(k-1)} = \infty$ or no further return occurs. Each $T_i^{(k)}$ is a stopping time for the chain.
Define the indicator of the $k$-th return attempt:
\begin{align*}
R_k = \mathbb{1}\{T_i^{(k)} < \infty\}, \quad k \geq 1.
\end{align*}
We claim that the events $\{R_k = 1\}$ (conditioned on $\{R_1 = \cdots = R_{k-1} = 1\}$) are independent Bernoulli trials, each with success probability $f_{i,i}$.
By the [Strong Markov Property](/theorems/2208) applied at $T_i^{(k-1)}$: on the event $\{T_i^{(k-1)} < \infty\}$, we have $X_{T_i^{(k-1)}} = i$, and the post-$T_i^{(k-1)}$ process $(X_{T_i^{(k-1)}+n})_{n \geq 0}$ is a Markov chain started at $i$ with the same transition matrix $P$, independent of $(X_0, \ldots, X_{T_i^{(k-1)}})$. The event $\{R_k = 1\}$ depends only on this post-$T_i^{(k-1)}$ process. Therefore:
\begin{align*}
\mathbb{P}_i(R_k = 1 \mid R_1 = \cdots = R_{k-1} = 1) = \mathbb{P}_i(T_i^+ < \infty) = f_{i,i}.
\end{align*}
[guided]
The strong Markov property is what makes the successive return attempts independent. Without it, we would need to worry that the history of the chain up to the $(k-1)$-th return could influence the probability of the $k$-th return. The strong Markov property guarantees that upon each arrival at state $i$ (at a stopping time), the chain "forgets" its past and behaves as a fresh chain from $i$.
To verify that the strong Markov property applies, we check that $T_i^{(k)}$ is a stopping time: the event $\{T_i^{(k)} = n\}$ requires that $X_n = i$ and that exactly $k-1$ of $X_1, \ldots, X_{n-1}$ equal $i$, which is determined by $(X_0, \ldots, X_n)$.
The conditional probability $\mathbb{P}_i(R_k = 1 \mid R_1 = \cdots = R_{k-1} = 1)$ equals $f_{i,i}$ because, on the event $\{T_i^{(k-1)} < \infty\}$, the chain is at state $i$ and the question "will it return to $i$?" is answered by the return probability $f_{i,i} = \mathbb{P}_i(T_i^+ < \infty)$ applied to the fresh chain.
[/guided]
[/step]
[step:Compute $\mathbb{P}_i(V_i = r)$ for finite $r \geq 0$]
The event $\{V_i = r\}$ means the chain returns to $i$ exactly $r$ times after time $0$, i.e., $R_1 = \cdots = R_r = 1$ (the first $r$ return attempts succeed) and $R_{r+1} = 0$ (the $(r+1)$-th attempt fails). For $r = 0$, this means the very first return attempt fails.
By the independence established above, these events multiply:
\begin{align*}
\mathbb{P}_i(V_i = r) &= \mathbb{P}_i(R_1 = 1, \ldots, R_r = 1, R_{r+1} = 0) \\
&= \mathbb{P}_i(R_1 = 1) \cdot \mathbb{P}_i(R_2 = 1 \mid R_1 = 1) \cdots \mathbb{P}_i(R_r = 1 \mid R_1 = \cdots = R_{r-1} = 1) \\
&\quad \times \mathbb{P}_i(R_{r+1} = 0 \mid R_1 = \cdots = R_r = 1) \\
&= f_{i,i}^r \cdot (1 - f_{i,i}).
\end{align*}
[/step]
[step:Verify the geometric distribution and derive the recurrent and transient consequences]
**Transient case** ($f_{i,i} < 1$): The formula $\mathbb{P}_i(V_i = r) = f_{i,i}^r(1 - f_{i,i})$ for $r \geq 0$ is a geometric distribution on $\{0, 1, 2, \ldots\}$ with parameter $1 - f_{i,i}$. We verify that the probabilities sum to one:
\begin{align*}
\sum_{r=0}^{\infty} \mathbb{P}_i(V_i = r) = (1 - f_{i,i}) \sum_{r=0}^{\infty} f_{i,i}^r = (1 - f_{i,i}) \cdot \frac{1}{1 - f_{i,i}} = 1.
\end{align*}
Since the geometric distribution is supported on finite values, $\mathbb{P}_i(V_i < \infty) = 1$.
**Recurrent case** ($f_{i,i} = 1$): The formula gives $\mathbb{P}_i(V_i = r) = 1^r \cdot (1 - 1) = 0$ for every finite $r \geq 0$. Since $V_i$ takes values in $\{0, 1, 2, \ldots\} \cup \{\infty\}$ and the probability of every finite value is zero:
\begin{align*}
\mathbb{P}_i(V_i = \infty) = 1 - \sum_{r=0}^{\infty} \mathbb{P}_i(V_i = r) = 1 - 0 = 1.
\end{align*}
The chain returns to $i$ infinitely often with probability one, confirming the [Recurrence and Infinitely Many Returns](/theorems/2219) characterisation.
[guided]
The two cases are exhaustive and complementary. When $f_{i,i} < 1$, the distribution of $V_i$ is a genuine geometric distribution: the chain flips a biased coin at each return, and eventually the coin comes up "fail." The expected number of returns is $\mathbb{E}_i[V_i] = f_{i,i}/(1 - f_{i,i})$, which is finite.
When $f_{i,i} = 1$, the "geometric distribution" degenerates: every finite value gets probability zero because the failure probability $1 - f_{i,i} = 0$ makes every term vanish. The total probability of all finite outcomes is zero, so all the probability mass concentrates at $V_i = \infty$. This is consistent with the definition of recurrence: the chain returns with certainty at each attempt, so there is no mechanism by which the sequence of returns could terminate.
One might ask: why does $\mathbb{P}_i(V_i = \infty)$ equal exactly $1$ and not some value in $(0,1)$? The reason is that $\{V_i = \infty\}$ is the intersection of the decreasing events $\{V_i \geq k\}$, and $\mathbb{P}_i(V_i \geq k) = f_{i,i}^{k-1}$ (as shown in the proof of [Recurrence and Infinitely Many Returns](/theorems/2219)). When $f_{i,i} = 1$, every term in the sequence is $1$, so the limit is $1$.
[/guided]
[/step]