[proofplan]
For a fixed past horizon $T$, we compare the coupling-from-the-past output with an auxiliary chain started at time $-T$ from the stationary distribution and then evolved using the same stored update maps $F_{-T},\dots,F_{-1}$. Stationarity implies that the auxiliary chain has law $\pi$ at time $0$ for every deterministic $T$. On the event that coalescence has occurred by time $-T$, the deterministic map from time $-T$ to time $0$ is constant, so the auxiliary time-$0$ value must equal the CFTP output regardless of its starting state. Since $\tau<\infty$ almost surely, this agreement probability tends to $1$, forcing the law of $Y$ to be the same as the constant law $\pi$.
[/proofplan]
[step:Construct a stationary auxiliary chain using the same update maps]
Fix $T\in\mathbb{N}$. Let $(S,2^S,\pi)$ denote the finite probability space assigning mass $\pi(x)$ to $x\in S$, and define the product extension $(\Omega_T,\mathcal{F}_T,\mathbb{P}_T)$ by $\Omega_T=\Omega\times S$, $\mathcal{F}_T=\mathcal{F}\otimes 2^S$, and $\mathbb{P}_T=\mathbb{P}\otimes\pi$. Let $q_T:\Omega_T\to\Omega$ be the projection $q_T(\omega,z)=\omega$, and regard each original random object as its pullback along $q_T$. Define
\begin{align*}
Z_T:\Omega_T \to S
\end{align*}
by $Z_T(\omega,z)=z$. Then $Z_T$ is independent of the pulled-back random maps $F_{-T},F_{-T+1},\dots,F_{-1}$ and satisfies
\begin{align*}
\mathbb{P}_T(Z_T=x)=\pi(x)
\end{align*}
for every $x\in S$. Define the auxiliary time-$0$ [random variable](/page/Random%20Variable)
\begin{align*}
X_{T,0}:\Omega_T \to S
\end{align*}
by
\begin{align*}
X_{T,0}=\Phi_{-T,0}(Z_T).
\end{align*}
More generally, define random variables
\begin{align*}
X_{T,k}:\Omega_T \to S
\end{align*}
for $k=-T,-T+1,\dots,0$ by $X_{T,-T}=Z_T$ and
\begin{align*}
X_{T,k+1}=F_k(X_{T,k})
\end{align*}
for $k=-T,\dots,-1$. We prove by induction on $k$ that $X_{T,k}$ has distribution $\pi$ under $\mathbb{P}_T$ for every $k=-T,\dots,0$.
The base case $k=-T$ is the definition of $Z_T$. Suppose $-T\leq k\leq -1$ and $X_{T,k}$ has distribution $\pi$ under $\mathbb{P}_T$. Since $X_{T,k}$ is a function of $Z_T,F_{-T},\dots,F_{k-1}$, it is independent of $F_k$. Therefore, for every $y\in S$,
\begin{align*}
\mathbb{P}_T(X_{T,k+1}=y)=\sum_{x\in S}\mathbb{P}_T(X_{T,k}=x)\mathbb{P}_T(F_k(x)=y).
\end{align*}
Using the induction hypothesis and the marginal condition for $F_k$, we obtain
\begin{align*}
\mathbb{P}_T(X_{T,k+1}=y)=\sum_{x\in S}\pi(x)P(x,y).
\end{align*}
By stationarity of $\pi$, this equals $\pi(y)$. Hence $X_{T,k+1}$ has distribution $\pi$, and induction gives
\begin{align*}
\mathbb{P}_T(X_{T,0}=y)=\pi(y)
\end{align*}
for every $y\in S$.
[guided]
Fix a deterministic horizon $T\in\mathbb{N}$. We introduce an auxiliary starting state at time $-T$ whose only purpose is to test what the stored update maps would do to a stationary chain. Let $(S,2^S,\pi)$ be the finite probability space with mass function $\pi$, and define $(\Omega_T,\mathcal{F}_T,\mathbb{P}_T)$ by $\Omega_T=\Omega\times S$, $\mathcal{F}_T=\mathcal{F}\otimes 2^S$, and $\mathbb{P}_T=\mathbb{P}\otimes\pi$. Let $q_T:\Omega_T\to\Omega$ be the projection $q_T(\omega,z)=\omega$; the original maps, $\tau$, and $Y$ are henceforth their pullbacks along $q_T$. Define
\begin{align*}
Z_T:\Omega_T \to S
\end{align*}
by $Z_T(\omega,z)=z$. Then $Z_T$ is independent of $F_{-T},F_{-T+1},\dots,F_{-1}$ and satisfies
\begin{align*}
\mathbb{P}_T(Z_T=x)=\pi(x)
\end{align*}
for every $x\in S$. The product extension does not change the law of the stored maps or of the CFTP output; it only supplies an independent stationary initial state.
Define
\begin{align*}
X_{T,k}:\Omega_T \to S
\end{align*}
for $k=-T,-T+1,\dots,0$ as follows. Set $X_{T,-T}=Z_T$, and for each $k=-T,\dots,-1$ set
\begin{align*}
X_{T,k+1}=F_k(X_{T,k}).
\end{align*}
Thus $X_{T,0}=\Phi_{-T,0}(Z_T)$.
We now verify that this auxiliary process has transition matrix $P$. Fix $k\in\{-T,\dots,-1\}$. The random variable $X_{T,k}$ depends only on $Z_T,F_{-T},\dots,F_{k-1}$, while the map $F_k$ is independent of all these random objects. Hence $X_{T,k}$ is independent of $F_k$. For every $y\in S$, partitioning over the possible values $x\in S$ of $X_{T,k}$ gives
\begin{align*}
\mathbb{P}_T(X_{T,k+1}=y)=\sum_{x\in S}\mathbb{P}_T(X_{T,k}=x)\mathbb{P}_T(F_k(x)=y).
\end{align*}
The hypothesis on the random maps says exactly that
\begin{align*}
\mathbb{P}_T(F_k(x)=y)=P(x,y)
\end{align*}
for every $x,y\in S$.
Now use induction. At time $-T$, the law of $X_{T,-T}$ is $\pi$ by construction. If $X_{T,k}$ has law $\pi$, then the preceding formula gives
\begin{align*}
\mathbb{P}_T(X_{T,k+1}=y)=\sum_{x\in S}\pi(x)P(x,y).
\end{align*}
The stationarity condition for $\pi$ says that this last sum is $\pi(y)$. Therefore $X_{T,k+1}$ also has law $\pi$. Induction from $k=-T$ to $k=0$ proves
\begin{align*}
\mathbb{P}_T(X_{T,0}=y)=\pi(y)
\end{align*}
for every $y\in S$.
[/guided]
[/step]
[step:Show that coalescence by time $-T$ forces agreement with the CFTP output]
Let
\begin{align*}
C_T=\{\omega_T\in\Omega_T:\tau(\omega_T)\leq T\}
\end{align*}
be the event that coalescence has occurred by horizon $T$ on the product extension. On $C_T$, put $t=\tau(\omega_T)$. Since $t\leq T$ and the formalized statement defines $\Phi_{a,a}=\operatorname{id}_S$, associativity of composition and the definition of $\Phi_{a,b}$ on adjacent time intervals give the factorization
\begin{align*}
\Phi_{-T,0}(\omega_T)=\Phi_{-t,0}(\omega_T)\circ \Phi_{-T,-t}(\omega_T).
\end{align*}
When $t=T$, this reads $\Phi_{-T,0}(\omega_T)=\Phi_{-T,0}(\omega_T)\circ\operatorname{id}_S$.
The map $\Phi_{-t,0}(\omega_T):S\to S$ is constant by the definition of $\tau$, with common value $Y(\omega_T)$. Therefore $\Phi_{-T,0}(\omega_T)$ is also constant with common value $Y(\omega_T)$. In particular,
\begin{align*}
X_{T,0}(\omega_T)=\Phi_{-T,0}(\omega_T)(Z_T(\omega_T))=Y(\omega_T)
\end{align*}
on $C_T$. Hence
\begin{align*}
\mathbb{P}_T(X_{T,0}\neq Y)\leq \mathbb{P}_T(\tau>T)=\mathbb{P}(\tau>T).
\end{align*}
[guided]
Let
\begin{align*}
C_T=\{\omega_T\in\Omega_T:\tau(\omega_T)\leq T\}
\end{align*}
be the event that all starting states have coalesced by the deterministic horizon $T$. The variables $\tau$ and $Y$ are the pullbacks from $\Omega$ to $\Omega_T$, so this event is measured in the product extension.
Take $\omega_T\in C_T$ and set $t=\tau(\omega_T)$. Then $t\leq T$. The composition from time $-T$ to time $0$ splits at the intermediate time $-t$: by associativity of composition and by the definition of $\Phi_{a,b}$ on adjacent time intervals,
\begin{align*}
\Phi_{-T,0}(\omega_T)=\Phi_{-t,0}(\omega_T)\circ \Phi_{-T,-t}(\omega_T).
\end{align*}
If $t=T$, then $\Phi_{-T,-t}=\Phi_{-T,-T}=\operatorname{id}_S$, so the same formula remains valid.
By the definition of $\tau$, the map $\Phi_{-t,0}(\omega_T):S\to S$ is constant on $S$. Its common value is exactly $Y(\omega_T)$ by the definition of the CFTP output. Therefore composing $\Phi_{-t,0}(\omega_T)$ with any map into $S$, in particular with $\Phi_{-T,-t}(\omega_T)$, still gives a constant map with common value $Y(\omega_T)$. Hence
\begin{align*}
X_{T,0}(\omega_T)=\Phi_{-T,0}(\omega_T)(Z_T(\omega_T))=Y(\omega_T)
\end{align*}
on $C_T$. Thus disagreement can occur only on $C_T^c=\{\tau>T\}$, and consequently
\begin{align*}
\mathbb{P}_T(X_{T,0}\neq Y)\leq \mathbb{P}_T(\tau>T)=\mathbb{P}(\tau>T).
\end{align*}
The last equality holds because $\tau$ is pulled back from the original probability space and $\mathbb{P}_T=\mathbb{P}\otimes\pi$.
[/guided]
[/step]
[step:Pass to the limit in the agreement probability]
Because $\tau<\infty$ almost surely, the events $\{\tau>T\}$ decrease to the null event $\{\tau=\infty\}$. By [continuity of probability](/theorems/1107) from above, applied to the decreasing sequence of events $\{\tau>T\}$ whose probabilities are finite because they are at most $1$, we have
\begin{align*}
\lim_{T\to\infty}\mathbb{P}(\tau>T)=\mathbb{P}(\tau=\infty)=0.
\end{align*}
Combining this with the previous step gives
\begin{align*}
\lim_{T\to\infty}\mathbb{P}_T(X_{T,0}\neq Y)=0.
\end{align*}
[guided]
The bound from the previous step reduces the problem to showing that the probability of missing coalescence by horizon $T$ tends to zero. Since $\tau<\infty$ almost surely, the events $\{\tau>T\}$ form a decreasing sequence whose intersection is $\{\tau=\infty\}$. Continuity of probability from above applies because each event has probability at most $1$, and gives
\begin{align*}
\lim_{T\to\infty}\mathbb{P}(\tau>T)=\mathbb{P}(\tau=\infty)=0.
\end{align*}
Combining this limit with
\begin{align*}
\mathbb{P}_T(X_{T,0}\neq Y)\leq \mathbb{P}(\tau>T)
\end{align*}
proves
\begin{align*}
\lim_{T\to\infty}\mathbb{P}_T(X_{T,0}\neq Y)=0.
\end{align*}
[/guided]
[/step]
[step:Identify the law of the CFTP output with the stationary distribution]
Fix $y\in S$. For every $T\in\mathbb{N}$, the pullback of $Y$ to $\Omega_T$ has the same law as $Y$ on $\Omega$, and the first step gives $\mathbb{P}_T(X_{T,0}=y)=\pi(y)$. Hence
\begin{align*}
|\mathbb{P}(Y=y)-\pi(y)|=|\mathbb{P}_T(Y=y)-\mathbb{P}_T(X_{T,0}=y)|.
\end{align*}
The two events $\{Y=y\}$ and $\{X_{T,0}=y\}$ can differ only when $Y\neq X_{T,0}$, so
\begin{align*}
|\mathbb{P}_T(Y=y)-\mathbb{P}_T(X_{T,0}=y)|\leq\mathbb{P}_T(Y\neq X_{T,0}).
\end{align*}
Taking $T\to\infty$ and using the previous step yields
\begin{align*}
|\mathbb{P}(Y=y)-\pi(y)|=0.
\end{align*}
Thus $\mathbb{P}(Y=y)=\pi(y)$ for every $y\in S$, which is exactly the assertion that the coupling-from-the-past output has distribution $\pi$.
[guided]
Fix $y\in S$. The random variable $Y$ on $\Omega_T=\Omega\times S$ is the pullback of the original $Y$ on $\Omega$, so it has the same law under $\mathbb{P}_T=\mathbb{P}\otimes\pi$ as under $\mathbb{P}$. The first step proved that the auxiliary time-$0$ state has stationary law, namely
\begin{align*}
\mathbb{P}_T(X_{T,0}=y)=\pi(y).
\end{align*}
Therefore
\begin{align*}
|\mathbb{P}(Y=y)-\pi(y)|=|\mathbb{P}_T(Y=y)-\mathbb{P}_T(X_{T,0}=y)|.
\end{align*}
The only way the indicators of the two events $\{Y=y\}$ and $\{X_{T,0}=y\}$ can differ is if the two random variables themselves differ. Hence
\begin{align*}
|\mathbb{P}_T(Y=y)-\mathbb{P}_T(X_{T,0}=y)|\leq\mathbb{P}_T(Y\neq X_{T,0}).
\end{align*}
The previous step shows that the right-hand side tends to $0$ as $T\to\infty$. Since the left-hand side does not depend on $T$, it must equal $0$. Thus
\begin{align*}
\mathbb{P}(Y=y)=\pi(y).
\end{align*}
Because $y\in S$ was arbitrary, $Y$ has distribution $\pi$ on the finite state space $S$.
[/guided]
[/step]