Monotone Coupling from the Past Correctness Theorem

Theorem

Edit Issues Pull Requests Attributions Admin

Let $(S,\preceq)$ be a finite partially ordered set with a least element $\hat{0}\in S$ and a greatest element $\hat{1}\in S$, meaning that $\hat{0}\preceq x\preceq \hat{1}$ for every $x\in S$. Equip $S$ and $S^S$ with their discrete sigma-algebras. Let $P:S\times S\to[0,1]$ be a Markov transition matrix. Let $\pi:S\to[0,1]$ be a stationary probability mass function for $P$, meaning that \begin{align*} \sum_{x\in S}\pi(x)=1 \end{align*} and \begin{align*} \sum_{x\in S}\pi(x)P(x,y)=\pi(y) \end{align*} for every $y\in S$. Let $(\Omega,\mathcal{F},\mathbb{P})$ be a probability space, and let $(F_n)_{n\leq -1}$ be independent identically distributed random maps, meaning measurable maps \begin{align*} F_n:\Omega\to S^S \end{align*} such that, for every $n\leq -1$ and every $x,y\in S$, \begin{align*} \mathbb{P}(F_n(x)=y)=P(x,y). \end{align*} Assume that each $F_n$ is monotone almost surely, meaning that \begin{align*} \mathbb{P}\bigl(F_n(x)\preceq F_n(y)\text{ for all }x,y\in S\text{ with }x\preceq y\bigr)=1 \end{align*} for every $n\leq -1$. For integers $a\leq b\leq 0$, define the reused-randomness composition as the measurable map \begin{align*} \Phi_{a,b}:\Omega\to S^S. \end{align*} If $a<b$, then for every $\omega\in\Omega$, the map $\Phi_{a,b}(\omega):S\to S$ is given by \begin{align*} x\mapsto F_{b-1}(\omega)(F_{b-2}(\omega)(\cdots F_a(\omega)(x)\cdots)). \end{align*} If $a=b$, then $\Phi_{a,a}(\omega)=\operatorname{id}_S$ for every $\omega\in\Omega$. Define the extremal coalescence time $\tau:\Omega\to\mathbb{N}\cup\{\infty\}$ by \begin{align*} \tau=\inf\{t\geq 1:\Phi_{-t,0}(\hat{0})=\Phi_{-t,0}(\hat{1})\}. \end{align*} Assume $\tau<\infty$ almost surely. Define the monotone coupling-from-the-past output $Y:\Omega\to S$ by \begin{align*} Y(\omega)=\Phi_{-\tau(\omega),0}(\omega)(\hat{0}) \end{align*} when $\tau(\omega)<\infty$, and by $Y(\omega)=\hat{0}$ when $\tau(\omega)=\infty$. Then $Y$ has distribution $\pi$; that is, \begin{align*} \mathbb{P}(Y=y)=\pi(y) \end{align*} for every $y\in S$.

Discussion

Proof

[proofplan] The proof first shows that every finite composition of monotone update maps is monotone. Because $\hat{0}$ is a least element and $\hat{1}$ is a greatest element, every starting state is trapped between the lower and upper extremal trajectories. At the random time $\tau$, the two extremal images coincide, so antisymmetry of the partial order forces every starting state to have the same image. Finally, a direct stationarity argument with an independent $\pi$-distributed start shows that this common value has law $\pi$. [/proofplan] [step:Restrict to the probability-one event where all relevant update maps are monotone] For each integer $n\leq -1$, define the event \begin{align*} A_n=\{F_n(x)\preceq F_n(y)\text{ for all }x,y\in S\text{ with }x\preceq y\}. \end{align*} By hypothesis, $\mathbb{P}(A_n)=1$ for every $n\leq -1$. Since the set of negative integers is countable, the event \begin{align*} A=\bigcap_{n\leq -1}A_n \end{align*} satisfies $\mathbb{P}(A)=1$. Define also \begin{align*} B=\{\tau<\infty\}. \end{align*} By hypothesis, $\mathbb{P}(B)=1$. Hence $\mathbb{P}(A\cap B)=1$. It is therefore enough to prove the coalescence assertion on $A\cap B$, since modifying $Y$ outside this probability-one event does not change its distribution. [/step] [step:Show that each reused-randomness composition is monotone] Fix $\omega\in A$, and fix integers $a<b\leq 0$. We prove that the map \begin{align*} \Phi_{a,b}(\omega):S&\to S \end{align*} is monotone. Let $x,y\in S$ satisfy $x\preceq y$. Since $\omega\in A$, each map $F_n(\omega):S\to S$ is monotone for $a\leq n\leq b-1$. Applying these monotonicity relations successively gives \begin{align*} F_a(\omega)(x)\preceq F_a(\omega)(y). \end{align*} Applying $F_{a+1}(\omega)$ to both sides preserves the order, and iterating through $F_{b-1}(\omega)$ gives \begin{align*} \Phi_{a,b}(\omega)(x)\preceq \Phi_{a,b}(\omega)(y). \end{align*} Thus $\Phi_{a,b}(\omega)$ is monotone. [guided] Fix an outcome $\omega\in A$. The definition of $A$ means that, for every negative integer $n$, the deterministic map \begin{align*} F_n(\omega):S&\to S \end{align*} is order-preserving: whenever $u,v\in S$ satisfy $u\preceq v$, we have \begin{align*} F_n(\omega)(u)\preceq F_n(\omega)(v). \end{align*} Now fix integers $a<b\leq 0$. We want to show that the whole composition \begin{align*} \Phi_{a,b}(\omega):S&\to S \end{align*} \begin{align*} x&\mapsto F_{b-1}(\omega)(F_{b-2}(\omega)(\cdots F_a(\omega)(x)\cdots)) \end{align*} is monotone. Take arbitrary $x,y\in S$ with $x\preceq y$. Applying monotonicity of $F_a(\omega)$ first gives \begin{align*} F_a(\omega)(x)\preceq F_a(\omega)(y). \end{align*} The outputs on both sides are again elements of $S$, so monotonicity of $F_{a+1}(\omega)$ applies to this new ordered pair. Continuing one map at a time, every application preserves the order. After the last map $F_{b-1}(\omega)$ is applied, the resulting inequality is \begin{align*} \Phi_{a,b}(\omega)(x)\preceq \Phi_{a,b}(\omega)(y). \end{align*} Since $x$ and $y$ were arbitrary ordered states, $\Phi_{a,b}(\omega)$ is monotone. [/guided] [/step] [step:Use the extremal states to trap every trajectory] Fix $\omega\in A\cap B$, and let $T=\tau(\omega)\in\mathbb{N}$. Since $\hat{0}$ is least and $\hat{1}$ is greatest, every $x\in S$ satisfies \begin{align*} \hat{0}\preceq x\preceq \hat{1}. \end{align*} By the monotonicity of $\Phi_{-T,0}(\omega):S\to S$, proved in the previous step, we obtain \begin{align*} \Phi_{-T,0}(\omega)(\hat{0})\preceq \Phi_{-T,0}(\omega)(x) \end{align*} and \begin{align*} \Phi_{-T,0}(\omega)(x)\preceq \Phi_{-T,0}(\omega)(\hat{1}). \end{align*} By the definition of $\tau$, the two extremal images agree at time $T$: \begin{align*} \Phi_{-T,0}(\omega)(\hat{0})=\Phi_{-T,0}(\omega)(\hat{1}). \end{align*} Let $z_\omega\in S$ denote this common value. Then \begin{align*} z_\omega\preceq \Phi_{-T,0}(\omega)(x) \end{align*} and \begin{align*} \Phi_{-T,0}(\omega)(x)\preceq z_\omega. \end{align*} By antisymmetry of the partial order $\preceq$, $\Phi_{-T,0}(\omega)(x)=z_\omega$. Since $x\in S$ was arbitrary, every starting state coalesces at time $T=\tau(\omega)$. [/step] [step:Identify the output law by starting the coupled chain in stationarity] Let $(S,2^S,\pi)$ denote the finite probability space with mass function $\pi$, and work on the product probability space \begin{align*} (\widetilde{\Omega},\widetilde{\mathcal{F}},\widetilde{\mathbb{P}})=(\Omega\times S,\mathcal{F}\otimes 2^S,\mathbb{P}\otimes\pi). \end{align*} Define the auxiliary initial state \begin{align*} Z:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto z. \end{align*} For each $t\in\mathbb{N}$, define the time-zero state obtained by applying the reused updates from time $-t$ by \begin{align*} W_t:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto \Phi_{-t,0}(\omega)(z). \end{align*} Define also the pullback of the CFTP output to the product space by \begin{align*} \widetilde{Y}:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto Y(\omega). \end{align*} The [random variable](/page/Random%20Variable) $Z$ has distribution $\pi$ and is independent of $F_{-t},\dots,F_{-1}$. For $0\leq k\leq t$, define \begin{align*} V_k:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto \Phi_{-t,-t+k}(\omega)(z), \end{align*} where the case $k=0$ uses the identity composition $\Phi_{-t,-t}=\operatorname{id}_S$. Thus $V_0=Z$ and $V_t=W_t$. For $0\leq k<t$ and $u,v\in S$ with $\widetilde{\mathbb{P}}(V_k=u)>0$, conditioning on the value $V_k=u$ and using independence of $F_{-t+k}$ from the earlier maps $F_{-t},\dots,F_{-t+k-1}$ gives \begin{align*} \widetilde{\mathbb{P}}(V_{k+1}=v\mid V_k=u)=\mathbb{P}(F_{-t+k}(u)=v)=P(u,v). \end{align*} For states $u\in S$ with $\widetilde{\mathbb{P}}(V_k=u)=0$, their contribution to the total-probability formula is zero. Hence the law of $V_{k+1}$ is obtained from the law of $V_k$ by applying $P$. Since $V_0$ has law $\pi$ and $\pi$ is stationary, induction over $k$ gives that $V_t=W_t$ has law $\pi$. Therefore \begin{align*} \widetilde{\mathbb{P}}(W_t=y)=\pi(y) \end{align*} for every $y\in S$ and every $t\in\mathbb{N}$. Now fix $t\in\mathbb{N}$. On the event $\{\tau\leq t\}\cap A$, the previous step applied at time $\tau(\omega)$ shows that $\Phi_{-\tau(\omega),0}(\omega)$ maps every state of $S$ to $Y(\omega)$. For $z\in S$, the composition identity \begin{align*} \Phi_{-t,0}(\omega)(z)=\Phi_{-\tau(\omega),0}(\omega)(\Phi_{-t,-\tau(\omega)}(\omega)(z)) \end{align*} therefore implies $W_t(\omega,z)=\widetilde{Y}(\omega,z)$. Hence, for every $y\in S$, \begin{align*} |\mathbb{P}(Y=y)-\pi(y)|=|\widetilde{\mathbb{P}}(\widetilde{Y}=y)-\widetilde{\mathbb{P}}(W_t=y)|\leq \widetilde{\mathbb{P}}(W_t\neq \widetilde{Y})\leq \mathbb{P}(\tau>t)+\mathbb{P}(A^c). \end{align*} Since $\mathbb{P}(A^c)=0$ and $\tau<\infty$ almost surely, continuity from above gives \begin{align*} \lim_{t\to\infty}\mathbb{P}(\tau>t)=0. \end{align*} Letting $t\to\infty$ yields \begin{align*} \mathbb{P}(Y=y)=\pi(y) \end{align*} for every $y\in S$. This proves that the monotone coupling-from-the-past output has distribution $\pi$. [guided] The point of this step is to prove the CFTP output law directly, without appealing to an external correctness theorem. We introduce one extra independent starting state with the stationary distribution and then run it forward using the same update maps. Let $(S,2^S,\pi)$ be the finite probability space whose probability of the singleton $\{z\}$ is $\pi(z)$, and form \begin{align*} (\widetilde{\Omega},\widetilde{\mathcal{F}},\widetilde{\mathbb{P}})=(\Omega\times S,\mathcal{F}\otimes 2^S,\mathbb{P}\otimes\pi). \end{align*} Define \begin{align*} Z:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto z. \end{align*} Then $Z$ has distribution $\pi$ and is independent of all update maps because the product measure separates the original randomness $\omega$ from the coordinate $z$. For a fixed $t\in\mathbb{N}$, define \begin{align*} W_t:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto \Phi_{-t,0}(\omega)(z). \end{align*} Also define the product-space version of the output by \begin{align*} \widetilde{Y}:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto Y(\omega). \end{align*} This declaration matters because $Y$ was originally a random variable on $\Omega$, while the comparison with $W_t$ takes place on $\widetilde{\Omega}$. For $0\leq k\leq t$, define \begin{align*} V_k:\widetilde{\Omega}&\to S \end{align*} \begin{align*} (\omega,z)&\mapsto \Phi_{-t,-t+k}(\omega)(z). \end{align*} The case $k=0$ is meaningful because the theorem statement defines the empty composition $\Phi_{-t,-t}$ to be $\operatorname{id}_S$, so $V_0=Z$. The case $k=t$ gives $V_t=W_t$. We now verify that $(V_k)_{k=0}^t$ evolves with transition matrix $P$. Fix $0\leq k<t$ and states $u,v\in S$ with $\widetilde{\mathbb{P}}(V_k=u)>0$. Conditional on $V_k=u$, the next state is obtained by applying $F_{-t+k}$ to $u$. The map $F_{-t+k}$ is independent of the earlier maps used to form $V_k$, namely $F_{-t},\dots,F_{-t+k-1}$, and its one-step marginal satisfies \begin{align*} \mathbb{P}(F_{-t+k}(u)=v)=P(u,v). \end{align*} Therefore \begin{align*} \widetilde{\mathbb{P}}(V_{k+1}=v\mid V_k=u)=P(u,v). \end{align*} If $\widetilde{\mathbb{P}}(V_k=u)=0$, then the state $u$ contributes zero to the total-probability expansion for the law of $V_{k+1}$. Hence each step updates the current law by the transition matrix $P$. Stationarity gives $\pi P=\pi$, so induction over $k$ yields that $V_t=W_t$ has distribution $\pi$. Therefore \begin{align*} \widetilde{\mathbb{P}}(W_t=y)=\pi(y) \end{align*} for every $y\in S$. It remains to compare $W_t$ with the CFTP output $Y$. If $\tau(\omega)\leq t$ and $\omega\in A$, then the previous step says that the map $\Phi_{-\tau(\omega),0}(\omega):S\to S$ sends every state to $Y(\omega)$. The composition from $-t$ to $0$ factors through time $-\tau(\omega)$: \begin{align*} \Phi_{-t,0}(\omega)(z)=\Phi_{-\tau(\omega),0}(\omega)(\Phi_{-t,-\tau(\omega)}(\omega)(z)). \end{align*} When $\tau(\omega)=t$, the inner composition is the identity map $\Phi_{-t,-t}(\omega)=\operatorname{id}_S$, so the same formula still applies. The inner value $\Phi_{-t,-\tau(\omega)}(\omega)(z)$ is an element of $S$, so the outer coalesced map sends it to $Y(\omega)=\widetilde{Y}(\omega,z)$. Thus $W_t(\omega,z)=\widetilde{Y}(\omega,z)$ whenever $\tau(\omega)\leq t$ and $\omega\in A$. For each $y\in S$, comparing the two events on the product space gives \begin{align*} |\mathbb{P}(Y=y)-\pi(y)|=|\widetilde{\mathbb{P}}(\widetilde{Y}=y)-\widetilde{\mathbb{P}}(W_t=y)|\leq \widetilde{\mathbb{P}}(W_t\neq \widetilde{Y}). \end{align*} The preceding paragraph shows that disagreement can occur only if $\tau>t$ or $A$ fails, so \begin{align*} \widetilde{\mathbb{P}}(W_t\neq \widetilde{Y})\leq \mathbb{P}(\tau>t)+\mathbb{P}(A^c). \end{align*} We already proved $\mathbb{P}(A^c)=0$, and the hypothesis $\tau<\infty$ almost surely implies, by continuity from above for the decreasing events $\{\tau>t\}$, that \begin{align*} \lim_{t\to\infty}\mathbb{P}(\tau>t)=0. \end{align*} Letting $t\to\infty$ gives \begin{align*} \mathbb{P}(Y=y)=\pi(y) \end{align*} for every $y\in S$. [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.