Finite-Dimensional Distribution Formula for Markov Chains

Finite-Dimensional Distribution Formula for Markov Chains (Theorem # 9527)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove the formula by induction on the length of the prescribed path. The case $n=0$ is exactly the definition of the initial distribution. For the induction step, we compare the probability of a path of length $m+1$ with the probability of its length-$m$ prefix; if the prefix has probability zero, the longer path also has probability zero, while if the prefix has positive probability, the Markov property and time-homogeneity identify the final [conditional probability](/page/Conditional%20Probability) with the transition probability $p_{i_m i_{m+1}}$. [/proofplan] [step:Verify the formula for paths of length zero] Fix $i_0 \in S$. By definition of the initial distribution $\lambda$, \begin{align*} \mathbb P(X_0=i_0)=\lambda_{i_0}. \end{align*} For $n=0$, the product $\prod_{r=0}^{-1}p_{i_r i_{r+1}}$ is the empty product and is therefore equal to $1$. Hence \begin{align*} \mathbb P(X_0=i_0)=\lambda_{i_0}\prod_{r=0}^{-1}p_{i_r i_{r+1}}. \end{align*} This proves the asserted identity for $n=0$. [/step] [step:Extend the formula from a prefix path to a one-step longer path] Assume that the formula holds for some $m \ge 0$. Fix states $i_0,\ldots,i_m,i_{m+1} \in S$. Since each $X_r:(\Omega,\mathcal F)\to(S,\mathcal P(S))$ is measurable, the following cylinder events belong to $\mathcal F$. Define the prefix event $A_m \in \mathcal F$ by \begin{align*} A_m=\{X_0=i_0, X_1=i_1,\ldots,X_m=i_m\}, \end{align*} and define the extended path event $A_{m+1} \in \mathcal F$ by \begin{align*} A_{m+1}=\{X_0=i_0, X_1=i_1,\ldots,X_m=i_m,X_{m+1}=i_{m+1}\}. \end{align*} If $\mathbb P(A_m)=0$, then $A_{m+1}\subset A_m$, so $\mathbb P(A_{m+1})=0$. By the induction hypothesis, \begin{align*} \mathbb P(A_m)=\lambda_{i_0}\prod_{r=0}^{m-1}p_{i_r i_{r+1}}, \end{align*} and therefore \begin{align*} \lambda_{i_0}\prod_{r=0}^{m}p_{i_r i_{r+1}}=\mathbb P(A_m)p_{i_m i_{m+1}}=0. \end{align*} Thus the desired identity holds in this case. Now suppose $\mathbb P(A_m)>0$. By the definition of conditional probability applied to the event $A_m$ and the event $\{X_{m+1}=i_{m+1}\}$, \begin{align*} \mathbb P(A_{m+1})=\mathbb P(A_m)\mathbb P(X_{m+1}=i_{m+1}\mid A_m). \end{align*} Since $A_m$ is the history event $\{X_0=i_0,\ldots,X_m=i_m\}$ and has positive probability, the Markov property in the time-homogeneous form stated above gives \begin{align*} \mathbb P(X_{m+1}=i_{m+1}\mid A_m)=p_{i_m i_{m+1}}. \end{align*} Equivalently, because $A_m\subset\{X_m=i_m\}$ and $\mathbb P(A_m)>0$, the event $\{X_m=i_m\}$ also has positive probability, so this is the same one-step transition probability from state $i_m$ to state $i_{m+1}$. Using the induction hypothesis for $\mathbb P(A_m)$, we obtain \begin{align*} \mathbb P(A_{m+1})=\left(\lambda_{i_0}\prod_{r=0}^{m-1}p_{i_r i_{r+1}}\right)p_{i_m i_{m+1}}. \end{align*} Hence \begin{align*} \mathbb P(A_{m+1})=\lambda_{i_0}\prod_{r=0}^{m}p_{i_r i_{r+1}}. \end{align*} [guided] The only subtle point in the induction step is that conditional probabilities given a history are usually stated only when the history has positive probability. We therefore split into two cases. Fix $m \ge 0$ and states $i_0,\ldots,i_m,i_{m+1} \in S$. Since each $X_r:(\Omega,\mathcal F)\to(S,\mathcal P(S))$ is measurable, the finite intersections defining the path events below belong to $\mathcal F$. Let $A_m \in \mathcal F$ denote the event that the chain follows the prescribed prefix up to time $m$: \begin{align*} A_m=\{X_0=i_0, X_1=i_1,\ldots,X_m=i_m\}. \end{align*} Let $A_{m+1} \in \mathcal F$ denote the event that the chain follows the same path for one additional step: \begin{align*} A_{m+1}=\{X_0=i_0, X_1=i_1,\ldots,X_m=i_m,X_{m+1}=i_{m+1}\}. \end{align*} First suppose $\mathbb P(A_m)=0$. Then the longer path event is contained in the prefix event: $A_{m+1}\subset A_m$. Monotonicity of probability gives \begin{align*} 0 \le \mathbb P(A_{m+1}) \le \mathbb P(A_m)=0, \end{align*} so $\mathbb P(A_{m+1})=0$. The induction hypothesis says \begin{align*} \mathbb P(A_m)=\lambda_{i_0}\prod_{r=0}^{m-1}p_{i_r i_{r+1}}. \end{align*} Since $\mathbb P(A_m)=0$, multiplying by the transition probability $p_{i_m i_{m+1}}$ gives \begin{align*} \lambda_{i_0}\prod_{r=0}^{m}p_{i_r i_{r+1}}=0. \end{align*} Thus both sides of the desired identity for $A_{m+1}$ are zero. Now suppose $\mathbb P(A_m)>0$. This positivity is exactly what permits us to condition on $A_m$. By the definition of conditional probability, \begin{align*} \mathbb P(A_{m+1})=\mathbb P(A_m)\mathbb P(X_{m+1}=i_{m+1}\mid A_m). \end{align*} The event $A_m$ records the full history $X_0=i_0,\ldots,X_m=i_m$. The Markov property in the time-homogeneous form stated in the theorem says that, for every positive-probability history ending at $i_m$, the conditional probability of moving next to $i_{m+1}$ is the transition probability $p_{i_m i_{m+1}}$. Hence \begin{align*} \mathbb P(X_{m+1}=i_{m+1}\mid A_m)=p_{i_m i_{m+1}}. \end{align*} If one writes this transition probability as a conditional probability given the present state alone, this is legitimate here: since $A_m\subset\{X_m=i_m\}$ and $\mathbb P(A_m)>0$, monotonicity gives $\mathbb P(X_m=i_m)>0$. Substituting this into the conditional-probability factorization gives \begin{align*} \mathbb P(A_{m+1})=\mathbb P(A_m)p_{i_m i_{m+1}}. \end{align*} Finally, the induction hypothesis identifies the prefix probability: \begin{align*} \mathbb P(A_m)=\lambda_{i_0}\prod_{r=0}^{m-1}p_{i_r i_{r+1}}. \end{align*} Combining the last two displayed identities yields \begin{align*} \mathbb P(A_{m+1})=\lambda_{i_0}\prod_{r=0}^{m}p_{i_r i_{r+1}}. \end{align*} This proves the induction step in both the zero-probability and positive-probability cases. [/guided] [/step] [step:Conclude by induction for every finite path] The base case has been verified, and the induction step shows that validity for paths ending at time $m$ implies validity for paths ending at time $m+1$. By induction on $n \ge 0$, for every $n \ge 0$ and every $i_0,\ldots,i_n \in S$, \begin{align*} \mathbb P(X_0=i_0, X_1=i_1,\ldots,X_n=i_n)=\lambda_{i_0}\prod_{r=0}^{n-1}p_{i_r i_{r+1}}. \end{align*} This is the finite-dimensional distribution formula for the [Markov chain](/page/Markov%20Chain). [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Explore Further

Conditional Probability Definition Distribution Definition Event Definition Consistency of Random Design Ordinary Least Squares Probability & Statistics Kantorovich-Rubinstein Duality Theorem Probability & Statistics Marchenko-Pastur Upper Edge for Gaussian Null Sample Covariance Matrices Probability & Statistics Omitted Variable Bias Formula Probability & Statistics Properties of the Discrete Conditional Expectation Conditional Expectation Population Reliability Diagram Identity Probability & Statistics Mean and Variance of the Normal Probability Theory Tensorization of Talagrand's $T_1$ Inequality Probability & Statistics Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.