[proofplan]
We prove an equivalence. The forward direction ($\Rightarrow$) proceeds by induction on the path length $n$: the base case $n = 0$ is immediate from the definition of $\lambda$, and the inductive step factors the joint probability using the multiplication rule for conditional probabilities, then applies the Markov property to collapse the conditioning to the most recent state. The reverse direction ($\Leftarrow$) assumes the product formula and recovers the Markov property by dividing the joint probability of a length-$n$ path by that of the length-$(n-1)$ prefix.
[/proofplan]
[step:Establish notation and set up the two directions]
Write $A_k$ for the event $\{X_k = i_k\}$. We prove each direction of the equivalence separately.
[/step]
[step:Prove the forward direction by induction on $n$]
Suppose $X$ is a Markov chain with initial distribution $\lambda$ and transition matrix $P$. We prove the product formula by induction on $n$.
**Base case** ($n = 0$). The formula asserts $\mathbb{P}(X_0 = i_0) = \lambda_{i_0}$, which holds by the definition of the initial distribution (the product of transition probabilities is empty and equals $1$ by convention).
**Inductive step.** Assume the formula holds for all paths of length $n - 1$. For a path of length $n$, we apply the multiplication rule for conditional probabilities:
\begin{align*}
\mathbb{P}(A_0 \cap \cdots \cap A_n) &= \mathbb{P}(A_0 \cap \cdots \cap A_{n-1}) \cdot \mathbb{P}(A_n \mid A_0 \cap \cdots \cap A_{n-1}).
\end{align*}
By the inductive hypothesis, $\mathbb{P}(A_0 \cap \cdots \cap A_{n-1}) = \lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}}$. For the conditional probability, we use the Markov property: given $X_0 = i_0, \ldots, X_n = i_n$, the conditional distribution of $X_n$ depends only on $X_{n-1}$. Thus
\begin{align*}
\mathbb{P}(X_n = i_n \mid X_0 = i_0, \ldots, X_{n-1} = i_{n-1}) = \mathbb{P}(X_n = i_n \mid X_{n-1} = i_{n-1}) = p_{i_{n-1}, i_n},
\end{align*}
where the last equality uses homogeneity. Combining:
\begin{align*}
\mathbb{P}(A_0 \cap \cdots \cap A_n) = \lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}} \cdot p_{i_{n-1},i_n} = \lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-1},i_n}.
\end{align*}
This completes the induction.
[guided]
We want to show that if $X$ is Markov$(\lambda, P)$, then the joint probability of any finite path factorises as $\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-1},i_n}$. The natural approach is induction on the path length $n$.
**Base case** ($n = 0$). The product formula with $n = 0$ reads $\mathbb{P}(X_0 = i_0) = \lambda_{i_0}$. This is exactly the definition of the initial distribution — no transition probabilities appear because the chain has not yet taken a step. (The product $\prod_{k=0}^{-1} p_{i_k, i_{k+1}}$ is empty and equals $1$ by convention.)
**Inductive step.** Suppose the formula holds for paths of length $n - 1$, and consider a path of length $n$. The strategy is to peel off the last state and reduce to a path of length $n - 1$.
By the multiplication rule for conditional probabilities (which states $\mathbb{P}(A \cap B) = \mathbb{P}(A) \cdot \mathbb{P}(B \mid A)$ whenever $\mathbb{P}(A) > 0$):
\begin{align*}
\mathbb{P}(A_0 \cap \cdots \cap A_n) &= \mathbb{P}(A_0 \cap \cdots \cap A_{n-1}) \cdot \mathbb{P}(A_n \mid A_0 \cap \cdots \cap A_{n-1}).
\end{align*}
The first factor equals $\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}}$ by the inductive hypothesis. For the second factor, this is exactly where the Markov property enters: knowing the full history $X_0 = i_0, \ldots, X_{n-1} = i_{n-1}$ provides no more information about $X_n$ than knowing only $X_{n-1} = i_{n-1}$. Formally:
\begin{align*}
\mathbb{P}(X_n = i_n \mid X_0 = i_0, \ldots, X_{n-1} = i_{n-1}) = \mathbb{P}(X_n = i_n \mid X_{n-1} = i_{n-1}) = p_{i_{n-1}, i_n}.
\end{align*}
The first equality is the Markov property; the second is the definition of the transition probability (using homogeneity: the transition probability does not depend on the time step). Multiplying the two factors:
\begin{align*}
\mathbb{P}(A_0 \cap \cdots \cap A_n) = \lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}} \cdot p_{i_{n-1},i_n} = \lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-1},i_n}.
\end{align*}
This completes the induction.
[/guided]
[/step]
[step:Prove the reverse direction by computing the conditional probability from the product formula]
Suppose the product formula holds. We must show $X$ is Markov with initial distribution $\lambda$ and transition matrix $P$.
**Initial distribution.** Setting $n = 0$, the formula gives $\mathbb{P}(X_0 = i_0) = \lambda_{i_0}$ for all $i_0 \in S$, so the initial distribution is $\lambda$.
**Markov property.** Fix $n \geq 1$ and states $i_0, \ldots, i_n \in S$ with $\mathbb{P}(A_0 \cap \cdots \cap A_{n-1}) > 0$. By the definition of conditional probability:
\begin{align*}
\mathbb{P}(X_n = i_n \mid X_0 = i_0, \ldots, X_{n-1} = i_{n-1}) &= \frac{\mathbb{P}(A_0 \cap \cdots \cap A_n)}{\mathbb{P}(A_0 \cap \cdots \cap A_{n-1})}.
\end{align*}
Substituting the product formula for both numerator and denominator:
\begin{align*}
\frac{\mathbb{P}(A_0 \cap \cdots \cap A_n)}{\mathbb{P}(A_0 \cap \cdots \cap A_{n-1})} &= \frac{\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-1},i_n}}{\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}}} = p_{i_{n-1},i_n}.
\end{align*}
The cancellation is valid because $\mathbb{P}(A_0 \cap \cdots \cap A_{n-1}) > 0$ implies $\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}} > 0$, so all factors in the denominator are positive. The result $p_{i_{n-1},i_n}$ depends only on $i_{n-1}$ and $i_n$, not on the earlier states $i_0, \ldots, i_{n-2}$. This is exactly the Markov property with transition probabilities $p_{i,j}$, confirming that $X$ is a Markov chain with initial distribution $\lambda$ and transition matrix $P$.
[guided]
Now suppose we are given the product formula and must recover the Markov property. The strategy is direct: compute $\mathbb{P}(X_n = i_n \mid X_0 = i_0, \ldots, X_{n-1} = i_{n-1})$ and show it depends only on $i_{n-1}$.
**Initial distribution.** Setting $n = 0$ in the product formula gives $\mathbb{P}(X_0 = i_0) = \lambda_{i_0}$, which identifies $\lambda$ as the initial distribution.
**Markov property.** Fix $n \geq 1$ and any states $i_0, \ldots, i_n \in S$ such that the conditioning event has positive probability: $\mathbb{P}(X_0 = i_0, \ldots, X_{n-1} = i_{n-1}) > 0$. By the definition of conditional probability (the ratio definition, which applies since the conditioning event has positive probability):
\begin{align*}
\mathbb{P}(X_n = i_n \mid X_0 = i_0, \ldots, X_{n-1} = i_{n-1}) &= \frac{\mathbb{P}(X_0 = i_0, \ldots, X_n = i_n)}{\mathbb{P}(X_0 = i_0, \ldots, X_{n-1} = i_{n-1})}.
\end{align*}
Now apply the product formula to both numerator and denominator. The numerator is $\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-1},i_n}$, and the denominator is $\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}}$. Dividing:
\begin{align*}
\frac{\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-1},i_n}}{\lambda_{i_0}\, p_{i_0,i_1} \cdots p_{i_{n-2},i_{n-1}}} = p_{i_{n-1},i_n}.
\end{align*}
The cancellation is valid because $\mathbb{P}(A_0 \cap \cdots \cap A_{n-1}) > 0$ forces every factor in the denominator to be strictly positive (if any one were zero, the product would be zero).
The result $p_{i_{n-1},i_n}$ depends only on the current state $i_{n-1}$ and the proposed next state $i_n$ — it does not depend on the earlier states $i_0, \ldots, i_{n-2}$. This is precisely the Markov property. Since the transition probability $p_{i_{n-1},i_n}$ does not depend on $n$, the chain is also homogeneous.
[/guided]
[/step]