[proofplan]
Both properties follow from unpacking the definitions of $\lambda$ and $P$ using the axioms of probability. The initial distribution inherits non-negativity and summation to one from the probability measure $\mathbb{P}$, and the transition matrix inherits row-stochasticity from the fact that the conditional probabilities $\mathbb{P}(X_1 = j \mid X_0 = i)$ form a probability distribution over $j \in S$ for each fixed $i$.
[/proofplan]
[step:Show that $\lambda$ is a probability distribution]
Since $\lambda_i = \mathbb{P}(X_0 = i)$ is a probability, we have $\lambda_i \geq 0$ for every $i \in S$. The events $\{X_0 = i\}$ for $i \in S$ are pairwise disjoint (the chain occupies exactly one state at time $0$) and exhaust the sample space $\Omega$, since $X_0$ takes values in $S$. By countable additivity of $\mathbb{P}$,
\begin{align*}
\sum_{i \in S} \lambda_i = \sum_{i \in S} \mathbb{P}(X_0 = i) = \mathbb{P}\!\left(\bigcup_{i \in S} \{X_0 = i\}\right) = \mathbb{P}(\Omega) = 1.
\end{align*}
Hence $\lambda$ is a probability distribution on $S$.
[guided]
We need to verify two things: that every $\lambda_i$ is non-negative, and that the $\lambda_i$ sum to one.
For non-negativity: each $\lambda_i = \mathbb{P}(X_0 = i)$ is the probability of an event, so $\lambda_i \geq 0$ by the axioms of probability.
For summation: the key observation is that the events $\{X_0 = i\}$ for $i \in S$ form a **partition** of the sample space $\Omega$. Why? Because $X_0$ is a random variable taking values in $S$, so for every outcome $\omega \in \Omega$, there is exactly one state $i \in S$ with $X_0(\omega) = i$. The events are therefore pairwise disjoint and their union is all of $\Omega$. Applying the countable additivity axiom of $\mathbb{P}$:
\begin{align*}
\sum_{i \in S} \lambda_i = \sum_{i \in S} \mathbb{P}(X_0 = i) = \mathbb{P}\!\left(\bigcup_{i \in S} \{X_0 = i\}\right) = \mathbb{P}(\Omega) = 1.
\end{align*}
The interchange of the sum and $\mathbb{P}$ is valid because the events are pairwise disjoint and $S$ is countable.
[/guided]
[/step]
[step:Show that $P$ is a stochastic matrix]
For non-negativity: $p_{i,j} = \mathbb{P}(X_1 = j \mid X_0 = i) \geq 0$ for all $i, j \in S$, since every conditional probability is non-negative.
For the row-sum condition: fix $i \in S$. The events $\{X_1 = j\}$ for $j \in S$ partition $\Omega$, since $X_1$ takes values in $S$. Using the law of total probability conditional on $\{X_0 = i\}$:
\begin{align*}
\sum_{j \in S} p_{i,j} = \sum_{j \in S} \mathbb{P}(X_1 = j \mid X_0 = i) = 1,
\end{align*}
because the conditional probabilities $\mathbb{P}(\cdot \mid X_0 = i)$ define a probability measure on $(\Omega, \mathcal{F})$, and $\{X_1 = j : j \in S\}$ is a partition of $\Omega$. Since this holds for every $i \in S$, the matrix $P$ is stochastic.
[guided]
We verify two properties of the matrix $P = (p_{i,j})_{i,j \in S}$: non-negativity of all entries, and the condition that each row sums to one.
**Non-negativity.** Each entry $p_{i,j} = \mathbb{P}(X_1 = j \mid X_0 = i)$ is a conditional probability, and conditional probabilities are non-negative by definition.
**Row sums.** Fix any state $i \in S$. We must show $\sum_{j \in S} p_{i,j} = 1$. The idea is the same as for $\lambda$: the events $\{X_1 = j\}$ for $j \in S$ partition $\Omega$, because $X_1$ takes values in $S$ and must equal exactly one element of $S$ at each outcome.
Now, conditional on $\{X_0 = i\}$ (assuming $\mathbb{P}(X_0 = i) > 0$), the map $A \mapsto \mathbb{P}(A \mid X_0 = i)$ is itself a probability measure. Applying it to the partition $\{X_1 = j : j \in S\}$:
\begin{align*}
\sum_{j \in S} p_{i,j} = \sum_{j \in S} \mathbb{P}(X_1 = j \mid X_0 = i) = \mathbb{P}\!\left(\bigcup_{j \in S} \{X_1 = j\} \;\middle|\; X_0 = i\right) = \mathbb{P}(\Omega \mid X_0 = i) = 1.
\end{align*}
Why does this matter? The row-sum condition says that from any state $i$, the chain must go *somewhere*: the probabilities of all possible next states account for all the probability mass. This is precisely what makes $P$ a stochastic matrix.
[/guided]
[/step]