[proofplan]
We decompose the interventional probability under $\mathbb P_x$ by conditioning on the countable covariate value $Z=z$. The graphical back-door assumptions have two consequences: first, intervening on $X$ does not change the law of the non-descendant covariates $Z$; second, the action $do(X=x)$ can be exchanged for the observation $X=x$ after conditioning on $Z=z$. Substituting these two identities into the total-probability decomposition gives the point-mass formula, and the event formula follows by summing over the countable state space of $Y$.
[/proofplan]
custom_env
admin
[step:Decompose the interventional law over the countable values of $Z$]Fix $y\in\mathcal Y$. Since $\mathcal Z$ is countable, the events $\{Z=z\}$, for $z\in\mathcal Z$, form a countable measurable partition of the sample space under the interventional law $\mathbb P_x$. By the [law of total probability](/theorems/1113) for countable partitions,
\begin{align*}
\mathbb P_x(Y=y)=\sum_{z\in\mathcal Z}\mathbb P_x(Y=y\mid Z=z)\mathbb P_x(Z=z),
\end{align*}
where terms with $\mathbb P_x(Z=z)=0$ are interpreted as zero.[/step]
custom_env
admin
[guided]We begin under the interventional law, not under the observational law. Fix a value $y\in\mathcal Y$. The random element $Z$ takes values in the [countable set](/page/Countable%20Set) $\mathcal Z$, so the events $\{Z=z\}$ for $z\in\mathcal Z$ are measurable, pairwise disjoint, and their union is the whole sample space. Therefore the countable form of the law of total probability applies to the probability measure $\mathbb P_x$.
For each $z\in\mathcal Z$ with $\mathbb P_x(Z=z)>0$, the elementary [conditional probability](/page/Conditional%20Probability) is
\begin{align*}
\mathbb P_x(Y=y\mid Z=z)=\frac{\mathbb P_x(Y=y,Z=z)}{\mathbb P_x(Z=z)}.
\end{align*}
For $z$ with $\mathbb P_x(Z=z)=0$, the product contribution is zero, so the corresponding conditional value is immaterial. Thus the countable partition formula gives
\begin{align*}
\mathbb P_x(Y=y)=\sum_{z\in\mathcal Z}\mathbb P_x(Y=y\mid Z=z)\mathbb P_x(Z=z).
\end{align*}
This is the place where discreteness and countability are used: the adjustment formula is a countable sum over covariate strata.[/guided]
custom_env
admin
[step:Use the non-descendant condition to keep the law of $Z$ unchanged]
For each $z\in\mathcal Z$, the hypothesis that no vertex in $Z$ is a descendant of a vertex in $X$ implies, by [Pearl's rule three for deleting actions](/theorems/9680) [citetheorem:9680], that the intervention $do(X=x)$ does not change the marginal law of $Z$. Hence
\begin{align*}
\mathbb P_x(Z=z)=\mathbb P(Z=z).
\end{align*}
Consequently the previous decomposition becomes
\begin{align*}
\mathbb P_x(Y=y)=\sum_{z\in\mathcal Z}\mathbb P_x(Y=y\mid Z=z)\mathbb P(Z=z).
\end{align*}
All summands with $\mathbb P(Z=z)=0$ vanish, so the sum may be restricted to those $z\in\mathcal Z$ satisfying $\mathbb P(Z=z)>0$.
[/step]
custom_env
admin
[step:Exchange the intervention on $X$ for conditioning on $X=x$]Let $G_{\underline X}$ denote the graph obtained from $G$ by deleting every directed edge whose tail lies in a vertex of $X$. The back-door blocking assumption says precisely that $Z$ blocks every path from $X$ to $Y$ in $G_{\underline X}$. Thus $Y$ and $X$ are d-separated by $Z$ in $G_{\underline X}$.
For every $z\in\mathcal Z$ with $\mathbb P(Z=z)>0$, the positivity assumption gives
\begin{align*}
\mathbb P(X=x,Z=z)>0,
\end{align*}
so the observational conditional probability $\mathbb P(Y=y\mid X=x,Z=z)$ is defined. Applying [Pearl's rule two](/theorems/9679) [citetheorem:9679] with conditioning set $Z$ and action variable $X$ yields
\begin{align*}
\mathbb P_x(Y=y\mid Z=z)=\mathbb P(Y=y\mid X=x,Z=z).
\end{align*}[/step]
custom_env
admin
[guided]The goal of this step is to replace the interventional conditional probability by an observational conditional probability. Define $G_{\underline X}$ to be the mutilated graph obtained from $G$ by deleting every outgoing directed edge from a vertex in $X$. In this graph, every path from $X$ to $Y$ is a back-door path in the sense relevant for exchanging action and observation: the causal arrows leaving $X$ have been removed, so any remaining open association between $X$ and $Y$ must pass through noncausal paths.
The hypothesis states that $Z$ blocks every path from a vertex of $X$ to a vertex of $Y$ whose first edge has an arrowhead into that vertex of $X$. Equivalently, after deleting the outgoing arrows from $X$, the set $Z$ d-separates $X$ from $Y$ in $G_{\underline X}$. Therefore the graphical hypothesis required by Pearl's rule two is satisfied.
Now fix $z\in\mathcal Z$ with $\mathbb P(Z=z)>0$. The theorem also assumes positivity:
\begin{align*}
\mathbb P(X=x,Z=z)>0.
\end{align*}
This matters because the right-hand side we want to write is the elementary conditional probability
\begin{align*}
\mathbb P(Y=y\mid X=x,Z=z)=\frac{\mathbb P(Y=y,X=x,Z=z)}{\mathbb P(X=x,Z=z)}.
\end{align*}
The denominator is positive by the positivity assumption, so this conditional probability is defined.
With the d-separation condition verified in $G_{\underline X}$ and the relevant conditional probabilities defined, Pearl's rule two [citetheorem:9679] exchanges the intervention $do(X=x)$ for the observation $X=x$ after conditioning on $Z=z$. Hence
\begin{align*}
\mathbb P_x(Y=y\mid Z=z)=\mathbb P(Y=y\mid X=x,Z=z).
\end{align*}[/guided]
custom_env
admin
[step:Substitute the graphical identities into the total-probability decomposition]
Combining the previous two steps, for every $y\in\mathcal Y$,
\begin{align*}
\mathbb P_x(Y=y)=\sum_{\substack{z\in\mathcal Z, \mathbb P(Z=z)>0}}\mathbb P(Y=y\mid X=x,Z=z)\mathbb P(Z=z).
\end{align*}
This proves the pointwise [back-door adjustment formula](/theorems/9695).
[/step]
custom_env
admin
[step:Sum the pointwise formula over a countable event in the state space of $Y$]
Let $A\subset\mathcal Y$. Since $\mathcal Y$ is countable and the events $\{Y=y\}$ for $y\in A$ are pairwise disjoint,
\begin{align*}
\mathbb P_x(Y\in A)=\sum_{y\in A}\mathbb P_x(Y=y).
\end{align*}
Using the pointwise formula and nonnegative rearrangement of countable sums,
\begin{align*}
\mathbb P_x(Y\in A)=\sum_{y\in A}\sum_{\substack{z\in\mathcal Z, \mathbb P(Z=z)>0}}\mathbb P(Y=y\mid X=x,Z=z)\mathbb P(Z=z).
\end{align*}
Interchanging the countable sums is justified because every summand is nonnegative. Thus
\begin{align*}
\mathbb P_x(Y\in A)=\sum_{\substack{z\in\mathcal Z, \mathbb P(Z=z)>0}}\left(\sum_{y\in A}\mathbb P(Y=y\mid X=x,Z=z)\right)\mathbb P(Z=z).
\end{align*}
For each such $z$, positivity gives $\mathbb P(X=x,Z=z)>0$, and [countable additivity of conditional probability](/theorems/1112) gives
\begin{align*}
\sum_{y\in A}\mathbb P(Y=y\mid X=x,Z=z)=\mathbb P(Y\in A\mid X=x,Z=z).
\end{align*}
Therefore
\begin{align*}
\mathbb P_x(Y\in A)=\sum_{\substack{z\in\mathcal Z, \mathbb P(Z=z)>0}}\mathbb P(Y\in A\mid X=x,Z=z)\mathbb P(Z=z).
\end{align*}
This is the asserted event version of the back-door adjustment formula.
[/step]