[proofplan]
The proof is mostly a careful unpacking of the graphical instrumental-variable criterion. The first two hypotheses verify possible relevance and exclusion directly from directed-path structure. The third hypothesis is converted into an independence statement by the [global Markov property for directed acyclic graphs](/theorems/9667), applied to the mutilated graph $G^{\operatorname{out}(Z)}$. The final assertion is exactly the definition of distributional relevance under interventions.
[/proofplan]
custom_env
admin
[step:Delete the outgoing arrows from $Z$ and record the relevant graph]
By definition,
\begin{align*}
E^{\operatorname{out}(Z)}:=E\setminus\{(Z,W)\in E:W\in V\}.
\end{align*}
Thus $G^{\operatorname{out}(Z)}=(V,E^{\operatorname{out}(Z)})$ has the same vertex set as $G$ and is obtained only by removing arrows whose tail is $Z$. Since $G$ is directed and acyclic, deleting arrows cannot create a directed cycle. Hence $G^{\operatorname{out}(Z)}$ is also a directed acyclic graph.
[/step]
custom_env
admin
[step:Use the path from $Z$ to $A$ to verify graphical possible relevance]
The first hypothesis states that there is a directed path from $Z$ to $A$ in $G$. This is precisely the graphical possible-relevance part of the instrumental-variable criterion: the graph contains a directed causal route along which changes in $Z$ may affect $A$. No distributional relevance is inferred at this point; the conclusion here is only the graphical condition.
[/step]
custom_env
admin
[step:Use the directed-path condition to verify graphical exclusion]
The second hypothesis states that every directed path from $Z$ to $Y$ in $G$ contains $A$. Therefore there is no directed path from $Z$ to $Y$ in $G$ that avoids $A$. This is exactly the graphical exclusion condition for an instrument: any directed effect of $Z$ on $Y$ represented by the graph must pass through the treatment vertex $A$.
[/step]
custom_env
admin
[step:Apply the global Markov property in the graph with the outgoing arrows from $Z$ removed]Let $(S_V,\mathcal S_V)$ be the product measurable space
\begin{align*}
(S_V,\mathcal S_V):=\left(\prod_{v\in V}S_v,\bigotimes_{v\in V}\mathcal S_v\right).
\end{align*}
Let $Q$ be a probability law on $(S_V,\mathcal S_V)$ that is Markov with respect to $G^{\operatorname{out}(Z)}$. Let $\pi_v:(S_V,\mathcal S_V)\to(S_v,\mathcal S_v)$ denote the coordinate projection corresponding to $v\in V$. The third hypothesis says that $\{Z\}$ and $\{Y\}$ are d-separated by the empty set in $G^{\operatorname{out}(Z)}$. Since $G^{\operatorname{out}(Z)}$ is a directed acyclic graph and $Q$ is Markov with respect to it, the [Global Markov Property for Directed Acyclic Graphs][citetheorem:9667] gives
\begin{align*}
\pi_Z \perp\!\!\!\perp \pi_Y \quad \text{under } Q.
\end{align*}
Equivalently, the coordinate projections $\pi_Z$ and $\pi_Y$ are independent under $Q$.[/step]
custom_env
admin
[guided]We now prove the statistical independence consequence from the graphical d-separation hypothesis. The probability law $Q$ lives on the product measurable space
\begin{align*}
(S_V,\mathcal S_V):=\left(\prod_{v\in V}S_v,\bigotimes_{v\in V}\mathcal S_v\right).
\end{align*}
For each vertex $v\in V$, define the coordinate projection $\pi_v:(S_V,\mathcal S_V)\to(S_v,\mathcal S_v)$ by sending a point of the product space to its $v$-coordinate. These coordinate projections are the random variables whose joint law is $Q$.
The graph to which the Markov argument applies is not the original graph $G$, but the graph $G^{\operatorname{out}(Z)}$ obtained by deleting every arrow with tail $Z$. We have already observed that deleting arrows from a directed acyclic graph cannot create a directed cycle, so $G^{\operatorname{out}(Z)}$ is a directed acyclic graph. The law $Q$ is assumed to be Markov with respect to this graph.
The third hypothesis states that the vertex set $\{Z\}$ and the vertex set $\{Y\}$ are d-separated by the empty conditioning set in $G^{\operatorname{out}(Z)}$. The [Global Markov Property for Directed Acyclic Graphs][citetheorem:9667] says that, for a probability law Markov with respect to a directed acyclic graph, d-separation of vertex sets implies the corresponding conditional independence of the associated random variables. Here the conditioning set is empty, so conditional independence given the empty set is ordinary independence. Therefore
\begin{align*}
\pi_Z \perp\!\!\!\perp \pi_Y \quad \text{under } Q.
\end{align*}
Thus the coordinate projections $\pi_Z$ and $\pi_Y$ are independent under $Q$. This is the product-space version of the graphical independence assertion corresponding to the $Z$- and $Y$-coordinates.[/guided]
custom_env
admin
[step:Conclude the graphical instrumental-variable criterion and the distributional relevance assertion]
Combining the previous steps, the three clauses in the definition of the graphical instrumental-variable criterion are verified: the directed path from $Z$ to $A$ gives graphical possible relevance, the directed-path condition through $A$ gives graphical exclusion, and the d-separation condition in $G^{\operatorname{out}(Z)}$ gives absence of an open noncausal path from $Z$ to $Y$.
It remains only to interpret the final hypothesis. If there exist $z,z'\in S_Z$ such that the interventional laws of $X_A$ under $\operatorname{do}(X_Z=z)$ and $\operatorname{do}(X_Z=z')$ are distinct, then changing the intervention value assigned to $Z$ changes the distribution of $A$. This is exactly the definition of distributional relevance of $Z$ for $A$ in the structural causal model $M$. This proves all asserted conclusions.
[/step]