Back-Door Adjustment Formula — Statement & Proof

Back-Door Adjustment Formula (Theorem # 9695)

Theorem

Edit Issues Pull Requests Attributions Admin

Let $(\Omega,\mathcal F,\mathbb P)$ be a [probability space](/page/Probability%20Space). Let $(\mathcal A,2^{\mathcal A})$ be a countable treatment measurable space, and let $(\mathcal Y,\mathcal E_{\mathcal Y})$ and $(\mathcal L,\mathcal E_{\mathcal L})$ be standard Borel spaces. Let \begin{align*} A:(\Omega,\mathcal F)\to(\mathcal A,2^{\mathcal A}) \end{align*} be the observed treatment [random variable](/page/Random%20Variable). Let \begin{align*} Y:(\Omega,\mathcal F)\to(\mathcal Y,\mathcal E_{\mathcal Y}) \end{align*} be the observed outcome random variable. Let \begin{align*} L:(\Omega,\mathcal F)\to(\mathcal L,\mathcal E_{\mathcal L}) \end{align*} be the observed covariate random variable. Fix $a\in\mathcal A$, and let \begin{align*} Y_a:(\Omega,\mathcal F)\to(\mathcal Y,\mathcal E_{\mathcal Y}) \end{align*} be the potential outcome under intervention $A=a$. Assume that the variables are generated by a structural causal model whose causal directed acyclic graph is $G$, and that $L$ satisfies the back-door criterion relative to $(A,Y)$ in $G$. Assume, under the usual structural-causal intervention semantics, that this back-door condition implies conditional exchangeability: \begin{align*} Y_a \perp\!\!\!\perp A \mid L. \end{align*} Assume consistency: \begin{align*} Y=Y_a \quad \text{on the event } \{A=a\}. \end{align*} Assume positivity: for a version of the regular [conditional probability](/page/Conditional%20Probability) \begin{align*} \pi_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(A=a\mid L=\ell), \end{align*} one has $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell\in\mathcal L$, where $\mathbb P_L:=\mathbb P\circ L^{-1}$. Then, for every $B\in\mathcal E_{\mathcal Y}$ and every regular conditional version \begin{align*} s_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y\in B\mid A=a,L=\ell), \end{align*} defined through a regular conditional law of $Y$ given $(A,L)$, one has \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}s_a(\ell)\,d\mathbb P_L(\ell). \end{align*} Equivalently, \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell). \end{align*}

Discussion

Proof

[proofplan] The proof is the standard adjustment argument. First disintegrate the law of the potential outcome $Y_a$ over the covariate $L$. The back-door criterion, through the assumed structural-causal semantics, gives conditional exchangeability, so conditioning additionally on $A=a$ does not change the conditional law of $Y_a$ given $L$. Positivity ensures this conditional law given $(A=a,L=\ell)$ is determined on the $\mathbb P_L$-support, and consistency then replaces $Y_a$ by the observed outcome $Y$ among units with $A=a$. Integrating the resulting [conditional probability](/page/Conditional%20Probability) over the marginal law of $L$ gives the formula. [/proofplan] [step:Disintegrate the potential-outcome law over the covariate distribution] Fix $B\in\mathcal E_{\mathcal Y}$. Define the event \begin{align*} E_B:=\{\omega\in\Omega:Y_a(\omega)\in B\}. \end{align*} Since $Y_a$ is $(\mathcal F,\mathcal E_{\mathcal Y})$-measurable and $B\in\mathcal E_{\mathcal Y}$, the event $E_B$ belongs to $\mathcal F$. Because $(\mathcal L,\mathcal E_{\mathcal L})$ is standard Borel, choose a regular conditional probability kernel for $Y_a$ given $L$ and denote the corresponding conditional probability by \begin{align*} q_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y_a\in B\mid L=\ell). \end{align*} By the defining property of regular conditional probability applied to the bounded measurable indicator $\mathbb 1_{E_B}:\Omega\to\{0,1\}$, one has \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}q_a(\ell)\,d\mathbb P_L(\ell). \end{align*} [guided] We first separate the target probability into covariate-specific pieces. Fix a measurable outcome set $B\in\mathcal E_{\mathcal Y}$, and define \begin{align*} E_B:=\{\omega\in\Omega:Y_a(\omega)\in B\}. \end{align*} This is an event because $Y_a:(\Omega,\mathcal F)\to(\mathcal Y,\mathcal E_{\mathcal Y})$ is measurable and $B$ is measurable. The role of the standard Borel hypothesis on $\mathcal L$ is to ensure that regular conditional probabilities may be used. Choose a version of the conditional law of $Y_a$ given $L$, and for this fixed set $B$ define \begin{align*} q_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y_a\in B\mid L=\ell). \end{align*} The function $q_a$ is $\mathcal E_{\mathcal L}$-measurable by the definition of a probability kernel. Applying the defining identity for conditional probability to the indicator function $\mathbb 1_{E_B}:\Omega\to\{0,1\}$ gives \begin{align*} \mathbb E[\mathbb 1_{E_B}] = \int_{\mathcal L}q_a(\ell)\,d\mathbb P_L(\ell). \end{align*} Since $\mathbb E[\mathbb 1_{E_B}]=\mathbb P(E_B)=\mathbb P(Y_a\in B)$, this becomes \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}\mathbb P(Y_a\in B\mid L=\ell)\,d\mathbb P_L(\ell). \end{align*} This is the [law of total probability](/theorems/1113) written in kernel form, with the measure in the integral explicitly equal to the marginal law $\mathbb P_L$ of $L$. [/guided] [/step] [step:Use back-door exchangeability to condition on the treatment value] By the assumed structural-causal interpretation of the back-door criterion, we have conditional exchangeability: \begin{align*} Y_a \perp\!\!\!\perp A \mid L. \end{align*} Let \begin{align*} r_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y_a\in B\mid A=a,L=\ell) \end{align*} be obtained from a regular conditional law of $Y_a$ given $(A,L)$ by evaluating the treatment coordinate at $a$. Conditional exchangeability means that the conditional law of $Y_a$ given $(A,L)$ equals the conditional law of $Y_a$ given $L$ for $\mathbb P_{(A,L)}$-almost every pair $(a',\ell)\in\mathcal A\times\mathcal L$, where $\mathbb P_{(A,L)}:=\mathbb P\circ(A,L)^{-1}$. Therefore \begin{align*} q_a(\ell)=r_a(\ell) \end{align*} for $\mathbb P_L$-almost every $\ell$ with $\pi_a(\ell)>0$. Since positivity gives $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell$, the equality $q_a=r_a$ holds $\mathbb P_L$-almost everywhere. Substituting into the disintegration formula gives \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}r_a(\ell)\,d\mathbb P_L(\ell). \end{align*} [guided] We now justify precisely why exchangeability permits us to condition on the observed treatment value $A=a$. The conditional exchangeability hypothesis is \begin{align*} Y_a \perp\!\!\!\perp A \mid L. \end{align*} In kernel language, this says that, after conditioning on $L$, the additional information carried by $A$ does not change the conditional law of $Y_a$. Thus the conditional law of $Y_a$ given $(A,L)$ agrees with the conditional law of $Y_a$ given $L$ for $\mathbb P_{(A,L)}$-almost every pair $(a',\ell)\in\mathcal A\times\mathcal L$, where $\mathbb P_{(A,L)}:=\mathbb P\circ(A,L)^{-1}$. For the fixed measurable set $B\in\mathcal E_{\mathcal Y}$, define \begin{align*} r_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y_a\in B\mid A=a,L=\ell). \end{align*} The preceding almost-everywhere identity is initially an identity with respect to the joint law of $(A,L)$, not automatically with respect to $\mathbb P_L$ on the whole fiber $A=a$. Positivity is exactly the hypothesis that transfers the identity to the covariate law: since \begin{align*} \pi_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(A=a\mid L=\ell) \end{align*} satisfies $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell$, a $\mathbb P_{(A,L)}$-null exceptional set on the slice $A=a$ has $\mathbb P_L$-measure zero on the positivity set. Hence \begin{align*} q_a(\ell)=r_a(\ell) \end{align*} for $\mathbb P_L$-almost every $\ell\in\mathcal L$. Substituting this $\mathbb P_L$-almost-everywhere equality into the previous integral is valid because both $q_a$ and $r_a$ are [measurable functions](/page/Measurable%20Functions) with values in $[0,1]$. Therefore \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}r_a(\ell)\,d\mathbb P_L(\ell). \end{align*} [/guided] [/step] [step:Apply consistency to replace the potential outcome by the observed outcome] Define \begin{align*} s_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y\in B\mid A=a,L=\ell) \end{align*} from a regular conditional law of $Y$ given $(A,L)$. We claim that \begin{align*} r_a(\ell)=s_a(\ell) \end{align*} for $\mathbb P_L$-almost every $\ell$. Consistency gives $Y=Y_a$ on $\{A=a\}$. Hence the events \begin{align*} \{Y_a\in B\}\cap\{A=a\} \end{align*} and \begin{align*} \{Y\in B\}\cap\{A=a\} \end{align*} are equal. It follows from the defining uniqueness of regular conditional probabilities that the two conditional probabilities given $(A,L)$ are equal for $\mathbb P_{(A,L)}$-almost every $(a',\ell)$ on the slice $a'=a$. Positivity transfers this equality to \begin{align*} r_a(\ell)=s_a(\ell) \end{align*} for $\mathbb P_L$-almost every $\ell$. [guided] The consistency hypothesis is the bridge from the counterfactual outcome $Y_a$ to the observed outcome $Y$. It states that \begin{align*} Y=Y_a \quad \text{on } \{A=a\}. \end{align*} Therefore, on the event where the observed treatment actually equals $a$, the event that the potential outcome lies in $B$ is the same event as the observed outcome lying in $B$. In set notation, \begin{align*} \{Y_a\in B\}\cap\{A=a\} = \{Y\in B\}\cap\{A=a\}. \end{align*} Define \begin{align*} s_a:\mathcal L\to[0,1],\qquad \ell\mapsto \mathbb P(Y\in B\mid A=a,L=\ell). \end{align*} Regular conditional probabilities are only determined up to the law of the conditioning variable. Hence the justified conclusion is not pointwise equality for every $\ell$, but equality for the relevant almost-everywhere class. Since the two events above are equal on $\{A=a\}$, their conditional probabilities given $(A,L)$ agree for $\mathbb P_{(A,L)}$-almost every pair on the slice $A=a$. By the same positivity transfer used in the exchangeability step, because $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell$, this gives \begin{align*} r_a(\ell)=s_a(\ell) \end{align*} for $\mathbb P_L$-almost every $\ell\in\mathcal L$. [/guided] [/step] [step:Integrate the observed conditional law over the covariate law] Substituting the equality $r_a=s_a$ $\mathbb P_L$-almost everywhere into the preceding integral gives \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}s_a(\ell)\,d\mathbb P_L(\ell). \end{align*} By the definition of $s_a$, this is precisely \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell). \end{align*} This proves the adjustment formula for the fixed treatment value $a$ and measurable outcome set $B$. [guided] The previous step proved that the counterfactual conditional probability $r_a$ and the observed conditional probability $s_a$ agree outside a $\mathbb P_L$-null set. Since both functions take values in $[0,1]$, changing the integrand on a $\mathbb P_L$-null set does not change the [Lebesgue integral](/page/Lebesgue%20Integral) with respect to $\mathbb P_L$. Thus \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}r_a(\ell)\,d\mathbb P_L(\ell) = \int_{\mathcal L}s_a(\ell)\,d\mathbb P_L(\ell). \end{align*} By the definition of $s_a$, the last integral is \begin{align*} \int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell). \end{align*} Therefore \begin{align*} \mathbb P(Y_a\in B) = \int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell), \end{align*} which is the back-door adjustment formula for the fixed treatment value $a$ and the measurable outcome event $B$. [/guided] [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Random Variable

Explore Further

What brings you to Androma?

Start with a route through the knowledge graph.

Back-Door Adjustment Formula (Theorem # 9695)

Discussion

Proof

Prerequisites (0/1 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Back-Door Adjustment Formula (Theorem # 9695)

Discussion

Proof

Prerequisites (0/1 completed)

Prerequisites Graph

Explore Further