[proofplan]
The proof is the standard adjustment argument. First disintegrate the law of the potential outcome $Y_a$ over the covariate $L$. The back-door criterion, through the assumed structural-causal semantics, gives conditional exchangeability, so conditioning additionally on $A=a$ does not change the conditional law of $Y_a$ given $L$. Positivity ensures this conditional law given $(A=a,L=\ell)$ is determined on the $\mathbb P_L$-support, and consistency then replaces $Y_a$ by the observed outcome $Y$ among units with $A=a$. Integrating the resulting [conditional probability](/page/Conditional%20Probability) over the marginal law of $L$ gives the formula.
[/proofplan]
[step:Disintegrate the potential-outcome law over the covariate distribution]
Fix $B\in\mathcal E_{\mathcal Y}$. Define the event
\begin{align*}
E_B:=\{\omega\in\Omega:Y_a(\omega)\in B\}.
\end{align*}
Since $Y_a$ is $(\mathcal F,\mathcal E_{\mathcal Y})$-measurable and $B\in\mathcal E_{\mathcal Y}$, the event $E_B$ belongs to $\mathcal F$.
Because $(\mathcal L,\mathcal E_{\mathcal L})$ is standard Borel, choose a regular conditional probability kernel for $Y_a$ given $L$ and denote the corresponding conditional probability by
\begin{align*}
q_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(Y_a\in B\mid L=\ell).
\end{align*}
By the defining property of regular conditional probability applied to the bounded measurable indicator $\mathbb 1_{E_B}:\Omega\to\{0,1\}$, one has
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}q_a(\ell)\,d\mathbb P_L(\ell).
\end{align*}
[guided]
We first separate the target probability into covariate-specific pieces. Fix a measurable outcome set $B\in\mathcal E_{\mathcal Y}$, and define
\begin{align*}
E_B:=\{\omega\in\Omega:Y_a(\omega)\in B\}.
\end{align*}
This is an event because $Y_a:(\Omega,\mathcal F)\to(\mathcal Y,\mathcal E_{\mathcal Y})$ is measurable and $B$ is measurable.
The role of the standard Borel hypothesis on $\mathcal L$ is to ensure that regular conditional probabilities may be used. Choose a version of the conditional law of $Y_a$ given $L$, and for this fixed set $B$ define
\begin{align*}
q_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(Y_a\in B\mid L=\ell).
\end{align*}
The function $q_a$ is $\mathcal E_{\mathcal L}$-measurable by the definition of a probability kernel. Applying the defining identity for conditional probability to the indicator function $\mathbb 1_{E_B}:\Omega\to\{0,1\}$ gives
\begin{align*}
\mathbb E[\mathbb 1_{E_B}]
=
\int_{\mathcal L}q_a(\ell)\,d\mathbb P_L(\ell).
\end{align*}
Since $\mathbb E[\mathbb 1_{E_B}]=\mathbb P(E_B)=\mathbb P(Y_a\in B)$, this becomes
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}\mathbb P(Y_a\in B\mid L=\ell)\,d\mathbb P_L(\ell).
\end{align*}
This is the [law of total probability](/theorems/1113) written in kernel form, with the measure in the integral explicitly equal to the marginal law $\mathbb P_L$ of $L$.
[/guided]
[/step]
[step:Use back-door exchangeability to condition on the treatment value]
By the assumed structural-causal interpretation of the back-door criterion, we have conditional exchangeability:
\begin{align*}
Y_a \perp\!\!\!\perp A \mid L.
\end{align*}
Let
\begin{align*}
r_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(Y_a\in B\mid A=a,L=\ell)
\end{align*}
be obtained from a regular conditional law of $Y_a$ given $(A,L)$ by evaluating the treatment coordinate at $a$. Conditional exchangeability means that the conditional law of $Y_a$ given $(A,L)$ equals the conditional law of $Y_a$ given $L$ for $\mathbb P_{(A,L)}$-almost every pair $(a',\ell)\in\mathcal A\times\mathcal L$, where $\mathbb P_{(A,L)}:=\mathbb P\circ(A,L)^{-1}$. Therefore
\begin{align*}
q_a(\ell)=r_a(\ell)
\end{align*}
for $\mathbb P_L$-almost every $\ell$ with $\pi_a(\ell)>0$. Since positivity gives $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell$, the equality $q_a=r_a$ holds $\mathbb P_L$-almost everywhere. Substituting into the disintegration formula gives
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}r_a(\ell)\,d\mathbb P_L(\ell).
\end{align*}
[guided]
We now justify precisely why exchangeability permits us to condition on the observed treatment value $A=a$. The conditional exchangeability hypothesis is
\begin{align*}
Y_a \perp\!\!\!\perp A \mid L.
\end{align*}
In kernel language, this says that, after conditioning on $L$, the additional information carried by $A$ does not change the conditional law of $Y_a$. Thus the conditional law of $Y_a$ given $(A,L)$ agrees with the conditional law of $Y_a$ given $L$ for $\mathbb P_{(A,L)}$-almost every pair $(a',\ell)\in\mathcal A\times\mathcal L$, where $\mathbb P_{(A,L)}:=\mathbb P\circ(A,L)^{-1}$.
For the fixed measurable set $B\in\mathcal E_{\mathcal Y}$, define
\begin{align*}
r_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(Y_a\in B\mid A=a,L=\ell).
\end{align*}
The preceding almost-everywhere identity is initially an identity with respect to the joint law of $(A,L)$, not automatically with respect to $\mathbb P_L$ on the whole fiber $A=a$. Positivity is exactly the hypothesis that transfers the identity to the covariate law: since
\begin{align*}
\pi_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(A=a\mid L=\ell)
\end{align*}
satisfies $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell$, a $\mathbb P_{(A,L)}$-null exceptional set on the slice $A=a$ has $\mathbb P_L$-measure zero on the positivity set. Hence
\begin{align*}
q_a(\ell)=r_a(\ell)
\end{align*}
for $\mathbb P_L$-almost every $\ell\in\mathcal L$.
Substituting this $\mathbb P_L$-almost-everywhere equality into the previous integral is valid because both $q_a$ and $r_a$ are [measurable functions](/page/Measurable%20Functions) with values in $[0,1]$. Therefore
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}r_a(\ell)\,d\mathbb P_L(\ell).
\end{align*}
[/guided]
[/step]
[step:Apply consistency to replace the potential outcome by the observed outcome]
Define
\begin{align*}
s_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(Y\in B\mid A=a,L=\ell)
\end{align*}
from a regular conditional law of $Y$ given $(A,L)$. We claim that
\begin{align*}
r_a(\ell)=s_a(\ell)
\end{align*}
for $\mathbb P_L$-almost every $\ell$.
Consistency gives $Y=Y_a$ on $\{A=a\}$. Hence the events
\begin{align*}
\{Y_a\in B\}\cap\{A=a\}
\end{align*}
and
\begin{align*}
\{Y\in B\}\cap\{A=a\}
\end{align*}
are equal. It follows from the defining uniqueness of regular conditional probabilities that the two conditional probabilities given $(A,L)$ are equal for $\mathbb P_{(A,L)}$-almost every $(a',\ell)$ on the slice $a'=a$. Positivity transfers this equality to
\begin{align*}
r_a(\ell)=s_a(\ell)
\end{align*}
for $\mathbb P_L$-almost every $\ell$.
[guided]
The consistency hypothesis is the bridge from the counterfactual outcome $Y_a$ to the observed outcome $Y$. It states that
\begin{align*}
Y=Y_a \quad \text{on } \{A=a\}.
\end{align*}
Therefore, on the event where the observed treatment actually equals $a$, the event that the potential outcome lies in $B$ is the same event as the observed outcome lying in $B$. In set notation,
\begin{align*}
\{Y_a\in B\}\cap\{A=a\}
=
\{Y\in B\}\cap\{A=a\}.
\end{align*}
Define
\begin{align*}
s_a:\mathcal L\to[0,1],\qquad
\ell\mapsto \mathbb P(Y\in B\mid A=a,L=\ell).
\end{align*}
Regular conditional probabilities are only determined up to the law of the conditioning variable. Hence the justified conclusion is not pointwise equality for every $\ell$, but equality for the relevant almost-everywhere class. Since the two events above are equal on $\{A=a\}$, their conditional probabilities given $(A,L)$ agree for $\mathbb P_{(A,L)}$-almost every pair on the slice $A=a$. By the same positivity transfer used in the exchangeability step, because $\pi_a(\ell)>0$ for $\mathbb P_L$-almost every $\ell$, this gives
\begin{align*}
r_a(\ell)=s_a(\ell)
\end{align*}
for $\mathbb P_L$-almost every $\ell\in\mathcal L$.
[/guided]
[/step]
[step:Integrate the observed conditional law over the covariate law]
Substituting the equality $r_a=s_a$ $\mathbb P_L$-almost everywhere into the preceding integral gives
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}s_a(\ell)\,d\mathbb P_L(\ell).
\end{align*}
By the definition of $s_a$, this is precisely
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell).
\end{align*}
This proves the adjustment formula for the fixed treatment value $a$ and measurable outcome set $B$.
[guided]
The previous step proved that the counterfactual conditional probability $r_a$ and the observed conditional probability $s_a$ agree outside a $\mathbb P_L$-null set. Since both functions take values in $[0,1]$, changing the integrand on a $\mathbb P_L$-null set does not change the [Lebesgue integral](/page/Lebesgue%20Integral) with respect to $\mathbb P_L$. Thus
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}r_a(\ell)\,d\mathbb P_L(\ell)
=
\int_{\mathcal L}s_a(\ell)\,d\mathbb P_L(\ell).
\end{align*}
By the definition of $s_a$, the last integral is
\begin{align*}
\int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell).
\end{align*}
Therefore
\begin{align*}
\mathbb P(Y_a\in B)
=
\int_{\mathcal L}\mathbb P(Y\in B\mid A=a,L=\ell)\,d\mathbb P_L(\ell),
\end{align*}
which is the back-door adjustment formula for the fixed treatment value $a$ and the measurable outcome event $B$.
[/guided]
[/step]