[proofplan]
Fix a treatment value $a\in\{0,1\}$. The proof first writes the mean potential outcome as an integral of its conditional mean over the marginal covariate law. Conditional exchangeability identifies this conditional mean with the conditional mean among units with $A=a$, and consistency then replaces the potential outcome by the observed outcome on that treatment arm. Positivity ensures that all equalities are valid on a full-measure covariate set for the outer integral, so integrating gives the adjustment formula; subtracting the two treatment-level formulas gives the ATE identity.
[/proofplan]
[step:Represent the mean potential outcome as an integral over the covariate law]
Fix $a\in\{0,1\}$. Define the marginal covariate law
\begin{align*}
\mu_Z:=\mathbb P\circ Z^{-1}
\end{align*}
on $(\mathcal Z,\mathcal E_{\mathcal Z})$. Since $Y(a)$ is integrable, the [conditional expectation](/page/Conditional%20Expectation) $\mathbb E[Y(a)\mid Z]$ is integrable. By the chosen version $m_a:\mathcal Z\to\mathbb R$, we have
\begin{align*}
\mathbb E[Y(a)\mid Z]=m_a\circ Z
\end{align*}
$\mathbb P$-a.s. The tower property for conditional expectation gives
\begin{align*}
\mathbb E[Y(a)]=\mathbb E[\mathbb E[Y(a)\mid Z]].
\end{align*}
Substituting the version $m_a\circ Z$ and using the definition of the pushforward measure $\mu_Z$ gives
\begin{align*}
\mathbb E[Y(a)]=\int_{\mathcal Z}m_a(z)\,d\mu_Z(z).
\end{align*}
[guided]
We fix one treatment level $a\in\{0,1\}$ because the same argument will be applied separately to $a=0$ and $a=1$. The outer distribution in the adjustment formula is the observed marginal law of the covariates, so we name it:
\begin{align*}
\mu_Z:=\mathbb P\circ Z^{-1}.
\end{align*}
This is a probability measure on $(\mathcal Z,\mathcal E_{\mathcal Z})$.
Let $\sigma(Z):=\{Z^{-1}(E):E\in\mathcal E_{\mathcal Z}\}$ denote the sub-$\sigma$-algebra of $\mathcal F$ generated by $Z$. Because $Y(a)$ is integrable, the conditional expectation $\mathbb E[Y(a)\mid Z]$ exists as an integrable $\sigma(Z)$-measurable [random variable](/page/Random%20Variable). By the version choice in the statement, there is an $\mathcal E_{\mathcal Z}$-measurable map
\begin{align*}
m_a:\mathcal Z\to\mathbb R
\end{align*}
such that
\begin{align*}
\mathbb E[Y(a)\mid Z]=m_a\circ Z
\end{align*}
$\mathbb P$-a.s. The [tower property of conditional expectation](/theorems/1150) applies because $Y(a)$ is integrable with respect to $\mathbb P$, and it yields
\begin{align*}
\mathbb E[Y(a)]=\mathbb E[\mathbb E[Y(a)\mid Z]].
\end{align*}
Replacing $\mathbb E[Y(a)\mid Z]$ by the version $m_a\circ Z$ gives
\begin{align*}
\mathbb E[Y(a)]=\mathbb E[m_a(Z)].
\end{align*}
Finally, the definition of the pushforward measure says that expectation of a [measurable function](/page/Measurable%20Function) of $Z$ is integration against $\mu_Z$. Therefore
\begin{align*}
\mathbb E[m_a(Z)]=\int_{\mathcal Z}m_a(z)\,d\mu_Z(z).
\end{align*}
Combining the last two displays gives
\begin{align*}
\mathbb E[Y(a)]=\int_{\mathcal Z}m_a(z)\,d\mu_Z(z).
\end{align*}
[/guided]
[/step]
[step:Identify the potential-outcome conditional mean on the positivity support]
By conditional exchangeability,
\begin{align*}
(Y(0),Y(1))\perp\!\!\!\perp A\mid Z.
\end{align*}
In particular, for the fixed treatment level $a$, the conditional law of $Y(a)$ given $Z=z$ agrees with the conditional law of $Y(a)$ given $A=a$ and $Z=z$ on the positivity set $S_a$ for the selected conditional-law versions. Since $Y(a)$ is integrable, and since the statement chooses conditional-mean versions agreeing on this positivity support, the corresponding conditional means satisfy
\begin{align*}
m_a(z)=n_a(z) \quad \text{for every } z\in S_a.
\end{align*}
[/step]
[step:Use consistency to replace the potential outcome by the observed outcome]
Consistency states that
\begin{align*}
Y=Y(a) \quad \mathbb P\text{-a.s. on } \{A=a\}.
\end{align*}
Therefore the conditional law of $Y(a)$ given $A=a$ and $Z=z$ agrees with the conditional law of $Y$ given $A=a$ and $Z=z$ on the same positivity support $S_a$ for the selected versions. Taking first moments, using integrability of both $Y(a)$ and $Y$ and the version agreement specified in the statement, gives
\begin{align*}
n_a(z)=r_a(z) \quad \text{for every } z\in S_a.
\end{align*}
Combining this equality with the equality from conditional exchangeability yields
\begin{align*}
m_a(z)=r_a(z) \quad \text{for every } z\in S_a.
\end{align*}
[guided]
At this point we have reduced the problem to matching two conditional means on the set of covariate values that the outer integral actually sees. Conditional exchangeability gave
\begin{align*}
m_a(z)=n_a(z) \quad \text{for every } z\in S_a,
\end{align*}
where $m_a(z)$ is a version of $\mathbb E[Y(a)\mid Z=z]$ and $n_a(z)$ is a version of $\mathbb E[Y(a)\mid A=a,Z=z]$.
Now we use consistency. The consistency hypothesis says that, on the event where the received treatment equals $a$, the observed outcome equals the corresponding potential outcome:
\begin{align*}
Y=Y(a) \quad \mathbb P\text{-a.s. on } \{A=a\}.
\end{align*}
Thus, after conditioning on $A=a$ and $Z=z$, the random variables whose conditional means are being computed are equal almost surely under that conditional law on the positivity support where this conditioning is meaningful. Since $Y$ and $Y(a)$ are integrable, equality of the conditioned random variables implies equality of their conditional first moments. The statement has fixed versions whose conditional means agree on this support, so
\begin{align*}
n_a(z)=r_a(z) \quad \text{for every } z\in S_a.
\end{align*}
Combining the two equalities on $S_a$ gives
\begin{align*}
m_a(z)=r_a(z) \quad \text{for every } z\in S_a.
\end{align*}
The role of positivity is exactly here: it ensures that conditioning on the treatment arm $A=a$ is not being used on covariate values that have zero probability of receiving treatment $a$ while still contributing to the outer covariate average.
[/guided]
[/step]
[step:Integrate the identified conditional mean over the full covariate support]
The positivity hypothesis gives
\begin{align*}
\mu_Z(S_a)=1.
\end{align*}
Since $m_a(z)=r_a(z)$ for every $z\in S_a$, the two functions are equal $\mu_Z$-a.e. Therefore their integrals against $\mu_Z$ agree:
\begin{align*}
\int_{\mathcal Z}m_a(z)\,d\mu_Z(z)=\int_{\mathcal Z}r_a(z)\,d\mu_Z(z).
\end{align*}
Combining this identity with the representation from the first step gives
\begin{align*}
\mathbb E[Y(a)]=\int_{\mathcal Z}r_a(z)\,d\mu_Z(z).
\end{align*}
By the definition of $r_a$, this is exactly
\begin{align*}
\mathbb E[Y(a)]=\mathbb E_Z[\mathbb E[Y\mid A=a,Z]].
\end{align*}
[/step]
[step:Subtract the two treatment-level identities to obtain the average treatment effect]
Applying the preceding identity with $a=1$ and with $a=0$ gives
\begin{align*}
\mathbb E[Y(1)]=\int_{\mathcal Z}r_1(z)\,d\mu_Z(z)
\end{align*}
and
\begin{align*}
\mathbb E[Y(0)]=\int_{\mathcal Z}r_0(z)\,d\mu_Z(z).
\end{align*}
Since $Y(1)$ and $Y(0)$ are integrable, both displayed integrals are finite. Using the definition
\begin{align*}
\operatorname{ATE}:=\mathbb E[Y(1)]-\mathbb E[Y(0)],
\end{align*}
and subtracting the two identities gives
\begin{align*}
\operatorname{ATE}=\int_{\mathcal Z}\bigl(r_1(z)-r_0(z)\bigr)\,d\mu_Z(z).
\end{align*}
Equivalently,
\begin{align*}
\operatorname{ATE}=\mathbb E_Z\bigl[\mathbb E[Y\mid A=1,Z]-\mathbb E[Y\mid A=0,Z]\bigr].
\end{align*}
This is the claimed basic adjustment formula.
[/step]