[proofplan]
Write $S=e(X)$ and prove conditional independence through bounded test functions, so that all null-set issues are absorbed into conditional-expectation identities. First compute the conditional law of the binary treatment given $S$ and $X$: because $S$ is a version of $\mathbb P(T=1\mid X)$ and $S$ is $\sigma(X)$-measurable, conditioning further on $S$ changes nothing. This gives the balancing identity $T\perp X\mid S$. For the ignorability conclusion, test against bounded functions of the potential-outcome pair and use conditional ignorability given $X$ to factor the [conditional expectation](/page/Conditional%20Expectation); the remaining factor depends on treatment only through $S$, so the same calculation gives independence given $S$.
[/proofplan]
custom_env
admin
[step:Record the measurability relations generated by the propensity score]
Define the score [random variable](/page/Random%20Variable)
\begin{align*}
S:(\Omega,\mathcal F)\to([0,1],\mathcal B([0,1]))
\end{align*}
by
\begin{align*}
S=e(X).
\end{align*}
Since $e$ is $\mathcal E/\mathcal B([0,1])$-measurable and $X$ is $\mathcal F/\mathcal E$-measurable, $S$ is $\sigma(X)$-measurable. Hence
\begin{align*}
\sigma(S)\subset \sigma(X).
\end{align*}
The propensity-score hypothesis says that $S$ is a chosen version of the conditional expectation
\begin{align*}
\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(X)].
\end{align*}
Therefore
\begin{align*}
\mathbb E[\mathbb 1_{\{T=0\}}\mid \sigma(X)]
=
1-S
\end{align*}
almost surely, because $\mathbb 1_{\{T=0\}}=1-\mathbb 1_{\{T=1\}}$.
[/step]
custom_env
admin
[step:Compute the treatment law after conditioning on the score]Let $t\in\{0,1\}$. Define the score-level treatment probability
\begin{align*}
p_t:\Omega\to[0,1]
\end{align*}
by
\begin{align*}
p_t=\mathbb E[\mathbb 1_{\{T=t\}}\mid \sigma(S)].
\end{align*}
Using $\sigma(S)\subset\sigma(X)$ and the defining consistency of iterated conditional expectation, we obtain
\begin{align*}
p_1
=
\mathbb E[\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(X)]\mid \sigma(S)].
\end{align*}
Substituting the propensity-score version gives
\begin{align*}
p_1
=
\mathbb E[S\mid \sigma(S)].
\end{align*}
Since $S$ is $\sigma(S)$-measurable,
\begin{align*}
p_1=S.
\end{align*}
Similarly,
\begin{align*}
p_0=1-S.
\end{align*}[/step]
custom_env
admin
[guided]We want the treatment probability inside a score stratum. For $t\in\{0,1\}$, define
\begin{align*}
p_t:\Omega\to[0,1]
\end{align*}
by
\begin{align*}
p_t=\mathbb E[\mathbb 1_{\{T=t\}}\mid \sigma(S)].
\end{align*}
The key point is that conditioning on $X$ is finer than conditioning on $S$, because $S=e(X)$ is a [measurable function](/page/Measurable%20Function) of $X$. Thus $\sigma(S)\subset\sigma(X)$.
For $t=1$, the defining property of the propensity score gives
\begin{align*}
\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(X)]=S
\end{align*}
almost surely. Taking conditional expectation of both sides with respect to $\sigma(S)$ gives
\begin{align*}
\mathbb E[\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(X)]\mid \sigma(S)]
=
\mathbb E[S\mid \sigma(S)].
\end{align*}
The left-hand side is $\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(S)]$ by iterated conditioning along the inclusion $\sigma(S)\subset\sigma(X)$. The right-hand side is $S$, because $S$ is $\sigma(S)$-measurable. Therefore
\begin{align*}
\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(S)]=S.
\end{align*}
Since $\mathbb 1_{\{T=0\}}=1-\mathbb 1_{\{T=1\}}$, linearity of conditional expectation gives
\begin{align*}
\mathbb E[\mathbb 1_{\{T=0\}}\mid \sigma(S)]=1-S.
\end{align*}
Thus, inside each score stratum, the conditional treatment law is Bernoulli with success probability $S$.[/guided]
custom_env
admin
[step:Prove that the treatment is conditionally independent of covariates given the score]
Let
\begin{align*}
h:(E,\mathcal E)\to(\mathbb R,\mathcal B(\mathbb R))
\end{align*}
be a bounded measurable map. For $t=1$, the product $h(X)S$ is integrable and $\sigma(X)$-measurable, so
\begin{align*}
\mathbb E[h(X)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
\mathbb E[\mathbb E[h(X)\mathbb 1_{\{T=1\}}\mid \sigma(X)]\mid \sigma(S)].
\end{align*}
Because $h(X)$ is bounded and $\sigma(X)$-measurable,
\begin{align*}
\mathbb E[h(X)\mathbb 1_{\{T=1\}}\mid \sigma(X)]
=
h(X)\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(X)].
\end{align*}
Using the propensity-score version,
\begin{align*}
\mathbb E[h(X)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
\mathbb E[h(X)S\mid \sigma(S)].
\end{align*}
Since $S$ is $\sigma(S)$-measurable,
\begin{align*}
\mathbb E[h(X)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
S\mathbb E[h(X)\mid \sigma(S)].
\end{align*}
By the preceding step, $S=\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(S)]$, hence
\begin{align*}
\mathbb E[h(X)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
\mathbb E[h(X)\mid \sigma(S)]\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(S)].
\end{align*}
For $t=0$, the same computation with $\mathbb 1_{\{T=0\}}$ and $1-S$ gives
\begin{align*}
\mathbb E[h(X)\mathbb 1_{\{T=0\}}\mid \sigma(S)]
=
\mathbb E[h(X)\mid \sigma(S)]\mathbb E[\mathbb 1_{\{T=0\}}\mid \sigma(S)].
\end{align*}
Every bounded map $k:\{0,1\}\to\mathbb R$ has the representation
\begin{align*}
k(T)=k(1)\mathbb 1_{\{T=1\}}+k(0)\mathbb 1_{\{T=0\}}.
\end{align*}
Linearity of conditional expectation therefore yields
\begin{align*}
\mathbb E[h(X)k(T)\mid \sigma(S)]
=
\mathbb E[h(X)\mid \sigma(S)]\mathbb E[k(T)\mid \sigma(S)].
\end{align*}
This is precisely $T\perp X\mid S$ in the bounded-test-function formulation.
[/step]
custom_env
admin
[step:Use ignorability given covariates to factor outcome and treatment tests]
Assume now that $(Y(1),Y(0))\perp T\mid X$. Define the potential-outcome pair
\begin{align*}
U:(\Omega,\mathcal F)\to(\mathbb R^2,\mathcal B(\mathbb R^2))
\end{align*}
by
\begin{align*}
U=(Y(1),Y(0)).
\end{align*}
Let
\begin{align*}
g:(\mathbb R^2,\mathcal B(\mathbb R^2))\to(\mathbb R,\mathcal B(\mathbb R))
\end{align*}
be bounded and measurable. Choose a bounded $\sigma(X)$-measurable version
\begin{align*}
M_g:\Omega\to\mathbb R
\end{align*}
of
\begin{align*}
\mathbb E[g(U)\mid \sigma(X)].
\end{align*}
Conditional ignorability gives, for each $t\in\{0,1\}$,
\begin{align*}
\mathbb E[g(U)\mathbb 1_{\{T=t\}}\mid \sigma(X)]
=
M_g\mathbb E[\mathbb 1_{\{T=t\}}\mid \sigma(X)]
\end{align*}
almost surely.
[/step]
custom_env
admin
[step:Average the covariate-level factorization over score strata]
For $t=1$, iterated conditioning and the factorization from the previous step give
\begin{align*}
\mathbb E[g(U)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
\mathbb E[M_g S\mid \sigma(S)].
\end{align*}
Since $S$ is $\sigma(S)$-measurable,
\begin{align*}
\mathbb E[g(U)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
S\mathbb E[M_g\mid \sigma(S)].
\end{align*}
Again by iterated conditioning,
\begin{align*}
\mathbb E[M_g\mid \sigma(S)]
=
\mathbb E[g(U)\mid \sigma(S)].
\end{align*}
Using $\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(S)]=S$, we obtain
\begin{align*}
\mathbb E[g(U)\mathbb 1_{\{T=1\}}\mid \sigma(S)]
=
\mathbb E[g(U)\mid \sigma(S)]\mathbb E[\mathbb 1_{\{T=1\}}\mid \sigma(S)].
\end{align*}
For $t=0$, replacing $S$ by $1-S$ gives
\begin{align*}
\mathbb E[g(U)\mathbb 1_{\{T=0\}}\mid \sigma(S)]
=
\mathbb E[g(U)\mid \sigma(S)]\mathbb E[\mathbb 1_{\{T=0\}}\mid \sigma(S)].
\end{align*}
Therefore, for every bounded map $k:\{0,1\}\to\mathbb R$,
\begin{align*}
\mathbb E[g(U)k(T)\mid \sigma(S)]
=
\mathbb E[g(U)\mid \sigma(S)]\mathbb E[k(T)\mid \sigma(S)].
\end{align*}
This is the bounded-test-function formulation of
\begin{align*}
U\perp T\mid S.
\end{align*}
Since $U=(Y(1),Y(0))$, this proves
\begin{align*}
(Y(1),Y(0))\perp T\mid e(X).
\end{align*}
Together with the balancing result, the theorem follows.
[/step]