Failure of Positivity Implies Nonidentification of the Stratum-Specific Potential-Outcome Law

Failure of Positivity Implies Nonidentification of the Stratum-Specific Potential-Outcome Law (Theorem # 9658)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We realize the given finite observational law $\mu$ on the canonical finite [probability space](/page/Probability%20Space) $\mathcal L\times\mathcal X\times\mathcal Y$. Since the cell $\{L=\ell,X=x\}$ has probability zero, consistency imposes no positive-probability restriction on the counterfactual variable $Y_x$ inside the positive-probability stratum $\{L=\ell\}$. We therefore construct two models with identical observed variables and identical observational law, but assign two distinct constant values to $Y_x$ on the stratum $\{L=\ell\}$. These assignments leave the observed law unchanged and preserve consistency almost surely, while forcing different conditional laws of $Y_x$ given $L=\ell$. [/proofplan] [step:Choose two distinct outcome values and realize the observational law canonically] Since $|\mathcal Y|\geq 2$, choose distinct values $y_0,y_1\in\mathcal Y$. Define the finite probability space \begin{align*} \Omega:=\mathcal L\times\mathcal X\times\mathcal Y, \qquad \mathcal F:=2^\Omega, \end{align*} and define the probability measure $\mathbb P:\mathcal F\to[0,1]$ by \begin{align*} \mathbb P(A):=\mu(A) \end{align*} for every $A\subset \Omega$, where $\Omega$ is identified with $\mathcal L\times\mathcal X\times\mathcal Y$. Define the coordinate maps \begin{align*} L:\Omega\to\mathcal L \end{align*} by $L(l,a,y)=l$, \begin{align*} X:\Omega\to\mathcal X \end{align*} by $X(l,a,y)=a$, and \begin{align*} Y:\Omega\to\mathcal Y \end{align*} by $Y(l,a,y)=y$. Since all spaces are finite and all $\sigma$-algebras are power-set $\sigma$-algebras, these maps are measurable. By construction, the joint law of $(L,X,Y)$ is exactly $\mu$. [/step] [step:Construct two potential-outcome systems with the same observed variables] For $j\in\{0,1\}$ and $a\in\mathcal X$, define a map \begin{align*} Y_{a,j}:\Omega\to\mathcal Y \end{align*} as follows. For $\omega=(l,b,y)\in\Omega$, set $Y_{a,j}(\omega)=y_j$ if $a=x$ and $l=\ell$; set $Y_{a,j}(\omega)=y$ if $a=b$ and not both $a=x$ and $l=\ell$; and set $Y_{a,j}(\omega)=y_0$ if $a\neq b$ and not both $a=x$ and $l=\ell$. This defines a $\mathcal Y$-valued [random variable](/page/Random%20Variable) because $\Omega$ is finite. Let $\mathcal M_j$ denote the causal model with probability space $(\Omega,\mathcal F,\mathbb P)$, observed variables $(L,X,Y)$, and potential outcomes $(Y_{a,j})_{a\in\mathcal X}$. [guided] The observed variables are already fixed: both models will use the same maps $L$, $X$, and $Y$ on the same probability space. Therefore the only freedom is in the potential outcomes. Consistency forces the potential outcome corresponding to the actually observed treatment to agree with $Y$ on all positive-probability units, but it does not constrain potential outcomes corresponding to treatments that are never observed in a given stratum. For $j\in\{0,1\}$ and $a\in\mathcal X$, define \begin{align*} Y_{a,j}:\Omega\to\mathcal Y \end{align*} by the following rule. For $(l,b,y)\in\Omega$, set $Y_{a,j}(l,b,y)=y_j$ if $a=x$ and $l=\ell$; set $Y_{a,j}(l,b,y)=y$ if $a=b$ and not both $a=x$ and $l=\ell$; and set $Y_{a,j}(l,b,y)=y_0$ if $a\neq b$ and not both $a=x$ and $l=\ell$. The first case is the deliberate nonidentification move: inside the stratum $L=\ell$, we set the counterfactual outcome under treatment $x$ to be $y_0$ in the first model and $y_1$ in the second model. The second case enforces consistency whenever the potential outcome corresponds to the observed treatment. The only possible conflict between the first and second cases occurs when $l=\ell$ and $b=x$; but the event $\{L=\ell,X=x\}$ has probability zero under $\mathbb P$, so any failure of pointwise consistency there is irrelevant to almost sure consistency. Since $\Omega$ is finite and $\mathcal F=2^\Omega$, every map from $\Omega$ to $\mathcal Y$ is measurable. Hence each $Y_{a,j}$ is a valid potential-outcome random variable. [/guided] [/step] [step:Verify consistency for both constructed models] Fix $j\in\{0,1\}$. Define the selected potential-outcome random variable \begin{align*} V_j:\Omega\to\mathcal Y \end{align*} by $V_j(\omega):=Y_{X(\omega),j}(\omega)$ for every $\omega\in\Omega$. Define the null event \begin{align*} N:=\{\omega\in\Omega:L(\omega)=\ell \text{ and } X(\omega)=x\}. \end{align*} The hypothesis $\mu(\{\ell\}\times\{x\}\times\mathcal Y)=0$ and the definition of $\mathbb P$ give \begin{align*} \mathbb P(N)=0. \end{align*} If $\omega=(l,b,y)\in\Omega\setminus N$, then $X(\omega)=b$. If $b=x$ and $l=\ell$, then $\omega\in N$, which is excluded. Hence the first case in the definition of $Y_{b,j}$ does not override the consistency case, and therefore \begin{align*} V_j(\omega)=Y_{X(\omega),j}(\omega)=Y_{b,j}(\omega)=y=Y(\omega). \end{align*} Thus \begin{align*} Y=V_j \quad \mathbb P\text{-a.s.} \end{align*} So both $\mathcal M_0$ and $\mathcal M_1$ satisfy consistency. [/step] [step:Compute the two conditional laws on the positive-probability stratum] Define the stratum event \begin{align*} S:=\{\omega\in\Omega:L(\omega)=\ell\}. \end{align*} By hypothesis, \begin{align*} \mathbb P(S)=\mu(\{\ell\}\times\mathcal X\times\mathcal Y)>0, \end{align*} so conditioning on $S$ is well defined. For $j\in\{0,1\}$, on the event $S\setminus N$ the definition gives \begin{align*} Y_{x,j}=y_j. \end{align*} Since $\mathbb P(N)=0$, it follows that \begin{align*} \mathbb P(Y_{x,j}=y_j\mid L=\ell)=1. \end{align*} Equivalently, for every subset $B\subset\mathcal Y$, \begin{align*} \mathbb P(Y_{x,j}\in B\mid L=\ell)=\mathbb 1_B(y_j). \end{align*} Because $y_0\neq y_1$, the conditional laws of $Y_{x,0}$ and $Y_{x,1}$ given $L=\ell$ are different. [/step] [step:Conclude nonidentification from the two compatible models] The two models $\mathcal M_0$ and $\mathcal M_1$ have the same probability space, the same observed random variables $L$, $X$, and $Y$, and hence the same observational law $\mu$ for $(L,X,Y)$. Both models satisfy consistency. However, under $\mathcal M_0$ the conditional law of $Y_x$ given $L=\ell$ is the point mass at $y_0$, while under $\mathcal M_1$ it is the point mass at $y_1$. Thus the observational law $\mu$ does not determine the stratum-specific potential-outcome law $\mathbb P(Y_x\in\cdot\mid L=\ell)$ over the stated class of causal models. This proves nonidentification. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Random Variable

Explore Further

Random Variable Definition UI Martingale Convergence Theorem Martingale Theory Identifiability Under Full Column Rank Probability & Statistics Weighted Lasso Karush-Kuhn-Tucker Conditions Probability & Statistics Debiased Lasso Decomposition Probability & Statistics Asymptotic Normality of Sample Quantiles Probability & Statistics Expectation of a Product of Independent Variables Probability Theory Continuity Theorem for Moment Generating Functions Probability & Statistics Integration with Respect to a Poisson Random Measure Poisson Processes Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.