Coupling Bound for Total Variation Distance — Statement & Proof

Coupling Bound for Total Variation Distance (Theorem # 7219)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first verify that the disagreement event $\{Y \neq Z\}$ is measurable, using the assumed measurability of the diagonal in $E \times E$. Then we fix a measurable set $A \in \mathcal E$ and compare the two probabilities $\mu(A)$ and $\nu(A)$ through the indicator functions $\mathbb{1}_A \circ Y$ and $\mathbb{1}_A \circ Z$. These indicators can differ only on the disagreement event, so their integral difference is bounded by $\mathbb P(Y \neq Z)$. Taking the supremum over all measurable $A$ gives the total variation bound. [/proofplan] [step:Verify that the disagreement event is measurable] Define the product [random variable](/page/Random%20Variable) \begin{align*} W : (\Omega,\mathcal F) \to (E \times E,\mathcal E \otimes \mathcal E), \qquad \omega \mapsto (Y(\omega),Z(\omega)). \end{align*} For every measurable rectangle $A \times B$ with $A,B \in \mathcal E$, \begin{align*} W^{-1}(A \times B) = Y^{-1}(A) \cap Z^{-1}(B) \in \mathcal F, \end{align*} so $W$ is measurable because $\mathcal E \otimes \mathcal E$ is generated by measurable rectangles. Let \begin{align*} D := \{\omega \in \Omega : Y(\omega) \neq Z(\omega)\}. \end{align*} Since $\Delta_E \in \mathcal E \otimes \mathcal E$, its complement $(E \times E)\setminus \Delta_E$ also belongs to $\mathcal E \otimes \mathcal E$. Therefore \begin{align*} D = W^{-1}((E \times E)\setminus \Delta_E) \in \mathcal F. \end{align*} Thus $\mathbb P(D)$ is well-defined. [/step] [step:Bound the difference on an arbitrary measurable set] Fix $A \in \mathcal E$. Define the indicator map \begin{align*} \mathbb{1}_A : (E,\mathcal E) \to (\mathbb R,\mathcal B(\mathbb R)) \end{align*} by setting $\mathbb{1}_A(x)=1$ when $x \in A$ and $\mathbb{1}_A(x)=0$ when $x \notin A$. Since $A \in \mathcal E$, the map $\mathbb{1}_A$ is measurable. Then $\mathbb{1}_A \circ Y$ and $\mathbb{1}_A \circ Z$ are bounded real-valued [measurable functions](/page/Measurable%20Functions) on $(\Omega,\mathcal F)$, and by the definition of the laws $\mu$ and $\nu$, \begin{align*} \mu(A) = \mathbb P(Y^{-1}(A)) = \int_\Omega \mathbb{1}_A(Y(\omega))\,d\mathbb P(\omega) \end{align*} and \begin{align*} \nu(A) = \mathbb P(Z^{-1}(A)) = \int_\Omega \mathbb{1}_A(Z(\omega))\,d\mathbb P(\omega). \end{align*} Hence \begin{align*} |\mu(A)-\nu(A)| = \left|\int_\Omega \bigl(\mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega))\bigr)\,d\mathbb P(\omega)\right|. \end{align*} Applying the triangle inequality for the [Lebesgue integral](/page/Lebesgue%20Integral) with respect to $\mathbb P$ gives \begin{align*} |\mu(A)-\nu(A)| \leq \int_\Omega |\mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega))|\,d\mathbb P(\omega). \end{align*} If $\omega \in \Omega \setminus D$, then $Y(\omega)=Z(\omega)$, so \begin{align*} \mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega)) = 0. \end{align*} Since both indicators take values in $\{0,1\}$, for every $\omega \in \Omega$ one has \begin{align*} |\mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega))| \leq \mathbb{1}_D(\omega). \end{align*} Therefore \begin{align*} |\mu(A)-\nu(A)| \leq \int_\Omega \mathbb{1}_D(\omega)\,d\mathbb P(\omega) = \mathbb P(D). \end{align*} [guided] Fix a measurable set $A \in \mathcal E$. The goal is to prove a bound for the single quantity $|\mu(A)-\nu(A)|$; the total variation norm will then be obtained by taking the supremum over all such $A$. Define the indicator map \begin{align*} \mathbb{1}_A : (E,\mathcal E) \to (\mathbb R,\mathcal B(\mathbb R)) \end{align*} by setting $\mathbb{1}_A(x)=1$ when $x \in A$ and $\mathbb{1}_A(x)=0$ when $x \notin A$. Because $A \in \mathcal E$, this indicator map is measurable. Since $Y$ and $Z$ are measurable maps from $(\Omega,\mathcal F)$ to $(E,\mathcal E)$, the compositions $\mathbb{1}_A \circ Y$ and $\mathbb{1}_A \circ Z$ are bounded real-valued measurable functions on $(\Omega,\mathcal F)$. The laws $\mu$ and $\nu$ are the pushforward measures $\mu=\mathbb P \circ Y^{-1}$ and $\nu=\mathbb P \circ Z^{-1}$. Therefore \begin{align*} \mu(A) = \mathbb P(Y^{-1}(A)) = \int_\Omega \mathbb{1}_A(Y(\omega))\,d\mathbb P(\omega) \end{align*} and \begin{align*} \nu(A) = \mathbb P(Z^{-1}(A)) = \int_\Omega \mathbb{1}_A(Z(\omega))\,d\mathbb P(\omega). \end{align*} Subtracting these identities gives \begin{align*} |\mu(A)-\nu(A)| = \left|\int_\Omega \bigl(\mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega))\bigr)\,d\mathbb P(\omega)\right|. \end{align*} The triangle inequality for integration with respect to the probability measure $\mathbb P$ yields \begin{align*} |\mu(A)-\nu(A)| \leq \int_\Omega |\mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega))|\,d\mathbb P(\omega). \end{align*} Now we identify where the integrand can be nonzero. Let \begin{align*} D := \{\omega \in \Omega : Y(\omega) \neq Z(\omega)\}. \end{align*} The preceding step proved that $D \in \mathcal F$, so the indicator $\mathbb{1}_D : (\Omega,\mathcal F) \to (\mathbb R,\mathcal B(\mathbb R))$ is measurable and $\int_\Omega \mathbb{1}_D(\omega)\,d\mathbb P(\omega)=\mathbb P(D)$. If $\omega \in \Omega \setminus D$, then $Y(\omega)=Z(\omega)$, and so the two indicator values are equal: \begin{align*} \mathbb{1}_A(Y(\omega)) = \mathbb{1}_A(Z(\omega)). \end{align*} Thus the absolute difference of the indicators vanishes on $\Omega \setminus D$. On $D$, the absolute difference is at most $1$, because both indicators take only the values $0$ and $1$. Combining these two cases gives the pointwise estimate \begin{align*} |\mathbb{1}_A(Y(\omega))-\mathbb{1}_A(Z(\omega))| \leq \mathbb{1}_D(\omega) \end{align*} for every $\omega \in \Omega$. Integrating this pointwise inequality with respect to $\mathbb P$ gives \begin{align*} |\mu(A)-\nu(A)| \leq \int_\Omega \mathbb{1}_D(\omega)\,d\mathbb P(\omega) = \mathbb P(D). \end{align*} This proves that the probability discrepancy on the particular set $A$ can only be caused by outcomes on which the coupled variables $Y$ and $Z$ disagree. [/guided] [/step] [step:Take the supremum over measurable sets] The previous step proves that for every $A \in \mathcal E$, \begin{align*} |\mu(A)-\nu(A)| \leq \mathbb P(D). \end{align*} Taking the supremum over all $A \in \mathcal E$ and using the stated convention for total variation distance gives \begin{align*} \|\mu-\nu\|_{\mathrm{TV}} = \sup_{A \in \mathcal E} |\mu(A)-\nu(A)| \leq \mathbb P(D). \end{align*} Since $D=\{\omega \in \Omega : Y(\omega) \neq Z(\omega)\}$, this is precisely \begin{align*} \|\mu-\nu\|_{\mathrm{TV}} \leq \mathbb P(Y \neq Z). \end{align*} [/step]

Explore Further

Schur Complement Criterion for Positive Semidefinite Block Matrices applied Closure of Polynomial Time Under Polynomial-Time Many-One Reductions applied Fenchel–Rockafellar Optimality Conditions applied Classification of Finite-Dimensional Irreducible Representations of $\mathfrak{su}(2)$ applied Polynomial-Time Reduction from Independent Set to Clique applied Complementary Slackness Principle applied Coherent State Uncertainty Minimization Theorem applied Noether's Theorem for Point Symmetries applied

What brings you to Androma?

Start with a route through the knowledge graph.

Coupling Bound for Total Variation Distance (Theorem # 7219)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Coupling Bound for Total Variation Distance (Theorem # 7219)

Discussion

Proof

Explore Further