Talagrand $T_2(C)$ Implies Talagrand $T_1(C)$ — Statement & Proof

Talagrand $T_2(C)$ Implies Talagrand $T_1(C)$ (Theorem # 6796)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first compare the two Wasserstein distances on $\mathcal P_2(E)$. For any coupling of $\nu$ and $\mu$, the first moment of the transport distance is bounded by the square root of its second moment, because the square has non-negative variance on a probability space. Taking the infimum over couplings gives $W_1(\nu,\mu)\le W_2(\nu,\mu)$, and the assumed $T_2(C)$ inequality then gives the desired $T_1(C)$ bound on $\mathcal P_2(E)$. Finally, the stated approximation property extends the estimate from $\mathcal P_2(E)$ to finite-entropy measures in $\mathcal P_1(E)$ by passing to the limit in $W_1$ and using the upper entropy bound. [/proofplan] [step:Compare $W_1$ and $W_2$ on $\mathcal P_2(E)$] Let $\nu \in \mathcal P_2(E)$, and let $\Pi(\nu,\mu)$ denote the set of Borel probability measures $\pi$ on $E \times E$ whose first marginal is $\nu$ and whose second marginal is $\mu$. For $\pi \in \Pi(\nu,\mu)$, define the measurable transport-distance function $D_\pi: E \times E \to [0,\infty)$ by $D_\pi(x,y)=d(x,y)$. Assume first that \begin{align*} \int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y) <+\infty. \end{align*} Define the number \begin{align*} m_\pi := \int_{E \times E} D_\pi(x,y) \, d\pi(x,y). \end{align*} Since $\pi$ is a probability measure and $(D_\pi-m_\pi)^2 \ge 0$, we have \begin{align*} 0 \le \int_{E \times E} (D_\pi(x,y)-m_\pi)^2 \, d\pi(x,y). \end{align*} Expanding the square and using $\pi(E\times E)=1$ gives \begin{align*} 0 \le \int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y) - m_\pi^2. \end{align*} Therefore \begin{align*} \int_{E \times E} D_\pi(x,y) \, d\pi(x,y) \le \left(\int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y)\right)^{1/2}. \end{align*} If the quadratic cost is infinite, the same inequality holds with right-hand side $+\infty$. Taking the infimum over all $\pi \in \Pi(\nu,\mu)$ and using the monotonicity of the square-root function on $[0,\infty]$, we obtain \begin{align*} W_1(\nu,\mu) \le W_2(\nu,\mu). \end{align*} [guided] The purpose of this step is to prove that a quadratic transport bound is stronger than a linear transport bound. We do this at the level of each coupling, before taking infima. Let $\Pi(\nu,\mu)$ be the set of Borel probability measures $\pi$ on $E \times E$ with first marginal $\nu$ and second marginal $\mu$. For a fixed coupling $\pi \in \Pi(\nu,\mu)$, define the map $D_\pi: E \times E \to [0,\infty)$ by $D_\pi(x,y)=d(x,y)$. This function is measurable because $d$ is the metric on the Borel [metric space](/page/Metric%20Space) $E$. Suppose first that the quadratic transportation cost under $\pi$ is finite: \begin{align*} \int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y) <+\infty. \end{align*} We want to compare the first moment of $D_\pi$ to the second moment of $D_\pi$. Define \begin{align*} m_\pi := \int_{E \times E} D_\pi(x,y) \, d\pi(x,y). \end{align*} The key elementary estimate is the non-negativity of variance. Since $\pi$ is a probability measure and $(D_\pi-m_\pi)^2$ is non-negative, we have \begin{align*} 0 \le \int_{E \times E} (D_\pi(x,y)-m_\pi)^2 \, d\pi(x,y). \end{align*} Expanding the square under the integral is justified by the finite second-moment assumption. Because $\pi(E\times E)=1$ and $m_\pi$ is constant, this gives \begin{align*} 0 \le \int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y) - 2m_\pi \int_{E \times E} D_\pi(x,y) \, d\pi(x,y) + m_\pi^2. \end{align*} By the definition of $m_\pi$, the middle integral equals $m_\pi$, so the inequality becomes \begin{align*} 0 \le \int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y) - m_\pi^2. \end{align*} Thus \begin{align*} \left(\int_{E \times E} D_\pi(x,y) \, d\pi(x,y)\right)^2 \le \int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y). \end{align*} Taking square roots gives \begin{align*} \int_{E \times E} D_\pi(x,y) \, d\pi(x,y) \le \left(\int_{E \times E} D_\pi(x,y)^2 \, d\pi(x,y)\right)^{1/2}. \end{align*} If the quadratic cost is infinite, the same displayed inequality holds in the extended real sense. Now take the infimum over all couplings $\pi \in \Pi(\nu,\mu)$. The left-hand infimum is $W_1(\nu,\mu)$ by definition, while the infimum of the square roots of the quadratic costs is the square root of the infimum because the square-root map is increasing on $[0,\infty]$. Therefore \begin{align*} W_1(\nu,\mu) \le W_2(\nu,\mu). \end{align*} [/guided] [/step] [step:Apply the assumed $T_2(C)$ inequality to obtain $T_1(C)$ on $\mathcal P_2(E)$] Let $\nu \in \mathcal P_2(E)$. If $H(\nu\mid\mu)=+\infty$, then \begin{align*} W_1^2(\nu,\mu) \le +\infty = 2C H(\nu\mid\mu) \end{align*} in the extended real sense, so the desired assertion is automatic. Assume now that $H(\nu\mid\mu)<+\infty$. Since $\mu$ satisfies $T_2(C)$ on $\mathcal P_2(E)$, applying the hypothesis to $\rho=\nu$ gives \begin{align*} W_2^2(\nu,\mu) \le 2C H(\nu\mid\mu). \end{align*} From the previous step, \begin{align*} W_1(\nu,\mu) \le W_2(\nu,\mu). \end{align*} Squaring this inequality and combining it with the $T_2(C)$ estimate yields \begin{align*} W_1^2(\nu,\mu) \le W_2^2(\nu,\mu) \le 2C H(\nu\mid\mu). \end{align*} Thus the claimed $T_1(C)$ inequality holds for every $\nu \in \mathcal P_2(E)$. [/step] [step:Pass from $\mathcal P_2(E)$ to $\mathcal P_1(E)$ using the approximation hypothesis] Assume the stated approximation property, and let $\nu \in \mathcal P_1(E)$. If $H(\nu\mid\mu)=+\infty$, the inequality \begin{align*} W_1^2(\nu,\mu) \le 2C H(\nu\mid\mu) \end{align*} is automatic in the extended real sense. Assume now that $H(\nu\mid\mu)<+\infty$. By the approximation hypothesis, there exists a sequence $(\nu_k)_{k=1}^{\infty}$ in $\mathcal P_2(E)$ such that $\nu_k \to \nu$ in $W_1$ and \begin{align*} \limsup_{k \to \infty} H(\nu_k\mid\mu) \le H(\nu\mid\mu). \end{align*} For each positive integer $k$, the already proved $\mathcal P_2(E)$ case gives \begin{align*} W_1^2(\nu_k,\mu) \le 2C H(\nu_k\mid\mu). \end{align*} Since $\nu_k \to \nu$ in $W_1$, we have \begin{align*} \lim_{k \to \infty} W_1(\nu_k,\nu)=0. \end{align*} The triangle inequality for the metric $W_1$ gives \begin{align*} |W_1(\nu_k,\mu)-W_1(\nu,\mu)| \le W_1(\nu_k,\nu), \end{align*} and hence \begin{align*} \lim_{k \to \infty} W_1(\nu_k,\mu)=W_1(\nu,\mu). \end{align*} Taking the limit superior in the inequalities for $\nu_k$ yields \begin{align*} W_1^2(\nu,\mu) = \lim_{k \to \infty} W_1^2(\nu_k,\mu) \le 2C \limsup_{k \to \infty} H(\nu_k\mid\mu). \end{align*} Using the entropy approximation bound, we conclude that \begin{align*} W_1^2(\nu,\mu) \le 2C H(\nu\mid\mu). \end{align*} Therefore $\mu$ satisfies $T_1(C)$ on all of $\mathcal P_1(E)$. [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Definitions & Concepts

Variance

Explore Further

What brings you to Androma?

Start with a route through the knowledge graph.

Talagrand $T_2(C)$ Implies Talagrand $T_1(C)$ (Theorem # 6796)

Discussion

Proof

Prerequisites (0/3 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Talagrand $T_2(C)$ Implies Talagrand $T_1(C)$ (Theorem # 6796)

Discussion

Proof

Prerequisites (0/3 completed)

Prerequisites Graph

Explore Further