Smooth McCann Displacement Convexity Theorem — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Let $n\in\mathbb N$, let $\mathcal P_2(\mathbb R^n)$ denote the set of Borel probability measures on $\mathbb R^n$ with finite second moment, and let $U:[0,\infty)\to\mathbb R$ be a convex function in the McCann displacement convexity class $\mathcal{DC}_n$, meaning that the McCann transform \begin{align*} G_U:(0,\infty)\to\mathbb R,\qquad r\mapsto r^n U(r^{-n}) \end{align*} is convex and nonincreasing. Let $\rho_0,\rho_1\in C^\infty(\mathbb R^n;(0,\infty))$ be probability densities with finite second moments, and set \begin{align*} \mu_0:=\rho_0\,\mathcal L^n,\qquad \mu_1:=\rho_1\,\mathcal L^n. \end{align*} Assume that $T:\mathbb R^n\to\mathbb R^n$ is the Brenier optimal transport map pushing $\mu_0$ forward to $\mu_1$, that $T$ is continuously differentiable, and that its Jacobian matrix $JT_x$ is symmetric positive semidefinite for $\mathcal L^n$-a.e. $x\in\mathbb R^n$. For each $t\in[0,1]$, define \begin{align*} T_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto (1-t)x+tT(x), \end{align*} and assume that $\mu_t:=(T_t)_\#\mu_0$ is absolutely continuous with density $\rho_t:\mathbb R^n\to[0,\infty)$. Assume also that there is a full-$\mathcal L^n$-measure Borel set $E\subset\mathbb R^n$ such that, for every $t\in[0,1]$, $JT_{t,x}:=(1-t)I_n+tJT_x$ exists on $E$, $\det JT_{t,x}>0$ on $E$, the Monge-Ampere change-of-variables identity \begin{align*} \rho_t(T_t(x))\det JT_{t,x}=\rho_0(x) \end{align*} holds for every $x\in E$, and the change-of-variables formula \begin{align*} \int_{\mathbb R^n} h(y)\,d\mathcal L^n(y)=\int_E h(T_t(x))\det JT_{t,x}\,d\mathcal L^n(x) \end{align*} holds for every nonnegative Borel function $h:\mathbb R^n\to[0,\infty]$ and every Borel function $h:\mathbb R^n\to\mathbb R$ for which one side is absolutely integrable. For an absolutely continuous probability measure $\mu=\rho\,\mathcal L^n$, define \begin{align*} \mathcal U[\mu]:=\int_{\mathbb R^n}U(\rho(x))\,d\mathcal L^n(x), \end{align*} with values in $(-\infty,+\infty]$ whenever the integral is well-defined. Assume that $\mathcal U[\mu_0]$ and $\mathcal U[\mu_1]$ are not both involved in an indeterminate extended-real expression on the right-hand side below. Then, for every $t\in[0,1]$, \begin{align*} \mathcal U[\mu_t]\le (1-t)\mathcal U[\mu_0]+t\mathcal U[\mu_1]. \end{align*}

Discussion

No discussion available for this theorem.

Proof

[proofplan] Under the stated smoothness and Monge-Ampere hypotheses, the geodesic is written as $\mu_t=(T_t)_\#\mu_0$, and the internal energy is pulled back to an integral over the initial measure. The determinant inequality for positive semidefinite matrices gives a lower bound on the interpolation Jacobian. Because the McCann transform $G_U(r)=r^nU(r^{-n})$ is convex and nonincreasing, this lower bound becomes the desired upper bound on the energy. [/proofplan] [step:Prove the inequality for smooth positive data with a smooth Brenier map] Assume first that $\rho_0,\rho_1\in C^\infty(\mathbb R^n;(0,\infty))$ are probability densities with finite second moment, that the Brenier map $T:\mathbb R^n\to\mathbb R^n$ is continuously differentiable, and that the Monge-Ampere change-of-variables identities below hold for $\mathcal L^n$-a.e. $x\in\mathbb R^n$. By hypothesis, the Jacobian matrix $JT_x$ is symmetric positive semidefinite for $\mathcal L^n$-a.e. $x\in\mathbb R^n$. For $t\in[0,1]$, define \begin{align*} T_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto (1-t)x+tT(x). \end{align*} At points of differentiability of $T$, its Jacobian matrix is \begin{align*} JT_{t,x}=(1-t)I_n+tJT_x. \end{align*} Since $JT_x$ is positive semidefinite, $JT_{t,x}$ is also positive semidefinite. Let $\rho_t:\mathbb R^n\to[0,\infty)$ denote the density of $\mu_t=(T_t)_\#\mu_0$. The Monge-Ampere change-of-variables formula for the map $T_t$ gives, for $\mathcal L^n$-a.e. $x\in\mathbb R^n$, \begin{align*} \rho_t(T_t(x))\det JT_{t,x}=\rho_0(x). \end{align*} Define the determinant radii \begin{align*} r_t:\mathbb R^n\to[0,\infty),\qquad x\mapsto \bigl(\det JT_{t,x}\bigr)^{1/n}, \end{align*} and \begin{align*} r_1:\mathbb R^n\to[0,\infty),\qquad x\mapsto \bigl(\det JT_x\bigr)^{1/n}. \end{align*} Because $\rho_0$ is strictly positive and the transport is between smooth positive densities in the present reduction, these determinant radii are positive for $\mathcal L^n$-a.e. $x$ relevant to the formula. The function $h_t:\mathbb R^n\to(-\infty,+\infty]$ defined by $h_t(y)=U(\rho_t(y))$ is Borel because $\rho_t$ is a Borel density and $U$ is convex, hence continuous on $(0,\infty)$ and finite at $0$ by hypothesis. Applying the assumed change-of-variables formula for $T_t$ to the positive and negative parts of $h_t$ whenever the integral is finite, and otherwise in the extended nonnegative sense after decomposition, gives \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}U(\rho_t(y))\,d\mathcal L^n(y). \end{align*} Using the substitution $y=T_t(x)$ on the full-measure set from the theorem statement and the Jacobian factor $\det JT_{t,x}=r_t(x)^n$, this becomes \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}U(\rho_t(T_t(x)))r_t(x)^n\,d\mathcal L^n(x). \end{align*} The Monge-Ampere identity rewrites the density as \begin{align*} \rho_t(T_t(x))=\frac{\rho_0(x)}{r_t(x)^n}. \end{align*} Therefore \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right)\,d\mathcal L^n(x). \end{align*} Introduce the measurable function \begin{align*} a_0:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho_0(x)^{-1/n}. \end{align*} By the definition of $G_U$, \begin{align*} \rho_0(x)G_U(a_0(x)r_t(x))=r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right). \end{align*} Thus \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_t(x))\,d\mathcal L^n(x). \end{align*} Let $\operatorname{id}_{\mathbb R^n}:\mathbb R^n\to\mathbb R^n$ denote the identity map, $x\mapsto x$. At $t=0$, $T_0=\operatorname{id}_{\mathbb R^n}$ and $r_0=1$, so \begin{align*} \mathcal U[\mu_0]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x))\,d\mathcal L^n(x). \end{align*} At $t=1$, the same computation for $T$ gives \begin{align*} \mathcal U[\mu_1]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_1(x))\,d\mathcal L^n(x). \end{align*} [guided] The reason for introducing $G_U$ is that the density transformation under $T_t$ naturally produces the expression $r_t^nU(\rho_0/r_t^n)$. This expression is not convex in $r_t$ in any transparent way, but after multiplying and dividing by $\rho_0$ it becomes exactly the McCann transform. We begin with the smooth positive setting. Let \begin{align*} T_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto (1-t)x+tT(x). \end{align*} Since $T$ is the Brenier map, it is the gradient of a convex function at $\mu_0$-a.e. point. Therefore $JT_x$ is symmetric positive semidefinite at almost every differentiability point. The Jacobian matrix of $T_t$ is \begin{align*} JT_{t,x}=(1-t)I_n+tJT_x. \end{align*} This matrix is positive semidefinite because it is a convex combination of positive semidefinite matrices. Let $\rho_t:\mathbb R^n\to[0,\infty)$ be the density of $\mu_t=(T_t)_\#\mu_0$. The Monge-Ampere change-of-variables formula states that \begin{align*} \rho_t(T_t(x))\det JT_{t,x}=\rho_0(x) \end{align*} for $\mathcal L^n$-a.e. $x\in\mathbb R^n$. Define \begin{align*} r_t:\mathbb R^n\to[0,\infty),\qquad x\mapsto \bigl(\det JT_{t,x}\bigr)^{1/n}. \end{align*} Then $\det JT_{t,x}=r_t(x)^n$, and the Monge-Ampere identity becomes \begin{align*} \rho_t(T_t(x))=\frac{\rho_0(x)}{r_t(x)^n}. \end{align*} Now compute the energy by pulling it back through $T_t$. The change of variables $y=T_t(x)$ transforms the Lebesgue measure by the Jacobian factor $r_t(x)^n$, hence \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}U(\rho_t(T_t(x)))r_t(x)^n\,d\mathcal L^n(x). \end{align*} Substituting the density identity gives \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right)\,d\mathcal L^n(x). \end{align*} This is where the McCann transform enters. Define \begin{align*} a_0:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho_0(x)^{-1/n}. \end{align*} Because $G_U(r)=r^nU(r^{-n})$, we have \begin{align*} \rho_0(x)G_U(a_0(x)r_t(x))=r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right). \end{align*} Therefore the energy has the form \begin{align*} \mathcal U[\mu_t]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_t(x))\,d\mathcal L^n(x). \end{align*} Let $\operatorname{id}_{\mathbb R^n}:\mathbb R^n\to\mathbb R^n$ denote the identity map, $x\mapsto x$. The endpoints are the same formula with $T_0=\operatorname{id}_{\mathbb R^n}$, $r_0=1$, and $r_1=(\det JT)^{1/n}$: \begin{align*} \mathcal U[\mu_0]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x))\,d\mathcal L^n(x), \end{align*} and \begin{align*} \mathcal U[\mu_1]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_1(x))\,d\mathcal L^n(x). \end{align*} This puts all three energies on the same reference measure $\rho_0\,\mathcal L^n$, which is what allows a pointwise convexity argument to prove the global inequality. We now make that pointwise argument inside the guided proof. Fix $x$ in the full-measure set on which the Jacobians, determinant identities, and positivity assertions hold. The matrix $JT_x$ is symmetric positive semidefinite by hypothesis, and $I_n$ is symmetric positive definite. Minkowski's determinant inequality for positive semidefinite matrices, applied to $(1-t)I_n$ and $tJT_x$, gives \begin{align*} r_t(x)=\bigl(\det JT_{t,x}\bigr)^{1/n}\ge (1-t)(\det I_n)^{1/n}+t(\det JT_x)^{1/n}. \end{align*} Since $\det I_n=1$ and $r_1(x)=(\det JT_x)^{1/n}$, this becomes \begin{align*} r_t(x)\ge (1-t)+tr_1(x). \end{align*} Multiplying by the positive number $a_0(x)=\rho_0(x)^{-1/n}$ preserves the inequality: \begin{align*} a_0(x)r_t(x)\ge (1-t)a_0(x)+t a_0(x)r_1(x). \end{align*} Now the direction of monotonicity matters. Since $G_U$ is nonincreasing, the lower bound on the argument gives an upper bound on the value: \begin{align*} G_U(a_0(x)r_t(x))\le G_U((1-t)a_0(x)+t a_0(x)r_1(x)). \end{align*} Since $G_U$ is convex on $(0,\infty)$ and both $a_0(x)$ and $a_0(x)r_1(x)$ are positive, convexity gives \begin{align*} G_U((1-t)a_0(x)+t a_0(x)r_1(x))\le (1-t)G_U(a_0(x))+tG_U(a_0(x)r_1(x)). \end{align*} Combining the two pointwise inequalities and multiplying by $\rho_0(x)\ge0$ yields \begin{align*} \rho_0(x)G_U(a_0(x)r_t(x))\le (1-t)\rho_0(x)G_U(a_0(x))+t\rho_0(x)G_U(a_0(x)r_1(x)). \end{align*} Integrating this inequality with respect to $\mathcal L^n$ and substituting the three energy representations already proved gives \begin{align*} \mathcal U[\mu_t]\le (1-t)\mathcal U[\mu_0]+t\mathcal U[\mu_1]. \end{align*} Thus the guided argument proves the same displacement convexity inequality as the exact proof. [/guided] [/step] [step:Apply the determinant inequality to the interpolated Jacobian] For each $\mathcal L^n$-a.e. $x\in\mathbb R^n$ at which $JT_x$ exists, apply Minkowski's determinant inequality for positive semidefinite matrices to $I_n$ and $JT_x$. Since \begin{align*} JT_{t,x}=(1-t)I_n+tJT_x, \end{align*} we obtain \begin{align*} \bigl(\det JT_{t,x}\bigr)^{1/n}\ge (1-t)(\det I_n)^{1/n}+t(\det JT_x)^{1/n}. \end{align*} Because $\det I_n=1$, this is \begin{align*} r_t(x)\ge (1-t)+tr_1(x). \end{align*} Multiplying by the positive scalar $a_0(x)=\rho_0(x)^{-1/n}$ gives \begin{align*} a_0(x)r_t(x)\ge (1-t)a_0(x)+t\,a_0(x)r_1(x). \end{align*} [/step] [step:Use monotonicity and convexity of the McCann transform] Since $G_U$ is nonincreasing and \begin{align*} a_0(x)r_t(x)\ge (1-t)a_0(x)+t\,a_0(x)r_1(x), \end{align*} we have, for $\mathcal L^n$-a.e. $x\in\mathbb R^n$, \begin{align*} G_U(a_0(x)r_t(x))\le G_U((1-t)a_0(x)+t\,a_0(x)r_1(x)). \end{align*} Since $G_U$ is convex on $(0,\infty)$, \begin{align*} G_U((1-t)a_0(x)+t\,a_0(x)r_1(x))\le (1-t)G_U(a_0(x))+tG_U(a_0(x)r_1(x)). \end{align*} Combining these inequalities and multiplying by the nonnegative density $\rho_0(x)$ gives \begin{align*} \rho_0(x)G_U(a_0(x)r_t(x))\le (1-t)\rho_0(x)G_U(a_0(x))+t\rho_0(x)G_U(a_0(x)r_1(x)). \end{align*} Integrating with respect to $\mathcal L^n$ and using the three energy representations from the first step yields \begin{align*} \mathcal U[\mu_t]\le (1-t)\mathcal U[\mu_0]+t\mathcal U[\mu_1]. \end{align*} Thus the theorem is proved in the smooth positive setting. [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Smooth McCann Displacement Convexity Theorem (Theorem # 9567)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Smooth McCann Displacement Convexity Theorem (Theorem # 9567)

Discussion

Proof

Explore Further