Wasserstein Geodesics Induced by Dynamical Optimal Plans

Wasserstein Geodesics Induced by Dynamical Optimal Plans (Theorem # 7486)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first check that the interpolated measures have finite $p$-moment, using the fact that every path in $\operatorname{Geo}(X)$ is controlled by its two endpoints. For two times $s,t$, the pushforward $(e_s,e_t)_\#\Pi$ gives an admissible transport plan between $\mu_s$ and $\mu_t$, and the constant-speed property of each path bounds its transport cost by $|t-s|^p$ times the endpoint cost. Since the endpoint plan is optimal, this gives an upper bound for every intermediate distance. The triangle inequality for $W_p$ along the chain $\mu_0,\mu_s,\mu_t,\mu_1$ forces all these upper bounds to be equalities. [/proofplan] [step:Verify that every interpolated measure has finite $p$-moment] Fix a point $x_\ast\in X$. For each $t\in[0,1]$, the map $e_t:\operatorname{Geo}(X)\to X$ is Borel, so $\mu_t=(e_t)_\#\Pi$ is a Borel probability measure on $X$. We prove that $\mu_t$ has finite $p$-moment. Since each $\gamma\in\operatorname{Geo}(X)$ is a constant-speed geodesic, \begin{align*} d(\gamma(0),\gamma(t))=t\,d(\gamma(0),\gamma(1)). \end{align*} By the triangle inequality in $X$, \begin{align*} d(x_\ast,\gamma(t))\le d(x_\ast,\gamma(0))+t\,d(\gamma(0),\gamma(1)). \end{align*} Again by the triangle inequality in $X$, \begin{align*} d(\gamma(0),\gamma(1))\le d(\gamma(0),x_\ast)+d(x_\ast,\gamma(1)). \end{align*} Therefore, since $0\le t\le 1$, \begin{align*} d(x_\ast,\gamma(t))\le 2d(x_\ast,\gamma(0))+d(x_\ast,\gamma(1)). \end{align*} Using the elementary inequality $(a+b+c)^p\le 3^{p-1}(a^p+b^p+c^p)$ for $a,b,c\ge0$, we get \begin{align*} d(x_\ast,\gamma(t))^p\le 3^{p-1}\left(2^p d(x_\ast,\gamma(0))^p+d(x_\ast,\gamma(1))^p\right). \end{align*} Integrating with respect to $\Pi$ and using $(e_0,e_1)_\#\Pi=\pi$, whose marginals are $\mu_0$ and $\mu_1$, gives \begin{align*} \int_X d(x_\ast,z)^p\,d\mu_t(z)\le 3^{p-1}\left(2^p\int_X d(x_\ast,x)^p\,d\mu_0(x)+\int_X d(x_\ast,y)^p\,d\mu_1(y)\right). \end{align*} The right-hand side is finite because $\mu_0,\mu_1\in\mathcal P_p(X)$. Hence $\mu_t\in\mathcal P_p(X)$. [/step] [step:Use the dynamical plan to bound the distance between two time slices] Let $s,t\in[0,1]$ with $s\le t$. Define the endpoint-at-times coupling \begin{align*} \pi_{s,t}:=(e_s,e_t)_\#\Pi. \end{align*} Then $\pi_{s,t}\in\Pi(\mu_s,\mu_t)$ because its first marginal is $(e_s)_\#\Pi=\mu_s$ and its second marginal is $(e_t)_\#\Pi=\mu_t$. By the definition of $W_p$ as the infimum of $p$th transport costs over couplings, \begin{align*} W_p(\mu_s,\mu_t)^p\le \int_{X\times X} d(x,y)^p\,d\pi_{s,t}(x,y). \end{align*} Using the pushforward identity for $\pi_{s,t}$, this becomes \begin{align*} W_p(\mu_s,\mu_t)^p\le \int_{\operatorname{Geo}(X)} d(\gamma(s),\gamma(t))^p\,d\Pi(\gamma). \end{align*} Since each $\gamma\in\operatorname{Geo}(X)$ is parametrized with constant speed on $[0,1]$, \begin{align*} d(\gamma(s),\gamma(t))=(t-s)d(\gamma(0),\gamma(1)). \end{align*} Therefore \begin{align*} W_p(\mu_s,\mu_t)^p\le (t-s)^p\int_{\operatorname{Geo}(X)} d(\gamma(0),\gamma(1))^p\,d\Pi(\gamma). \end{align*} Using $(e_0,e_1)_\#\Pi=\pi$, we identify the last integral with the endpoint transport cost: \begin{align*} \int_{\operatorname{Geo}(X)} d(\gamma(0),\gamma(1))^p\,d\Pi(\gamma)=\int_{X\times X} d(x,y)^p\,d\pi(x,y). \end{align*} Since $\pi$ is optimal, \begin{align*} \int_{X\times X} d(x,y)^p\,d\pi(x,y)=W_p(\mu_0,\mu_1)^p. \end{align*} Thus \begin{align*} W_p(\mu_s,\mu_t)\le (t-s)W_p(\mu_0,\mu_1). \end{align*} [guided] We want to estimate the Wasserstein distance between the two intermediate measures $\mu_s$ and $\mu_t$. The definition of $W_p$ asks us to minimize over all couplings of $\mu_s$ and $\mu_t$, so it is enough to exhibit one admissible coupling and compute its cost. The natural coupling is obtained by taking a random geodesic $\gamma$ distributed according to $\Pi$ and looking at its two positions $\gamma(s)$ and $\gamma(t)$. Formally, define \begin{align*} \pi_{s,t}:=(e_s,e_t)_\#\Pi. \end{align*} This is a probability measure on $X\times X$. Its first marginal is $(e_s)_\#\Pi=\mu_s$, and its second marginal is $(e_t)_\#\Pi=\mu_t$, so $\pi_{s,t}\in\Pi(\mu_s,\mu_t)$. Because $W_p(\mu_s,\mu_t)^p$ is the infimum of $\int_{X\times X}d(x,y)^p\,d\rho(x,y)$ over all $\rho\in\Pi(\mu_s,\mu_t)$, the particular admissible plan $\pi_{s,t}$ gives the upper bound \begin{align*} W_p(\mu_s,\mu_t)^p\le \int_{X\times X} d(x,y)^p\,d\pi_{s,t}(x,y). \end{align*} Now we rewrite this integral using the definition of pushforward. Since $\pi_{s,t}=(e_s,e_t)_\#\Pi$, we have \begin{align*} \int_{X\times X} d(x,y)^p\,d\pi_{s,t}(x,y)=\int_{\operatorname{Geo}(X)} d(\gamma(s),\gamma(t))^p\,d\Pi(\gamma). \end{align*} The key point is that $\Pi$ is supported on constant-speed geodesics. Therefore, for every $\gamma\in\operatorname{Geo}(X)$ and every $0\le s\le t\le1$, \begin{align*} d(\gamma(s),\gamma(t))=(t-s)d(\gamma(0),\gamma(1)). \end{align*} Substituting this identity into the previous estimate gives \begin{align*} W_p(\mu_s,\mu_t)^p\le (t-s)^p\int_{\operatorname{Geo}(X)} d(\gamma(0),\gamma(1))^p\,d\Pi(\gamma). \end{align*} Finally, the assumption $(e_0,e_1)_\#\Pi=\pi$ says that the distribution of the pair of endpoints $(\gamma(0),\gamma(1))$ is exactly $\pi$. Hence \begin{align*} \int_{\operatorname{Geo}(X)} d(\gamma(0),\gamma(1))^p\,d\Pi(\gamma)=\int_{X\times X} d(x,y)^p\,d\pi(x,y). \end{align*} The coupling $\pi$ is optimal for $W_p(\mu_0,\mu_1)$, so \begin{align*} \int_{X\times X} d(x,y)^p\,d\pi(x,y)=W_p(\mu_0,\mu_1)^p. \end{align*} Combining these identities yields \begin{align*} W_p(\mu_s,\mu_t)^p\le (t-s)^pW_p(\mu_0,\mu_1)^p. \end{align*} Taking $p$th roots is valid because both sides are non-negative, and we obtain \begin{align*} W_p(\mu_s,\mu_t)\le (t-s)W_p(\mu_0,\mu_1). \end{align*} [/guided] [/step] [step:Force equality by applying the triangle inequality along the time chain] Fix $0\le s\le t\le1$, and set \begin{align*} L:=W_p(\mu_0,\mu_1). \end{align*} Applying the previous step to the pairs $(0,s)$, $(s,t)$, and $(t,1)$ gives \begin{align*} W_p(\mu_0,\mu_s)\le sL. \end{align*} \begin{align*} W_p(\mu_s,\mu_t)\le (t-s)L. \end{align*} \begin{align*} W_p(\mu_t,\mu_1)\le (1-t)L. \end{align*} By the triangle inequality for the metric $W_p$ on $\mathcal P_p(X)$, \begin{align*} L=W_p(\mu_0,\mu_1)\le W_p(\mu_0,\mu_s)+W_p(\mu_s,\mu_t)+W_p(\mu_t,\mu_1). \end{align*} Combining this with the three upper bounds gives \begin{align*} L\le W_p(\mu_0,\mu_s)+W_p(\mu_s,\mu_t)+W_p(\mu_t,\mu_1)\le sL+(t-s)L+(1-t)L=L. \end{align*} Therefore both inequalities are equalities. Since each summand is bounded above by the corresponding summand in $sL+(t-s)L+(1-t)L$, equality of the total sum forces \begin{align*} W_p(\mu_s,\mu_t)=(t-s)L. \end{align*} Thus, for all $0\le s\le t\le1$, \begin{align*} W_p(\mu_s,\mu_t)=(t-s)W_p(\mu_0,\mu_1). \end{align*} If $s>t$, the same identity follows by symmetry of $W_p$. Hence \begin{align*} W_p(\mu_s,\mu_t)=|t-s|W_p(\mu_0,\mu_1) \end{align*} for all $s,t\in[0,1]$. [/step] [step:Conclude that the curve is a constant-speed Wasserstein geodesic] Taking $s=0$ and $t=1$ in the preceding identity gives the prescribed endpoints: \begin{align*} \mu_0=(e_0)_\#\Pi \end{align*} and \begin{align*} \mu_1=(e_1)_\#\Pi. \end{align*} For arbitrary $s,t\in[0,1]$, the identity \begin{align*} W_p(\mu_s,\mu_t)=|t-s|W_p(\mu_0,\mu_1) \end{align*} says exactly that $[0,1]\ni t\mapsto\mu_t\in\mathcal P_p(X)$ has constant speed $W_p(\mu_0,\mu_1)$ with respect to the metric $W_p$. Therefore $(\mu_t)_{t\in[0,1]}$ is a constant-speed geodesic in $(\mathcal P_p(X),W_p)$ from $\mu_0$ to $\mu_1$. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Separable

Explore Further

Separable Definition Algebra of Holomorphic Functions Complex Analysis Continuous Embedding of Inhomogeneous Into Homogeneous Sobolev Spaces Sobolev Spaces Duhamel's Principle For The Wave Equation Partial Differential Equations Littlewood-Paley Square Function Inequality Analysis Closed Map Lemma Topology Brouwer's Characterisation of the Cantor Set Real Analysis Courant-Fischer Min-Max Principle Functional Analysis Multivariable Chain Rule Multivariable Calculus Analysis Area

What brings you to Androma?

Start with a route through the knowledge graph.