Jordan-Kinderlehrer-Otto Convergence Theorem for the Fokker-Planck Equation (Theorem # 9575)
Theorem
Let $n\in\mathbb N$. Let $V\in C^2(\mathbb R^n;\mathbb R)$, let $\lambda\in\mathbb R$, and assume that, for every $x\in\mathbb R^n$ and every $\xi\in\mathbb R^n$,
\begin{align*}
\xi^\top J(\nabla V)_x\xi\ge \lambda |\xi|^2.
\end{align*}
Assume also that there are constants $a>0$ and $C>0$ such that, for every $x\in\mathbb R^n$,
\begin{align*}
a|x|^2-C\le V(x)\le C(1+|x|^2)
\end{align*}
and
\begin{align*}
|\nabla V(x)|\le C(1+|x|).
\end{align*}
Define the free energy functional $\mathcal F:\mathcal P_2(\mathbb R^n)\to(-\infty,+\infty]$ as follows. If $\rho=u\mathcal L^n$ for a Borel density $u:\mathbb R^n\to[0,\infty)$ satisfying $u\log u\in L^1(\mathbb R^n)$, then
\begin{align*}
\mathcal F[\rho]=\int_{\mathbb R^n} u(x)\log u(x)\,d\mathcal L^n(x)+\int_{\mathbb R^n} V(x)\,d\rho(x).
\end{align*}
For all other $\rho\in\mathcal P_2(\mathbb R^n)$, set $\mathcal F[\rho]=+\infty$.
Let $\rho_0\in\mathcal P_2(\mathbb R^n)$ satisfy $\mathcal F[\rho_0]<\infty$. For each $\tau>0$, define a sequence $(\rho_k^\tau)_{k\ge0}$ in $\mathcal P_2(\mathbb R^n)$ by $\rho_0^\tau=\rho_0$ and, for each $k\ge0$, by choosing a minimizer
\begin{align*}
\rho_{k+1}^\tau\in\operatorname*{argmin}_{\rho\in\mathcal P_2(\mathbb R^n)}\left\{\mathcal F[\rho]+\frac{1}{2\tau}W_2^2(\rho,\rho_k^\tau)\right\}.
\end{align*}
Define the piecewise-constant interpolation $\bar\rho_\tau:[0,\infty)\to\mathcal P_2(\mathbb R^n)$ by $\bar\rho_\tau(0)=\rho_0$ and
\begin{align*}
\bar\rho_\tau(t)=\rho_k^\tau
\end{align*}
for $k\in\mathbb N$ and $t\in((k-1)\tau,k\tau]$.
Assume the standard JKO compactness theorem applies to $\mathcal F$ under the preceding growth hypotheses: bounded energy sublevels have uniformly integrable second moments, and the moment-plus-increment bounds for the JKO interpolations imply subsequential pointwise $W_2$ convergence to a locally absolutely continuous curve. Then for every sequence $(\tau_j)_{j\ge1}$ in $(0,\infty)$ with $\tau_j\downarrow0$, there are a subsequence, still denoted $(\tau_j)_{j\ge1}$, and a locally absolutely continuous curve
\begin{align*}
\rho:[0,\infty)\to(\mathcal P_2(\mathbb R^n),W_2)
\end{align*}
such that
\begin{align*}
W_2(\bar\rho_{\tau_j}(t),\rho(t))\to0
\end{align*}
for every $t\ge0$.
For $\mathcal L^1$-a.e. $t\ge0$ with $\mathcal F[\rho(t)]<\infty$, write $\rho(t)=u(t,\cdot)\mathcal L^n$ for its Borel density. The curve $\rho$ is a distributional solution of the Fokker-Planck equation
\begin{align*}
\partial_t\rho=\Delta\rho+\nabla\cdot(\rho\nabla V)
\end{align*}
with initial datum $\rho_0$, in the sense that for every $\varphi\in C_c^\infty([0,\infty)\times\mathbb R^n)$,
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}\left(\partial_t\varphi(t,x)+\Delta\varphi(t,x)-\nabla V(x)\cdot\nabla\varphi(t,x)\right)\,d\rho(t)(x)\,d\mathcal L^1(t)+\int_{\mathbb R^n}\varphi(0,x)\,d\rho_0(x)=0.
\end{align*}
Define the relative Fisher information $\mathcal I_V:\mathcal P_2(\mathbb R^n)\to[0,+\infty]$ as follows. If $\rho=u\mathcal L^n$, if the distributional vector field $\nabla u+u\nabla V$ is represented by a locally integrable Borel vector field on $\mathbb R^n$, and if
\begin{align*}
\int_{\mathbb R^n}\frac{|\nabla u(x)+u(x)\nabla V(x)|^2}{u(x)}\,d\mathcal L^n(x)<\infty,
\end{align*}
with the integrand interpreted as $0$ on the set where $u=0$ and $\nabla u+u\nabla V=0$, then
\begin{align*}
\mathcal I_V[\rho]=\int_{\mathbb R^n}\frac{|\nabla u(x)+u(x)\nabla V(x)|^2}{u(x)}\,d\mathcal L^n(x).
\end{align*}
For all other $\rho\in\mathcal P_2(\mathbb R^n)$, set $\mathcal I_V[\rho]=+\infty$.
Assume, as supplied by the JKO slope-identification input, that $\mathcal F$ is proper, lower semicontinuous, $\lambda$-displacement convex, and has strong upper gradient satisfying $|\partial\mathcal F|^2=\mathcal I_V$ along every limiting curve obtained above. Then for all $0\le s\le t<\infty$,
\begin{align*}
\mathcal F[\rho(t)]+\frac{1}{2}\int_s^t |\rho'|_{W_2}^2(r)\,d\mathcal L^1(r)+\frac{1}{2}\int_s^t\mathcal I_V[\rho(r)]\,d\mathcal L^1(r)\le \mathcal F[\rho(s)],
\end{align*}
where $|\rho'|_{W_2}$ denotes the metric derivative of $\rho$ as a curve in $(\mathcal P_2(\mathbb R^n),W_2)$.
Knowledge Status
Analysis
Discussion
No discussion available for this theorem.
Proof
[proofplan]
The proof follows the Jordan-Kinderlehrer-Otto scheme. First the direct method gives a minimizer at every time step, and minimality gives a discrete energy inequality controlling both the free energy and the squared Wasserstein increments. These estimates give compactness of the piecewise-constant interpolations and produce a locally absolutely continuous limit curve. The Euler-Lagrange equation for each JKO minimizer gives a discrete weak formulation, which converges to the Fokker-Planck weak formulation. Finally, lower semicontinuity of the metric action and the relative Fisher information gives the energy dissipation inequality.
[/proofplan]
[step:Construct each JKO minimizer by the direct method]
Fix $\tau>0$ and $k\ge0$, and assume $\rho_k^\tau\in\mathcal P_2(\mathbb R^n)$ has already been constructed with $\mathcal F[\rho_k^\tau]<\infty$. Define the functional $\mathcal J_{\tau,k}:\mathcal P_2(\mathbb R^n)\to(-\infty,+\infty]$ by
\begin{align*}
\mathcal J_{\tau,k}[\rho]=\mathcal F[\rho]+\frac{1}{2\tau}W_2^2(\rho,\rho_k^\tau).
\end{align*}
Since $\mathcal J_{\tau,k}[\rho_k^\tau]=\mathcal F[\rho_k^\tau]<\infty$, the infimum of $\mathcal J_{\tau,k}$ is finite from above.
Let $(\eta_m)_{m\ge1}$ be a minimizing sequence for $\mathcal J_{\tau,k}$. The lower growth bound on $V$ gives
\begin{align*}
\int_{\mathbb R^n}|x|^2\,d\eta_m(x)\le \frac{1}{a}\int_{\mathbb R^n}V(x)\,d\eta_m(x)+\frac{C}{a}.
\end{align*}
The entropy is controlled from below by the second moment with an arbitrarily small coefficient: for every $\varepsilon>0$ there is a constant $C_{n,\varepsilon}>0$, depending only on $n$ and $\varepsilon$, such that every density $u:\mathbb R^n\to[0,\infty)$ with finite second moment satisfies
\begin{align*}
\int_{\mathbb R^n} u(x)\log u(x)\,d\mathcal L^n(x)\ge -\varepsilon\int_{\mathbb R^n}|x|^2u(x)\,d\mathcal L^n(x)-C_{n,\varepsilon}.
\end{align*}
Choose $\varepsilon=a/2$. Combining this estimate with the coercive lower bound on $V$ and the boundedness of $\mathcal J_{\tau,k}[\eta_m]$ gives a uniform second-moment bound for $(\eta_m)_{m\ge1}$.
The uniform second-moment bound implies tightness of $(\eta_m)_{m\ge1}$. By Prokhorov compactness, after passing to a subsequence there is a probability measure $\eta\in\mathcal P(\mathbb R^n)$ such that $\eta_m$ converges narrowly to $\eta$. The same uniform second-moment bound implies $\eta\in\mathcal P_2(\mathbb R^n)$ and gives convergence along the subsequence in the lower semicontinuity sense needed for $W_2$. The entropy is lower semicontinuous under narrow convergence with moment control, the potential term is lower semicontinuous because $V$ is continuous and bounded from below by a quadratic function, and $\rho\mapsto W_2^2(\rho,\rho_k^\tau)$ is lower semicontinuous under narrow convergence with second-moment control. Hence
\begin{align*}
\mathcal J_{\tau,k}[\eta]\le \liminf_{m\to\infty}\mathcal J_{\tau,k}[\eta_m].
\end{align*}
Thus $\eta$ is a minimizer. We set $\rho_{k+1}^\tau=\eta$.
[guided]
We fix $\tau>0$ and $k\ge0$, and we suppose that the previous state $\rho_k^\tau$ has already been constructed with finite energy. The object to minimize is the map
\begin{align*}
\mathcal J_{\tau,k}:\mathcal P_2(\mathbb R^n)\to(-\infty,+\infty]
\end{align*}
defined by
\begin{align*}
\mathcal J_{\tau,k}[\rho]=\mathcal F[\rho]+\frac{1}{2\tau}W_2^2(\rho,\rho_k^\tau).
\end{align*}
The direct method requires three ingredients: a minimizing sequence, compactness of that sequence, and lower semicontinuity of the functional.
Because $\mathcal F[\rho_k^\tau]<\infty$, evaluating at $\rho=\rho_k^\tau$ gives
\begin{align*}
\mathcal J_{\tau,k}[\rho_k^\tau]=\mathcal F[\rho_k^\tau].
\end{align*}
Hence the infimum is not $+\infty$. Choose a minimizing sequence $(\eta_m)_{m\ge1}$ in $\mathcal P_2(\mathbb R^n)$, meaning
\begin{align*}
\mathcal J_{\tau,k}[\eta_m]\to\inf_{\rho\in\mathcal P_2(\mathbb R^n)}\mathcal J_{\tau,k}[\rho].
\end{align*}
The main compactness issue is to prevent mass from escaping to infinity. The coercive lower bound on $V$ is exactly what supplies this control. Since
\begin{align*}
a|x|^2-C\le V(x)
\end{align*}
for every $x\in\mathbb R^n$, integration with respect to $\eta_m$ gives
\begin{align*}
a\int_{\mathbb R^n}|x|^2\,d\eta_m(x)-C\le \int_{\mathbb R^n}V(x)\,d\eta_m(x).
\end{align*}
Equivalently,
\begin{align*}
\int_{\mathbb R^n}|x|^2\,d\eta_m(x)\le \frac{1}{a}\int_{\mathbb R^n}V(x)\,d\eta_m(x)+\frac{C}{a}.
\end{align*}
The entropy term may be negative, so one must use the entropy-moment lower bound with an arbitrarily small coefficient: for every $\varepsilon>0$ there is a constant $C_{n,\varepsilon}>0$, depending only on $n$ and $\varepsilon$, such that every probability density $u:\mathbb R^n\to[0,\infty)$ with finite second moment satisfies
\begin{align*}
\int_{\mathbb R^n}u(x)\log u(x)\,d\mathcal L^n(x)\ge -\varepsilon\int_{\mathbb R^n}|x|^2u(x)\,d\mathcal L^n(x)-C_{n,\varepsilon}.
\end{align*}
Choose $\varepsilon=a/2$. Combining this bound with $V(x)\ge a|x|^2-C$ gives
\begin{align*}
\mathcal F[\eta_m]\ge \frac{a}{2}\int_{\mathbb R^n}|x|^2\,d\eta_m(x)-C-C_{n,a/2}.
\end{align*}
Since $\mathcal J_{\tau,k}[\eta_m]$ is bounded from above along the minimizing sequence and the Wasserstein penalty is non-negative, we obtain a uniform bound
\begin{align*}
\sup_{m\ge1}\int_{\mathbb R^n}|x|^2\,d\eta_m(x)<\infty.
\end{align*}
This moment bound implies tightness. Therefore Prokhorov compactness gives a subsequence, still denoted $(\eta_m)_{m\ge1}$, and a Borel probability measure $\eta$ on $\mathbb R^n$ such that $\eta_m$ converges narrowly to $\eta$. The same second-moment bound implies that $\eta$ has finite second moment, so $\eta\in\mathcal P_2(\mathbb R^n)$.
We now pass to the limit in the functional. The entropy is lower semicontinuous under narrow convergence with the preceding moment control. The potential energy is lower semicontinuous because $V$ is continuous and bounded below by a quadratic function. The Wasserstein term is lower semicontinuous in its first variable under narrow convergence with moment control. Therefore
\begin{align*}
\mathcal F[\eta]+\frac{1}{2\tau}W_2^2(\eta,\rho_k^\tau)\le \liminf_{m\to\infty}\left(\mathcal F[\eta_m]+\frac{1}{2\tau}W_2^2(\eta_m,\rho_k^\tau)\right).
\end{align*}
In the notation of $\mathcal J_{\tau,k}$, this is
\begin{align*}
\mathcal J_{\tau,k}[\eta]\le \liminf_{m\to\infty}\mathcal J_{\tau,k}[\eta_m].
\end{align*}
Since $(\eta_m)_{m\ge1}$ was minimizing, $\eta$ attains the infimum. We define $\rho_{k+1}^\tau=\eta$, completing the construction of the next JKO step.
[/guided]
[/step]
[step:Derive the discrete energy and moment estimates]
By the minimality of $\rho_{k+1}^\tau$ and the admissibility of $\rho_k^\tau$,
\begin{align*}
\mathcal F[\rho_{k+1}^\tau]+\frac{1}{2\tau}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau)\le \mathcal F[\rho_k^\tau].
\end{align*}
Iterating this inequality gives, for every $N\ge1$,
\begin{align*}
\mathcal F[\rho_N^\tau]+\frac{1}{2\tau}\sum_{k=0}^{N-1}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau)\le \mathcal F[\rho_0].
\end{align*}
This is precisely the discrete energy dissipation estimate of [citetheorem:9576] applied to the JKO sequence for $\mathcal F$.
The coercive lower bound on $V$, together with the entropy-moment lower bound with $\varepsilon=a/2$, gives explicit constants $A=a/2$ and $B=C+C_{n,a/2}$ such that every $\rho\in\mathcal P_2(\mathbb R^n)$ with $\mathcal F[\rho]<\infty$ satisfies
\begin{align*}
\mathcal F[\rho]\ge A\int_{\mathbb R^n}|x|^2\,d\rho(x)-B.
\end{align*}
For $\mathcal F[\rho]=+\infty$ the same inequality is immediate. Since the iterated energy inequality gives $\mathcal F[\rho_k^\tau]\le \mathcal F[\rho_0]$ for every $k$, the preceding coercive estimate yields
\begin{align*}
\int_{\mathbb R^n}|x|^2\,d\rho_k^\tau(x)\le A^{-1}(\mathcal F[\rho_0]+B).
\end{align*}
Thus, for every $T>0$,
\begin{align*}
\sup_{\tau\in(0,1]}\sup_{0\le k\tau\le T}\int_{\mathbb R^n}|x|^2\,d\rho_k^\tau(x)<\infty.
\end{align*}
The same lower bound gives $\mathcal F[\rho_{k+1}^\tau]\ge -B$ for every $k$. Therefore the iterated energy inequality implies
\begin{align*}
\frac{1}{2\tau}\sum_{0\le k\tau\le T}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau)\le \mathcal F[\rho_0]-\inf_{0\le k\tau\le T}\mathcal F[\rho_{k+1}^\tau]\le \mathcal F[\rho_0]+B.
\end{align*}
Consequently
\begin{align*}
\sum_{0\le k\tau\le T}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau)\le C_T\tau,
\end{align*}
where one may take $C_T=2(\mathcal F[\rho_0]+B)$; this constant depends only on $\rho_0$, $n$, $a$, and $C$.
[/step]
[step:Extract a Wasserstein-convergent locally absolutely continuous limit curve]
Fix $T>0$. The moment estimate and the discrete increment estimate verify the hypotheses of [citetheorem:9577] for the family of piecewise-constant interpolations $(\bar\rho_\tau)_\tau$ on $[0,T]$. Therefore, along a subsequence, $\bar\rho_\tau(t)$ converges narrowly for every $t$ outside a countable set, and the limiting curve has a representative in $AC^2([0,T];\mathcal P_2(\mathbb R^n))$.
The compactness input used here is stronger than bare boundedness of second moments: the energy sublevel control and the discrete increment estimate give uniform integrability of $|x|^2$ along the convergent subsequence. Narrow convergence together with this uniform integrability is equivalent to $W_2$ convergence at every continuity time of the representative. Applying the compactness argument on $T=1,2,3,\dots$ and using a diagonal extraction gives a sequence $(\tau_j)_{j\ge1}$ with $\tau_j\downarrow0$ and a curve
\begin{align*}
\rho:[0,\infty)\to\mathcal P_2(\mathbb R^n)
\end{align*}
such that $\rho\in AC^2([0,T];\mathcal P_2(\mathbb R^n))$ for every $T>0$ and
\begin{align*}
W_2(\bar\rho_{\tau_j}(t),\rho(t))\to0
\end{align*}
for every $t\ge0$. In particular, $\rho$ is locally absolutely continuous as a curve in $(\mathcal P_2(\mathbb R^n),W_2)$.
The lower semicontinuity of entropy gives $\mathcal F[\rho(t)]<\infty$ for almost every $t\ge0$. Hence for almost every $t\ge0$ there is a Borel density $u(t,\cdot):\mathbb R^n\to[0,\infty)$ such that
\begin{align*}
\rho(t)=u(t,\cdot)\mathcal L^n.
\end{align*}
[guided]
We now explain why the discrete estimates produce a genuine curve, not merely a collection of subsequential limits. Fix a finite time horizon $T>0$. From the previous step we know two facts. First, the measures $\rho_k^\tau$ have uniformly bounded second moments whenever $0\le k\tau\le T$. Second, their squared Wasserstein increments satisfy
\begin{align*}
\sum_{0\le k\tau\le T}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau)\le C_T\tau.
\end{align*}
These are exactly the hypotheses needed in the compactness theorem for JKO interpolations, [citetheorem:9577]. Applying that result to $(\bar\rho_\tau)_\tau$ on $[0,T]$ gives a subsequence and a limiting curve on $[0,T]$.
The theorem first gives narrow convergence at all times except possibly a countable set and an $AC^2$ representative of the limit curve. The discrete increment estimate gives the uniform modulus estimate
\begin{align*}
W_2(\bar\rho_\tau(s),\bar\rho_\tau(t))\le C_T^{1/2}(|t-s|+\tau)^{1/2}
\end{align*}
for $s,t\in[0,T]$, obtained by summing the increments between the two grid intervals and applying Cauchy's inequality. This equicontinuity transfers convergence from the dense set of compactness times to every $t\in[0,T]$. The moment information needed at this point is uniform integrability of $|x|^2$, not merely boundedness of the second moments. The compactness theorem for these JKO interpolations supplies that uniform integrability from the coercive energy sublevels and the discrete increment estimate. Hence the second moments converge along the selected subsequence, and in $\mathcal P_2(\mathbb R^n)$ narrow convergence plus convergence of second moments is equivalent to $W_2$ convergence.
We need a single subsequence that works for all finite time intervals. To obtain it, apply the preceding argument on $[0,1]$, then on $[0,2]$ to the subsequence already chosen, then on $[0,3]$, and so on. The diagonal subsequence works on every interval $[0,T]$ with $T<\infty$. Denote this subsequence by $(\tau_j)_{j\ge1}$ and denote the resulting curve by
\begin{align*}
\rho:[0,\infty)\to\mathcal P_2(\mathbb R^n).
\end{align*}
For every $T>0$, the compactness theorem gives
\begin{align*}
\rho\in AC^2([0,T];\mathcal P_2(\mathbb R^n)).
\end{align*}
This is precisely local absolute continuity on $[0,\infty)$.
Finally, the finite-energy property passes to the limit at almost every time. Since each discrete state with finite free energy is absolutely continuous with respect to $\mathcal L^n$, and since entropy is lower semicontinuous under the convergence obtained above, the limit satisfies $\mathcal F[\rho(t)]<\infty$ for almost every $t$. Therefore, for almost every $t$, there is a Borel density
\begin{align*}
u(t,\cdot):\mathbb R^n\to[0,\infty)
\end{align*}
such that
\begin{align*}
\rho(t)=u(t,\cdot)\mathcal L^n.
\end{align*}
[/guided]
[/step]
[step:Write the Euler-Lagrange identity for one JKO step]
Fix $\tau>0$ and $k\ge0$. Let $\psi\in C_c^\infty(\mathbb R^n;\mathbb R)$, and define the smooth perturbation maps $T_\varepsilon:\mathbb R^n\to\mathbb R^n$ by
\begin{align*}
T_\varepsilon(x)=x+\varepsilon\nabla\psi(x)
\end{align*}
for $\varepsilon$ in a neighbourhood of $0$. Set $\rho_\varepsilon=(T_\varepsilon)_\#\rho_{k+1}^\tau$.
By the first variation of the entropy, the potential energy, and the Wasserstein penalty along smooth push-forward perturbations, equivalently by the Euler-Lagrange condition for a JKO step [citetheorem:9574], we have
\begin{align*}
\int_{\mathbb R^n}\left(\Delta\psi(x)-\nabla V(x)\cdot\nabla\psi(x)\right)\,d\rho_{k+1}^\tau(x)=\frac{1}{\tau}\int_{\mathbb R^n}\left(\psi(x)-\psi(T_k^\tau(x))\right)\,d\rho_{k+1}^\tau(x)+R_{k,\tau}[\psi].
\end{align*}
Here $T_k^\tau:\mathbb R^n\to\mathbb R^n$ is an optimal transport map from $\rho_{k+1}^\tau$ to $\rho_k^\tau$ when such a map exists, and otherwise the right-hand side is interpreted using an optimal plan $\pi_k^\tau\in\Pi(\rho_{k+1}^\tau,\rho_k^\tau)$ as
\begin{align*}
\frac{1}{\tau}\int_{\mathbb R^n\times\mathbb R^n}\left(\psi(x)-\psi(y)\right)\,d\pi_k^\tau(x,y)+R_{k,\tau}[\psi].
\end{align*}
The remainder $R_{k,\tau}[\psi]$ satisfies
\begin{align*}
|R_{k,\tau}[\psi]|\le \frac{1}{2\tau}\|D^2\psi\|_\infty W_2^2(\rho_{k+1}^\tau,\rho_k^\tau).
\end{align*}
The hypotheses of [citetheorem:9574] are satisfied here: the new minimizer has finite free energy and is therefore absolutely continuous with respect to $\mathcal L^n$; $V$ is $C^1$, bounded below, and has at most linear gradient growth; an optimal plan exists because both measures belong to $\mathcal P_2(\mathbb R^n)$; and the drift term is integrable because $\nabla\psi$ is bounded with compact support while $\nabla V$ is continuous. This is the discrete Euler-Lagrange identity used below.
[/step]
[step:Sum the Euler-Lagrange identities into a discrete weak formulation]
Let $\varphi\in C_c^\infty([0,\infty)\times\mathbb R^n)$. Choose $T_\varphi>0$ such that
\begin{align*}
\operatorname{supp}\varphi\subset[0,T_\varphi]\times K_\varphi
\end{align*}
for some compact set $K_\varphi\subset\mathbb R^n$. For each $k\ge0$, define the spatial test function $\varphi_k^\tau:\mathbb R^n\to\mathbb R$ by
\begin{align*}
\varphi_k^\tau(x)=\varphi(k\tau,x).
\end{align*}
This map belongs to $C_c^\infty(\mathbb R^n;\mathbb R)$.
Apply the previous step with $\psi=\varphi_{k+1}^\tau$ and choose, for each $k$, an optimal plan $\pi_k^\tau\in\Pi(\rho_{k+1}^\tau,\rho_k^\tau)$. Let $K_\tau$ be the largest integer such that $K_\tau\tau\le T_\varphi+\tau$. Summing the one-step identities for $0\le k\le K_\tau$ gives a transport sum
\begin{align*}
\frac{1}{\tau}\sum_{k=0}^{K_\tau}\int_{\mathbb R^n\times\mathbb R^n}\left(\varphi((k+1)\tau,x)-\varphi((k+1)\tau,y)\right)\,d\pi_k^\tau(x,y).
\end{align*}
By Taylor's theorem in the spatial variable,
\begin{align*}
\varphi((k+1)\tau,x)-\varphi((k+1)\tau,y)=\nabla\varphi((k+1)\tau,x)\cdot(x-y)+r_{k,\tau}(x,y),
\end{align*}
where
\begin{align*}
|r_{k,\tau}(x,y)|\le \frac{1}{2}\|D_x^2\varphi\|_\infty |x-y|^2.
\end{align*}
Since $\pi_k^\tau$ is optimal, the summed spatial remainder satisfies
\begin{align*}
\left|E_\tau^{\mathrm{space}}\right|\le \frac{1}{2\tau}\|D_x^2\varphi\|_\infty\sum_{k=0}^{K_\tau}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau).
\end{align*}
This is combined with the one-step remainder $R_{k,\tau}[\varphi_{k+1}^\tau]$ from the Euler-Lagrange identity; after multiplication by $\tau$ in the summed weak formulation, the total error is bounded by a constant multiple of $\sum_{k=0}^{K_\tau}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau)$ and therefore tends to $0$.
The zeroth-order transport terms telescope after adding and subtracting $\varphi(k\tau,\cdot)$ against $\rho_{k+1}^\tau$:
\begin{align*}
\sum_{k=0}^{K_\tau}\left(\int_{\mathbb R^n}\varphi((k+1)\tau,x)\,d\rho_{k+1}^\tau(x)-\int_{\mathbb R^n}\varphi((k+1)\tau,y)\,d\rho_k^\tau(y)\right)
\end{align*}
\begin{align*}
= -\int_{\mathbb R^n}\varphi(0,x)\,d\rho_0(x)-\sum_{k=0}^{K_\tau}\int_{\mathbb R^n}\left(\varphi((k+1)\tau,x)-\varphi(k\tau,x)\right)\,d\rho_{k+1}^\tau(x)+o(1).
\end{align*}
The terminal term is $o(1)$ because $\varphi(t,\cdot)=0$ for $t>T_\varphi$ and $(K_\tau+1)\tau>T_\varphi$ for small $\tau$. Taylor's theorem in time gives
\begin{align*}
\varphi((k+1)\tau,x)-\varphi(k\tau,x)=\int_{k\tau}^{(k+1)\tau}\partial_t\varphi(r,x)\,d\mathcal L^1(r),
\end{align*}
and the uniform continuity of $\partial_t\varphi$ on its compact support converts the sum into
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}\partial_t\varphi(t,x)\,d\bar\rho_\tau(t)(x)\,d\mathcal L^1(t)+o(1).
\end{align*}
Consequently the discrete weak formulation is
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}\left(\partial_t\varphi(t,x)+\Delta\varphi(t,x)-\nabla V(x)\cdot\nabla\varphi(t,x)\right)\,d\bar\rho_\tau(t)(x)\,d\mathcal L^1(t)+\int_{\mathbb R^n}\varphi(0,x)\,d\rho_0(x)=o(1).
\end{align*}
[guided]
We now convert the one-step Euler-Lagrange identity into the weak form of the PDE. Let
\begin{align*}
\varphi\in C_c^\infty([0,\infty)\times\mathbb R^n)
\end{align*}
be fixed. Since $\varphi$ has compact support, there are a finite time $T_\varphi>0$ and a compact set $K_\varphi\subset\mathbb R^n$ such that
\begin{align*}
\operatorname{supp}\varphi\subset[0,T_\varphi]\times K_\varphi.
\end{align*}
For each integer $k\ge0$, define
\begin{align*}
\varphi_k^\tau:\mathbb R^n\to\mathbb R
\end{align*}
by
\begin{align*}
\varphi_k^\tau(x)=\varphi(k\tau,x).
\end{align*}
This is an admissible spatial test function in the Euler-Lagrange identity because $\varphi$ is smooth and compactly supported in the spatial variable.
The reason for choosing $\psi=\varphi_{k+1}^\tau$ is that the JKO step connects $\rho_{k+1}^\tau$ backward to $\rho_k^\tau$. Applying the Euler-Lagrange identity at each step gives a relation involving
\begin{align*}
\int_{\mathbb R^n}\left(\Delta\varphi((k+1)\tau,x)-\nabla V(x)\cdot\nabla\varphi((k+1)\tau,x)\right)\,d\rho_{k+1}^\tau(x)
\end{align*}
and a transport difference between $\rho_{k+1}^\tau$ and $\rho_k^\tau$. Summing over all $k$ with $k\tau\le T_\varphi+\tau$ turns these one-step relations into a time-discrete weak formulation.
The transport difference contains the expression
\begin{align*}
\int_{\mathbb R^n\times\mathbb R^n}\left(\varphi((k+1)\tau,x)-\varphi((k+1)\tau,y)\right)\,d\pi_k^\tau(x,y),
\end{align*}
where $\pi_k^\tau$ is an optimal plan from $\rho_{k+1}^\tau$ to $\rho_k^\tau$. Taylor's theorem in the spatial variable controls the second-order error by
\begin{align*}
\frac{1}{2}\|D_x^2\varphi\|_\infty\int_{\mathbb R^n\times\mathbb R^n}|x-y|^2\,d\pi_k^\tau(x,y).
\end{align*}
Since $\pi_k^\tau$ is optimal, the last integral is
\begin{align*}
W_2^2(\rho_{k+1}^\tau,\rho_k^\tau).
\end{align*}
After summing in $k$, the total spatial error is bounded by
\begin{align*}
\left|E_\tau^{\mathrm{space}}\right|\le \frac{1}{2}\|D_x^2\varphi\|_\infty\sum_{k\tau\le T_\varphi+\tau}W_2^2(\rho_{k+1}^\tau,\rho_k^\tau).
\end{align*}
The discrete energy estimate says that this sum is $O(\tau)$ on bounded time intervals, so $E_\tau^{\mathrm{space}}\to0$.
The remaining first-order time difference is
\begin{align*}
\sum_k\int_{\mathbb R^n}\left(\varphi((k+1)\tau,x)-\varphi(k\tau,x)\right)\,d\rho_{k+1}^\tau(x).
\end{align*}
Taylor's theorem in the time variable gives
\begin{align*}
\varphi((k+1)\tau,x)-\varphi(k\tau,x)=\int_{k\tau}^{(k+1)\tau}\partial_t\varphi(r,x)\,d\mathcal L^1(r).
\end{align*}
Integrating this identity against $\rho_{k+1}^\tau$ and summing over $k$ yields the Riemann-sum representation of
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}\partial_t\varphi(t,x)\,d\bar\rho_\tau(t)(x)\,d\mathcal L^1(t),
\end{align*}
up to an error tending to $0$ because $\partial_t\varphi$ is uniformly continuous on its compact support and the interpolations have uniformly bounded mass.
Putting the time term, the diffusion term, and the drift term together gives
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}\left(\partial_t\varphi(t,x)+\Delta\varphi(t,x)-\nabla V(x)\cdot\nabla\varphi(t,x)\right)\,d\bar\rho_\tau(t)(x)\,d\mathcal L^1(t)+\int_{\mathbb R^n}\varphi(0,x)\,d\rho_0(x)=o(1).
\end{align*}
This is the discrete weak formulation.
[/guided]
[/step]
[step:Pass from the discrete weak formulation to the Fokker-Planck equation]
Let $\varphi\in C_c^\infty([0,\infty)\times\mathbb R^n)$ and define
\begin{align*}
G_\varphi:[0,\infty)\times\mathbb R^n\to\mathbb R
\end{align*}
by
\begin{align*}
G_\varphi(t,x)=\partial_t\varphi(t,x)+\Delta\varphi(t,x)-\nabla V(x)\cdot\nabla\varphi(t,x).
\end{align*}
The function $G_\varphi$ is continuous and compactly supported in the spatial variable, because $\varphi$ is spatially compactly supported and $V\in C^2(\mathbb R^n;\mathbb R)$. The convergence
\begin{align*}
W_2(\bar\rho_{\tau_j}(t),\rho(t))\to0
\end{align*}
for every $t\ge0$ implies narrow convergence at every time. Therefore
\begin{align*}
\int_{\mathbb R^n}G_\varphi(t,x)\,d\bar\rho_{\tau_j}(t)(x)\to\int_{\mathbb R^n}G_\varphi(t,x)\,d\rho(t)(x)
\end{align*}
for every $t\ge0$.
Since $G_\varphi$ is bounded and supported in $[0,T_\varphi]\times K_\varphi$, the dominated convergence theorem with respect to $\mathcal L^1$ on $[0,T_\varphi]$ gives
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}G_\varphi(t,x)\,d\bar\rho_{\tau_j}(t)(x)\,d\mathcal L^1(t)\to\int_0^\infty\int_{\mathbb R^n}G_\varphi(t,x)\,d\rho(t)(x)\,d\mathcal L^1(t).
\end{align*}
Passing to the limit in the discrete weak formulation yields
\begin{align*}
\int_0^\infty\int_{\mathbb R^n}\left(\partial_t\varphi(t,x)+\Delta\varphi(t,x)-\nabla V(x)\cdot\nabla\varphi(t,x)\right)\,d\rho(t)(x)\,d\mathcal L^1(t)+\int_{\mathbb R^n}\varphi(0,x)\,d\rho_0(x)=0.
\end{align*}
Thus $\rho$ is a distributional solution of the Fokker-Planck equation with initial datum $\rho_0$.
[/step]
[step:Pass the discrete dissipation inequality to the limit]
For each $\tau>0$, define the piecewise-constant metric speed $g_\tau:[0,\infty)\to[0,\infty)$ by
\begin{align*}
g_\tau(r)=\frac{1}{\tau}W_2(\rho_{k+1}^\tau,\rho_k^\tau)
\end{align*}
for $r\in(k\tau,(k+1)\tau]$. The discrete energy inequality gives the action bound on grid intervals. For arbitrary endpoints $0\le s\le t<\infty$, replacing $s$ and $t$ by adjacent grid times changes the displayed inequality by an endpoint error that vanishes along the selected sequence at energy-continuity points; the full all-time endpoint statement is supplied by the limiting energy-dissipation theorem invoked below, not by lower semicontinuity alone.
By the lower semicontinuity of metric action under $W_2$ convergence, and equivalently by the minimality property of metric derivatives in [citetheorem:9558],
\begin{align*}
\int_s^t |\rho'|_{W_2}^2(r)\,d\mathcal L^1(r)\le \liminf_{j\to\infty}\int_s^t g_{\tau_j}^2(r)\,d\mathcal L^1(r).
\end{align*}
We now invoke the limiting energy-dissipation input [citetheorem:9578] for the endpoint passage and the slope term. Its hypotheses are exactly the additional assumptions stated in the theorem: $\mathcal F$ is proper, lower semicontinuous, $\lambda$-displacement convex, and has a strong upper gradient $|\partial\mathcal F|$ along the limiting curve. Properness follows from $\mathcal F[\rho_0]<\infty$; lower semicontinuity follows from entropy lower semicontinuity and the continuous potential term with quadratic growth; $\lambda$-displacement convexity follows from [citetheorem:9568], whose hypotheses match the assumed lower Hessian bound on $V$; and the slope identity $|\partial\mathcal F|^2=\mathcal I_V$ is precisely the slope-identification assumption in the statement. Thus [citetheorem:9578] applies to $\mathcal F$ and gives
\begin{align*}
\mathcal F[\rho(t)]+\frac{1}{2}\int_s^t |\rho'|_{W_2}^2(r)\,d\mathcal L^1(r)+\frac{1}{2}\int_s^t\mathcal I_V[\rho(r)]\,d\mathcal L^1(r)\le \mathcal F[\rho(s)].
\end{align*}
This proves the stated energy dissipation inequality for all $0\le s\le t<\infty$.
[guided]
The last part of the theorem is the energy dissipation inequality. The discrete scheme already contains half of it. Define
\begin{align*}
g_\tau:[0,\infty)\to[0,\infty)
\end{align*}
by
\begin{align*}
g_\tau(r)=\frac{1}{\tau}W_2(\rho_{k+1}^\tau,\rho_k^\tau)
\end{align*}
when $r\in(k\tau,(k+1)\tau]$. This is the metric speed of the affine-in-time bookkeeping of the discrete path. The discrete energy inequality says that the energy drop controls the integral of $g_\tau^2$:
\begin{align*}
\mathcal F[\rho_N^\tau]+\frac{1}{2}\int_0^{N\tau}g_\tau^2(r)\,d\mathcal L^1(r)\le \mathcal F[\rho_0].
\end{align*}
The same estimate between two grid times gives the corresponding inequality on every interval $[s,t]$, up to an error that disappears when the grid size tends to $0$.
Now pass to the limit. The metric derivative $|\rho'|_{W_2}$ is the smallest $L^1_{\mathrm{loc}}$ function that controls the length of the limiting curve. By [citetheorem:9558], any limiting upper-gradient bound for the discrete distances dominates $|\rho'|_{W_2}$. In integral form this gives the lower semicontinuity estimate
\begin{align*}
\int_s^t |\rho'|_{W_2}^2(r)\,d\mathcal L^1(r)\le \liminf_{j\to\infty}\int_s^t g_{\tau_j}^2(r)\,d\mathcal L^1(r).
\end{align*}
It remains to identify the slope term. For the free energy
\begin{align*}
\mathcal F[\rho]=\int_{\mathbb R^n}u(x)\log u(x)\,d\mathcal L^n(x)+\int_{\mathbb R^n}V(x)\,d\rho(x)
\end{align*}
with $\rho=u\mathcal L^n$, the Wasserstein first variation is
\begin{align*}
\frac{\delta\mathcal F}{\delta\rho}=\log u+1+V.
\end{align*}
The corresponding Wasserstein gradient is
\begin{align*}
\nabla\left(\frac{\delta\mathcal F}{\delta\rho}\right)=\frac{\nabla u}{u}+\nabla V
\end{align*}
where the expression is understood distributionally and then represented by an $L^2(\rho)$ vector field when the Fisher information is finite. Its squared $L^2(\rho)$ norm is
\begin{align*}
\int_{\mathbb R^n}\left|\frac{\nabla u(x)}{u(x)}+\nabla V(x)\right|^2u(x)\,d\mathcal L^n(x)=\int_{\mathbb R^n}\frac{|\nabla u(x)+u(x)\nabla V(x)|^2}{u(x)}\,d\mathcal L^n(x).
\end{align*}
This is exactly $\mathcal I_V[\rho]$.
The rigorous lower-semicontinuity passage for this slope and the endpoint passage for the energy are not consequences of the displayed discrete inequality alone. They are exactly the strong-upper-gradient limiting theorem recorded as [citetheorem:9578]. Its hypotheses in this application are: $\mathcal F$ is proper, lower semicontinuous, $\lambda$-displacement convex, the JKO interpolations converge to the curve $\rho$, and $|\partial\mathcal F|$ is a strong upper gradient along that curve. Properness follows from the finite-energy initial datum. Lower semicontinuity follows from entropy lower semicontinuity and the continuous potential term with the stated quadratic growth. The $\lambda$-displacement convexity follows from [citetheorem:9568], using the assumed quadratic-form lower bound on $J(\nabla V)_x$. The strong-upper-gradient identity is assumed in the theorem statement and reads $|\partial\mathcal F|^2=\mathcal I_V$ along the limiting curve. Therefore [citetheorem:9578] gives
\begin{align*}
\mathcal F[\rho(t)]+\frac{1}{2}\int_s^t |\rho'|_{W_2}^2(r)\,d\mathcal L^1(r)+\frac{1}{2}\int_s^t|\partial\mathcal F|^2(\rho(r))\,d\mathcal L^1(r)\le \mathcal F[\rho(s)].
\end{align*}
For the entropy-plus-potential functional above, the slope identity is
\begin{align*}
|\partial\mathcal F|^2(\rho)=\mathcal I_V[\rho].
\end{align*}
Substituting this identity gives
\begin{align*}
\mathcal F[\rho(t)]+\frac{1}{2}\int_s^t |\rho'|_{W_2}^2(r)\,d\mathcal L^1(r)+\frac{1}{2}\int_s^t\mathcal I_V[\rho(r)]\,d\mathcal L^1(r)\le \mathcal F[\rho(s)].
\end{align*}
This is the desired energy dissipation inequality.
[/guided]
[/step]
[step:Conclude the convergence theorem]
The first step constructs the JKO sequence for every $\tau>0$. The discrete estimates give compactness and, after diagonal extraction, a locally absolutely continuous limit curve $\rho$ with pointwise $W_2$ convergence of $\bar\rho_{\tau_j}(t)$ to $\rho(t)$. The discrete Euler-Lagrange formulation passes to the limit and gives the weak Fokker-Planck equation with initial datum $\rho_0$. The lower semicontinuity of the action and the Fisher-information slope identity give the stated energy dissipation inequality. Therefore the subsequential JKO convergence, the weak equation, and the energy dissipation inequality all hold as claimed.
[/step]
Explore Further
Riemannian Isometries Preserve Lengths of Piecewise Smooth Curves
Analysis
Coordinate Formula for the Tension Field
Analysis
Symbolic Expansion from Taylor Expansion
Analysis
Russo Dye Theorem
Analysis
Fourier Inversion Theorem
Fourier Analysis
Generators of $\mathcal{B}(\mathbb{R})$
Analysis
Quadratic Cost Equivalence with Classical Cyclical Monotonicity
Analysis
Kakutani's Theorem
Functional Analysis
Analysis
Area