[proofplan]
The proof splits the relative entropy into the Lebesgue entropy plus the potential energy. Since finite relative entropy with respect to $\mu$ implies absolute continuity with respect to [Lebesgue measure](/page/Lebesgue%20Measure), Brenier's theorem gives a unique optimal map $T$ from $\nu_0$ to $\nu_1$, and the given geodesic is the displacement interpolation generated by $T$. The Lebesgue entropy is displacement convex by McCann's theorem, while the uniform convexity of $V$ gives a strictly convex estimate for the potential energy along each transport segment. Adding the two inequalities and cancelling the constant $\log Z$ gives the claimed $\rho$-convexity inequality.
[/proofplan]
[step:Identify the unique displacement interpolation generated by the Brenier map]
Because $e^{-V(x)} > 0$ for every $x \in \mathbb R^n$, the measures $\mu$ and $\mathcal L^n$ have the same null sets. Since $H(\nu_0\mid\mu) < \infty$ and $H(\nu_1\mid\mu) < \infty$, there exist Borel functions $f_0,f_1:\mathbb R^n \to [0,\infty)$ such that $\nu_i=f_i\mu$ for $i \in \{0,1\}$. Hence $\nu_0$ and $\nu_1$ are absolutely continuous with respect to $\mathcal L^n$.
By the Brenier optimal transport theorem, applied to the absolutely continuous source measure $\nu_0 \in \mathcal P_2(\mathbb R^n)$ and the target measure $\nu_1 \in \mathcal P_2(\mathbb R^n)$ for the quadratic cost, there exists a convex function $\varphi:\mathbb R^n \to (-\infty,\infty]$ and a Borel map
\begin{align*}
T:\mathbb R^n \to \mathbb R^n
\end{align*}
such that $T(x)=\nabla \varphi(x)$ for $\nu_0$-a.e. $x$. The same theorem gives that $T$ pushes $\nu_0$ forward to $\nu_1$ and is the unique optimal transport map from $\nu_0$ to $\nu_1$ for the cost $|x-y|^2$. Thus
\begin{align*}
T_{\#}\nu_0=\nu_1
\end{align*}
and
\begin{align*}
\int_{\mathbb R^n}|T(x)-x|^2\, d\nu_0(x)=W_2(\nu_0,\nu_1)^2.
\end{align*}
For each $t \in [0,1]$, define the interpolation map
\begin{align*}
T_t:\mathbb R^n \to \mathbb R^n
\end{align*}
by
\begin{align*}
T_t(x):=(1-t)x+tT(x).
\end{align*}
Since $\nu_0$ is absolutely continuous with respect to $\mathcal L^n$, optimal plans from $\nu_0$ for the quadratic cost are unique. The standard representation of constant-speed $W_2$-geodesics by optimal plans says that if $\pi$ is an optimal plan between $\nu_0$ and $\nu_1$, then the geodesic induced by $\pi$ is $((1-t)x+ty)_{\#}\pi$. Applying this representation to the unique optimal plan $\pi=(\operatorname{id}_{\mathbb R^n},T)_{\#}\nu_0$, the constant-speed $W_2$-geodesic from $\nu_0$ to $\nu_1$ is unique, and the prescribed geodesic satisfies
\begin{align*}
\nu_t=(T_t)_{\#}\nu_0
\end{align*}
for every $t \in [0,1]$.
[/step]
[step:Split relative entropy into Lebesgue entropy and potential energy]
For any $\nu \in \mathcal P_2(\mathbb R^n)$ with $H(\nu\mid\mu)<\infty$, write $\nu=\rho_\nu \mathcal L^n$, where $\rho_\nu:\mathbb R^n\to[0,\infty)$ is the Lebesgue density of $\nu$. Define the Lebesgue entropy functional $E_{\mathcal L}:\mathcal P_2(\mathbb R^n)\to(-\infty,\infty]$ by
\begin{align*}
E_{\mathcal L}(\nu):=\int_{\mathbb R^n}\rho_\nu(x)\log \rho_\nu(x)\, d\mathcal L^n(x)
\end{align*}
when $\nu=\rho_\nu\mathcal L^n$, and $E_{\mathcal L}(\nu):=+\infty$ otherwise. Define the potential energy functional $P_V:\mathcal P_2(\mathbb R^n)\to(-\infty,\infty]$ by
\begin{align*}
P_V(\nu):=\int_{\mathbb R^n}V(x)\, d\nu(x).
\end{align*}
Define the reference density $p:\mathbb R^n\to(0,\infty)$ by
\begin{align*}
p(x):=Z^{-1}e^{-V(x)}.
\end{align*}
Then $\mu=p\mathcal L^n$. Define the Radon-Nikodym derivative as the measurable map
\begin{align*}
r_\nu:\mathbb R^n\to[0,\infty]
\end{align*}
given $\mu$-a.e. by
\begin{align*}
r_\nu=\frac{d\nu}{d\mu}.
\end{align*}
Since $\nu=\rho_\nu\mathcal L^n$, this derivative satisfies
\begin{align*}
r_\nu(x)=\frac{\rho_\nu(x)}{p(x)}=Z e^{V(x)}\rho_\nu(x)
\end{align*}
for $\nu$-a.e. $x$. The change of reference measure in the definition of relative entropy gives
\begin{align*}
H(\nu\mid\mu)=\int_{\mathbb R^n}\rho_\nu(x)\log\frac{\rho_\nu(x)}{p(x)}\, d\mathcal L^n(x)
\end{align*}
as an extended integral, with the convention $0\log 0=0$. On the set where $\rho_\nu>0$, we have
\begin{align*}
\log\frac{\rho_\nu(x)}{p(x)}=\log\rho_\nu(x)+V(x)+\log Z.
\end{align*}
The set where $\rho_\nu=0$ contributes zero to the density-weighted integrals. Therefore, whenever $E_{\mathcal L}(\nu)$ and $P_V(\nu)$ are not combined in the indeterminate form $\infty-\infty$, the definition of the signed [Lebesgue integral](/page/Lebesgue%20Integral) gives
\begin{align*}
H(\nu\mid\mu)=E_{\mathcal L}(\nu)+P_V(\nu)+\log Z.
\end{align*}
For the endpoint measures, the next paragraph proves that $E_{\mathcal L}(\nu_i)$ and $P_V(\nu_i)$ are finite [real numbers](/page/Real%20Numbers), so the displayed identity is then an ordinary equality for $i\in\{0,1\}$.
The terms in this identity are finite for $\nu_0$ and $\nu_1$. Indeed, Taylor's formula with integral remainder and the Hessian lower bound give
\begin{align*}
V(x)\ge V(0)+\nabla V(0)\cdot x+\frac{\rho}{2}|x|^2.
\end{align*}
[Young's inequality](/theorems/244) applied to the Euclidean [inner product](/page/Inner%20Product) $\nabla V(0)\cdot x$ gives
\begin{align*}
\nabla V(0)\cdot x\ge -\frac{\rho}{4}|x|^2-\frac{|\nabla V(0)|^2}{\rho},
\end{align*}
so define the constant
\begin{align*}
m:=V(0)-\frac{|\nabla V(0)|^2}{\rho}.
\end{align*}
Then
\begin{align*}
V(x)\ge \frac{\rho}{4}|x|^2+m
\end{align*}
for every $x\in\mathbb R^n$. For a real-valued measurable function $F:\mathbb R^n\to\mathbb R$, define its positive and negative parts by
\begin{align*}
F^+(x):=\max\{F(x),0\}, \qquad F^-(x):=\max\{-F(x),0\}.
\end{align*}
Hence $V$ is bounded below by $m$. Defining
\begin{align*}
m^-:=\max\{-m,0\},
\end{align*}
we have $V^-\le m^-$.
It remains to justify that the positive part of $V$ and the Lebesgue entropy are finite at the endpoints. Fix $i\in\{0,1\}$, and define
\begin{align*}
M_i:=\int_{\mathbb R^n}|x|^2\,d\nu_i(x)<\infty.
\end{align*}
Choose $a>0$ and define the Gaussian probability density $q_a:\mathbb R^n\to(0,\infty)$ by
\begin{align*}
q_a(x):=c_a e^{-a|x|^2}.
\end{align*}
Here $c_a>0$ is the normalising constant satisfying
\begin{align*}
\int_{\mathbb R^n}q_a(x)\,d\mathcal L^n(x)=1.
\end{align*}
By the non-negativity of relative entropy, also called [Gibbs' inequality](/theorems/1629), the relative entropy of $\nu_i$ with respect to the probability measure $q_a\mathcal L^n$ is non-negative in the extended sense. Therefore
\begin{align*}
0\le \int_{\mathbb R^n}\rho_{\nu_i}(x)\log\frac{\rho_{\nu_i}(x)}{q_a(x)}\,d\mathcal L^n(x).
\end{align*}
Thus
\begin{align*}
E_{\mathcal L}(\nu_i)\ge \int_{\mathbb R^n}\rho_{\nu_i}(x)\log q_a(x)\,d\mathcal L^n(x)=\log c_a-aM_i> -\infty.
\end{align*}
The lower bound $V\ge m$ gives $P_V(\nu_i)>-\infty$. Hence the entropy splitting for $\nu_i$ is not an indeterminate expression. Using
\begin{align*}
H(\nu_i\mid\mu)=E_{\mathcal L}(\nu_i)+P_V(\nu_i)+\log Z
\end{align*}
and the hypothesis $H(\nu_i\mid\mu)<\infty$, the lower bound $E_{\mathcal L}(\nu_i)>-\infty$ forces $P_V(\nu_i)<\infty$. Since $V^-$ is bounded, this implies
\begin{align*}
\int_{\mathbb R^n}V^+(x)\,d\nu_i(x)<\infty,
\end{align*}
and hence $P_V(\nu_i)\in\mathbb R$. The same identity then gives $E_{\mathcal L}(\nu_i)\in\mathbb R$.
[/step]
[step:Apply displacement convexity to the Lebesgue entropy]
By the preceding step, $E_{\mathcal L}(\nu_0)$ and $E_{\mathcal L}(\nu_1)$ are finite real numbers. We use McCann's displacement convexity theorem in the following precise form: for the extended Boltzmann entropy functional $E_{\mathcal L}$, defined as above by the density formula on measures absolutely continuous with respect to $\mathcal L^n$ and as $+\infty$ otherwise, $E_{\mathcal L}$ is convex along every quadratic displacement interpolation in $\mathcal P_2(\mathbb R^n)$. The hypotheses apply because $\nu_0$ and $\nu_1$ are absolutely continuous with respect to $\mathcal L^n$, have finite second moments, have finite Lebesgue entropy, and $\nu_t=(T_t)_{\#}\nu_0$ is their quadratic displacement interpolation. Hence, for every $t\in[0,1]$,
\begin{align*}
E_{\mathcal L}(\nu_t)\le (1-t)E_{\mathcal L}(\nu_0)+tE_{\mathcal L}(\nu_1).
\end{align*}
[/step]
[step:Use uniform convexity of $V$ along each transport segment]
Fix $x\in\mathbb R^n$ and define the one-dimensional function
\begin{align*}
g_x:[0,1]\to\mathbb R,\qquad g_x(s)=V((1-s)x+sT(x)).
\end{align*}
For $\nu_0$-a.e. $x$, the vector $T(x)$ is defined and finite. Since $V\in C^2(\mathbb R^n)$, the function $g_x$ belongs to $C^2([0,1])$, and
\begin{align*}
g_x''(s)=(T(x)-x)^\top D^2V((1-s)x+sT(x))(T(x)-x).
\end{align*}
The Hessian lower bound gives
\begin{align*}
g_x''(s)\ge \rho |T(x)-x|^2
\end{align*}
for every $s\in[0,1]$.
Applying the elementary one-dimensional strong convexity estimate to $g_x$ gives
\begin{align*}
V(T_t(x))\le (1-t)V(x)+tV(T(x))-\frac{\rho}{2}t(1-t)|T(x)-x|^2
\end{align*}
for $\nu_0$-a.e. $x$.
[guided]
This guided expansion proves the pointwise strong convexity estimate used in this step; the remaining steps integrate this estimate and combine it with the entropy convexity estimate. We want to extract the curvature of $V$ along the exact straight line used by the Wasserstein geodesic. For a fixed starting point $x\in\mathbb R^n$ for which $T(x)$ is defined, the path followed by the transported particle is
\begin{align*}
s\mapsto (1-s)x+sT(x).
\end{align*}
This qualification is needed because the Brenier map is only specified $\nu_0$-a.e.; all pointwise estimates below are therefore asserted on that full $\nu_0$-measure set.
We encode the value of the potential along this segment by the function
\begin{align*}
g_x:[0,1]\to\mathbb R
\end{align*}
defined by
\begin{align*}
g_x(s):=V((1-s)x+sT(x)).
\end{align*}
Since $V\in C^2(\mathbb R^n)$ and the map $s\mapsto (1-s)x+sT(x)$ is affine, the chain rule gives $g_x\in C^2([0,1])$. Differentiating twice in the one-dimensional variable $s$ gives
\begin{align*}
g_x''(s)=(T(x)-x)^\top D^2V((1-s)x+sT(x))(T(x)-x).
\end{align*}
Now the hypothesis on $D^2V$ applies with the vector $\xi=T(x)-x$ and the point $(1-s)x+sT(x)$. Therefore
\begin{align*}
g_x''(s)\ge \rho |T(x)-x|^2
\end{align*}
for every $s\in[0,1]$.
This means that the function
\begin{align*}
h_x:[0,1]\to\mathbb R
\end{align*}
defined by
\begin{align*}
h_x(s):=g_x(s)-\frac{\rho}{2}|T(x)-x|^2s^2
\end{align*}
is convex, because $h_x''(s)=g_x''(s)-\rho |T(x)-x|^2\ge0$. Convexity of $h_x$ between $0$ and $1$ gives
\begin{align*}
h_x(t)\le (1-t)h_x(0)+t h_x(1).
\end{align*}
Substituting the definition of $h_x$ into this inequality yields
\begin{align*}
g_x(t)-\frac{\rho}{2}|T(x)-x|^2t^2\le (1-t)g_x(0)+t\left(g_x(1)-\frac{\rho}{2}|T(x)-x|^2\right).
\end{align*}
Rearranging gives
\begin{align*}
g_x(t)\le (1-t)g_x(0)+t g_x(1)-\frac{\rho}{2}t(1-t)|T(x)-x|^2.
\end{align*}
Finally, $g_x(0)=V(x)$, $g_x(1)=V(T(x))$, and $g_x(t)=V(T_t(x))$, so
\begin{align*}
V(T_t(x))\le (1-t)V(x)+tV(T(x))-\frac{\rho}{2}t(1-t)|T(x)-x|^2.
\end{align*}
This is the pointwise gain coming from uniform convexity. It is stronger than ordinary convexity exactly by the quadratic correction involving the transport distance $|T(x)-x|^2$.
[/guided]
[/step]
[step:Integrate the potential estimate and identify the transport cost]
Let $m:=V(0)-|\nabla V(0)|^2/\rho$ be the lower bound constant obtained above, and define the non-negative potential $\widetilde V:\mathbb R^n\to[0,\infty)$ by
\begin{align*}
\widetilde V(x):=V(x)-m.
\end{align*}
Subtracting the constant $m$ from the pointwise strong convexity estimate cancels on the affine terms, so for $\nu_0$-a.e. $x$,
\begin{align*}
\widetilde V(T_t(x))\le (1-t)\widetilde V(x)+t\widetilde V(T(x))-\frac{\rho}{2}t(1-t)|T(x)-x|^2.
\end{align*}
The functions $\widetilde V$ and $\widetilde V\circ T$ are $\nu_0$-integrable because $P_V(\nu_0)$ and $P_V(\nu_1)$ are finite and $T_{\#}\nu_0=\nu_1$. The function $x\mapsto |T(x)-x|^2$ is $\nu_0$-integrable because $T$ is an optimal transport map between measures in $\mathcal P_2(\mathbb R^n)$. Hence the right-hand side is integrable, and the preceding pointwise inequality implies that $\widetilde V\circ T_t$ is also $\nu_0$-integrable. Integrating with respect to $\nu_0$ is therefore justified and gives
\begin{align*}
\int_{\mathbb R^n}\widetilde V(T_t(x))\,d\nu_0(x)\le (1-t)\int_{\mathbb R^n}\widetilde V(x)\,d\nu_0(x)+t\int_{\mathbb R^n}\widetilde V(T(x))\,d\nu_0(x)-\frac{\rho}{2}t(1-t)\int_{\mathbb R^n}|T(x)-x|^2\,d\nu_0(x).
\end{align*}
Because $\nu_t=(T_t)_{\#}\nu_0$ and $T_{\#}\nu_0=\nu_1$, the pushforward identity gives
\begin{align*}
\int_{\mathbb R^n}\widetilde V(T_t(x))\,d\nu_0(x)=P_V(\nu_t)-m
\end{align*}
and
\begin{align*}
\int_{\mathbb R^n}\widetilde V(T(x))\,d\nu_0(x)=P_V(\nu_1)-m.
\end{align*}
Also,
\begin{align*}
\int_{\mathbb R^n}\widetilde V(x)\,d\nu_0(x)=P_V(\nu_0)-m.
\end{align*}
Substituting these three identities into the integrated inequality cancels the constants because $(1-t)m+tm=m$. Thus
\begin{align*}
P_V(\nu_t)\le (1-t)P_V(\nu_0)+tP_V(\nu_1)-\frac{\rho}{2}t(1-t)\int_{\mathbb R^n}|T(x)-x|^2\,d\nu_0(x).
\end{align*}
Since $T$ is the optimal transport map from $\nu_0$ to $\nu_1$ for the quadratic cost,
\begin{align*}
\int_{\mathbb R^n}|T(x)-x|^2\,d\nu_0(x)=W_2(\nu_0,\nu_1)^2.
\end{align*}
Therefore
\begin{align*}
P_V(\nu_t)\le (1-t)P_V(\nu_0)+tP_V(\nu_1)-\frac{\rho}{2}t(1-t)W_2(\nu_0,\nu_1)^2.
\end{align*}
[guided]
The pointwise estimate from the previous step becomes useful only after integrating it against the starting measure $\nu_0$. Because $V$ may take negative values, we first subtract the lower bound constant $m$ and work with the non-negative function $\widetilde V=V-m$. This avoids any ambiguity in the integral while preserving the same convexity correction term.
For $\nu_0$-a.e. $x$, the previous step gives
\begin{align*}
\widetilde V(T_t(x))\le (1-t)\widetilde V(x)+t\widetilde V(T(x))-\frac{\rho}{2}t(1-t)|T(x)-x|^2.
\end{align*}
The functions on the right are integrable with respect to $\nu_0$: the endpoint potential energies are finite, $T_{\#}\nu_0=\nu_1$, and the quadratic transport cost is finite because $T$ is optimal between measures in $\mathcal P_2(\mathbb R^n)$. Thus integration with respect to $\nu_0$ is justified and gives
\begin{align*}
\int_{\mathbb R^n}\widetilde V(T_t(x))\,d\nu_0(x)\le (1-t)\int_{\mathbb R^n}\widetilde V(x)\,d\nu_0(x)+t\int_{\mathbb R^n}\widetilde V(T(x))\,d\nu_0(x)-\frac{\rho}{2}t(1-t)\int_{\mathbb R^n}|T(x)-x|^2\,d\nu_0(x).
\end{align*}
The pushforward identities identify these three potential terms as $P_V(\nu_t)-m$, $P_V(\nu_0)-m$, and $P_V(\nu_1)-m$. The constants cancel because $(1-t)m+tm=m$. Finally, the optimality of $T$ identifies the last integral with $W_2(\nu_0,\nu_1)^2$. Therefore
\begin{align*}
P_V(\nu_t)\le (1-t)P_V(\nu_0)+tP_V(\nu_1)-\frac{\rho}{2}t(1-t)W_2(\nu_0,\nu_1)^2.
\end{align*}
[/guided]
[/step]
[step:Add the entropy and potential inequalities]
Before using the entropy splitting at time $t$, we verify that it is an ordinary equality. The preceding step proves $P_V(\nu_t)\in\mathbb R$. Since $\nu_t\in\mathcal P_2(\mathbb R^n)$, the same Gaussian comparison used for the endpoints gives $E_{\mathcal L}(\nu_t)>-\infty$. The displacement convexity estimate gives $E_{\mathcal L}(\nu_t)<\infty$ because $E_{\mathcal L}(\nu_0)$ and $E_{\mathcal L}(\nu_1)$ are finite. Hence $E_{\mathcal L}(\nu_t)\in\mathbb R$, and the entropy splitting for $\nu_t$, $\nu_0$, and $\nu_1$ is legitimate. Thus
\begin{align*}
H(\nu_t\mid\mu)=E_{\mathcal L}(\nu_t)+P_V(\nu_t)+\log Z.
\end{align*}
Combining the displacement convexity estimate for $E_{\mathcal L}$ with the strong convexity estimate for $P_V$ gives
\begin{align*}
H(\nu_t\mid\mu)\le (1-t)E_{\mathcal L}(\nu_0)+tE_{\mathcal L}(\nu_1)+(1-t)P_V(\nu_0)+tP_V(\nu_1)+\log Z-\frac{\rho}{2}t(1-t)W_2(\nu_0,\nu_1)^2.
\end{align*}
Since $(1-t)\log Z+t\log Z=\log Z$, the right-hand side is
\begin{align*}
(1-t)H(\nu_0\mid\mu)+tH(\nu_1\mid\mu)-\frac{\rho}{2}t(1-t)W_2(\nu_0,\nu_1)^2.
\end{align*}
Thus, for every $t\in[0,1]$,
\begin{align*}
H(\nu_t\mid\mu)
\le (1-t)H(\nu_0\mid\mu)+tH(\nu_1\mid\mu)-\frac{\rho}{2}t(1-t)W_2(\nu_0,\nu_1)^2.
\end{align*}
This is the desired displacement $\rho$-convexity inequality for the relative entropy.
[/step]