[proofplan]
Under the stated smoothness and Monge-Ampere hypotheses, the geodesic is written as $\mu_t=(T_t)_\#\mu_0$, and the internal energy is pulled back to an integral over the initial measure. The determinant inequality for positive semidefinite matrices gives a lower bound on the interpolation Jacobian. Because the McCann transform $G_U(r)=r^nU(r^{-n})$ is convex and nonincreasing, this lower bound becomes the desired upper bound on the energy.
[/proofplan]
[step:Prove the inequality for smooth positive data with a smooth Brenier map]
Assume first that $\rho_0,\rho_1\in C^\infty(\mathbb R^n;(0,\infty))$ are probability densities with finite second moment, that the Brenier map $T:\mathbb R^n\to\mathbb R^n$ is continuously differentiable, and that the Monge-Ampere change-of-variables identities below hold for $\mathcal L^n$-a.e. $x\in\mathbb R^n$.
By hypothesis, the Jacobian matrix $JT_x$ is symmetric positive semidefinite for $\mathcal L^n$-a.e. $x\in\mathbb R^n$.
For $t\in[0,1]$, define
\begin{align*}
T_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto (1-t)x+tT(x).
\end{align*}
At points of differentiability of $T$, its Jacobian matrix is
\begin{align*}
JT_{t,x}=(1-t)I_n+tJT_x.
\end{align*}
Since $JT_x$ is positive semidefinite, $JT_{t,x}$ is also positive semidefinite.
Let $\rho_t:\mathbb R^n\to[0,\infty)$ denote the density of $\mu_t=(T_t)_\#\mu_0$. The Monge-Ampere change-of-variables formula for the map $T_t$ gives, for $\mathcal L^n$-a.e. $x\in\mathbb R^n$,
\begin{align*}
\rho_t(T_t(x))\det JT_{t,x}=\rho_0(x).
\end{align*}
Define the determinant radii
\begin{align*}
r_t:\mathbb R^n\to[0,\infty),\qquad x\mapsto \bigl(\det JT_{t,x}\bigr)^{1/n},
\end{align*}
and
\begin{align*}
r_1:\mathbb R^n\to[0,\infty),\qquad x\mapsto \bigl(\det JT_x\bigr)^{1/n}.
\end{align*}
Because $\rho_0$ is strictly positive and the transport is between smooth positive densities in the present reduction, these determinant radii are positive for $\mathcal L^n$-a.e. $x$ relevant to the formula.
The function $h_t:\mathbb R^n\to(-\infty,+\infty]$ defined by $h_t(y)=U(\rho_t(y))$ is Borel because $\rho_t$ is a Borel density and $U$ is convex, hence continuous on $(0,\infty)$ and finite at $0$ by hypothesis. Applying the assumed change-of-variables formula for $T_t$ to the positive and negative parts of $h_t$ whenever the integral is finite, and otherwise in the extended nonnegative sense after decomposition, gives
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}U(\rho_t(y))\,d\mathcal L^n(y).
\end{align*}
Using the substitution $y=T_t(x)$ on the full-measure set from the theorem statement and the Jacobian factor $\det JT_{t,x}=r_t(x)^n$, this becomes
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}U(\rho_t(T_t(x)))r_t(x)^n\,d\mathcal L^n(x).
\end{align*}
The Monge-Ampere identity rewrites the density as
\begin{align*}
\rho_t(T_t(x))=\frac{\rho_0(x)}{r_t(x)^n}.
\end{align*}
Therefore
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right)\,d\mathcal L^n(x).
\end{align*}
Introduce the measurable function
\begin{align*}
a_0:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho_0(x)^{-1/n}.
\end{align*}
By the definition of $G_U$,
\begin{align*}
\rho_0(x)G_U(a_0(x)r_t(x))=r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right).
\end{align*}
Thus
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_t(x))\,d\mathcal L^n(x).
\end{align*}
Let $\operatorname{id}_{\mathbb R^n}:\mathbb R^n\to\mathbb R^n$ denote the identity map, $x\mapsto x$. At $t=0$, $T_0=\operatorname{id}_{\mathbb R^n}$ and $r_0=1$, so
\begin{align*}
\mathcal U[\mu_0]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x))\,d\mathcal L^n(x).
\end{align*}
At $t=1$, the same computation for $T$ gives
\begin{align*}
\mathcal U[\mu_1]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_1(x))\,d\mathcal L^n(x).
\end{align*}
[guided]
The reason for introducing $G_U$ is that the density transformation under $T_t$ naturally produces the expression $r_t^nU(\rho_0/r_t^n)$. This expression is not convex in $r_t$ in any transparent way, but after multiplying and dividing by $\rho_0$ it becomes exactly the McCann transform.
We begin with the smooth positive setting. Let
\begin{align*}
T_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto (1-t)x+tT(x).
\end{align*}
Since $T$ is the Brenier map, it is the gradient of a convex function at $\mu_0$-a.e. point. Therefore $JT_x$ is symmetric positive semidefinite at almost every differentiability point. The Jacobian matrix of $T_t$ is
\begin{align*}
JT_{t,x}=(1-t)I_n+tJT_x.
\end{align*}
This matrix is positive semidefinite because it is a convex combination of positive semidefinite matrices.
Let $\rho_t:\mathbb R^n\to[0,\infty)$ be the density of $\mu_t=(T_t)_\#\mu_0$. The Monge-Ampere change-of-variables formula states that
\begin{align*}
\rho_t(T_t(x))\det JT_{t,x}=\rho_0(x)
\end{align*}
for $\mathcal L^n$-a.e. $x\in\mathbb R^n$. Define
\begin{align*}
r_t:\mathbb R^n\to[0,\infty),\qquad x\mapsto \bigl(\det JT_{t,x}\bigr)^{1/n}.
\end{align*}
Then $\det JT_{t,x}=r_t(x)^n$, and the Monge-Ampere identity becomes
\begin{align*}
\rho_t(T_t(x))=\frac{\rho_0(x)}{r_t(x)^n}.
\end{align*}
Now compute the energy by pulling it back through $T_t$. The change of variables $y=T_t(x)$ transforms the Lebesgue measure by the Jacobian factor $r_t(x)^n$, hence
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}U(\rho_t(T_t(x)))r_t(x)^n\,d\mathcal L^n(x).
\end{align*}
Substituting the density identity gives
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right)\,d\mathcal L^n(x).
\end{align*}
This is where the McCann transform enters. Define
\begin{align*}
a_0:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho_0(x)^{-1/n}.
\end{align*}
Because $G_U(r)=r^nU(r^{-n})$, we have
\begin{align*}
\rho_0(x)G_U(a_0(x)r_t(x))=r_t(x)^nU\left(\frac{\rho_0(x)}{r_t(x)^n}\right).
\end{align*}
Therefore the energy has the form
\begin{align*}
\mathcal U[\mu_t]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_t(x))\,d\mathcal L^n(x).
\end{align*}
Let $\operatorname{id}_{\mathbb R^n}:\mathbb R^n\to\mathbb R^n$ denote the identity map, $x\mapsto x$. The endpoints are the same formula with $T_0=\operatorname{id}_{\mathbb R^n}$, $r_0=1$, and $r_1=(\det JT)^{1/n}$:
\begin{align*}
\mathcal U[\mu_0]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x))\,d\mathcal L^n(x),
\end{align*}
and
\begin{align*}
\mathcal U[\mu_1]=\int_{\mathbb R^n}\rho_0(x)G_U(a_0(x)r_1(x))\,d\mathcal L^n(x).
\end{align*}
This puts all three energies on the same reference measure $\rho_0\,\mathcal L^n$, which is what allows a pointwise convexity argument to prove the global inequality.
We now make that pointwise argument inside the guided proof. Fix $x$ in the full-measure set on which the Jacobians, determinant identities, and positivity assertions hold. The matrix $JT_x$ is symmetric positive semidefinite by hypothesis, and $I_n$ is symmetric positive definite. Minkowski's determinant inequality for positive semidefinite matrices, applied to $(1-t)I_n$ and $tJT_x$, gives
\begin{align*}
r_t(x)=\bigl(\det JT_{t,x}\bigr)^{1/n}\ge (1-t)(\det I_n)^{1/n}+t(\det JT_x)^{1/n}.
\end{align*}
Since $\det I_n=1$ and $r_1(x)=(\det JT_x)^{1/n}$, this becomes
\begin{align*}
r_t(x)\ge (1-t)+tr_1(x).
\end{align*}
Multiplying by the positive number $a_0(x)=\rho_0(x)^{-1/n}$ preserves the inequality:
\begin{align*}
a_0(x)r_t(x)\ge (1-t)a_0(x)+t a_0(x)r_1(x).
\end{align*}
Now the direction of monotonicity matters. Since $G_U$ is nonincreasing, the lower bound on the argument gives an upper bound on the value:
\begin{align*}
G_U(a_0(x)r_t(x))\le G_U((1-t)a_0(x)+t a_0(x)r_1(x)).
\end{align*}
Since $G_U$ is convex on $(0,\infty)$ and both $a_0(x)$ and $a_0(x)r_1(x)$ are positive, convexity gives
\begin{align*}
G_U((1-t)a_0(x)+t a_0(x)r_1(x))\le (1-t)G_U(a_0(x))+tG_U(a_0(x)r_1(x)).
\end{align*}
Combining the two pointwise inequalities and multiplying by $\rho_0(x)\ge0$ yields
\begin{align*}
\rho_0(x)G_U(a_0(x)r_t(x))\le (1-t)\rho_0(x)G_U(a_0(x))+t\rho_0(x)G_U(a_0(x)r_1(x)).
\end{align*}
Integrating this inequality with respect to $\mathcal L^n$ and substituting the three energy representations already proved gives
\begin{align*}
\mathcal U[\mu_t]\le (1-t)\mathcal U[\mu_0]+t\mathcal U[\mu_1].
\end{align*}
Thus the guided argument proves the same displacement convexity inequality as the exact proof.
[/guided]
[/step]
[step:Apply the determinant inequality to the interpolated Jacobian]
For each $\mathcal L^n$-a.e. $x\in\mathbb R^n$ at which $JT_x$ exists, apply Minkowski's determinant inequality for positive semidefinite matrices to $I_n$ and $JT_x$. Since
\begin{align*}
JT_{t,x}=(1-t)I_n+tJT_x,
\end{align*}
we obtain
\begin{align*}
\bigl(\det JT_{t,x}\bigr)^{1/n}\ge (1-t)(\det I_n)^{1/n}+t(\det JT_x)^{1/n}.
\end{align*}
Because $\det I_n=1$, this is
\begin{align*}
r_t(x)\ge (1-t)+tr_1(x).
\end{align*}
Multiplying by the positive scalar $a_0(x)=\rho_0(x)^{-1/n}$ gives
\begin{align*}
a_0(x)r_t(x)\ge (1-t)a_0(x)+t\,a_0(x)r_1(x).
\end{align*}
[/step]
[step:Use monotonicity and convexity of the McCann transform]
Since $G_U$ is nonincreasing and
\begin{align*}
a_0(x)r_t(x)\ge (1-t)a_0(x)+t\,a_0(x)r_1(x),
\end{align*}
we have, for $\mathcal L^n$-a.e. $x\in\mathbb R^n$,
\begin{align*}
G_U(a_0(x)r_t(x))\le G_U((1-t)a_0(x)+t\,a_0(x)r_1(x)).
\end{align*}
Since $G_U$ is convex on $(0,\infty)$,
\begin{align*}
G_U((1-t)a_0(x)+t\,a_0(x)r_1(x))\le (1-t)G_U(a_0(x))+tG_U(a_0(x)r_1(x)).
\end{align*}
Combining these inequalities and multiplying by the nonnegative density $\rho_0(x)$ gives
\begin{align*}
\rho_0(x)G_U(a_0(x)r_t(x))\le (1-t)\rho_0(x)G_U(a_0(x))+t\rho_0(x)G_U(a_0(x)r_1(x)).
\end{align*}
Integrating with respect to $\mathcal L^n$ and using the three energy representations from the first step yields
\begin{align*}
\mathcal U[\mu_t]\le (1-t)\mathcal U[\mu_0]+t\mathcal U[\mu_1].
\end{align*}
Thus the theorem is proved in the smooth positive setting.
[/step]