[proofplan]
We compute the first variation of the internal energy $\mathcal U_m$ along smooth mass-preserving perturbations and obtain the variational derivative $\frac{m}{m-1}\rho^{m-1}$, up to an irrelevant additive constant. The finite-second-moment assumption places $\mu_t=\rho_t\mathcal L^n$ in $\mathcal P_2(\mathbb R^n)$, so the curve is admissible for the formal Wasserstein interpretation; the algebraic identification of the equation uses the smoothness, positivity, and decay assumptions. The formal Wasserstein gradient-flow rule identifies the velocity as the negative spatial gradient of this first variation. Substituting that velocity into the continuity equation reduces the gradient-flow equation to a pointwise divergence identity, which is exactly the porous medium equation. The converse is the same computation in reverse: the porous medium flux can be written as $\rho_t$ times the Wasserstein negative gradient velocity.
[/proofplan]
[step:Compute the first variation of the internal energy]
Fix a smooth positive probability density
\begin{align*}
\rho:\mathbb R^n\to(0,\infty)
\end{align*}
with sufficient decay at infinity. Let
\begin{align*}
\sigma:\mathbb R^n\to\mathbb R
\end{align*}
be a smooth rapidly decaying perturbation satisfying the mass-preserving condition
\begin{align*}
\int_{\mathbb R^n}\sigma(x)\,d\mathcal L^n(x)=0.
\end{align*}
For every real number $\varepsilon$ for which $\rho+\varepsilon\sigma$ remains positive, define the smooth map
\begin{align*}
\rho_\varepsilon:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho(x)+\varepsilon\sigma(x).
\end{align*}
Then $\rho_\varepsilon$ is a smooth positive perturbation of $\rho$ and satisfies the mass-preserving condition to first order. The decay hypothesis in the theorem is exactly the admissibility assumption needed here: on the chosen interval of $\varepsilon$, the derivative of $(\rho+\varepsilon\sigma)^m$ at $\varepsilon=0$ is integrable and dominated by an integrable rapidly decaying function.
By differentiating under the integral sign, justified by this domination together with the smoothness and decay hypotheses, we get
\begin{align*}
\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}\mathcal U_m[\rho_\varepsilon]=\frac{1}{m-1}\int_{\mathbb R^n}m\rho(x)^{m-1}\sigma(x)\,d\mathcal L^n(x).
\end{align*}
Therefore the first variation of $\mathcal U_m$ at $\rho$ is represented by the smooth function
\begin{align*}
\frac{\delta\mathcal U_m}{\delta\rho}(\rho):\mathbb R^n\to\mathbb R,\qquad x\mapsto \frac{m}{m-1}\rho(x)^{m-1}.
\end{align*}
As usual for mass-preserving variations, this representative is determined only up to an additive constant, but its spatial gradient is uniquely determined.
[guided]
We first identify the function whose spatial gradient will generate the Wasserstein velocity field. Fix a smooth positive probability density
\begin{align*}
\rho:\mathbb R^n\to(0,\infty)
\end{align*}
with sufficient decay at infinity. To test the first variation, choose a smooth rapidly decaying function
\begin{align*}
\sigma:\mathbb R^n\to\mathbb R
\end{align*}
such that
\begin{align*}
\int_{\mathbb R^n}\sigma(x)\,d\mathcal L^n(x)=0.
\end{align*}
The zero-integral condition is the infinitesimal form of preserving total mass. For every real number $\varepsilon$ for which $\rho+\varepsilon\sigma$ remains positive, define
\begin{align*}
\rho_\varepsilon:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho(x)+\varepsilon\sigma(x).
\end{align*}
This restriction on $\varepsilon$ is needed because a positive density on $\mathbb R^n$ need not be bounded away from zero. On this admissible interval, $\rho_\varepsilon$ is a smooth positive perturbation for the formal calculation. The decay hypothesis in the theorem also supplies the analytic justification for the next step: the derivative of $(\rho+\varepsilon\sigma)^m$ at $\varepsilon=0$ is integrable and is dominated, for $\varepsilon$ in the chosen interval, by an integrable rapidly decaying function.
Now compute the derivative of the energy. The functional is
\begin{align*}
\mathcal U_m[\rho_\varepsilon]=\frac{1}{m-1}\int_{\mathbb R^n}(\rho(x)+\varepsilon\sigma(x))^m\,d\mathcal L^n(x).
\end{align*}
Differentiating under the integral sign gives
\begin{align*}
\frac{d}{d\varepsilon}\Big|_{\varepsilon=0}\mathcal U_m[\rho_\varepsilon]=\frac{1}{m-1}\int_{\mathbb R^n}m\rho(x)^{m-1}\sigma(x)\,d\mathcal L^n(x).
\end{align*}
This means that the first variation is represented by the map
\begin{align*}
\frac{\delta\mathcal U_m}{\delta\rho}(\rho):\mathbb R^n\to\mathbb R,\qquad x\mapsto \frac{m}{m-1}\rho(x)^{m-1}.
\end{align*}
Why is an additive constant irrelevant? Since all admissible perturbations $\sigma$ have zero integral, replacing this representative by the sum of it and a constant does not change the first variation. The Wasserstein velocity uses the spatial gradient of the representative, and the gradient of a constant is zero, so the velocity is unaffected.
[/guided]
[/step]
[step:Insert the first variation into the formal Wasserstein gradient-flow rule]
For each $t>0$, the hypothesis $\mu_t=\rho_t\mathcal L^n\in\mathcal P_2(\mathbb R^n)$ is the Wasserstein admissibility condition, while the following algebra uses only the smoothness, positivity, and decay assumptions. Apply the preceding computation to the density $\rho_t$. The formal Wasserstein negative gradient velocity associated with $\mathcal U_m$ is the smooth vector field
\begin{align*}
v_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto -\nabla\left(\frac{m}{m-1}\rho_t(x)^{m-1}\right).
\end{align*}
Using the chain rule for the smooth map $s\mapsto s^{m-1}$ on $(0,\infty)$, this becomes
\begin{align*}
v_t(x)=-m\rho_t(x)^{m-2}\nabla\rho_t(x).
\end{align*}
The formal Wasserstein gradient-flow equation is the continuity equation
\begin{align*}
\partial_t\rho_t+\nabla\cdot(\rho_t v_t)=0.
\end{align*}
Substituting the displayed formula for $v_t$ gives
\begin{align*}
\partial_t\rho_t=\nabla\cdot\left(m\rho_t^{m-1}\nabla\rho_t\right).
\end{align*}
[/step]
[step:Identify the divergence form with the porous medium equation]
For each $t>0$, define the smooth function
\begin{align*}
q_t:\mathbb R^n\to(0,\infty),\qquad x\mapsto \rho_t(x)^m.
\end{align*}
By the chain rule, its gradient is
\begin{align*}
\nabla q_t(x)=m\rho_t(x)^{m-1}\nabla\rho_t(x).
\end{align*}
Taking divergence of both sides gives
\begin{align*}
\Delta q_t(x)=\nabla\cdot\left(m\rho_t^{m-1}\nabla\rho_t\right)(x).
\end{align*}
Since $q_t=\rho_t^m$, the equation obtained in the previous step is precisely
\begin{align*}
\partial_t\rho_t=\Delta(\rho_t^m).
\end{align*}
Thus every smooth formal Wasserstein gradient flow of $\mathcal U_m$ satisfies the porous medium equation.
[/step]
[step:Recover the Wasserstein gradient-flow equation from the porous medium equation]
Conversely, assume that $\rho$ is a smooth positive probability-density solution of
\begin{align*}
\partial_t\rho_t=\Delta(\rho_t^m)
\end{align*}
with finite second moment and sufficient decay at infinity. By the chain-rule identity from the previous step,
\begin{align*}
\Delta(\rho_t^m)=\nabla\cdot\left(m\rho_t^{m-1}\nabla\rho_t\right).
\end{align*}
Using the first-variation formula,
\begin{align*}
\nabla\left(\frac{\delta\mathcal U_m}{\delta\rho}(\rho_t)\right)=\nabla\left(\frac{m}{m-1}\rho_t^{m-1}\right)=m\rho_t^{m-2}\nabla\rho_t.
\end{align*}
Therefore
\begin{align*}
\rho_t\nabla\left(\frac{\delta\mathcal U_m}{\delta\rho}(\rho_t)\right)
=
m\rho_t^{m-1}\nabla\rho_t.
\end{align*}
The porous medium equation can hence be rewritten as
\begin{align*}
\partial_t\rho_t=\nabla\cdot\left(\rho_t\nabla\left(\frac{\delta\mathcal U_m}{\delta\rho}(\rho_t)\right)\right).
\end{align*}
Equivalently,
\begin{align*}
\partial_t\rho_t+\nabla\cdot\left(-\rho_t\nabla\left(\frac{\delta\mathcal U_m}{\delta\rho}(\rho_t)\right)\right)=0.
\end{align*}
Defining
\begin{align*}
v_t:\mathbb R^n\to\mathbb R^n,\qquad x\mapsto -\nabla\left(\frac{\delta\mathcal U_m}{\delta\rho}(\rho_t)(x)\right),
\end{align*}
we obtain
\begin{align*}
\partial_t\rho_t+\nabla\cdot(\rho_t v_t)=0.
\end{align*}
This is exactly the formal Wasserstein gradient-flow equation for $\mathcal U_m$, completing the equivalence.
[/step]