[proofplan]
We differentiate $\mathcal F[\rho_t]$ by applying the definition of first variation to the curve $t\mapsto\rho_t$. The continuity equation identifies the tangent vector $\partial_t\rho_t$ with the negative divergence of the transport flux $\rho_t\nabla\phi_t$. An [integration by parts](/theorems/210) on $\mathbb R^n$, justified by the compact-support or decay hypothesis, transfers the divergence onto the first-variation density and gives the weighted $L^2(\rho_t)$ pairing of gradients.
[/proofplan]
custom_env
admin
[step:Differentiate the functional using its first variation]Fix $t\in I$. Define the first-variation potential at time $t$ by
\begin{align*}
\psi_t:\mathbb R^n\to\mathbb R,\qquad \psi_t(x)=\frac{\delta\mathcal F}{\delta\rho}(\rho_t)(x).
\end{align*}
Since the curve $s\mapsto\rho_s$ takes values in $\mathcal A$ and is smooth, its velocity at time $t$ is the smooth function
\begin{align*}
\sigma_t:\mathbb R^n\to\mathbb R,\qquad \sigma_t(x)=\partial_t\rho_t(x).
\end{align*}
Applying the defining first-variation identity to the admissible curve $s\mapsto\rho_s$ at the parameter value $s=t$ gives
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=\int_{\mathbb R^n}\psi_t(x)\,\partial_t\rho_t(x)\,d\mathcal L^n(x).
\end{align*}[/step]
custom_env
admin
[guided]Fix $t\in I$. The first variation of $\mathcal F$ at $\rho_t$ is represented by the smooth function
\begin{align*}
\psi_t:\mathbb R^n\to\mathbb R,\qquad \psi_t(x)=\frac{\delta\mathcal F}{\delta\rho}(\rho_t)(x).
\end{align*}
The curve $s\mapsto\rho_s$ is an admissible smooth curve in $\mathcal A$, so its tangent vector at $s=t$ is the smooth function
\begin{align*}
\sigma_t:\mathbb R^n\to\mathbb R,\qquad \sigma_t(x)=\partial_t\rho_t(x).
\end{align*}
The definition of first variation says that the derivative of $\mathcal F$ along an admissible curve at any parameter value is obtained by integrating the first-variation potential at the corresponding density against the curve velocity. Applying that definition to $s\mapsto\rho_s$ at $s=t$ yields
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=\int_{\mathbb R^n}\psi_t(x)\,\sigma_t(x)\,d\mathcal L^n(x).
\end{align*}
Substituting the definition of $\sigma_t$ gives
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=\int_{\mathbb R^n}\psi_t(x)\,\partial_t\rho_t(x)\,d\mathcal L^n(x).
\end{align*}
This is the point at which the variational derivative converts a derivative in the infinite-dimensional density space into an ordinary spatial integral.[/guided]
custom_env
admin
[step:Use the continuity equation to rewrite the density velocity]
The continuity equation gives, pointwise on $\mathbb R^n$ at the fixed time $t$,
\begin{align*}
\partial_t\rho_t(x)=-\operatorname{div}(\rho_t\nabla\phi_t)(x).
\end{align*}
Substituting this identity into the first-variation formula gives
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=-\int_{\mathbb R^n}\psi_t(x)\,\operatorname{div}(\rho_t\nabla\phi_t)(x)\,d\mathcal L^n(x).
\end{align*}
[/step]
custom_env
admin
[step:Integrate by parts to move the divergence onto the first variation]Define the smooth vector field
\begin{align*}
J_t:\mathbb R^n\to\mathbb R^n,\qquad J_t(x)=\rho_t(x)\nabla\phi_t(x).
\end{align*}
By the compact-support or decay hypothesis, the [integration by parts](/theorems/2098) identity on $\mathbb R^n$ applies to $\psi_t$ and $J_t$ with no boundary contribution at infinity. Hence
\begin{align*}
-\int_{\mathbb R^n}\psi_t(x)\,\operatorname{div}J_t(x)\,d\mathcal L^n(x)=\int_{\mathbb R^n}\nabla\psi_t(x)\cdot J_t(x)\,d\mathcal L^n(x).
\end{align*}
Using the definition of $J_t$ gives
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=\int_{\mathbb R^n}\nabla\psi_t(x)\cdot\nabla\phi_t(x)\,\rho_t(x)\,d\mathcal L^n(x).
\end{align*}[/step]
custom_env
admin
[guided]The expression obtained from the continuity equation contains the divergence of the flux. We introduce that flux explicitly as the smooth vector field
\begin{align*}
J_t:\mathbb R^n\to\mathbb R^n,\qquad J_t(x)=\rho_t(x)\nabla\phi_t(x).
\end{align*}
With this notation,
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=-\int_{\mathbb R^n}\psi_t(x)\,\operatorname{div}J_t(x)\,d\mathcal L^n(x).
\end{align*}
The purpose of integration by parts is to transfer the derivative from the flux $J_t$ to the first-variation potential $\psi_t$. The hypotheses state precisely that the compact-support or decay conditions are strong enough for the boundary contribution at infinity to vanish. Therefore the integration by parts formula on $\mathbb R^n$ gives
\begin{align*}
-\int_{\mathbb R^n}\psi_t(x)\,\operatorname{div}J_t(x)\,d\mathcal L^n(x)=\int_{\mathbb R^n}\nabla\psi_t(x)\cdot J_t(x)\,d\mathcal L^n(x).
\end{align*}
Substituting $J_t(x)=\rho_t(x)\nabla\phi_t(x)$ into the right-hand side yields
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=\int_{\mathbb R^n}\nabla\psi_t(x)\cdot\nabla\phi_t(x)\,\rho_t(x)\,d\mathcal L^n(x).
\end{align*}
This is the desired transport-direction pairing: the tangent vector is represented by the potential $\phi_t$, and the differential of the functional is represented by the gradient of its first variation.[/guided]
custom_env
admin
[step:Substitute the first-variation potential and conclude]
Finally, by the definition of $\psi_t$,
\begin{align*}
\nabla\psi_t(x)=\nabla\left(\frac{\delta\mathcal F}{\delta\rho}(\rho_t)\right)(x).
\end{align*}
Substituting this identity into the preceding formula gives
\begin{align*}
\frac{d}{dt}\mathcal F[\rho_t]=\int_{\mathbb R^n}\nabla\left(\frac{\delta\mathcal F}{\delta\rho}(\rho_t)\right)(x)\cdot\nabla\phi_t(x)\,\rho_t(x)\,d\mathcal L^n(x).
\end{align*}
Since $t\in I$ was arbitrary, the identity holds for every $t\in I$.
[/step]