[proofplan]
The first assertion is the pushforward part of [Brenier's theorem](/theorems/7477) written in the language of transport Alexandrov solutions: the Brenier map agrees with the gradient of the convex potential outside a $\mu$-null set, and pushforwards are unchanged by modifying a map on a null set. In the smooth setting, the diffeomorphism hypothesis allows the classical change-of-variables formula to be applied to $\nabla\phi:U\to V$. This identifies the source density with the target density times the Jacobian determinant; convexity makes the Hessian positive semidefinite, so the absolute Jacobian determinant is just $\det D^2\phi$. Finally, continuity upgrades the almost-everywhere identity to a pointwise identity.
[/proofplan]
[step:Replace the Brenier map by the gradient representative]
Let $D_\phi\subset \mathbb{R}^n$ denote the set of differentiability points of $\phi$. Since the statement now assumes that $\phi$ is finite and convex on all of $\mathbb{R}^n$, the almost-everywhere differentiability theorem for finite convex functions on open convex subsets of Euclidean space applies with domain $\mathbb{R}^n$. Hence $D_\phi$ has full $\mathcal{L}^n$-measure. Since $\mu\ll \mathcal{L}^n$, it also has full $\mu$-measure. Choose a measurable map
\begin{align*}
G:\mathbb{R}^n \to \mathbb{R}^n
\end{align*}
such that $G(x)=\nabla\phi(x)$ for every $x\in D_\phi$ and such that $G$ is defined arbitrarily on $\mathbb{R}^n\setminus D_\phi$.
By Brenier's theorem applied to the absolutely continuous source measure $\mu$ and target measure $\nu$, the Brenier map $T:\mathbb{R}^n\to\mathbb{R}^n$ satisfies
\begin{align*}
T_{\#}\mu=\nu
\end{align*}
and $T(x)=\nabla\phi(x)=G(x)$ for $\mu$-almost every $x\in\mathbb{R}^n$. The external result used here is Brenier's theorem for quadratic-cost optimal transport: for a Borel probability source measure absolutely continuous with respect to $\mathcal{L}^n$ and a Borel probability target measure on $\mathbb{R}^n$, there exists a convex Brenier potential whose gradient representative transports the source to the target. The hypotheses match because $\mu$ and $\nu$ are Borel probability measures and $\mu\ll\mathcal{L}^n$.
For every Borel set $A\subset\mathbb{R}^n$, the sets $T^{-1}(A)$ and $G^{-1}(A)$ differ by a subset of the $\mu$-null set $\{x\in\mathbb{R}^n:T(x)\ne G(x)\}$. Hence
\begin{align*}
G_{\#}\mu(A)=\mu(G^{-1}(A))=\mu(T^{-1}(A))=T_{\#}\mu(A)=\nu(A).
\end{align*}
Thus $G_{\#}\mu=\nu$. In the single-valued almost-everywhere convention, $G$ is denoted by $\nabla\phi$, so
\begin{align*}
(\nabla\phi)_{\#}\mu=\nu.
\end{align*}
This is exactly the transport Alexandrov solution condition in the single-valued almost-everywhere sense.
[guided]
Let us make precise why the first part is not an additional regularity statement. The input from Brenier's theorem is that, because $\mu$ is absolutely continuous with respect to $\mathcal{L}^n$, the optimal transport map exists and is represented by the gradient of a convex function at $\mu$-almost every point where that gradient is defined. The external result used here is Brenier's theorem for quadratic-cost optimal transport: if the source and target are Borel probability measures on $\mathbb{R}^n$ and the source is absolutely continuous with respect to $\mathcal{L}^n$, then the optimal transport is represented almost everywhere by the gradient of a convex Brenier potential and pushes the source measure to the target. These hypotheses are exactly the hypotheses on $\mu$ and $\nu$ in the theorem statement.
Let $D_\phi\subset\mathbb{R}^n$ be the differentiability set of the convex function $\phi$. The differentiability input is the almost-everywhere differentiability theorem for finite convex functions on open convex subsets of Euclidean space. Its hypotheses are satisfied here because the repaired statement assumes $\phi:\mathbb{R}^n\to\mathbb{R}$ is finite and convex, and $\mathbb{R}^n$ is open and convex. Therefore $D_\phi$ has full $\mathcal{L}^n$-measure. The hypothesis $\mu\ll\mathcal{L}^n$ then transfers this to $\mu(\mathbb{R}^n\setminus D_\phi)=0$. To speak about a pushforward, we choose a measurable representative
\begin{align*}
G:\mathbb{R}^n \to \mathbb{R}^n
\end{align*}
which agrees with $\nabla\phi$ on $D_\phi$ and is defined arbitrarily outside $D_\phi$. The arbitrary values do not matter, because that exceptional set has $\mu$-measure zero.
Brenier's theorem gives a transport map $T:\mathbb{R}^n\to\mathbb{R}^n$ satisfying
\begin{align*}
T_{\#}\mu=\nu
\end{align*}
and $T=G$ $\mu$-almost everywhere. We now check directly that changing a map on a null set does not change its pushforward measure. If $A\subset\mathbb{R}^n$ is Borel, then $T^{-1}(A)$ and $G^{-1}(A)$ can disagree only at points where $T$ and $G$ disagree. Therefore their symmetric difference is contained in a $\mu$-null set, and so
\begin{align*}
G_{\#}\mu(A)=\mu(G^{-1}(A))=\mu(T^{-1}(A))=T_{\#}\mu(A)=\nu(A).
\end{align*}
Since this holds for every Borel set $A$, we have $G_{\#}\mu=\nu$. The single-valued almost-everywhere definition of a transport Alexandrov solution records this identity using the notation $\nabla\phi$ for the chosen almost-everywhere gradient representative. Thus
\begin{align*}
(\nabla\phi)_{\#}\mu=\nu.
\end{align*}
[/guided]
[/step]
[step:Apply change of variables to the smooth transport map]
Assume now the additional smooth hypotheses. Define the map
\begin{align*}
F:U &\to V
\end{align*}
by $F(x)=\nabla\phi(x)$. By hypothesis, $F$ is a diffeomorphism from $U$ onto $V$, and the pushforward identity says
\begin{align*}
F_{\#}(\rho_0\,\mathcal{L}^n\!\restriction_U)=\rho_1\,\mathcal{L}^n\!\restriction_V.
\end{align*}
Because $F(U)=V$ and $\rho_1:V\to(0,\infty)$, the composition $\rho_1\circ F$ is defined on all of $U$ and is strictly positive there.
Let $a:U\to[0,\infty)$ be the measurable function
\begin{align*}
a(x)=\rho_1(F(x))\,|\det JF_x|,
\end{align*}
where $JF_x$ is the Jacobian matrix of $F$ at $x$. By the classical change-of-variables formula for $C^1$ diffeomorphisms between open subsets of $\mathbb{R}^n$, applied to the diffeomorphism $F:U\to V$ and the nonnegative Borel function $y\mapsto \psi(y)\rho_1(y)$, for every nonnegative Borel function $\psi:V\to[0,\infty]$ one has
\begin{align*}
\int_V \psi(y)\rho_1(y)\,d\mathcal{L}^n(y)=\int_U \psi(F(x))\rho_1(F(x))|\det JF_x|\,d\mathcal{L}^n(x).
\end{align*}
On the other hand, the pushforward identity gives
\begin{align*}
\int_V \psi(y)\rho_1(y)\,d\mathcal{L}^n(y)=\int_U \psi(F(x))\rho_0(x)\,d\mathcal{L}^n(x).
\end{align*}
Comparing these identities yields
\begin{align*}
\int_U \psi(F(x))\rho_0(x)\,d\mathcal{L}^n(x)=\int_U \psi(F(x))a(x)\,d\mathcal{L}^n(x)
\end{align*}
for every nonnegative Borel function $\psi:V\to[0,\infty]$.
Since $F:U\to V$ is a diffeomorphism, choose $\psi=\mathbb{1}_{F(E)}$ for an arbitrary Borel set $E\subset U$. Then $F(E)$ is Borel, and $\psi(F(x))=\mathbb{1}_E(x)$. Hence
\begin{align*}
\int_E \rho_0(x)\,d\mathcal{L}^n(x)=\int_E a(x)\,d\mathcal{L}^n(x)
\end{align*}
for every Borel set $E\subset U$. Therefore
\begin{align*}
\rho_0(x)=\rho_1(F(x))|\det JF_x|
\end{align*}
for $\mathcal{L}^n$-almost every $x\in U$.
[/step]
[step:Identify the Jacobian determinant with the Monge Ampere determinant]
Because $\phi\in C^2(U)$ and $F=\nabla\phi$, the Jacobian matrix $JF_x$ is the Hessian matrix $D^2\phi(x)$ for every $x\in U$. Since $\phi$ is convex, $D^2\phi(x)$ is positive semidefinite for every $x\in U$. Hence
\begin{align*}
|\det JF_x|=|\det D^2\phi(x)|=\det D^2\phi(x).
\end{align*}
Substituting this into the almost-everywhere density identity from the previous step gives
\begin{align*}
\rho_0(x)=\rho_1(\nabla\phi(x))\det D^2\phi(x)
\end{align*}
for $\mathcal{L}^n$-almost every $x\in U$. Since $\rho_1$ is positive, division by $\rho_1(\nabla\phi(x))$ is valid, and we obtain
\begin{align*}
\det D^2\phi(x)=\frac{\rho_0(x)}{\rho_1(\nabla\phi(x))}
\end{align*}
for $\mathcal{L}^n$-almost every $x\in U$.
[/step]
[step:Upgrade the identity from almost everywhere to everywhere under continuity]
Assume finally that $\rho_0$ and $\rho_1$ are continuous. Define
\begin{align*}
h:U &\to \mathbb{R}
\end{align*}
by
\begin{align*}
h(x)=\det D^2\phi(x)-\frac{\rho_0(x)}{\rho_1(\nabla\phi(x))}.
\end{align*}
Since $\phi\in C^2(U)$, the function $x\mapsto \det D^2\phi(x)$ is continuous. Since $\rho_0$ and $\rho_1$ are continuous, $\nabla\phi$ is continuous, and $\rho_1$ is positive on $V$, the quotient $x\mapsto \rho_0(x)/\rho_1(\nabla\phi(x))$ is continuous. Therefore $h$ is continuous on $U$.
The previous step gives $h=0$ $\mathcal{L}^n$-almost everywhere on $U$. If there were a point $x_0\in U$ with $h(x_0)\ne 0$, continuity would give an open ball $B(x_0,r)\subset U$ and a number $\varepsilon>0$ such that $|h(x)|\ge\varepsilon$ for every $x\in B(x_0,r)$. Since every nonempty Euclidean open ball has positive $\mathcal{L}^n$-measure, this contradicts $h=0$ $\mathcal{L}^n$-almost everywhere. Hence $h(x)=0$ for every $x\in U$, which is the desired pointwise identity.
[/step]