[proofplan]
We encode the variational expression as a smooth function $\Phi(t,x,y)$ and use the first-order optimality condition at the unique minimizer $Y(t,x)$. Since $\nabla_y \Phi(t,x,Y(t,x))=0$, every chain-rule contribution coming from differentiating the minimizer map $Y$ vanishes. The remaining explicit derivatives give $\nabla_x u(t,x) = (x-Y(t,x))/t$ and $\partial_t u(t,x) = -|x-Y(t,x)|^2/(2t^2)$, and substitution proves the Hamilton-Jacobi equation at the specified point.
[/proofplan]
[step:Introduce the objective function and record the first-order optimality condition]
Define the function
\begin{align*}
\Phi: V \times \mathbb{R}^n \to \mathbb{R}
\end{align*}
by
\begin{align*}
\Phi(t,x,y) = g(y) + \frac{|x-y|^2}{2t}.
\end{align*}
Since $g \in C^1(\mathbb{R}^n)$ and $t>0$ on $V$, the map $\Phi$ is $C^1$ in all variables. For each $(t,x) \in V$, the point $Y(t,x)$ is a minimizer of the differentiable function $y \mapsto \Phi(t,x,y)$ on the [open set](/page/Open%20Set) $\mathbb{R}^n$. Therefore its gradient with respect to $y$ vanishes:
\begin{align*}
\nabla_y \Phi(t,x,Y(t,x)) = 0.
\end{align*}
Computing the $y$-gradient gives
\begin{align*}
\nabla g(Y(t,x)) - \frac{x-Y(t,x)}{t} = 0.
\end{align*}
Thus, for every $(t,x) \in V$,
\begin{align*}
\nabla g(Y(t,x)) = \frac{x-Y(t,x)}{t}.
\end{align*}
[/step]
[step:Differentiate the value function and cancel the minimizer derivatives]
Let $Y_i:V \to \mathbb{R}$ denote the $i$-th component of $Y$, and write
\begin{align*}
Y(t,x) = (Y_1(t,x),\dots,Y_n(t,x)).
\end{align*}
Since $Y \in C^1(V;\mathbb{R}^n)$ and $\Phi \in C^1(V \times \mathbb{R}^n)$, the composite function
\begin{align*}
u:V \to \mathbb{R}
\end{align*}
given by
\begin{align*}
u(t,x) = \Phi(t,x,Y(t,x))
\end{align*}
is $C^1$ on $V$.
Fix $(t,x) \in V$. For each spatial index $j \in \{1,\dots,n\}$, the chain rule gives
\begin{align*}
\partial_{x_j}u(t,x) = \partial_{x_j}\Phi(t,x,Y(t,x)) + \sum_{i=1}^n \partial_{y_i}\Phi(t,x,Y(t,x))\,\partial_{x_j}Y_i(t,x).
\end{align*}
The first-order optimality condition gives $\partial_{y_i}\Phi(t,x,Y(t,x))=0$ for every $i$, so the sum vanishes. Hence
\begin{align*}
\partial_{x_j}u(t,x) = \partial_{x_j}\Phi(t,x,Y(t,x)).
\end{align*}
Differentiating the explicit quadratic term with respect to $x_j$ gives
\begin{align*}
\partial_{x_j}u(t,x) = \frac{x_j-Y_j(t,x)}{t}.
\end{align*}
Therefore
\begin{align*}
\nabla_x u(t,x) = \frac{x-Y(t,x)}{t}.
\end{align*}
The same chain-rule cancellation applies to the time derivative:
\begin{align*}
\partial_t u(t,x) = \partial_t\Phi(t,x,Y(t,x)) + \sum_{i=1}^n \partial_{y_i}\Phi(t,x,Y(t,x))\,\partial_tY_i(t,x).
\end{align*}
Again the sum is zero, and differentiating the explicit factor $1/(2t)$ gives
\begin{align*}
\partial_t u(t,x) = -\frac{|x-Y(t,x)|^2}{2t^2}.
\end{align*}
[guided]
The important point is that $u$ depends on $(t,x)$ in two ways: directly through the variables in $\Phi(t,x,y)$, and indirectly through the minimizer $y=Y(t,x)$. We therefore must account for both dependences and then use the minimizing condition to remove the indirect terms.
Let $Y_i:V \to \mathbb{R}$ be the $i$-th component of the $C^1$ map $Y:V \to \mathbb{R}^n$, so that
\begin{align*}
Y(t,x) = (Y_1(t,x),\dots,Y_n(t,x)).
\end{align*}
Since $\Phi$ is $C^1$ and $Y$ is $C^1$, the composite map
\begin{align*}
u:V \to \mathbb{R}
\end{align*}
defined by
\begin{align*}
u(t,x) = \Phi(t,x,Y(t,x))
\end{align*}
is $C^1$.
Fix $(t,x) \in V$ and a spatial index $j \in \{1,\dots,n\}$. The chain rule separates the direct $x_j$-dependence of $\Phi$ from the dependence through the minimizer:
\begin{align*}
\partial_{x_j}u(t,x) = \partial_{x_j}\Phi(t,x,Y(t,x)) + \sum_{i=1}^n \partial_{y_i}\Phi(t,x,Y(t,x))\,\partial_{x_j}Y_i(t,x).
\end{align*}
The reason the Hopf-Lax formula becomes simple is that the second term is multiplied by the $y$-gradient of $\Phi$ at the minimizing point. Because $Y(t,x)$ minimizes the differentiable function $y \mapsto \Phi(t,x,y)$ on all of $\mathbb{R}^n$, the first-order optimality condition gives
\begin{align*}
\nabla_y\Phi(t,x,Y(t,x)) = 0.
\end{align*}
Equivalently, $\partial_{y_i}\Phi(t,x,Y(t,x))=0$ for every $i \in \{1,\dots,n\}$. Thus the whole sum involving $\partial_{x_j}Y_i(t,x)$ vanishes, and only the explicit derivative remains:
\begin{align*}
\partial_{x_j}u(t,x) = \partial_{x_j}\Phi(t,x,Y(t,x)).
\end{align*}
Now $\Phi(t,x,y)=g(y)+|x-y|^2/(2t)$, and $g(y)$ has no direct dependence on $x_j$. Differentiating the quadratic term with respect to $x_j$ gives
\begin{align*}
\partial_{x_j}u(t,x) = \frac{x_j-Y_j(t,x)}{t}.
\end{align*}
Since this holds for every spatial component $j$, we obtain the vector identity
\begin{align*}
\nabla_x u(t,x) = \frac{x-Y(t,x)}{t}.
\end{align*}
The time derivative is handled in exactly the same chain-rule framework. Applying the chain rule in the $t$ variable gives
\begin{align*}
\partial_t u(t,x) = \partial_t\Phi(t,x,Y(t,x)) + \sum_{i=1}^n \partial_{y_i}\Phi(t,x,Y(t,x))\,\partial_tY_i(t,x).
\end{align*}
The first-order optimality condition again makes every coefficient $\partial_{y_i}\Phi(t,x,Y(t,x))$ equal to zero, so the derivative of the minimizer map $Y$ does not contribute. Therefore
\begin{align*}
\partial_t u(t,x) = \partial_t\Phi(t,x,Y(t,x)).
\end{align*}
When differentiating $\Phi(t,x,y)$ with respect to $t$, the variables $x$ and $y$ are held fixed. Since
\begin{align*}
\frac{\partial}{\partial t}\left(\frac{|x-y|^2}{2t}\right) = -\frac{|x-y|^2}{2t^2},
\end{align*}
we get
\begin{align*}
\partial_t u(t,x) = -\frac{|x-Y(t,x)|^2}{2t^2}.
\end{align*}
[/guided]
[/step]
[step:Substitute the derivative formulas into the Hamilton-Jacobi expression]
Evaluating the two derivative identities at $(t_0,x_0)$ gives
\begin{align*}
\nabla_x u(t_0,x_0) = \frac{x_0-Y(t_0,x_0)}{t_0}.
\end{align*}
and
\begin{align*}
\partial_t u(t_0,x_0) = -\frac{|x_0-Y(t_0,x_0)|^2}{2t_0^2}.
\end{align*}
Hence
\begin{align*}
\partial_t u(t_0,x_0) + \frac{1}{2}|\nabla_x u(t_0,x_0)|^2 = -\frac{|x_0-Y(t_0,x_0)|^2}{2t_0^2} + \frac{1}{2}\left|\frac{x_0-Y(t_0,x_0)}{t_0}\right|^2.
\end{align*}
Using homogeneity of the Euclidean norm,
\begin{align*}
\left|\frac{x_0-Y(t_0,x_0)}{t_0}\right|^2 = \frac{|x_0-Y(t_0,x_0)|^2}{t_0^2}.
\end{align*}
Therefore
\begin{align*}
\partial_t u(t_0,x_0) + \frac{1}{2}|\nabla_x u(t_0,x_0)|^2 = 0.
\end{align*}
This is precisely the Hamilton-Jacobi equation for the quadratic Hamiltonian at $(t_0,x_0)$.
[/step]