[guided]The important point is that $u$ depends on $(t,x)$ in two ways: directly through the variables in $\Phi(t,x,y)$, and indirectly through the minimizer $y=Y(t,x)$. We therefore must account for both dependences and then use the minimizing condition to remove the indirect terms.
Let $Y_i:V \to \mathbb{R}$ be the $i$-th component of the $C^1$ map $Y:V \to \mathbb{R}^n$, so that
\begin{align*}
Y(t,x) = (Y_1(t,x),\dots,Y_n(t,x)).
\end{align*}
Since $\Phi$ is $C^1$ and $Y$ is $C^1$, the composite map
\begin{align*}
u:V \to \mathbb{R}
\end{align*}
defined by
\begin{align*}
u(t,x) = \Phi(t,x,Y(t,x))
\end{align*}
is $C^1$.
Fix $(t,x) \in V$ and a spatial index $j \in \{1,\dots,n\}$. The chain rule separates the direct $x_j$-dependence of $\Phi$ from the dependence through the minimizer:
\begin{align*}
\partial_{x_j}u(t,x) = \partial_{x_j}\Phi(t,x,Y(t,x)) + \sum_{i=1}^n \partial_{y_i}\Phi(t,x,Y(t,x))\,\partial_{x_j}Y_i(t,x).
\end{align*}
The reason the Hopf-Lax formula becomes simple is that the second term is multiplied by the $y$-gradient of $\Phi$ at the minimizing point. Because $Y(t,x)$ minimizes the differentiable function $y \mapsto \Phi(t,x,y)$ on all of $\mathbb{R}^n$, the first-order optimality condition gives
\begin{align*}
\nabla_y\Phi(t,x,Y(t,x)) = 0.
\end{align*}
Equivalently, $\partial_{y_i}\Phi(t,x,Y(t,x))=0$ for every $i \in \{1,\dots,n\}$. Thus the whole sum involving $\partial_{x_j}Y_i(t,x)$ vanishes, and only the explicit derivative remains:
\begin{align*}
\partial_{x_j}u(t,x) = \partial_{x_j}\Phi(t,x,Y(t,x)).
\end{align*}
Now $\Phi(t,x,y)=g(y)+|x-y|^2/(2t)$, and $g(y)$ has no direct dependence on $x_j$. Differentiating the quadratic term with respect to $x_j$ gives
\begin{align*}
\partial_{x_j}u(t,x) = \frac{x_j-Y_j(t,x)}{t}.
\end{align*}
Since this holds for every spatial component $j$, we obtain the vector identity
\begin{align*}
\nabla_x u(t,x) = \frac{x-Y(t,x)}{t}.
\end{align*}
The time derivative is handled in exactly the same chain-rule framework. Applying the chain rule in the $t$ variable gives
\begin{align*}
\partial_t u(t,x) = \partial_t\Phi(t,x,Y(t,x)) + \sum_{i=1}^n \partial_{y_i}\Phi(t,x,Y(t,x))\,\partial_tY_i(t,x).
\end{align*}
The first-order optimality condition again makes every coefficient $\partial_{y_i}\Phi(t,x,Y(t,x))$ equal to zero, so the derivative of the minimizer map $Y$ does not contribute. Therefore
\begin{align*}
\partial_t u(t,x) = \partial_t\Phi(t,x,Y(t,x)).
\end{align*}
When differentiating $\Phi(t,x,y)$ with respect to $t$, the variables $x$ and $y$ are held fixed. Since
\begin{align*}
\frac{\partial}{\partial t}\left(\frac{|x-y|^2}{2t}\right) = -\frac{|x-y|^2}{2t^2},
\end{align*}
we get
\begin{align*}
\partial_t u(t,x) = -\frac{|x-Y(t,x)|^2}{2t^2}.
\end{align*}[/guided]