[guided]The obstruction in the weak setting is that the pointwise tangent projection $d\pi_{u(x)}(\phi(x))$ need not be a smooth tangent field along the Sobolev map $u$. We avoid testing directly against that projected field. Instead, we build an actual admissible variation through maps into $N$ and differentiate the energy with respect to the scalar variation parameter.
Since $N$ is compact and embedded, the [Tubular Neighbourhood Theorem](/theorems/???) gives an open neighbourhood $\mathcal{T}\subset\mathbb{R}^q$ of $N$ and a smooth nearest-point projection
\begin{align*}
\pi:\mathcal{T}\to N.
\end{align*}
For every $p\in N$, the differential $d\pi_p:\mathbb{R}^q\to T_pN$ is the orthogonal projection onto $T_pN$. Because $\phi$ has compact support and $N$ is compact, the set of points $u(x)+t\phi(x)$ remains in $\mathcal{T}$ for all sufficiently small $|t|$; choose $\varepsilon>0$ with this property.
For $t\in(-\varepsilon,\varepsilon)$ define
\begin{align*}
u_t:U&\to N
\end{align*}
by
\begin{align*}
u_t(x)&=\pi(u(x)+t\phi(x)).
\end{align*}
The map $u_t$ is a Sobolev map into $N$ because $u\in W^{1,2}(U;N)$, $\phi$ is smooth and compactly supported, and composition with the smooth map $\pi$ is valid on the compact subset of $\mathcal{T}$ reached by the variation. The Sobolev chain rule gives, for almost every $x\in U$,
\begin{align*}
\left.\frac{d}{dt}\right|_{t=0}u_t(x)=d\pi_{u(x)}(\phi(x))
\end{align*}
and
\begin{align*}
\left.\frac{d}{dt}\right|_{t=0}\partial_{x_\alpha}u_t(x)
=
d\pi_{u(x)}(\partial_{x_\alpha}\phi(x))
+
d^2\pi_{u(x)}(\partial_{x_\alpha}u(x),\phi(x)).
\end{align*}
This is the replacement for the invalid computation with second derivatives of $u$: all derivatives fall either on the smooth variation field $\phi$ or on the smooth projection $\pi$.
We must also justify moving the derivative in $t$ through the energy integral. Let $K\subset U$ be a compact set containing $\operatorname{supp}\phi$, and choose constants $M_1,M_2>0$ bounding $d\pi$ and $d^2\pi$ on the compact subset reached by $u(x)+t\phi(x)$. The chain-rule formula bounds the difference quotients of $\partial_{x_\alpha}u_t$ by a constant multiple of $|\partial_{x_\alpha}\phi|+|\partial_{x_\alpha}u|\,|\phi|$. After pairing with $\partial_{x_\alpha}u$, the first-variation integrand is dominated by a constant multiple of $|du|\,|d\phi|+|du|^2|\phi|$. This function lies in $L^1(K,d\operatorname{vol}_g)$ because $du\in L^2_{\mathrm{loc}}$ and $\phi,d\phi$ are bounded with compact support. Therefore the dominated convergence theorem permits differentiating the Dirichlet energy under the integral sign.
Weak harmonicity says that the first derivative at $t=0$ of the Dirichlet energy of this projected variation vanishes. Thus
\begin{align*}
0
=
\int_U g^{\alpha\beta}\,\partial_{x_\alpha}u\cdot\left(d\pi_u(\partial_{x_\beta}\phi)+d^2\pi_u(\partial_{x_\beta}u,\phi)\right)\,d\operatorname{vol}_g.
\end{align*}
Since $\partial_{x_\alpha}u(x)$ is tangent to $N$ at $u(x)$ for almost every $x$, and since $d\pi_{u(x)}$ is the orthogonal projection onto $T_{u(x)}N$, we may remove the projection from the first term after pairing with $\partial_{x_\alpha}u$:
\begin{align*}
\partial_{x_\alpha}u\cdot d\pi_u(\partial_{x_\beta}\phi)
=
\partial_{x_\alpha}u\cdot \partial_{x_\beta}\phi.
\end{align*}
The second derivative of the nearest-point projection records the curvature of the embedded submanifold. With the convention $A_p(X,Y)=(D_XY)^\perp$, fix $p\in N$, $X,Z\in T_pN$, and $V\in\mathbb{R}^q$. Differentiating the projection identity $d\pi_p(Z)=Z$ in the ambient direction $V$ and using the Weingarten relation for the normal component of $V$ gives
\begin{align*}
Z\cdot d^2\pi_p(X,V)=-A_p(Z,X)\cdot V.
\end{align*}
The tangent component of $V$ gives zero on the right because $A_p(Z,X)$ is normal, and it gives zero on the left after pairing with $Z$ because $d^2\pi_p(X,V^\top)$ is normal. Applying the identity with $p=u(x)$, $Z=\partial_{x_\alpha}u(x)$, $X=\partial_{x_\beta}u(x)$, and $V=\phi(x)$ gives
\begin{align*}
\partial_{x_\alpha}u\cdot d^2\pi_u(\partial_{x_\beta}u,\phi)
=
-A_u(\partial_{x_\alpha}u,\partial_{x_\beta}u)\cdot\phi.
\end{align*}
Substitution gives
\begin{align*}
0
=
\int_U g^{\alpha\beta}\,\partial_{x_\alpha}u\cdot\partial_{x_\beta}\phi\,d\operatorname{vol}_g
-
\int_U A_u(du,du)\cdot\phi\,d\operatorname{vol}_g.
\end{align*}
Using the definition
\begin{align*}
\Delta_g u(\phi)
=
-\int_U g^{\alpha\beta}\,\partial_{x_\alpha}u\cdot\partial_{x_\beta}\phi\,d\operatorname{vol}_g,
\end{align*}
we obtain
\begin{align*}
\Delta_g u(\phi)
=
-\int_U A_u(du,du)\cdot\phi\,d\operatorname{vol}_g.
\end{align*}
This is exactly the desired distributional identity when the term $A_u(du,du)$ is interpreted as the [regular distribution](/page/Regular%20Distribution) defined by integration against $d\operatorname{vol}_g$.[/guided]