[guided]The goal of this step is to move the spatial derivative off the coefficient-gradient expression and onto the [test function](/page/Test%20Function) $u$. Because the test function is exactly the solution and $u$ vanishes on the boundary, this produces the interior energy term without a boundary correction.
Fix $\tau\in(0,T)$, and let $\nu:\partial U\to\mathbb{R}^n$ be the outward unit normal field. Define the vector field $F_\tau:\overline{U}\to\mathbb{R}^n$ by
\begin{align*}
F_\tau(x)=u(x,\tau)A(x,\tau)\nabla u(x,\tau).
\end{align*}
This vector field is smooth enough for the classical [divergence theorem](/theorems/3614) because $u$ is twice continuously differentiable in the spatial variables, $A$ is $C^1$, and $\partial U$ is smooth. Applying the product rule to the scalar factor $u(\cdot,\tau)$ and the vector field $A(\cdot,\tau)\nabla u(\cdot,\tau)$ gives
\begin{align*}
\operatorname{div}F_\tau(x)=\nabla u(x,\tau)\cdot A(x,\tau)\nabla u(x,\tau)+u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau)).
\end{align*}
The Euclidean dot product is symmetric as a scalar product, so $\nabla u(x,\tau)\cdot A(x,\tau)\nabla u(x,\tau)=A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)$.
Let $\mathcal{H}^{n-1}$ denote the $(n-1)$-dimensional [Hausdorff measure](/page/Hausdorff%20Measure) on $\partial U$. Now apply the divergence theorem to $F_\tau$ on $U$. The boundary term is
\begin{align*}
\int_{\partial U} F_\tau(x)\cdot\nu(x)\,d\mathcal{H}^{n-1}(x).
\end{align*}
For every $x\in\partial U$, the Dirichlet boundary condition gives $u(x,\tau)=0$, and hence $F_\tau(x)\cdot\nu(x)=0$. Therefore the boundary integral vanishes and
\begin{align*}
\int_U \operatorname{div}F_\tau(x)\,d\mathcal{L}^n(x)=0.
\end{align*}
Substituting the product-rule formula for $\operatorname{div}F_\tau$ gives
\begin{align*}
\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)+\int_U u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau))\,d\mathcal{L}^n(x)=0.
\end{align*}
Rearranging this equality gives
\begin{align*}
\int_U u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau))\,d\mathcal{L}^n(x)=-\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x).
\end{align*}
This is exactly the spatial integration-by-parts identity needed for the energy calculation.
The word energy is justified by uniform ellipticity. Let $\theta>0$ be an ellipticity constant for $A$, so that
\begin{align*}
A(x,\tau)\xi\cdot\xi\geq \theta |\xi|^2
\end{align*}
for every $x\in U$, every $\tau\in(0,T)$, and every $\xi\in\mathbb{R}^n$. Choosing $\xi=\nabla u(x,\tau)$ and integrating over $U$ with respect to $\mathcal{L}^n$ gives
\begin{align*}
\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)\geq \theta\int_U |\nabla u(x,\tau)|^2\,d\mathcal{L}^n(x).
\end{align*}
Thus the spatial term controls the squared $L^2(U)$ norm of the gradient at time $\tau$.[/guided]