[proofplan]
We test the parabolic equation against the solution $u$ itself. The time derivative term becomes the derivative of the squared $L^2$ norm by differentiating under the integral sign. The divergence term becomes the coercive spatial energy by Green's identity, with no boundary contribution because $u$ vanishes on $\partial U$. Integrating the resulting differential identity from $s$ to $t$ gives the asserted formula.
[/proofplan]
[step:Differentiate the squared $L^2$ norm in time]
Define the energy function $E:[0,T]\to\mathbb{R}$ by
\begin{align*}
E(\tau)=\frac{1}{2}\int_U u(x,\tau)^2\,d\mathcal{L}^n(x).
\end{align*}
Since $u$ and $\partial_t u$ are continuous on the compact set $\overline{U}\times[0,T]$, differentiating under the integral sign gives, for every $\tau\in(0,T)$,
\begin{align*}
E'(\tau)=\int_U u(x,\tau)\partial_t u(x,\tau)\,d\mathcal{L}^n(x).
\end{align*}
Equivalently,
\begin{align*}
E'(\tau)=(\partial_t u(\cdot,\tau),u(\cdot,\tau))_{L^2(U)}.
\end{align*}
[/step]
[step:Convert the divergence term into the spatial energy]
Fix $\tau\in(0,T)$. Let $\nu:\partial U\to\mathbb{R}^n$ denote the outward unit normal field on $\partial U$. Define the smooth vector field $F_\tau:\overline{U}\to\mathbb{R}^n$ by
\begin{align*}
F_\tau(x)=u(x,\tau)A(x,\tau)\nabla u(x,\tau).
\end{align*}
By the product rule,
\begin{align*}
\operatorname{div}F_\tau(x)=A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)+u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau)).
\end{align*}
The classical [divergence theorem](/theorems/2754) on the smooth bounded domain $U$ gives the boundary integration identity. Since $u(x,\tau)=0$ for $x\in\partial U$, the boundary flux satisfies
\begin{align*}
F_\tau(x)\cdot\nu(x)=0
\end{align*}
for every $x\in\partial U$. Therefore
\begin{align*}
\int_U \operatorname{div}F_\tau(x)\,d\mathcal{L}^n(x)=0.
\end{align*}
Substituting the product-rule identity yields
\begin{align*}
\int_U u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau))\,d\mathcal{L}^n(x)=-\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x).
\end{align*}
Let $\theta>0$ be an ellipticity constant for $A$, meaning that
\begin{align*}
A(x,\tau)\xi\cdot\xi\geq \theta |\xi|^2
\end{align*}
for every $x\in U$, every $\tau\in(0,T)$, and every $\xi\in\mathbb{R}^n$. Applying this with $\xi=\nabla u(x,\tau)$ shows that the spatial term is coercive:
\begin{align*}
\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)\geq \theta\int_U |\nabla u(x,\tau)|^2\,d\mathcal{L}^n(x).
\end{align*}
[guided]
The goal of this step is to move the spatial derivative off the coefficient-gradient expression and onto the [test function](/page/Test%20Function) $u$. Because the test function is exactly the solution and $u$ vanishes on the boundary, this produces the interior energy term without a boundary correction.
Fix $\tau\in(0,T)$, and let $\nu:\partial U\to\mathbb{R}^n$ be the outward unit normal field. Define the vector field $F_\tau:\overline{U}\to\mathbb{R}^n$ by
\begin{align*}
F_\tau(x)=u(x,\tau)A(x,\tau)\nabla u(x,\tau).
\end{align*}
This vector field is smooth enough for the classical [divergence theorem](/theorems/3614) because $u$ is twice continuously differentiable in the spatial variables, $A$ is $C^1$, and $\partial U$ is smooth. Applying the product rule to the scalar factor $u(\cdot,\tau)$ and the vector field $A(\cdot,\tau)\nabla u(\cdot,\tau)$ gives
\begin{align*}
\operatorname{div}F_\tau(x)=\nabla u(x,\tau)\cdot A(x,\tau)\nabla u(x,\tau)+u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau)).
\end{align*}
The Euclidean dot product is symmetric as a scalar product, so $\nabla u(x,\tau)\cdot A(x,\tau)\nabla u(x,\tau)=A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)$.
Let $\mathcal{H}^{n-1}$ denote the $(n-1)$-dimensional [Hausdorff measure](/page/Hausdorff%20Measure) on $\partial U$. Now apply the divergence theorem to $F_\tau$ on $U$. The boundary term is
\begin{align*}
\int_{\partial U} F_\tau(x)\cdot\nu(x)\,d\mathcal{H}^{n-1}(x).
\end{align*}
For every $x\in\partial U$, the Dirichlet boundary condition gives $u(x,\tau)=0$, and hence $F_\tau(x)\cdot\nu(x)=0$. Therefore the boundary integral vanishes and
\begin{align*}
\int_U \operatorname{div}F_\tau(x)\,d\mathcal{L}^n(x)=0.
\end{align*}
Substituting the product-rule formula for $\operatorname{div}F_\tau$ gives
\begin{align*}
\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)+\int_U u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau))\,d\mathcal{L}^n(x)=0.
\end{align*}
Rearranging this equality gives
\begin{align*}
\int_U u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau))\,d\mathcal{L}^n(x)=-\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x).
\end{align*}
This is exactly the spatial integration-by-parts identity needed for the energy calculation.
The word energy is justified by uniform ellipticity. Let $\theta>0$ be an ellipticity constant for $A$, so that
\begin{align*}
A(x,\tau)\xi\cdot\xi\geq \theta |\xi|^2
\end{align*}
for every $x\in U$, every $\tau\in(0,T)$, and every $\xi\in\mathbb{R}^n$. Choosing $\xi=\nabla u(x,\tau)$ and integrating over $U$ with respect to $\mathcal{L}^n$ gives
\begin{align*}
\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)\geq \theta\int_U |\nabla u(x,\tau)|^2\,d\mathcal{L}^n(x).
\end{align*}
Thus the spatial term controls the squared $L^2(U)$ norm of the gradient at time $\tau$.
[/guided]
[/step]
[step:Derive the differential energy identity]
For each $\tau\in(0,T)$, multiply the equation
\begin{align*}
\partial_t u(x,\tau)-\operatorname{div}(A(x,\tau)\nabla u(x,\tau))=f(x,\tau)
\end{align*}
by $u(x,\tau)$ and integrate over $U$ with respect to $\mathcal{L}^n$. This gives
\begin{align*}
\int_U \partial_t u(x,\tau)u(x,\tau)\,d\mathcal{L}^n(x)-\int_U u(x,\tau)\operatorname{div}(A(x,\tau)\nabla u(x,\tau))\,d\mathcal{L}^n(x)=\int_U f(x,\tau)u(x,\tau)\,d\mathcal{L}^n(x).
\end{align*}
Using the time-derivative identity from the first step and the integration-by-parts identity from the second step, we obtain
\begin{align*}
E'(\tau)+\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)=(f(\cdot,\tau),u(\cdot,\tau))_{L^2(U)}.
\end{align*}
[/step]
[step:Integrate the differential identity from $s$ to $t$]
Define $G:[0,T]\to\mathbb{R}$ by
\begin{align*}
G(\tau)=\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x).
\end{align*}
The functions $E'$ and $G$ are continuous on $(0,T)$. Moreover, by the differential identity from the previous step, the source-pairing function $\tau\mapsto(f(\cdot,\tau),u(\cdot,\tau))_{L^2(U)}$ equals $E'(\tau)+G(\tau)$ for every $\tau\in(0,T)$, and hence it is continuous on $(0,T)$. Integrating over $[s,t]$ with respect to $\mathcal{L}^1$ gives
\begin{align*}
\int_s^t E'(\tau)\,d\mathcal{L}^1(\tau)+\int_s^t G(\tau)\,d\mathcal{L}^1(\tau)=\int_s^t (f(\cdot,\tau),u(\cdot,\tau))_{L^2(U)}\,d\mathcal{L}^1(\tau).
\end{align*}
By the [fundamental theorem of calculus](/theorems/632) applied to $E$,
\begin{align*}
\int_s^t E'(\tau)\,d\mathcal{L}^1(\tau)=E(t)-E(s).
\end{align*}
Substituting the definitions of $E$ and $G$ gives
\begin{align*}
\frac{1}{2}\|u(\cdot,t)\|_{L^2(U)}^2-\frac{1}{2}\|u(\cdot,s)\|_{L^2(U)}^2+\int_s^t\int_U A(x,\tau)\nabla u(x,\tau)\cdot\nabla u(x,\tau)\,d\mathcal{L}^n(x)\,d\mathcal{L}^1(\tau)=\int_s^t (f(\cdot,\tau),u(\cdot,\tau))_{L^2(U)}\,d\mathcal{L}^1(\tau).
\end{align*}
Moving the initial energy term to the right-hand side gives the asserted identity.
[/step]