The proof is by induction on $m$, bootstrapping from the [Interior Parabolic Regularity](/theorems/611) result. The idea is that each application of the interior regularity theorem gains two spatial derivatives and one time derivative, and the process can be iterated as long as the data and coefficients are smooth enough.
**Step 1: Base case ($m = 0$).** The [Interior Parabolic Regularity theorem](/theorems/611) gives $u \in L^2(\delta, T; H^2(V))$ and $u_t \in L^2(\delta, T; L^2(V))$ for any $V \subset\subset U$ and $\delta > 0$, with the estimate $\|u\|_{L^2(\delta,T; H^2(V))} + \|u_t\|_{L^2(\delta,T; L^2(V))} \le C(\|f\|_{L^2(U_T)} + \|u\|_{L^2(U_T)})$. This is the $m = 0$ case.
**Step 2: [Differentiation](/page/Derivative) of the equation.** Suppose that $u \in L^2(\delta, T; H^{m+2}(V))$ locally for some $m \ge 0$, and that $f$ and the coefficients have $m+1$ derivatives. Differentiate the equation $u_t + Lu = f$ with respect to $x_j$. Since the coefficients are smooth, the Leibniz rule gives
\begin{align*}
(\partial_{x_j} u)_t + L(\partial_{x_j} u) = \partial_{x_j} f - [L, \partial_{x_j}] u =: \tilde{f},
\end{align*}
where $[L, \partial_{x_j}]$ is the commutator, which is a differential operator of order at most $2$ with coefficients involving derivatives of $a_{ij}, b_i, c$. By the inductive hypothesis, $u$ has $m+2$ spatial derivatives in $L^2$ locally, so $\tilde{f}$ has $m$ spatial derivatives in $L^2$ locally (one derivative is lost in the commutator, but the inductive hypothesis provides $m+2$ to start with).
**Step 3: Application of interior regularity to $\partial_{x_j} u$.** The [function](/page/Function) $w := \partial_{x_j} u$ solves $w_t + Lw = \tilde{f}$ with $\tilde{f} \in L^2$ locally. Applying the [Interior Parabolic Regularity theorem](/theorems/611) to $w$ gives $w \in L^2(\delta', T; H^2(V'))$ locally, i.e., $\partial_{x_j} u \in L^2(\delta', T; H^2(V'))$. Since this holds for every $j = 1, \ldots, n$, we gain $u \in L^2(\delta', T; H^{m+3}(V'))$ locally.
**Step 4: Time regularity.** Each spatial regularity gain also improves the time regularity via the equation: $u_t = f - Lu$, and gaining two more spatial derivatives of $u$ gives one more time derivative in $L^2$.
**Step 5: Induction.** Repeating Steps 2–4 a total of $m$ times, starting from the base case, yields $u \in H^{m+2}(U_T)$ locally (with the appropriate parabolic weighting of derivatives). The [Sobolev embedding theorem](/theorems/903) then gives $u \in C^\infty$ when $f$, $g$, and the coefficients are all $C^\infty$ with compatible [boundary](/page/Boundary) conditions (since $H^{m+2}$ embeds into $C^k$ for $m$ sufficiently large relative to $k$ and the dimension).