E(x,t) := \begin{cases} \Phi(x,t) & \text{if } t > 0, \\ 0 & \text{if } t \leq 0. \end{cases}
\end{align*}
Then $E \in L^1_{\mathrm{loc}}(\mathbb{R}^{n+1})$ and, as an element of $\mathcal{D}'(\mathbb{R}^{n+1})$, it satisfies $(\partial_t - \Delta)E = \delta$ where $\delta$ is the Dirac distribution at the origin in $\mathbb{R}^{n+1}$.
[/claim]
[proof]
First, $E \in L^1_{\mathrm{loc}}(\mathbb{R}^{n+1})$: for any compact set $K \subset \mathbb{R}^{n+1}$, the integral $\int_K |E(x,t)|\,d\mathcal{L}^{n+1}(x,t)$ is finite because $\Phi(x,t) \leq C\,t^{-n/2}$ and $t^{-n/2}$ is locally integrable in $t$ near $t = 0$ when integrated against the Gaussian spatial factor (in fact, $\int_{\mathbb{R}^n}\Phi(x,t)\,d\mathcal{L}^n(x) = 1$ for all $t > 0$, so the spatial integral is uniformly bounded).
To verify $(\partial_t - \Delta)E = \delta$, let $\psi \in C_c^\infty(\mathbb{R}^{n+1})$ be an arbitrary test function. We must show
\begin{align*}
\langle (\partial_t - \Delta)E, \psi \rangle = \psi(0,0).
\end{align*}
By definition of distributional derivatives, $\langle (\partial_t - \Delta)E, \psi \rangle = -\langle E, \partial_t \psi \rangle - \langle E, \Delta \psi \rangle$, but since $E$ is a regular distribution and $\partial_t$ acts on $E$ distributionally, we write
\begin{align*}
\langle (\partial_t - \Delta)E, \psi \rangle &= -\int_0^\infty\int_{\mathbb{R}^n}\Phi(x,t)\,\partial_t\psi(x,t)\,d\mathcal{L}^n(x)\,d\mathcal{L}^1(t) - \int_0^\infty\int_{\mathbb{R}^n}\Phi(x,t)\,\Delta\psi(x,t)\,d\mathcal{L}^n(x)\,d\mathcal{L}^1(t).
\end{align*}
For the second integral, since $\Phi(\cdot,t)$ is smooth and decays as a Gaussian for each $t > 0$, integration by parts in $x$ (with all boundary terms vanishing) gives $\int_{\mathbb{R}^n}\Phi\,\Delta\psi\,d\mathcal{L}^n = \int_{\mathbb{R}^n}(\Delta\Phi)\,\psi\,d\mathcal{L}^n$. For the first integral, integrate by parts in $t$ on $[\varepsilon, T]$ (where $T$ is large enough that $\psi$ vanishes for $t > T$):
\begin{align*}
-\int_\varepsilon^T\int_{\mathbb{R}^n}\Phi\,\partial_t\psi\,d\mathcal{L}^n\,d\mathcal{L}^1 &= \int_\varepsilon^T\int_{\mathbb{R}^n}(\partial_t\Phi)\,\psi\,d\mathcal{L}^n\,d\mathcal{L}^1 - \left[\int_{\mathbb{R}^n}\Phi(x,t)\,\psi(x,t)\,d\mathcal{L}^n(x)\right]_{t=\varepsilon}^{t=T}.
\end{align*}
The boundary term at $t = T$ vanishes because $\psi(\cdot, T) = 0$. The boundary term at $t = \varepsilon$ is $-\int_{\mathbb{R}^n}\Phi(x,\varepsilon)\,\psi(x,\varepsilon)\,d\mathcal{L}^n(x)$. Combining and using $\partial_t\Phi - \Delta\Phi = 0$ for $t > 0$ (property 2 of the [fundamental solution](/theorems/53)):
\begin{align*}
\langle (\partial_t - \Delta)E, \psi \rangle &= \lim_{\varepsilon \to 0^+}\left(\int_\varepsilon^T\int_{\mathbb{R}^n}\underbrace{(\partial_t\Phi - \Delta\Phi)}_{= 0}\,\psi\,d\mathcal{L}^n\,d\mathcal{L}^1 + \int_{\mathbb{R}^n}\Phi(x,\varepsilon)\,\psi(x,\varepsilon)\,d\mathcal{L}^n(x)\right).
\end{align*}
The first integral vanishes. For the second, since $\psi$ is continuous and bounded, and $\psi(x,\varepsilon) \to \psi(x,0)$ uniformly in $x$ as $\varepsilon \to 0$, the delta-limit property (property 4 of the [fundamental solution](/theorems/53)) gives
\begin{align*}
\lim_{\varepsilon \to 0^+}\int_{\mathbb{R}^n}\Phi(x,\varepsilon)\,\psi(x,\varepsilon)\,d\mathcal{L}^n(x) = \psi(0,0).
\end{align*}
Therefore $\langle (\partial_t - \Delta)E, \psi \rangle = \psi(0,0) = \langle \delta, \psi \rangle$.
[/proof]
**Step 2 (The Duhamel formula as space-time convolution).**
[claim:Duhamel Formula Equals Space Time Convolution]
Define $\tilde{f}: \mathbb{R}^{n+1} \to \mathbb{R}$ by $\tilde{f}(x,t) := f(x,t)$ for $t \geq 0$ and $\tilde{f}(x,t) := 0$ for $t < 0$. Then $\tilde{f}$ has compact support in $\mathbb{R}^{n+1}$ and
\begin{align*}
u(x,t) = (E * \tilde{f})(x,t)
\end{align*}
for all $(x,t) \in \mathbb{R}^n \times [0,\infty)$.
[/claim]
[proof]
Since $f$ has compact support in $\mathbb{R}^n \times [0,\infty)$, $\tilde{f}$ has compact support in $\mathbb{R}^{n+1}$. The space-time convolution is
\begin{align*}
(E * \tilde{f})(x,t) &= \int_{-\infty}^{\infty}\int_{\mathbb{R}^n}E(x - y, t - s)\,\tilde{f}(y,s)\,d\mathcal{L}^n(y)\,d\mathcal{L}^1(s).
\end{align*}
The integrand is nonzero only when both $E(x-y,t-s) \neq 0$ and $\tilde{f}(y,s) \neq 0$. The first condition requires $t - s > 0$, i.e. $s < t$. The second requires $s \geq 0$. Therefore the $s$-integral reduces to the interval $[0,t]$, and $E(x-y,t-s) = \Phi(x-y,t-s)$, $\tilde{f}(y,s) = f(y,s)$:
\begin{align*}
(E * \tilde{f})(x,t) = \int_0^t\int_{\mathbb{R}^n}\Phi(x - y, t - s)\,f(y,s)\,d\mathcal{L}^n(y)\,d\mathcal{L}^1(s) = u(x,t).
\end{align*}
[/proof]
**Step 3 (Applying the heat operator to the convolution).**
[claim:Heat Operator Passes Through Convolution]
$(\partial_t - \Delta)u = f$ in $\mathbb{R}^n \times (0,\infty)$ and $u(\cdot, 0) = 0$.
[/claim]
[proof]
Since $\tilde{f}$ has compact support and $E \in L^1_{\mathrm{loc}}(\mathbb{R}^{n+1})$, the convolution $E * \tilde{f}$ is well-defined and yields a $C^{2,1}$ function (this follows from the smoothing property of convolution: the regularity of $E$ for $t > 0$ combined with the compact support and $C^{2,1}$ regularity of $\tilde{f}$ ensures that $u$ inherits $C^{2,1}$ regularity).
For distributional derivatives of convolutions, the general identity states: if $T \in \mathcal{D}'(\mathbb{R}^{n+1})$ and $\varphi$ has compact support, then $D^\alpha(T * \varphi) = (D^\alpha T) * \varphi$ for any partial derivative $D^\alpha$. Applying this with $T = E$ and $\varphi = \tilde{f}$, and using the linearity of $\partial_t - \Delta$:
\begin{align*}
(\partial_t - \Delta)(E * \tilde{f}) = ((\partial_t - \Delta)E) * \tilde{f} = \delta * \tilde{f},
\end{align*}
where the last equality uses Claim 1. Since convolution with $\delta$ is the identity ($\delta * \tilde{f} = \tilde{f}$):
\begin{align*}
(\partial_t - \Delta)u = \tilde{f}.
\end{align*}
For $t > 0$, $\tilde{f}(x,t) = f(x,t)$, giving $\partial_t u - \Delta u = f$ in $\mathbb{R}^n \times (0,\infty)$.
For the initial condition: when $t \leq 0$, the convolution $(E * \tilde{f})(x,t)$ involves integrating over $s \in (-\infty, t] \cap [0,\infty)$. For $t \leq 0$, this set is at most $\{0\}$, which has measure zero, so $u(x,t) = 0$ for $t \leq 0$. In particular, $u(\cdot, 0) = 0$.
[/proof]