Solution Of The Cauchy Problem For The Heat Equation (Theorem # 54)
Theorem
Let $n \geq 1$, let $\Phi: \mathbb{R}^n \times (0,\infty) \to \mathbb{R}$ denote the [fundamental solution of the diffusion equation](/theorems/53), and let $g \in C(\mathbb{R}^n) \cap L^\infty(\mathbb{R}^n)$. Define $u: \mathbb{R}^n \times (0,\infty) \to \mathbb{R}$ by [convolution](/page/Convolution) with the initial data:
\begin{align*}
u(x,t) &:= (\Phi(\cdot, t) * g)(x) = \int_{\mathbb{R}^n} \Phi(x - y, t)\,g(y)\,d\mathcal{L}^n(y).
\end{align*}
Then:
1. **Smoothness:** $u \in C^\infty(\mathbb{R}^n \times (0,\infty))$.
2. **PDE satisfaction:** $u$ solves the homogeneous [heat equation](/page/Heat%20Equation):
\begin{align*}
\partial_t u(x,t) - \Delta u(x,t) = 0 \quad \text{for all } x \in \mathbb{R}^n,\; t > 0.
\end{align*}
3. **Initial condition:** $u$ attains the initial data pointwise:
\begin{align*}
\lim_{\substack{(x,t) \to (x_0, 0) \\ t > 0}} u(x,t) = g(x_0)
\end{align*}
for every $x_0 \in \mathbb{R}^n$.
Analysis
Partial Differential Equations
Discussion
This theorem uses the [fundamental solution of the diffusion equation](/theorems/53) to construct a classical solution to the Cauchy problem for the [heat equation](/page/Heat%20Equation) on all of $\mathbb{R}^n$ with bounded [continuous](/page/Continuity) initial data. The solution is given by [convolution](/page/Convolution): $u(\cdot, t) = \Phi(\cdot, t) * g$. Heuristically, the delta-[limit](/page/Limit) property $\Phi(\cdot, t) \to \delta_0$ in $\mathcal{D}'(\mathbb{R}^n)$ suggests $u(\cdot, t) \to \delta_0 * g = g$, and the theorem makes this precise in the pointwise sense.
Proof
[proofplan]
Properties 1 and 2 follow from the [fundamental solution](/theorems/53): since $\Phi(\cdot, t)$ is $C^\infty$ for $t > 0$ with all derivatives decaying as Gaussians, and $g$ is bounded, differentiation under the integral sign is justified by dominated convergence via the [Leibniz integral rule](/theorems/831), and the [heat equation](/page/Heat%20Equation) passes through the integral because $\Phi$ itself satisfies it. Property 3 -- pointwise attainment of initial data -- is the most substantial step: it uses the near-field/far-field decomposition centered at $x_0$, with a triangle inequality to handle the simultaneous limits $x \to x_0$ and $t \to 0$.
[/proofplan]
[step:Justify differentiation under the integral sign and conclude $u \in C^\infty(\mathbb{R}^n \times (0,\infty))$]
Fix an arbitrary multi-index $\alpha$ and integer $k \geq 0$. We show that
\begin{align*}
\partial_t^k \partial_x^\alpha u(x,t) &= \int_{\mathbb{R}^n}\partial_t^k \partial_x^\alpha \Phi(x - y, t)\,g(y)\,d\mathcal{L}^n(y).
\end{align*}
Fix a compact set $K \times [\delta, T] \subset \mathbb{R}^n \times (0,\infty)$ with $0 < \delta < T$. For $(x,t) \in K \times [\delta, T]$ and $y \in \mathbb{R}^n$, the derivative $\partial_t^k \partial_x^\alpha \Phi(x - y, t)$ is a product of a polynomial in $(x-y, t^{-1})$ with $\Phi(x-y,t)$. There exist constants $C_{k,\alpha} > 0$ and $m \in \mathbb{N}_0$ (depending on $k, \alpha$) such that
\begin{align*}
\left|\partial_t^k \partial_x^\alpha \Phi(x - y, t)\right| &\leq C_{k,\alpha}\,t^{-n/2 - m}\,(1 + |x - y|^{2m}/t^m)\,\exp\!\left(-\frac{|x - y|^2}{4t}\right).
\end{align*}
Since $t \geq \delta > 0$, the factor $t^{-n/2 - m}$ is bounded by $\delta^{-n/2-m}$. The polynomial growth $(1 + |x-y|^{2m}/t^m)$ is absorbed by halving the Gaussian exponent: there exists $C'_{k,\alpha,\delta} > 0$ such that
\begin{align*}
(1 + |x-y|^{2m}/t^m)\,\exp\!\left(-\frac{|x - y|^2}{4t}\right) \leq C'_{k,\alpha,\delta}\,\exp\!\left(-\frac{|x - y|^2}{8t}\right)
\end{align*}
for all $t \geq \delta$. Combining, for $(x,t) \in K \times [\delta, T]$:
\begin{align*}
\left|\partial_t^k \partial_x^\alpha \Phi(x - y, t)\right| \cdot |g(y)| &\leq C''_{k,\alpha,\delta,T,K}\,\|g\|_{L^\infty}\,\exp\!\left(-\frac{|x - y|^2}{8T}\right).
\end{align*}
The right-hand side is independent of $(x,t) \in K \times [\delta, T]$ and integrable in $y$ over $\mathbb{R}^n$ (it is a Gaussian with parameter $8T$). By the [Leibniz integral rule](/theorems/831), justified by the [dominated convergence theorem](/theorems/4) with this integrable dominating function, differentiation under the integral sign is valid on $K \times [\delta, T]$. Since $k, \alpha$, and the compact set were arbitrary, $u \in C^\infty(\mathbb{R}^n \times (0,\infty))$.
[guided]
The key question is: why can we pass derivatives through the integral defining $u$? The answer is dominated convergence. For each derivative order, the derivative of the heat kernel $\partial_t^k\partial_x^\alpha\Phi(x-y,t)$ grows polynomially in $|x-y|/\sqrt{t}$ but is multiplied by the Gaussian $\exp(-|x-y|^2/(4t))$, which decays faster than any polynomial. The standard trick is to "sacrifice" half the Gaussian exponent to absorb the polynomial:
\begin{align*}
|x-y|^{2m}\,\exp\!\left(-\frac{|x-y|^2}{4t}\right) \leq C_m\, t^m\,\exp\!\left(-\frac{|x-y|^2}{8t}\right),
\end{align*}
which follows from the elementary inequality $s^{2m}e^{-s^2/4} \leq C_m e^{-s^2/8}$ applied with $s = |x-y|/\sqrt{t}$.
The resulting dominating function $C\,\exp(-|x-y|^2/(8T))$ is a fixed Gaussian (independent of the varying $(x,t)$) and is integrable in $y$ over $\mathbb{R}^n$. The bound $t \leq T$ ensures we use the widest Gaussian (slowest decay) in the family, which dominates all narrower ones.
Restricting to a compact set $K \times [\delta, T]$ with $\delta > 0$ is necessary because the estimates degenerate as $t \to 0^+$: the factor $t^{-n/2-m}$ blows up. This is why the smoothness conclusion is $u \in C^\infty(\mathbb{R}^n \times (0,\infty))$ and not $C^\infty(\mathbb{R}^n \times [0,\infty))$.
By the [Leibniz integral rule](/theorems/831), justified by the [dominated convergence theorem](/theorems/4) with the Gaussian dominator, the interchange is valid:
\begin{align*}
\partial_t^k \partial_x^\alpha u(x,t) = \int_{\mathbb{R}^n}\partial_t^k \partial_x^\alpha \Phi(x - y, t)\,g(y)\,d\mathcal{L}^n(y).
\end{align*}
Since $k$ and $\alpha$ are arbitrary, $u$ has continuous partial derivatives of all orders on $\mathbb{R}^n \times (0,\infty)$.
[/guided]
[/step]
[step:Pass the heat operator through the integral to verify $\partial_tu - \Delta u = 0$]
Apply the differentiation interchange from the previous step with $k = 1$, $\alpha = 0$ and $k = 0$, $|\alpha| = 2$:
\begin{align*}
\partial_t u(x,t) - \Delta u(x,t) &= \int_{\mathbb{R}^n}\bigl(\partial_t \Phi(x - y, t) - \Delta_x \Phi(x - y, t)\bigr)\,g(y)\,d\mathcal{L}^n(y).
\end{align*}
Since $\Phi$ satisfies the heat equation $\partial_t \Phi - \Delta \Phi = 0$ for $t > 0$ (property 2 of the [fundamental solution](/theorems/53)), the integrand vanishes identically for each $y \in \mathbb{R}^n$. Therefore $\partial_t u - \Delta u = 0$ on $\mathbb{R}^n \times (0,\infty)$.
[/step]
[step:Prove pointwise attainment of initial data via near-field/far-field decomposition]
Fix $x_0 \in \mathbb{R}^n$. By the unit mass property (property 3 of the [fundamental solution](/theorems/53)):
\begin{align*}
\int_{\mathbb{R}^n}\Phi(x - y, t)\,d\mathcal{L}^n(y) = 1 \quad \text{for all } x \in \mathbb{R}^n,\; t > 0.
\end{align*}
Using this to write $g(x_0) = g(x_0)\int_{\mathbb{R}^n}\Phi(x-y,t)\,d\mathcal{L}^n(y)$:
\begin{align*}
u(x,t) - g(x_0) &= \int_{\mathbb{R}^n}\Phi(x - y, t)\bigl(g(y) - g(x_0)\bigr)\,d\mathcal{L}^n(y).
\end{align*}
Fix $\varepsilon > 0$. Since $g$ is [continuous](/page/Continuity) at $x_0$, there exists $\rho > 0$ such that $|y - x_0| < 2\rho$ implies $|g(y) - g(x_0)| < \varepsilon$. Restrict to $(x,t)$ with $|x - x_0| < \rho$ and $t > 0$. Split the integral:
\begin{align*}
|u(x,t) - g(x_0)| &\leq \underbrace{\int_{B(x_0, 2\rho)}\Phi(x - y, t)\,|g(y) - g(x_0)|\,d\mathcal{L}^n(y)}_{I} + \underbrace{\int_{\mathbb{R}^n \setminus B(x_0, 2\rho)}\Phi(x - y, t)\,|g(y) - g(x_0)|\,d\mathcal{L}^n(y)}_{J}.
\end{align*}
**Near-field estimate.** On $B(x_0, 2\rho)$, the continuity bound gives $|g(y) - g(x_0)| < \varepsilon$. Since $\Phi > 0$, enlarge the domain of integration from $B(x_0, 2\rho)$ to $\mathbb{R}^n$ and apply the unit mass property:
\begin{align*}
I &< \varepsilon\int_{B(x_0, 2\rho)}\Phi(x - y, t)\,d\mathcal{L}^n(y) \leq \varepsilon\int_{\mathbb{R}^n}\Phi(x - y, t)\,d\mathcal{L}^n(y) = \varepsilon.
\end{align*}
**Far-field estimate.** Set $M := 2\|g\|_{L^\infty(\mathbb{R}^n)}$, so that $|g(y) - g(x_0)| \leq M$ for all $y$. For $y \in \mathbb{R}^n \setminus B(x_0, 2\rho)$ and $|x - x_0| < \rho$, the triangle inequality gives
\begin{align*}
|x - y| &\geq |y - x_0| - |x - x_0| > 2\rho - \rho = \rho,
\end{align*}
so $\{y : |y - x_0| \geq 2\rho\} \subseteq \{y : |x - y| \geq \rho\}$. Substituting $z := x - y$ (so $d\mathcal{L}^n(y) = d\mathcal{L}^n(z)$, and $|x-y| \geq \rho$ becomes $|z| \geq \rho$):
\begin{align*}
J &\leq M\int_{|x - y| \geq \rho}\Phi(x - y, t)\,d\mathcal{L}^n(y) = M\int_{|z| \geq \rho}\Phi(z, t)\,d\mathcal{L}^n(z).
\end{align*}
Substitute $w := z/(2\sqrt{t})$, giving $d\mathcal{L}^n(z) = (2\sqrt{t})^n\,d\mathcal{L}^n(w)$ and $|z| \geq \rho$ becomes $|w| \geq \rho/(2\sqrt{t})$:
\begin{align*}
J &\leq M\,(4\pi t)^{-n/2}\,(2\sqrt{t})^n\int_{|w| \geq \rho/(2\sqrt{t})}e^{-|w|^2}\,d\mathcal{L}^n(w) = M\,\pi^{-n/2}\int_{|w| \geq \rho/(2\sqrt{t})}e^{-|w|^2}\,d\mathcal{L}^n(w).
\end{align*}
The function $w \mapsto e^{-|w|^2}$ is integrable on $\mathbb{R}^n$. As $t \downarrow 0$, the cutoff $\rho/(2\sqrt{t}) \to \infty$, so the Gaussian tail integral vanishes. Therefore $J \to 0$ as $t \downarrow 0$, uniformly in $x$ with $|x - x_0| < \rho$.
**Conclusion.** For $(x,t)$ with $|x - x_0| < \rho$ and $t > 0$ small enough that $J < \varepsilon$:
\begin{align*}
|u(x,t) - g(x_0)| \leq I + J < \varepsilon + \varepsilon = 2\varepsilon.
\end{align*}
Since $\varepsilon > 0$ was arbitrary, $u(x,t) \to g(x_0)$ as $(x,t) \to (x_0, 0)$ with $t > 0$.
[guided]
This step is the heart of the theorem and the reason for the hypothesis $g \in C(\mathbb{R}^n) \cap L^\infty(\mathbb{R}^n)$. Continuity of $g$ at $x_0$ controls the near field, and boundedness of $g$ controls the far field. Using the unit mass property $\int_{\mathbb{R}^n}\Phi(x-y,t)\,d\mathcal{L}^n(y) = 1$, write
\begin{align*}
u(x,t) - g(x_0) = \int_{\mathbb{R}^n}\Phi(x-y,t)\bigl(g(y) - g(x_0)\bigr)\,d\mathcal{L}^n(y).
\end{align*}
The complication compared to the delta-limit property (property 4 of the [fundamental solution](/theorems/53)) is that both $x$ and $t$ vary simultaneously: we need $u(x,t) \to g(x_0)$ as $(x,t) \to (x_0, 0)$ with $t > 0$, not just as $t \to 0$ with $x$ fixed. This is why we split around $B(x_0, 2\rho)$ rather than $B(x_0, \rho)$, and restrict to $|x - x_0| < \rho$.
Fix $\varepsilon > 0$. Continuity of $g$ at $x_0$ provides $\rho > 0$ with $|g(y) - g(x_0)| < \varepsilon$ whenever $|y - x_0| < 2\rho$. Restrict attention to $(x,t)$ with $|x - x_0| < \rho$ and $t > 0$, and decompose $|u(x,t) - g(x_0)| \leq I + J$ where
\begin{align*}
I &:= \int_{B(x_0, 2\rho)}\Phi(x-y,t)\,|g(y) - g(x_0)|\,d\mathcal{L}^n(y), \\
J &:= \int_{\mathbb{R}^n \setminus B(x_0, 2\rho)}\Phi(x-y,t)\,|g(y) - g(x_0)|\,d\mathcal{L}^n(y).
\end{align*}
For the near-field term $I$: on $B(x_0, 2\rho)$ the continuity bound gives $|g(y) - g(x_0)| < \varepsilon$. Since $\Phi > 0$, enlarge the domain from $B(x_0, 2\rho)$ to $\mathbb{R}^n$ and apply the unit mass property:
\begin{align*}
I < \varepsilon \int_{B(x_0,2\rho)} \Phi(x-y,t)\,d\mathcal{L}^n(y) \leq \varepsilon \int_{\mathbb{R}^n} \Phi(x-y,t)\,d\mathcal{L}^n(y) = \varepsilon.
\end{align*}
For the far-field term $J$: set $M := 2\|g\|_{L^\infty}$ so that $|g(y) - g(x_0)| \leq M$ for all $y$. For $y \notin B(x_0, 2\rho)$ and $|x - x_0| < \rho$, the triangle inequality gives $|x - y| \geq |y - x_0| - |x - x_0| > 2\rho - \rho = \rho$. Substituting $z = x - y$ and then $w = z/(2\sqrt{t})$:
\begin{align*}
J &\leq M \int_{|x-y| \geq \rho} \Phi(x-y,t)\,d\mathcal{L}^n(y) = M \int_{|z| \geq \rho} \Phi(z,t)\,d\mathcal{L}^n(z) \\
&= M\,\pi^{-n/2} \int_{|w| \geq \rho/(2\sqrt{t})} e^{-|w|^2}\,d\mathcal{L}^n(w).
\end{align*}
As $t \downarrow 0$, the cutoff $\rho/(2\sqrt{t}) \to \infty$ and the Gaussian tail $\int_{|w| \geq R} e^{-|w|^2}\,d\mathcal{L}^n(w) \to 0$ as $R \to \infty$, so $J \to 0$. The bound on $J$ depends only on $t$ and $\rho$ (not on the specific $x$), which gives uniformity in $x$ for $|x - x_0| < \rho$. Combining:
\begin{align*}
|u(x,t) - g(x_0)| \leq I + J < \varepsilon + \varepsilon = 2\varepsilon
\end{align*}
for all $(x,t)$ with $|x - x_0| < \rho$ and $t > 0$ sufficiently small. Since $\varepsilon > 0$ was arbitrary, $u(x,t) \to g(x_0)$ as $(x,t) \to (x_0, 0)$ with $t > 0$.
[/guided]
[/step]
Prerequisites (0/6 completed)
Prerequisites Graph
Interactive dependency map showing how this theorem builds on foundational concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Theorems
Definitions & Concepts