The Heat Equation (or diffusion equation) is the prototypical parabolic partial differential equation. It models processes where a quantity spreads from regions of high concentration to low concentration, smoothing irregularities as time evolves.
[motivation]
## Motivation
### Physical Origins
If $u(x, t)$ represents the temperature at position $x$ and time $t$, Fourier's law states that the heat flux is proportional to the negative temperature gradient: $q = -k \nabla u$. Conservation of energy requires $\partial_t u + \nabla \cdot q = 0$. Combining these gives $\partial_t u = k \Delta u$, which (after normalising $k = 1$) is the heat equation. Beyond thermodynamics, the same equation governs Brownian motion, chemical diffusion, and — through the connection to [semigroup theory](/page/Semigroup%20Theory) — serves as the foundation for parabolic PDE theory.
### Why Not Just Solve It Pointwise?
For the heat equation on all of $\mathbb{R}^n$, one might attempt a direct [Fourier transform](/page/Fourier%20Transform) approach: $\hat{u}_t = -|\xi|^2 \hat{u}$ gives $\hat{u}(\xi, t) = \hat{g}(\xi) e^{-|\xi|^2 t}$, and inverting produces a convolution $u = \Phi(\cdot, t) * g$. This works well for the Cauchy problem with smooth, rapidly decaying data — but it says nothing about bounded domains, [boundary](/page/Boundary) conditions, maximum principles, or the qualitative structure of solutions.
### The Role of Symmetry and Mean Values
The deeper approach, developed in this article, exploits the *structure* of the equation rather than explicit formulas. The heat equation has a parabolic scaling symmetry — invariance under $(x, t) \mapsto (\lambda x, \lambda^2 t)$ — which determines the form of the fundamental solution via a self-similar ansatz. Once the fundamental solution is in hand, a mean value formula (analogous to the elliptic mean value property) unlocks the qualitative theory: maximum principles, uniqueness, regularity, and derivative estimates all follow from it.
[/motivation]
## Formal Definition
[definition: Heat Equation]
The homogeneous **Heat Equation** for a function $u: \Omega \times (0, \infty) \to \mathbb{R}$ is the second-order linear PDE:
\begin{align*}
\partial_t u - \Delta u = 0,
\end{align*}
where $\Omega \subseteq \mathbb{R}^n$ is an [open set](/page/Open%20Set), $t > 0$ denotes time, and $\Delta = \sum_{i=1}^n \partial_{x_i}^2$ is the spatial Laplacian.
[/definition]
## Derivation of the Fundamental Solution
Our first goal is to find a specific solution on $\mathbb{R}^n \times (0, \infty)$ by exploiting the scaling symmetry of the equation.
### Scaling Invariance
If $u(x, t)$ solves the heat equation and $\lambda > 0$, the rescaled function $u_\lambda(x, t) := u(\lambda x, \lambda^2 t)$ also solves it, since both $\partial_t$ and $\Delta_x$ pick up a factor of $\lambda^2$ under this substitution. This invariance under the dilation $(x, t) \mapsto (\lambda x, \lambda^2 t)$ suggests looking for self-similar solutions of the form
\begin{align*}
u(x, t) = \frac{1}{t^{n/2}} v\!\left(\frac{x}{\sqrt{t}}\right),
\end{align*}
where the exponent $n/2$ is chosen to ensure conservation of total mass: $\int_{\mathbb{R}^n} u \, d\mathcal{L}^n$ is independent of $t$.
### Reduction to an ODE
Substituting the ansatz $u(x, t) = t^{-n/2} v(y)$ with $y = x/\sqrt{t}$ into $\partial_t u = \Delta u$ and cancelling the common factor $t^{-(n+2)/2}$ yields the elliptic equation
\begin{align*}
\Delta v + \frac{1}{2} y \cdot \nabla v + \frac{n}{2} v = 0.
\end{align*}
Assuming radial symmetry $v(y) = w(|y|)$ and writing $r = |y|$, the radial Laplacian $\Delta v = w'' + \frac{n-1}{r} w'$ reduces this to
\begin{align*}
w'' + \left(\frac{n-1}{r} + \frac{r}{2}\right) w' + \frac{n}{2} w = 0.
\end{align*}
Multiplying by $r^{n-1}$ reveals a total derivative: $(r^{n-1} w')' + \frac{1}{2}(r^n w)' = 0$. Integrating once and imposing decay at infinity gives $w'/w = -r/2$, hence $w(r) = b \, e^{-r^2/4}$. The normalisation $\int_{\mathbb{R}^n} u \, d\mathcal{L}^n = 1$ fixes $b = (4\pi)^{-n/2}$.
[definition: Fundamental Solution]
The **fundamental solution** (or heat kernel) of the heat equation is the function $\Phi: \mathbb{R}^n \times (\mathbb{R} \setminus \{0\}) \to \mathbb{R}$ defined by:
\begin{align*}
\Phi(x, t) := \begin{cases} \dfrac{1}{(4\pi t)^{n/2}} \, e^{-|x|^2/(4t)} & \text{for } x \in \mathbb{R}^n, \, t > 0, \\ 0 & \text{for } x \in \mathbb{R}^n, \, t < 0. \end{cases}
\end{align*}
[/definition]
The fundamental solution is smooth, satisfies the heat equation, has unit mass for each $t > 0$, and converges to $\delta_0$ as $t \downarrow 0$.
[quotetheorem:53]
The delta-[limit](/page/Limit) property (point 4) is the key to the initial value problem: it means that convolving $\Phi(\cdot, t)$ with initial data $g$ should recover $g$ as $t \to 0$.
[example: The Heat Kernel In One Dimension]
For $n = 1$, the fundamental solution is $\Phi(x, t) = (4\pi t)^{-1/2} e^{-x^2/(4t)}$. At time $t = 0.01$, this is a narrow Gaussian of standard deviation $\sigma = \sqrt{2t} = \sqrt{0.02} \approx 0.14$, concentrated near the origin. By $t = 1$, the standard deviation has grown to $\sqrt{2} \approx 1.41$, and the peak has dropped by a factor of $10$. The total area remains $1$ throughout — mass is redistributed, not created or destroyed.
[/example]
## Solution to the Initial-Value Problem
The linearity of the heat equation and the delta-limit property of $\Phi$ suggest that the solution to the Cauchy problem with initial data $g$ is the [convolution](/page/Convolution) $u(\cdot, t) = \Phi(\cdot, t) * g$.
[quotetheorem:54]
The hypotheses $g \in C(\mathbb{R}^n) \cap L^\infty(\mathbb{R}^n)$ can be relaxed (e.g., to $g \in L^p$ for $1 \leq p \leq \infty$), but the bounded [continuous](/page/Continuity) case captures the essential mechanism: the smoothness of $\Phi$ forces $u$ to be $C^\infty$ for $t > 0$ regardless of the regularity of $g$.
## Nonhomogeneous Problem
When an external source $f(x, t)$ drives the system ($\partial_t u - \Delta u = f$), the solution is constructed by superposing the responses to instantaneous pulses injected at each time $s \in (0, t)$, each of which then evolves under the homogeneous equation for the remaining duration $t - s$.
[quotetheorem:55]
To solve the general problem with both nonzero source $f$ and nonzero initial data $g$, linearity allows us to add the two contributions:
\begin{align*}
u(x, t) = \int_{\mathbb{R}^n} \Phi(x - y, t) g(y) \, d\mathcal{L}^n(y) + \int_0^t \int_{\mathbb{R}^n} \Phi(x - y, t - s) f(y, s) \, d\mathcal{L}^n(y) \, d\mathcal{L}^1(s).
\end{align*}
## Mean-Value Formula
Harmonic functions satisfy the mean value property: the value at a point equals the average over any sphere. Solutions to the heat equation satisfy an analogous identity, but the averaging region must reflect the parabolic scaling — it is a "heat ball" defined by the level [sets](/page/Set) of the fundamental solution.
[definition: Heat Ball]
For $x \in \mathbb{R}^n$, $t \in \mathbb{R}$, and $r > 0$, the **heat ball** is
\begin{align*}
E(x, t; r) := \left\{(y, s) \in \mathbb{R}^{n+1} : s \leq t, \; \Phi(x - y, t - s) \geq \frac{1}{r^n}\right\}.
\end{align*}
[/definition]
The heat ball is an egg-shaped region in space-time, sitting below the point $(x, t)$ and elongated in the time direction. Its boundary is determined by the kernel threshold $\Phi = r^{-n}$.
[quotetheorem:559]
The weight $|x - y|^2 / (t - s)^2$ in the mean value formula compensates for the singularity of the kernel near $(x, t)$ and arises naturally from the self-similar structure of $\Phi$. Unlike the elliptic case, the averaging region is not symmetric in time: it extends only into the past, reflecting the causal structure of diffusion.
## Properties of Solutions
The mean value formula is the engine behind the qualitative theory. Just as for harmonic [functions](/page/Function), it implies maximum principles, uniqueness, regularity, and quantitative estimates on [derivatives](/page/Derivative).
### Parabolic Boundary and Cylinder
To state the maximum principle on bounded domains, we need the parabolic geometry.
[definition: Parabolic Cylinder]
Let $\Omega \subseteq \mathbb{R}^n$ be a bounded open set and $T > 0$. The **parabolic cylinder** is $\Omega_T := \Omega \times (0, T]$. The **parabolic boundary** is $\Gamma_T := \overline{\Omega}_T \setminus \Omega_T$, consisting of the bottom face $\Omega \times \{0\}$ and the lateral sides $\partial\Omega \times [0, T]$, but excluding the top cap $\Omega \times \{T\}$.
[/definition]
The asymmetry — the top is interior, the bottom is boundary — reflects the causal direction of diffusion: the future is determined by the past and the boundary, not the other way around.
### Maximum Principle
The mean value formula forces any interior maximum to propagate backwards in time through heat balls, ultimately reaching the parabolic boundary.
[quotetheorem:560]
[example: Why The Top Cap Is Excluded]
Consider $u(x, t) = t$ on $\Omega_T = (0, 1) \times (0, 1]$. This solves $\partial_t u - \Delta u = 1 \neq 0$, so it does not satisfy the heat equation — but it illustrates the geometry: $u$ attains its maximum $u = 1$ on the top cap $\{t = 1\}$, which is part of $\Omega_T$ but not of $\Gamma_T$. For actual solutions of the homogeneous equation, this cannot happen: the maximum is forced onto $\Gamma_T$.
[/example]
### Uniqueness
[quotetheorem:561]
The proof reduces the problem to the maximum principle: the difference of two solutions satisfies the homogeneous equation with zero parabolic boundary data, so it is bounded above by $0$ and below by $0$.
### Regularity
The heat equation has a powerful smoothing effect: even rough initial data produces infinitely differentiable solutions for $t > 0$.
[quotetheorem:562]
The proof uses a space-time cutoff to localise the solution, applies [Duhamel's Principle](/theorems/55) to represent it as a convolution with the smooth kernel $\Phi$ against a compactly supported source, and differentiates under the [integral](/page/Integral) sign. This infinite differentiability implies **infinite speed of propagation**: if the initial data is nonzero in any region, the solution is nonzero everywhere in $\mathbb{R}^n$ for all $t > 0$, though with exponentially small magnitude at large distances.
### Derivative Estimates
The smoothness result is qualitative. The following quantitative estimate bounds the derivatives of a solution on an inner cylinder by the $L^1$ norm on a larger cylinder, with explicit dependence on the cylinder radius.
[quotetheorem:563]
The exponent $n + 2 + k$ reflects the parabolic scaling: $n$ spatial dimensions, $2$ for the time dimension (scaled as $r^2$), and $k$ for each derivative.
## Energy Methods
The maximum principle controls solutions pointwise. Energy methods provide complementary $L^2$ control and extend naturally to settings (systems, higher-order equations) where pointwise methods fail.
[quotetheorem:564]
The identity $dE/dt = -\int |\nabla u|^2$ has a clear physical interpretation: the rate of energy loss equals the total gradient flux, so steep temperature profiles dissipate faster. This immediately gives an alternative proof of uniqueness: if two solutions agree on the parabolic boundary, their difference has zero energy at $t = 0$ and non-increasing energy thereafter, forcing the difference to vanish.
### Backwards Uniqueness
The heat equation is irreversible: solving it backwards in time is ill-posed (arbitrarily small perturbations can grow exponentially). Nevertheless, two solutions that match at the *final* time must have matched at all prior times — they cannot merge and then continue identically.
[quotetheorem:565]
The proof establishes that the $L^2$ energy $e(t) = \int w^2 \, d\mathcal{L}^n$ of the difference $w = u - v$ is **log-convex**: $\log e(t)$ is a convex function of $t$. A convex function that reaches $-\infty$ (i.e., $e = 0$) at $t = T$ must be $-\infty$ everywhere, so $e \equiv 0$.
## References
- Evans, L. C. (2010). *Partial Differential Equations* (2nd ed.). American Mathematical Society.
- Folland, G. B. (1995). *Introduction to Partial Differential Equations*. Princeton University Press.
- John, F. (1982). *Partial Differential Equations*. Springer-Verlag.