Distribution
Distribution — Verification - Androma

Current Content

Debug: Found 1 attribution entries

First Attribution: Source: create, Text length: 28028, Start: N/A, End: N/A

Page content length: 34751

Classical analysis is built on functions: measurable maps from $\mathbb{R}^n$ (or an open subset) to $\mathbb{R}$. But many natural objects — the Dirac delta, derivatives of discontinuous functions, Green's functions of elliptic operators — are not functions in any reasonable sense. The theory of distributions, introduced by Laurent Schwartz in the 1940s and 1950s, resolves these difficulties by replacing pointwise evaluation with continuous linear functionals on a space of test functions. Instead of asking "what is the value of $u$ at $x$?", one asks "what is $T(\varphi)$ for every smooth, compactly supported $\varphi$?" This shift allows distributions to encompass delta masses, derivatives of discontinuous functions, and solutions to PDEs in the weakest possible sense.

Motivation

[motivation]

Why Functions Are Not Enough

Consider the one-dimensional wave equation $u_{tt} - u_{xx} = 0$ on $\mathbb{R} \times (0, \infty)$ with initial data $u(x, 0) = f(x)$ and $u_t(x, 0) = 0$. D'Alembert's formula gives $u(x,t) = \tfrac{1}{2}(f(x+t) + f(x-t))$. If $f \in C^2(\mathbb{R})$, this is a classical solution: $u_{tt}$ and $u_{xx}$ exist pointwise and are equal. But if $f$ is merely continuous — say, a triangular pulse with a corner — then $u$ is continuous but not $C^2$, and the equation $u_{tt} = u_{xx}$ has no pointwise meaning at the corner. Yet the formula still describes the physically correct propagation of the wave. We need a framework in which $u$ "solves" the wave equation without requiring pointwise derivatives to exist.

The situation is worse for nonlinear conservation laws. The inviscid Burgers equation $u_t + uu_x = 0$ with smooth initial data can develop discontinuities (shock waves) in finite time. After the shock forms, no classical solution exists, but the physics continues — the shock propagates according to the Rankine–Hugoniot conditions. The equation must be interpreted in a sense that allows discontinuous solutions and their "derivatives" to make sense.

The Integration-by-Parts Idea

The key observation is that testing against smooth functions can replace pointwise evaluation. Suppose $f \in C^{|\alpha|}(\Omega)$ and $\varphi \in C_c^\infty(\Omega)$. Integration by parts gives
\begin{align*} \int_\Omega (\partial^\alpha f)(x)\, \varphi(x) \, d\mathcal{L}^n(x) &= (-1)^{|\alpha|} \int_\Omega f(x)\, (\partial^\alpha \varphi)(x) \, d\mathcal{L}^n(x), \end{align*}
with no boundary terms because $\varphi$ has compact support in $\Omega$. The right-hand side makes sense even when $f$ is merely locally integrable — one never differentiates $f$, only the smooth test function $\varphi$. This suggests defining the "derivative" of $f$ as the rule that assigns to each $\varphi$ the number $(-1)^{|\alpha|} \int f \, \partial^\alpha \varphi \, d\mathcal{L}^n$. More generally, any linear functional on test functions that is continuous in a suitable sense can serve as a "generalised function" — a distribution.

From Weak Derivatives to Distributions

The weak derivative, central to Sobolev space theory, is a special case: $v \in L^p(\Omega)$ is the weak derivative $\partial^\alpha f$ if $\int v \varphi \, d\mathcal{L}^n = (-1)^{|\alpha|} \int f \, \partial^\alpha \varphi \, d\mathcal{L}^n$ for all $\varphi \in C_c^\infty(\Omega)$. But weak derivatives are required to be functions in $L^p$. Distributions remove this restriction: the distributional derivative of $T_f$ is the functional $\varphi \mapsto (-1)^{|\alpha|} \int f \, \partial^\alpha \varphi \, d\mathcal{L}^n$, which need not be representable by any locally integrable function. The Heaviside function $H$ has no weak derivative in any $L^p$ (its distributional derivative is the Dirac delta, which is not a function), but $T_H$ has a perfectly well-defined distributional derivative. The theory of distributions is the completion of the weak derivative idea: every locally integrable function generates a distribution whose distributional derivatives of all orders exist, regardless of the regularity of the original function.

[/motivation]

Test Functions

The definition of a distribution requires a space of "probe" functions against which distributions act. Smoothness ensures that integration by parts produces no error terms from non-differentiability; compact support ensures that boundary contributions vanish and all pairings are finite.

[definition: Space Of Test Functions]
Let $\Omega \subseteq \mathbb{R}^n$ be a non-empty open set. The space of test functions on $\Omega$ is
\begin{align*} \mathcal{D}(\Omega) &:= \{\varphi \in C^\infty(\Omega) \mid \mathrm{supp}(\varphi) \subset \Omega \text{ is compact}\}. \end{align*}
It carries the strict inductive limit topology: the finest locally convex topology making each inclusion $\mathcal{D}_K(\Omega) \hookrightarrow \mathcal{D}(\Omega)$ continuous, where $\mathcal{D}_K(\Omega)$ is the Fréchet space of smooth functions supported in a compact set $K \subset \Omega$.
[/definition]

This topology is Hausdorff and complete but not metrizable. Despite the non-metrizability, a sequence $\varphi_k \to \varphi$ in $\mathcal{D}(\Omega)$ if and only if all supports lie in a single compact set and all derivatives converge uniformly (Theorem 448). The full construction — bump functions, mollifiers, partitions of unity, and density results — is developed on the Test Functions page.

The Space of Distributions

With the test function space and its topology in hand, a distribution is defined as a continuous linear functional — a linear map from $\mathcal{D}(\Omega)$ to the scalars that is continuous with respect to the strict inductive limit topology.

[definition: Distribution]
Let $\Omega \subseteq \mathbb{R}^n$ be a non-empty open set and equip $\mathcal{D}(\Omega)$ with the strict inductive limit topology. A distribution on $\Omega$ is a continuous linear map $T: \mathcal{D}(\Omega) \to \mathbb{R}$.
[/definition]

The definition is topological: $T$ must send open sets in $\mathcal{D}(\Omega)$ to open sets in $\mathbb{R}$. Since the strict inductive limit topology on $\mathcal{D}(\Omega)$ is abstract and non-metrisable, it is natural to ask whether there are more concrete ways to verify that a given linear functional is a distribution. The following result provides two equivalent characterisations.

[quotetheorem:449]

The equivalence (1) $\Leftrightarrow$ (2) is the reason one can work with sequences throughout distribution theory despite the non-metrisability of $\mathcal{D}(\Omega)$: a linear functional is continuous if and only if it is sequentially continuous. This equivalence is a consequence of the LF-space structure (strict inductive limit of Fréchet spaces) and would fail for general locally convex spaces. The equivalence (1) $\Leftrightarrow$ (3) gives a quantitative bound: the integer $N_K$ measures how many derivatives of the test function the distribution "uses" on the compact set $K$. When a single $N$ works for all compact sets, $T$ is said to have finite order (at most $N$). For instance, regular distributions generated by $L^1_\mathrm{loc}$ functions have order $0$ (only the supremum of $\varphi$ appears), while the Dirac delta also has order $0$, and derivatives of the delta have order equal to the number of derivatives taken.

[definition: Space Of Distributions]
Let $\Omega \subseteq \mathbb{R}^n$ be a non-empty open set. The space of distributions on $\Omega$, denoted $\mathcal{D}'(\Omega)$, is the set of all distributions on $\Omega$. It is a vector space under pointwise operations:
\begin{align*} (T_1 + T_2)(\varphi) &:= T_1(\varphi) + T_2(\varphi), \\ (\lambda T)(\varphi) &:= \lambda \, T(\varphi) \end{align*}
for $T_1, T_2, T \in \mathcal{D}'(\Omega)$, $\lambda \in \mathbb{R}$, and $\varphi \in \mathcal{D}(\Omega)$.
[/definition]

The space $\mathcal{D}'(\Omega)$ is the continuous dual of the locally convex space $\mathcal{D}(\Omega)$, and as such it carries a natural topology: the weak* topology $\sigma(\mathcal{D}'(\Omega), \mathcal{D}(\Omega))$, which is the coarsest topology making every evaluation map $\operatorname{ev}_\varphi: T \mapsto T(\varphi)$ continuous. By the Pointwise Characterisation of Weak Star Convergence, a net (or sequence) $\{T_k\}$ converges to $T$ in $\sigma(\mathcal{D}'(\Omega), \mathcal{D}(\Omega))$ if and only if
\begin{align*} T_k(\varphi) \to T(\varphi) \quad \text{for every } \varphi \in \mathcal{D}(\Omega). \end{align*}
This is convergence "tested pointwise against all test functions," and it is the standard notion of convergence in distribution theory.

Regular and Singular Distributions

Every locally integrable function gives rise to a distribution by integration, but not every distribution arises this way. The distinction between these two cases is fundamental.

[definition: Regular Distribution]
Let $f \in L^1_\mathrm{loc}(\Omega)$. The regular distribution generated by $f$ is the distribution $T_f \in \mathcal{D}'(\Omega)$ defined by
\begin{align*} T_f: \mathcal{D}(\Omega) &\to \mathbb{R} \\ \varphi &\mapsto \int_\Omega f(x)\, \varphi(x) \, d\mathcal{L}^n(x). \end{align*}
[/definition]

That $T_f$ is indeed a distribution must be verified. Linearity follows from linearity of the Lebesgue integral. For continuity, one checks the seminorm condition from the Characterisation of Distributions: if $\mathrm{supp}(\varphi) \subseteq K$, then $|T_f(\varphi)| \leq \left(\int_K |f| \, d\mathcal{L}^n\right) \sup_K |\varphi|$, which is a bound of the required form with $N_K = 0$ and $C_K = \int_K |f| \, d\mathcal{L}^n < \infty$ (the finiteness uses local integrability and compactness of $K$). The order-zero bound means that $T_f$ depends only on the supremum of $\varphi$ — not on any of its derivatives.

The map $f \mapsto T_f$ sends locally integrable functions to distributions. A natural question is whether distinct functions generate distinct distributions — i.e., whether $T_f$ retains all the information about $f$ (up to null sets). The answer is yes.

[quotetheorem:450]

The injectivity means that no information is lost in passing from $f$ to $T_f$: if $T_f = T_g$ as distributions, then $f = g$ as elements of $L^1_\mathrm{loc}(\Omega)$. The proof uses mollification: the hypothesis $T_f = T_g$ implies that $f * \rho_\varepsilon = g * \rho_\varepsilon$ pointwise (since mollifiers are test functions), and passing $\varepsilon \to 0$ recovers $f = g$ a.e. via the Lebesgue differentiation theorem. Despite the injectivity, the objects $f$ and $T_f$ remain logically distinct: $f$ is an equivalence class of measurable functions, while $T_f$ is a continuous linear functional on $\mathcal{D}(\Omega)$. We maintain this distinction throughout.

A second compatibility question arises when $f \in L^1(\mathbb{R}^n)$ and $T_f$ is viewed as a tempered distribution: does the distributional Fourier transform $\widehat{T_f}$ (defined by duality as $\widehat{T_f}(\varphi) := T_f(\hat{\varphi})$) agree with the regular distribution $T_{\hat{f}}$ generated by the classical Fourier transform $\hat{f}(\xi) = \int f(x)\,e^{-ix\cdot\xi}\,d\mathcal{L}^n(x)$? The answer is yes — the two notions are compatible, with the proof reducing to Fubini's theorem.

[quotetheorem:718]

[definition: Singular Distribution]
A distribution $T \in \mathcal{D}'(\Omega)$ is singular if there is no $f \in L^1_\mathrm{loc}(\Omega)$ such that $T = T_f$.
[/definition]

The most important singular distribution is the Dirac delta, which evaluates a test function at a prescribed point.

[example: Dirac Delta]
For $x_0 \in \Omega$, the Dirac delta at $x_0$ is the map
\begin{align*} \delta_{x_0}: \mathcal{D}(\Omega) &\to \mathbb{R} \\ \varphi &\mapsto \varphi(x_0). \end{align*}
Linearity is immediate from the linearity of evaluation. For continuity, the seminorm estimate $|\delta_{x_0}(\varphi)| = |\varphi(x_0)| \leq \sup_K |\varphi|$ holds for any compact $K$ containing $x_0$, with $N_K = 0$ and $C_K = 1$. Thus $\delta_{x_0} \in \mathcal{D}'(\Omega)$, and it has order $0$.

The distribution $\delta_{x_0}$ is singular. To see this, suppose for contradiction that $\delta_{x_0} = T_f$ for some $f \in L^1_\mathrm{loc}(\Omega)$. Then $\int_\Omega f\varphi \, d\mathcal{L}^n = \varphi(x_0)$ for all $\varphi \in \mathcal{D}(\Omega)$. But the left-hand side depends on $\varphi$ only through its equivalence class in $L^1$ (modifying $\varphi$ on a null set does not change the integral), while the right-hand side depends on the pointwise value of $\varphi$ at $x_0$. Since one can modify any $\varphi \in \mathcal{D}(\Omega)$ on the null set $\{x_0\}$ without changing its $L^1$-equivalence class, the two sides cannot agree for all test functions, contradicting $\delta_{x_0} = T_f$.
[/example]

The Distributional Derivative

The central operation on distributions — and the primary reason the theory exists — is differentiation. Every distribution has derivatives of all orders, with no regularity requirements whatsoever. The definition is motivated by the integration-by-parts identity from the Motivation section: if $f \in C^{|\alpha|}(\Omega)$ and $\varphi \in \mathcal{D}(\Omega)$, then $\int (\partial^\alpha f)\varphi \, d\mathcal{L}^n = (-1)^{|\alpha|} \int f \, \partial^\alpha\varphi \, d\mathcal{L}^n$. The right-hand side makes sense for any distribution $T$ in place of $T_f$: one simply applies $T$ to $\partial^\alpha \varphi$.

[definition: Distributional Derivative]
Let $T \in \mathcal{D}'(\Omega)$ and let $\alpha \in \mathbb{N}_0^n$ be a multi-index. The distributional derivative of $T$ of order $\alpha$ is the distribution $\partial^\alpha T \in \mathcal{D}'(\Omega)$ defined by
\begin{align*} (\partial^\alpha T)(\varphi) &:= (-1)^{|\alpha|} T(\partial^\alpha \varphi) \quad \text{for every } \varphi \in \mathcal{D}(\Omega). \end{align*}
[/definition]

Three things must be checked. First, $\partial^\alpha \varphi \in \mathcal{D}(\Omega)$ whenever $\varphi \in \mathcal{D}(\Omega)$, since differentiating a smooth compactly supported function yields a smooth function with the same support. Second, $\partial^\alpha T$ is linear because $T$ is linear and $\partial^\alpha$ is linear. Third, $\partial^\alpha T$ is continuous: if $\varphi_k \to \varphi$ in $\mathcal{D}(\Omega)$ (with all supports in a compact set $K$ and all derivatives converging uniformly, by the Sequential Characterisation), then $\partial^\alpha \varphi_k \to \partial^\alpha \varphi$ in $\mathcal{D}(\Omega)$ (the supports remain in $K$, and $\partial^\beta(\partial^\alpha \varphi_k) = \partial^{\alpha+\beta}\varphi_k \to \partial^{\alpha+\beta}\varphi$ uniformly for every $\beta$). Continuity of $T$ then gives $T(\partial^\alpha \varphi_k) \to T(\partial^\alpha \varphi)$, hence $(\partial^\alpha T)(\varphi_k) \to (\partial^\alpha T)(\varphi)$.

The sign $(-1)^{|\alpha|}$ is chosen so that the distributional derivative of $T_f$ agrees with the classical derivative when $f \in C^{|\alpha|}(\Omega)$: the integration-by-parts identity gives $(\partial^\alpha T_f)(\varphi) = (-1)^{|\alpha|} T_f(\partial^\alpha \varphi) = (-1)^{|\alpha|} \int f \, \partial^\alpha \varphi \, d\mathcal{L}^n = \int (\partial^\alpha f) \varphi \, d\mathcal{L}^n = T_{\partial^\alpha f}(\varphi)$. Consistency with the weak derivative used in Sobolev space theory is proved on the Distributional Derivative page: if $f \in W^{|\alpha|,p}(\Omega)$ with weak derivative $v$, then $\partial^\alpha T_f = T_v$. The three notions — classical, weak, distributional — form a strict hierarchy, each more general than the last.

A key consequence is that the distributional derivative of any distribution is again a distribution, so every element of $\mathcal{D}'(\Omega)$ is infinitely differentiable in the distributional sense — in sharp contrast to classical analysis, where differentiability is a restrictive regularity condition.

Computing Distributional Derivatives

[example: Derivative Of The Heaviside Function]
The Heaviside step function $H: \mathbb{R} \to \mathbb{R}$ defined by $H(x) = \mathbb{1}_{[0,\infty)}(x)$ is locally integrable and generates a regular distribution $T_H$. For any $\varphi \in \mathcal{D}(\mathbb{R})$:
\begin{align*} (\partial T_H)(\varphi) &= -T_H(\varphi') = -\int_0^\infty \varphi'(x) \, d\mathcal{L}^1(x) = -[\varphi(x)]_0^\infty = \varphi(0) = \delta_0(\varphi), \end{align*}
where the boundary term at $+\infty$ vanishes because $\varphi$ has compact support. Therefore $\partial T_H = \delta_0$: the distributional derivative of $T_H$ is the Dirac delta. The unit jump discontinuity at $x = 0$ produces a delta mass of weight $1$.

More generally, if $g: \mathbb{R} \to \mathbb{R}$ is piecewise $C^1$ with jump discontinuities $[g]_{x_k} := g(x_k^+) - g(x_k^-)$ at finitely many points $\{x_k\}$, then
\begin{align*} \partial T_g &= T_{g'} + \sum_k [g]_{x_k}\, \delta_{x_k}, \end{align*}
where $g'$ denotes the classical derivative on the complement of $\{x_k\}$ (which is locally integrable). Each jump contributes a delta mass whose weight equals the size of the jump.
[/example]

[example: Distributional Laplacian Of The Newton Potential]
In dimension $n = 3$, the Newton potential $\Phi: \mathbb{R}^3 \setminus \{0\} \to \mathbb{R}$ defined by $\Phi(x) = (4\pi|x|)^{-1}$ extends to a locally integrable function on $\mathbb{R}^3$ (the singularity is integrable because $\int_0^1 r^{-1} \cdot r^2 \, d\mathcal{L}^1(r) < \infty$ in spherical coordinates). The function $\Phi$ is smooth and harmonic on $\mathbb{R}^3 \setminus \{0\}$: $\Delta \Phi(x) = 0$ for $x \neq 0$. Yet the distributional Laplacian of $T_\Phi$ detects the singularity at the origin. For $\varphi \in \mathcal{D}(\mathbb{R}^3)$, the full computation (carried out on the Distributional Derivative page using Green's identity on $\mathbb{R}^3 \setminus B(0, \varepsilon)$ and taking $\varepsilon \to 0$) gives $\Delta T_\Phi = -\delta_0$ in $\mathcal{D}'(\mathbb{R}^3)$. This is a special case of the general result for the fundamental solution of the Laplacian in all dimensions $n \geq 2$, stated as Distributional Laplacian of the Fundamental Solution. The distributional identity $-\Delta T_\Phi = \delta_0$ underlies the Poisson equation: the solution to $-\Delta u = f$ on $\mathbb{R}^3$ (for suitable $f$) is given by the convolution $u = \Phi * f$.
[/example]

Operations on Distributions

Multiplication by Smooth Functions

One can multiply a distribution by a smooth function, but the definition requires care: since a distribution is a functional on test functions (not a pointwise-defined object), multiplication must be defined by transferring the smooth factor to the test function.

[definition: Smooth Multiplication]
Let $T \in \mathcal{D}'(\Omega)$ and $\psi \in C^\infty(\Omega)$. The product $\psi T \in \mathcal{D}'(\Omega)$ is defined by
\begin{align*} (\psi T)(\varphi) &:= T(\psi \varphi) \quad \text{for every } \varphi \in \mathcal{D}(\Omega). \end{align*}
[/definition]

This is well-defined because $\psi \varphi \in \mathcal{D}(\Omega)$ whenever $\varphi \in \mathcal{D}(\Omega)$: the product of a smooth function with a compactly supported smooth function is smooth and has $\mathrm{supp}(\psi\varphi) \subseteq \mathrm{supp}(\varphi)$, which is compact in $\Omega$. Moreover, the map $\varphi \mapsto \psi\varphi$ is continuous on $\mathcal{D}(\Omega)$ (it preserves the common support and, by the classical Leibniz rule, uniform convergence of all derivatives), so $\psi T$ is a distribution. For regular distributions, $\psi T_f = T_{\psi f}$: the distributional product reduces to pointwise multiplication.

The product of two arbitrary distributions cannot be defined within the theory. The expression $\delta_0 \cdot \delta_0$, for instance, has no consistent meaning: to evaluate it on a test function $\varphi$, one would need to evaluate $\delta_0$ at the "function" $\delta_0 \cdot \varphi$, but $\delta_0 \cdot \varphi$ is not a test function (it is not even a function). The space $\mathcal{D}'(\Omega)$ is a module over the ring $C^\infty(\Omega)$, but it is not an algebra. This limitation is the source of the renormalisation problem in quantum field theory and the need for paraproduct decompositions in nonlinear PDE.

The Leibniz rule extends to distributional products.

[quotetheorem:452]

This is verified by a direct computation from the definitions of the distributional derivative and smooth multiplication, using the classical Leibniz rule $\psi \, \partial_j \varphi = \partial_j(\psi\varphi) - (\partial_j\psi)\varphi$ to rearrange the argument of $T$. The result is used routinely in localisation arguments: multiplying a distribution by a smooth cutoff function restricts its behaviour to a compact set, and the Leibniz rule controls the error introduced by the cutoff.

Support of a Distribution

To define the support of a distribution, one first specifies what it means for a distribution to vanish on an open set.

[definition: Vanishing On An Open Set]
Let $T \in \mathcal{D}'(\Omega)$ and let $U \subseteq \Omega$ be open. The distribution $T$ vanishes on $U$ if $T(\varphi) = 0$ for every $\varphi \in \mathcal{D}(\Omega)$ with $\mathrm{supp}(\varphi) \subseteq U$.
[/definition]

Before defining the support, one must know that the collection of all open sets on which $T$ vanishes is closed under arbitrary unions — otherwise the "largest open vanishing set" might not exist, and the support would not be well-defined.

[quotetheorem:453]

The proof is a partition-of-unity argument: a test function supported in the union $\bigcup U_i$ has compact support, which is covered by finitely many $U_i$; a smooth partition of unity subordinate to this finite cover decomposes $\varphi$ into pieces, each supported in some $U_i$, on which $T$ vanishes by hypothesis. With this result in hand, the support is well-defined.

[definition: Support Of A Distribution]
Let $T \in \mathcal{D}'(\Omega)$. The support of $T$, denoted $\mathrm{supp}(T)$, is the complement in $\Omega$ of the union of all open subsets of $\Omega$ on which $T$ vanishes.
[/definition]

By the Localisation of Vanishing, this union is itself an open set on which $T$ vanishes, so $\mathrm{supp}(T)$ is a well-defined closed subset of $\Omega$. Unwinding the definition, $x_0 \in \mathrm{supp}(T)$ precisely when every open neighbourhood $U$ of $x_0$ in $\Omega$ contains some $\varphi \in \mathcal{D}(U)$ with $T(\varphi) \neq 0$ — i.e., $T$ cannot be "killed" by localising near $x_0$. For a regular distribution $T_f$ with $f \in L^1_\mathrm{loc}(\Omega)$, $\mathrm{supp}(T_f)$ coincides with the essential support of $f$: the smallest closed subset of $\Omega$ outside of which $f = 0$ almost everywhere. The Dirac delta $\delta_{x_0}$ has $\mathrm{supp}(\delta_{x_0}) = \{x_0\}$: it is a distribution concentrated at a single point.

Distributions Supported at a Point

A natural question is: what are all the distributions concentrated at a single point? The following structure theorem, due to Schwartz, gives a complete answer.

[quotetheorem:451]

The theorem asserts that the only distributions supported at a single point are finite linear combinations of the delta and its derivatives at that point — no other behaviour is possible. This is remarkable: without any a priori regularity assumption, the support condition alone forces $T$ to be a finite-order differential operator applied to $\delta_{x_0}$. The key step in the proof is to show that $T$ annihilates every test function that vanishes to sufficiently high order at $x_0$ (using the seminorm estimate from the Characterisation of Distributions together with a Taylor expansion and cutoff argument), so the action of $T$ on an arbitrary test function depends only on the finitely many Taylor coefficients of $\varphi$ at $x_0$.

The Hierarchy of Function and Distribution Spaces

The various spaces of functions and distributions on $\mathbb{R}^n$ are connected by a chain of continuous linear embeddings:
\begin{align*} \mathcal{D}(\mathbb{R}^n) \hookrightarrow \mathcal{S}(\mathbb{R}^n) \hookrightarrow L^p(\mathbb{R}^n) \xrightarrow{\; f \,\mapsto\, T_f \;} \mathcal{S}'(\mathbb{R}^n) \hookrightarrow \mathcal{D}'(\mathbb{R}^n), \end{align*}
where $1 \leq p \leq \infty$. The first two arrows are set-theoretic inclusions. The third is the canonical embedding $f \mapsto T_f$: linear, injective (by the Injectivity of the Canonical Embedding), and continuous, but not a set-theoretic inclusion — it sends an equivalence class of measurable functions to a continuous linear functional. The fourth is restriction of functionals from $\mathcal{S}(\mathbb{R}^n)$ to $\mathcal{D}(\mathbb{R}^n)$.

Moving left to right, the spaces grow: $\mathcal{D}(\mathbb{R}^n)$ (smooth, compactly supported test functions, dense in $L^p$ for $p < \infty$), the Schwartz space $\mathcal{S}(\mathbb{R}^n)$ (smooth, rapidly decaying — the natural domain of the Fourier transform), $L^p(\mathbb{R}^n)$ (integrability without smoothness), $\mathcal{S}'(\mathbb{R}^n)$ (tempered distributions — the largest space on which the Fourier transform is defined, via the Fourier automorphism), and $\mathcal{D}'(\mathbb{R}^n)$ (all distributions, including those of super-polynomial growth). The dual pairing reverses the inclusion order: as the test function spaces shrink ($\mathcal{D} \hookrightarrow \mathcal{S}$), fewer continuity conditions are imposed, so the distribution spaces grow ($\mathcal{S}' \hookrightarrow \mathcal{D}'$).

[example: A Distribution That Is Not Tempered]
The function $g: \mathbb{R} \to \mathbb{R}$ defined by $g(x) = e^{e^x}$ is locally integrable and generates a regular distribution $T_g \in \mathcal{D}'(\mathbb{R})$. However, $T_g \notin \mathcal{S}'(\mathbb{R})$: the functional $\varphi \mapsto \int_{\mathbb{R}} e^{e^x}\varphi(x) \, d\mathcal{L}^1(x)$ is continuous on $\mathcal{D}(\mathbb{R})$ (where the compact support of $\varphi$ controls the integral) but not on $\mathcal{S}(\mathbb{R})$ (where the rapid decay of $\varphi$ cannot compensate for the super-exponential growth of $g$). The Fourier transform of $T_g$ is therefore not defined — this is the obstruction that the temperedness condition is designed to exclude.
[/example]

On a general open set $\Omega \subsetneq \mathbb{R}^n$, the Schwartz space is not defined (rapid decay at infinity is meaningful only on all of $\mathbb{R}^n$), and the chain reduces to $\mathcal{D}(\Omega) \hookrightarrow L^p(\Omega) \xrightarrow{f \mapsto T_f} \mathcal{D}'(\Omega)$.

Application to PDEs: Distributional Solutions

The Concept of a Distributional Solution

The distributional framework provides the weakest notion of "solution" to a PDE: a distributional solution is a distribution that satisfies the equation when tested against all test functions. This is weaker than a weak solution in the Sobolev sense (which requires the solution to belong to some $W^{k,p}$ space) and allows genuinely singular solutions.

The definition uses the formal adjoint of a differential operator. Given a linear differential operator $L = \sum_{|\alpha| \leq m} a_\alpha \partial^\alpha$ with coefficients $a_\alpha \in C^\infty(\Omega)$, its formal adjoint is $L^* = \sum_{|\alpha| \leq m} (-1)^{|\alpha|} \partial^\alpha(a_\alpha \, \cdot\,)$. Since each $a_\alpha$ is smooth, $L^*$ maps $\mathcal{D}(\Omega)$ into $\mathcal{D}(\Omega)$: if $\varphi \in \mathcal{D}(\Omega)$, then $L^*\varphi$ is smooth (by the classical Leibniz rule) and $\mathrm{supp}(L^*\varphi) \subseteq \mathrm{supp}(\varphi)$ (since $\partial^\alpha(a_\alpha \varphi)$ vanishes wherever $\varphi$ does).

[definition: Distributional Solution]
Let $L = \sum_{|\alpha| \leq m} a_\alpha \partial^\alpha$ be a linear differential operator with coefficients $a_\alpha \in C^\infty(\Omega)$, let $L^* = \sum_{|\alpha| \leq m} (-1)^{|\alpha|} \partial^\alpha(a_\alpha \, \cdot\,)$ be its formal adjoint, and let $S \in \mathcal{D}'(\Omega)$. A distribution $T \in \mathcal{D}'(\Omega)$ is a distributional solution of $LT = S$ if
\begin{align*} T(L^* \varphi) &= S(\varphi) \quad \text{for every } \varphi \in \mathcal{D}(\Omega). \end{align*}
[/definition]

For the Laplacian $L = -\Delta$, the formal adjoint is $L^* = -\Delta$ (since $\Delta$ is formally self-adjoint), so a distributional solution of $-\Delta T = S$ satisfies $T(-\Delta \varphi) = S(\varphi)$ for all $\varphi \in \mathcal{D}(\Omega)$. When $T = T_u$ and $S = T_f$ are both regular distributions, this reduces to the standard weak formulation $\int \nabla u \cdot \nabla \varphi \, d\mathcal{L}^n = \int f\varphi \, d\mathcal{L}^n$ (after one further integration by parts).

Shock Waves as Distributional Solutions

The most physically compelling application of distributional solutions is to conservation laws with discontinuous solutions.

[example: Burgers Equation Shock Wave]
The inviscid Burgers equation in one dimension, written in conservation form, is
\begin{align*} u_t + \partial_x\!\left(\tfrac{1}{2}u^2\right) &= 0 \quad \text{on } \mathbb{R} \times (0, \infty). \end{align*}
Consider the initial data $u_0: \mathbb{R} \to \mathbb{R}$ defined by $u_0(x) = 1$ for $x < 0$ and $u_0(x) = 0$ for $x > 0$. No classical solution exists for $t > 0$: characteristics from the left carry the value $1$ at speed $1$, while characteristics from the right carry the value $0$ at speed $0$, and they collide immediately. A distributional solution exists as a travelling discontinuity. Define $u: \mathbb{R} \times (0, \infty) \to \mathbb{R}$ by $u(x,t) = 1$ for $x < t/2$ and $u(x,t) = 0$ for $x > t/2$. This is a shock wave propagating at speed $s = 1/2$, which equals the Rankine–Hugoniot speed $s = [u^2/2]/[u] = (1/2 - 0)/(1 - 0) = 1/2$.

Verification as a distributional solution. The function $u$ is locally integrable on $\mathbb{R} \times (0, \infty)$ and generates a regular distribution $T_u$. The conservation law $u_t + \partial_x(u^2/2) = 0$ holds in the distributional sense if $\partial_t T_u + \partial_x T_{u^2/2} = 0$ in $\mathcal{D}'(\mathbb{R} \times (0,\infty))$, which means
\begin{align*} \int_0^\infty \int_{-\infty}^\infty \left(u \, \varphi_t + \tfrac{1}{2}u^2 \, \varphi_x\right) d\mathcal{L}^1(x) \, d\mathcal{L}^1(t) &= 0 \end{align*}
for every $\varphi \in \mathcal{D}(\mathbb{R} \times (0, \infty))$. Split the integral at the shock line $x = t/2$. On the region $\{x < t/2\}$, $u = 1$ and $u^2/2 = 1/2$, giving $\int\!\!\int_{x < t/2} (\varphi_t + \tfrac{1}{2}\varphi_x) \, d\mathcal{L}^1(x) \, d\mathcal{L}^1(t)$. On $\{x > t/2\}$, $u = 0$ and both terms vanish. Integrating by parts on the region $\{x < t/2\}$ (where $u$ is constant, so the classical equation holds trivially), the volume integral vanishes and the boundary contribution along $x = t/2$ is
\begin{align*} \int_0^\infty \varphi(t/2, t) \left(-s \cdot [u] + [u^2/2]\right) d\mathcal{L}^1(t), \end{align*}
where $[u] = 1 - 0 = 1$ and $[u^2/2] = 1/2 - 0 = 1/2$ are the jumps across the shock. Since $s = 1/2$, the factor $-s[u] + [u^2/2] = -1/2 + 1/2 = 0$, and the integral vanishes for every $\varphi$. The Rankine–Hugoniot condition is precisely the condition that makes the distributional equation hold across the shock.
[/example]

The Heat Kernel as a Distributional Initial-Value Solution

[example: Heat Kernel Distributional Solution]
The heat equation on $\mathbb{R}^n \times (0, \infty)$ is $u_t - \Delta u = 0$. The heat kernel is the function $K: \mathbb{R}^n \times (0, \infty) \to \mathbb{R}$ defined by
\begin{align*} K(x, t) &:= \frac{1}{(4\pi t)^{n/2}} \exp\!\left(-\frac{|x|^2}{4t}\right). \end{align*}
For each fixed $t > 0$, $K(\cdot, t) \in \mathcal{S}(\mathbb{R}^n)$ (it is a Gaussian, hence smooth and rapidly decaying). A direct computation confirms that $K$ satisfies $K_t - \Delta K = 0$ classically for all $t > 0$.

The distributional content lies in the initial condition. As $t \to 0^+$, the Gaussian $K(\cdot, t)$ concentrates: $\int_{\mathbb{R}^n} K(x, t) \, d\mathcal{L}^n(x) = 1$ for every $t > 0$ (by the standard Gaussian integral), while the mass concentrates near $\{0\}$. For any $\varphi \in \mathcal{D}(\mathbb{R}^n)$, the substitution $y = x/\sqrt{4t}$ gives
\begin{align*} T_{K(\cdot,t)}(\varphi) &= \int_{\mathbb{R}^n} K(x, t) \, \varphi(x) \, d\mathcal{L}^n(x) = \int_{\mathbb{R}^n} \pi^{-n/2} e^{-|y|^2} \varphi(\sqrt{4t}\, y) \, d\mathcal{L}^n(y). \end{align*}
As $t \to 0^+$, the integrand converges pointwise to $\pi^{-n/2} e^{-|y|^2} \varphi(0)$, and is dominated by $\pi^{-n/2} e^{-|y|^2} \|\varphi\|_{L^\infty}$, which is integrable. By the dominated convergence theorem,
\begin{align*} T_{K(\cdot,t)}(\varphi) &\to \varphi(0) \int_{\mathbb{R}^n} \pi^{-n/2} e^{-|y|^2} \, d\mathcal{L}^n(y) = \varphi(0) = \delta_0(\varphi). \end{align*}
Therefore $T_{K(\cdot,t)} \to \delta_0$ in $\mathcal{D}'(\mathbb{R}^n)$ as $t \to 0^+$: the heat kernel is the fundamental solution of the heat equation, a classical solution for $t > 0$ whose initial data in the distributional sense is the Dirac delta. The solution to $u_t - \Delta u = 0$ with initial data $u(\cdot, 0) = f$ (for $f \in L^p(\mathbb{R}^n)$, $1 \leq p \leq \infty$) is then given by convolution: $u(x, t) = (K(\cdot, t) * f)(x) = \int_{\mathbb{R}^n} K(x - y, t)\, f(y) \, d\mathcal{L}^n(y)$.
[/example]

References

  1. L. Schwartz, Théorie des Distributions, 2nd ed. (1966).
  2. L. Hörmander, The Analysis of Linear Partial Differential Operators I (1983).
  3. L. C. Evans, Partial Differential Equations (1998).
  4. W. Rudin, Functional Analysis (1991).
  5. F. G. Friedlander and M. Joshi, Introduction to the Theory of Distributions, 2nd ed. (1998).

Attribution Debug Info:

Total segments: 1

Attributed segments: 0

Non-attributed segments: 1

Attribution Summary

admin

Contributions: 1
Sources: create
Last Modified: 2/27/2026