Classical [differentiation](/page/Derivative) requires a function to be locally linear: the difference quotient $(f(x+h) - f(x))/h$ must converge as $h \to 0$. This fails for any discontinuous function and for many [functions](/page/Function) that arise naturally in PDE theory — solutions to elliptic equations in rough media, velocity fields with shocks, Green's functions which are singular at a point. Yet these objects do possess meaningful "derivatives" in a weaker sense, and a complete theory of PDEs requires being able to differentiate them.
The key observation is that, for a smooth function $f$ and a test function $\varphi \in C_c^\infty(\Omega)$, the integration-by-parts formula:
\begin{align*}
\int_\Omega (\partial^\alpha f)(x)\varphi(x)\,d\mathcal{L}^n(x) &= (-1)^{|\alpha|}\int_\Omega f(x)(\partial^\alpha\varphi)(x)\,d\mathcal{L}^n(x)
\end{align*}
holds with no [boundary](/page/Boundary) terms, since $\varphi$ has compact support in $\Omega$. The right-hand side remains well-defined even if $f$ is merely locally [integrable](/page/Integral): we never differentiate $f$, only the smooth test function $\varphi$. This suggests defining the derivative of $f$ as the functional that acts on $\varphi$ and produces the value $(-1)^{|\alpha|}\int f\,\partial^\alpha\varphi\,d\mathcal{L}^n$.
## [Test Functions](/page/Test%20Function) and Distributions
[definition: Test Function Space]
Let $\Omega \subseteq \mathbb{R}^n$ be a non-empty [open set](/page/Open%20Set). The **space of test functions** on $\Omega$, denoted $\mathcal{D}(\Omega)$, is the [set](/page/Set) of all infinitely differentiable functions with compact support contained in $\Omega$:
\begin{align*}
\mathcal{D}(\Omega) := \left\{\varphi \in C^\infty(\Omega) \;\middle|\; \mathrm{supp}(\varphi) \subset \Omega \text{ is compact}\right\}.
\end{align*}
[/definition]
[definition: Convergence in the Test Function Space]
A [sequence](/page/Sequence) $\{\varphi_k\}_{k=1}^\infty \subseteq \mathcal{D}(\Omega)$ **converges** to $\varphi \in \mathcal{D}(\Omega)$ if there exists a compact set $K \subset \Omega$ containing $\mathrm{supp}(\varphi_k)$ for every $k \in \mathbb{N}$, and $\partial^\alpha\varphi_k \to \partial^\alpha\varphi$ uniformly on $K$ for every multi-index $\alpha \in \mathbb{N}_0^n$.
[/definition]
[definition: Distribution]
A **distribution** on $\Omega$ is a continuous linear functional on $\mathcal{D}(\Omega)$. Explicitly, a map $T: \mathcal{D}(\Omega) \to \mathbb{R}$ is a distribution if it is linear and $T(\varphi_k) \to T(\varphi)$ whenever $\varphi_k \to \varphi$ in $\mathcal{D}(\Omega)$.
The **space of distributions** on $\Omega$, denoted $\mathcal{D}'(\Omega)$, is the set of all distributions on $\Omega$, equipped with the weak-$*$ [topology](/page/Topology): $T_k \to T$ in $\mathcal{D}'(\Omega)$ if $T_k(\varphi) \to T(\varphi)$ for every $\varphi \in \mathcal{D}(\Omega)$.
[/definition]
A locally integrable function $f \in L^1_{\mathrm{loc}}(\Omega)$ defines a distribution $T_f \in \mathcal{D}'(\Omega)$ via:
\begin{align*}
T_f: \mathcal{D}(\Omega) &\to \mathbb{R} \\
\varphi &\mapsto \int_\Omega f(x)\varphi(x)\,d\mathcal{L}^n(x).
\end{align*}
Distributions of this form are called **regular distributions**. The map $f \mapsto T_f$ is injective: if $T_f = T_g$, then $\int(f-g)\varphi\,d\mathcal{L}^n = 0$ for all $\varphi \in \mathcal{D}(\Omega)$, which forces $f = g$ almost everywhere. Distributions that cannot be represented by any locally integrable function — such as the Dirac delta $\delta_0(\varphi) = \varphi(0)$ — are called **singular distributions**.
When $\Omega = \mathbb{R}^n$, distributions that additionally extend continuously to all of $\mathcal{S}(\mathbb{R}^n)$ form the subspace $\mathcal{S}'(\mathbb{R}^n) \hookrightarrow \mathcal{D}'(\mathbb{R}^n)$ of **[tempered distributions](/page/Tempered%20Distributions)**, defined on the [Tempered Distributions](/pages/1053) page. Tempered distributions are the natural setting when one also requires the [Fourier transform](/page/Fourier%20Transform) to be well-defined.
## The Distributional Derivative
[definition: Distributional Derivative]
Let $\Omega \subseteq \mathbb{R}^n$ be a non-empty open set, let $T \in \mathcal{D}'(\Omega)$, and let $\alpha \in \mathbb{N}_0^n$ be a multi-index. The **distributional derivative** of $T$ of order $\alpha$ is the distribution $\partial^\alpha T \in \mathcal{D}'(\Omega)$ defined by:
\begin{align*}
\partial^\alpha T: \mathcal{D}(\Omega) &\to \mathbb{R} \\
\varphi &\mapsto (-1)^{|\alpha|}T(\partial^\alpha\varphi).
\end{align*}
[/definition]
The map $\varphi \mapsto \partial^\alpha\varphi$ is continuous on $\mathcal{D}(\Omega)$ — since differentiating a sequence converging in $\mathcal{D}$ produces a convergent sequence — so $\partial^\alpha T = T \circ ((-1)^{|\alpha|}\partial^\alpha)$ is a composition of continuous [linear maps](/page/Linear%20Map) and is therefore a distribution. Every distribution is infinitely differentiable in this sense: the distributional derivative of any order of any distribution is again a distribution.
[theorem: Consistency With Classical and Weak Derivatives]
Let $\Omega \subseteq \mathbb{R}^n$ be a non-empty open set, let $f \in L^1_{\mathrm{loc}}(\Omega)$, and let $T_f \in \mathcal{D}'(\Omega)$ be the associated regular distribution. Let $\alpha \in \mathbb{N}_0^n$.
If $f \in C^{|\alpha|}(\Omega)$, then the distributional derivative $\partial^\alpha T_f$ coincides with the [regular distribution](/page/Regular%20Distribution) generated by the classical derivative $\partial^\alpha f$: $\partial^\alpha T_f = T_{\partial^\alpha f}$.
If $f \in W^{|\alpha|,p}(\Omega)$ for some $p \in [1,\infty]$, and $v \in L^p(\Omega)$ is the [weak derivative](/page/Weak%20Derivative) of $f$ of order $\alpha$, then $\partial^\alpha T_f = T_v$.
[/theorem]
[proof]
**Step 1: Classical case.** Let $f \in C^{|\alpha|}(\Omega)$ and $\varphi \in \mathcal{D}(\Omega)$. Applying the classical integration-by-parts formula $|\alpha|$ times (boundary terms vanish since $\varphi$ has compact support in $\Omega$):
\begin{align*}
(\partial^\alpha T_f)(\varphi) &= (-1)^{|\alpha|}T_f(\partial^\alpha\varphi) = (-1)^{|\alpha|}\int_\Omega f(x)(\partial^\alpha\varphi)(x)\,d\mathcal{L}^n(x) \\
&= \int_\Omega (\partial^\alpha f)(x)\varphi(x)\,d\mathcal{L}^n(x) = T_{\partial^\alpha f}(\varphi).
\end{align*}
Since this holds for every $\varphi \in \mathcal{D}(\Omega)$, we have $\partial^\alpha T_f = T_{\partial^\alpha f}$.
**Step 2: Weak derivative case.** By definition, $v \in L^p(\Omega)$ is the weak derivative of $f$ of order $\alpha$ if:
\begin{align*}
\int_\Omega v(x)\varphi(x)\,d\mathcal{L}^n(x) &= (-1)^{|\alpha|}\int_\Omega f(x)(\partial^\alpha\varphi)(x)\,d\mathcal{L}^n(x) \quad \text{for every } \varphi \in \mathcal{D}(\Omega).
\end{align*}
Comparing with the definition of $\partial^\alpha T_f$:
\begin{align*}
(\partial^\alpha T_f)(\varphi) = (-1)^{|\alpha|}T_f(\partial^\alpha\varphi) = (-1)^{|\alpha|}\int_\Omega f\,\partial^\alpha\varphi\,d\mathcal{L}^n = \int_\Omega v\,\varphi\,d\mathcal{L}^n = T_v(\varphi).
\end{align*}
Since this holds for every $\varphi \in \mathcal{D}(\Omega)$, we have $\partial^\alpha T_f = T_v$.
[/proof]
## Examples
[example: Derivative of the Heaviside Function]
The Heaviside step function illustrates how a classical discontinuity has a perfectly well-defined distributional derivative. Define:
\begin{align*}
H: \mathbb{R} &\to \mathbb{R} \\
x &\mapsto \begin{cases} 1 & \text{if } x \ge 0, \\ 0 & \text{if } x < 0. \end{cases}
\end{align*}
Since $H \in L^1_{\mathrm{loc}}(\mathbb{R})$, it defines a regular distribution $T_H$. For any $\varphi \in \mathcal{D}(\mathbb{R})$:
\begin{align*}
(\partial T_H)(\varphi) &= -T_H(\partial\varphi) = -\int_\mathbb{R} H(x)\varphi'(x)\,d\mathcal{L}^1(x) = -\int_0^\infty \varphi'(x)\,d\mathcal{L}^1(x).
\end{align*}
By the [fundamental theorem of calculus](/theorems/632) and compact support of $\varphi$:
\begin{align*}
-\int_0^\infty \varphi'(x)\,d\mathcal{L}^1(x) &= -\lim_{R\to\infty}\varphi(R) + \varphi(0) = \varphi(0) = \delta_0(\varphi).
\end{align*}
Thus $\partial T_H = \delta_0$: the derivative of the Heaviside function is the Dirac delta. The jump discontinuity at $0$ of magnitude $1$ produces a delta mass of weight $1$.
More generally, if $f$ is a piecewise $C^1$ function on $\mathbb{R}$ with jump discontinuity $[f]_{x_0} = f(x_0^+) - f(x_0^-)$ at each point $x_0$ in a discrete set $S$, then:
\begin{align*}
\partial T_f &= T_{f'} + \sum_{x_0 \in S}[f]_{x_0}\delta_{x_0},
\end{align*}
where $f'$ denotes the classical derivative on $\mathbb{R} \setminus S$.
[/example]
[example: Second Distributional Derivative of the Absolute Value]
Let $f(x) = |x|$ on $\mathbb{R}$. This function is continuous but not differentiable at $0$. The first distributional derivative was computed as the sign function: $\partial T_f = T_{\mathrm{sgn}}$, where $\mathrm{sgn}(x) = \mathbb{1}_{(0,\infty)}(x) - \mathbb{1}_{(-\infty,0)}(x)$.
The sign function can be written as $\mathrm{sgn} = 2H - 1$ almost everywhere, where $H$ is the Heaviside function. Since the constant function $1$ has zero distributional derivative, by linearity and the computation above:
\begin{align*}
\partial^2 T_f &= \partial T_{\mathrm{sgn}} = \partial(2T_H - T_1) = 2\partial T_H - 0 = 2\delta_0.
\end{align*}
The second distributional derivative of $|x|$ is a delta mass of weight $2$ at the origin. This can be verified directly: for $\varphi \in \mathcal{D}(\mathbb{R})$,
\begin{align*}
(\partial^2 T_f)(\varphi) &= T_f(\varphi'') = \int_{-\infty}^\infty |x|\varphi''(x)\,d\mathcal{L}^1(x).
\end{align*}
Integrating by parts twice (noting $|x|$ is Lipschitz so admits a classical derivative a.e.):
\begin{align*}
\int_{-\infty}^\infty |x|\varphi''(x)\,d\mathcal{L}^1(x) &= -\int_{-\infty}^0(-1)\varphi'(x)\,d\mathcal{L}^1(x) - \int_0^\infty 1\cdot\varphi'(x)\,d\mathcal{L}^1(x) \\
&= \int_{-\infty}^0\varphi'(x)\,d\mathcal{L}^1(x) - \int_0^\infty\varphi'(x)\,d\mathcal{L}^1(x) \\
&= \varphi(0) - 0 - (0 - \varphi(0)) = 2\varphi(0) = 2\delta_0(\varphi).
\end{align*}
[/example]
[example: Green's Function and the Distributional Laplacian]
The Newton potential in dimension $n = 3$:
\begin{align*}
\Phi: \mathbb{R}^3 \setminus \{0\} &\to \mathbb{R} \\
x &\mapsto \frac{1}{4\pi|x|}
\end{align*}
is locally integrable (since $|x|^{-1} \in L^1_{\mathrm{loc}}(\mathbb{R}^3)$ — the singularity at the origin is integrable in $\mathbb{R}^3$ because $\int_0^1 r^{-1} r^2\,dr = \int_0^1 r\,dr < \infty$), so $\Phi$ defines a regular distribution $T_\Phi \in \mathcal{D}'(\mathbb{R}^3)$. Away from the origin $\Phi$ is smooth and harmonic, so $\Delta\Phi = 0$ classically on $\mathbb{R}^3 \setminus \{0\}$. Yet the [distributional](/page/Distribution) Laplacian is non-trivial at the origin. For $\varphi \in \mathcal{D}(\mathbb{R}^3)$:
\begin{align*}
(\Delta T_\Phi)(\varphi) &= T_\Phi(\Delta\varphi) = \int_{\mathbb{R}^3}\Phi(x)\Delta\varphi(x)\,d\mathcal{L}^3(x).
\end{align*}
Split the integral at the ball $B_\varepsilon := B(0,\varepsilon)$ and apply Green's identity on $\mathbb{R}^3 \setminus B_\varepsilon$ (where $\Phi$ is smooth and $\Delta\Phi = 0$):
\begin{align*}
\int_{\mathbb{R}^3 \setminus B_\varepsilon}\Phi\Delta\varphi\,d\mathcal{L}^3 &= \int_{\mathbb{R}^3 \setminus B_\varepsilon}(\Delta\Phi)\varphi\,d\mathcal{L}^3 + \int_{\partial B_\varepsilon}\left(\Phi\frac{\partial\varphi}{\partial\nu} - \varphi\frac{\partial\Phi}{\partial\nu}\right)d\mathcal{H}^2 \\
&= \int_{\partial B_\varepsilon}\left(\frac{1}{4\pi\varepsilon}\frac{\partial\varphi}{\partial\nu} - \varphi\frac{\partial}{\partial r}\frac{1}{4\pi r}\bigg|_{r=\varepsilon}\right)d\mathcal{H}^2,
\end{align*}
where $\partial/\partial\nu$ denotes the outward normal derivative (pointing toward the origin on $\partial B_\varepsilon$). As $\varepsilon \to 0$, the surface area is $\mathcal{H}^2(\partial B_\varepsilon) = 4\pi\varepsilon^2$, so the first boundary term is $O(\varepsilon)$ and vanishes, while $\partial\Phi/\partial r = -1/(4\pi r^2)$ on $\partial B_\varepsilon$ gives:
\begin{align*}
-\int_{\partial B_\varepsilon}\varphi\cdot\frac{-1}{4\pi\varepsilon^2}\,d\mathcal{H}^2 = \frac{1}{4\pi\varepsilon^2}\int_{\partial B_\varepsilon}\varphi\,d\mathcal{H}^2 \to \varphi(0)
\end{align*}
by [continuity](/page/Continuity) of $\varphi$. The contribution from $B_\varepsilon$ itself vanishes as $\varepsilon \to 0$ since $\Phi \in L^1_{\mathrm{loc}}$. Therefore $(\Delta T_\Phi)(\varphi) = \varphi(0) = \delta_0(\varphi)$ for every $\varphi \in \mathcal{D}(\mathbb{R}^3)$, giving $\Delta T_\Phi = \delta_0$. This is the distributional identity underlying the fact that $\Phi$ is the fundamental solution of the Laplacian.
[/example]
## Problems
[problem]
Let $f_k: \mathbb{R} \to \mathbb{R}$ be defined for each $k \in \mathbb{N}$ by:
\begin{align*}
f_k: \mathbb{R} &\to \mathbb{R} \\
x &\mapsto \frac{k}{2}\mathbb{1}_{[-1/k,\,1/k]}(x).
\end{align*}
1. Compute $\int_\mathbb{R} f_k(x)\,d\mathcal{L}^1(x)$ for each $k$.
2. Show that $T_{f_k} \to \delta_0$ in $\mathcal{D}'(\mathbb{R})$ as $k \to \infty$.
3. Compute the distributional derivative $\partial T_{f_k}$ and describe its [limit](/page/Limit) in $\mathcal{D}'(\mathbb{R})$.
[/problem]
[solution]
**Part 1.** The integral is:
\begin{align*}
\int_\mathbb{R} f_k(x)\,d\mathcal{L}^1(x) &= \frac{k}{2}\int_{-1/k}^{1/k}d\mathcal{L}^1(x) = \frac{k}{2}\cdot\frac{2}{k} = 1 \quad \text{for every } k \in \mathbb{N}.
\end{align*}
**Part 2.** For any $\varphi \in \mathcal{D}(\mathbb{R})$, continuity of $\varphi$ at $0$ gives:
\begin{align*}
T_{f_k}(\varphi) &= \int_\mathbb{R} f_k(x)\varphi(x)\,d\mathcal{L}^1(x) = \frac{k}{2}\int_{-1/k}^{1/k}\varphi(x)\,d\mathcal{L}^1(x).
\end{align*}
By the mean value property of continuous functions, for any $\varepsilon > 0$ choose $\delta > 0$ such that $|\varphi(x) - \varphi(0)| < \varepsilon$ whenever $|x| < \delta$. For $k > 1/\delta$, every $x \in [-1/k,1/k]$ satisfies $|x| \le 1/k < \delta$, so:
\begin{align*}
\left|T_{f_k}(\varphi) - \varphi(0)\right| &= \left|\frac{k}{2}\int_{-1/k}^{1/k}(\varphi(x) - \varphi(0))\,d\mathcal{L}^1(x)\right| \le \frac{k}{2}\cdot\frac{2}{k}\cdot\varepsilon = \varepsilon.
\end{align*}
Since $\varepsilon$ was arbitrary and $\delta_0(\varphi) = \varphi(0)$, we have $T_{f_k}(\varphi) \to \delta_0(\varphi)$ for every $\varphi \in \mathcal{D}(\mathbb{R})$, giving $T_{f_k} \to \delta_0$ in $\mathcal{D}'(\mathbb{R})$.
**Part 3.** The function $f_k$ has jump discontinuities of magnitude $k/2$ at $x = 1/k$ (upward) and $-k/2$ at $x = -1/k$ (downward). Wait — $f_k$ is the indicator function scaled by $k/2$, so it jumps from $0$ to $k/2$ at $x = -1/k$ and from $k/2$ to $0$ at $x = 1/k$. By the jump formula:
\begin{align*}
\partial T_{f_k} &= T_{f_k'} + \frac{k}{2}\delta_{-1/k} - \frac{k}{2}\delta_{1/k}.
\end{align*}
Since $f_k' = 0$ almost everywhere on $\mathbb{R}$ (the function is constant on each piece), $T_{f_k'} = 0$, and:
\begin{align*}
\partial T_{f_k} &= \frac{k}{2}\delta_{-1/k} - \frac{k}{2}\delta_{1/k}.
\end{align*}
For any $\varphi \in \mathcal{D}(\mathbb{R})$:
\begin{align*}
(\partial T_{f_k})(\varphi) &= \frac{k}{2}\varphi(-1/k) - \frac{k}{2}\varphi(1/k) = \frac{k}{2}\left(\varphi(-1/k) - \varphi(1/k)\right).
\end{align*}
Since $\varphi$ is smooth, $\varphi(-1/k) - \varphi(1/k) = -2\varphi'(0)/k + O(k^{-2})$ by Taylor's theorem, giving:
\begin{align*}
(\partial T_{f_k})(\varphi) \to -\varphi'(0) = \delta_0(\varphi') = -(\partial\delta_0)(\varphi).
\end{align*}
Wait — let us recheck the sign. The distributional derivative satisfies $(\partial\delta_0)(\varphi) = -\delta_0(\varphi') = -\varphi'(0)$. Comparing: $(\partial T_{f_k})(\varphi) \to -\varphi'(0) = (\partial\delta_0)(\varphi)$. This is consistent with differentiating the limit $\partial(T_{f_k}) \to \partial\delta_0$, which follows from the continuity of $\partial: \mathcal{D}'(\mathbb{R}) \to \mathcal{D}'(\mathbb{R})$ established in the definition and from $T_{f_k} \to \delta_0$.
[/solution]
## References
1. L. C. Evans, *Partial Differential Equations* (1998).
2. L. Grafakos, *Classical Fourier Analysis* (2014).
3. L. Hörmander, *The Analysis of Linear Partial Differential Operators I* (1983).
4. W. Rudin, *Functional Analysis* (1991).