Convolution - Content Verification
Current Content
Debug: Found 1 attribution entries
First Attribution: Source: create, Text length: 15477, Start: N/A, End: N/A
Page content length: 15512
Convolution is the operation that "smears" one function against another, producing a weighted average that inherits the best properties of both factors. It is the algebraic operation underlying three pillars of analysis: $L^p$ theory (where Young's inequality controls its integrability), Fourier analysis (where the Convolution Theorem converts it to pointwise multiplication), and PDE (where solutions are built by convolving with fundamental solutions). This page develops the operation itself; for its role in regularisation see the mollifier page, and for its interaction with the Fourier transform see the Fourier Transform page.
[motivation]
Motivation
What Convolution Does
Given two functions $f$ and $g$ on $\mathbb{R}^n$, the convolution $(f * g)(x) = \int f(x - y)g(y) \, d\mathcal{L}^n(y)$ evaluates $f$ at all points near $x$, weighted by $g$. If $g$ is a concentrated bump near the origin, $(f * g)(x)$ is a local average of $f$ around $x$ — a smoothing operation. If $g$ is itself rough, the convolution still makes sense (under integrability conditions) and is at least as regular as the smoother of the two factors.
Why Domains Matter
The integral $(f * g)(x) = \int f(x - y)g(y) \, dy$ requires two things: $y$ must lie in the domain of $g$, and $x - y$ must lie in the domain of $f$. If $f$ is supported on a set $A$ and $g$ on a set $B$, then the integrand is nonzero only when $y \in B$ and $x - y \in A$, i.e., $x \in A + B$. The convolution's natural domain is therefore the Minkowski sum $A + B = \{a + b : a \in A, \, b \in B\}$, which is typically larger than either $A$ or $B$ individually. This domain growth is not a technicality — it is the reason convolution with a kernel of width $\varepsilon$ expands the support by $\varepsilon$, and it is the mechanism by which the Poisson kernel on the ball solves the Dirichlet problem.
The Three Algebraic Structures
Convolution gives $L^1(\mathbb{R}^n)$ the structure of a commutative Banach algebra (without identity). The Fourier transform is the Gelfand transform of this algebra, converting convolution to multiplication and making the spectral theory of translation-invariant operators algebraic. In PDE, every linear constant-coefficient equation $P(D)u = f$ on $\mathbb{R}^n$ is solved by $u = E * f$ where $E$ is the fundamental solution satisfying $P(D)E = \delta_0$: the heat kernel, the Newtonian potential, and the Duhamel integral are all convolutions.
[/motivation]
Definition and Domain
The definition requires care about where the integral makes sense. The integrand involves $f$ evaluated at $x - y$ and $g$ at $y$, so both the integrability of $f$ and $g$ and the geometry of their supports determine the set of $x$ for which the convolution is finite.
[definition: Convolution]
Let $f, g: \mathbb{R}^n \to \mathbb{C}$ be measurable functions. The convolution of $f$ and $g$ is the function
\begin{align*} (f * g)(x) := \int_{\mathbb{R}^n} f(x - y) \, g(y) \, d\mathcal{L}^n(y), \end{align*}
defined for those $x \in \mathbb{R}^n$ for which the integral exists (either as a Lebesgue integral when the integrand is absolutely integrable, or as an improper integral in suitable senses).
[/definition]
The notation is deceptively simple. The integral involves $f$ evaluated at $x - y$ (a reflected and translated version of $f$) multiplied by $g(y)$. For the integral to be finite at a given $x$, one needs $y \mapsto f(x-y)g(y)$ to be integrable. This is where the integrability conditions (Young's inequality) and the support conditions (Minkowski sum) enter.
When Is the Convolution Well-Defined?
The most common sufficient conditions are:
$L^1 * L^1$: If $f, g \in L^1(\mathbb{R}^n)$, Fubini-Tonelli gives $\int\!\!\int |f(x-y)||g(y)| \, dy \, dx = \|f\|_{L^1}\|g\|_{L^1} < \infty$, so $f * g$ is defined a.e. and belongs to $L^1$.
$L^p * L^q$: More generally, Young's Convolution Inequality gives the complete $L^p$ picture.
Compact support: If one of $f, g$ has compact support and the other is locally integrable, the integral reduces to a bounded region and $f * g$ is defined everywhere.
[example: Domain Growth Under Convolution]
Let $f = \mathbb{1}_{[0,1]}$ and $g = \mathbb{1}_{[2,4]}$ on $\mathbb{R}$. The convolution is
\begin{align*} (f * g)(x) = \int_{\mathbb{R}} \mathbb{1}_{[0,1]}(x - y) \, \mathbb{1}_{[2,4]}(y) \, d\mathcal{L}^1(y) = \mathcal{L}^1\!\big([2,4] \cap [x-1, x]\big). \end{align*}
The intersection $[2,4] \cap [x-1, x]$ is nonempty exactly when $x - 1 \leq 4$ and $x \geq 2$, i.e., $x \in [2, 5]$. The convolution is: $0$ for $x < 2$; $x - 2$ for $2 \leq x \leq 3$; $1$ for $3 \leq x \leq 4$; $5 - x$ for $4 \leq x \leq 5$; $0$ for $x > 5$. The support of $f * g$ is $[2, 5] = [0, 1] + [2, 4]$ — the Minkowski sum of the two supports. Neither input is supported on $[2, 5]$; the convolution's domain has grown by adding the widths.
Note that $f$ "lives" on $[0,1]$ and $g$ on $[2,4]$, which are disjoint — yet $f * g$ is nonzero on the connecting interval $[2, 5]$. Convolution reaches across gaps.
[/example]
Commutativity and Algebraic Properties
Convolution on $\mathbb{R}^n$ is commutative: $(f * g)(x) = (g * f)(x)$. The proof requires a global change of variables.
[remark: Commutativity Proof]
Substitute $z = x - y$ (so $y = x - z$, $dy = dz$):
\begin{align*} (f * g)(x) = \int f(x - y) g(y) \, dy = \int f(z) g(x - z) \, dz = (g * f)(x). \end{align*}
This substitution is valid on $\mathbb{R}^n$ because Lebesgue measure is translation-invariant. On other domains — for instance, on a non-abelian group — the substitution introduces the group inverse and convolution is not commutative in general.
[/remark]
[example: Commutativity And Asymmetric Supports]
Take $f = \mathbb{1}_{[0,1]}$ and $g = \delta$-approximation $\rho_\varepsilon$ supported on $[-\varepsilon, \varepsilon]$. Then $f * g$ is a smooth function supported on $[-\varepsilon, 1 + \varepsilon]$: the mollification of $f$, which smooths the jumps at $0$ and $1$. Computing $g * f$ instead gives the same function — but the intuition is different. In $f * g$, we average $f$ against a narrow bump; in $g * f$, we smear the bump $g$ across the support of $f$. Commutativity says these produce the same output, despite the asymmetry between a rough function ($f$) and a smooth one ($g$).
[/example]
Beyond commutativity, convolution is associative — $(f * g) * h = f * (g * h)$ — and distributes over addition. These are verified by Fubini's theorem. The Dirac delta $\delta_0$ acts as a (distributional) identity: $f * \delta_0 = f$. However, no function in $L^1(\mathbb{R}^n)$ serves as a convolution identity: the Riemann-Lebesgue Lemma forces $\hat{f}(\xi) \to 0$ for $f \in L^1$, but a convolution identity would need $\hat{e}(\xi) = 1$ for all $\xi$, which is incompatible with decay.
The Support of a Convolution
The domain example above illustrates a general principle: convolution adds supports. If $f$ is nonzero on a set $A$ and $g$ on a set $B$, the integrand $f(x-y)g(y)$ can be nonzero only when $y \in B$ and $x - y \in A$, forcing $x \in A + B$. This makes precise the intuition that convolution "spreads out" a function by the width of the kernel.
[quotetheorem:588]
The support property is the mechanism behind the $\varepsilon$-enlargement in mollification: if $\operatorname{supp}(\rho_\varepsilon) = \overline{B}(0, \varepsilon)$ and $\operatorname{supp}(f) \subseteq K$, then $\operatorname{supp}(f * \rho_\varepsilon) \subseteq K + \overline{B}(0, \varepsilon)$ — the support expands by at most $\varepsilon$ in every direction. See Properties of Mollification for the full statement in the mollification context.
$L^p$ Theory
Knowing that $f * g$ is well-defined is not enough — we need to control which $L^r$ space it lands in. For pointwise products, Hölder's inequality gives $\|fg\|_{L^r} \leq \|f\|_{L^p}\|g\|_{L^q}$ with $1/r = 1/p + 1/q$. Convolution does strictly better: the averaging effect gains one full unit of integrability, so the exponent relation shifts to $1/r = 1/p + 1/q - 1$. Without this gain, the approximate identity theory would not work — the $L^1 * L^p \to L^p$ bound that underlies mollification requires $r = p$ with $q = 1$.
[quotetheorem:463]
The exponent relation $1/p + 1/q = 1 + 1/r$ has a useful interpretation: convolution gains one full unit of integrability ($1/r = 1/p + 1/q - 1$) compared to Hölder's inequality for products ($1/r = 1/p + 1/q$). The most important special cases are: $L^1 * L^p \to L^p$ (convolution with an $L^1$ kernel preserves $L^p$, with $\|f * g\|_p \leq \|f\|_p\|g\|_1$), which is the bound behind every approximate identity argument; and $L^1 * L^1 \to L^1$ (with the multiplicative norm bound $\|f * g\|_1 \leq \|f\|_1 \|g\|_1$), which makes $L^1(\mathbb{R}^n)$ a Banach algebra under convolution.
Young's inequality controls the norm of $f * g$. But for convergence of approximate identities — showing $f * \varphi_\varepsilon \to f$ — we need to control the difference $f * \varphi_\varepsilon - f = \int \varphi_\varepsilon(y)(f(\cdot - y) - f(\cdot)) \, dy$. This requires pulling the $L^p_x$ norm inside the $dy$ integral, which is a job for Minkowski's integral inequality.
[quotetheorem:464]
Together, Young and Minkowski reduce $L^p$ convergence of convolutions to the $L^p$-continuity of translation: $\|f(\cdot - h) - f\|_{L^p} \to 0$ as $h \to 0$, which holds for all $f \in L^p$ with $1 \leq p < \infty$.
Regularity: Convolution Inherits the Best
Young's inequality and support control are quantitative but say nothing about differentiability. The following regularity principle is equally fundamental: if one factor is smooth, the convolution inherits that smoothness. The reason is that the derivative of $f * g$ can be computed by differentiating either factor — and one always chooses the smooth one. Without this principle, mollification could not produce smooth approximations.
[quotetheorem:35]
The principle generalises: if $f \in L^1_{\mathrm{loc}}$ and $g \in C_c^k$, then $f * g \in C^k$ and $D^\alpha(f * g) = f * D^\alpha g$ for $|\alpha| \leq k$. Intuitively, $f * g$ is at least as smooth as the smoother factor. This is the fundamental mechanism of mollification: convolving any rough function with $\rho_\varepsilon \in C_c^\infty$ produces a $C^\infty$ function. See the mollifier page for the full development.
Interaction with the Fourier Transform
The deepest structural property of convolution is that the Fourier transform diagonalises it: convolution in the spatial domain becomes pointwise multiplication in the frequency domain.
[quotetheorem:250]
The dual identity — multiplication becomes convolution — also holds: $\widehat{fg}(\xi) = (2\pi)^{-n}(\hat{f} * \hat{g})(\xi)$. This duality is the reason the Fourier transform is so effective for PDE: a differential equation $P(D)u = f$ transforms to $P(\xi)\hat{u}(\xi) = \hat{f}(\xi)$, which is algebraic; inverting gives $\hat{u} = \hat{f}/P$, i.e., $u = \mathcal{F}^{-1}[1/P] * f$, expressing the solution as a convolution with the fundamental solution.
The convolution theorem extends to distributions: when $u$ is a tempered distribution and $\varphi$ is a Schwartz function, the convolution $u * \varphi$ is a smooth function of at most polynomial growth, and the Fourier exchange $\widehat{u * \varphi} = \hat{u} \cdot \hat{\varphi}$ continues to hold. This is essential for PDE, where the fundamental solution is typically a distribution (e.g., $\delta_0$) and the source term a function.
[quotetheorem:458]
Approximate Identities
There is no convolution identity in $L^1$ (as shown above), but there are approximate identities: families of kernels $\varphi_\varepsilon$ that act increasingly like $\delta_0$ as $\varepsilon \to 0$. The conditions needed are surprisingly minimal — unit mass, uniform $L^1$ bound, and concentration at the origin suffice to guarantee $f * \varphi_\varepsilon \to f$ in $L^p$ for any $f \in L^p$.
[definition: Approximate Identity]
A family $\{\varphi_\varepsilon\}_{\varepsilon > 0}$ in $L^1(\mathbb{R}^n)$ is an approximate identity if: (1) $\int_{\mathbb{R}^n} \varphi_\varepsilon \, d\mathcal{L}^n = 1$ for all $\varepsilon > 0$; (2) $\sup_\varepsilon \|\varphi_\varepsilon\|_{L^1} < \infty$; (3) for every $\delta > 0$, $\int_{|x| > \delta} |\varphi_\varepsilon(x)| \, d\mathcal{L}^n(x) \to 0$ as $\varepsilon \to 0$.
[/definition]
The three conditions say: unit mass, uniform $L^1$ bound, and concentration. The convergence $f * \varphi_\varepsilon \to f$ in $L^p$ (for $1 \leq p < \infty$) follows from the same argument used in part (3) of Properties of Mollification: Minkowski's integral inequality reduces the problem to $L^p$-continuity of translation, and the concentration property (3) localises the integral.
Important examples: the standard mollifier $\rho_\varepsilon$ (compactly supported, gives $C^\infty$ approximations); the heat kernel $(4\pi\varepsilon)^{-n/2}e^{-|x|^2/(4\varepsilon)}$ (solves the heat equation, not compactly supported); the Poisson kernel (solves the Laplace equation on the half-space); the Fejér kernel (trigonometric approximate identity on $\mathbb{T}$). The standard mollifier is distinguished by compact support, which gives the support control that the others lack.
Convolution of Distributions
The definition extends beyond functions. For a distribution $T \in \mathcal{D}'(\mathbb{R}^n)$ and a test function $\varphi \in \mathcal{D}(\mathbb{R}^n)$, the convolution $(T * \varphi)(x) := T_y(\varphi(x - y))$ is always well-defined (the map $y \mapsto \varphi(x - y)$ is in $\mathcal{D}$ for each $x$) and produces a $C^\infty$ function. This is the distributional analogue of the regularity principle: convolving with a smooth function always smooths.
For two distributions $T, S \in \mathcal{D}'(\mathbb{R}^n)$, the convolution $T * S$ requires a support condition — at least one of $T, S$ must be compactly supported — to ensure the necessary integrals converge. When this holds, $T * S \in \mathcal{D}'(\mathbb{R}^n)$ and satisfies:
\begin{align*} \partial^\alpha(T * S) = (\partial^\alpha T) * S = T * (\partial^\alpha S) \end{align*}
for every multi-index $\alpha$. The Dirac delta is the convolution identity: $T * \delta_0 = T$ for every distribution $T$. See the Distribution page for the full framework and Theorem 458 for the tempered distribution case.
References
- Stein, E. M. and Weiss, G. (1971). Introduction to Fourier Analysis on Euclidean Spaces. Princeton University Press.
- Grafakos, L. (2014). Classical Fourier Analysis (3rd ed.). Springer.
- Folland, G. B. (1999). Real Analysis (2nd ed.). Wiley.
- Brezis, H. (2011). Functional Analysis, Sobolev Spaces and Partial Differential Equations. Springer.
Attribution Debug Info:
Total segments: 1
Attributed segments: 0
Non-attributed segments: 1