Microlocal analysis provides a framework for understanding differential and pseudo-differential operators through the joint lens of position and frequency, enabling precise control over solutions to partial differential equations. This course develops the theory of pseudodifferential operators from first principles, beginning with Fourier multipliers as a model case and progressing through the symbolic calculus that underpins modern PDE theory. Pseudodifferential operators generalize classical differential operators by replacing polynomial symbol functions with more general smooth symbols, allowing for a vastly richer class of operators that nonetheless retain good analytic properties and a coherent composition algebra.
The course is organized around the construction and manipulation of this symbolic calculus. Chapters 1–7 build the foundational machinery: Fourier multipliers establish the basic intuition; symbol classes formalize the decay and smoothness conditions that make the calculus work; quantisation rules translate symbols into operators via integral kernels; and composition, adjoints, and commutator formulas show how the algebra behaves under operations. Asymptotic expansions emerge as a key theme, allowing us to track operator behavior not just qualitatively but through precise asymptotic formulas in the symbol.
The final chapters apply this machinery to fundamental problems in analysis and geometry. Chapters 8–10 exploit Sobolev regularity, ellipticity, and Fredholm theory to establish solvability and spectral properties of classical differential operators, while Chapter 11 extends the calculus to manifolds through coordinate localisation. Throughout, the microlocal perspective—understanding operators simultaneously in position and momentum—reveals structure hidden by traditional approaches: principal symbols encode the leading-order behavior, commutators vanish asymptotically, and the interplay between regularity and decay becomes transparent. Chapter 12 consolidates these threads through the lens of elliptic theory, showing how pseudodifferential calculus unifies the solution of boundary-value problems and the construction of parametrices.
# Introduction
This introductory chapter fixes the scope, prerequisites, and guiding examples for the course. The central question is how far the Fourier multiplier description of constant-coefficient differential operators can be extended to variable-coefficient operators while retaining an algebraic calculus. The answer developed in the course is the local pseudodifferential calculus on open subsets of $\mathbb R^n$, built from symbols, quantisation, composition, parametrices, and Sobolev estimates.
The course sits between Fourier analysis and elliptic PDE. Fourier analysis supplies the model operators, distributions supply the correct domain of action, and Sobolev spaces supply the scale on which estimates are measured. A later course on wave front sets and Fourier integral operators will attach singularities to directions in phase space; in this course the emphasis stays on the operator algebra needed before that refinement enters.
## What the Course Is Trying to Build
The first problem is that differential operators with variable coefficients are local in $x$ but not diagonal under the Fourier transform. Constant-coefficient operators become multiplication by a polynomial in the frequency variable $\xi$, so their mapping properties are visible from the size and zero set of that polynomial. Variable coefficients destroy exact diagonalisation, but their coefficients vary in $x$ while differentiation still corresponds to powers of $\xi$.
[motivation]
### From Multipliers to Variable Coefficients
For a constant-coefficient operator
\begin{align*}
P(D)u = \sum_{|\alpha| \le m} c_\alpha D^\alpha u,
\end{align*}
the Fourier transform gives
\begin{align*}
\widehat{P(D)u}(\xi) = p(\xi)\hat{u}(\xi), \qquad p(\xi)=\sum_{|\alpha|\le m}c_\alpha \xi^\alpha,
\end{align*}
up to the chosen convention for powers of $D$. The operator is therefore controlled by a function $p$ on frequency space. This is the model for a pseudodifferential operator: replace $p(\xi)$ by a function $a(x,\xi)$ whose $x$-dependence records variable coefficients and whose $\xi$-growth records order.
### Why Estimates Replace Formulas
The calculus cannot rely only on closed-form expressions for $a(x,\xi)$. Composition, adjoints, coordinate cutoffs, and parametrices produce expansions rather than finite formulas. The course therefore organises operators by derivative estimates on symbols, since these estimates survive the operations needed in elliptic theory.
[/motivation]
This motivation leads to the course's main object. The formal definition will come later after Fourier conventions and distribution spaces have been reviewed, but the guiding shape is already visible: an operator is assembled by taking the Fourier transform, multiplying by an amplitude depending on both $x$ and $\xi$, and inverting the transform.
[definition: Local Pseudodifferential Calculus]
Let $U \subset \mathbb R^n$ be open. The local pseudodifferential calculus on $U$ is the framework that starts with smooth functions $a(x,\xi)$ on position-frequency space and turns them into operators, usually written $\operatorname{Op}_U(a)$, acting first on compactly supported smooth test functions. The growth of $a$ as $|\xi|\to\infty$ records the order of the operator, while derivatives in $x$ and $\xi$ measure how stable the operator is under localization, composition, and adjoints.
The word local means that the construction is tested after inserting compactly supported cutoffs in $U$. Properly supported operators are the ones whose kernels have support controlled enough that they act on distributions without sending information infinitely far away. Smoothing operators are the error terms with smooth kernels; they are negligible for finite-order local Sobolev regularity.
[/definition]
This is not a single operator class in isolation. It is a package: symbol spaces, a way of turning symbols into operators, an [equivalence relation](/page/Equivalence%20Relation) modulo smoothing terms, and estimates strong enough to control Sobolev norms. The later chapters prove the structural properties deliberately kept out of the definition: composition and adjoint formulae modulo smoothing operators, elliptic parametrices, and continuous maps between local Sobolev spaces.
One notation from the later symbol theory is useful already. The class $S^{-\infty}$ denotes the intersection of all symbol classes of finite order: a symbol is in $S^{-\infty}$ when it satisfies symbol estimates of order $N$ for every $N \in \mathbb Z$. Operators with symbols in $S^{-\infty}$ are smoothing, so this notation is a symbolic way of recording that an error term has become negligible for the calculus.
[example: Differential Operators As Symbols]
Let $U \subset \mathbb R^n$ be open, and use the course convention $D_j=-i\partial_{x_j}$. For
\begin{align*}
L u(x)=\sum_{|\alpha|\le m} a_\alpha(x)D^\alpha u(x),
\end{align*}
with $a_\alpha\in C^\infty(U)$, the coefficient $a_\alpha(x)$ records the $x$-dependence and the factor $D^\alpha$ records the frequency dependence. Indeed, for the oscillation $e^{ix\cdot \xi}$ one has
\begin{align*}
D_j e^{ix\cdot \xi}=-i\partial_{x_j}e^{ix\cdot \xi}=-i(i\xi_j)e^{ix\cdot \xi}=\xi_j e^{ix\cdot \xi}.
\end{align*}
Applying this identity once for each derivative in the multi-index $\alpha$ gives
\begin{align*}
D^\alpha e^{ix\cdot \xi}=\xi^\alpha e^{ix\cdot \xi}.
\end{align*}
Therefore
\begin{align*}
L(e^{ix\cdot \xi})=\sum_{|\alpha|\le m}a_\alpha(x)\xi^\alpha e^{ix\cdot \xi}=p(x,\xi)e^{ix\cdot \xi},
\end{align*}
where
\begin{align*}
p(x,\xi)=\sum_{|\alpha|\le m}a_\alpha(x)\xi^\alpha.
\end{align*}
The same differential expression extends to distributions by defining $\langle Lu,\varphi\rangle=\langle u,L^t\varphi\rangle$ for $\varphi\in C_c^\infty(U)$, where $L^t$ is the transpose differential operator on test functions.
For example, if $L=-\Delta+V(x)$ and $D_j=-i\partial_{x_j}$, then
\begin{align*}
D_j^2=(-i\partial_{x_j})(-i\partial_{x_j})=-\partial_{x_j}^2.
\end{align*}
Hence
\begin{align*}
-\Delta=-\sum_{j=1}^n\partial_{x_j}^2=\sum_{j=1}^nD_j^2.
\end{align*}
On $e^{ix\cdot \xi}$ this gives
\begin{align*}
(-\Delta)e^{ix\cdot \xi}=\sum_{j=1}^nD_j^2e^{ix\cdot \xi}=\sum_{j=1}^n\xi_j^2e^{ix\cdot \xi}=|\xi|^2e^{ix\cdot \xi}.
\end{align*}
The multiplication term contributes
\begin{align*}
V(x)e^{ix\cdot \xi}=V(x)e^{ix\cdot \xi}.
\end{align*}
Thus
\begin{align*}
(-\Delta+V(x))e^{ix\cdot \xi}=(|\xi|^2+V(x))e^{ix\cdot \xi},
\end{align*}
so the symbol is $p(x,\xi)=|\xi|^2+V(x)$ and the principal symbol is the degree-$2$ part $|\xi|^2$. This is the basic separation used throughout the course: the highest powers of $\xi$ determine order and ellipticity, while lower powers and zeroth-order coefficients enter as lower-order corrections.
[/example]
Differential operators give the entry point, but the calculus is larger because inverses of elliptic differential operators are seldom differential operators. The search for approximate inverses is the reason pseudodifferential operators appear naturally in PDE.
## The Guiding Elliptic Problem
The next question is how to solve an elliptic equation microlocally before introducing the full language of wave front sets. If $L$ is elliptic, its principal symbol does not vanish at high frequency, so the first approximation to an inverse should divide by that principal symbol. This is the parametrix idea.
[definition: Parametrix]
Let $U \subset \mathbb R^n$ be open and let $L: \mathcal D'(U) \to \mathcal D'(U)$ be a continuous linear operator. A parametrix for $L$ is a continuous linear operator $Q: \mathcal D'(U) \to \mathcal D'(U)$ such that
\begin{align*}
QL = I - R_1, \qquad LQ = I - R_2,
\end{align*}
where $R_1,R_2: \mathcal D'(U) \to C^\infty(U)$ are smoothing operators, regarded as maps into $\mathcal D'(U)$ by the canonical inclusion $C^\infty(U) \subset \mathcal D'(U)$.
[/definition]
A parametrix gives the right substitute for an inverse when exact inversion is obstructed by lower-order and compactly supported effects. The next result is the course-level form of the definition: ellipticity is the condition that permits inversion modulo smoothing errors, provided the operator is interpreted inside the properly supported local calculus.
[quotetheorem:7662]
[citeproof:7662]
The ellipticity hypothesis is essential because it prevents high-frequency directions from escaping the inverse construction. For example, the operator $\partial_{x_1}$ on $\mathbb R^2$ has principal symbol proportional to $\xi_1$, which vanishes for the high-frequency covectors $(0,\xi_2)$; singular oscillations depending only on $x_2$ cannot be recovered by dividing by $\xi_1$. The conclusion also does not assert that $Q$ is an exact inverse, solve boundary value problems, or control global topology on a noncompact domain. It gives an inverse modulo smoothing errors inside the local properly supported calculus, and those smoothing remainders can still carry finite-dimensional or global information in later Fredholm problems.
[example: The Bessel Potential Model]
On $\mathbb R^n$, define the Bessel potential operator $\langle D\rangle^s=(I-\Delta)^{s/2}$ by its action on the Fourier transform:
\begin{align*}
\widehat{\langle D\rangle^s u}(\xi)=\langle \xi\rangle^s\hat u(\xi).
\end{align*}
Here
\begin{align*}
\langle \xi\rangle=(1+|\xi|^2)^{1/2}.
\end{align*}
Raising both sides to the power $s$ gives
\begin{align*}
\langle \xi\rangle^s=((1+|\xi|^2)^{1/2})^s=(1+|\xi|^2)^{s/2}.
\end{align*}
The inverse multiplier is obtained by multiplying by the reciprocal symbol. Since
\begin{align*}
\langle \xi\rangle^s\langle \xi\rangle^{-s}=\langle \xi\rangle^{s-s}=\langle \xi\rangle^0=1,
\end{align*}
we have
\begin{align*}
\widehat{\langle D\rangle^{-s}\langle D\rangle^s u}(\xi)=\langle \xi\rangle^{-s}\langle \xi\rangle^s\hat u(\xi)=\hat u(\xi).
\end{align*}
Thus $\langle D\rangle^{-s}$ is the inverse of $\langle D\rangle^s$ on the Fourier side.
In the Sobolev norm,
\begin{align*}
\|\langle D\rangle^s u\|_{L^2}=\|\langle \xi\rangle^s\hat u(\xi)\|_{L^2}=\|u\|_{H^s}.
\end{align*}
So applying $\langle D\rangle^s$ turns $H^s$ regularity into an $L^2$ quantity: positive order consumes $s$ derivatives, while the inverse multiplier $\langle D\rangle^{-s}$ restores those derivatives.
[/example]
The Bessel potential model also explains why Sobolev spaces are the natural scale for the course. Operators of order $m$ should act like $m$ derivatives, so the target exponent should drop by $m$.
## Prerequisites and Conventions
The course assumes that distributions are already familiar enough to be used as the default domain for differential operators. Smooth compactly supported functions are too small for elliptic inversion, while arbitrary functions are not stable under weak differentiation. Distribution theory provides a common language for kernels, smoothing operators, Fourier transforms, and weak solutions.
The special role of smoothing remainders is stronger than merely being compact or lower order. A lower-order operator can still preserve singularities, as a first-order derivative is lower order than a second-order elliptic operator but certainly does not regularise distributions. A compact operator on one Sobolev space also need not improve regularity by every order. Smoothing is the precise negligible class because it removes the singular behaviour that the calculus is designed to track.
[definition: Smoothing Operator]
Let $U \subset \mathbb R^n$ be open. A smoothing operator on $U$ is a continuous linear operator $R: \mathcal D'(U) \to C^\infty(U)$.
[/definition]
Smoothing operators are the negligible terms of this course. They are not zero, but they remove singularities and improve Sobolev regularity by every order on compact subsets. The calculus is therefore often most meaningful modulo smoothing operators.
[remark: Fourier Transform Convention]
The Fourier transform is first defined on $\mathcal S(\mathbb R^n)$ as a map $\mathcal F: \mathcal S(\mathbb R^n) \to \mathcal S(\mathbb R^n)$ by the symmetric normalisation
\begin{align*}
\hat{f}(\xi) = \frac{1}{(2\pi)^{n/2}}\int_{\mathbb R^n} f(x)e^{-ix\cdot \xi}\,d\mathcal L^n(x).
\end{align*}
The inverse transform uses the same factor and the phase $e^{ix\cdot \xi}$. By duality, the same notation denotes the extension $\mathcal F: \mathcal S'(\mathbb R^n) \to \mathcal S'(\mathbb R^n)$.
[/remark]
This convention makes Plancherel's theorem take the clean form $\|\hat f\|_{L^2}=\|f\|_{L^2}$. The next step is to turn that identity into a family of norms that measure differentiability by frequency growth. Those norms are the Sobolev scale on which the order of a pseudodifferential operator becomes a mapping statement.
[definition: Sobolev Scale Used in the Course]
For $s \in \mathbb R$, the Sobolev space $H^s(\mathbb R^n)$ is defined by
\begin{align*}
H^s(\mathbb R^n)=\{u \in \mathcal S'(\mathbb R^n): \langle \xi\rangle^s \hat u(\xi) \in L^2(\mathbb R^n)\}.
\end{align*}
Its norm is
\begin{align*}
\|u\|_{H^s}=\|\langle \xi\rangle^s\hat u\|_{L^2}.
\end{align*}
[/definition]
The same scale is localised to open sets by cutoffs and coordinate patches. Once Sobolev regularity has been encoded by powers of $\langle \xi\rangle$, the main analytic question becomes whether symbol order always translates into loss or gain of the corresponding number of derivatives. This is the mapping principle that makes the calculus useful for PDE estimates.
We write $\langle D\rangle^t$ for the Bessel potential Fourier multiplier with symbol $\langle \xi\rangle^t$. Thus
\begin{align*}
\widehat{\langle D\rangle^t u}(\xi)=\langle \xi\rangle^t\hat u(\xi),
\end{align*}
initially for $u \in \mathcal S(\mathbb R^n)$ and then by extension on [tempered distributions](/page/Tempered%20Distributions) whenever the expression is defined.
Bessel potentials make the expected estimate testable: if an order-$m$ operator behaves like $m$ derivatives, then conjugating it by $\langle D\rangle^{-s}$ on the input and $\langle D\rangle^{s-m}$ on the output should leave an order-zero operator. The next theorem is therefore the analytic bridge between symbol bookkeeping and the Sobolev estimates used in elliptic regularity and Fredholm arguments.
[quotetheorem:7715]
[citeproof:7715]
This mapping principle is one of the main payoffs of the calculus. It turns symbol order into a precise regularity statement and allows PDE estimates to be proved by algebraic manipulations of symbols.
The hypothesis $a \in S^m_{1,0}$ is not cosmetic. The estimates in this class allow each $\xi$-derivative to improve decay without losing too much control in $x$, which is exactly what the integration-by-parts and almost-orthogonality arguments require. In rougher symbol classes, especially when $\delta$ is too large relative to $\rho$, order-zero operators need not be bounded on $L^2$; Ching-type examples in $S^0_{1,1}$ give model failures where high-frequency packets are arranged so that the dyadic pieces do not satisfy the square-summability needed for an $L^2$ estimate. The theorem is also local in nature: on open subsets one inserts cutoffs or assumes proper support, and boundary conditions require additional analysis beyond the interior pseudodifferential calculus.
## Structure of the Course
The final organisational question is what must be built before parametrices and elliptic regularity can be proved. The order is dictated by the dependencies: Fourier multipliers motivate the estimates, symbol classes provide the objects, quantisation turns symbols into operators, and composition gives the algebra.
[explanation: Chapter Roadmap]
The first lecture block reviews Fourier multipliers, tempered distributions, Bessel potentials, and constant-coefficient elliptic regularity. This block fixes the model examples and the Sobolev scale.
The second block introduces Hörmander symbol classes $S^m_{\rho,\delta}$, with the main emphasis on $S^m_{1,0}$. The core skills are reading symbol estimates, using seminorms, recognising smoothing symbols, and manipulating cutoffs.
The third block defines quantisation and studies kernels. Proper support, smoothing operators, and transpose or adjoint operations enter here because they are needed to compose local operators without losing control of supports.
The fourth block proves the symbolic composition formula and asymptotic summation. This is where the calculus becomes an algebra: products of operators correspond to asymptotic products of symbols.
The final block applies the calculus to elliptic operators. It constructs parametrices, proves elliptic regularity, establishes Sobolev mapping properties, and explains how parametrices feed into Fredholm theory on compact geometries. It also explains why the next course must refine support information into phase-space singular support, the language needed for wave propagation.
[/explanation]
By the end of the course, the reader should be able to pass between three descriptions of an operator: its oscillatory integral formula, its symbol estimates, and its action on Sobolev spaces. The course does not yet develop the full microlocal theory of wave front sets, but it prepares the operator calculus on which that theory rests.
We have outlined three equivalent descriptions of operators: oscillatory integrals, symbol estimates, and Sobolev mapping properties. This chapter focuses on the first and most explicit: the constant-coefficient case where the Fourier transform fully diagonalizes the operator. This model calculus sets the analytic conventions and computational style for the entire course.
# 1. Fourier Multipliers and the Model Calculus
Microlocal analysis begins with the most rigid operators: those whose action is diagonalised by the Fourier transform. This chapter fixes the analytic conventions used throughout the course and treats constant-coefficient operators as the model case for the later pseudodifferential calculus. The main point is that differentiation becomes multiplication by a polynomial in frequency, while Sobolev regularity is measured by weighted $L^2$ size of the Fourier transform.
The later variable-coefficient calculus will imitate the constructions here locally in $x$ and globally in $\xi$. For that reason we spend time on normalisations, multiplier domains, and parametrices: these are the features that survive when a polynomial multiplier is replaced by a general symbol.
## Fourier Transform on Schwartz Functions and Tempered Distributions
The first problem is to choose a class of functions on which all operations needed for Fourier analysis are stable. We want to differentiate, multiply by polynomials, integrate by parts without boundary terms, and pass to dual objects such as distributions. Schwartz functions provide this common domain.
[definition: Schwartz Space]
Let $\mathcal{S}(\mathbb R^n)$ be the [vector space](/page/Vector%20Space) of all functions $u \in C^\infty(\mathbb R^n)$ such that, for every pair of multi-indices $\alpha,\beta \in \mathbb N_0^n$,
\begin{align*}
\|u\|_{\alpha,\beta} := \sup_{x \in \mathbb R^n} |x^\alpha D^\beta u(x)| < \infty.
\end{align*}
[/definition]
These seminorms record rapid decay of $u$ and of every derivative. The space is designed so that polynomial weights and derivatives can be traded freely, which is exactly what [integration by parts](/theorems/210) in Fourier analysis requires. To see that the definition is not empty and to motivate the examples used later in inversion arguments, we record the basic Gaussian model.
[example: Gaussian Is Schwartz]
Let $u(x)=e^{-|x|^2}$ on $\mathbb R^n$. For each multi-index $\beta$, we show that there is a polynomial $p_\beta(x)$ such that $D^\beta u(x)=p_\beta(x)e^{-|x|^2}$. For $\beta=0$, take $p_0=1$. If $D^\beta u=p_\beta e^{-|x|^2}$, then for each coordinate $x_j$,
\begin{align*}
\partial_{x_j}(p_\beta(x)e^{-|x|^2})=(\partial_{x_j}p_\beta(x))e^{-|x|^2}+p_\beta(x)(-2x_j)e^{-|x|^2}=(\partial_{x_j}p_\beta(x)-2x_jp_\beta(x))e^{-|x|^2}.
\end{align*}
The coefficient $\partial_{x_j}p_\beta-2x_jp_\beta$ is again a polynomial, so induction over $|\beta|$ proves the claim.
Fix multi-indices $\alpha,\beta$. Since $x^\alpha p_\beta(x)$ is a polynomial, write it as a finite sum $\sum_\gamma c_\gamma x^\gamma$. For $r=|x|$, each monomial satisfies $|x^\gamma|\le r^{|\gamma|}$ when $r\ge 1$, and $|x^\gamma|\le 1$ when $r\le 1$. The one-variable function $r^k e^{-r^2}$ is bounded on $[0,\infty)$: for $k=0$ this is immediate, and for $k>0$ its derivative is
\begin{align*}
\frac{d}{dr}(r^k e^{-r^2})=r^{k-1}e^{-r^2}(k-2r^2),
\end{align*}
so its only positive critical point is $r=\sqrt{k/2}$, and the values at $0$, $\sqrt{k/2}$, and as $r\to\infty$ are finite. Therefore each $|x^\gamma|e^{-|x|^2}$ is bounded, hence
\begin{align*}
\sup_{x\in\mathbb R^n}|x^\alpha D^\beta u(x)|=\sup_{x\in\mathbb R^n}|x^\alpha p_\beta(x)e^{-|x|^2}|<\infty.
\end{align*}
Thus every Schwartz seminorm of $u$ is finite, so $e^{-|x|^2}\in\mathcal S(\mathbb R^n)$. The same Leibniz-rule argument applies to $q(x)e^{-|x|^2}$ for any polynomial $q$, so polynomial multiples of Gaussians form a stable family of Schwartz test functions for the Fourier inversion arguments used later.
[/example]
The Gaussian shows that the Schwartz class is rich, but kernels and fundamental solutions force us beyond functions. This motivates the following [dual space](/page/Dual%20Space), which keeps the same polynomial-growth scale while allowing singular objects.
[definition: Tempered Distribution]
A tempered distribution on $\mathbb R^n$ is a continuous linear functional $T: \mathcal{S}(\mathbb R^n) \to \mathbb C$. The space of tempered distributions is denoted $\mathcal{S}'(\mathbb R^n)$.
[/definition]
Continuity here is with respect to the Schwartz seminorm topology. This condition excludes distributions with growth faster than polynomial and is exactly the growth scale compatible with Fourier transform methods. We now need to fix the transform convention that will act on both $\mathcal S(\mathbb R^n)$ and $\mathcal S'(\mathbb R^n)$.
[definition: Symmetric Fourier Transform]
The Fourier transform is the map $\mathcal F:\mathcal{S}(\mathbb R^n)\to\mathcal{S}(\mathbb R^n)$ defined, for $u \in \mathcal{S}(\mathbb R^n)$, by
\begin{align*}
\hat{u}(\xi) = \mathcal{F}u(\xi) := \frac{1}{(2\pi)^{n/2}}\int_{\mathbb R^n} u(x)e^{-ix\cdot \xi}\,d\mathcal L^n(x).
\end{align*}
The inverse Fourier transform is the map $\mathcal F^{-1}:\mathcal{S}(\mathbb R^n)\to\mathcal{S}(\mathbb R^n)$ defined, for $v\in\mathcal S(\mathbb R^n)$, by
\begin{align*}
\check{v}(x) = \mathcal{F}^{-1}v(x) := \frac{1}{(2\pi)^{n/2}}\int_{\mathbb R^n} v(\xi)e^{ix\cdot \xi}\,d\mathcal L^n(\xi).
\end{align*}
[/definition]
The symmetric convention is chosen so that the same constant appears in the forward and inverse transforms. This makes Plancherel take its cleanest form and will keep adjoint formulae later in the pseudodifferential calculus uncluttered. Before using the transform on distributions or $L^2$, we need the structural fact that it preserves the [Schwartz space](/page/Schwartz%20Space).
[quotetheorem:228]
[citeproof:228]
The automorphism theorem provides the missing test-function stability. The Schwartz hypotheses are doing real work: differentiation under the integral and [integration by parts](/theorems/2098) use rapid decay to eliminate boundary terms, and multiplication by powers of $x$ must still leave a function whose transform is controlled. Specific failures mark the boundary. The constant function $1$ is smooth but not rapidly decreasing, and its Fourier transform is a Dirac mass rather than a Schwartz function. The function $u(x)=(1+|x|^2)^{-(n+1)/2}$ is integrable, but its derivatives and weighted seminorms do not satisfy the Schwartz estimates needed for the theorem. The theorem therefore says that $\mathcal S(\mathbb R^n)$ is the correct invariant test-function space; it does not say that the pointwise integral formula already makes sense for arbitrary distributions or $L^2$ functions. This motivates the distributional Fourier transform, where the transform is defined by moving it onto the Schwartz [test function](/page/Test%20Function).
[definition: Fourier Transform of a Tempered Distribution]
The Fourier transform on tempered distributions is the map $\mathcal F:\mathcal{S}'(\mathbb R^n)\to\mathcal{S}'(\mathbb R^n)$ defined, for $T \in \mathcal{S}'(\mathbb R^n)$, by
\begin{align*}
(\mathcal F T)(\phi) := T(\mathcal F\phi)
\end{align*}
for all $\phi \in \mathcal{S}(\mathbb R^n)$.
[/definition]
With this convention fixed, it is important to remember that distributions are transformed through their action on test functions. With this extension, functions of polynomial growth, derivatives of such functions, and Dirac masses all lie in a single Fourier-invariant framework. The next step is to make the same transform compatible with Hilbert-space estimates.
[quotetheorem:529]
[citeproof:529/microlocal-analysis-i-pseudodifferential-operators]
Plancherel is the bridge from oscillatory formulae to Hilbert-space estimates. The completion step is necessary because an arbitrary $L^2$ function need not be integrable, so the pointwise integral defining $\hat u(\xi)$ may not converge for any given $\xi$. The density of $\mathcal S(\mathbb R^n)$ in $L^2(\mathbb R^n)$ is what makes the extension both available and unique: two continuous $L^2$ operators agreeing on this dense core must agree everywhere. The symmetric normalisation is also part of the statement, since it makes $\mathcal F$ unitary without extra powers of $2\pi$ in the norm identity or [inner product](/page/Inner%20Product) identity. Plancherel defines the transform as the unique $L^2$ limit of transforms of Schwartz approximations. Thus the theorem gives a Hilbert-space operator and norm identity, not a pointwise formula for every representative of an $L^2$ equivalence class. From now on, many operator estimates are proved by moving to frequency space, applying a pointwise multiplier bound, and returning by Plancherel.
## Constant-Coefficient Operators as Polynomial Multipliers
The next question is what a differential operator looks like after Fourier transform. For constant coefficients, no $x$-dependence remains, so differentiation turns into multiplication by powers of $\xi$.
[definition: Constant-Coefficient Differential Operator]
A constant-coefficient differential operator of order at most $m$ on $\mathbb R^n$ is an operator $P(D):\mathcal S(\mathbb R^n)\to\mathcal S(\mathbb R^n)$ of the form
\begin{align*}
P(D)u := \sum_{|\alpha|\le m} a_\alpha D^\alpha u,
\end{align*}
where $a_\alpha\in\mathbb C$, $u\in\mathcal S(\mathbb R^n)$, $D_j=-i\partial_{x_j}$, and $D^\alpha=D_1^{\alpha_1}\cdots D_n^{\alpha_n}$.
[/definition]
The notation $P(D)$ emphasises that the operator is obtained from a polynomial by replacing frequency variables with the normalized differential operators $D_j=-i\partial_{x_j}$. With the Fourier transform convention used here, this normalization removes the extra powers of $i$ from the multiplier formula.
The next definition fixes the exact polynomial that will be called the symbol of such an operator. This is needed before the multiplier theorem, because the theorem should compare an operator with a named function of $\xi$, not with an informal phrase such as “replace derivatives by frequencies.” The same bookkeeping will later become the principal-symbol convention for variable-coefficient and pseudodifferential operators, so the constant-coefficient case is where the sign and normalization choices must be made unambiguous.
[definition: Full Symbol of a Constant-Coefficient Operator]
For $P(D)=\sum_{|\alpha|\le m}a_\alpha D^\alpha$, its full symbol is the polynomial map $P:\mathbb R^n\to\mathbb C$ given by
\begin{align*}
P:\mathbb R^n&\longrightarrow\mathbb C, &
\xi&\longmapsto P(\xi):=\sum_{|\alpha|\le m}a_\alpha \xi^\alpha.
\end{align*}
[/definition]
With this convention, the full symbol is the exact frequency-side multiplier. The principal part, introduced later for variable coefficients, is the homogeneous degree $m$ part of this polynomial when such terms are present. We now verify that the definition has the intended operator meaning.
[quotetheorem:7716]
[citeproof:7716]
This result is the prototype of quantisation: an operator is encoded by a function of frequency. The hypotheses are restrictive. Constant coefficients are needed because multiplication by a variable coefficient does not commute with the Fourier transform; for instance, if $A=x_j\partial_{x_j}$, then Fourier transform turns the factor $x_j$ into differentiation in $\xi_j$, so the Fourier-side operator is not multiplication by a single function of $\xi$. The Schwartz assumption prevents boundary terms in the integration-by-parts step; on a rough or slowly decaying function, the displayed calculation may only make sense after distributional interpretation. The theorem therefore does not cover variable-coefficient operators, boundary value problems, or operators whose Fourier-side action differentiates in frequency. The special feature here is that the symbol is a polynomial, so growth at infinity corresponds exactly to loss of derivatives. The Laplacian is the central example because it is the model elliptic operator.
[example: The Laplacian as a Multiplier]
Let $u\in\mathcal S(\mathbb R^n)$ and $\Delta=\sum_{j=1}^n\partial_{x_j}^2$. For each coordinate $x_j$, integration by parts in the $x_j$ variable gives
\begin{align*}
\widehat{\partial_{x_j}u}(\xi)=\frac{1}{(2\pi)^{n/2}}\int_{\mathbb R^n}\partial_{x_j}u(x)e^{-ix\cdot\xi}\,d\mathcal L^n(x)=i\xi_j\hat u(\xi),
\end{align*}
because $u$ is Schwartz, so the boundary term at infinity vanishes. Applying the same identity to $\partial_{x_j}u$ gives
\begin{align*}
\widehat{\partial_{x_j}^2u}(\xi)=i\xi_j\widehat{\partial_{x_j}u}(\xi)=i\xi_j(i\xi_j\hat u(\xi))=-\xi_j^2\hat u(\xi).
\end{align*}
Therefore, by linearity of the Fourier transform,
\begin{align*}
\widehat{\Delta u}(\xi)=\sum_{j=1}^n\widehat{\partial_{x_j}^2u}(\xi)=\sum_{j=1}^n(-\xi_j^2\hat u(\xi))=-\left(\sum_{j=1}^n\xi_j^2\right)\hat u(\xi)=-|\xi|^2\hat u(\xi).
\end{align*}
Thus $-\Delta$ has non-negative multiplier $|\xi|^2$. Since $\widehat{Iu}(\xi)=\hat u(\xi)$, the operator $I-\Delta$ has multiplier
\begin{align*}
1-(-|\xi|^2)=1+|\xi|^2.
\end{align*}
This is the sign convention behind the frequent use of $I-\Delta$ in elliptic estimates.
[/example]
The multiplier viewpoint also gives an immediate way to define operators that are not differential operators. This motivates a wider class of frequency multipliers, where controlled growth replaces polynomial structure.
[definition: Polynomial Growth Multiplier]
A measurable function $a:\mathbb R^n\to\mathbb C$ is a multiplier of polynomial growth of order $m\in\mathbb R$ if there exists $C>0$ such that
\begin{align*}
|a(\xi)|\le C\langle\xi\rangle^m
\end{align*}
for all $\xi\in\mathbb R^n$, where $\langle\xi\rangle=(1+|\xi|^2)^{1/2}$.
[/definition]
The multiplier $a(\xi)$ defines an operator by multiplying the Fourier transform and applying the inverse transform. The natural Sobolev scale is built so that this definition has a direct mapping theorem.
## Bessel Potentials and Sobolev Spaces
Differential operators measure regularity in integer numbers of derivatives, but Fourier multipliers allow fractional orders. The weight $\langle\xi\rangle^s$ measures high-frequency growth like $|\xi|^s$ while staying nonzero at $\xi=0$.
[definition: Bessel Potential Operator]
For $s\in\mathbb R$, the Bessel potential operator is the map $\langle D\rangle^s:\mathcal S(\mathbb R^n)\to\mathcal S(\mathbb R^n)$ defined by
\begin{align*}
\langle D\rangle^s u=\mathcal F^{-1}(\langle\xi\rangle^s\hat u),
\qquad
\widehat{\langle D\rangle^s u}(\xi)=\langle\xi\rangle^s\hat u(\xi),
\qquad \langle\xi\rangle=(1+|\xi|^2)^{1/2}.
\end{align*}
[/definition]
This notation packages the operator $I-\Delta$ into a functional calculus. Since $I-\Delta$ has multiplier $1+|\xi|^2$, its fractional powers are exactly Bessel potentials.
[example: Fractional Powers of I Minus Delta]
For $u\in\mathcal S(\mathbb R^n)$, the operator $I-\Delta$ has Fourier multiplier $1+|\xi|^2$: indeed, $\widehat{Iu}(\xi)=\hat u(\xi)$ and $\widehat{-\Delta u}(\xi)=|\xi|^2\hat u(\xi)$, so by linearity,
\begin{align*}
\widehat{(I-\Delta)u}(\xi)=\hat u(\xi)+|\xi|^2\hat u(\xi)=(1+|\xi|^2)\hat u(\xi).
\end{align*}
Defining the fractional power by the corresponding Fourier multiplier gives
\begin{align*}
\widehat{(I-\Delta)^{s/2}u}(\xi)=(1+|\xi|^2)^{s/2}\hat u(\xi).
\end{align*}
Since $\langle\xi\rangle=(1+|\xi|^2)^{1/2}$, raising both sides to the power $s$ gives $\langle\xi\rangle^s=(1+|\xi|^2)^{s/2}$, and hence
\begin{align*}
\widehat{(I-\Delta)^{s/2}u}(\xi)=\langle\xi\rangle^s\hat u(\xi).
\end{align*}
This is exactly the defining Fourier-side formula for $\langle D\rangle^s u$, so $(I-\Delta)^{s/2}=\langle D\rangle^s$ on Schwartz functions. When $s=2$, the multiplier is $\langle\xi\rangle^2=1+|\xi|^2$, recovering $I-\Delta$; when $s<0$, the multiplier $\langle\xi\rangle^s$ behaves like $|\xi|^s=|\xi|^{-|s|}$ for large $|\xi|$, so high frequencies are damped by order $|s|$.
[/example]
The example shows why $\langle\xi\rangle^s$ is the right regularity weight: it treats low frequencies gently and high frequencies like differentiation of order $s$. This motivates the Sobolev scale used throughout the course, where regularity is encoded by weighted square integrability of the Fourier transform.
[definition: Sobolev Space on Euclidean Space]
For $s\in\mathbb R$, define
\begin{align*}
H^s(\mathbb R^n):=\{u\in\mathcal S'(\mathbb R^n):\langle\xi\rangle^s\hat u(\xi)\in L^2(\mathbb R^n)\}.
\end{align*}
Its norm is
\begin{align*}
\|u\|_{H^s}:=\|\langle D\rangle^s u\|_{L^2}=\|\langle\xi\rangle^s\hat u\|_{L^2}.
\end{align*}
[/definition]
For integer $s\ge 0$, this is equivalent to the classical Sobolev norm involving weak derivatives up to order $s$. The Fourier definition is better suited to microlocal analysis because it makes order bookkeeping algebraic. This motivates the central mapping estimate for multipliers of prescribed order.
[quotetheorem:7717]
[citeproof:7717]
The theorem says that a multiplier of order $m$ loses $m$ derivatives. Each hypothesis has a role. Measurability is needed so that $a\hat u$ is a measurable function and its weighted $L^2$ norm is meaningful; a non-measurable multiplier cannot define the displayed integral estimate in the usual Lebesgue framework. The target space $H^{s-m}$ is chosen so that the factor $\langle\xi\rangle^{-m}a(\xi)$ becomes bounded after the source weight $\langle\xi\rangle^s$ is extracted. This is also where uniqueness enters: once the weighted Fourier representative $a\hat u$ is in the target weighted $L^2$ space, Plancherel identifies a unique Sobolev distribution with that representative. The growth hypothesis is not cosmetic: if $a(\xi)$ grows faster than every polynomial, multiplication by $a$ can send a weighted $L^2$ Fourier transform outside every Sobolev space. For example, $a(\xi)=e^{|\xi|^2}$ overwhelms the polynomial weights defining $H^s$, so no estimate of the displayed form can hold on the Sobolev scale. The theorem is also specifically an $L^2$ weighted estimate; rough order-zero multipliers may require additional hypotheses to act boundedly on other function spaces. Negative order multipliers gain regularity, and order zero multipliers act boundedly on every $H^s$. The inverse of $1+|\xi|^2$ is the basic example where this gain can be read exactly.
[example: Parametrix of One Plus Xi Squared]
Let $a(\xi)=1+|\xi|^2=\langle\xi\rangle^2$ and $b(\xi)=(1+|\xi|^2)^{-1}=\langle\xi\rangle^{-2}$. The multiplier $a$ is the Fourier multiplier of $I-\Delta$, because
\begin{align*}
\widehat{(I-\Delta)u}(\xi)=\widehat{Iu}(\xi)-\widehat{\Delta u}(\xi)=\hat u(\xi)-(-|\xi|^2\hat u(\xi))=(1+|\xi|^2)\hat u(\xi).
\end{align*}
Define $B$ by $\widehat{Bu}(\xi)=b(\xi)\hat u(\xi)$. Then, for $u\in\mathcal S(\mathbb R^n)$,
\begin{align*}
\widehat{B(I-\Delta)u}(\xi)=b(\xi)a(\xi)\hat u(\xi)=(1+|\xi|^2)^{-1}(1+|\xi|^2)\hat u(\xi)=\hat u(\xi).
\end{align*}
Since the Fourier transform is injective on $\mathcal S(\mathbb R^n)$, this gives $B(I-\Delta)u=u$. The same multiplication in the opposite order gives
\begin{align*}
\widehat{(I-\Delta)Bu}(\xi)=a(\xi)b(\xi)\hat u(\xi)=(1+|\xi|^2)(1+|\xi|^2)^{-1}\hat u(\xi)=\hat u(\xi),
\end{align*}
so $B$ is an exact inverse on the Schwartz core.
For $u\in H^s(\mathbb R^n)$, the Sobolev norm of $Bu$ in order $s+2$ is
\begin{align*}
\|Bu\|_{H^{s+2}}=\|\langle\xi\rangle^{s+2}b(\xi)\hat u(\xi)\|_{L^2}=\|\langle\xi\rangle^{s+2}\langle\xi\rangle^{-2}\hat u(\xi)\|_{L^2}=\|\langle\xi\rangle^s\hat u(\xi)\|_{L^2}=\|u\|_{H^s}.
\end{align*}
Thus the inverse multiplier extends continuously as $B:H^s(\mathbb R^n)\to H^{s+2}(\mathbb R^n)$ with norm $1$. This is the model parametrix mechanism: dividing by the elliptic symbol $\langle\xi\rangle^2$ gains exactly two Sobolev derivatives.
[/example]
The parametrix example shows the mechanism behind elliptic regularity. This motivates isolating the high-frequency lower bound that makes a multiplier invertible up to lower-order errors.
[definition: Elliptic Polynomial Multiplier]
A polynomial multiplier of degree $m\in\mathbb N_0$ is a polynomial map $P:\mathbb R^n\to\mathbb C$ of degree $m$. It is elliptic if there exist $R>0$ and $c>0$ such that
\begin{align*}
|P(\xi)|\ge c|\xi|^m
\end{align*}
for all $|\xi|\ge R$.
[/definition]
This definition controls only the high-frequency region. Low frequencies can be handled by adding a compactly supported correction, which is the first appearance of the parametrix idea. We now use the multiplier mapping theorem to convert ellipticity into regularity gain.
[quotetheorem:7663]
[citeproof:7663]
This theorem is the first version of elliptic regularity in the course. Ellipticity is the decisive hypothesis: for the non-elliptic operator $\partial_{x_1}$ on $\mathbb R^n$, the multiplier $i\xi_1$ vanishes on the hyperplane $\xi_1=0$, so the equation $\partial_{x_1}u=f$ gives no control of oscillations in the transverse variables. Constant coefficients are also essential for this proof, because the Fourier transform no longer diagonalises $a(x)D^\alpha$ when $a$ depends on $x$; the missing commutation is exactly what later symbolic calculus repairs. The cutoff assumptions keep the commutator term inside the region where $\chi_1u$ is controlled. If $\chi_1$ did not equal $1$ near $\operatorname{supp}\chi$, differentiating $\chi$ could create terms outside the assumed Sobolev control. The theorem does not say that non-elliptic operators are regularising at characteristic frequencies, nor does it give [boundary regularity](/theorems/99) on domains with boundary. Later chapters replace $P(\xi)$ by symbols $a(x,\xi)$ and replace exact Fourier diagonalisation by asymptotic symbolic composition, but the strategy is the same: invert the leading high-frequency part and treat the rest as lower order.
[illustration:elliptic-parametrix-frequency-cutoff]
The Laplacian illustrates this strategy in its most familiar form.
[example: Elliptic Regularity for the Laplacian]
For $P(D)=-\Delta$, the Fourier multiplier is $|\xi|^2$: by *Constant-Coefficient Operators Are Fourier Multipliers*,
\begin{align*}
\widehat{-\Delta u}(\xi)=-\widehat{\Delta u}(\xi)=-(-|\xi|^2\hat u(\xi))=|\xi|^2\hat u(\xi).
\end{align*}
This multiplier is elliptic of order $2$, since for every $|\xi|\ge 1$,
\begin{align*}
\bigl||\xi|^2\bigr|=|\xi|^2.
\end{align*}
Assume $-\Delta u=f$ in $\mathcal S'(\mathbb R^n)$ and $f\in H^s(\mathbb R^n)$. Choose a smooth cutoff $\psi(\xi)$ with $\psi(\xi)=0$ for $|\xi|\le 1$ and $\psi(\xi)=1$ for $|\xi|\ge 2$, and define
\begin{align*}
q(\xi)=\psi(\xi)|\xi|^{-2}.
\end{align*}
On the support of $\psi$, one has $|\xi|\ge 1$, so
\begin{align*}
|q(\xi)|=\psi(\xi)|\xi|^{-2}\le |\xi|^{-2}\le 2\langle\xi\rangle^{-2}.
\end{align*}
Thus $q$ is a multiplier of order $-2$, and by the *[Fourier Multiplier Mapping Theorem](/theorems/7717)* the corresponding operator $Q$ maps $H^s(\mathbb R^n)$ boundedly into $H^{s+2}(\mathbb R^n)$.
Taking Fourier transforms in $-\Delta u=f$ gives
\begin{align*}
|\xi|^2\hat u(\xi)=\hat f(\xi).
\end{align*}
Multiplying by $q(\xi)$ gives
\begin{align*}
q(\xi)\hat f(\xi)=\psi(\xi)|\xi|^{-2}|\xi|^2\hat u(\xi)=\psi(\xi)\hat u(\xi).
\end{align*}
Hence the high-frequency part $\mathcal F^{-1}(\psi\hat u)$ equals $Qf$, so it lies in $H^{s+2}(\mathbb R^n)$.
The remaining multiplier $1-\psi$ is supported in $|\xi|\le 2$. After localising $u$ in $x$, the localised distribution has some finite Sobolev order, say $H^{-N}$; for any $t\in\mathbb R$,
\begin{align*}
\|\mathcal F^{-1}((1-\psi)\widehat{\chi u})\|_{H^t}=\|\langle\xi\rangle^t(1-\psi(\xi))\widehat{\chi u}(\xi)\|_{L^2}\le C_{t,N}\|\chi u\|_{H^{-N}},
\end{align*}
because $\langle\xi\rangle^{t+N}(1-\psi(\xi))$ is bounded on the compact set $|\xi|\le 2$. Thus the low-frequency part is locally in every Sobolev space, while the high-frequency part is in $H^{s+2}$. Therefore $u\in H^{s+2}_{\mathrm{loc}}(\mathbb R^n)$, which is the Laplacian case of elliptic regularity.
[/example]
The model calculus developed in this chapter can now be summarised in one line: Fourier transform converts constant-coefficient operators into multiplication by symbols, and Sobolev mapping properties become weighted $L^2$ inequalities. The rest of the course asks how much of this line remains true when the multiplier depends on $x$ as well as $\xi$.
Chapter 1 showed that when multipliers depend only on the frequency variable $\xi$, the Fourier transform gives complete control through $L^2$ estimates. Once the multiplier gains $x$-dependence, global diagonalization fails, but symbol classes provide systematic bookkeeping for position-dependent operators. This chapter introduces the functional-analytic framework that lets the Fourier calculus survive the transition to pseudodifferential operators.
# 2. Symbol Classes
Symbol classes are the bookkeeping device that lets the Fourier multiplier calculus from Chapter 1 survive when the multiplier depends on position. There, polynomial and elliptic Fourier multipliers were controlled by estimates in the frequency variable alone. This chapter introduces functions $a(x,\xi)$ whose derivatives in $x$ and $\xi$ have prescribed growth, then organises them by order, by smoothing remainders, and finally by asymptotic expansions at high frequency.
## Measuring Symbols by Derivatives
The first question is how much regularity and growth control a variable coefficient multiplier should have. If $a(x,\xi)$ is to be inserted into oscillatory integrals, derivatives in $\xi$ must improve decay, while derivatives in $x$ are allowed to cost some growth. The Hörmander classes encode this balance through two parameters $\rho$ and $\delta$.
[definition: Japanese Bracket]
The Japanese bracket is the function $\langle\cdot\rangle:\mathbb R^n\to[1,\infty)$ defined by
\begin{align*}
\langle \xi \rangle := (1 + |\xi|^2)^{1/2}.
\end{align*}
[/definition]
The Japanese bracket is comparable to $|\xi|$ for large $|\xi|$ but stays positive near the origin, so estimates written with $\langle \xi \rangle$ do not need a separate low-frequency clause. It is the standard weight for symbol bounds.
[definition: Hormander Symbol Class]
Let $U \subseteq \mathbb R^n$ be open, let $m \in \mathbb R$, and let $0 \le \delta < \rho \le 1$. The class $S^m_{\rho,\delta}(U \times \mathbb R^n)$ consists of all $a \in C^\infty(U \times \mathbb R^n)$ such that, for every compact $K \subset U$ and every pair of multi-indices $\alpha,\beta \in \mathbb N_0^n$, there exists $C_{K,\alpha,\beta} > 0$ with
\begin{align*}
|\partial_x^\alpha \partial_\xi^\beta a(x,\xi)| \le C_{K,\alpha,\beta}\langle \xi \rangle^{m - \rho|\beta| + \delta|\alpha|}
\end{align*}
for all $x \in K$ and $\xi \in \mathbb R^n$.
[/definition]
The condition is local in $x$ because pseudodifferential operators on $U$ are local objects. The inequality says that differentiating in $\xi$ lowers order by $\rho$, while differentiating in $x$ raises order by at most $\delta$.
[definition: Standard Symbol Class]
The standard symbol class of order $m$ on $U$ is
\begin{align*}
S^m(U \times \mathbb R^n) := S^m_{1,0}(U \times \mathbb R^n).
\end{align*}
[/definition]
In this course the standard class is the default. For $S^m$, every $\xi$-derivative improves decay by one power and $x$-derivatives preserve the same order, matching the behaviour of differential operators with smooth coefficients.
[example: Quadratic Differential Symbol]
Let $g_{ij}\in C^\infty(U)$ for $1\le i,j\le n$ and $V\in C^\infty(U)$, and set
\begin{align*}
a(x,\xi)=\sum_{i,j=1}^n g_{ij}(x)\xi_i\xi_j+V(x).
\end{align*}
We show that $a\in S^2(U\times\mathbb R^n)$. For multi-indices $\alpha,\beta$, differentiating first in $x$ and then in $\xi$ gives
\begin{align*}
\partial_x^\alpha\partial_\xi^\beta a(x,\xi)=\sum_{i,j=1}^n (\partial_x^\alpha g_{ij})(x)\partial_\xi^\beta(\xi_i\xi_j)+\partial_x^\alpha V(x)\partial_\xi^\beta(1).
\end{align*}
Here $\partial_\xi^\beta(1)=1$ if $\beta=0$ and $\partial_\xi^\beta(1)=0$ if $|\beta|>0$.
Fix a compact set $K\subset U$. Since each $\partial_x^\alpha g_{ij}$ and $\partial_x^\alpha V$ is continuous, there is a constant $M_{\alpha,K}$ such that
\begin{align*}
|\partial_x^\alpha g_{ij}(x)|\le M_{\alpha,K}\quad\text{and}\quad |\partial_x^\alpha V(x)|\le M_{\alpha,K}
\end{align*}
for all $x\in K$ and all $i,j$. If $|\beta|>2$, then $\partial_\xi^\beta(\xi_i\xi_j)=0$ because $\xi_i\xi_j$ has degree $2$. If $|\beta|\le 2$, then $\partial_\xi^\beta(\xi_i\xi_j)$ is a polynomial of degree at most $2-|\beta|$, so for some constant $C_\beta$,
\begin{align*}
|\partial_\xi^\beta(\xi_i\xi_j)|\le C_\beta\langle\xi\rangle^{2-|\beta|}.
\end{align*}
Combining these estimates gives
\begin{align*}
|\partial_x^\alpha\partial_\xi^\beta a(x,\xi)|\le C_{K,\alpha,\beta}\langle\xi\rangle^{2-|\beta|}
\end{align*}
for all $x\in K$ and $\xi\in\mathbb R^n$, where the $V$ term is harmless when $\beta=0$ because $\langle\xi\rangle^0\le\langle\xi\rangle^2$. This is exactly the defining estimate for $S^2(U\times\mathbb R^n)$, so quadratic differential symbols have order $2$.
[/example]
This example is the local model for second-order differential operators. More generally, a differential operator of order $k$ with smooth coefficients has a polynomial symbol of order $k$.
[example: Weight Symbol]
For any $m\in\mathbb R$, set $a(x,\xi)=\langle\xi\rangle^m=(1+|\xi|^2)^{m/2}$, independent of $x$. We show that $a\in S^m(\mathbb R^n\times\mathbb R^n)$ by proving that every $\xi$-derivative lowers the order by the number of differentiations.
For a multi-index $\beta$, the derivative $\partial_\xi^\beta \langle\xi\rangle^m$ is a finite sum of terms of the form
\begin{align*}
c_{\beta,\gamma,k}\xi^\gamma(1+|\xi|^2)^{m/2-k}
\end{align*}
with $|\gamma|-2k\le -|\beta|$. This follows by induction on $|\beta|$. For $\beta=0$ the only term has $\gamma=0$ and $k=0$. If a term has the displayed form, then differentiating in $\xi_\ell$ gives
\begin{align*}
\partial_{\xi_\ell}\bigl(\xi^\gamma(1+|\xi|^2)^{m/2-k}\bigr)=\gamma_\ell\xi^{\gamma-e_\ell}(1+|\xi|^2)^{m/2-k}+(m-2k)\xi^{\gamma+e_\ell}(1+|\xi|^2)^{m/2-k-1}
\end{align*}
where the first term is absent when $\gamma_\ell=0$. For the first new term,
\begin{align*}
|\gamma-e_\ell|-2k=|\gamma|-1-2k\le -|\beta|-1
\end{align*}
and for the second,
\begin{align*}
|\gamma+e_\ell|-2(k+1)=|\gamma|-2k-1\le -|\beta|-1.
\end{align*}
Thus the same form holds after one more $\xi$-derivative.
Since $|\xi^\gamma|\le \langle\xi\rangle^{|\gamma|}$ and $(1+|\xi|^2)^{m/2-k}=\langle\xi\rangle^{m-2k}$, each summand satisfies
\begin{align*}
|\xi^\gamma(1+|\xi|^2)^{m/2-k}|\le \langle\xi\rangle^{m+|\gamma|-2k}\le \langle\xi\rangle^{m-|\beta|}.
\end{align*}
Because there are only finitely many summands, for each $\beta$ there is a constant $C_\beta$ such that
\begin{align*}
|\partial_\xi^\beta a(x,\xi)|\le C_\beta\langle\xi\rangle^{m-|\beta|}
\end{align*}
for all $x,\xi\in\mathbb R^n$. Finally, $\partial_x^\alpha a=0$ whenever $|\alpha|>0$, while the case $\alpha=0$ is exactly the estimate above. Hence $a\in S^m(\mathbb R^n\times\mathbb R^n)$, so the Japanese bracket powers are model symbols of their stated order.
[/example]
The two examples show that the definition captures both differential operators and Sobolev weights. To make approximation arguments inside these classes, we also need a topology that remembers every constant in every differentiated estimate.
[definition: Symbol Seminorm]
For compact $K \subset U$ and $N \in \mathbb N_0$, the symbol seminorm is the map
\begin{align*}
p_{K,N}^{(m,\rho,\delta)}:S^m_{\rho,\delta}(U \times \mathbb R^n)\to[0,\infty)
\end{align*}
defined by
\begin{align*}
p_{K,N}^{(m,\rho,\delta)}(a)
:= \max_{|\alpha|+|\beta|\le N}\sup_{x\in K,\xi\in\mathbb R^n}
\langle \xi \rangle^{-m+\rho|\beta|-\delta|\alpha|}
|\partial_x^\alpha\partial_\xi^\beta a(x,\xi)|.
\end{align*}
[/definition]
Choosing an exhaustion of $U$ by compact subsets gives a countable family of seminorms. The next issue is whether Cauchy constructions, such as asymptotic sums and regularised limits, stay inside the symbol class rather than merely producing formal derivative bounds.
[quotetheorem:7718]
[citeproof:7718]
The completeness statement itself works in the wider borderline range $0\le\delta\le\rho\le1$. The stricter condition $\delta<\rho$ is the analytic setting used later for composition and asymptotic expansion, where differentiating in frequency must improve estimates faster than differentiating in position can damage them. Completeness is needed because asymptotic summation will build symbols as limits of cut-off series rather than as finite expressions. A typical failure without completeness would be a sequence whose every differentiated weighted estimate is Cauchy but whose limit is not known to remain smooth with the same bounds. This Fréchet structure is also what lets the later calculus pass from formal expansions to honest operators acting on Sobolev spaces.
## Order Filtration and Smoothing Symbols
Once symbols have an order, the next question is how orders interact. Lower order terms are not discarded; they form a filtration that records how much decay has been gained. The intersection of all orders consists of symbols so rapidly decreasing in frequency that their operators have smoothing kernels.
[definition: Order Filtration]
For fixed $\rho,\delta$, the order filtration is the nested family
\begin{align*}
\cdots \subset S^{m-1}_{\rho,\delta}(U\times\mathbb R^n)
\subset S^m_{\rho,\delta}(U\times\mathbb R^n)
\subset S^{m+1}_{\rho,\delta}(U\times\mathbb R^n) \subset \cdots .
\end{align*}
[/definition]
The inclusion follows from $\langle \xi \rangle^{m'} \le \langle \xi \rangle^m$ when $m' \le m$. The smallest remainders in this filtration are those that fall below every finite order, and those are exactly the remainders that the later operator calculus treats as smoothing errors.
[definition: Smoothing Symbol]
The smoothing symbol class on $U$ is
\begin{align*}
S^{-\infty}(U\times\mathbb R^n) := \bigcap_{m\in\mathbb R} S^m(U\times\mathbb R^n).
\end{align*}
[/definition]
A smoothing symbol decays faster than any prescribed power of $\langle\xi\rangle$ after all derivatives, and later quantising such a symbol produces an operator with a smooth Schwartz kernel. This is the precise sense in which very low-order remainders become harmless for Sobolev mapping estimates and elliptic regularity.
[example: Compact Frequency Cutoff]
Let $\chi\in C_c^\infty(\mathbb R^n)$, set $L=\operatorname{supp}\chi$, and define $a(x,\xi)=b(x,\xi)\chi(\xi)$. We show that $a\in S^r(U\times\mathbb R^n)$ for every $r\in\mathbb R$, which is exactly $a\in S^{-\infty}(U\times\mathbb R^n)$.
Fix a compact set $K\subset U$ and multi-indices $\alpha,\beta$. Since $\chi$ is independent of $x$, the Leibniz rule in the $\xi$ variables gives
\begin{align*}
\partial_x^\alpha\partial_\xi^\beta a(x,\xi)=\sum_{\gamma\le \beta}\binom{\beta}{\gamma}(\partial_x^\alpha\partial_\xi^{\beta-\gamma}b)(x,\xi)(\partial_\xi^\gamma\chi)(\xi).
\end{align*}
For each $\gamma\le\beta$, the function $\partial_\xi^\gamma\chi$ is supported in $L$. By the assumed boundedness of derivatives of $b$ on $K\times L$, there is a constant $B_{K,\alpha,\beta}$ such that
\begin{align*}
|(\partial_x^\alpha\partial_\xi^{\beta-\gamma}b)(x,\xi)|\le B_{K,\alpha,\beta}
\end{align*}
for all $x\in K$, all $\xi\in L$, and all $\gamma\le\beta$. Since each $\partial_\xi^\gamma\chi$ is continuous with compact support, there is also a constant $D_\beta$ such that
\begin{align*}
|\partial_\xi^\gamma\chi(\xi)|\le D_\beta
\end{align*}
for all $\xi\in\mathbb R^n$ and all $\gamma\le\beta$. Hence
\begin{align*}
|\partial_x^\alpha\partial_\xi^\beta a(x,\xi)|\le \sum_{\gamma\le\beta}\binom{\beta}{\gamma}B_{K,\alpha,\beta}D_\beta
\end{align*}
whenever $x\in K$, and the left-hand side is $0$ when $\xi\notin L$.
Now fix $r\in\mathbb R$. Because $L$ is compact and $\langle\xi\rangle^{r-|\beta|}>0$ on $L$, the number
\begin{align*}
\mu_{r,\beta}:=\inf_{\xi\in L}\langle\xi\rangle^{r-|\beta|}
\end{align*}
is positive. Therefore, with
\begin{align*}
C_{K,\alpha,\beta,r}:=\mu_{r,\beta}^{-1}\sum_{\gamma\le\beta}\binom{\beta}{\gamma}B_{K,\alpha,\beta}D_\beta,
\end{align*}
we have
\begin{align*}
|\partial_x^\alpha\partial_\xi^\beta a(x,\xi)|\le C_{K,\alpha,\beta,r}\langle\xi\rangle^{r-|\beta|}
\end{align*}
for all $x\in K$ and all $\xi\in\mathbb R^n$. Since $r$ was arbitrary, $a$ lies in every standard symbol class $S^r$, so the cutoff symbol is smoothing.
[/example]
This compactly supported example shows that low-frequency modifications are invisible to high-frequency order estimates. The next structural question is whether the operations needed in quantisation, including differentiation, products, and cutoffs, preserve the filtration with predictable order changes.
[quotetheorem:7664]
[citeproof:7664]
The product estimate depends on having uniform control of all differentiated factors; without the Leibniz bounds, even multiplying two apparently mild amplitudes could create derivatives whose growth is not controlled by the expected combined order. The cutoff statement also has a limitation: a rough cutoff would introduce distributional or unbounded derivatives, so smoothness and symbol estimates of order $0$ are essential. These estimates explain why the order filtration behaves like a graded algebra up to lower-order terms, which is exactly the bookkeeping later used to prove Sobolev mapping properties and to separate principal symbols from lower-order errors. The next example shows how a smooth cutoff can alter low frequencies without changing the high-frequency order.
[example: Low Frequency Regularisation]
Let $m\in\mathbb R$ and choose $\chi\in C^\infty(\mathbb R^n)$ with $\chi(\xi)=0$ for $|\xi|\le 1$ and $\chi(\xi)=1$ for $|\xi|\ge 2$. Define
\begin{align*}
a(x,\xi)=a(\xi):=\chi(\xi)|\xi|^m.
\end{align*}
Since $\chi$ vanishes on a neighbourhood of $\xi=0$, the product is smooth at the origin; since it is independent of $x$, $\partial_x^\alpha a=0$ whenever $|\alpha|>0$.
We prove the defining estimates for $S^m$. Fix a multi-index $\beta$. By the Leibniz rule,
\begin{align*}
\partial_\xi^\beta a(\xi)=\sum_{\gamma\le \beta}\binom{\beta}{\gamma}(\partial_\xi^\gamma\chi)(\xi)\partial_\xi^{\beta-\gamma}|\xi|^m.
\end{align*}
For every multi-index $\eta$, repeated differentiation of $|\xi|^m$ on $\mathbb R^n\setminus\{0\}$ gives a finite sum of terms
\begin{align*}
c_{\eta,\nu,k}\xi^\nu|\xi|^{m-2k}
\end{align*}
with $|\nu|-2k=-|\eta|$. Indeed, differentiating one such term in $\xi_\ell$ gives
\begin{align*}
\partial_{\xi_\ell}(\xi^\nu|\xi|^{m-2k})=\nu_\ell\xi^{\nu-e_\ell}|\xi|^{m-2k}+(m-2k)\xi^{\nu+e_\ell}|\xi|^{m-2k-2},
\end{align*}
where the first term is absent if $\nu_\ell=0$, and the two new exponents satisfy
\begin{align*}
|\nu-e_\ell|-2k=|\nu|-2k-1
\end{align*}
and
\begin{align*}
|\nu+e_\ell|-2(k+1)=|\nu|-2k-1.
\end{align*}
Hence, for $|\xi|\ge 1$,
\begin{align*}
|\xi^\nu|\xi|^{m-2k}|\le |\xi|^{m+|\nu|-2k}=|\xi|^{m-|\eta|}\le C_{\eta,m}\langle\xi\rangle^{m-|\eta|}.
\end{align*}
Now split the Leibniz sum. If $\gamma=0$, then $\chi$ is bounded and the preceding estimate with $\eta=\beta$ gives
\begin{align*}
|\chi(\xi)\partial_\xi^\beta|\xi|^m|\le C_\beta\langle\xi\rangle^{m-|\beta|}.
\end{align*}
If $\gamma\ne0$, then $\partial_\xi^\gamma\chi$ is supported in the compact annulus $1\le |\xi|\le 2$. On this annulus, $\partial_\xi^\gamma\chi$ and $\partial_\xi^{\beta-\gamma}|\xi|^m$ are bounded, and $\langle\xi\rangle^{m-|\beta|}$ is bounded below by a positive constant. Therefore each $\gamma\ne0$ summand is also bounded by a constant times $\langle\xi\rangle^{m-|\beta|}$. Summing the finitely many terms gives
\begin{align*}
|\partial_\xi^\beta a(\xi)|\le C_\beta\langle\xi\rangle^{m-|\beta|}
\end{align*}
for all $\xi\in\mathbb R^n$. Together with the vanishing of positive $x$-derivatives, this is exactly $a\in S^m$. Thus the cutoff keeps the high-frequency model $|\xi|^m$ unchanged while replacing its possible singular behaviour at $\xi=0$ by a smooth low-frequency zero.
[/example]
## Classical Symbols and Asymptotic Expansions
The final question in this chapter is how to formalise the idea that a symbol has a leading homogeneous part, followed by successively lower-order corrections. This is essential for ellipticity, principal symbols, parametrices, and coordinate changes. Classical symbols are those admitting such a polyhomogeneous expansion at infinity.
[definition: Homogeneous Symbol Component]
Let $r\in\mathbb R$. A function $a_r:U\times(\mathbb R^n\setminus\{0\})\to\mathbb C$ is a homogeneous symbol component of degree $r$ in $\xi$ if $a_r\in C^\infty(U\times(\mathbb R^n\setminus\{0\}))$ and
\begin{align*}
a_r(x,t\xi)=t^r a_r(x,\xi)
\end{align*}
for all $x\in U$, $\xi\ne 0$, and $t>0$.
[/definition]
Homogeneous functions need not be smooth at $\xi=0$, so they are used only away from the zero section. A cutoff is inserted when turning homogeneous pieces into genuine symbols on all of $U\times\mathbb R^n$.
[definition: Classical Symbol]
A symbol $a\in S^m(U\times\mathbb R^n)$ is classical of order $m$ if there exist functions $a_{m-j}\in C^\infty(U\times(\mathbb R^n\setminus\{0\}))$, homogeneous of degree $m-j$ in $\xi$, such that for every $N\in\mathbb N$ and every cutoff $\chi\in C^\infty(\mathbb R^n)$ with $\chi(\xi)=0$ near $0$ and $\chi(\xi)=1$ for $|\xi|$ sufficiently large,
\begin{align*}
a(x,\xi)-\sum_{j=0}^{N-1}\chi(\xi)a_{m-j}(x,\xi) \in S^{m-N}(U\times\mathbb R^n).
\end{align*}
[/definition]
We write $a\sim\sum_{j\ge0}a_{m-j}$ for this expansion. The notation records successive high-frequency approximations, not pointwise convergence of the infinite series.
[example: Classical Differential Symbol]
Define, for $\xi\ne0$,
\begin{align*}
a_2(x,\xi):=\sum_{i,j=1}^n g_{ij}(x)\xi_i\xi_j
\end{align*}
and
\begin{align*}
a_1(x,\xi):=0,\qquad a_0(x,\xi):=V(x),\qquad a_{2-j}(x,\xi):=0\ \text{for }j\ge3.
\end{align*}
Then $a_2(x,t\xi)=t^2a_2(x,\xi)$ because each factor $\xi_i\xi_j$ becomes $(t\xi_i)(t\xi_j)=t^2\xi_i\xi_j$, while $a_0(x,t\xi)=V(x)=t^0a_0(x,\xi)$. The zero components are homogeneous of every degree.
Let $\chi\in C^\infty(\mathbb R^n)$ vanish near $0$ and equal $1$ for $|\xi|$ sufficiently large. For $N=1$ or $N=2$, the partial classical sum is just $\chi(\xi)a_2(x,\xi)$, since $a_1=0$. Hence
\begin{align*}
a(x,\xi)-\chi(\xi)a_2(x,\xi)=(1-\chi(\xi))a_2(x,\xi)+V(x).
\end{align*}
The factor $1-\chi$ is supported where $|\xi|$ is bounded. On every compact $K\subset U$, all $x$-derivatives of the coefficients $g_{ij}$ and $V$ are bounded, and all $\xi$-derivatives of $(1-\chi)\xi_i\xi_j$ are bounded because their support is compact in $\xi$. Thus $(1-\chi)a_2$ lies in $S^r(U\times\mathbb R^n)$ for every $r\in\mathbb R$, while $V(x)$ lies in $S^0(U\times\mathbb R^n)$ because its positive $\xi$-derivatives vanish. Therefore the remainder lies in $S^1$ when $N=1$ and in $S^0$ when $N=2$.
For $N\ge3$, the partial classical sum includes $\chi a_2$, $\chi a_1=0$, and $\chi a_0=\chi V$, so
\begin{align*}
a(x,\xi)-\sum_{j=0}^{N-1}\chi(\xi)a_{2-j}(x,\xi)=(1-\chi(\xi))a_2(x,\xi)+(1-\chi(\xi))V(x).
\end{align*}
Both terms have compact $\xi$-support and smooth bounded derivatives on $K\times\mathbb R^n$, so the same compact-support estimate puts the remainder in every $S^r$, in particular in $S^{2-N}$. Hence
\begin{align*}
a\sim a_2+a_0,
\end{align*}
with the missing degree $1$ and lower components equal to $0$. Thus the quadratic differential symbol is classical of order $2$.
[/example]
The differential example has only finitely many nonzero homogeneous pieces. For parametrices and coordinate changes, however, the expansion is usually infinite, so the next problem is whether a prescribed formal sequence of homogeneous terms comes from an actual symbol.
[quotetheorem:7665]
[citeproof:7665]
The shrinking parameters are essential: using a fixed cutoff scale for every homogeneous term need not make the differentiated weighted seminorms summable, because infinitely many lower-order pieces can still contribute on the same frequency region. The theorem gives existence but not uniqueness, and different choices of cutoffs and shrinking parameters can produce different realised symbols. This non-uniqueness is harmless only if it is confined to smoothing terms, since smoothing operators are invisible to the microlocal principal geometry and improve Sobolev regularity. The remaining question is whether these choices affect any finite homogeneous component, or whether they change only the smoothing remainder.
[quotetheorem:7666]
[citeproof:7666]
The hypothesis that the homogeneous components agree at every degree is necessary: if the first disagreement occurs at degree $m-j$, then the difference generally has order $m-j$, not order $-\infty$. For example, two classical symbols with leading parts differing by $|\xi|^m$ away from the origin cannot have smoothing difference, because their difference still has high-frequency growth of order $m$. The converse identifies the exact limitation of formal expansions: they determine a classical symbol only modulo smoothing symbols, not as a unique function on phase space. This is the right equivalence relation for microlocal elliptic regularity, where the leading component becomes the principal symbol and smoothing ambiguity improves Sobolev regularity rather than changing singularities.
Chapters 1–2 have established symbol classes as controlled functions on phase space, with equivalence relations that respect elliptic regularity. To extract estimates, composition laws, and regularity theorems, these symbols must be reified as operators through a quantization procedure. This chapter constructs the oscillatory integral quantization and examines what information the kernel carries about the symbol.
# 3. Quantisation and Kernels
This chapter turns the symbol classes of Chapter 2 into operators. Those classes supplied controlled functions on phase space; the present chapter explains how such a symbol acts on a function or distribution. The main questions are: how should the formal expression $a(x,D)$ be interpreted, what kernel does it define, and why does the resulting operator preserve singular support away from the diagonal?
## From Symbols to Operators
The Fourier multiplier model suggests replacing a constant coefficient multiplier $p(\xi)$ by a variable coefficient amplitude $a(x,\xi)$. The issue is that $a$ now depends on the observation point $x$, so the operator is no longer diagonalised by the Fourier transform. Kohn--Nirenberg quantisation is the convention used in this course to make this replacement precise.
[definition: Kohn-Nirenberg Quantisation]
Let $a \in S^m_{1,0}(\mathbb R^n \times \mathbb R^n)$ and let $u \in \mathcal{S}(\mathbb R^n)$. The Kohn--Nirenberg quantisation of $a$ is the operator $\operatorname{Op}(a):\mathcal{S}(\mathbb R^n) \to \mathcal{S}'(\mathbb R^n)$ defined by
\begin{align*}
\operatorname{Op}(a)u(x)
= (2\pi)^{-n}\int_{\mathbb R^n}\int_{\mathbb R^n} e^{i(x-y)\cdot\xi}a(x,\xi)u(y)\,d\mathcal L^n(y)d\mathcal L^n(\xi).
\end{align*}
[/definition]
For symbols of very negative order the integral is an ordinary absolutely convergent integral. For general symbols the formula is first understood by writing $u$ through its inverse Fourier transform:
\begin{align*}
\operatorname{Op}(a)u(x)
= (2\pi)^{-n/2}\int_{\mathbb R^n} e^{ix\cdot \xi}a(x,\xi)\hat{u}(\xi)\,d\mathcal L^n(\xi),
\end{align*}
which is meaningful because $\hat u$ is rapidly decreasing and $a$ has at most polynomial growth in $\xi$.
The definition should recover familiar operators. This is the first check on the convention and fixes the placement of $x$, $y$, and $\xi$ in the phase.
[example: Differential Operators]
Let
\begin{align*}
P=\sum_{|\alpha|\le m} c_\alpha(x)D^\alpha,\qquad D^\alpha=(-i)^{|\alpha|}\partial_x^\alpha,
\end{align*}
with $c_\alpha\in C^\infty_b(\mathbb R^n)$ for simplicity. We show that $P$ is the Kohn--Nirenberg operator with polynomial symbol
\begin{align*}
p(x,\xi)=\sum_{|\alpha|\le m}c_\alpha(x)\xi^\alpha.
\end{align*}
For $u\in\mathcal S(\mathbb R^n)$, Fourier inversion gives
\begin{align*}
u(x)=(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}\hat u(\xi)\,d\mathcal L^n(\xi).
\end{align*}
Differentiating under the integral is justified by the rapid decay of $\hat u$. Since $\partial_{x_j}e^{ix\cdot\xi}=i\xi_j e^{ix\cdot\xi}$, for a multi-index $\alpha$ we have
\begin{align*}
\partial_x^\alpha e^{ix\cdot\xi}=i^{|\alpha|}\xi^\alpha e^{ix\cdot\xi}.
\end{align*}
Therefore
\begin{align*}
D^\alpha e^{ix\cdot\xi}=(-i)^{|\alpha|}i^{|\alpha|}\xi^\alpha e^{ix\cdot\xi}=\xi^\alpha e^{ix\cdot\xi}.
\end{align*}
Applying $P$ to the inverse Fourier representation of $u$ gives
\begin{align*}
Pu(x)=\sum_{|\alpha|\le m}c_\alpha(x)(2\pi)^{-n/2}\int_{\mathbb R^n}\xi^\alpha e^{ix\cdot\xi}\hat u(\xi)\,d\mathcal L^n(\xi).
\end{align*}
Because the sum over $\alpha$ is finite, this is
\begin{align*}
Pu(x)=(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}\left(\sum_{|\alpha|\le m}c_\alpha(x)\xi^\alpha\right)\hat u(\xi)\,d\mathcal L^n(\xi).
\end{align*}
Substituting the definition of $p$ yields
\begin{align*}
Pu(x)=(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}p(x,\xi)\hat u(\xi)\,d\mathcal L^n(\xi)=\operatorname{Op}(p)u(x).
\end{align*}
Thus differential operators are exactly the polynomial-in-$\xi$ part of the Kohn--Nirenberg calculus, with each factor $\xi_j$ encoding the operator $D_{x_j}=-i\partial_{x_j}$.
[/example]
Multiplication operators give the other elementary endpoint. Differential operators showed what happens when the frequency variable appears polynomially, so powers of $\xi$ encode differentiation. If the symbol ignores $\xi$ altogether, the oscillatory formula should collapse to a pointwise operation in the base variable. This second check confirms that the calculus contains both differentiation and smooth coefficient multiplication before it combines them into more flexible operators.
[example: Multiplication Operators]
Let $f\in C^\infty_b(\mathbb R^n)$ and set $a(x,\xi)=f(x)$. Since $a$ is independent of $\xi$, the Fourier-side form of Kohn--Nirenberg quantisation gives, for $u\in\mathcal S(\mathbb R^n)$,
\begin{align*}
\operatorname{Op}(a)u(x)=(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}a(x,\xi)\hat u(\xi)\,d\mathcal L^n(\xi).
\end{align*}
Substituting $a(x,\xi)=f(x)$ gives
\begin{align*}
\operatorname{Op}(a)u(x)=(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}f(x)\hat u(\xi)\,d\mathcal L^n(\xi).
\end{align*}
For fixed $x$, the factor $f(x)$ is independent of $\xi$, so by linearity of the integral,
\begin{align*}
\operatorname{Op}(a)u(x)=f(x)(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}\hat u(\xi)\,d\mathcal L^n(\xi).
\end{align*}
By Fourier inversion on $\mathcal S(\mathbb R^n)$,
\begin{align*}
(2\pi)^{-n/2}\int_{\mathbb R^n}e^{ix\cdot\xi}\hat u(\xi)\,d\mathcal L^n(\xi)=u(x).
\end{align*}
Therefore
\begin{align*}
\operatorname{Op}(a)u(x)=f(x)u(x).
\end{align*}
Thus a symbol depending only on $x$ acts by pointwise multiplication, so smooth coefficient multiplication is the order $0$ endpoint of the Kohn--Nirenberg calculus.
[/example]
These examples also explain why the quantisation is not symmetric in $x$ and $y$. The coefficient is sampled at the output point $x$, which is the defining feature of the Kohn--Nirenberg convention. The next problem is whether the same formula has stable mapping properties on the test-function space used to define distributions.
[quotetheorem:7719]
[citeproof:7719]
The point of this theorem is not only that the formula is legal on test functions. It gives the first version of the operator topology: estimates on finitely many derivatives of the symbol control finitely many seminorms of the output.
The symbol-class hypotheses are doing real work here. The estimates in $S^m_{1,0}$ give polynomial control in $\xi$ after every $x$-derivative and gain one power of decay after every $\xi$-derivative, so the integration-by-parts argument can always be paid for by Schwartz decay of $\hat u$. The $x$-control cannot be dropped: if $a(x,\xi)=e^{|x|^2}$, then $\operatorname{Op}(a)u=e^{|x|^2}u(x)$, and the Gaussian $u(x)=e^{-|x|^2/2}$ is sent to $e^{|x|^2/2}$, which is not a Schwartz function. The $\xi$-derivative estimates are also structural: if repeated $\xi$-derivatives of the amplitude grow faster than any fixed polynomial, the commutations used to handle powers of $x$ do not close on finitely many Schwartz seminorms. Thus the theorem is a continuity statement for the standard symbol calculus, not a statement about arbitrary smooth amplitudes of polynomial size. This limitation is important later: mapping results on distributions and Sobolev spaces will always depend on the same finite families of symbol seminorms.
## Oscillatory Integrals and Localisation
The double integral defining $\operatorname{Op}(a)$ is rarely absolutely convergent in the frequency variable. To use the same formula on open sets and eventually on distributions, we need a notion of integral in which the oscillation of $e^{i(x-y)\cdot\xi}$ supplies convergence after integration by parts.
[definition: Oscillatory Integral Regularisation]
Let $W\subseteq \mathbb R^n\times\mathbb R^n$ be open, and let $A\in C^\infty(W\times\mathbb R^n)$ satisfy symbol estimates in $\xi$ locally uniformly in $(x,y)$. The oscillatory integral regularisation is the assignment, whenever the limit below exists in $\mathcal D'(W)$,
\begin{align*}
A \longmapsto \operatorname{Os}_\xi(A)\in \mathcal D'(W),
\end{align*}
defined by
\begin{align*}
\operatorname{Os}_\xi(A)(x,y)
=\operatorname{Os}\!\int_{\mathbb R^n} e^{i(x-y)\cdot\xi}A(x,y,\xi)\,d\mathcal L^n(\xi).
\end{align*}
For each cutoff $\chi\in C_c^\infty(\mathbb R^n)$ with $\chi(0)=1$, this distribution is the limit
\begin{align*}
\lim_{\varepsilon\downarrow 0}\int e^{i(x-y)\cdot\xi}\chi(\varepsilon \xi)A(x,y,\xi)\,d\mathcal L^n(\xi),
\end{align*}
understood as a limit in $\mathcal D'(W)$.
[/definition]
The regularisation is independent of the cutoff once enough integrations by parts are available. Away from $x=y$, the vector field differentiating the phase in the $\xi$ direction produces factors of $|x-y|^{-1}$ and improves decay in $\xi$. This raises the basic question needed for kernels: does the regularised integral become a smooth function whenever the variables are kept away from the diagonal?
[quotetheorem:7667]
[citeproof:7667]
This principle is the analytic mechanism behind the phrase "smooth off the diagonal." The separation condition $|x-y|\ge r$ is essential: it gives a nonvanishing gradient in the $\xi$-phase direction and keeps the coefficients in the integration-by-parts operator bounded on the region under consideration. On the diagonal this mechanism degenerates. For instance, with $A\equiv 1$ the regularised integral is the delta kernel of the identity operator, not a smooth function at $x=y$.
The symbol estimates are essential as well. They ensure that each integration by parts turns the amplitude into another controlled amplitude with better decay in $\xi$. If, away from the diagonal, one replaces a symbol by $A(x,y,\xi)=e^{|\xi|^2}$, the cutoff integrals
\begin{align*}
\int_{\mathbb R^n} e^{i(x-y)\cdot\xi}\chi(\varepsilon\xi)e^{|\xi|^2}\,d\mathcal L^n(\xi)
\end{align*}
do not have a distributional limit as $\varepsilon\downarrow 0$: the exponential growth overwhelms all oscillatory integrations by parts. Thus the theorem gives no smoothing across the diagonal, and it gives no conclusion for arbitrary smooth amplitudes. It says that once the output and input variables are separated, standard symbol estimates let frequency oscillation remove the apparent divergence. This is exactly the form needed for kernel estimates and, later, for microlocal statements that refine singular support by keeping track of directions in frequency space.
On an [open set](/page/Open%20Set) $U\subseteq\mathbb R^n$, quantisation is defined locally by cutting off inside coordinate space. The operator acts first on compactly supported smooth functions, because then the $y$-variable remains in a compact subset of $U$.
[definition: Pseudodifferential Operator on an Open Set]
Let $U\subseteq\mathbb R^n$ be open. A [linear map](/page/Linear%20Map) $P:C_c^\infty(U)\to C^\infty(U)$ is a pseudodifferential operator of order $m$ if, locally in $U$, it can be written as
\begin{align*}
Pu(x)=(2\pi)^{-n}\operatorname{Os}\!\int\int e^{i(x-y)\cdot\xi}a(x,\xi)u(y)\,d\mathcal L^n(y)d\mathcal L^n(\xi)
\end{align*}
with $a\in S^m_{1,0}(V\times\mathbb R^n)$ on coordinate neighbourhoods $V\subset U$.
[/definition]
After localisation, the operator can be paired against a test function in $x$. To define the same operator on distributions, we must know what test function should be fed into the input distribution. The obstruction is support: a transpose may send a compactly supported test function to a smooth function whose support is not compact, and a general distribution cannot be evaluated on such an object. This leads to the transpose construction and explains where support restrictions enter.
[definition: Distributional Action]
Let $P:C_c^\infty(U)\to C^\infty(U)$ be continuous. Its transpose $P^t:C_c^\infty(U)\to \mathcal D'(U)$ is defined by
\begin{align*}
(P^t\phi)(u)=\int_U \phi(x)Pu(x)\,d\mathcal L^n(x),
\end{align*}
for $\phi,u\in C_c^\infty(U)$.
[/definition]
If $P^t\phi$ is represented by a compactly supported test function for every $\phi\in C_c^\infty(U)$, then the formula $(Pu)(\phi)=u(P^t\phi)$ defines the distributional extension. This extension statement is a consequence of the support property, not part of the definition of the transpose itself.
The support condition in this extension criterion is not cosmetic. Without it, applying an operator to a compactly supported test function can create output whose interaction with a distribution is not defined by a compactly supported pairing.
[example: Non-Properly Supported Operator]
On $\mathbb R$, fix a nonzero $\psi\in C_c^\infty(\mathbb R)$ and define
\begin{align*}
Tu(x)=\int_{\mathbb R}\psi(y)u(y)\,d\mathcal L^1(y)
\end{align*}
for $u\in C_c^\infty(\mathbb R)$. The right-hand side has no remaining dependence on $x$, so if
\begin{align*}
c_u=\int_{\mathbb R}\psi(y)u(y)\,d\mathcal L^1(y),
\end{align*}
then
\begin{align*}
Tu(x)=c_u
\end{align*}
for every $x\in\mathbb R$. Since $\psi\ne 0$, choose $u\in C_c^\infty(\mathbb R)$ with $c_u\ne 0$; for instance, take $u$ supported where $\psi$ is not identically zero and with the same sign locally. Then $Tu$ is a nonzero constant function, hence
\begin{align*}
\operatorname{supp}(Tu)=\mathbb R.
\end{align*}
For test functions $\phi,u\in C_c^\infty(\mathbb R)$, the pairing with the output is
\begin{align*}
\int_{\mathbb R}\phi(x)Tu(x)\,d\mathcal L^1(x)
=\int_{\mathbb R}\phi(x)\left(\int_{\mathbb R}\psi(y)u(y)\,d\mathcal L^1(y)\right)d\mathcal L^1(x).
\end{align*}
Because $\phi$ and $u$ are compactly supported, [Fubini's theorem](/theorems/2961) applies and gives
\begin{align*}
\int_{\mathbb R}\phi(x)Tu(x)\,d\mathcal L^1(x)
=\int_{\mathbb R}\int_{\mathbb R}\phi(x)\psi(y)u(y)\,d\mathcal L^1(y)d\mathcal L^1(x).
\end{align*}
Thus the Schwartz kernel is the smooth function
\begin{align*}
K_T(x,y)=\psi(y).
\end{align*}
Its support is
\begin{align*}
\operatorname{supp}K_T=\mathbb R\times \operatorname{supp}\psi.
\end{align*}
Let $K\subset\mathbb R$ be any compact set meeting $\operatorname{supp}\psi$. Then
\begin{align*}
\pi_2^{-1}(K)\cap \operatorname{supp}K_T=\mathbb R\times (K\cap \operatorname{supp}\psi),
\end{align*}
which is not compact because it is unbounded in the $x$-direction. Hence the second projection from $\operatorname{supp}K_T$ is not proper. This operator has a smooth kernel, but it is not properly supported; smoothness of the kernel alone does not prevent compactly supported input from producing output supported everywhere.
[/example]
This example shows that smoothing behaviour and support behaviour are separate. Proper support is the condition that lets the local calculus act globally on distributions without imposing artificial support restrictions.
## Schwartz Kernels and Proper Support
Operators become geometric objects once we record their Schwartz kernels. The guiding problem is to identify which part of the kernel is responsible for singularities and which part is harmless smoothing.
[definition: Schwartz Kernel]
Let $U\subseteq\mathbb R^n$ be open and let $P:C_c^\infty(U)\to \mathcal D'(U)$ be continuous. A distribution $K_P\in\mathcal D'(U\times U)$ is the Schwartz kernel of $P$ if
\begin{align*}
(Pu)(\phi)=K_P(\phi\otimes u)
\end{align*}
for all $\phi,u\in C_c^\infty(U)$.
[/definition]
The kernel theorem says that this distribution exists and is unique for continuous linear maps on test functions. For a pseudodifferential operator, the natural question is stronger: what is the kernel in terms of the symbol, and where can that kernel fail to be smooth?
[quotetheorem:7668]
[citeproof:7668]
The diagonal is therefore the whole singular locus of the kernel. This observation is the kernel-level form of locality: pseudodifferential operators may be nonlocal, but their singular part is concentrated where input and output points coincide. The theorem has separated the analytic difficulty, namely singular behaviour near the diagonal, from a second difficulty that is geometric rather than microlocal.
The hypotheses explain why this conclusion belongs to the symbol calculus rather than to arbitrary oscillatory formulas. If the amplitude is replaced by $e^{|\xi|^2}$, the regularised frequency integral may fail to define a distribution even away from the diagonal, so there is no kernel to restrict to $(U\times U)\setminus\operatorname{diag}$. If the operator is not locally given by a Kohn--Nirenberg symbol in coordinate space, the displayed phase $e^{i(x-y)\cdot\xi}$ and the symbol estimates do not describe its kernel; for example, a Fourier integral operator has a different phase and can carry singularities along a canonical relation rather than concentrating its kernel singularity only on the diagonal. The theorem is therefore both analytic, through symbol estimates, and local, through the quantisation model.
That second difficulty appears when we try to let the operator act on distributions or compare it on different compact subsets. A kernel may be singular only on the diagonal and still spread compactly supported input across a noncompact set, or require a test function in one variable to be paired against a noncompact region in the other. The symbol estimates control oscillation in the frequency variable, but they do not by themselves say that the two spatial projections of the kernel support behave well. The next definition isolates exactly the support condition needed to make the kernel calculus local in both variables.
[definition: Properly Supported Operator]
Let $U\subseteq\mathbb R^n$ be open and let $P:C_c^\infty(U)\to \mathcal D'(U)$ have Schwartz kernel $K_P$. The operator $P$ is properly supported if the coordinate projections
\begin{align*}
\pi_1:\operatorname{supp}K_P\to U,
\qquad
\pi_2:\operatorname{supp}K_P\to U
\end{align*}
are proper maps.
[/definition]
Proper support means that compact sets remain compact under both projections of the kernel support. Consequently $P$ sends compactly supported distributions to distributions whose support is locally controlled, and the transpose has the same property.
[remark: Proper Support by Cutoff]
Every pseudodifferential operator is locally equal to a properly supported one. Choose a cutoff $\chi(x,y)$ supported in a small neighbourhood of the diagonal and equal to $1$ near a smaller neighbourhood of the diagonal. Replacing $K_P$ by $\chi K_P$ changes the operator by a smoothing operator near the region where $\chi=1$.
[/remark]
This replacement is used throughout the local calculus. It allows statements about distributions to be proved after inserting a proper support cutoff, while all singular information near a chosen point is unchanged.
## Smoothing Operators and Pseudolocality
The next question is what happens when the kernel has no singularity at all. Such operators are regularising: they turn distributions into smooth functions, and they are invisible to singular support.
[definition: Smoothing Operator]
Let $U\subseteq\mathbb R^n$ be open. A continuous linear map $R:\mathcal D'(U)\to C^\infty(U)$ is a smoothing operator if its Schwartz kernel is a smooth function $K_R\in C^\infty(U\times U)$ and the action is given locally by
\begin{align*}
Ru(x)=u(K_R(x,\cdot))
\end{align*}
whenever the pairing is defined.
[/definition]
Smooth kernels include familiar averaging procedures. They are the order $-\infty$ part of the calculus.
[example: Regularising Convolution Kernel]
Let $\rho\in\mathcal S(\mathbb R^n)$. For each fixed $x\in\mathbb R^n$, the function
\begin{align*}
y\mapsto \rho(x-y)
\end{align*}
belongs to $\mathcal S(\mathbb R^n)$, so every tempered distribution $u\in\mathcal S'(\mathbb R^n)$ can be paired with it. Define
\begin{align*}
Ru(x)=u(\rho(x-\cdot)).
\end{align*}
If $u$ is represented by an integrable function, this pairing is exactly
\begin{align*}
Ru(x)=\int_{\mathbb R^n}\rho(x-y)u(y)\,d\mathcal L^n(y).
\end{align*}
The associated kernel is
\begin{align*}
K_R(x,y)=\rho(x-y).
\end{align*}
Since the map $(x,y)\mapsto x-y$ is smooth and $\rho$ is smooth, $K_R$ is smooth on $\mathbb R^n\times\mathbb R^n$. For the first $x_j$-derivative,
\begin{align*}
\partial_{x_j}K_R(x,y)=(\partial_j\rho)(x-y).
\end{align*}
By induction on $|\alpha|$,
\begin{align*}
\partial_x^\alpha K_R(x,y)=(\partial^\alpha\rho)(x-y).
\end{align*}
For fixed $x$, the difference quotient
\begin{align*}
\frac{\rho(x+he_j-\cdot)-\rho(x-\cdot)}{h}
\end{align*}
converges to $(\partial_j\rho)(x-\cdot)$ in $\mathcal S(\mathbb R^n)$ as $h\to0$, because translations and derivatives are continuous in the Schwartz topology. Since $u$ is continuous on $\mathcal S(\mathbb R^n)$, applying $u$ to this convergence gives
\begin{align*}
\partial_{x_j}Ru(x)=u((\partial_j\rho)(x-\cdot)).
\end{align*}
Repeating the same argument for a multi-index $\alpha$ gives
\begin{align*}
\partial_x^\alpha Ru(x)=u((\partial^\alpha\rho)(x-\cdot)).
\end{align*}
Hence $Ru$ is smooth, and each derivative is obtained by pairing $u$ with the corresponding derivative of the convolution kernel.
[/example]
The convolution example is a model for a general principle. What needs to be proved is that smoothness of the kernel is exactly the condition that turns every distributional input into a smooth output, provided support is controlled.
[quotetheorem:7669]
[citeproof:7669]
This theorem identifies smoothing operators with the harmless part of the kernel. Pseudodifferential operators are not generally smoothing, but away from the diagonal their kernels are smooth, and that is enough to control singular support. The theorem does not say that every kernel with a diagonal singularity is pseudodifferential, nor that a finite-order pseudodifferential kernel regularises all distributions. It separates the special case of globally smooth kernels from the finite-order kernels whose singular behaviour remains concentrated on the diagonal.
Proper support is not a technical decoration in this theorem. A smooth kernel may define a perfectly good operator on compactly supported test functions while failing to act on all distributions, because $K_R(x,\cdot)$ need not have compact support and a distribution is not defined on arbitrary smooth functions. The constant-output example above has a smooth kernel but lacks the support control needed for a global distributional action on $\mathcal D'(U)$. Thus smoothness of the kernel is the regularity condition, while proper support is the domain condition. In non-proper settings the same kernel may still regularise compactly supported distributions or tempered distributions under extra decay assumptions, but the theorem as stated is the local $\mathcal D'$ version used in pseudolocality.
[definition: Singular Support]
Let $u\in\mathcal D'(U)$. The singular support $\operatorname{sing\,supp}u$ is the complement of the largest open subset $V\subset U$ such that the restriction $u|_V$ is induced by a smooth function on $V$.
[/definition]
With this definition, the final question of the chapter can be stated precisely. If the kernel is smooth away from the diagonal, can applying $P$ create a singularity at a point where the input distribution was already smooth?
[quotetheorem:7670]
[citeproof:7670]
Pseudolocality is the first singularity theorem of the course. It does not say that singularities are improved; elliptic parametrices later give that stronger conclusion under ellipticity. Here the conclusion is only that the calculus respects the location of singular support.
Proper support is again part of the mechanism. In the proof, the term $(1-\psi)u$ is paired with kernels over regions separated from $x_0$; proper support ensures that this pairing only sees a compact part of the input distribution when $x$ stays in a compact neighbourhood of $x_0$. Without such control, a smooth off-diagonal kernel can collect contributions from infinitely far away, and the expression near $x_0$ may fail to be defined on arbitrary distributions or may depend on extra decay assumptions not present in $\mathcal D'(U)$. The theorem also has a built-in limitation: it forbids new singular locations, but it does not reduce the strength or direction of an existing singularity. That stronger information belongs to elliptic regularity, parametrices, and later propagation-of-singularities arguments.
[example: Pseudolocal but Nonlocal Behaviour]
Let $P$ be properly supported, and suppose its kernel $K_P$ is supported in a small neighbourhood of the diagonal but has a nonzero off-diagonal smooth value: choose $x_1\ne y_1$ with $K_P(x_1,y_1)\ne0$. Since $K_P$ is smooth off the diagonal, there are neighbourhoods $V$ of $x_1$ and $W$ of $y_1$, with $V\cap W=\varnothing$, on which $K_P$ is a smooth function. After shrinking $W$, we may assume $K_P(x_1,y)$ has the same sign as $K_P(x_1,y_1)$ and is nonzero on $W$.
Choose a nonnegative bump $u\in C_c^\infty(W)$ with $u\not\equiv0$. For $x=x_1$, the kernel formula gives
\begin{align*}
Pu(x_1)=\int_{\mathbb R^n}K_P(x_1,y)u(y)\,d\mathcal L^n(y).
\end{align*}
The integrand has one fixed sign on $W$, is not identically zero, and vanishes outside $\operatorname{supp}u\subset W$, so
\begin{align*}
\int_{\mathbb R^n}K_P(x_1,y)u(y)\,d\mathcal L^n(y)=\int_W K_P(x_1,y)u(y)\,d\mathcal L^n(y)\ne0.
\end{align*}
Thus a smooth input supported near $y_1$ can affect the value of $Pu$ at the different point $x_1$; pseudodifferential operators are not local in the pointwise sense.
Now let $u\in\mathcal D'(U)$ have a singularity at $y_0$ but be smooth near a different point $x_0$. Since $x_0\notin\operatorname{sing\,supp}u$, pseudolocality gives
\begin{align*}
x_0\notin\operatorname{sing\,supp}(Pu).
\end{align*}
Equivalently, the singularity at $y_0$ may contribute smooth off-diagonal terms near $x_0$, but it cannot create a new singularity there. If $x_0$ is already in $\operatorname{sing\,supp}u$, the inclusion $\operatorname{sing\,supp}(Pu)\subseteq\operatorname{sing\,supp}u$ gives no reason for that singularity to disappear.
[/example]
The chapter has moved from the oscillatory formula for $\operatorname{Op}(a)$ to the kernel description of its singular behaviour. The next step in the course is to compare different quantisations and compose operators, where the same oscillatory-integral method produces asymptotic expansions of symbols.
The oscillatory integral formula for pseudodifferential operators reveals their singular support through phase and amplitude. Composing operators and deriving asymptotic corrections requires the same oscillatory machinery to produce symbol expansions systematically. This chapter develops principal symbols, changes of quantization, and the formal calculus needed for operator composition.
# 4. Amplitudes, Changes of Quantisation, and Principal Symbols
Having constructed local pseudodifferential operators from symbol classes and oscillatory integrals, we now examine how stable that construction is under the changes that occur in practice. The chapter assumes the preceding material on Fourier transform conventions, smoothing operators, asymptotic symbol expansions, and the kernel description of $\Psi^m(U)$. The operator was previously written with a left symbol $a(x,\xi)$, so the coefficient was evaluated at the output point $x$; coordinate changes, cutoffs, and adjoints naturally produce formulas where the coefficient depends on both $x$ and $y$. We therefore introduce amplitudes, compare left, right, and Weyl quantisation, and isolate the part of the symbol that survives these choices: the principal symbol.
## Amplitude Operators and Reduction to Left Symbols
The first problem is that many natural kernels are not presented with coefficients depending only on the output variable. When localising an operator by cutoffs, changing coordinates, or composing with multiplication operators, the oscillatory integral often contains a factor depending on both the source point $y$ and the target point $x$. The calculus would be unusable if each such formula represented a new class of operators, so the main question is whether this extra dependence can be converted back into an ordinary left symbol.
[definition: Amplitude of Order m]
Let $U \subset \mathbb R^n$ be open. An amplitude of order $m$ on $U$ is a smooth map
\begin{align*}
A:U \times U \times \mathbb R^n \longrightarrow \mathbb C
\end{align*}
such that for every compact $K \subset U \times U$ and all multi-indices $\alpha, \beta, \gamma$, there is a constant $C_{K\alpha\beta\gamma}>0$ with
\begin{align*}
|\partial_x^\alpha \partial_y^\beta \partial_\xi^\gamma A(x,y,\xi)| \le C_{K\alpha\beta\gamma}(1+|\xi|)^{m-|\gamma|}
\end{align*}
for all $(x,y) \in K$ and $\xi \in \mathbb R^n$.
[/definition]
The point of this definition is that the estimates are symbolic only in the covariable $\xi$, while $x$ and $y$ are treated locally and smoothly. To turn such an object into an operator, we keep the same Fourier phase as before and allow the coefficient to see both endpoints of the kernel.
[definition: Amplitude Operator]
Let $A$ be an amplitude of order $m$ on $U$. The amplitude operator $I_A:C_c^\infty(U) \to \mathcal{D}^{\prime}(U)$ is defined by
\begin{align*}
I_Au(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot\xi}A(x,y,\xi)u(y)\,dy\,d\xi,
\end{align*}
where the integral is understood as an oscillatory integral.
[/definition]
This formula generalises the left quantisation formula by replacing $a(x,\xi)$ with $A(x,y,\xi)$. The obstruction is that $y$-dependence prevents us from reading off an ordinary symbol: the kernel coefficient now sees both endpoints, while a left symbol is allowed to see only the output point $x$ and the covariable $\xi$. For the calculus to be closed under the operations that naturally create such kernels, the off-diagonal $y$-dependence must be removable up to a smoothing error. The needed mechanism is a Taylor expansion along the diagonal $y=x$, with integration by parts converting each off-diagonal factor into a lower-order $\xi$-derivative.
[quotetheorem:7671]
[citeproof:7671]
The theorem says that amplitudes do not enlarge the pseudodifferential calculus modulo smoothing operators. The symbolic estimates in $\xi$ are essential: they make the Taylor remainder lose order after each integration by parts, so the error can be forced into $S^{m-N}$ for arbitrary $N$. Smooth local dependence on $(x,y)$ is also needed because the proof differentiates the amplitude repeatedly along the diagonal $y=x$; a merely continuous kernel coefficient would not produce the displayed derivative expansion. Thus the theorem is not a statement about arbitrary oscillatory kernels, but about kernels whose off-diagonal dependence is controlled in exactly the symbolic way needed by the calculus. They are still indispensable as an intermediate language, because the operations that produce new operators often produce amplitudes before the reduction theorem returns them to symbols.
[example: Multiplication on Both Sides]
Let $u\in C_c^\infty(U)$. Since multiplication by $c$ sends $u$ to $cu$, the left quantisation formula gives
\begin{align*}
P(cu)(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot\xi}a(x,\xi)c(y)u(y)\,dy\,d\xi.
\end{align*}
Multiplying the output by $b(x)$ gives
\begin{align*}
(bPc)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot\xi}b(x)a(x,\xi)c(y)u(y)\,dy\,d\xi.
\end{align*}
Thus $bPc$ is the amplitude operator with
\begin{align*}
A(x,y,\xi)=b(x)a(x,\xi)c(y).
\end{align*}
Applying the *[Amplitude Reduction Theorem](/theorems/7671)* to this amplitude, a left symbol for $bPc$ is asymptotic to
\begin{align*}
\sum_{|\alpha|\ge 0}\frac{1}{\alpha!}\partial_\xi^\alpha D_y^\alpha\bigl(b(x)a(x,\xi)c(y)\bigr)\big|_{y=x}.
\end{align*}
Because $b(x)$ and $a(x,\xi)$ are independent of $y$,
\begin{align*}
D_y^\alpha\bigl(b(x)a(x,\xi)c(y)\bigr)=b(x)a(x,\xi)D_y^\alpha c(y).
\end{align*}
Because $D_y^\alpha c(y)$ is independent of $\xi$,
\begin{align*}
\partial_\xi^\alpha\bigl(b(x)a(x,\xi)D_y^\alpha c(y)\bigr)=\partial_\xi^\alpha\bigl(b(x)a(x,\xi)\bigr)D_y^\alpha c(y).
\end{align*}
Therefore the reduced left symbol has expansion
\begin{align*}
\sum_{|\alpha|\ge 0}\frac{1}{\alpha!}\partial_\xi^\alpha\bigl(b(x)a(x,\xi)\bigr)D_y^\alpha c(y)\big|_{y=x}.
\end{align*}
The term with $\alpha=0$ is
\begin{align*}
\partial_\xi^0\bigl(b(x)a(x,\xi)\bigr)D_y^0c(y)\big|_{y=x}=b(x)a(x,\xi)c(x).
\end{align*}
If $|\alpha|\ge 1$, then $\partial_\xi^\alpha a\in S^{m-|\alpha|}$, while multiplication by the smooth factors $b(x)$ and $D_y^\alpha c(x)$ does not change symbolic order locally. Hence every nonzero-$\alpha$ term has order at most $m-1$. The top-order symbol of $bPc$ is therefore $b(x)c(x)a(x,\xi)$, and the lower-order terms measure the failure of the right cutoff $c(y)$ to behave exactly like $c(x)$ inside the oscillatory kernel.
[/example]
This example is the basic local mechanism behind cutoff manipulations. The top-order part behaves as if multiplication by smooth functions commuted with the operator, while lower-order terms record the failure of exact commutation.
## Right, Left, and Weyl Quantisation
The next question is how much of a pseudodifferential operator depends on the convention used to place the coefficient. Left quantisation evaluates the symbol at $x$, right quantisation evaluates it at $y$, and Weyl quantisation evaluates it at the midpoint. Since differential operators and quantum-mechanical observables naturally lead to different conventions, the calculus needs a systematic translation between them.
[definition: Left Quantisation]
For $a \in S^m(U\times\mathbb R^n)$, the left quantisation of $a$ is the operator
\begin{align*}
\operatorname{Op}_L(a):C_c^\infty(U)\longrightarrow \mathcal{D}'(U)
\end{align*}
defined by
\begin{align*}
\operatorname{Op}_L(a)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot\xi}a(x,\xi)u(y)\,dy\,d\xi,
\end{align*}
where the integral is understood as an oscillatory integral.
[/definition]
The left convention is the one used for most of the local calculus because it keeps the output variable visible in the symbol. To compare it with the convention where coefficients act at the input point, we keep the same phase and shift the symbol from $x$ to $y$.
[definition: Right Quantisation]
For $a \in S^m(U\times\mathbb R^n)$, the right quantisation of $a$ is the operator
\begin{align*}
\operatorname{Op}_R(a):C_c^\infty(U)\longrightarrow \mathcal{D}'(U)
\end{align*}
defined by
\begin{align*}
\operatorname{Op}_R(a)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot\xi}a(y,\xi)u(y)\,dy\,d\xi,
\end{align*}
where the integral is understood as an oscillatory integral.
[/definition]
Right quantisation has the same kernel phase as left quantisation but places the coefficient at the input point. It still privileges one endpoint of the kernel, so it does not address the symmetry question that arises when adjoints and self-adjoint model operators are central; that question motivates placing the coefficient at the midpoint.
[definition: Weyl Quantisation]
For $a \in S^m(U\times\mathbb R^n)$, the Weyl quantisation of $a$ is the operator
\begin{align*}
\operatorname{Op}_W(a):C_c^\infty(U)\longrightarrow \mathcal{D}'(U)
\end{align*}
defined by
\begin{align*}
\operatorname{Op}_W(a)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot\xi}a\left(\frac{x+y}{2},\xi\right)u(y)\,dy\,d\xi,
\end{align*}
where the integral is understood as an oscillatory integral.
[/definition]
The Weyl convention is symmetric in $x$ and $y$, which is why it interacts well with formal adjoints. The cost is that it is no longer a left symbol formula, so we need a single translation theorem that includes left, right, and Weyl conventions at once.
[quotetheorem:7672]
[citeproof:7672]
This result separates the operator from the bookkeeping convention. The fixed parameter $t$ matters because the chain-rule factors $t^{|\alpha|}$ are then harmless constants in the symbolic expansion; allowing $t$ to vary with $(x,\xi)$ would introduce new derivatives and is not covered by the formula. The symbol estimates are also essential: without decay of order under $\partial_\xi$, the correction terms need not fall from order $m$ to order $m-1$, so the principal class could depend on the convention. For example, if one tried to use the oscillatory coefficient $a(x,\xi)=e^{ix\cdot \xi}$, then $\partial_{\xi_j}a=ix_j e^{ix\cdot\xi}$ has no lower symbolic order than $a$ on a compact set with $x_j\ne 0$. The first correction term in the change-of-quantisation expansion is then the same size as the supposed leading term, so the argument that left and Weyl conventions agree modulo one lower order breaks down. The theorem therefore identifies only the leading class, not a canonical full symbol shared by all quantisations. Quantisation choices affect subprincipal and lower-order information, but the top-order class is shared.
[example: Weyl Quantisation of $x_j\xi_k$]
On $\mathbb R^n$, take the Weyl symbol $a(x,\xi)=x_j\xi_k$. For $u\in C_c^\infty(\mathbb R^n)$, the Weyl formula gives
\begin{align*}
\operatorname{Op}_W(x_j\xi_k)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_{\mathbb R^n} e^{i(x-y)\cdot\xi}\frac{x_j+y_j}{2}\xi_k u(y)\,dy\,d\xi.
\end{align*}
Since $D_{x_k}=-i\partial_{x_k}$, we have
\begin{align*}
D_{x_k}e^{i(x-y)\cdot\xi}=\xi_k e^{i(x-y)\cdot\xi}.
\end{align*}
Splitting the midpoint factor into its two terms, the $x_j$ contribution is
\begin{align*}
\frac{x_j}{2}(2\pi)^{-n}\int_{\mathbb R^n}\int_{\mathbb R^n} e^{i(x-y)\cdot\xi}\xi_k u(y)\,dy\,d\xi=\frac{1}{2}x_jD_{x_k}u(x).
\end{align*}
The $y_j$ contribution is the same [Fourier inversion formula](/theorems/528) applied to the test function $y\mapsto y_j u(y)$:
\begin{align*}
\frac{1}{2}(2\pi)^{-n}\int_{\mathbb R^n}\int_{\mathbb R^n} e^{i(x-y)\cdot\xi}\xi_k y_j u(y)\,dy\,d\xi=\frac{1}{2}D_{x_k}(x_j u)(x).
\end{align*}
Therefore
\begin{align*}
\operatorname{Op}_W(x_j\xi_k)u(x)=\frac{1}{2}\bigl(x_jD_{x_k}u(x)+D_{x_k}(x_j u)(x)\bigr).
\end{align*}
Expanding the second term by the product rule,
\begin{align*}
D_{x_k}(x_j u)=-i\partial_{x_k}(x_j u)=-i\delta_{jk}u+x_jD_{x_k}u.
\end{align*}
Hence
\begin{align*}
\operatorname{Op}_W(x_j\xi_k)u=x_jD_{x_k}u-\frac{i}{2}\delta_{jk}u.
\end{align*}
The Weyl rule symmetrises multiplication by $x_j$ and differentiation by $D_{x_k}$, and the extra scalar term is order zero, so the principal symbol remains $x_j\xi_k$.
[/example]
The correction term in the example is not an error; it is the signature of a convention designed to preserve symmetry. Since the correction has order zero, the principal symbol is still $x_j\xi_k$.
[example: First-Order Vector Fields]
Let
\begin{align*}
w(x,\xi)=\sum_{j=1}^n a_j(x)\xi_j+w_0(x)
\end{align*}
be a Weyl symbol whose quantisation equals $V$. To compare with the known left symbol
\begin{align*}
v_L(x,\xi)=\sum_{j=1}^n a_j(x)\xi_j+b(x),
\end{align*}
use the change-of-quantisation formula with $t=\frac12$. Since $w$ is affine in $\xi$, all terms with $|\alpha|\ge 2$ vanish, so its left symbol is
\begin{align*}
w_L(x,\xi)=w(x,\xi)+\frac12\sum_{\ell=1}^n \partial_{\xi_\ell}D_{x_\ell}w(x,\xi).
\end{align*}
For each $\ell$,
\begin{align*}
D_{x_\ell}w(x,\xi)=\sum_{j=1}^n D_{x_\ell}a_j(x)\xi_j+D_{x_\ell}w_0(x).
\end{align*}
Taking $\partial_{\xi_\ell}$ gives
\begin{align*}
\partial_{\xi_\ell}D_{x_\ell}w(x,\xi)=D_{x_\ell}a_\ell(x),
\end{align*}
because $D_{x_\ell}w_0(x)$ is independent of $\xi$ and $\partial_{\xi_\ell}\xi_j=\delta_{\ell j}$. Therefore
\begin{align*}
w_L(x,\xi)=\sum_{j=1}^n a_j(x)\xi_j+w_0(x)+\frac12\sum_{\ell=1}^n D_{x_\ell}a_\ell(x).
\end{align*}
Equating this with $v_L$ forces
\begin{align*}
w_0(x)=b(x)-\frac12\sum_{\ell=1}^n D_{x_\ell}a_\ell(x).
\end{align*}
Since $D_{x_\ell}=-i\partial_{x_\ell}$, this may also be written as
\begin{align*}
w_0(x)=b(x)+\frac{i}{2}\sum_{\ell=1}^n \partial_{x_\ell}a_\ell(x).
\end{align*}
Thus the Weyl symbol of $V$ is
\begin{align*}
w(x,\xi)=\sum_{j=1}^n a_j(x)\xi_j+b(x)-\frac12\sum_{\ell=1}^n D_{x_\ell}a_\ell(x).
\end{align*}
The order-one part is still $\sum_j a_j(x)\xi_j$, while changing from left to Weyl quantisation changes only the order-zero scalar term.
[/example]
First-order operators already show the pattern that will dominate the rest of the course. The leading homogeneous expression is geometric, while lower-order terms depend on choices of coordinates, densities, and quantisation.
## Principal Symbols and the Top-Order Quotient
The final question in this chapter is what part of a pseudodifferential operator survives all of these choices. Since changing quantisation alters only lower-order terms, the order-$m$ component should be an invariant of the operator modulo $\Psi^{m-1}$. The principal symbol formalises this invariant and turns the filtered algebra of operators into a graded object at its top layer.
[definition: Principal Symbol Class]
Let $a\in S^m(U\times\mathbb R^n)$. The principal symbol class of $a$ is its equivalence class
\begin{align*}
[a] \in S^m(U\times\mathbb R^n)/S^{m-1}(U\times\mathbb R^n).
\end{align*}
The principal symbol map is
\begin{align*}
\sigma_m:\Psi^m(U)\longrightarrow S^m(U\times\mathbb R^n)/S^{m-1}(U\times\mathbb R^n).
\end{align*}
For $P\in \Psi^m(U)$ represented by $P=\operatorname{Op}_L(a)$ modulo smoothing operators, it is defined by
\begin{align*}
\sigma_m(P)=[a].
\end{align*}
[/definition]
Here $\Psi^m(U)$ is used with the convention established in the local calculus: two full symbol representatives define the same order-$m$ operator when their difference is smoothing. Thus the map records the top-order class of an operator modulo smoothing remainders before it forgets the additional $S^{m-1}$ part. This definition is meaningful only if different symbols representing the same operator give the same class, and the next result gives both well-definedness and the exact algebraic statement.
[quotetheorem:7673]
[citeproof:7673]
Exactness is the algebraic reason that principal symbols can be manipulated without committing to a full symbol. The hypotheses encode the chosen filtered pseudodifferential calculus: operators are identified modulo smoothing remainders, symbols are read through left representatives, and the uniqueness of the top-order symbol is measured only up to $S^{m-1}$. Each part is needed. If the quotient by $S^{m-1}$ were omitted, the same principal behaviour would be split into incompatible full-symbol data: $D_{x_1}$ and $D_{x_1}+r(x)$, with $r\in C^\infty(U)$, have full left symbols $\xi_1$ and $\xi_1+r(x)$ but the same order-one class. If smoothing remainders were not ignored, a rapidly decaying symbol such as $e^{-|\xi|^2}$ would give a smoothing operator whose full symbol is not literally zero, so the zero operator in the quotient calculus would fail to have a unique representative. If the operators were not filtered by order, the kernel statement would also fail: multiplication by a nonzero smooth function belongs to $\Psi^0(U)$, but it cannot lie in the kernel of an order-one principal symbol map unless it is first regarded as an element of the lower filtered piece $\Psi^0(U)=\Psi^{1-1}(U)$. Finally, fixing left representatives is a local bookkeeping convention; changing to Weyl or right representatives changes lower-order terms, so the theorem would be underdetermined without specifying which representative is used before passing to the quotient. These examples show why the quotients and filtration are part of the statement rather than technical decoration. The quotient also explains why ellipticity is a leading-order condition: invertibility of the top-order class is the obstruction that a parametrix must remove first.
[remark: Homogeneous Principal Symbols]
For classical symbols with an expansion in homogeneous components, the quotient class $S^m/S^{m-1}$ is represented by the homogeneous degree-$m$ term $a_m(x,\xi)$ for $|\xi|\ge 1$. In the non-classical $S^m_{1,0}$ calculus, the quotient class still exists, but it need not be represented by a homogeneous function. The quotient formulation is therefore the more general definition.
[/remark]
The homogeneous picture is useful for intuition, but the quotient picture is what survives in the full calculus. It also gives the coordinate-free interpretation of the principal symbol.
[explanation: Coordinate-Free Meaning]
Under a change of coordinates $x=\kappa(\tilde x)$, the oscillatory phase transforms by
\begin{align*}
(x-y)\cdot\xi = (\tilde x-\tilde y)\cdot \tilde\xi + O(|\tilde x-\tilde y|^2|\xi|)
\end{align*}
along the diagonal $\tilde x=\tilde y$, where $\tilde\xi=(D\kappa_{\tilde x})^\top\xi$. Thus the leading symbol transforms as a function on cotangent variables. Lower-order terms absorb the Jacobian factors, Taylor remainders, and derivatives of the coordinate map. Hence the principal symbol of an order-$m$ operator on a manifold is a well-defined function on $T^*M\setminus 0$, or in the non-homogeneous setting a class modulo order $m-1$ symbols on the cotangent bundle.
[/explanation]
[illustration:cotangent-coordinate-change-near-diagonal]
This coordinate-free meaning is the bridge from the local Euclidean construction to the global pseudodifferential calculus. Operators are still defined by charts and local oscillatory integrals, but their leading symbols patch as geometric objects on the cotangent bundle.
[example: Principal Symbol of a Laplace-Type Operator]
Let $U\subset\mathbb R^n$ and let
\begin{align*}
L=-\sum_{i,j=1}^n \partial_{x_i}\bigl(g^{ij}(x)\partial_{x_j}\bigr)+\sum_{j=1}^n b_j(x)\partial_{x_j}+c(x),
\end{align*}
where $(g^{ij}(x))$ is smooth positive definite and all coefficients are smooth. Since $D_{x_j}=-i\partial_{x_j}$, we have $\partial_{x_j}=iD_{x_j}$. Applying the product rule to the second-order part gives, for $u\in C_c^\infty(U)$,
\begin{align*}
-\partial_{x_i}\bigl(g^{ij}\partial_{x_j}u\bigr)=-(\partial_{x_i}g^{ij})\partial_{x_j}u-g^{ij}\partial_{x_i}\partial_{x_j}u.
\end{align*}
Substituting $\partial_{x_j}=iD_{x_j}$ into the first term gives
\begin{align*}
-(\partial_{x_i}g^{ij})\partial_{x_j}u=-i(\partial_{x_i}g^{ij})D_{x_j}u.
\end{align*}
For the second term,
\begin{align*}
-g^{ij}\partial_{x_i}\partial_{x_j}u=-g^{ij}(iD_{x_i})(iD_{x_j})u=g^{ij}D_{x_i}D_{x_j}u.
\end{align*}
Thus
\begin{align*}
Lu=\sum_{i,j=1}^n g^{ij}D_{x_i}D_{x_j}u-i\sum_{i,j=1}^n(\partial_{x_i}g^{ij})D_{x_j}u+i\sum_{j=1}^n b_jD_{x_j}u+cu.
\end{align*}
In left quantisation, $D_{x_i}D_{x_j}$ has symbol $\xi_i\xi_j$, multiplication by $g^{ij}(x)$ places the coefficient at the output point $x$, and each single $D_{x_j}$ contributes only one power of $\xi$. Hence a left symbol for $L$ is
\begin{align*}
\ell(x,\xi)=\sum_{i,j=1}^n g^{ij}(x)\xi_i\xi_j-i\sum_{i,j=1}^n(\partial_{x_i}g^{ij})(x)\xi_j+i\sum_{j=1}^n b_j(x)\xi_j+c(x).
\end{align*}
The last three terms lie in $S^1$, so in the quotient $S^2/S^1$ only the quadratic part remains:
\begin{align*}
\sigma_2(L)(x,\xi)=\sum_{i,j=1}^n g^{ij}(x)\xi_i\xi_j \quad \operatorname{mod} S^1.
\end{align*}
If $K\subset U$ is compact, the function $(x,\eta)\mapsto \sum_{i,j}g^{ij}(x)\eta_i\eta_j$ is continuous and positive on $K\times S^{n-1}$, so it has a positive minimum $\lambda_K$. Therefore
\begin{align*}
\sum_{i,j=1}^n g^{ij}(x)\xi_i\xi_j\ge \lambda_K|\xi|^2
\end{align*}
for $x\in K$, which is the elliptic lower bound for the principal symbol.
[/example]
The Laplace-type example illustrates the role of cotangent variables in PDE and gives a test case for any proposed symbol rule. To use principal symbols as part of an algebra, we next need to know how the leading classes behave under composition.
[quotetheorem:7674]
[citeproof:7674]
This multiplicativity is the computational payoff of the principal-symbol sequence. Its proof depends on the full symbolic composition expansion, not just on the existence of kernels: differentiating one symbol in $\xi$ is what forces every non-leading term to lose at least one order. The order and symbol hypotheses are necessary for the displayed target order. If a coefficient is not a symbol, the product can create leading terms outside the predicted class; for instance the oscillatory coefficient $a(x,\xi)=e^{ix\cdot\xi}$ is stable under neither the symbol estimates nor the order-lowering rule, since $\partial_{\xi_j}a=ix_j e^{ix\cdot\xi}$ remains the same size on compact sets with $x_j\ne 0$. The composition expansion would then have infinitely many correction terms of the same apparent order, so there is no well-defined principal product in $S^{m+m'}/S^{m+m'-1}$. If the quotient by one lower order is dropped even within the calculus, the product rule is false as an equality of full operators: take $P=\operatorname{Op}_L(\xi_1)$ and let $Q$ be multiplication by $x_1$. Then $PQ=D_{x_1}x_1=x_1D_{x_1}-i$, while $QP=x_1D_{x_1}$. Both products have the same order-one principal symbol $x_1\xi_1$, but they differ by the order-zero operator $-i$. Thus replacing equality of operators by equality of principal symbols, and quotienting by one lower order, is what makes the product rule true. The statement deliberately discards all lower-order information, so it does not determine the subprincipal symbol of $PQ$ or the leading term of a commutator. It makes the highest-order part of the calculus commutative even when the operators themselves fail to commute, since the commutator loses one order.
[remark: Commutators Lose One Order]
For $P\in\Psi^m(U)$ and $Q\in\Psi^{m'}(U)$, the principal symbols of $PQ$ and $QP$ agree in order $m+m'$. Hence $[P,Q]=PQ-QP$ belongs to $\Psi^{m+m'-1}(U)$. The next chapter refines this statement by identifying the leading symbol of the commutator with the Poisson bracket.
[/remark]
The preceding chapters have defined symbols, quantization, and principal symbols, all with the understanding that infinitely many lower-order corrections remain. Asymptotic expansions formalize this: an infinite formal series encodes all lower-order terms without requiring pointwise convergence. This asymptotic bookkeeping is the foundation that makes symbolic algebra closed under composition, adjoints, and other operations.
# 5. Asymptotic Expansions and Symbolic Algebra
Asymptotic expansions are the bookkeeping mechanism that lets the pseudodifferential calculus keep infinitely many lower-order corrections without requiring an actual convergent series. The preceding chapters defined symbols and quantisation; this chapter explains how formal symbolic expressions are converted into genuine symbols, how Taylor expansion in oscillatory integrals produces these expressions, and how principal and subprincipal terms record the first pieces of the calculus. The guiding principle is that equality at every symbolic order is weaker than pointwise convergence but strong enough to control operators modulo smoothing errors.
## Formal Asymptotic Expansions
When a symbolic calculation produces a sequence of terms of decreasing orders, the first question is what it means to add them. In ordinary analysis an infinite sum asks for convergence in a fixed topology; in symbolic analysis the relevant requirement is that after subtracting finitely many terms, the error drops to the next prescribed order.
[definition: Asymptotic Expansion of Symbols]
Let $U \subset \mathbb R^n$ be open, let $m_0 > m_1 > m_2 > \cdots$ with $m_j \to -\infty$, and let $a_j \in S^{m_j}_{1,0}(U \times \mathbb R^n)$. For $a \in S^{m_0}_{1,0}(U \times \mathbb R^n)$, we write
\begin{align*}
a \sim \sum_{j=0}^\infty a_j
\end{align*}
if, for every $N \ge 1$,
\begin{align*}
a - \sum_{j=0}^{N-1} a_j \in S^{m_N}_{1,0}(U \times \mathbb R^n).
\end{align*}
[/definition]
The definition says that the first $N$ terms determine $a$ up to order $m_N$. The series need not converge as functions of $(x,\xi)$; only the filtered remainders matter.
[example: Geometric Symbol of Negative Orders]
Let $t=\langle\xi\rangle^{-1}$ and choose $\chi\in C^\infty(\mathbb R^n)$ with $\chi(\xi)=0$ for small $|\xi|$ and $\chi(\xi)=1$ for large $|\xi|$. On the region where $\chi=1$, the proposed asymptotic sum is
\begin{align*}
a(\xi)=\frac{t}{1-t}=\langle \xi\rangle^{-1}\bigl(1-\langle \xi\rangle^{-1}\bigr)^{-1}.
\end{align*}
For every $N\ge 1$, the finite geometric identity follows from
\begin{align*}
(1-t)\sum_{j=0}^{N-1}t^{j+1}=t-t^{N+1}.
\end{align*}
Dividing by $1-t$ gives
\begin{align*}
\frac{t}{1-t}-\sum_{j=0}^{N-1}t^{j+1}=\frac{t^{N+1}}{1-t}.
\end{align*}
Substituting $t=\langle\xi\rangle^{-1}$, the large-frequency remainder is
\begin{align*}
\langle\xi\rangle^{-1}\bigl(1-\langle\xi\rangle^{-1}\bigr)^{-1}-\sum_{j=0}^{N-1}\langle\xi\rangle^{-j-1}=\frac{\langle\xi\rangle^{-N-1}}{1-\langle\xi\rangle^{-1}}.
\end{align*}
After multiplying by the high-frequency cutoff, the denominator is smooth and bounded together with all its $\xi$-derivatives on the support of $\chi$. Hence
\begin{align*}
\chi(\xi)\frac{\langle\xi\rangle^{-N-1}}{1-\langle\xi\rangle^{-1}}\in S^{-N-1}_{1,0}(\mathbb R^n\times\mathbb R^n).
\end{align*}
Thus subtracting the first $N$ powers leaves exactly the next symbolic order, while any low-frequency modification introduced by $\chi$ is compactly supported in $\xi$ and therefore smoothing.
[/example]
This example also shows why low frequencies are treated separately. Symbol order is an estimate as $|\xi| \to \infty$, so modifying a candidate asymptotic sum on a compact set in $\xi$ only changes it by a smoothing symbol. The remaining issue is existence: a formal list of decreasing-order symbols should always be realisable by an actual symbol, otherwise formal parametrix calculations would have no analytic representative.
[quotetheorem:7675]
[citeproof:7675]
The theorem is the symbolic analogue of Borel's theorem: arbitrary formal data of decreasing orders can be realised. The strict decrease to $-\infty$ is essential, because if infinitely many terms all had the same order then no finite truncation would force the tail into lower symbol classes; for instance, repeatedly adding a nonzero order-zero symbol would not define an asymptotic expansion with improving remainders. The theorem does not assert pointwise convergence of $\sum_j a_j$, nor does it give a canonical preferred sum; the cutoffs used in the construction make choices at low frequency. The uniqueness statement is exactly the amount of uniqueness needed for operators, since smoothing differences are usually negligible in microlocal arguments.
[remark: Nonuniqueness of the Sum]
The notation $a \sim \sum_j a_j$ does not name a canonical function. It names an equivalence class modulo $S^{-\infty}$ together with specified representatives for successive homogeneous or filtered pieces.
[/remark]
This nonuniqueness is harmless only if every representative of the asymptotic class behaves the same way under finite truncation. In applications one never uses the whole formal series at once: a parametrix construction stops after finitely many correction terms and then needs the discarded tail to have precisely the lower order predicted by the symbolic filtration. The next estimate isolates this finite-truncation control, including the derivative seminorm bounds needed for the operator calculus.
[quotetheorem:7676]
[citeproof:7676]
The estimate explains how asymptotic expansions enter parametrices. The assumptions that the terms lie in symbol classes and that the orders tend to $-\infty$ cannot be dropped: a sequence of smooth functions with uncontrolled $\xi$-derivatives may have small pointwise size but fail every symbol seminorm estimate needed for composition. Likewise, if the orders do not improve, truncating after $N$ terms gives no reason for the remainder to be of lower operator order. The conclusion is only a finite-truncation seminorm bound: for each fixed $N$ it controls the error after $N$ symbolic terms, but it does not assert convergence of the infinite asymptotic series in any function-space topology. If an elliptic inverse is built order by order, stopping after $N$ corrections gives an error whose operator order is the next order in the symbolic filtration.
[example: Truncation Error in a Parametrix]
Let $p(x,\xi)\in S^m_{1,0}(U\times\mathbb R^n)$ be elliptic, and suppose the inverse symbol has been built as an asymptotic expansion $q\sim\sum_{j=0}^{\infty}q_j$ with $q_j\in S^{-m-j}_{1,0}$. For a fixed $N\ge 1$, set
\begin{align*}
q_N=\sum_{j=0}^{N-1}q_j.
\end{align*}
By the left composition expansion, the symbol of $\operatorname{Op}(p)\operatorname{Op}(q_N)$ has the form
\begin{align*}
p\# q_N \sim \sum_{\alpha}\frac{1}{\alpha!}\partial_\xi^\alpha p\,D_x^\alpha q_N.
\end{align*}
Since $q_N=\sum_{j=0}^{N-1}q_j$, the displayed expansion becomes
\begin{align*}
p\# q_N \sim \sum_{\alpha}\sum_{j=0}^{N-1}\frac{1}{\alpha!}\partial_\xi^\alpha p\,D_x^\alpha q_j.
\end{align*}
Here $\partial_\xi^\alpha p\in S^{m-|\alpha|}_{1,0}$ and $D_x^\alpha q_j\in S^{-m-j}_{1,0}$, so each product lies in
\begin{align*}
S^{m-|\alpha|-m-j}_{1,0}=S^{-j-|\alpha|}_{1,0}.
\end{align*}
Matching the parametrix coefficients through order $-N+1$ means that the terms with $j+|\alpha|<N$ add to $1$ modulo $S^{-N}_{1,0}$. Therefore
\begin{align*}
p\# q_N-1\in S^{-N}_{1,0}.
\end{align*}
Thus stopping after $N$ symbolic corrections leaves an error of order at most $-N$, while taking the full asymptotic sum makes the error belong to every $S^{-N}_{1,0}$ and hence gives an inverse modulo $S^{-\infty}$.
[/example]
## Taylor Expansion Inside Oscillatory Integrals
The symbolic algebra of compositions comes from a concrete analytic manoeuvre: expand the slowly varying factor in an oscillatory integral and convert powers of the spatial variable into derivatives in frequency. This section records the mechanism before it is used systematically in the composition theorem.
Suppose a composition or adjoint calculation produces an integral with phase $(y-x)\cdot \eta$ and amplitude depending on $y$. The dependence on $y$ is moved back to $x$ by Taylor expansion, while the monomials $(y-x)^\alpha$ are transferred to $\eta$-derivatives by integration by parts.
[quotetheorem:7677]
[citeproof:7677]
The compactness and segment hypotheses are not cosmetic. They ensure that all $x$-derivatives of $b$ are estimated on one compact subset of $U$; if the segment from $x$ to $y$ leaves $U$, the expression $b(x+t(y-x),\xi)$ may not even be defined. The formula also does not improve the $\xi$-order by itself: Taylor expansion only reorganises the $x$-dependence, and the order gain appears later when powers of $y-x$ are traded for $\eta$-derivatives. To turn those powers into symbolic order gains, we need the following identity: it is the exact mechanism that moves derivatives from the phase onto the amplitude during integration by parts.
[quotetheorem:7678]
[citeproof:7678]
After moving $D_\eta^\alpha$ from the exponential onto the other factor in the amplitude, signs and powers of $i$ are absorbed into the standard differential notation. The identity is purely algebraic and does not by itself justify integration by parts in a non-absolutely convergent oscillatory integral; that justification comes from the cutoff regularisation or oscillatory integral continuity estimates used in the calculus. It also depends on the linear phase $(y-x)\cdot\eta$. For example, with phase $(y-x)\cdot\eta+|\eta|^2$, applying $D_{\eta_j}$ to the exponential produces $(y_j-x_j)+2\eta_j$ rather than only $y_j-x_j$, so the monomial in $y-x$ cannot be replaced by an $\eta$-derivative without extra terms. The next theorem packages the resulting finite Taylor calculations into the asymptotic formula that drives composition, adjoints, commutators, and parametrices.
[quotetheorem:7679]
[citeproof:7679]
The result is the prototype for all later symbolic formulae: adjoints, commutators, and parametrices differ in their phases and conventions, but the same Taylor plus integration-by-parts mechanism produces decreasing symbolic orders. The localization is part of the theorem, not an afterthought. On an arbitrary open set, $\operatorname{Op}(b)u$ need not have compact support, so applying $\operatorname{Op}(a)$ to it is not justified from the initial definition on $C_c^\infty(U)$ unless proper support or cutoffs have been inserted. The hypotheses also matter because the formula is an asymptotic statement inside the $S_{1,0}$ filtration; without symbol estimates, differentiating the amplitude may not produce controlled order drops. The displayed series is not asserted to converge as a function of $(x,\xi)$, and changing the quantisation convention changes the lower-order terms even though the leading product remains the same.
[example: First Terms of a Composition]
Let $c$ denote the local left-quantised composition symbol of $\operatorname{Op}(a)\operatorname{Op}(b)$. By *[Symbolic Expansion from Taylor Expansion](/theorems/7679)*, its expansion begins with the terms indexed by $|\alpha|<2$:
\begin{align*}
c(x,\xi)-\sum_{|\alpha|<2}\frac{1}{\alpha!}\partial_\xi^\alpha a(x,\xi)D_x^\alpha b(x,\xi)\in S^{m+m'-2}_{1,0}.
\end{align*}
The term with $\alpha=0$ is
\begin{align*}
\frac{1}{0!}\partial_\xi^0a(x,\xi)D_x^0b(x,\xi)=a(x,\xi)b(x,\xi).
\end{align*}
The multi-indices with $|\alpha|=1$ are $e_1,\ldots,e_n$, where $e_j$ has a $1$ in the $j$th coordinate and $0$ elsewhere. Since $e_j!=1$, their contribution is
\begin{align*}
\sum_{|\alpha|=1}\frac{1}{\alpha!}\partial_\xi^\alpha a(x,\xi)D_x^\alpha b(x,\xi)=\sum_{j=1}^n \partial_{\xi_j}a(x,\xi)D_{x_j}b(x,\xi).
\end{align*}
Therefore, if
\begin{align*}
r_2(x,\xi)=c(x,\xi)-a(x,\xi)b(x,\xi)-\sum_{j=1}^n \partial_{\xi_j}a(x,\xi)D_{x_j}b(x,\xi),
\end{align*}
then $r_2\in S^{m+m'-2}_{1,0}$, and hence
\begin{align*}
c(x,\xi)=a(x,\xi)b(x,\xi)+\sum_{j=1}^n \partial_{\xi_j}a(x,\xi)D_{x_j}b(x,\xi)+r_2(x,\xi).
\end{align*}
Thus ordinary multiplication gives the principal part, while the first noncommutative correction is exactly the order $m+m'-1$ term involving one $\xi$-derivative of the left symbol and one $x$-derivative of the right symbol.
[/example]
## Principal and Subprincipal Bookkeeping
Full asymptotic expansions are powerful, but many arguments only need the first one or two orders. Principal and subprincipal terms are bookkeeping devices that let us state ellipticity, commutator identities, and parametrix corrections without carrying the entire expansion.
The leading term lives in a quotient: changing a symbol by a lower-order term should not alter its principal information. This quotient viewpoint is what makes the principal symbol independent of lower-order choices.
[definition: Principal Symbol Class]
Let $a\in S^m_{1,0}(U\times\mathbb R^n)$. Its principal symbol class is the image of $a$ in the quotient
\begin{align*}
S^m_{1,0}(U\times\mathbb R^n)/S^{m-1}_{1,0}(U\times\mathbb R^n).
\end{align*}
[/definition]
For classical symbols, this quotient is often represented by the homogeneous component of degree $m$ in $\xi$ away from the origin. For non-classical symbols it is better to remember the quotient formulation, since it only uses the filtration.
[example: Principal Symbol of a Differential Operator]
Let
\begin{align*}
P:C_c^\infty(U)\to C^\infty(U),\qquad Pu=\sum_{|\alpha|\le m}a_\alpha(x)D_x^\alpha u
\end{align*}
with $a_\alpha\in C^\infty(U)$ and $D_x^\alpha=(-i)^{|\alpha|}\partial_x^\alpha$. Since
\begin{align*}
D_x^\alpha e^{ix\cdot \xi}=\xi^\alpha e^{ix\cdot \xi},
\end{align*}
the left-quantised symbol corresponding to $P$ is
\begin{align*}
p(x,\xi)=\sum_{|\alpha|\le m}a_\alpha(x)\xi^\alpha.
\end{align*}
Separate the degree $m$ part from the lower-degree part:
\begin{align*}
p(x,\xi)=\sum_{|\alpha|=m}a_\alpha(x)\xi^\alpha+\sum_{|\alpha|\le m-1}a_\alpha(x)\xi^\alpha.
\end{align*}
For a lower-degree term with $|\gamma|\le m-1$, any compact $K\subset U$, and multi-indices $\rho,\beta$, the coefficient derivative $\partial_x^\rho a_\gamma$ is bounded on $K$, while $\partial_\xi^\beta \xi^\gamma$ is either $0$ or a polynomial of degree $|\gamma|-|\beta|$. Hence
\begin{align*}
\left|\partial_x^\rho\partial_\xi^\beta\bigl(a_\gamma(x)\xi^\gamma\bigr)\right|\le C_{K,\rho,\beta,\gamma}\langle\xi\rangle^{|\gamma|-|\beta|}.
\end{align*}
Because $|\gamma|\le m-1$, this gives
\begin{align*}
\left|\partial_x^\rho\partial_\xi^\beta\bigl(a_\gamma(x)\xi^\gamma\bigr)\right|\le C_{K,\rho,\beta,\gamma}\langle\xi\rangle^{m-1-|\beta|}.
\end{align*}
Thus every term with $|\gamma|\le m-1$ lies in $S^{m-1}_{1,0}$, and the principal symbol class of $P$ is represented by
\begin{align*}
\sum_{|\alpha|=m}a_\alpha(x)\xi^\alpha.
\end{align*}
The quotient keeps exactly the top-degree part of the differential symbol and discards the lower-order differential terms.
[/example]
The next layer is useful when the leading term has already been fixed and the first correction affects the calculation. In local coordinates the exact subprincipal convention depends on the quantisation; the important point for this chapter is that it tracks the order $m-1$ contribution together with the correction forced by the chosen quantisation rule.
[definition: Subprincipal Symbol in Left Quantisation]
Let $A=\operatorname{Op}(a):C_c^\infty(U)\to C^\infty(U)$ be a local left-quantised pseudodifferential operator with $a\in S^m_{1,0}(U\times\mathbb R^n)$ admitting an expansion $a\sim a_m+a_{m-1}+a_{m-2}+\cdots$ by decreasing integer orders. The left subprincipal bookkeeping term of $A$ is the class of $a_{m-1}$ modulo $S^{m-2}_{1,0}(U\times\mathbb R^n)$, together with the convention that composition corrections are computed using the left-quantised product formula.
[/definition]
This definition is deliberately tied to a convention and is not the coordinate-invariant geometric subprincipal symbol used in invariant treatments of pseudodifferential operators. A change of quantisation or coordinates can alter this bookkeeping term by lower-order correction terms. In these notes the role is computational, so the operator setting and quantisation convention must be stated whenever subprincipal terms are used.
[quotetheorem:7680]
[citeproof:7680]
The theorem is the reason leading-order symbolic calculations look commutative while the next order remembers operator ordering. The quotient statement is important: it does not say that $c=ab$, only that the difference has lower order, and ignoring the quotient would lose exactly the correction terms used in commutator calculations. The symbol-class hypotheses are also part of the conclusion: if $a$ and $b$ are merely smooth functions with no uniform $\xi$-derivative bounds, the derivatives $\partial_\xi^\alpha a$ and $D_x^\alpha b$ need not decrease order, and the remainder need not lie in $S^{m+m'-2}_{1,0}$. The scalar-symbol assumption is part of the simple commutator picture as well; for matrix-valued symbols, the leading products need not commute. For example, take constant matrix symbols $A$ and $B$ with entries $A_{12}=1$, $B_{21}=1$, and all other entries equal to $0$. These already satisfy $AB\ne BA$ at principal order, so the leading term of a matrix-valued commutator can remain of order $m+m'$ rather than dropping to $m+m'-1$. Commutators are therefore one order lower than products in the scalar case, and their first term is governed by the familiar Poisson bracket expression.
[example: Commutator Drops One Order]
Let $c_{AB}$ and $c_{BA}$ be the left-quantised composition symbols of $AB$ and $BA$ on the localized patch. Applying the first two terms of the left composition expansion to $AB$ gives
\begin{align*}
c_{AB}(x,\xi)=a(x,\xi)b(x,\xi)+\sum_{j=1}^n \partial_{\xi_j}a(x,\xi)D_{x_j}b(x,\xi)+r_{AB}(x,\xi),
\end{align*}
with $r_{AB}\in S^{m+m'-2}_{1,0}$. Applying the same formula with $a$ and $b$ interchanged gives
\begin{align*}
c_{BA}(x,\xi)=b(x,\xi)a(x,\xi)+\sum_{j=1}^n \partial_{\xi_j}b(x,\xi)D_{x_j}a(x,\xi)+r_{BA}(x,\xi),
\end{align*}
with $r_{BA}\in S^{m+m'-2}_{1,0}$.
The commutator symbol is $c_{AB}-c_{BA}$ modulo the same localized symbolic convention, so
\begin{align*}
c_{AB}(x,\xi)-c_{BA}(x,\xi)=a(x,\xi)b(x,\xi)-b(x,\xi)a(x,\xi)+\sum_{j=1}^n\left(\partial_{\xi_j}a(x,\xi)D_{x_j}b(x,\xi)-\partial_{\xi_j}b(x,\xi)D_{x_j}a(x,\xi)\right)+r_{AB}(x,\xi)-r_{BA}(x,\xi).
\end{align*}
Since the symbols are scalar-valued, $a(x,\xi)b(x,\xi)=b(x,\xi)a(x,\xi)$, and therefore the order $m+m'$ term cancels. Also, $r_{AB}-r_{BA}\in S^{m+m'-2}_{1,0}\subset S^{m+m'-1}_{1,0}$. For each $j$, $\partial_{\xi_j}a\in S^{m-1}_{1,0}$ and $D_{x_j}b\in S^{m'}_{1,0}$, so $\partial_{\xi_j}aD_{x_j}b\in S^{m+m'-1}_{1,0}$; similarly $\partial_{\xi_j}bD_{x_j}a\in S^{m+m'-1}_{1,0}$. Hence $[A,B]$ has symbolic order at most $m+m'-1$, represented at that order by
\begin{align*}
\sum_{j=1}^n\left(\partial_{\xi_j}aD_{x_j}b-\partial_{\xi_j}bD_{x_j}a\right).
\end{align*}
With $D_{x_j}=-i\partial_{x_j}$, this first nonzero term is $-i\sum_j(\partial_{\xi_j}a\,\partial_{x_j}b-\partial_{\xi_j}b\,\partial_{x_j}a)$, the Poisson-bracket expression up to the chosen sign convention.
[/example]
## Smoothing and Formal Vanishing
The final question in this chapter is how to recognise when a symbolic construction has removed every order. This is the [symbolic smoothing criterion](/theorems/7681): if all pieces in an asymptotic expansion vanish, what remains is smoothing.
Smoothing symbols are the intersection of all finite orders. In operator language they produce kernels that are smooth, but the symbolic statement is already enough for the algebraic arguments in this part of the course.
[definition: Smoothing Symbol]
A symbol $r\in C^\infty(U\times\mathbb R^n)$ is smoothing if
\begin{align*}
r\in S^{-N}_{1,0}(U\times\mathbb R^n)
\end{align*}
for every $N\ge 1$. The space of smoothing symbols is denoted $S^{-\infty}(U\times\mathbb R^n)$.
[/definition]
This definition is compatible with the asymptotic uniqueness statement above. If two constructions have identical terms at every order, their difference is invisible to the symbolic filtration.
[quotetheorem:7681]
[citeproof:7681]
The criterion is used constantly in parametrix constructions. The condition $m_N\to -\infty$ is essential: knowing only that $r\in S^{-1}_{1,0}$, or even that many leading homogeneous terms vanish while the order remains bounded below, does not make $r$ smoothing. Formal vanishing also differs from literal equality to zero; a nonzero element of $S^{-\infty}$ can have a smooth kernel and still be present as an operator. After solving away the error to successively lower orders, the remaining error is not zero as a function, but its symbol is smoothing and hence negligible for microlocal elliptic regularity.
[example: Formal Parametrix Error]
Let
\begin{align*}
r(x,\xi)=p\# q(x,\xi)-1.
\end{align*}
The assumption that $p\# q$ has expansion $1+0+0+\cdots$ means precisely that, after subtracting the leading term $1$, every remaining symbolic coefficient is zero. Hence, for each $N\ge 1$, the definition of asymptotic expansion gives
\begin{align*}
r-\sum_{j=0}^{N-1}0\in S^{-N}_{1,0}.
\end{align*}
Since the finite sum of zero terms is $0$, this is
\begin{align*}
r\in S^{-N}_{1,0}
\end{align*}
for every $N\ge 1$. Therefore $r\sim0$, and by *Symbolic Smoothing Criterion*,
\begin{align*}
p\# q-1=r\in S^{-\infty}.
\end{align*}
Thus the composition satisfies
\begin{align*}
\operatorname{Op}(p)\operatorname{Op}(q)=\operatorname{Op}(1)+\operatorname{Op}(r),
\end{align*}
with $\operatorname{Op}(r)$ smoothing. Since $\operatorname{Op}(1)$ is the identity operator in the chosen quantisation, $\operatorname{Op}(q)$ is a right parametrix for $\operatorname{Op}(p)$ modulo a smoothing operator.
[/example]
This completes the algebraic foundation for the pseudodifferential calculus. Chapter 6 uses these expansions as the main engine behind composition and commutator formulae; the later chapters then apply the same bookkeeping to adjoints, elliptic parametrices, and Sobolev mapping estimates. The same bookkeeping also explains why commutators measure propagation phenomena in microlocal analysis: the principal cancellation exposes the first-order Poisson bracket term, which is the symbolic shadow of Hamiltonian flow.
Asymptotic expansions have supplied the machinery to manipulate symbols formally; the central question is whether pseudodifferential operators close under composition and whether the result respects symbol order. The answer is yes: the product of two pseudodifferential operators yields another, with principal symbols multiplying and commutators generating Poisson-bracket terms at lower order. This chapter builds the composition law and shows how principal symbols control both products and commutators.
# 6. Composition and Commutators
Chapters 2 and 3 introduced symbol classes and quantisation, and Chapter 5 supplied the asymptotic expansions needed to manipulate them. Thus a single symbol $a(x,\xi)$ now gives an operator $\operatorname{Op}(a)$ through an oscillatory integral. The next question is whether these operators form a usable calculus: after applying one pseudodifferential operator and then another, can the result be described by a new symbol? This chapter gives the symbolic composition formula, explains how the asymptotic expansion is tied to the Fourier transform convention, and extracts the leading term in commutators. The payoff is that pseudodifferential operators become a filtered algebra, with order bookkeeping controlled by derivatives in $x$ and $\xi$.
## The Composition Problem
Suppose $a \in S^m(U \times \mathbb R^n)$ and $b \in S^{m'}(U \times \mathbb R^n)$ are properly supported symbols. On test functions the Kohn--Nirenberg quantisation is the operator
\begin{align*}
\operatorname{Op}(a):C_c^\infty(U)\to C^\infty(U)
\end{align*}
given by
\begin{align*}
\operatorname{Op}(a)u(x) = (2\pi)^{-n}\int_{\mathbb R^n}\int_U e^{i(x-y)\cdot \xi}a(x,\xi)u(y)\,dy\,d\xi.
\end{align*}
Later mapping theorems refine this to Sobolev maps such as $H^s_{\mathrm{comp}}(U)\to H^{s-m}_{\mathrm{loc}}(U)$. In this chapter $\partial_x^\alpha$ and $\partial_\xi^\alpha$ denote ordinary partial derivatives; the notation $D_j=-i\partial_{x_j}$ is reserved for the differential operator with symbol $\xi_j$.
If $\operatorname{Op}(a)$ and $\operatorname{Op}(b)$ are to behave like variable-coefficient Fourier multipliers, their composition should again be a pseudodifferential operator. The issue is that the first operator differentiates and translates the oscillation created by the second, so the new symbol must combine $\xi$-derivatives of $a$ with $x$-derivatives of $b$.
[definition: Symbolic Product]
Let $a \in S^m(U \times \mathbb R^n)$ and $b \in S^{m'}(U \times \mathbb R^n)$. A symbol $c \in S^{m+m'}(U \times \mathbb R^n)$ is called a symbolic product of $a$ and $b$, written $c = a \# b$, if
\begin{align*}
\operatorname{Op}(a)\operatorname{Op}(b) - \operatorname{Op}(c)
\end{align*}
is an operator from $C_c^\infty(U)$ to $C^\infty(U)$ whose Schwartz kernel belongs to $C^\infty(U\times U)$.
[/definition]
The definition records the equality at the level relevant to the calculus: smoothing remainders are negligible for principal symbols, ellipticity, and Sobolev order. What remains to prove is that such a symbolic product exists for the operators produced by the quantisation map, and that it has a computable expansion rather than only an abstract existence statement.
[quotetheorem:7682]
[proofunderconstruction:7682]
The theorem says that the leading behaviour of the composition is multiplication of symbols, while every correction term pays one $\xi$-derivative on the left symbol and one $x$-derivative on the right symbol. Proper support is not cosmetic: without it the composed kernel need not be proper over $U \times U$, so the formula may fail to define a local operator on compactly supported inputs without extra cutoffs. The $S^m_{1,0}$ estimates are also essential because each $\xi$-derivative lowers symbolic order while $x$-derivatives do not raise it. Finally, the coefficient $i^{- |\alpha|}$ is tied to the Kohn--Nirenberg phase $e^{i(x-y)\cdot \xi}$; changing quantisation changes lower-order representatives, even though the principal product remains $ab$.
[example: Multiplication Followed By Differentiation]
Let $f \in C^\infty(U)$ act by multiplication, so its symbol is independent of $\xi$, and let $D_j=-i\partial_{x_j}$ have symbol $\xi_j$. Using the composition expansion from the theorem above, the product with left symbol $f$ and right symbol $\xi_j$ is
\begin{align*}
f\# \xi_j \sim \sum_{\alpha}\frac{i^{-|\alpha|}}{\alpha!}\partial_\xi^\alpha f\,\partial_x^\alpha \xi_j.
\end{align*}
For $\alpha=0$ this gives $f\xi_j$. If $|\alpha|\ge 1$, then $\partial_\xi^\alpha f=0$ because $f$ is independent of $\xi$, and also $\partial_x^\alpha \xi_j=0$ because $\xi_j$ is independent of $x$. Hence
\begin{align*}
f\# \xi_j=f\xi_j.
\end{align*}
For the reverse order, the left symbol is $\xi_j$ and the right symbol is $f$, so
\begin{align*}
\xi_j\# f \sim \sum_{\alpha}\frac{i^{-|\alpha|}}{\alpha!}\partial_\xi^\alpha \xi_j\,\partial_x^\alpha f.
\end{align*}
The term $\alpha=0$ is $\xi_j f$. If $|\alpha|=1$, write $\alpha=e_k$; then
\begin{align*}
\partial_{\xi_k}\xi_j=\delta_{jk}.
\end{align*}
Therefore the first-order contribution is
\begin{align*}
\sum_{k=1}^n i^{-1}\partial_{\xi_k}\xi_j\,\partial_{x_k}f=i^{-1}\sum_{k=1}^n \delta_{jk}\partial_{x_k}f.
\end{align*}
Since $\sum_{k=1}^n \delta_{jk}\partial_{x_k}f=\partial_{x_j}f$ and $i^{-1}=1/i=-i$, this becomes
\begin{align*}
i^{-1}\partial_{x_j}f=-i\partial_{x_j}f.
\end{align*}
All terms with $|\alpha|\ge 2$ vanish because $\xi_j$ is linear in $\xi$, so
\begin{align*}
\xi_j\# f=\xi_j f-i\partial_{x_j}f.
\end{align*}
Thus placing $D_j$ on the other side of multiplication by $f$ changes the symbol by the order-zero term $-i\partial_{x_j}f$, exactly measuring the failure of differentiation to commute with multiplication.
[/example]
## Asymptotic Expansion and Order Bookkeeping
The composition formula is useful only if the infinite series has a precise meaning. Symbols are not usually analytic in $\xi$, and the expansion is not meant to converge term by term. Instead, it is an asymptotic expansion in the decreasing order filtration, where each additional derivative pair lowers the order by one in the $S^m_{1,0}$ calculus.
[definition: Asymptotic Sum of Symbols]
Let $c_j \in S^{M-j}(U \times \mathbb R^n)$ for $j \ge 0$. A symbol $c \in S^M(U \times \mathbb R^n)$ is an asymptotic sum of the sequence $(c_j)_{j\ge 0}$, written
\begin{align*}
c \sim \sum_{j=0}^{\infty}c_j,
\end{align*}
if the displayed condition holds for every integer $N \ge 1$:
\begin{align*}
c - \sum_{j=0}^{N-1}c_j \in S^{M-N}(U \times \mathbb R^n).
\end{align*}
[/definition]
This definition explains why the expression for $a\# b$ is finite at any prescribed symbolic precision. The next theorem turns that interpretation into a usable estimate: after retaining all terms below a chosen multi-index length, the discarded part has the exact lower order predicted by the calculus.
[quotetheorem:7683]
[citeproof:7683]
The finite-order statement is often the version used in computations because it is an estimate, not a convergence assertion. The series $\sum_\alpha i^{- |\alpha|}\partial_\xi^\alpha a\,\partial_x^\alpha b/\alpha!$ need not converge as a function or symbol; it only determines the product modulo arbitrarily low order remainders. The $S^m_{1,0}$ mechanism is what makes this possible: each additional $\partial_\xi$ lowers the order by one, while the matching $\partial_x$ does not increase it.
A model failure explains the hypothesis. On $U=\mathbb R$, the oscillatory factor $b(x,\xi)=e^{ix\xi}$ has $\partial_x^N b=(i\xi)^N e^{ix\xi}$, so $x$-differentiation raises the $\xi$-growth by $N$ instead of preserving symbolic order. Such an amplitude is not in $S^0_{1,0}(\mathbb R\times\mathbb R)$, and the finite expansion no longer gives a remainder of order $m+m'-N$. Proper support is a separate local condition: the smooth kernel $K(x,y)=1$ defines $Tu(x)=\int_{\mathbb R}u(y)\,dy$, which maps $C_c^\infty(\mathbb R)$ to smooth functions but has support all of $\mathbb R^2$ and is not proper over either variable. Composing such global kernels with local pseudodifferential operators is possible only after adding extra support restrictions, so it is not the intrinsic local calculus described by the theorem. Thus composition adds orders, while lower-order errors can be ignored only at a fixed symbolic precision, such as when calculating principal symbols.
[quotetheorem:7684]
[citeproof:7684]
This closure result is more than a formal convenience: it lets us insert cutoffs, parametrices, and lower-order corrections without leaving the pseudodifferential framework. Proper support is still part of the statement because composition of non-proper kernels can lose the compactness control needed for the oscillatory integrals to act locally on $C_c^\infty(U)$. Without proper support, an operator may send compactly supported input to a function whose support is not locally controlled, so a second local operator may not have the same pseudodifferential description without adding cutoffs. The theorem also does not assert Sobolev boundedness or ellipticity; those require separate estimates and lower bounds on principal symbols. A basic local computation is to see what happens when a cutoff is placed on either side of an elliptic operator.
[example: Cutoff Composed With An Elliptic Operator]
Let $P=\operatorname{Op}(p)$ with $p\in S^2(U\times\mathbb R^n)$ elliptic for large $|\xi|$, and let $\chi\in C_c^\infty(U)$ be viewed as a symbol independent of $\xi$. For the left cutoff, the composition expansion gives
\begin{align*}
\chi\# p \sim \sum_{\alpha}\frac{i^{-|\alpha|}}{\alpha!}\partial_\xi^\alpha \chi\,\partial_x^\alpha p.
\end{align*}
The term $\alpha=0$ is
\begin{align*}
\frac{i^0}{0!}\partial_\xi^0\chi\,\partial_x^0p=\chi p.
\end{align*}
If $|\alpha|\ge 1$, then at least one $\xi$-derivative falls on $\chi$, and since $\chi=\chi(x)$,
\begin{align*}
\partial_\xi^\alpha\chi=0.
\end{align*}
Thus every term with $|\alpha|\ge 1$ vanishes, so
\begin{align*}
\chi\# p=\chi p.
\end{align*}
Hence $\operatorname{Op}(\chi)P$ has principal symbol $\chi p$: multiplying by the cutoff on the left simply localizes the principal symbol.
For the reverse composition, the expansion is
\begin{align*}
p\#\chi \sim \sum_{\alpha}\frac{i^{-|\alpha|}}{\alpha!}\partial_\xi^\alpha p\,\partial_x^\alpha\chi.
\end{align*}
The zeroth term is
\begin{align*}
p\chi.
\end{align*}
For $|\alpha|=1$, write $\alpha=e_k$; the contribution is
\begin{align*}
\sum_{k=1}^n i^{-1}\partial_{\xi_k}p\,\partial_{x_k}\chi.
\end{align*}
Since $p\in S^2$, each $\partial_{\xi_k}p$ lies in $S^1$, while $\partial_{x_k}\chi$ is smooth and compactly supported, so this whole first-order correction lies in $S^1$. More generally, the term with multi-index $\alpha$ lies in $S^{2-|\alpha|}$ because $\partial_\xi^\alpha p\in S^{2-|\alpha|}$ and $\partial_x^\alpha\chi$ is a compactly supported smooth factor. Therefore
\begin{align*}
p\#\chi \sim p\chi+i^{-1}\sum_{k=1}^n \partial_{\xi_k}p\,\partial_{x_k}\chi+\text{terms in lower symbol orders}.
\end{align*}
So $P\operatorname{Op}(\chi)$ has the same principal symbol $p\chi$, but its lower-order corrections are supported where derivatives of $\chi$ are nonzero; placing the cutoff on the right records the spatial variation of the cutoff in the symbolic product.
[/example]
## Principal Symbols and Commutators
Composition is noncommutative because the correction terms are asymmetric: $\partial_\xi a\cdot \partial_x b$ appears in $a\# b$, while $\partial_\xi b\cdot \partial_x a$ appears in $b\# a$. The leading multiplication terms cancel in a commutator, so commutators usually have lower order than either raw composition suggests. The first surviving term is measured by the Poisson bracket.
[definition: Poisson Bracket]
For smooth functions $a,b:U\times \mathbb R^n\to \mathbb C$, the Poisson bracket is the map
\begin{align*}
\{\cdot,\cdot\}:C^\infty(U\times\mathbb R^n)\times C^\infty(U\times\mathbb R^n)\to C^\infty(U\times\mathbb R^n)
\end{align*}
defined by the function $\{a,b\}:U\times\mathbb R^n\to\mathbb C$ with
\begin{align*}
\{a,b\}=\sum_{j=1}^{n}\left(\partial_{\xi_j}a\,\partial_{x_j}b-\partial_{x_j}a\,\partial_{\xi_j}b\right).
\end{align*}
[/definition]
The bracket is the canonical first-order measure of noncommutation on phase space. For operators, however, noncommutation is filtered by order: the top product terms in the two compositions look like $ab$ and $ba$, so in the scalar case they cancel in the commutator. The real question is what remains as the first nonzero principal contribution. The composition formula predicts that the first surviving term should be built from one $x$-derivative and one $\xi$-derivative, exactly the data assembled by the Poisson bracket.
[quotetheorem:7685]
[citeproof:7685]
This theorem is the bridge from operator algebra to Hamiltonian geometry. The scalar nature of the principal symbols is used when the order $m+m'$ terms cancel; for systems or matrix-valued symbols, the matrix commutator contributes at top order unless extra hypotheses are imposed. The proper-support assumption also has content. On $U=\mathbb R$, let $T$ have smooth kernel $K(x,y)=1$, so $Tu(x)=\int_{\mathbb R}u(y)\,dy$ for $u\in C_c^\infty(\mathbb R)$; this is smoothing as a map to $C^\infty(\mathbb R)$ but it is not properly supported. If $M_\chi$ is multiplication by a nonzero cutoff $\chi\in C_c^\infty(\mathbb R)$, then $[M_\chi,T]$ has smooth kernel $\chi(x)-\chi(y)$, whose support is not proper over either variable. Thus a commutator can be smoothing while still lying outside the properly supported pseudodifferential calculus used by the theorem.
The quotient formulation is also important because lower-order changes to $a$ or $b$ change the displayed Poisson bracket only by an element of $S^{m+m'-2}$. At the level of principal symbols, dividing the commutator by $i$ corresponds to taking the Poisson bracket, so the phase-space Hamiltonian flow of a symbol governs leading-order noncommutation. A vanishing Poisson bracket only removes this first possible principal term; it does not imply the operators commute exactly, since lower-order terms may still survive. This is the form used later in propagation arguments and in elliptic regularity, where cutoffs are moved through operators at the cost of controlled lower-order errors.
[example: The Basic Commutator With A Function]
Let $D_j=-i\partial_{x_j}$ and let $f\in C^\infty(U)$. For $u\in C_c^\infty(U)$, the commutator acts by
\begin{align*}
[D_j,f]u=D_j(fu)-fD_j u.
\end{align*}
Substituting $D_j=-i\partial_{x_j}$ into each term gives
\begin{align*}
D_j(fu)=-i\partial_{x_j}(fu).
\end{align*}
By the product rule,
\begin{align*}
\partial_{x_j}(fu)=(\partial_{x_j}f)u+f\partial_{x_j}u.
\end{align*}
Therefore
\begin{align*}
D_j(fu)=-i(\partial_{x_j}f)u-if\partial_{x_j}u.
\end{align*}
The second term in the commutator is
\begin{align*}
fD_j u=f(-i\partial_{x_j}u)=-if\partial_{x_j}u.
\end{align*}
Subtracting the two displayed expressions,
\begin{align*}
[D_j,f]u=\left(-i(\partial_{x_j}f)u-if\partial_{x_j}u\right)-\left(-if\partial_{x_j}u\right).
\end{align*}
The two terms $-if\partial_{x_j}u$ and $+if\partial_{x_j}u$ cancel, so
\begin{align*}
[D_j,f]u=-i(\partial_{x_j}f)u.
\end{align*}
Thus $[D_j,f]$ is multiplication by $-i\partial_{x_j}f$, equivalently $[D_j,f]=\operatorname{Op}(-i\partial_{x_j}f)$, which is an operator of order $0$.
The Poisson bracket definition gives, with $a(x,\xi)=\xi_j$ and $b(x,\xi)=f(x)$,
\begin{align*}
\{\xi_j,f\}=\sum_{k=1}^n\left(\partial_{\xi_k}\xi_j\,\partial_{x_k}f-\partial_{x_k}\xi_j\,\partial_{\xi_k}f\right).
\end{align*}
Since $\partial_{\xi_k}\xi_j=\delta_{jk}$, $\partial_{x_k}\xi_j=0$, and $\partial_{\xi_k}f=0$, this becomes
\begin{align*}
\{\xi_j,f\}=\sum_{k=1}^n\delta_{jk}\partial_{x_k}f.
\end{align*}
The Kronecker delta selects the $k=j$ term, hence
\begin{align*}
\{\xi_j,f\}=\partial_{x_j}f.
\end{align*}
Multiplying by $1/i=-i$ gives
\begin{align*}
\frac{1}{i}\{\xi_j,f\}=-i\partial_{x_j}f.
\end{align*}
The commutator symbol obtained from the Poisson bracket is therefore the same order-zero symbol found by applying the operators to a test function.
[/example]
The same calculation persists for general symbols, with the first-order phase-space bracket replacing the ordinary derivative of a coefficient. This is why commutator estimates in microlocal analysis often gain one order.
[example: Commutator Of Two Pseudodifferential Operators]
Let $a\in S^m(U\times\mathbb R^n)$ and $b\in S^{m'}(U\times\mathbb R^n)$ have principal parts $a_m$ and $b_{m'}$. By *[Commutator Order Drop](/theorems/7685)*, the commutator satisfies
\begin{align*}
[\operatorname{Op}(a),\operatorname{Op}(b)]\in \Psi^{m+m'-1}(U),
\end{align*}
and its principal symbol in order $m+m'-1$ is represented by
\begin{align*}
\frac{1}{i}\{a_m,b_{m'}\}.
\end{align*}
Suppose first that the relevant principal symbols have the separated form $a_m=a_m(\xi)$ and $b_{m'}=b_{m'}(x)$. The Poisson bracket is
\begin{align*}
\{a_m,b_{m'}\}=\sum_{j=1}^{n}\left(\partial_{\xi_j}a_m\,\partial_{x_j}b_{m'}-\partial_{x_j}a_m\,\partial_{\xi_j}b_{m'}\right).
\end{align*}
Since $a_m$ is independent of $x$, we have $\partial_{x_j}a_m=0$ for every $j$. Since $b_{m'}$ is independent of $\xi$, we have $\partial_{\xi_j}b_{m'}=0$ for every $j$. Therefore each second term in the summand vanishes:
\begin{align*}
\partial_{x_j}a_m\,\partial_{\xi_j}b_{m'}=0\cdot 0=0.
\end{align*}
Thus
\begin{align*}
\{a_m,b_{m'}\}=\sum_{j=1}^{n}\partial_{\xi_j}a_m(\xi)\,\partial_{x_j}b_{m'}(x).
\end{align*}
Multiplying by $1/i$ gives the order $m+m'-1$ principal symbol of the commutator:
\begin{align*}
\sigma_{m+m'-1}([\operatorname{Op}(a),\operatorname{Op}(b)])=\frac{1}{i}\sum_{j=1}^{n}\partial_{\xi_j}a_m(\xi)\,\partial_{x_j}b_{m'}(x).
\end{align*}
So the leading noncommuting term differentiates the left multiplier in frequency and the right coefficient in space.
If both principal symbols depend only on $\xi$, then $\partial_{x_j}a_m=0$ and $\partial_{x_j}b_{m'}=0$ for every $j$. The two pieces in each Poisson-bracket summand are therefore
\begin{align*}
\partial_{\xi_j}a_m\,\partial_{x_j}b_{m'}=\partial_{\xi_j}a_m\cdot 0=0
\end{align*}
and
\begin{align*}
\partial_{x_j}a_m\,\partial_{\xi_j}b_{m'}=0\cdot \partial_{\xi_j}b_{m'}=0.
\end{align*}
Hence
\begin{align*}
\{a_m,b_{m'}\}=0.
\end{align*}
The order $m+m'-1$ principal symbol of the commutator vanishes, so the commutator begins at order at most $m+m'-2$ rather than at the first possible commutator order.
[/example]
## Consequences for the Calculus
The composition and commutator formulae are the working rules used in later chapters. Parametrix constructions use composition to solve $a\# b\sim 1$ term by term, while elliptic regularity uses commutators to move cutoffs through elliptic operators at the cost of lower-order errors. The algebra property of $\Psi^0$ also explains why order-zero operators behave like bounded symbolic multipliers before Sobolev mapping estimates are proved.
[remark: Dependence On Quantisation]
The coefficient $i^{- |\alpha|}$ in the expansion belongs to the Kohn--Nirenberg convention with phase $e^{i(x-y)\cdot \xi}$ and the Fourier transform convention used in the course. Other quantisations, such as Weyl quantisation, redistribute the lower-order terms while preserving the principal product and the principal Poisson bracket term in the commutator.
[/remark]
This dependence is a reminder to separate invariant information from convention-dependent representatives. Principal symbols and the order drop for commutators are stable features of the calculus, while the precise lower-order symbolic product changes with the chosen quantisation.
Composition showed how pseudodifferential operators multiply forward through the algebra. To treat integration by parts, duality, and positivity constraints, we must understand adjoints and transposes. The principal symbol still governs the leading behavior, but now in the framework of dual pairings and self-adjoint perturbations.
# 7. Adjoints, Transposes, and Positivity
This chapter adds the involutive side of the pseudodifferential calculus. Composition told us how operators multiply; adjoints and transposes tell us how the calculus behaves under duality, integration by parts, and positivity. The main point is that the adjoint of a pseudodifferential operator is again pseudodifferential, with a symbol obtained from the original one by conjugating and then correcting by an asymptotic expansion. This prepares the positivity estimates used in energy methods and elliptic regularity.
## Formal Adjoints in the Kohn-Nirenberg Calculus
The first question is: if
\begin{align*}
A=\operatorname{Op}(a)
\end{align*}
is defined by oscillatory integration, what operator satisfies
\begin{align*}
(Au,v)_{L^2}=(u,A^*v)_{L^2}
\end{align*}
on test functions? For differential operators this is integration by parts. For pseudodifferential operators, the same principle survives, but the answer is no longer given only by conjugating the coefficients.
[definition: Formal Adjoint]
Let $A:C_c^\infty(U)\to \mathcal D'(U)$ be a linear operator. A linear operator $A^*:C_c^\infty(U)\to \mathcal D'(U)$ is a formal adjoint of $A$ if
\begin{align*}
(Au)(v)=\overline{(A^*v)(u)}
\end{align*}
for all $u,v\in C_c^\infty(U)$, with the pairing interpreted through the $L^2$ inner product when both sides are functions.
[/definition]
The formal adjoint is a local distributional notion; it does not by itself assert that an unbounded operator on $L^2$ has a Hilbert-space adjoint with a specified domain. In this chapter all adjoints are first understood on test functions or Schwartz functions, and operator-domain issues enter only after Sobolev mapping estimates have been applied.
[example: Differential Operator Adjoint]
Let $P=\sum_{|\alpha|\le m}a_\alpha(x)D^\alpha$ with $D_j=-i\partial_{x_j}$, and take $u,v\in C_c^\infty(\mathbb R^n)$. For one derivative, compact support removes the boundary term and gives
\begin{align*}
(D_j u,w)_{L^2}=\int_{\mathbb R^n}(-i\partial_{x_j}u)\overline w\,dx=\int_{\mathbb R^n}u\,\overline{(-i\partial_{x_j}w)}\,dx=(u,D_jw)_{L^2}.
\end{align*}
Applying this identity successively to the multi-index derivative $D^\alpha$ gives
\begin{align*}
(a_\alpha D^\alpha u,v)_{L^2}=(D^\alpha u,\overline{a_\alpha}v)_{L^2}=(u,D^\alpha(\overline{a_\alpha}v))_{L^2}.
\end{align*}
Summing over $|\alpha|\le m$, the formal adjoint is therefore
\begin{align*}
P^*v=\sum_{|\alpha|\le m}D^\alpha(\overline{a_\alpha}v).
\end{align*}
The lower-order terms are exactly the terms in the Leibniz expansion. For each $\alpha$,
\begin{align*}
D^\alpha(\overline{a_\alpha}v)=\sum_{\beta\le \alpha}\binom{\alpha}{\beta}(D^{\alpha-\beta}\overline{a_\alpha})D^\beta v.
\end{align*}
The term with $\beta=\alpha$ is $\overline{a_\alpha}D^\alpha v$, while every term with $\beta<\alpha$ contains fewer derivatives of $v$. Thus the top-order part is obtained by conjugating the coefficients, and the extra terms come precisely from derivatives landing on the coefficients.
[/example]
The example already contains the pattern for pseudodifferential operators: conjugation controls the top order, and differentiation in $x$ and $\xi$ produces the lower-order expansion. The question is whether this pattern remains valid when the operator is given by an oscillatory integral rather than a finite sum of derivatives.
[quotetheorem:7686]
[citeproof:7686]
The hypotheses in the adjoint theorem are doing real work. The symbol estimates ensure that every integration by parts in $\xi$ improves the order in a controlled way; without them, the Taylor remainders need not define a pseudodifferential symbol, and the adjoint may fall outside the same calculus. A concrete failure is given by the amplitude
\begin{align*}
a(x,\xi)=e^{|\xi|^2}\chi(x)
\end{align*}
with $\chi\in C_c^\infty(\mathbb R^n)$: its $\xi$-derivatives grow faster than any polynomial, so it is not in any $S^m_{1,0}$ class and the Taylor-integration-by-parts procedure does not produce an asymptotic symbol expansion. The theorem also does not say that $A$ is a self-adjoint unbounded operator on $L^2$; it identifies the formal adjoint on test functions or Schwartz functions, while Hilbert-space domains require separate Sobolev mapping and closure arguments.
The adjoint formula gives the full lower-order expansion, but many arguments only need the top-order consequence. Extracting that consequence is useful because it tells us which features of an operator are intrinsic at the principal-symbol level and which depend on the chosen quantisation.
[quotetheorem:7687]
[citeproof:7687]
The global Schwartz-space setting in the statement is part of the theorem, not decoration. It ensures that the formal adjoint is defined by the oscillatory kernel without support escaping the test-function category; on an open set, the analogous statement is made for properly supported operators so that adjoints and compositions remain local. If one starts instead with an arbitrary continuous map $C_c^\infty\to \mathcal D'$, there is still a distributional adjoint when the pairings make sense, but there need not be a symbol and hence no principal-symbol class to conjugate.
The result is also only a principal-order statement. It does not claim $A^*=\operatorname{Op}(\overline a)$, because every term $\partial_\xi^\alpha D_x^\alpha\overline a$ with $|\alpha|\ge 1$ contributes to lower orders in Kohn-Nirenberg quantisation. For example, a variable coefficient differential operator has derivatives falling on the coefficients after integration by parts, so the leading coefficients conjugate while the lower-order part changes. This is exactly the distinction needed later: positivity and symmetry arguments often begin at the principal symbol, but the lower-order remainders must be estimated rather than ignored.
The principal-symbol statement is exact for operators whose symbols do not depend on $x$, because all correction terms disappear. This special case is the cleanest model for the positivity results later in the chapter.
[example: Fourier Multiplier Adjoint]
Let $a(x,\xi)=m(\xi)$, so $D_x^\alpha \overline{a(x,\xi)}=0$ for every multi-index $\alpha\ne 0$, while the $\alpha=0$ term is $\overline{m(\xi)}$. Thus the adjoint symbol expansion has only one nonzero term, namely $\overline{m(\xi)}$. Equivalently, for $u,v\in \mathcal S(\mathbb R^n)$, Plancherel's identity gives
\begin{align*}(Au,v)_{L^2}=\int_{\mathbb R^n} m(\xi)\hat u(\xi)\overline{\hat v(\xi)}\,d\xi\end{align*}
and the integrand may be rewritten as
\begin{align*}m(\xi)\hat u(\xi)\overline{\hat v(\xi)}=\hat u(\xi)\overline{\overline{m(\xi)}\hat v(\xi)}.\end{align*}
Therefore
\begin{align*}(Au,v)_{L^2}=\int_{\mathbb R^n}\hat u(\xi)\overline{\overline{m(\xi)}\hat v(\xi)}\,d\xi=(u,\overline m(D)v)_{L^2}.\end{align*}
So $A^*=\overline m(D)$ as a formal adjoint on $\mathcal S(\mathbb R^n)$. If $m$ is real-valued, then $\overline m=m$ and $A^*=A$. If moreover $m(\xi)\ge 0$ for all $\xi$, then
\begin{align*}(Au,u)_{L^2}=\int_{\mathbb R^n}m(\xi)|\hat u(\xi)|^2\,d\xi\ge 0,\end{align*}
so the multiplier is exactly positive, with no lower-order correction term.
[/example]
This multiplier case is the model against which the variable-coefficient situation is measured. Variable coefficients introduce lower-order commutators, so real principal symbols give self-adjointness only modulo operators of smaller order.
## Transposes and Distributional Duality
Adjoints use complex conjugation and the Hilbert-space structure of $L^2$. The transpose answers a nearby but different question: how does an operator act on distributions by duality with test functions? This distinction matters because pseudodifferential operators are often applied to distributions before any $L^2$ domain has been chosen.
[definition: Formal Transpose]
Let $A:C_c^\infty(U)\to C^\infty(U)$ be continuous in the test-function topology. The formal transpose ${}^tA:C_c^\infty(U)\to \mathcal D'(U)$ is characterised by
\begin{align*}
({}^tA\phi)(u)=\phi(Au)
\end{align*}
for $u,\phi\in C_c^\infty(U)$, using the bilinear distributional pairing.
[/definition]
The transpose is bilinear rather than sesquilinear. When ${}^tA$ maps test functions to test functions, or when proper support supplies the same effect locally, it extends $A$ to distributions by the rule $(AT)(\phi)=T({}^tA\phi)$. To use this extension inside the pseudodifferential calculus, we need to know that this bilinear dual operation keeps us inside the same symbol classes and to identify the symbol it produces.
[quotetheorem:7688]
[citeproof:7688]
The transpose formula depends on both the Kohn-Nirenberg placement of the amplitude and the symbol estimates. The sign $\xi\mapsto -\xi$ comes from rewriting the transposed kernel back into the same left-quantised form; for a different quantisation convention, such as Weyl quantisation, the displayed formula is replaced by the corresponding convention-specific rule. The estimates in $S^m_{1,0}$ are also needed to keep the Taylor expansion asymptotic inside the calculus. For instance, replacing a symbol by
\begin{align*}
e^{|\xi|^2}\chi(x)
\end{align*}
with $\chi\in C_c^\infty(\mathbb R^n)$ destroys the polynomial derivative bounds in $\xi$, so the transposed kernel may still define a bilinear distributional pairing on test functions but has no asymptotic expansion in the $S^m_{1,0}$ scale. A general continuous operator $C_c^\infty\to C^\infty$ has a transpose on distributions, but it need not be pseudodifferential, and its transpose need not have any symbol expansion at all.
The sign change in the transpose is a reminder that the covariable records oscillation direction. A first-order constant-coefficient operator shows the distinction between transpose and adjoint without any lower-order terms obscuring the computation.
[example: Derivative and Transpose]
Regard $D_j=-i\partial_{x_j}$ as a continuous map $\mathcal S(\mathbb R^n)\to \mathcal S(\mathbb R^n)$, and also as a test-function operator $C_c^\infty(\mathbb R^n)\to C_c^\infty(\mathbb R^n)$. Since multiplication by $\xi_j$ in Fourier variables gives
\begin{align*}\operatorname{Op}(\xi_j)u=D_j u,\end{align*}
the Kohn-Nirenberg symbol of $D_j$ is $\xi_j$. The transpose formula sends this symbol to $\xi_j$ evaluated at $-\xi$, and there are no $x$-derivative correction terms because $\xi_j$ is independent of $x$. Hence the transpose symbol is $-\xi_j$.
The same conclusion is visible from the bilinear pairing. For $u,\phi\in C_c^\infty(\mathbb R^n)$, integration by parts gives
\begin{align*}\int_{\mathbb R^n} (D_j u)(x)\phi(x)\,dx=\int_{\mathbb R^n} (-i\partial_{x_j}u)(x)\phi(x)\,dx.\end{align*}
The boundary term vanishes because $u$ and $\phi$ have compact support, so
\begin{align*}\int_{\mathbb R^n} (-i\partial_{x_j}u)\phi\,dx=\int_{\mathbb R^n} u(i\partial_{x_j}\phi)\,dx.\end{align*}
Since $i\partial_{x_j}\phi=-D_j\phi$, this becomes
\begin{align*}\int_{\mathbb R^n} (D_j u)\phi\,dx=\int_{\mathbb R^n}u(-D_j\phi)\,dx.\end{align*}
Thus ${}^tD_j=-D_j$.
For the formal adjoint, the adjoint symbol formula gives $\xi_j$ again: $\xi_j$ is real-valued and independent of $x$, so conjugation leaves it unchanged and all correction terms with $D_x^\alpha$ for $\alpha\ne 0$ vanish. Equivalently, for $u,v\in\mathcal S(\mathbb R^n)$,
\begin{align*}(D_j u,v)_{L^2}=\int_{\mathbb R^n}(-i\partial_{x_j}u)\overline v\,dx.\end{align*}
Integrating by parts and using rapid decay to remove the boundary term gives
\begin{align*}\int_{\mathbb R^n}(-i\partial_{x_j}u)\overline v\,dx=\int_{\mathbb R^n}u(i\partial_{x_j}\overline v)\,dx.\end{align*}
Because $i\partial_{x_j}\overline v=\overline{-i\partial_{x_j}v}=\overline{D_jv}$, we get
\begin{align*}(D_j u,v)_{L^2}=(u,D_jv)_{L^2}.\end{align*}
Therefore $D_j^*=D_j$ on $\mathcal S(\mathbb R^n)$, while the transpose is ${}^tD_j=-D_j$; the sign difference comes from bilinear duality versus sesquilinear $L^2$ duality.
[/example]
This example separates the two dualities in the smallest possible case. In microlocal arguments, the transpose is used to define $Au$ for distributions $u$, while the adjoint is used to move operators across $L^2$ inner products.
## Real Principal Symbols and Self-Adjointness up to Lower Order
The next problem is to decide how much symmetry is forced by the leading symbol. A real differential operator need not be exactly self-adjoint, because derivatives can strike the coefficients, but its highest-order behaviour is symmetric when the leading coefficients are real in the correct sense. The pseudodifferential version is identical at the principal level.
[definition: Formal Self-Adjointness Modulo Lower Order]
Let $A:C_c^\infty(\mathbb R^n)\to \mathcal D'(\mathbb R^n)$ be a properly supported operator belonging to $\Psi^m_{1,0}(\mathbb R^n)$, and let $A^*:C_c^\infty(\mathbb R^n)\to \mathcal D'(\mathbb R^n)$ denote its formal adjoint. We say that $A$ is formally self-adjoint modulo lower order if
\begin{align*}
A-A^*\in \Psi^{m-1}_{1,0}(\mathbb R^n).
\end{align*}
[/definition]
This definition measures symmetry at the level seen by the principal symbol. The obstruction to checking it directly is that the adjoint changes the full symbol by lower-order correction terms coming from differentiating coefficients and from the quantisation convention. At principal order, though, taking the adjoint should only conjugate the leading symbol. Thus the useful criterion is whether the leading symbol is real, because that is precisely the part of $A-A^*$ that cannot be hidden in $\Psi^{m-1}_{1,0}$.
[quotetheorem:7689]
[citeproof:7689]
The theorem does not assert exact symmetry. It says that any failure of symmetry has lower order, and a familiar second-order elliptic operator shows how this separates the leading quadratic form from drift and potential terms.
The reality hypothesis cannot be dropped. If $A=i\langle D\rangle^m$ is the Fourier multiplier with symbol $i\langle\xi\rangle^m$, then $A^*=-i\langle D\rangle^m$ and $A-A^*=2i\langle D\rangle^m$, which still has order $m$ rather than order $m-1$. The conclusion also uses the order filtration of the pseudodifferential calculus: outside $\Psi^m_{1,0}$ there may be no principal-symbol quotient in which the cancellation $a_m-\overline{a_m}=0$ can be interpreted. Thus the theorem isolates a leading-order obstruction to symmetry; it does not decide whether lower-order drift, potential, or subprincipal terms make the full operator formally self-adjoint.
[example: Symmetric Second-Order Elliptic Operators]
Let
\begin{align*}
Lu=-\sum_{i,j=1}^n\partial_{x_i}(a_{ij}(x)\partial_{x_j}u)+\sum_{j=1}^n b_j(x)\partial_{x_j}u+c(x)u
\end{align*}
on $C_c^\infty(\mathbb R^n)$, where $a_{ij}=a_{ji}$ are real and smooth. Since $D_j=-i\partial_{x_j}$, we have $\partial_{x_j}=iD_j$, and the second-order part expands as
\begin{align*}
-\partial_{x_i}(a_{ij}\partial_{x_j}u)=-(\partial_{x_i}a_{ij})\partial_{x_j}u-a_{ij}\partial_{x_i}\partial_{x_j}u
\end{align*}
and therefore
\begin{align*}
-\partial_{x_i}(a_{ij}\partial_{x_j}u)=-i(\partial_{x_i}a_{ij})D_j u+a_{ij}D_iD_j u.
\end{align*}
The terms $-i(\partial_{x_i}a_{ij})D_j u$, $b_j\partial_{x_j}u=i b_jD_j u$, and $c u$ have order at most $1$, so the order-$2$ symbol is
\begin{align*}
l_2(x,\xi)=\sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j.
\end{align*}
Because each $a_{ij}$ is real, $l_2(x,\xi)$ is real-valued. By *Real Principal Symbol Implies Lower-Order Skew Part*, this gives
\begin{align*}
L-L^*\in \Psi^1_{1,0}(\mathbb R^n).
\end{align*}
The divergence-form second-order part is in fact formally self-adjoint. If
\begin{align*}
L_0u=-\sum_{i,j=1}^n\partial_{x_i}(a_{ij}\partial_{x_j}u),
\end{align*}
then integration by parts and compact support give
\begin{align*}
(L_0u,v)_{L^2}=\sum_{i,j=1}^n\int_{\mathbb R^n}a_{ij}\partial_{x_j}u\,\overline{\partial_{x_i}v}\,dx.
\end{align*}
Renaming $i$ and $j$ and using $a_{ij}=a_{ji}$ gives
\begin{align*}
(L_0u,v)_{L^2}=\sum_{i,j=1}^n\int_{\mathbb R^n}a_{ij}\partial_{x_i}u\,\overline{\partial_{x_j}v}\,dx.
\end{align*}
Integrating by parts in the reverse direction,
\begin{align*}
\sum_{i,j=1}^n\int_{\mathbb R^n}a_{ij}\partial_{x_i}u\,\overline{\partial_{x_j}v}\,dx=(u,L_0v)_{L^2}.
\end{align*}
For the lower-order part $Bu=\sum_j b_j\partial_{x_j}u+cu$, the same integration-by-parts calculation gives
\begin{align*}
B^*v=\sum_{j=1}^n\left(-\overline{b_j}\partial_{x_j}v-(\partial_{x_j}\overline{b_j})v\right)+\overline c\,v.
\end{align*}
Thus $L$ is formally self-adjoint exactly when the lower-order coefficients satisfy
\begin{align*}
b_j=-\overline{b_j}\quad\text{for every }j
\end{align*}
and
\begin{align*}
c=\overline c-\sum_{j=1}^n\partial_{x_j}\overline{b_j}.
\end{align*}
The real quadratic principal symbol controls symmetry only at order $2$; exact formal self-adjointness is decided by these first-order and zeroth-order identities.
[/example]
The second-order example is the PDE prototype: principal symmetry comes from the quadratic form in the highest derivatives, while lower-order drift terms determine exact self-adjointness. Pseudodifferential notation isolates this separation cleanly, and the same separation becomes even sharper in Weyl quantisation.
[example: Weyl Quantisation of Real Symbols]
Let $a\in S^m_{1,0}(\mathbb R^n\times\mathbb R^n)$, and define the Weyl operator by
\begin{align*}
\operatorname{Op}^w(a)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}\int_{\mathbb R^n}e^{i(x-y)\cdot\xi}a\left(\frac{x+y}{2},\xi\right)u(y)\,dy\,d\xi.
\end{align*}
Its distribution kernel is
\begin{align*}
K_a(x,y)=(2\pi)^{-n}\int_{\mathbb R^n}e^{i(x-y)\cdot\xi}a\left(\frac{x+y}{2},\xi\right)\,d\xi.
\end{align*}
The kernel of the formal adjoint is $\overline{K_a(y,x)}$. Since
\begin{align*}
K_a(y,x)=(2\pi)^{-n}\int_{\mathbb R^n}e^{i(y-x)\cdot\xi}a\left(\frac{x+y}{2},\xi\right)\,d\xi,
\end{align*}
complex conjugation gives
\begin{align*}
\overline{K_a(y,x)}=(2\pi)^{-n}\int_{\mathbb R^n}e^{-i(y-x)\cdot\xi}\overline{a\left(\frac{x+y}{2},\xi\right)}\,d\xi.
\end{align*}
Because $-i(y-x)\cdot\xi=i(x-y)\cdot\xi$, this becomes
\begin{align*}
\overline{K_a(y,x)}=(2\pi)^{-n}\int_{\mathbb R^n}e^{i(x-y)\cdot\xi}\overline{a\left(\frac{x+y}{2},\xi\right)}\,d\xi.
\end{align*}
If $a$ is real-valued, then $\overline{a((x+y)/2,\xi)}=a((x+y)/2,\xi)$, so $\overline{K_a(y,x)}=K_a(x,y)$.
Thus the adjoint kernel equals the original kernel, and therefore $(\operatorname{Op}^w(a))^*=\operatorname{Op}^w(a)$ exactly as a formal identity on $\mathcal S(\mathbb R^n)$. The symmetry is exact because Weyl quantisation places the spatial variable at $(x+y)/2$, so interchanging $x$ and $y$ leaves the amplitude unchanged.
[/example]
Kohn-Nirenberg quantisation is computationally convenient for composition, while Weyl quantisation aligns better with symmetry. The calculus permits movement between the two conventions at the cost of lower-order symbol corrections.
## Positivity and the Sharp Garding Inequality
The final question is more delicate: if the symbol is nonnegative, must the operator have nonnegative quadratic form? For Fourier multipliers the answer is yes by Plancherel. For variable-coefficient pseudodifferential operators, noncommutativity produces lower-order errors, and the correct statement is a lower bound rather than exact positivity.
[definition: Semibounded Quadratic Form]
Let $A:C_c^\infty(\mathbb R^n)\to \mathcal D'(\mathbb R^n)$ be formally self-adjoint. We say that $A$ is semibounded from below by $-C$ on $L^2$ if
\begin{align*}
\operatorname{Re}(Au,u)_{L^2}\ge -C\|u\|_{L^2}^2
\end{align*}
for all $u\in C_c^\infty(\mathbb R^n)$.
[/definition]
This notion is weaker than positivity but strong enough for energy estimates. The basic problem is now to turn a pointwise nonnegative phase-space symbol into this analytic lower bound for the associated operator.
[quotetheorem:7690]
[citeproof:7690]
[illustration:sharp-garding-phase-space-boxes]
Each hypothesis in sharp Garding has a distinct role. Real-valuedness is a [symmetry condition](/theorems/1360): if the symbol has an imaginary part, the real part of the quadratic form only sees the symmetric component of the operator, not the full symbol. For example, the purely imaginary constant symbol
\begin{align*}
a(\xi)=i
\end{align*}
gives
\begin{align*}
\operatorname{Re}(\operatorname{Op}(a)u,u)_{L^2}=0,
\end{align*}
so it is neither a counterexample to the lower bound nor evidence for positivity of the operator; it has no nonnegative real principal symbol to exploit. Nonnegativity of the real symbol is the separate coercive input: if $a(x,\xi)=-1$, then
\begin{align*}
(\operatorname{Op}(a)u,u)_{L^2}=-\|u\|_{L^2}^2,
\end{align*}
and no lower bound better than the negative constant built into the symbol is possible. The order-zero assumption fixes the scale of the error term; for a nonnegative symbol of order $m$, the corresponding sharp lower bound loses one derivative and is measured in $H^{(m-1)/2}$ rather than just $L^2$.
The word sharp refers precisely to this amount of derivative lost in the lower bound: for order $0$, the negative contribution is only $L^2$-bounded. Fourier multipliers show the limiting case where no variable-coefficient error is present.
[example: Positive Elliptic Multipliers]
Let $a(\xi)=\langle\xi\rangle^{-2s}$ with $s\ge 0$, where $\langle\xi\rangle=(1+|\xi|^2)^{1/2}$. Since $a$ is independent of $x$, $\operatorname{Op}(a)$ is the Fourier multiplier with
\begin{align*}
\widehat{\operatorname{Op}(a)u}(\xi)=\langle\xi\rangle^{-2s}\hat u(\xi).
\end{align*}
For $u\in\mathcal S(\mathbb R^n)$, unitarity of the Fourier transform under the symmetric normalisation gives
\begin{align*}
(\operatorname{Op}(a)u,u)_{L^2}=\int_{\mathbb R^n}\widehat{\operatorname{Op}(a)u}(\xi)\overline{\hat u(\xi)}\,d\xi.
\end{align*}
Substituting the multiplier formula gives
\begin{align*}
(\operatorname{Op}(a)u,u)_{L^2}=\int_{\mathbb R^n}\langle\xi\rangle^{-2s}\hat u(\xi)\overline{\hat u(\xi)}\,d\xi.
\end{align*}
Since $\hat u(\xi)\overline{\hat u(\xi)}=|\hat u(\xi)|^2$, this becomes
\begin{align*}
(\operatorname{Op}(a)u,u)_{L^2}=\int_{\mathbb R^n}\langle\xi\rangle^{-2s}|\hat u(\xi)|^2\,d\xi.
\end{align*}
For every $\xi$, $\langle\xi\rangle=(1+|\xi|^2)^{1/2}\ge 1$, so $\langle\xi\rangle^{-2s}\ge 0$, and $|\hat u(\xi)|^2\ge 0$. Hence
\begin{align*}
(\operatorname{Op}(a)u,u)_{L^2}\ge 0.
\end{align*}
The positivity is exact because the symbol depends only on $\xi$: multiplication by the nonnegative function $\langle\xi\rangle^{-2s}$ in Fourier variables introduces no $x$-dependent commutator or lower-order correction.
[/example]
The multiplier example shows why the [Garding inequality](/theorems/92) is the correct variable-coefficient analogue of Plancherel positivity. The lower bound is the price paid for allowing the symbol to depend on both $x$ and $\xi$, as the next localised example illustrates.
[example: Nonnegative Cutoff Symbol]
Let $\chi\in C_c^\infty(\mathbb R^n)$ satisfy $\chi\ge 0$, and set
\begin{align*}
a(x,\xi)=\chi(x)\langle\xi\rangle^{-1}.
\end{align*}
For every $(x,\xi)$, both factors are nonnegative, since $\chi(x)\ge 0$ and $\langle\xi\rangle^{-1}=(1+|\xi|^2)^{-1/2}>0$. Hence
\begin{align*}
a(x,\xi)\ge 0.
\end{align*}
Also, for multi-indices $\alpha,\beta$,
\begin{align*}
\partial_x^\beta\partial_\xi^\alpha a(x,\xi)=(\partial_x^\beta\chi)(x)\partial_\xi^\alpha\langle\xi\rangle^{-1}.
\end{align*}
The derivatives of $\chi$ are bounded because $\chi\in C_c^\infty$, and each $\xi$-derivative of $\langle\xi\rangle^{-1}$ gains one power of decay, so
\begin{align*}
|\partial_x^\beta\partial_\xi^\alpha a(x,\xi)|\le C_{\alpha\beta}\langle\xi\rangle^{-1-|\alpha|}.
\end{align*}
Thus $a\in S^{-1}_{1,0}$. Since $\langle\xi\rangle^{-1-|\alpha|}\le \langle\xi\rangle^{-|\alpha|}$, the same estimate also gives $a\in S^0_{1,0}$.
The symbol is real-valued and nonnegative, so *Sharp Garding Inequality for Order-Zero Symbols* applies to $a$ as an order-zero symbol. Therefore there is a constant $C\ge 0$ such that, for every $u\in\mathcal S(\mathbb R^n)$,
\begin{align*}
\operatorname{Re}(\operatorname{Op}(a)u,u)_{L^2}\ge -C\|u\|_{L^2}^2.
\end{align*}
The stronger fact $a\in S^{-1}_{1,0}$ says that the operator gains one derivative in the pseudodifferential scale. After inserting compact spatial cutoffs, this gain combines with Rellich compactness, so the possible negative contribution is a bounded localized error rather than a high-frequency obstruction.
[/example]
This example is typical in microlocal cutoffs: a symbol may be pointwise nonnegative in phase space without producing an exactly positive operator. The inequality still prevents uncontrolled negative energy from accumulating at high frequency.
## Elementary Positivity Mechanisms
Garding's inequality is a theorem of the calculus, but many positivity estimates in applications come from more concrete mechanisms. The goal is to recognise when positivity is exact, when it is a square plus lower order, and when only semiboundedness is available.
[quotetheorem:7691]
[citeproof:7691]
This theorem explains the square-root philosophy behind Garding: if a nonnegative symbol could be quantised as an exact square, positivity would be immediate. The same mechanism also appears in elliptic estimates, where parametrices convert control of an equation into a square estimate for Sobolev norms.
The assumptions again matter. Proper support, or the global Schwartz-space substitute, keeps $B^*$ and $B^*B$ in the operator category where the formal adjoint and composition theorem apply. A concrete failure mode is an operator on an open set whose kernel is supported along pairs $(x,y)$ with $y$ escaping every compact set while $x$ remains in a fixed compact set. Such an operator can send compactly supported test functions into smooth functions with nonlocal support behaviour, so applying a second local operator need not produce a properly supported kernel or a symbol in the intended calculus. Closure of the calculus is also essential: the principal symbol $|\sigma_r(B)|^2$ comes from applying the product theorem to $B^*$ and $B$, not from a pointwise manipulation alone.
The theorem gives a sufficient mechanism for positivity, not a converse. It does not say that every nonnegative principal symbol has an exact square root inside the same quantisation with no lower-order error, and it does not turn pointwise positivity of an arbitrary Kohn-Nirenberg symbol into exact positivity of the operator. Symbols with zeros may have square roots that fail to be smooth at the zero set, and even when a smooth microlocal square root exists, noncommutation between $x$ and $D$ produces lower-order remainders. This limitation is precisely why sharp Garding gives a lower bound rather than an exact factorisation for general nonnegative symbols.
[example: Elliptic Energy from a Parametrix]
Choose an order-zero microlocal cutoff $A=\operatorname{Op}(\psi)$ whose symbol is supported in the conic elliptic region, and choose a slightly larger cutoff $A_1=\operatorname{Op}(\psi_1)$ with $\psi_1=1$ on $\operatorname{supp}\psi$. Ellipticity gives a microlocal parametrix $Q\in\Psi^{-m}_{1,0}$ such that
\begin{align*}
AQP=A+S
\end{align*}
on this region, with $S\in\Psi^{-\infty}$. Applying both sides to $u$ gives
\begin{align*}
Au=AQPu-Su.
\end{align*}
Let $\Lambda^s=\langle D\rangle^s$. Then
\begin{align*}
\Lambda^s Au=\Lambda^s AQPu-\Lambda^s Su.
\end{align*}
Taking $L^2$ norms and using the triangle inequality,
\begin{align*}
\|\Lambda^s Au\|_{L^2}\le \|\Lambda^s AQPu\|_{L^2}+\|\Lambda^s Su\|_{L^2}.
\end{align*}
The first operator $\Lambda^s AQ$ has order $s-m$, since $\Lambda^s\in\Psi^s_{1,0}$, $A\in\Psi^0_{1,0}$, and $Q\in\Psi^{-m}_{1,0}$. Hence the Sobolev mapping estimate for pseudodifferential operators gives
\begin{align*}
\|\Lambda^s AQPu\|_{L^2}\le C\|Pu\|_{H^{s-m}_{\mathrm{mic}}}.
\end{align*}
For the smoothing term, $S\in\Psi^{-\infty}$ means that for every $N$ the operator $\Lambda^s S$ maps $H^{-N}$ continuously into $L^2$, so
\begin{align*}
\|\Lambda^s Su\|_{L^2}\le C_N\|u\|_{H^{-N}}.
\end{align*}
Combining the two estimates gives
\begin{align*}
\|u\|_{H^s_{\mathrm{mic}}}=\|\Lambda^s Au\|_{L^2}\le C\|Pu\|_{H^{s-m}_{\mathrm{mic}}}+C_N\|u\|_{H^{-N}}.
\end{align*}
The square mechanism enters through the identity
\begin{align*}
\|\Lambda^s Au\|_{L^2}^2=((\Lambda^s A)^*(\Lambda^s A)u,u)_{L^2}\ge 0.
\end{align*}
Thus the parametrix converts ellipticity of $P$ into control of the microlocal Sobolev square, while the smoothing remainder and the low-frequency part are exactly what is measured by the harmless term $C_N\|u\|_{H^{-N}}$.
[/example]
The square mechanism is also the bridge to the next part of the course. Once parametrices and positivity are combined, elliptic regularity becomes an energy statement: control of $Pu$ controls the microlocal Sobolev size of $u$ away from characteristic directions.
[remark: What Positivity Does Not Say]
Pointwise positivity of a Kohn-Nirenberg symbol is not the same as positivity of the operator. The symbol records the leading phase-space behaviour, while the operator also contains ordering effects from the noncommutation of $x$ and $D$. Sharp Garding converts pointwise nonnegativity into a stable lower bound, which is the form of positivity used in most pseudodifferential energy estimates.
[/remark]
Chapters 2–7 have developed pseudodifferential operators as a symbolic algebra with composition, adjoints, and principal symbols controlling lower-order behavior. The analytic step is translating symbol order into concrete Sobolev mapping properties—bounds on weighted Sobolev spaces that respect the symbol order. This chapter makes that translation and quantifies the precise gain in regularity from lower symbol order.
# 8. Sobolev Mapping Properties
This chapter turns the symbolic calculus into estimates on Sobolev spaces. Earlier chapters showed that pseudodifferential operators behave algebraically like functions of $(x,\xi)$, with composition and parametrices controlled by symbol order. The analytic question is now whether that symbolic order is exactly the loss of Sobolev regularity: an operator of order $m$ should map $H^s$ into $H^{s-m}$, first globally on $\mathbb R^n$ and then locally on open sets.
## Operators of Order $m$ Between Sobolev Spaces
The model case is a Fourier multiplier. If $p(\theta)$ has size comparable to $\langle \theta\rangle^m$, then applying $p(D)$ should consume $m$ derivatives. The first task is to formulate this statement in a way that survives the passage from multipliers to variable-coefficient symbols.
[definition: Bessel Potential Sobolev Norm]
Let $s \in \mathbb R$. The Sobolev space $H^s(\mathbb R^n)$ consists of all tempered distributions $u \in \mathcal S'(\mathbb R^n)$ such that $\langle \xi\rangle^s \hat u(\xi) \in L^2(\mathbb R^n)$, with norm
\begin{align*}
\|u\|_{H^s} = \|\langle \xi\rangle^s \hat u\|_{L^2}.
\end{align*}
[/definition]
This definition is designed so that the operator $\langle D\rangle^s$ is an isometry from $H^s(\mathbb R^n)$ onto $L^2(\mathbb R^n)$. Therefore, to prove a Sobolev estimate for an operator $A$, it is natural to conjugate by Bessel potentials and reduce the statement to an $L^2$ boundedness problem.
[example: Bessel Potential Comparison]
Let $A=\langle D\rangle^m$, so by definition of the Fourier multiplier,
\begin{align*}
\widehat{Au}(\theta)=\langle\theta\rangle^m\hat u(\theta).
\end{align*}
For $u\in\mathcal S(\mathbb R^n)$ and $s\in\mathbb R$, the Bessel potential Sobolev norm gives
\begin{align*}
\|Au\|_{H^{s-m}}^2=\int_{\mathbb R^n}\langle\theta\rangle^{2(s-m)}|\widehat{Au}(\theta)|^2\,d\mathcal L^n(\theta).
\end{align*}
Substituting the multiplier formula,
\begin{align*}
\|Au\|_{H^{s-m}}^2=\int_{\mathbb R^n}\langle\theta\rangle^{2(s-m)}|\langle\theta\rangle^m\hat u(\theta)|^2\,d\mathcal L^n(\theta).
\end{align*}
Since $\langle\theta\rangle^m>0$, we have $|\langle\theta\rangle^m\hat u(\theta)|^2=\langle\theta\rangle^{2m}|\hat u(\theta)|^2$, hence
\begin{align*}
\|Au\|_{H^{s-m}}^2=\int_{\mathbb R^n}\langle\theta\rangle^{2(s-m)}\langle\theta\rangle^{2m}|\hat u(\theta)|^2\,d\mathcal L^n(\theta).
\end{align*}
The exponents add as $2(s-m)+2m=2s$, so
\begin{align*}
\|Au\|_{H^{s-m}}^2=\int_{\mathbb R^n}\langle\theta\rangle^{2s}|\hat u(\theta)|^2\,d\mathcal L^n(\theta).
\end{align*}
Therefore
\begin{align*}
\|Au\|_{H^{s-m}}^2=\|u\|_{H^s}^2.
\end{align*}
Taking square roots gives $\|\langle D\rangle^m u\|_{H^{s-m}}=\|u\|_{H^s}$. The inverse multiplier is $\langle D\rangle^{-m}$, so $\langle D\rangle^m:H^s(\mathbb R^n)\to H^{s-m}(\mathbb R^n)$ is an isometric isomorphism. This fixes the sign convention: positive order lowers Sobolev regularity.
[/example]
The multiplier computation leaves the main problem for the variable-coefficient calculus: after $x$-dependent symbols are introduced, the Fourier transform no longer diagonalises the operator. The next theorem is needed to show that the order filtration still has the same Sobolev meaning as in the Bessel potential example.
[quotetheorem:7692]
[citeproof:7692]
This theorem is the basic analytic meaning of symbol order. Operators of negative order improve Sobolev regularity, operators of order zero preserve it, and differential operators fit the same scale because their polynomial symbols lie in the appropriate symbol classes.
The hypotheses on $a$ are not cosmetic. The estimates defining $S^m_{1,0}$ say not only that $a(x,\theta)$ grows like $\langle \theta\rangle^m$, but also that differentiating in $\theta$ improves decay while differentiating in $x$ does not worsen the order; this is exactly the balance used by composition and by Calderon-Vaillancourt after Bessel conjugation. If the symbol is merely pointwise bounded, the associated oscillatory integral can move $L^2$ mass between small spatial and frequency regions in an uncontrolled way, so boundedness on Sobolev spaces may fail. The theorem also does not assert compactness, elliptic invertibility, or any boundary regularity: it is a continuity statement on the full space $\mathbb R^n$, and additional hypotheses are needed for parametrices, Fredholm properties, or estimates on domains with boundary.
[example: First-Order Differential Operators]
Let
\begin{align*}
P=\sum_{j=1}^n a_j(x)\partial_{x_j}+b(x),
\end{align*}
where $a_j,b\in C_b^\infty(\mathbb R^n)$. For $u\in\mathcal S(\mathbb R^n)$, the Fourier transform identity $\widehat{\partial_{x_j}u}(\xi)=i\xi_j\hat u(\xi)$ shows that the Kohn-Nirenberg symbol of $P$ is
\begin{align*}
p(x,\xi)=i\sum_{j=1}^n a_j(x)\xi_j+b(x).
\end{align*}
We verify that $p\in S^1_{1,0}(\mathbb R^n\times\mathbb R^n)$. If $\beta=0$, then
\begin{align*}
\partial_x^\alpha p(x,\xi)=i\sum_{j=1}^n (\partial_x^\alpha a_j)(x)\xi_j+(\partial_x^\alpha b)(x).
\end{align*}
Since all derivatives of $a_j$ and $b$ are bounded, there is a constant $C_\alpha$ such that
\begin{align*}
|\partial_x^\alpha p(x,\xi)|\le C_\alpha(1+|\xi|)\le 2C_\alpha\langle \xi\rangle.
\end{align*}
If $\beta=e_k$, then
\begin{align*}
\partial_x^\alpha\partial_\xi^{e_k}p(x,\xi)=i(\partial_x^\alpha a_k)(x),
\end{align*}
so for some $C_{\alpha,k}$,
\begin{align*}
|\partial_x^\alpha\partial_\xi^{e_k}p(x,\xi)|\le C_{\alpha,k}=C_{\alpha,k}\langle\xi\rangle^{1-1}.
\end{align*}
If $|\beta|\ge 2$, then every term in $p$ is at most linear in $\xi$, hence
\begin{align*}
\partial_x^\alpha\partial_\xi^\beta p(x,\xi)=0.
\end{align*}
Thus each symbol seminorm required for order $1$ is finite, so $p\in S^1_{1,0}$.
By the *Sobolev [Boundedness Theorem](/theorems/181) for Pseudodifferential Operators*, $\operatorname{Op}(p)$ extends continuously as
\begin{align*}
\operatorname{Op}(p):H^s(\mathbb R^n)\to H^{s-1}(\mathbb R^n)
\end{align*}
for every $s\in\mathbb R$. Since $\operatorname{Op}(p)=P$ on $\mathcal S(\mathbb R^n)$, this gives
\begin{align*}
P:H^s(\mathbb R^n)\to H^{s-1}(\mathbb R^n).
\end{align*}
The estimate says that one differentiation costs exactly one Sobolev order, while multiplication by smooth bounded coefficients stays inside the same pseudodifferential order-one framework.
[/example]
## Calderon-Vaillancourt and Order-Zero Boundedness
The preceding theorem rests on one analytic estimate that is not purely symbolic. The question is why an order-zero operator, whose symbol is uniformly bounded together with enough derivatives, should be bounded on $L^2$ even though it is not a multiplier.
[quotetheorem:7693]
[proofunderconstruction:7693]
The theorem says that order zero is the analytic threshold for $L^2$ boundedness, but it is the symbol seminorm control, not pointwise boundedness alone, that makes the threshold valid. A typical failure mode is a bounded amplitude with rapid uncontrolled oscillation in $x$ or $\theta$: the kernel may then concentrate along thin regions of phase space, and Schur or almost-orthogonality estimates no longer give a uniform $L^2$ bound. Thus Calderon-Vaillancourt is not a theorem about arbitrary bounded measurable symbols; it is a theorem about symbols with enough controlled derivatives. Its role in the calculus is precisely to supply the order-zero endpoint needed after Bessel potential conjugation, so that higher-order Sobolev estimates reduce to this finite-seminorm $L^2$ estimate.
[remark: Finite Seminorm Dependence]
The exact value of $N(n)$ is not important for the calculus developed here. What matters is that the $L^2$ norm of $\operatorname{Op}(a)$ is controlled by a finite list of symbol seminorms, because this finite control is stable under composition with fixed Bessel potential operators.
[/remark]
The simplest order-zero operators are multipliers and cutoffs. They show that Calderon-Vaillancourt contains both Fourier multiplier estimates and multiplication estimates as special cases.
[example: Order-Zero Cutoffs]
Let $\chi\in C_b^\infty(\mathbb R^n)$ and define $M_\chi u=\chi u$. For the Kohn-Nirenberg symbol $a(x,\theta)=\chi(x)$, the operator formula gives
\begin{align*}
\operatorname{Op}(a)u(x)=(2\pi)^{-n}\int_{\mathbb R^n}e^{ix\cdot\theta}\chi(x)\hat u(\theta)\,d\theta.
\end{align*}
Since $\chi(x)$ is independent of $\theta$, it factors out of the integral, and Fourier inversion gives
\begin{align*}
\operatorname{Op}(a)u(x)=\chi(x)(2\pi)^{-n}\int_{\mathbb R^n}e^{ix\cdot\theta}\hat u(\theta)\,d\theta=\chi(x)u(x)=M_\chi u(x).
\end{align*}
We check that $a\in S^0_{1,0}(\mathbb R^n\times\mathbb R^n)$. If $\beta=0$, then
\begin{align*}
\partial_x^\alpha\partial_\theta^\beta a(x,\theta)=\partial_x^\alpha\chi(x).
\end{align*}
Because $\chi\in C_b^\infty(\mathbb R^n)$, the constant $C_\alpha=\sup_{x\in\mathbb R^n}|\partial_x^\alpha\chi(x)|$ is finite, so
\begin{align*}
|\partial_x^\alpha a(x,\theta)|\le C_\alpha=C_\alpha\langle\theta\rangle^0.
\end{align*}
If $|\beta|\ge 1$, then $a$ has no $\theta$-dependence, hence
\begin{align*}
\partial_x^\alpha\partial_\theta^\beta a(x,\theta)=0.
\end{align*}
Thus for every $\alpha,\beta$,
\begin{align*}
|\partial_x^\alpha\partial_\theta^\beta a(x,\theta)|\le C_{\alpha,\beta}\langle\theta\rangle^{-|\beta|},
\end{align*}
which is exactly the order-zero symbol estimate.
For $s=0$, the boundedness can also be seen directly:
\begin{align*}
\|\chi u\|_{L^2}^2=\int_{\mathbb R^n}|\chi(x)u(x)|^2\,d\mathcal L^n(x).
\end{align*}
Since $|\chi(x)u(x)|^2=|\chi(x)|^2|u(x)|^2$ and $|\chi(x)|\le \|\chi\|_{L^\infty}$,
\begin{align*}
\|\chi u\|_{L^2}^2\le \|\chi\|_{L^\infty}^2\int_{\mathbb R^n}|u(x)|^2\,d\mathcal L^n(x)=\|\chi\|_{L^\infty}^2\|u\|_{L^2}^2.
\end{align*}
Taking square roots gives $\|\chi u\|_{L^2}\le \|\chi\|_{L^\infty}\|u\|_{L^2}$. Since $M_\chi=\operatorname{Op}(a)$ with $a\in S^0_{1,0}$, the *Sobolev Boundedness Theorem for Pseudodifferential Operators* gives
\begin{align*}
M_\chi:H^s(\mathbb R^n)\to H^s(\mathbb R^n)
\end{align*}
for every $s\in\mathbb R$. Smooth bounded cutoffs therefore preserve Sobolev order: they localize a function without costing derivatives.
[/example]
## Local Sobolev Estimates on Open Sets
The global theorem is not yet the form needed for differential equations on an open set $U\subset \mathbb R^n$. A local operator may be defined only on $U$, and a local estimate should compare $u$ near one compact set with $Au$ near another, allowing cutoffs to separate the region being estimated from boundary and support issues.
[definition: Local Sobolev Space]
Let $U\subset \mathbb R^n$ be open and let $s\in\mathbb R$. A distribution $u\in\mathcal D'(U)$ belongs to $H^s_{\mathrm{loc}}(U)$ if $\chi u$, extended by zero to $\mathbb R^n$, belongs to $H^s(\mathbb R^n)$ for every $\chi\in C_c^\infty(U)$.
[/definition]
The definition uses cutoffs because Sobolev regularity is measured after localisation. To apply a pseudodifferential operator locally, however, the operator must not send information from far away into the compact set being studied.
[definition: Properly Supported Operator]
Let $U\subset \mathbb R^n$ be open. A continuous linear map $A:C_c^\infty(U)\to C^\infty(U)$ with distribution kernel $K_A\in\mathcal D'(U\times U)$ is properly supported if both coordinate projections from $\operatorname{supp}K_A\subset U\times U$ to $U$ are proper maps.
[/definition]
Proper support is the support condition that makes composition with cutoffs manageable. Without such a condition, locality can fail even for a smoothing-looking kernel: for instance, an operator of the form
\begin{align*}
Tu(x)=\rho(x)\int_U \sigma(y)u(y)\,d\mathcal L^n(y)
\end{align*}
can make the value of $Tu$ on $\operatorname{supp}\rho$ depend on $u$ arbitrarily far away whenever $\sigma$ is not confined to a compact region controlled by $\rho$. Then no estimate of $\chi Tu$ in terms of a fixed slightly larger cutoff $\psi u$ can hold for all compactly supported inputs. With proper support in place, the local problem becomes precise: after cutting the output by $\chi$, can the result be estimated using only a slightly larger cutoff of the input, with the same loss of $m$ derivatives as in the global theorem?
[quotetheorem:7694]
[citeproof:7694]
The theorem is the form used in elliptic regularity arguments. It permits estimates to be written with nested cutoffs: control the Sobolev norm of $Au$ in a small region by the Sobolev norm of $u$ in a slightly larger region, plus smoothing terms when the cutoffs are separated.
Proper support is essential for this formulation because the cutoff $\psi$ must capture all input points that can influence $\operatorname{supp}\chi$. If the kernel has a non-proper projection, a compact output region may receive contributions from a non-compact input region, and no compactly supported $\psi$ can give the displayed estimate. The theorem is also local rather than global: it does not control behaviour at infinity, does not impose boundary conditions, and does not by itself give estimates up to $\partial U$. Boundary regularity requires additional structure, such as extension operators, boundary pseudodifferential calculi, or explicit boundary conditions.
[example: Local Estimate for a Differential Operator]
Let $P=\sum_{j=1}^n a_j(x)\partial_{x_j}+b(x)$ on an open set $U$, with $a_j,b\in C^\infty(U)$. Choose $\chi,\psi\in C_c^\infty(U)$ such that $\psi=1$ on an open neighbourhood of $\operatorname{supp}\chi$, and let $u$ be a distribution with $\psi u\in H^s(\mathbb R^n)$ after extension by zero.
First compute how the cutoff interacts with $P$. By the product rule,
\begin{align*}
P(\psi u)=\sum_{j=1}^n a_j(x)\partial_{x_j}(\psi u)+b(x)\psi u
\end{align*}
and
\begin{align*}
P(\psi u)=\sum_{j=1}^n a_j(x)\big((\partial_{x_j}\psi)u+\psi\partial_{x_j}u\big)+b(x)\psi u.
\end{align*}
Grouping the terms gives
\begin{align*}
P(\psi u)=\psi Pu+\sum_{j=1}^n a_j(x)(\partial_{x_j}\psi)u.
\end{align*}
Multiplying by $\chi$,
\begin{align*}
\chi P(\psi u)=\chi\psi Pu+\sum_{j=1}^n \chi a_j(x)(\partial_{x_j}\psi)u.
\end{align*}
Since $\psi=1$ on a neighbourhood of $\operatorname{supp}\chi$, we have $\chi\psi=\chi$ and $\chi\,\partial_{x_j}\psi=0$ for every $j$. Therefore
\begin{align*}
\chi P(\psi u)=\chi Pu.
\end{align*}
Choose $\eta\in C_c^\infty(U)$ with $\eta=1$ on a neighbourhood of $\operatorname{supp}\chi$, and extend $\eta a_j$ and $\eta b$ by zero to smooth bounded functions on $\mathbb R^n$. Define the global first-order operator
\begin{align*}
\widetilde P=\sum_{j=1}^n (\eta a_j)(x)\partial_{x_j}+(\eta b)(x).
\end{align*}
On $\operatorname{supp}\chi$, $\eta=1$, so
\begin{align*}
\chi P(\psi u)=\chi\widetilde P(\psi u).
\end{align*}
The coefficients of $\widetilde P$ belong to $C_b^\infty(\mathbb R^n)$, so the first-order differential-operator case of the *Sobolev Boundedness Theorem for Pseudodifferential Operators* gives
\begin{align*}
\|\widetilde P(\psi u)\|_{H^{s-1}(\mathbb R^n)}\le C_1\|\psi u\|_{H^s(\mathbb R^n)}.
\end{align*}
Multiplication by the smooth bounded cutoff $\chi$ is order zero, hence bounded on $H^{s-1}(\mathbb R^n)$, so
\begin{align*}
\|\chi\widetilde P(\psi u)\|_{H^{s-1}(\mathbb R^n)}\le C_2\|\widetilde P(\psi u)\|_{H^{s-1}(\mathbb R^n)}.
\end{align*}
Combining the two inequalities yields
\begin{align*}
\|\chi Pu\|_{H^{s-1}(\mathbb R^n)}\le C\|\psi u\|_{H^s(\mathbb R^n)}.
\end{align*}
The only possible commutator contribution is $\sum_j a_j(\partial_{x_j}\psi)u$, and it disappears after multiplication by $\chi$ because the derivative of $\psi$ is supported away from $\operatorname{supp}\chi$. Thus the local estimate loses exactly one Sobolev order, matching the order of the differential operator.
[/example]
## Consequences for the Calculus
The Sobolev mapping theorem converts symbolic statements into regularity statements. If $A\in\Psi^m$ and $B\in\Psi^{m'}$, then the composition $AB\in\Psi^{m+m'}$ maps $H^s$ to $H^{s-m-m'}$, matching the algebraic addition of orders. If $R\in\Psi^{-\infty}$, then $R$ maps every $H^s$ continuously into every $H^t$ locally, because its kernel is smooth after localisation.
[remark: Smoothing Operators Improve All Sobolev Orders]
A properly supported smoothing operator $R\in\Psi^{-\infty}(U)$ maps $H^s_{\mathrm{loc}}(U)$ continuously into $H^t_{\mathrm{loc}}(U)$ for every $s,t\in\mathbb R$. Locally, this follows by cutting off the kernel to a compact subset of $U\times U$ and differentiating the smooth kernel under the integral pairing. In the pseudodifferential calculus, smoothing remainders are therefore harmless for Sobolev regularity, although they remain important for support and compactness questions.
[/remark]
This completes the analytic bridge needed for parametrices. Once an elliptic operator has a parametrix modulo smoothing remainders, the mapping properties proved here allow the symbolic inverse to be read as a gain of derivatives in Sobolev spaces. These estimates also connect the calculus with the energy estimates used for PDEs: order-zero boundedness plays the role of a stable lower-order perturbation, while negative-order remainders behave like compact or regularising errors after localisation. The next use of these estimates is elliptic regularity: if $Au$ is smoother and $A$ is elliptic, then $u$ is smoother microlocally in the region where the parametrix exists.
Sobolev mapping estimates have quantified how symbol order controls operator action on function spaces. Elliptic regularity now follows immediately: an elliptic operator has a parametrix by composition, so regularity of $Au$ propagates to regularity of $u$ microlocally. This chapter synthesizes the Sobolev calculus into elliptic regularity theorems and constructs explicit parametrices.
# 9. Ellipticity and Parametrices
This chapter is the point in the course where the symbolic calculus becomes an analytic [regularity theorem](/theorems/2750). The prerequisites are the symbol classes $S^m(U \times \mathbb R^n)$, the composition formula for pseudodifferential operators, asymptotic summation, and Sobolev mapping estimates. The goal is to identify the high-frequency lower bound that permits symbolic inversion, construct parametrices as inverses modulo smoothing errors, and use them to prove local Sobolev regularity for elliptic equations. This is the bridge from formal symbol algebra to the standard elliptic PDE principle that regular right-hand sides force regular solutions.
## Elliptic Symbols and Lower Bounds
The central question is when a symbol can be inverted without leaving the symbol calculus. A symbol may vanish for bounded frequencies without damaging high-frequency regularity, so ellipticity is a condition at fibre infinity rather than at every point of $U \times \mathbb R^n$.
[definition: Elliptic Scalar Symbol]
Let $U \subset \mathbb R^n$ be open, let $m \in \mathbb R$, and let $a \in S^m(U \times \mathbb R^n;\mathbb C)$ be a scalar symbol, so $a:U \times \mathbb R^n \to \mathbb C$. The symbol $a$ is elliptic of order $m$ on $U$ if for every compact $K \subset U$ there exist constants $C_K > 0$ and $R_K > 0$ such that
\begin{align*}
|a(x,\xi)| \ge C_K \langle \xi \rangle^m
\end{align*}
for all $x \in K$ and all $|\xi| \ge R_K$.
[/definition]
The compact-set formulation is local in $x$, matching the local nature of the pseudodifferential calculus. It permits coefficients to vary on $U$ while requiring uniform invertibility on each compact coordinate patch.
[example: Constant Coefficient Elliptic Polynomial]
Let $p(\xi)$ be a polynomial of degree $m$ and write
\begin{align*}
p(\xi)=p_m(\xi)+r(\xi),
\end{align*}
where $p_m$ is homogeneous of degree $m$ and $r$ has degree at most $m-1$. Since $p$ is independent of $x$, every $x$-derivative vanishes, and each $\xi$-derivative $\partial_\xi^\beta p$ is a polynomial of degree at most $m-|\beta|$ when $|\beta|\le m$ and is zero when $|\beta|>m$. Hence $|\partial_\xi^\beta p(\xi)|\le C_\beta \langle \xi\rangle^{m-|\beta|}$, so $p\in S^m(\mathbb R^n\times \mathbb R^n)$.
Because $p_m$ is continuous and nonzero on the compact unit sphere $S^{n-1}$, the number
\begin{align*}
c_0=\min_{\omega\in S^{n-1}} |p_m(\omega)|
\end{align*}
is positive. For $\xi\neq 0$, homogeneity gives
\begin{align*}
|p_m(\xi)|=|\xi|^m |p_m(\xi/|\xi|)|\ge c_0|\xi|^m.
\end{align*}
Since $r$ has degree at most $m-1$, there is $C_1>0$ such that, for $|\xi|\ge 1$,
\begin{align*}
|r(\xi)|\le C_1|\xi|^{m-1}.
\end{align*}
Choose $R\ge 1$ with $C_1|\xi|^{m-1}\le \frac{c_0}{2}|\xi|^m$ whenever $|\xi|\ge R$. Then, for $|\xi|\ge R$,
\begin{align*}
|p(\xi)|\ge |p_m(\xi)|-|r(\xi)|\ge \frac{c_0}{2}|\xi|^m.
\end{align*}
For $|\xi|\ge 1$, $\langle \xi\rangle\le \sqrt 2|\xi|$, so
\begin{align*}
|p(\xi)|\ge \frac{c_0}{2}2^{-m/2}\langle \xi\rangle^m.
\end{align*}
Thus the elliptic lower bound holds uniformly on every compact $K\subset \mathbb R^n$, since the symbol has no $x$-dependence. The leading homogeneous part controls all large frequencies, while the lower-degree terms are eventually too small to destroy the bound.
[/example]
This example reduces ellipticity to a familiar leading-term lower bound, but it also shows what can go wrong when the leading part vanishes in a direction. For instance, $p(\xi)=\xi_1^2$ is a symbol of order $2$, but it is not elliptic on $\mathbb R^n$ when $n\ge 2$ because $p(0,\xi_2,\dots,\xi_n)=0$ along an unbounded set of frequencies. The operator $-\partial_{x_1}^2$ cannot recover derivatives in the transverse variables, so no order $-2$ reciprocal can control all Sobolev directions.
[example: Directional Failure of Ellipticity]
Choose nonzero $\phi,\psi\in C_c^\infty(\mathbb R)$ and set $u_k(x_1,x_2)=\phi(x_1)e^{ikx_2}\psi(x_2)$. The symbol of $P=-\partial_{x_1}^2$ is $p(\xi)=\xi_1^2$, and along the unbounded frequency ray $\xi=(0,k)$ one has
\begin{align*}
|p(0,k)|=0
\end{align*}
while
\begin{align*}
\langle(0,k)\rangle^2=(1+k^2)>0.
\end{align*}
Thus no positive constant $C$ can make $|p(\xi)|\ge C\langle\xi\rangle^2$ for all large $|\xi|$.
The same failure is visible on functions. Since $P$ differentiates only in $x_1$,
\begin{align*}
Pu_k(x_1,x_2)=-\phi''(x_1)e^{ikx_2}\psi(x_2).
\end{align*}
Taking absolute values removes the phase $e^{ikx_2}$, so
\begin{align*}
\|Pu_k\|_{L^2(\mathbb R^2)}=\|\phi''\|_{L^2(\mathbb R)}\|\psi\|_{L^2(\mathbb R)}.
\end{align*}
This quantity is independent of $k$. On the other hand,
\begin{align*}
\partial_{x_2}^2u_k=\phi(x_1)e^{ikx_2}\left(-k^2\psi(x_2)+2ik\psi'(x_2)+\psi''(x_2)\right).
\end{align*}
By the [reverse triangle inequality](/theorems/2300) in $L^2(\mathbb R)$,
\begin{align*}
\|\partial_{x_2}^2u_k\|_{L^2(\mathbb R^2)}\ge \|\phi\|_{L^2}\left(k^2\|\psi\|_{L^2}-2k\|\psi'\|_{L^2}-\|\psi''\|_{L^2}\right).
\end{align*}
The right-hand side grows like $k^2$ because $\psi\neq 0$. Hence $\|u_k\|_{H^2}$ grows like $k^2$, while $\|Pu_k\|_{L^2}$ stays bounded. The operator detects $x_1$-oscillation but misses high frequency in the transverse $x_2$ direction, which is exactly why ellipticity requires a lower bound in every large-frequency direction.
[/example]
The next question is whether the reciprocal of an elliptic symbol has the derivative bounds of a symbol of order $-m$. This is not automatic from pointwise nonvanishing: the symbolic calculus also needs estimates for all $x$- and $\xi$-derivatives of the reciprocal.
[quotetheorem:7695]
[citeproof:7695]
This theorem explains why ellipticity is the exact hypothesis needed for inversion at the level of principal symbols. The lower bound cannot be weakened to mere nonvanishing on each fixed compact frequency set: $a(\xi)=e^{-\xi^2}$ never vanishes, but its reciprocal grows faster than any polynomial and is not a symbol of finite order. The cutoff is also essential because ellipticity allows low-frequency zeros; for example $|\xi|^2$ is elliptic of order $2$ but has no reciprocal at $\xi=0$. The theorem therefore says only that high-frequency inversion is symbolic, and this is precisely what the parametrix construction will use.
[remark: Ellipticity Is a Principal Symbol Condition]
For a classical symbol $a \sim \sum_{j=0}^{\infty} a_{m-j}$, ellipticity is equivalent to invertibility of the leading homogeneous term $a_m(x,\xi)$ for $\xi \neq 0$, locally uniformly in $x$. The non-classical definition above avoids homogeneity and works directly in $S^m$.
[/remark]
After scalar operators, the same question arises for systems. A system is elliptic when its matrix symbol is invertible at high frequency with the expected growth of the inverse.
[definition: Elliptic Matrix Symbol]
Let $a \in S^m(U \times \mathbb R^n;\mathbb C^{N \times N})$. The matrix symbol $a$ is elliptic of order $m$ if for every compact $K \subset U$ there exist constants $C_K>0$ and $R_K>0$ such that $a(x,\xi)$ is invertible for $x \in K$, $|\xi| \ge R_K$, and
\begin{align*}
\|a(x,\xi)^{-1}\|_{\mathrm{op}} \le C_K \langle \xi \rangle^{-m}.
\end{align*}
[/definition]
Matrix ellipticity differs from nonvanishing of entries: it is invertibility of the whole principal matrix. This matters for coupled PDE systems, where cancellation between components can destroy scalar-looking lower bounds.
[example: Elliptic First Order System]
Let $A(\xi)=\sum_{j=1}^n A_j\xi_j$, where the $A_j$ are fixed complex $N\times N$ matrices. Since there is no $x$-dependence, every $x$-derivative vanishes. For $|\beta|=0$,
\begin{align*}
\|A(\xi)\|_{\mathrm{op}}\le \sum_{j=1}^n \|A_j\|_{\mathrm{op}}|\xi_j|\le \left(\sum_{j=1}^n \|A_j\|_{\mathrm{op}}\right)|\xi|\le C\langle \xi\rangle .
\end{align*}
If $|\beta|=1$, then $\partial_\xi^\beta A(\xi)$ is one of the constant matrices $A_j$, so $\|\partial_\xi^\beta A(\xi)\|_{\mathrm{op}}\le C_\beta\langle\xi\rangle^{0}$. If $|\beta|\ge 2$, then $\partial_\xi^\beta A(\xi)=0$. Thus $A\in S^1(\mathbb R^n\times \mathbb R^n;\mathbb C^{N\times N})$.
Assume now that $A(\xi)$ is invertible for every $\xi\neq 0$. For $\xi\neq 0$, write $\omega=\xi/|\xi|\in S^{n-1}$. By linearity in $\xi$,
\begin{align*}
A(\xi)=\sum_{j=1}^n A_j|\xi|\omega_j=|\xi|\sum_{j=1}^n A_j\omega_j=|\xi|A(\omega).
\end{align*}
Taking inverses of both sides gives
\begin{align*}
A(\xi)^{-1}=\left(|\xi|A(\omega)\right)^{-1}=|\xi|^{-1}A(\omega)^{-1}.
\end{align*}
The map $\omega\mapsto A(\omega)^{-1}$ is continuous on the compact sphere $S^{n-1}$, because $A(\omega)$ is invertible there and matrix inversion is continuous on invertible matrices. Hence
\begin{align*}
M=\max_{\omega\in S^{n-1}}\|A(\omega)^{-1}\|_{\mathrm{op}}<\infty .
\end{align*}
Therefore, for $\xi\neq 0$,
\begin{align*}
\|A(\xi)^{-1}\|_{\mathrm{op}}\le M|\xi|^{-1}.
\end{align*}
For $|\xi|\ge 1$, $\langle\xi\rangle\le \sqrt 2|\xi|$, so $|\xi|^{-1}\le \sqrt 2\langle\xi\rangle^{-1}$. Thus
\begin{align*}
\|A(\xi)^{-1}\|_{\mathrm{op}}\le \sqrt 2 M\langle\xi\rangle^{-1}.
\end{align*}
This is the elliptic matrix lower bound of order $1$, uniformly on every compact set in $x$ because the symbol is independent of $x$. The condition is exactly invertibility of the whole homogeneous matrix $A(\omega)$ in every frequency direction $\omega\in S^{n-1}$.
[/example]
This example is the matrix analogue of a scalar homogeneous elliptic polynomial: the whole leading matrix must be invertible on the sphere. The next example shows why it is not enough for each component to look first order when viewed separately; the determinant can still vanish along fibre directions that escape to infinity.
[example: A Near-Miss System]
In two variables, consider the diagonal matrix symbol
\begin{align*}
A(\xi)=\operatorname{diag}(\xi_1,\xi_2).
\end{align*}
The entries $\xi_1$ and $\xi_2$ are homogeneous polynomials of degree $1$, so each scalar entry satisfies the symbol bounds of order $1$. However, matrix ellipticity requires invertibility of the whole matrix for every sufficiently large frequency direction, not merely first-order growth of its entries.
For $\xi=(t,0)$ with $t\neq 0$,
\begin{align*}
A(t,0)=\operatorname{diag}(t,0).
\end{align*}
This matrix is singular because its second diagonal entry is $0$. Equivalently,
\begin{align*}
\det A(\xi)=\xi_1\xi_2,
\end{align*}
so
\begin{align*}
\det A(t,0)=t\cdot 0=0.
\end{align*}
The frequencies $(t,0)$ satisfy $|(t,0)|=|t|$, which becomes arbitrarily large as $|t|\to\infty$. Hence no radius $R$ can make $A(\xi)$ invertible for all $|\xi|\ge R$.
The same obstruction occurs on the other coordinate ray: for $\xi=(0,t)$,
\begin{align*}
A(0,t)=\operatorname{diag}(0,t)
\end{align*}
and
\begin{align*}
\det A(0,t)=0\cdot t=0.
\end{align*}
Thus the determinant vanishes along unbounded frequency directions, so the system cannot recover both components uniformly. This separates ellipticity of a matrix symbol from the weaker fact that individual entries have order $1$.
[/example]
## Recursive Construction of the Parametrix
Once the leading inverse exists, the problem is that operator composition is not pointwise multiplication. The symbolic inverse must be corrected term by term so that the composition symbol becomes $1$ modulo $S^{-\infty}$.
[definition: Parametrix]
Let $P:C_c^\infty(U)\to \mathcal D'(U)$ be a properly supported pseudodifferential operator in $\Psi^m(U)$. A parametrix for $P$ is a properly supported operator $Q:C_c^\infty(U)\to \mathcal D'(U)$ in $\Psi^{-m}(U)$ such that
\begin{align*}
QP-I \in \Psi^{-\infty}(U), \qquad PQ-I \in \Psi^{-\infty}(U).
\end{align*}
[/definition]
A parametrix is an inverse for all microlocal purposes visible to finite Sobolev order. The problem now is to prove that the symbolic reciprocal can be corrected through every lower order so that the remainder becomes smoothing.
[quotetheorem:7696]
[citeproof:7696]
This theorem is the main payoff from the asymptotic expansion developed earlier. Each hypothesis has a distinct role: ellipticity supplies the high-frequency reciprocal, membership in $\Psi^m(U)$ supplies the composition expansion and asymptotic summation, and proper support keeps both products $QP$ and $PQ$ inside the local operator algebra. Exact inverses are too much to expect locally: even for elliptic differential operators, global kernels, boundary conditions, and low-frequency behaviour may prevent invertibility. Smoothing remainders are the natural endpoint because the symbolic calculus controls high-frequency singularities, not finite-dimensional or global obstructions. The next explanation records how the recursive cancellation looks in the classical expansion.
[explanation: Recursive Formula in the Classical Case]
Suppose $a \sim a_m+a_{m-1}+a_{m-2}+\cdots$ is classical and scalar. We seek $q \sim q_{-m}+q_{-m-1}+q_{-m-2}+\cdots$ with $q\# a \sim 1$. The leading equation is
\begin{align*}
q_{-m}a_m=1,
\end{align*}
so $q_{-m}=a_m^{-1}$. The term of degree $-j$ in $q\# a-1$ is a finite expression involving $q_{-m},\dots,q_{-m-j+1}$, the components of $a$, and derivatives from the composition formula. Since multiplication by $a_m^{-1}$ is available, this determines $q_{-m-j}$ recursively.
[/explanation]
This recursion is often more important than its closed form. It says that the parametrix is local in the full symbol of $P$, and that changing $P$ by a lower-order operator changes $Q$ first at the corresponding lower symbolic order.
[example: Parametrix for Minus Laplacian Plus One]
On $\mathbb R^n$, the operator $P=-\Delta+1$ acts on a Schwartz function by
\begin{align*}
\widehat{Pu}(\xi)=\left(|\xi|^2+1\right)\hat u(\xi),
\end{align*}
because $\widehat{-\partial_{x_j}^2u}(\xi)=\xi_j^2\hat u(\xi)$ and $\sum_{j=1}^n \xi_j^2=|\xi|^2$. Thus its Fourier multiplier symbol is $a(\xi)=|\xi|^2+1$.
Set
\begin{align*}
q(\xi)=\left(1+|\xi|^2\right)^{-1}.
\end{align*}
Then
\begin{align*}
q(\xi)a(\xi)=\left(1+|\xi|^2\right)^{-1}\left(1+|\xi|^2\right)=1,
\end{align*}
so the multiplier $Q$ with symbol $q$ is an exact inverse on $\mathcal S(\mathbb R^n)$. To see that $q\in S^{-2}$, note that differentiating $q=(1+|\xi|^2)^{-1}$ repeatedly gives finite sums of terms
\begin{align*}
C_{\beta,\gamma}\xi^\gamma\left(1+|\xi|^2\right)^{-1-|\beta|}
\end{align*}
with $|\gamma|\le |\beta|$. Hence
\begin{align*}
\left|\xi^\gamma\left(1+|\xi|^2\right)^{-1-|\beta|}\right|\le \langle\xi\rangle^{|\beta|}\langle\xi\rangle^{-2-2|\beta|}=\langle\xi\rangle^{-2-|\beta|},
\end{align*}
so $|\partial_\xi^\beta q(\xi)|\le C_\beta\langle\xi\rangle^{-2-|\beta|}$ for every multiindex $\beta$.
Finally, for $f\in H^s(\mathbb R^n)$ and $u=Qf$, one has $\hat u(\xi)=q(\xi)\hat f(\xi)$, and therefore
\begin{align*}
\|u\|_{H^{s+2}}^2=\int_{\mathbb R^n}\langle\xi\rangle^{2s+4}\left(1+|\xi|^2\right)^{-2}|\hat f(\xi)|^2\,d\xi.
\end{align*}
Since $\langle\xi\rangle^2=1+|\xi|^2$, this becomes
\begin{align*}
\|u\|_{H^{s+2}}^2=\int_{\mathbb R^n}\langle\xi\rangle^{2s}|\hat f(\xi)|^2\,d\xi=\|f\|_{H^s}^2.
\end{align*}
Thus the parametrix is not merely an inverse modulo smoothing errors in this constant-coefficient case: it is the actual Fourier inverse, carrying $H^s(\mathbb R^n)$ isometrically onto $H^{s+2}(\mathbb R^n)$ under the standard Sobolev norm.
[/example]
Variable coefficients introduce derivative terms in the symbolic product, but the same construction applies whenever the principal part is uniformly elliptic on compact sets.
[example: Variable Coefficient Uniformly Elliptic Operator]
Let
\begin{align*}
Pu=-\sum_{i,j=1}^n \partial_{x_i}(a_{ij}(x)\partial_{x_j}u)+c(x)u
\end{align*}
with smooth coefficients. Expanding the derivative gives
\begin{align*}
Pu=-\sum_{i,j=1}^n a_{ij}(x)\partial_{x_i}\partial_{x_j}u-\sum_{i,j=1}^n (\partial_{x_i}a_{ij})(x)\partial_{x_j}u+c(x)u.
\end{align*}
With the convention that $D_{x_j}=-i\partial_{x_j}$ has symbol $\xi_j$, the second-order part $-\sum_{i,j}a_{ij}\partial_{x_i}\partial_{x_j}$ has principal symbol
\begin{align*}
p_2(x,\xi)=\sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j.
\end{align*}
The terms involving $(\partial_{x_i}a_{ij})\partial_{x_j}$ have order $1$, and $c(x)u$ has order $0$, so the full symbol has the form
\begin{align*}
p(x,\xi)=p_2(x,\xi)+p_1(x,\xi)+p_0(x,\xi)
\end{align*}
with $p_1\in S^1$ and $p_0\in S^0$ locally in $x$.
Fix a compact $K\subset U$, and suppose that for $x\in K$ and all $\xi\in\mathbb R^n$,
\begin{align*}
\sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j\ge \theta_K|\xi|^2
\end{align*}
for some $\theta_K>0$. For $|\xi|\ge 1$, the identity $\langle\xi\rangle^2=1+|\xi|^2$ gives
\begin{align*}
\langle\xi\rangle^2\le 2|\xi|^2.
\end{align*}
Hence
\begin{align*}
p_2(x,\xi)\ge \theta_K|\xi|^2\ge \frac{\theta_K}{2}\langle\xi\rangle^2.
\end{align*}
Since $p_1+p_0$ has order at most $1$ on $K$, there is $C_K>0$ such that
\begin{align*}
|p_1(x,\xi)+p_0(x,\xi)|\le C_K\langle\xi\rangle.
\end{align*}
Choose $R_K\ge 1$ so that $C_K\langle\xi\rangle\le \frac{\theta_K}{4}\langle\xi\rangle^2$ whenever $|\xi|\ge R_K$. Then for $x\in K$ and $|\xi|\ge R_K$,
\begin{align*}
|p(x,\xi)|\ge |p_2(x,\xi)|-|p_1(x,\xi)+p_0(x,\xi)|.
\end{align*}
Using the two preceding bounds,
\begin{align*}
|p(x,\xi)|\ge \frac{\theta_K}{2}\langle\xi\rangle^2-\frac{\theta_K}{4}\langle\xi\rangle^2=\frac{\theta_K}{4}\langle\xi\rangle^2.
\end{align*}
Thus $P$ is elliptic of order $2$ on compact subsets of $U$.
At the leading symbolic level, the parametrix symbol begins with
\begin{align*}
q_{-2}(x,\xi)=p_2(x,\xi)^{-1}
\end{align*}
for large $|\xi|$, because
\begin{align*}
q_{-2}(x,\xi)p_2(x,\xi)=p_2(x,\xi)^{-1}p_2(x,\xi)=1.
\end{align*}
The lower-order terms in the parametrix then cancel the order $-1,-2,\ldots$ errors coming from $p_1$, $p_0$, and derivatives of the coefficients in the symbolic composition formula. The example shows that the uniform [quadratic lower bound](/theorems/2052) on the principal part is exactly what starts the elliptic parametrix construction.
[/example]
## Smoothing Remainders and Uniqueness
The parametrix theorem leaves remainders in $\Psi^{-\infty}$. The next question is how much information is lost by ignoring such operators, and whether the parametrix depends on the choices made during construction. Proper support remains part of the local calculus here: without it, composing operators or applying them to compactly supported test functions can create nonlocal expressions that are not controlled on compact subsets. For example, an integral operator with smooth kernel $K(x,y)=1$ on a noncompact open set is smoothing as a kernel, but it sends a test function to the constant $\int u(y)\,dy$ and is not properly supported; local cutoffs are needed to keep the operator algebra compatible with support.
[definition: Smoothing Operator]
An operator $R:C_c^\infty(U)\to \mathcal D'(U)$ is smoothing if its Schwartz kernel is $C^\infty$ on $U\times U$. The class of properly supported smoothing pseudodifferential operators is denoted $\Psi^{-\infty}(U)$.
[/definition]
Smoothing operators are negligible in symbolic algebra because their full symbols lie in $S^{-N}$ for every $N$. The analytic question is how this symbolic smallness translates into Sobolev mapping, since elliptic regularity will need remainders that improve arbitrary orders.
[quotetheorem:7697]
[citeproof:7697]
Thus parametrices are genuine inverses after passing to the quotient algebra $\Psi^*(U)/\Psi^{-\infty}(U)$. The proper support assumption is doing work: it lets smoothing remainders compose with pseudodifferential operators without leaving the local calculus. The theorem does not say that a smoothing remainder is analytically irrelevant in every global problem, since compact smoothing operators may still affect kernels and eigenvalues. It says that smoothing remainders do not affect finite-order local Sobolev regularity, which is the level at which elliptic parametrices operate. The next theorem records that the ambiguity in the construction is exactly smoothing.
[quotetheorem:7698]
[citeproof:7698]
The uniqueness theorem removes dependence on choices of cutoffs and asymptotic representatives. The hypotheses cannot be dropped: if $P=0$, then every candidate $Q$ fails to satisfy $QP-I\in\Psi^{-\infty}(U)$ because $-I$ is not smoothing. Even for elliptic $P$, uniqueness is only modulo smoothing, since adding any smoothing operator to a parametrix changes neither symbolic inverse identity. The remaining question is whether these two-sided identities can be packaged as a true inverse statement inside the quotient algebra where smoothing operators are set to zero.
[quotetheorem:7699]
[citeproof:7699]
This quotient statement is algebraic, not a Fredholm theorem. It does not assert that $P$ has a genuine inverse on a chosen Hilbert or Sobolev space, because kernels and cokernels survive outside the symbolic quotient. The proper support condition ensures that the quotient multiplication is defined locally; without a composition framework the displayed inverse identities would not be meaningful as operator equalities. The analytic consequence of the quotient inverse is therefore regularity rather than global solvability, which is the subject of the next section.
## Elliptic Regularity in Sobolev Scales
The analytic purpose of a parametrix is to transfer regularity from $Pu$ back to $u$. The problem is local: if $Pu$ has $s-m$ Sobolev derivatives near a point, ellipticity should force $u$ to have $s$ Sobolev derivatives there.
[quotetheorem:7700]
[citeproof:7700]
The lower-order norm in the estimate is the price of localising and of starting with an arbitrary distribution. Ellipticity is necessary: for $P=-\partial_{x_1}^2$ on $\mathbb R^2$, control of $Pu$ gives no control of high oscillations in $x_2$, so the conclusion fails in the transverse direction. Proper support and cutoffs are also part of the statement because the argument is local and must keep every composition inside a compact region where the symbol estimates apply. The theorem gives regularity, not existence or uniqueness; in boundary-value or global Fredholm settings the lower-order term is controlled by compactness or an independent a priori estimate.
[example: Regularity for the Bessel Operator]
Let $P=I-\Delta$ on $\mathbb R^n$, and write $\langle \xi\rangle^2=1+|\xi|^2$. For a Schwartz function $v$,
\begin{align*}
\widehat{Pv}(\xi)=\widehat v(\xi)+\widehat{-\Delta v}(\xi)=(1+|\xi|^2)\widehat v(\xi)=\langle\xi\rangle^2\widehat v(\xi).
\end{align*}
Thus the Fourier multiplier with symbol
\begin{align*}
q(\xi)=(1+|\xi|^2)^{-1}=\langle\xi\rangle^{-2}
\end{align*}
is an exact inverse on the Fourier side, because
\begin{align*}
q(\xi)(1+|\xi|^2)=\langle\xi\rangle^{-2}\langle\xi\rangle^2=1.
\end{align*}
Now suppose $u\in\mathcal S'(\mathbb R^n)$ and $f=(I-\Delta)u\in H^{s-2}(\mathbb R^n)$. Taking Fourier transforms in the tempered-distribution sense gives
\begin{align*}
\widehat f(\xi)=\langle\xi\rangle^2\widehat u(\xi).
\end{align*}
Multiplying both sides by $\langle\xi\rangle^{-2}$ gives
\begin{align*}
\widehat u(\xi)=\langle\xi\rangle^{-2}\widehat f(\xi).
\end{align*}
Therefore the $H^s$ norm of $u$ is
\begin{align*}
\|u\|_{H^s}^2=\int_{\mathbb R^n}\langle\xi\rangle^{2s}|\widehat u(\xi)|^2\,d\xi.
\end{align*}
Substituting the Fourier-side formula for $\widehat u$ gives
\begin{align*}
\|u\|_{H^s}^2=\int_{\mathbb R^n}\langle\xi\rangle^{2s}\left|\langle\xi\rangle^{-2}\widehat f(\xi)\right|^2\,d\xi.
\end{align*}
Since $\left|\langle\xi\rangle^{-2}\widehat f(\xi)\right|^2=\langle\xi\rangle^{-4}|\widehat f(\xi)|^2$, this becomes
\begin{align*}
\|u\|_{H^s}^2=\int_{\mathbb R^n}\langle\xi\rangle^{2s-4}|\widehat f(\xi)|^2\,d\xi.
\end{align*}
Because $2s-4=2(s-2)$, the last integral is exactly
\begin{align*}
\|u\|_{H^s}^2=\|f\|_{H^{s-2}}^2=\|(I-\Delta)u\|_{H^{s-2}}^2.
\end{align*}
Hence $(I-\Delta)u\in H^{s-2}(\mathbb R^n)$ forces $u\in H^s(\mathbb R^n)$, and in this constant-coefficient case the elliptic regularity estimate is an equality of Fourier-weighted norms.
[/example]
Elliptic regularity also applies to systems, provided the matrix symbol is elliptic. The parametrix then has matrix-valued symbol and acts componentwise on Sobolev spaces.
[example: Elliptic Systems]
[claim]If $P\in \Psi^m(U;\mathbb C^N)$ has elliptic matrix symbol $p(x,\xi)$ and $Pu\in H^{s-m}_{\mathrm{loc}}(U;\mathbb C^N)$, then $u\in H^s_{\mathrm{loc}}(U;\mathbb C^N)$.[/claim]
[proof]Fix a compact $K\subset U$. Matrix ellipticity gives constants $C_K,R_K>0$ such that $p(x,\xi)$ is invertible for $x\in K$, $|\xi|\ge R_K$, and
\begin{align*}
\|p(x,\xi)^{-1}\|_{\mathrm{op}}\le C_K\langle\xi\rangle^{-m}.
\end{align*}
Choose a cutoff $\chi(\xi)$ with $\chi(\xi)=0$ for $|\xi|\le R_K$ and $\chi(\xi)=1$ for $|\xi|\ge 2R_K$, and set
\begin{align*}
q_0(x,\xi)=\chi(\xi)p(x,\xi)^{-1}.
\end{align*}
On the region $|\xi|\ge 2R_K$, this gives
\begin{align*}
q_0(x,\xi)p(x,\xi)=p(x,\xi)^{-1}p(x,\xi)=I_N.
\end{align*}
The symbolic product therefore has the form
\begin{align*}
q_0\# p=I_N-r_1
\end{align*}
with $r_1\in S^{-1}$, because the zeroth term is $q_0p$ and every nonzero derivative term in the composition formula lowers the symbolic order.
Suppose inductively that matrix symbols $q_0,\dots,q_N$ have been chosen with $q_j\in S^{-m-j}$ and
\begin{align*}
(q_0+\cdots+q_N)\# p=I_N-r_{N+1}
\end{align*}
where $r_{N+1}\in S^{-N-1}$. Let $\rho_{-N-1}$ be the homogeneous leading term of $r_{N+1}$ in order $-N-1$. To cancel it in the left parametrix equation, choose the next term so that
\begin{align*}
q_{N+1}(x,\xi)p_m(x,\xi)=\rho_{-N-1}(x,\xi).
\end{align*}
Since $p_m(x,\xi)$ is invertible for large $|\xi|$, this is solved by the ordered matrix product
\begin{align*}
q_{N+1}(x,\xi)=\rho_{-N-1}(x,\xi)p_m(x,\xi)^{-1}.
\end{align*}
Then the leading order $-N-1$ part of $(q_0+\cdots+q_N+q_{N+1})\#p-I_N$ is
\begin{align*}
-\rho_{-N-1}+q_{N+1}p_m=-\rho_{-N-1}+\rho_{-N-1}p_m^{-1}p_m=0.
\end{align*}
Thus the new remainder lies in $S^{-N-2}$. Asymptotic summation gives a matrix symbol $q\sim \sum_{j=0}^\infty q_j\in S^{-m}$ with
\begin{align*}
q\#p-I_N\in S^{-\infty}.
\end{align*}
The corresponding operator $Q\in\Psi^{-m}(U;\mathbb C^N)$ satisfies
\begin{align*}
QP=I+R
\end{align*}
with $R\in\Psi^{-\infty}(U;\mathbb C^N)$, by the matrix version of the *Elliptic Parametrix Theorem*.
Now let $\eta,\psi\in C_c^\infty(U)$ with $\psi=1$ near $\operatorname{supp}\eta$. From $QP=I+R$ we get
\begin{align*}
u=QPu-Ru.
\end{align*}
Multiplying by $\eta$ and inserting $\psi$ near the relevant support gives
\begin{align*}
\eta u=\eta Q\psi Pu-\eta R\psi u+S u
\end{align*}
where $S$ is smoothing, coming from the support-separated cutoff errors. The pseudodifferential mapping theorem gives
\begin{align*}
\|\eta Q\psi Pu\|_{H^s}\le C\|\psi Pu\|_{H^{s-m}},
\end{align*}
because $Q$ has order $-m$. The smoothing terms satisfy, for any $N_0>0$,
\begin{align*}
\|\eta R\psi u\|_{H^s}+\|S u\|_{H^s}\le C_{N_0}\|\psi u\|_{H^{s-N_0}}.
\end{align*}
Since $Pu\in H^{s-m}_{\mathrm{loc}}(U;\mathbb C^N)$ and every distribution has some finite local Sobolev order on compact subsets, choosing $N_0$ large enough makes the right-hand side finite. Hence $\eta u\in H^s(\mathbb R^n;\mathbb C^N)$ for every compactly supported cutoff $\eta$, so $u\in H^s_{\mathrm{loc}}(U;\mathbb C^N)$.[/proof]
The only new feature compared with the scalar case is that the correction terms must be multiplied in the correct matrix order; ellipticity is invertibility of the whole principal matrix, not componentwise nonvanishing.
[/example]
The system example marks the boundary of what local elliptic theory provides: it upgrades regularity, but it does not by itself solve global invertibility questions. The remaining obstruction is not symbolic order; it lies in kernels, cokernels, support conditions, and boundary conditions.
[remark: What Ellipticity Does Not See]
Ellipticity is a high-frequency condition. It does not decide finite-dimensional kernels, cokernels, boundary conditions, or global solvability on noncompact domains. The parametrix says that all failures of exact inversion are smoothing from the local pseudodifferential viewpoint.
[/remark]
This chapter completes the basic elliptic part of the calculus. The next step in a microlocal course is to replace global ellipticity by ellipticity only on a conic region of phase space, leading to microlocal parametrices and wave front set regularity.
Elliptic parametrices and regularity estimates have been inherently local and order-dependent. To extract global Fredholm statements—finite-dimensional kernels and cokernels—compactness is the final analytic input needed. Negative-order operators become regularizing on compact manifolds, so the global index becomes a finite integer determined solely by the principal symbol.
# 10. Compactness, Fredholm Theory, and Applications
This chapter turns the local elliptic regularity of Chapter 9 into global Fredholm statements. The preceding chapters constructed parametrices and Sobolev estimates for elliptic operators; here the remaining analytic input is compactness. The central mechanism is that negative-order operators gain derivatives, and on bounded or compact spaces this gain becomes a compact map between Sobolev spaces.
## Negative-Order Operators and Compactness After Localization
The first question is why an operator of negative order should behave like a compact perturbation rather than merely a bounded lower-order term. On all of $\mathbb R^n$, translation prevents compactness: a bounded sequence can drift to infinity without gaining a convergent subsequence. Compactness appears after localization, where the gain in Sobolev order can be combined with Rellich's theorem on bounded sets.
[quotetheorem:7720]
[citeproof:7720]
The hypotheses in Rellich are exactly the two ways compactness can fail. If the boundedness or localization is removed, the translated sequence $u_j(x)=u(x-je_1)$ is bounded in every $H^s(\mathbb R^n)$ but has no strongly convergent subsequence in $H^t(\mathbb R^n)$ for nonzero $u\in C_c^\infty(\mathbb R^n)$. If the strict inequality $s>t$ is replaced by $s=t$, high-frequency oscillations such as suitably normalized functions $e^{ijx}\phi(x)$ on a bounded coordinate patch stay bounded in $H^s$ without becoming compact in $H^s$. Rellich therefore proves compactness only after both a gain of derivatives and a bounded spatial region are present; it does not say that bounded sets in one Sobolev space are compact in the same norm, nor does it give any global compactness on noncompact spaces.
Rellich by itself is a [compactness theorem](/theorems/2748) for Sobolev inclusions, not yet a compactness theorem for pseudodifferential operators. To use it in the calculus, the Sobolev gain must come from an operator and the bounded region must come from localization. The next theorem packages these two ingredients into the form used later in parametrix identities: a negative-order operator becomes compact on a fixed Sobolev space once the input and output are confined to a common compact coordinate region.
[quotetheorem:7701]
[citeproof:7701]
Each hypothesis in the local compactness theorem has a distinct role. The condition $m<0$ is the source of extra derivatives; for an order-zero operator, even multiplication by a nonzero cutoff is not compact on $H^s$, as high-frequency oscillations can remain in a fixed compact set while avoiding strong convergence. Compact support of the cutoffs rules out the other obstruction, namely translation to infinity: the unlocalized operator $(I-\Delta)^{-1/2}$ on $\mathbb R^n$ gains one derivative but is not compact $H^s(\mathbb R^n)\to H^s(\mathbb R^n)$ because translated bumps remain separated. The condition $\psi=1$ near $\operatorname{supp}\chi$ ensures that the part of the input seen by the output cutoff is genuinely localized; without this buffer, the pseudolocal argument would have to track off-support kernel terms rather than reducing to a bounded coordinate patch.
This result explains why remainders of order $-1$ or lower are harmless in many elliptic estimates on compact manifolds: after [partition of unity](/page/Partition%20of%20Unity), each piece is compact on a fixed Sobolev space. Smoothing operators are a stronger special case, since they improve by every finite number of derivatives, but smoothing alone still needs compact support or compact ambient geometry to become compact as an operator on a fixed Sobolev space.
[example: Compactness of a Localized Bessel Potential]
Let $B=(I-\Delta)^{-1/2}$, so $A=\chi B\chi$. For $v\in H^s(\mathbb R^n)$, the Fourier transform of $Bv$ is
\begin{align*}
\widehat{Bv}(\xi)=(1+|\xi|^2)^{-1/2}\widehat v(\xi).
\end{align*}
Therefore
\begin{align*}
\|Bv\|_{H^{s+1}}^2=\int_{\mathbb R^n}(1+|\xi|^2)^{s+1}(1+|\xi|^2)^{-1}|\widehat v(\xi)|^2\,d\xi=\|v\|_{H^s}^2.
\end{align*}
Thus $B:H^s(\mathbb R^n)\to H^{s+1}(\mathbb R^n)$ is bounded, and multiplication by $\chi\in C_c^\infty(\mathbb R^n)$ is bounded on each Sobolev space, so
\begin{align*}
\|Au\|_{H^{s+1}}\le C_\chi\|B(\chi u)\|_{H^{s+1}}=C_\chi\|\chi u\|_{H^s}\le C_\chi'\|u\|_{H^s}.
\end{align*}
Moreover, $\operatorname{supp}(Au)\subseteq \operatorname{supp}\chi$. Choose a bounded smooth open set $W$ with $\operatorname{supp}\chi\Subset W$. If $(u_j)$ is bounded in $H^s(\mathbb R^n)$, then $(Au_j)$ is bounded in $H^{s+1}(W)$ and supported in the fixed compact set $\operatorname{supp}\chi$. Since $s+1>s$, *Rellich Compactness Input* gives a subsequence converging in $H^s(W)$. Because all functions in the subsequence are supported in the same compact subset of $W$, this convergence is the same as convergence in $H^s(\mathbb R^n)$ after extending by zero outside $W$. Hence $A:H^s(\mathbb R^n)\to H^s(\mathbb R^n)$ is compact. This is the localized model for negative-order parametrix remainders: the multiplier supplies one derivative, and the two cutoffs place the output in a bounded region where Rellich applies.
[/example]
## Elliptic Estimates with Lower-Order Error Terms
The next problem is to convert a parametrix identity into an estimate for $u$ in terms of $Pu$. A parametrix gives control of the leading Sobolev norm, but the identity also produces a smoothing or lower-order remainder. On noncompact spaces such remainders cannot always be discarded; on compact manifolds they become compact or can be placed in a weaker norm.
[quotetheorem:7702]
[citeproof:7702]
The lower-order error is not a defect in the estimate; it records the information that ellipticity alone cannot remove. If $P$ has a nonzero kernel, then $Pu=0$ gives no control of the kernel component of $u$, so a weaker norm or an orthogonality condition must remain. If ellipticity is dropped, the estimate fails in a more basic way: for $P=\partial_{x_1}$ on $\mathbb R^2$, functions depending only on $x_2$ have $Pu=0$ while their full first-order Sobolev norm can be arbitrarily large. If localization is dropped, singularities or mass away from the coordinate patch can enter the left-hand side without being measured by the localized right-hand side.
[remark: Why the Weak Norm Appears]
The term $\|\chi u\|_{H^{-N}}$ prevents the estimate from claiming invertibility from ellipticity alone. A solution lying in the kernel of $P$ has $Pu=0$, so some information about $u$ must remain on the right unless the kernel has already been controlled. On compact manifolds this term becomes a compact perturbation of the main estimate. The estimate is also local: it gives regularity and bounds inside the region selected by the cutoffs, not solvability of $Pu=f$, not uniqueness, and not a global Fredholm statement on a noncompact space.
[/remark]
The estimate is most useful when applied repeatedly. Once it gives one improvement in Sobolev regularity, the same argument can be restarted with a stronger right-hand side, which is the standard elliptic bootstrap mechanism.
[example: Regularity Bootstrap for Pu Equals f]
Let $K\Subset U$ be fixed, and choose cutoffs $\psi,\chi\in C_c^\infty(U)$ with $\psi=1$ near $K$ and $\chi=1$ near $\operatorname{supp}\psi$. Since $u\in H^t_{\operatorname{loc}}(U)$, choose $N>0$ with $t\ge -N$. Then $\chi u\in H^t(U)$, and Sobolev monotonicity gives $\chi u\in H^{-N}(U)$.
Applying *Elliptic A Priori Estimate with Lower-Order Error* to $Pu=f$ gives
\begin{align*}
\|\psi u\|_{H^{s+m}} \le C\bigl(\|\chi Pu\|_{H^s}+\|\chi u\|_{H^{-N}}\bigr).
\end{align*}
Because $Pu=f$ on $U$,
\begin{align*}
\|\chi Pu\|_{H^s}=\|\chi f\|_{H^s}.
\end{align*}
The assumption $f\in H^s_{\operatorname{loc}}(U)$ gives $\chi f\in H^s(U)$, and the previous paragraph gives $\chi u\in H^{-N}(U)$, so the right-hand side is finite. Hence
\begin{align*}
\psi u\in H^{s+m}(U).
\end{align*}
Since $\psi=1$ near $K$, this means $u\in H^{s+m}$ on a neighbourhood of $K$. As $K\Subset U$ was arbitrary, $u\in H^{s+m}_{\operatorname{loc}}(U)$. Thus the equation transfers local $H^s$ regularity of $f$ into local $H^{s+m}$ regularity of $u$; if $f$ is known in higher Sobolev orders, the same argument applied at each higher order gives the usual elliptic bootstrap.
[/example]
The bootstrap example handles finite Sobolev orders. The limiting case asks what happens when the right-hand side is smooth in every Sobolev scale, and the answer is the smooth elliptic regularity theorem.
[quotetheorem:7703]
[citeproof:7703]
Smooth elliptic regularity is a regularity theorem, not a solvability theorem. It says that a distributional solution has no hidden roughness once $Pu$ is smooth, but it does not say that a solution exists for a given smooth right-hand side, nor that it is unique. Ellipticity is essential: for $P=\partial_{x_1}$ on $\mathbb R^2$, the equation $Pu=0$ permits arbitrary distributions in the $x_2$ variable, so smoothness of $Pu$ gives no smoothness in the missing direction. The statement is also local; boundary behavior, growth at infinity, and global kernel or cokernel questions require extra hypotheses. Those global finite-dimensional questions are the subject of Fredholm theory.
## Fredholm Consequences on Compact Manifolds
The final question is what ellipticity gives when there is no boundary and no escape to infinity. On a compact manifold, every local compactness statement can be patched together by finitely many charts. The parametrix identity then says that an elliptic operator is invertible modulo compact operators, which is exactly the Fredholm situation.
[definition: Fredholm Operator]
Let $X$ and $Y$ be Banach spaces. A bounded linear map $T:X\to Y$ is Fredholm if $\ker T$ is finite-dimensional, $\operatorname{Range}(T)$ is closed in $Y$, and $Y/\operatorname{Range}(T)$ is finite-dimensional.
[/definition]
The quotient dimension measures the number of compatibility conditions required to solve $Tu=f$. To prove that elliptic operators have these finite-dimensional defects, we need a functional-analytic criterion that recognizes Fredholm operators from compact remainders.
[quotetheorem:6430]
[citeproof:6430]
Compactness is the decisive hypothesis. If $K=I$ on an infinite-dimensional Banach space, then $K$ is bounded but not compact, and $I-K=0$ has infinite-dimensional kernel and cokernel, so the conclusion fails. There are also noncompact perturbations with finite kernel but nonclosed range: on $\ell^2(\mathbb N)$, let $K$ be the diagonal operator $K e_j=(1-1/j)e_j$. Then $I-K$ is the diagonal operator $e_j\mapsto j^{-1}e_j$, which is injective and has dense nonclosed range because vectors with finite support are in the range but the inverse would multiply the $j$th coefficient by $j$. The theorem therefore does not say that every bounded perturbation of the identity is Fredholm; it says that compact perturbations preserve enough finite-dimensional structure for the failure of invertibility to be algebraic rather than infinite-dimensional.
This theorem is the analytic bridge from parametrix identities to elliptic Fredholm theory. The pseudodifferential construction supplies compact remainders, and the [Fredholm alternative](/theorems/72) supplies finite-dimensional algebraic consequences.
[quotetheorem:7704]
[citeproof:7704]
The theorem says that elliptic operators on compact manifolds fail to be invertible only by finite-dimensional data, and each global hypothesis is doing work. Compactness of $M$ rules out escape to infinity; for instance, $\Delta:H^2(\mathbb R^n)\to L^2(\mathbb R^n)$ is elliptic but not Fredholm because low-frequency concentration prevents closed range. The absence of boundary avoids boundary data: on a compact domain with boundary, the Laplacian without boundary conditions has an infinite-dimensional harmonic kernel, so ellipticity alone is not Fredholm. Ellipticity is also necessary; on the torus, $\partial_{x_1}$ has an infinite-dimensional kernel consisting of functions of the remaining variables.
Fredholmness itself is weaker than invertibility. It gives finite-dimensional kernel, closed range, and finite-dimensional cokernel, but it does not say that the kernel vanishes, that every right-hand side is solvable, or that the index is zero. Those further conclusions require extra information, such as positivity, an adjoint compatibility condition, or an index theorem.
[example: Fredholmness of an Elliptic Second-Order Operator]
Let $M$ be compact without boundary and let $P=-\Delta_g+V$, where $\Delta_g$ is the Laplace-Beltrami operator of a smooth Riemannian metric and $V\in C^\infty(M)$ is real-valued. In local coordinates, the second-order part of $-\Delta_g$ has principal symbol $g^{ij}(x)\xi_i\xi_j$, while multiplication by $V$ has order $0$ and therefore contributes nothing to the order-$2$ principal symbol. Thus
\begin{align*}
\sigma_2(P)(x,\xi)=g^{ij}(x)\xi_i\xi_j=|\xi|_g^2.
\end{align*}
Since $g$ is positive definite, $|\xi|_g^2>0$ whenever $\xi\ne 0$, so $P$ is elliptic of order $2$. By *[Fredholm Theorem for Elliptic Pseudodifferential Operators](/theorems/7704)*, for every $s\in\mathbb R$ the map
\begin{align*}
P:H^{s+2}(M)\longrightarrow H^s(M)
\end{align*}
is Fredholm.
If $u\in\ker P$ and $V\ge 0$, then $Pu=0$, so pairing with $u$ in $L^2(M)$ gives
\begin{align*}
0=\langle Pu,u\rangle_{L^2}=\langle -\Delta_g u,u\rangle_{L^2}+\langle Vu,u\rangle_{L^2}.
\end{align*}
Integration by parts on the compact boundaryless manifold gives
\begin{align*}
\langle -\Delta_g u,u\rangle_{L^2}=\int_M |\nabla u|_g^2\,d\operatorname{vol}_g.
\end{align*}
The potential term is
\begin{align*}
\langle Vu,u\rangle_{L^2}=\int_M V|u|^2\,d\operatorname{vol}_g.
\end{align*}
Hence
\begin{align*}
0=\int_M |\nabla u|_g^2\,d\operatorname{vol}_g+\int_M V|u|^2\,d\operatorname{vol}_g.
\end{align*}
Both integrands are nonnegative, so $\nabla u=0$ and $V|u|^2=0$. Thus $u$ is constant on each connected component, and if every connected component meets the set where $V>0$, that constant must be $0$. In that case $\ker P=0$, so the Fredholm map is injective and therefore invertible onto its closed range; the remaining obstruction to solving $Pu=f$ for every $f\in H^s(M)$ is exactly the finite-dimensional cokernel.
[/example]
Fredholmness gives finite-dimensional kernel and cokernel at the Sobolev level. Elliptic regularity then strengthens this information by identifying the Sobolev kernel with a space of smooth solutions.
[example: Finite-Dimensional Kernel of an Elliptic Operator]
Let $P\in\Psi^m(M)$ be elliptic on a compact manifold without boundary, and fix $s\in\mathbb R$. By *Fredholm Theorem for Elliptic Pseudodifferential Operators*, the bounded map
\begin{align*}
P:H^{s+m}(M)\longrightarrow H^s(M)
\end{align*}
is Fredholm, so
\begin{align*}
\dim\ker(P:H^{s+m}(M)\longrightarrow H^s(M))<\infty.
\end{align*}
We now identify this Sobolev kernel with the smooth solution space. If
\begin{align*}
u\in\ker(P:H^{s+m}(M)\longrightarrow H^s(M)),
\end{align*}
then $u\in H^{s+m}(M)\subset\mathcal D'(M)$ and
\begin{align*}
Pu=0.
\end{align*}
Since $0\in C^\infty(M)$, *Smooth Elliptic Regularity* gives
\begin{align*}
u\in C^\infty(M).
\end{align*}
Thus
\begin{align*}
\ker(P:H^{s+m}(M)\longrightarrow H^s(M))\subseteq \{u\in C^\infty(M):Pu=0\}.
\end{align*}
Conversely, if $u\in C^\infty(M)$ and $Pu=0$, then smooth functions on compact manifolds lie in every Sobolev space, so $u\in H^{s+m}(M)$ and therefore
\begin{align*}
u\in\ker(P:H^{s+m}(M)\longrightarrow H^s(M)).
\end{align*}
Hence
\begin{align*}
\ker(P:H^{s+m}(M)\longrightarrow H^s(M))=\{u\in C^\infty(M):Pu=0\}.
\end{align*}
The right-hand side does not depend on $s$, so the elliptic nullspace is a single finite-dimensional space of smooth modes, not a family of Sobolev-dependent kernels.
[/example]
The Fredholm theorem is the endpoint of the basic pseudodifferential calculus developed in this course and the starting point for several later theories. In spectral theory, it explains why elliptic [self-adjoint operators](/page/Self-Adjoint%20Operators) on compact manifolds have discrete spectral features rather than continuous escape phenomena. In Hodge theory, it is the analytic input behind finite-dimensional spaces of harmonic forms. In index theory, the remaining finite-dimensional obstruction is organized into the integer $\dim\ker P-\dim\operatorname{coker}P$, whose computation depends on the symbol rather than on lower-order analytic details.
The Fredholm index is a global topological integer, yet it depends only on the principal symbol, which lives naturally on the cotangent bundle. To move from Euclidean coordinates to smooth manifolds, we must verify that principal symbols, ellipticity, and parametrices are coordinate-independent. This chapter places the entire calculus invariantly on the cotangent bundle of a manifold.
# 11. Coordinate Localisation and Operators on Manifolds
This chapter returns to the geometric setting implicit in the compact-manifold Fredholm discussion and explains how the Euclidean pseudodifferential calculus is placed on smooth manifolds. It assumes the Euclidean symbol calculus, oscillatory-integral kernels, smoothing remainders, Sobolev mapping properties, and the basic language of smooth manifolds, charts, partitions of unity, densities, and the cotangent bundle. The main difficulty is that the formula for an operator uses coordinates and Fourier variables, while the final object should not depend on the auxiliary choices. We first isolate the coordinate-invariant part of a symbol, then use cutoffs and partitions of unity to assemble properly supported operators, and finally apply the resulting calculus to elliptic operators on compact manifolds.
## Coordinate Symbols and the Cotangent Bundle
The first question is how a symbol written in local coordinates can define an invariant object on a manifold. Fourier variables do not transform as tangent vectors; they transform as covectors. This is why the principal symbol naturally lives on the cotangent bundle rather than on the product of a coordinate patch with $\mathbb R^n$.
Let $M$ be a smooth $n$-manifold and let $(U,\varphi)$ be a chart with coordinates $x=(x_1,\dots,x_n)$. A covector $\xi \in T_x^*M$ is represented in this chart as
\begin{align*}
\xi = \sum_{j=1}^n \xi_j\, dx_j.
\end{align*}
Under a change of coordinates $y=\kappa(x)$, the same covector has components $\eta$ satisfying
\begin{align*}
\sum_{j=1}^n \xi_j\, dx_j = \sum_{k=1}^n \eta_k\, dy_k,
\qquad
\xi_j = \sum_{k=1}^n \eta_k\,\frac{\partial y_k}{\partial x_j}.
\end{align*}
Thus the correct transition rule is the cotangent lift of the coordinate change.
[definition: Classical Symbol in a Chart]
Let $(U,\varphi)$ be a coordinate chart on a smooth $n$-manifold $M$. A function $a \in C^\infty(\varphi(U)\times \mathbb R^n;\mathbb C)$ is a classical symbol of order $m$ in the chart if
\begin{align*}
a(x,\xi) \sim \sum_{j=0}^{\infty} a_{m-j}(x,\xi)
\end{align*}
in $S^m_{1,0}(\varphi(U)\times \mathbb R^n)$, where each $a_{m-j}$ is smooth for $\xi \ne 0$ and positively homogeneous of degree $m-j$ in $\xi$ for $|\xi|\ge 1$.
[/definition]
This definition records the Euclidean data available inside a chart, but it still depends on the chart. To compare two coordinate descriptions of the same operator, we need to separate the invariant leading term from the lower-order terms created by the change of variables. That need motivates the global principal symbol.
[definition: Principal Symbol on a Manifold]
Let $A:C_c^\infty(M)\to \mathcal D'(M)$ be a properly supported scalar classical pseudodifferential operator of order $m$ which, in each coordinate chart $(U,\varphi)$, is represented modulo smoothing operators by a classical symbol $a_U \in S^m_{\mathrm{cl}}(\varphi(U)\times \mathbb R^n)$. The principal symbol is the map $\sigma_m(A):T^*M\setminus 0\to \mathbb C$ whose coordinate representative over $U$ is the homogeneous degree $m$ leading term $(a_U)_m(x,\xi)$.
[/definition]
The definition is meaningful only if the coordinate representatives glue. A change of coordinates rewrites both the phase and the amplitude, and this can create many lower-order correction terms in the local full symbol. The possible obstruction is that even the top homogeneous term might depend on the chosen chart; if that happened, there would be no global principal symbol on $T^*M\setminus 0$. What must be checked is that the leading term transforms by the cotangent lift, while the coordinate-dependent errors are pushed into order $m-1$.
[quotetheorem:7705]
[citeproof:7705]
The hypotheses matter in concrete ways. If two coordinate formulae describe different operators rather than the same operator modulo smoothing terms, equality of their leading pieces is not forced. For instance, on $\mathbb R$ the change of variable $y=e^x$ sends a covector component by $\xi=e^x\eta$. The operator $-\partial_x^2$ has principal symbol $\xi^2$ in the $x$-coordinate, so the correct transformed principal symbol is
\begin{align*}
e^{2x}\eta^2=y^2\eta^2.
\end{align*}
If the Fourier variable were incorrectly treated by the tangent rule $\eta=e^x\xi$, the transformed expression would instead be
\begin{align*}
e^{-2x}\eta^2=y^{-2}\eta^2,
\end{align*}
which is the reciprocal weight. Proper support is not part of the algebraic transformation law itself, but it is what lets the local representatives define the same global operator on distributions.
The theorem does not say that the full symbol is coordinate invariant. Lower-order terms change with the quantisation convention, the density used to write kernels, and the choice of coordinates. What survives canonically is the top homogeneous term, and that is exactly the piece used next to define ellipticity and to patch local constructions.
[example: Laplace Beltrami Principal Symbol]
Let $(M,g)$ be a Riemannian manifold, and write the Laplace-Beltrami operator in coordinates $x=(x_1,\dots,x_n)$ as
\begin{align*}
\Delta_g u=|g|^{-1/2}\sum_{i,j=1}^n \partial_{x_i}\left(|g|^{1/2}g^{ij}\partial_{x_j}u\right).
\end{align*}
Expanding the derivative by the product rule gives
\begin{align*}
\Delta_g u=\sum_{i,j=1}^n g^{ij}\partial_{x_i}\partial_{x_j}u+\sum_{i,j=1}^n |g|^{-1/2}\partial_{x_i}\left(|g|^{1/2}g^{ij}\right)\partial_{x_j}u.
\end{align*}
The second summation contains only first derivatives of $u$, so the order-$2$ part of $-\Delta_g$ is
\begin{align*}
-\sum_{i,j=1}^n g^{ij}(x)\partial_{x_i}\partial_{x_j}.
\end{align*}
With the local Fourier convention $u(x)=e^{ix\cdot \xi}$, we have $\partial_{x_j}e^{ix\cdot \xi}=i\xi_j e^{ix\cdot \xi}$ and therefore
\begin{align*}
-\partial_{x_i}\partial_{x_j}e^{ix\cdot \xi}=\xi_i\xi_j e^{ix\cdot \xi}.
\end{align*}
Thus the principal symbol is
\begin{align*}
\sigma_2(-\Delta_g)(x,\xi)=\sum_{i,j=1}^n g^{ij}(x)\xi_i\xi_j.
\end{align*}
Since $(g^{ij}(x))$ is the inverse of the positive definite metric matrix $(g_{ij}(x))$, this quadratic form is the dual metric norm
\begin{align*}
\sum_{i,j=1}^n g^{ij}(x)\xi_i\xi_j=|\xi|_{g^{-1}}^2.
\end{align*}
For every nonzero covector $\xi\in T_x^*M$, the dual metric is positive definite, so $|\xi|_{g^{-1}}^2>0$; hence $-\Delta_g$ is elliptic of order $2$.
[/example]
The example also shows the role of geometry: the metric identifies the leading part of the operator with the dual metric on $T^*M$. The lower-order terms of $\Delta_g$ change under coordinates, but the quadratic form above is intrinsic.
## Partitions of Unity and Proper Support
The next problem is global construction. A symbol in one chart gives an operator only on that chart, while a global operator must accept functions on all of $M$ and produce a well-defined distribution on all of $M$. Partitions of unity localise the input and output, and proper support prevents unwanted behaviour at infinity or across infinitely many coordinate patches.
[definition: Locally Finite Partition of Unity]
Let $M$ be a smooth manifold and let $\{U_i\}_{i\in I}$ be an open cover. A locally finite partition of unity subordinate to the cover is a family of functions $\chi_i:M\to[0,1]$ with $\chi_i\in C^\infty(M;\mathbb R)$ such that $\operatorname{supp}\chi_i\subset U_i$, the family of supports is locally finite, and
\begin{align*}
\sum_{i\in I}\chi_i(x)=1
\end{align*}
for every $x\in M$.
[/definition]
The partition separates global questions into chartwise questions, but localisation alone does not control how far the kernel moves support. To compose operators and to extend their action to distributions, we need a support condition on the Schwartz kernel. This leads to the notion of proper support.
[definition: Properly Supported Operator]
Let $A:C_c^\infty(M)\to \mathcal D'(M)$ be continuous with Schwartz kernel $K_A\in \mathcal D'(M\times M)$. The operator $A$ is properly supported if both coordinate projections
\begin{align*}
\operatorname{supp}K_A \to M
\end{align*}
are proper maps.
[/definition]
Proper support means that compactly supported inputs produce compactly supported outputs, and that the transpose has the same property. With this support control in place, the local Euclidean quantisations can be patched into a global operator. The next theorem states the construction that turns compatible local symbols into the manifold calculus.
[quotetheorem:7706]
[citeproof:7706]
Each hypothesis prevents a specific failure. Without local finiteness, the displayed sum need not define a distributional kernel near a point; without cutoffs near the diagonal, a local coordinate formula can try to compare points lying in no common coordinate patch; without proper support, compactly supported inputs may produce outputs whose support escapes every compact set. The theorem also does not produce a canonical operator from a principal symbol alone: lower-order choices can change mapping properties, adjoints, and spectra. Its role is to build a usable global calculus, so the next issues are how kernels are integrated invariantly and how adjoints behave once a density has been chosen.
[example: Operator on the Circle]
Let $M=S^1$ and let $\theta$ be an angular coordinate on a coordinate arc. For
\begin{align*}
P=-\frac{d^2}{d\theta^2}+V(\theta),
\end{align*}
with $V\in C^\infty(S^1)$, evaluate the two pieces on the local Fourier mode $e^{i\theta\xi}$. First,
\begin{align*}
\frac{d}{d\theta}e^{i\theta\xi}=i\xi e^{i\theta\xi}.
\end{align*}
Differentiating once more gives
\begin{align*}
\frac{d^2}{d\theta^2}e^{i\theta\xi}=i\xi\cdot i\xi e^{i\theta\xi}=-\xi^2e^{i\theta\xi}.
\end{align*}
Therefore
\begin{align*}
-\frac{d^2}{d\theta^2}e^{i\theta\xi}=\xi^2e^{i\theta\xi}.
\end{align*}
The zeroth-order term acts by multiplication:
\begin{align*}
V(\theta)e^{i\theta\xi}=V(\theta)e^{i\theta\xi}.
\end{align*}
Thus
\begin{align*}
P(e^{i\theta\xi})=(\xi^2+V(\theta))e^{i\theta\xi},
\end{align*}
so the complete differential-operator symbol in this coordinate is $\xi^2+V(\theta)$.
The term $\xi^2$ is homogeneous of degree $2$ in the cotangent variable, while $V(\theta)$ is homogeneous of degree $0$. Hence the principal symbol is
\begin{align*}
\sigma_2(P)(\theta,\xi)=\xi^2.
\end{align*}
Since $T_\theta^*S^1$ is one-dimensional, a nonzero covector has coordinate $\xi\ne 0$, and then
\begin{align*}
\xi^2>0.
\end{align*}
Therefore $P$ is elliptic of order $2$.
If $W\in C^\infty(S^1)$ is another potential and
\begin{align*}
Q=-\frac{d^2}{d\theta^2}+W(\theta),
\end{align*}
then the same calculation gives
\begin{align*}
\sigma_2(Q)(\theta,\xi)=\xi^2=\sigma_2(P)(\theta,\xi).
\end{align*}
But
\begin{align*}
(P-Q)u=(V(\theta)-W(\theta))u,
\end{align*}
so $P$ and $Q$ differ by a zeroth-order multiplication operator unless $V=W$. Thus a partition-of-unity construction and a Fourier-series construction can represent the same fixed differential operator $P$, but agreement of principal symbols alone records only the order-$2$ part and does not determine the lower-order potential.
[/example]
On compact manifolds, proper support is automatic for continuous kernels supported near the diagonal, but it remains useful to keep the condition in the definition. It makes the noncompact case behave like the compact case for local arguments.
## Densities, Adjoints, and Coordinate-Free Kernels
The next issue is integration. A coordinate formula for an operator involves $dy$ or $d\mathcal L^n(y)$, but a manifold has no canonical [Lebesgue measure](/page/Lebesgue%20Measure). Densities provide the invariant object needed to integrate kernels, define adjoints, and compare local quantisations.
[definition: Density]
Let $M$ be a smooth $n$-manifold. A density on $M$ is a smooth section of the density bundle $|\Lambda|T^*M$, whose transition functions in coordinates are multiplied by the absolute value of the Jacobian determinant.
[/definition]
A density can be integrated without choosing an orientation. Once a positive smooth density $\mu$ is fixed, the scalar representative of a kernel can be written in a coordinate-free integral formula. This motivates recording how kernels are represented relative to $\mu$.
[definition: Schwartz Kernel with Respect to a Density]
Let $\mu$ be a positive smooth density on $M$. A continuous operator $A:C_c^\infty(M)\to \mathcal D'(M)$ has scalar kernel representative $K_A^\mu\in \mathcal D'(M\times M)$ with respect to $\mu$ if
\begin{align*}
(Au)(x)=\int_M K_A^\mu(x,y)u(y)\,d\mu(y)
\end{align*}
whenever $K_A^\mu$ is represented by a smooth function and the integral expression is meaningful.
[/definition]
For singular kernels this formula is interpreted distributionally, with $K_A^\mu$ acting on test functions on $M\times M$. The density is part of the bookkeeping: changing $\mu$ changes the scalar representative of the kernel but not the underlying operator.
[remark: Half Densities]
Many invariant formulations let pseudodifferential operators act on half densities rather than on functions. In that convention the adjoint and principal symbol are independent of a chosen positive density. For this first course, it is enough to fix a density when writing integral kernels and to remember which local formulas depend on that choice.
[/remark]
The density issue becomes visible when taking adjoints. A formal adjoint depends on the pairing, so we need to know which part of the adjoint symbol is intrinsic after the pairing has been fixed. The next theorem gives the principal-symbol rule used in energy estimates and self-adjointness questions.
[quotetheorem:7707]
[citeproof:7707]
The density and proper-support hypotheses are not cosmetic. Changing the density from $\mu$ to $e^f\mu$ conjugates the scalar kernel description by a smooth weight and changes lower-order adjoint terms; dropping proper support can make the formal transpose fail to act continuously on compactly supported test functions. The theorem also does not say that $A$ is self-adjoint when its principal symbol is real: skew lower-order terms may remain. What it gives is the leading symbolic input needed for energy estimates, and it prepares the elliptic Fredholm theory where lower-order terms are controlled but not ignored.
[example: Scalar Operators on Vector Bundles]
Let $E\to M$ be a complex vector bundle of rank $r$, and choose a local frame $e_1,\dots,e_r$ over $U\subset M$. A section has the coordinate form $u=\sum_{\beta=1}^r u^\beta e_\beta$, and a local representative of $A$ is a matrix $(A^\alpha_{\ \beta})$ of scalar pseudodifferential operators defined by
\begin{align*}
Au=\sum_{\alpha=1}^r\left(\sum_{\beta=1}^r A^\alpha_{\ \beta}u^\beta\right)e_\alpha.
\end{align*}
If each entry has order at most $m$, then the principal symbol in this frame is the matrix
\begin{align*}
S_e(x,\xi)=\left(\sigma_m(A^\alpha_{\ \beta})(x,\xi)\right)_{\alpha,\beta=1}^r.
\end{align*}
Thus, for $v=\sum_{\beta=1}^r v^\beta e_\beta\in E_x$, the induced principal-symbol map is
\begin{align*}
\sigma_m(A)(x,\xi)v=\sum_{\alpha=1}^r\left(\sum_{\beta=1}^r \sigma_m(A^\alpha_{\ \beta})(x,\xi)v^\beta\right)e_\alpha.
\end{align*}
Now change to another local frame $\widetilde e_1,\dots,\widetilde e_r$ on the same open set, and write
\begin{align*}
\widetilde e_\gamma=\sum_{\alpha=1}^r e_\alpha G^\alpha_{\ \gamma}
\end{align*}
for a smooth invertible matrix $G(x)$. If $u=\sum_\gamma \widetilde u^\gamma\widetilde e_\gamma$, then its old components are
\begin{align*}
u^\alpha=\sum_{\gamma=1}^r G^\alpha_{\ \gamma}\widetilde u^\gamma.
\end{align*}
Applying $A$ in the old frame gives
\begin{align*}
(Au)^\alpha_e=\sum_{\beta=1}^r A^\alpha_{\ \beta}\left(\sum_{\gamma=1}^r G^\beta_{\ \gamma}\widetilde u^\gamma\right).
\end{align*}
To convert the output back to the new frame, use $(Au)_{\widetilde e}=G^{-1}(Au)_e$, so
\begin{align*}
(Au)^\delta_{\widetilde e}=\sum_{\alpha=1}^r (G^{-1})^\delta_{\ \alpha}\sum_{\beta=1}^r A^\alpha_{\ \beta}\left(\sum_{\gamma=1}^r G^\beta_{\ \gamma}\widetilde u^\gamma\right).
\end{align*}
Multiplication by the smooth matrices $G$ and $G^{-1}$ has order $0$, and derivatives falling on these smooth coefficients contribute only lower-order terms in the scalar symbol product formula. Hence the order-$m$ matrix symbol in the new frame is
\begin{align*}
S_{\widetilde e}(x,\xi)=G(x)^{-1}S_e(x,\xi)G(x).
\end{align*}
Conjugate matrices have the same invertibility status, because
\begin{align*}
S_{\widetilde e}(x,\xi)^{-1}=G(x)^{-1}S_e(x,\xi)^{-1}G(x)
\end{align*}
whenever $S_e(x,\xi)$ is invertible, and the same identity with the two frames interchanged proves the converse. Thus the condition that $\sigma_m(A)(x,\xi):E_x\to E_x$ be invertible is independent of the chosen local frame.
[/example]
Vector bundles add linear algebra to the same localisation principle. The coordinate chart controls the cotangent variable, while the local frame controls the fibre variable.
## Ellipticity and Parametrices on Compact Manifolds
The final question is whether ellipticity still produces inverses modulo smoothing errors once no global Fourier transform is available. The answer is yes: construct the inverse symbol locally, patch it by the manifold calculus, and use compactness to remove support issues from the final Fredholm consequences.
[definition: Elliptic Classical Operator on a Manifold]
Let $M$ be a smooth manifold and let $A:C_c^\infty(M)\to \mathcal D'(M)$ be a scalar operator in $\Psi^m_{\mathrm{cl}}(M)$. The operator $A$ is elliptic if its principal symbol satisfies
\begin{align*}
\sigma_m(A)(x,\xi)\ne 0
\end{align*}
for every $(x,\xi)\in T^*M\setminus 0$.
[/definition]
For operators on vector bundles, the scalar nonvanishing condition is replaced by invertibility of the principal symbol as a linear map on each fibre. Ellipticity says that the leading symbol can be inverted away from the zero section, and the calculus turns that symbolic inverse into an operator. The next theorem is the resulting global parametrix statement.
[quotetheorem:7708]
[citeproof:7708]
Compactness, ellipticity, and classicality each have a role. If $M$ is noncompact, a local parametrix may still exist, but the compact-manifold proof no longer supplies a finite patching argument or global Sobolev control. On $\mathbb R^n$, the operator $1-\Delta$ has a well-behaved Fourier multiplier inverse because the coefficients and geometry are uniform; an arbitrary noncompact manifold can have coordinate patches, injectivity radii, or coefficients degenerating at infinity, so the same finite global argument is unavailable. If ellipticity fails at a nonzero covector, the leading symbol cannot be inverted in that direction; for example, a vector field differentiating in only one direction cannot recover derivatives transverse to its flow. Classicality is what makes the inverse symbol polyhomogeneous: starting from $b_{-m}=(a_m)^{-1}$, the recursive cancellation produces homogeneous terms of degrees $-m,-m-1,\dots$. For a nonclassical elliptic symbol in a broader Hörmander class, a parametrix may belong to that broader class, but this theorem does not identify it as a classical operator. The theorem also does not claim that $A$ is genuinely invertible, since smoothing remainders may leave a finite-dimensional kernel or cokernel.
The parametrix is the manifold version of dividing by an elliptic symbol. Since smoothing remainders improve regularity by any number of derivatives, the parametrix should convert regularity of $Au$ into regularity of $u$. The next theorem records that analytic consequence in Sobolev spaces and is the entry point to spectral theory, index theory, and geometric analysis.
[quotetheorem:7709]
[citeproof:7709]
The assumptions mark the boundary of the conclusion. Compactness lets the Sobolev spaces and smoothing remainders be handled globally without adding weights or support conditions; on a noncompact space, even smoothing kernels need support or decay hypotheses to give global Sobolev estimates. Ellipticity is essential because nonelliptic operators can control only some directions: on the torus $\mathbb T^2$, $A=\partial_{x_1}$ satisfies $Au=0$ for every distribution $u(x_1,x_2)=v(x_2)$, so smoothness of $Au$ gives no regularity in the $x_2$ direction. The distributional hypothesis on $u$ is the natural starting point because the parametrix acts on distributions. The theorem does not say that $A$ has a unique solution or that $u$ is smoother than $s+m$ unless $Au$ is smoother as well.
The phrase "gain $m$ derivatives" should be read relative to the order of the operator. For positive elliptic differential operators, such as $-\Delta_g$ with $m=2$, this is the usual regularity gain. For order zero operators it gives no Sobolev-order gain, and for negative order elliptic pseudodifferential operators the conclusion rewrites the mapping relation between $A$ and its parametrix rather than asserting extra differentiability.
This is the main analytic payoff of the chapter. Once the pseudodifferential calculus is localised to manifolds, elliptic regularity becomes a symbolic statement rather than a coordinate-by-coordinate PDE estimate. The same mechanism underlies discreteness of spectra for elliptic operators on compact manifolds and the Fredholm framework used later in index theory.
There is also a geometric reason for insisting on the cotangent-bundle formulation. In Hodge theory, the symbol of the Hodge Laplacian controls [harmonic representatives](/theorems/2747) of de Rham cohomology; in geometric quantisation and propagation problems, the principal symbol is the function whose Hamiltonian flow describes high-frequency motion. The coordinate localisation developed here is therefore not only a technical device for elliptic PDE, but also the bridge from local Fourier analysis to global geometric invariants.
[example: Elliptic Regularity for the Laplace Beltrami Operator]
Let $(M,g)$ be compact and let $u\in \mathcal D'(M)$ satisfy $-\Delta_g u=f$ with $f\in H^s(M)$. In coordinates, the order-$2$ part of $-\Delta_g$ has symbol
\begin{align*}
\sigma_2(-\Delta_g)(x,\xi)=\sum_{i,j=1}^n g^{ij}(x)\xi_i\xi_j=|\xi|_{g^{-1}}^2.
\end{align*}
For $\xi\ne 0$, the inverse metric $(g^{ij}(x))$ is positive definite, so
\begin{align*}
|\xi|_{g^{-1}}^2>0.
\end{align*}
Thus $-\Delta_g$ is elliptic of order $2$. Applying *Elliptic Regularity on a Compact Manifold* to $A=-\Delta_g$ gives
\begin{align*}
Au=f\in H^s(M) \quad \Longrightarrow \quad u\in H^{s+2}(M).
\end{align*}
The constants explain the difference between regularity and solvability. If $c$ is constant, then $\partial_{x_j}c=0$ in every coordinate chart, so
\begin{align*}
-\Delta_g c=0.
\end{align*}
Hence constants may obstruct uniqueness for $-\Delta_g u=f$, and solvability may impose a compatibility condition on $f$. But once a distributional solution $u$ is already given, the parametrix argument still writes
\begin{align*}
u=Bf-Ru
\end{align*}
with $B\in \Psi^{-2}_{\mathrm{cl}}(M)$ and $R\in \Psi^{-\infty}(M)$, so $Bf\in H^{s+2}(M)$ and $Ru\in C^\infty(M)\subset H^{s+2}(M)$.
For $I-\Delta_g$, the order-$2$ part is unchanged because the identity operator has order $0$, so
\begin{align*}
\sigma_2(I-\Delta_g)(x,\xi)=|\xi|_{g^{-1}}^2.
\end{align*}
Also, if $c$ is constant, then
\begin{align*}
(I-\Delta_g)c=c-0=c,
\end{align*}
so constants no longer lie in the kernel unless $c=0$. Thus $I-\Delta_g$ has the same elliptic two-derivative regularity gain, while removing the constant-kernel obstruction present for $-\Delta_g$.
[/example]
The chapter closes the passage from Euclidean formulas to invariant analysis on manifolds. Local coordinates remain essential for constructing operators and proving estimates, but the principal symbol, ellipticity, parametrices, and regularity statements all live naturally on the cotangent bundle.
Chapters 8–11 have built the complete elliptic calculus: Sobolev mapping, regularity, Fredholm theory, and manifold invariance. The key principle underlying all of it is that ellipticity is a microlocal property—meaningful only in local conic regions of the cotangent bundle. This final chapter consolidates the elliptic theory and sketches how this microlocal perspective naturally extends to singularity analysis and wave front sets.
# 12. Consolidation: The Elliptic Calculus in Practice
This final chapter consolidates the elliptic calculus developed in Chapters 8 through 11 and prepares the transition to microlocal singularity analysis. We do not introduce wave front sets here, but we already use the guiding idea: an elliptic operator cannot create or hide local singularities except through smoothing errors. The main lesson is that the calculus is not a collection of formal identities; it is a practical machine for regularity estimates.
## Building Parametrices for Estimates
The basic problem is this: given an elliptic operator $P \in \Psi^m(U)$ and a distribution $u$, how can the symbolic inverse of $P$ be turned into a local estimate for $u$ from information about $Pu$? The answer requires two localisations. We first make the operator properly supported, so that it acts on distributions without support pathologies, and then insert cutoffs so that the parametrix is only used where ellipticity is available.
[definition: Local Ellipticity]
Let $P \in \Psi^m(U)$ be a pseudodifferential operator acting continuously as
\begin{align*}
P:C_c^\infty(U)\to C^\infty(U)
\end{align*}
and extended by proper localisation to distributions as a local map $P:\mathcal D'(U)\to \mathcal D'(U)$ whenever this action is invoked. Let $p_m(x,\xi)$ be its principal symbol. The operator $P$ is elliptic on an open set $V \subset U$ if for every compact $K \subset V$ there exist constants $C_K>0$ and $R_K>0$ such that
\begin{align*}
|p_m(x,\xi)| \ge C_K |\xi|^m
\end{align*}
for all $x \in K$ and $|\xi| \ge R_K$.
[/definition]
This formulation is deliberately local in $x$, which is the form that survives after multiplying the equation by cutoffs. The next step is to turn this lower bound for the principal symbol into an actual inverse modulo errors that do not affect local Sobolev estimates.
[quotetheorem:7710]
[citeproof:7710]
Proper support is part of the theorem because a kernel with uncontrolled projection to either factor need not define a continuous operator on arbitrary distributions after localisation. Ellipticity on $V$ is also essential: if the principal symbol vanishes in a covector direction over $K$, no symbolic inverse of order $-m$ can be constructed there, and the first-order example below shows that transverse singularities may remain invisible to $P$. The conclusion is local, not global; it gives smoothing identities only between cutoffs supported inside the elliptic region, so it says nothing about points where $P$ is characteristic or about boundary effects outside the chosen neighbourhood. The theorem is most useful after multiplying an equation by a test cutoff. The cutoff creates commutators, but the commutators have lower order, and the parametrix turns those lower-order terms into controlled contributions.
[example: Interior Parametrix Estimate]
Let $P \in \Psi^m(U)$ be elliptic on $V$, let $u \in \mathcal D'(U)$, and choose $\chi,\psi \in C_c^\infty(V)$ with $\psi=1$ on a neighbourhood of $\operatorname{supp}\chi$. By the local parametrix construction, after shrinking to the region controlled by these cutoffs there is $Q\in \Psi^{-m}(U)$ and a smoothing remainder $R\in \Psi^{-\infty}(U)$ such that, on distributions supported where $\psi$ is active,
\begin{align*}
I=QP+R
\end{align*}
after absorbing the sign of the smoothing error into $R$. Since $\psi=1$ near $\operatorname{supp}\chi$, multiplication by $\chi$ gives
\begin{align*}
\chi u=\chi\psi u.
\end{align*}
Applying the parametrix identity to $\psi u$ gives
\begin{align*}
\chi\psi u=\chi QP(\psi u)+\chi R(\psi u).
\end{align*}
Using the commutator convention $[P,\psi]=P\psi-\psi P$, we have
\begin{align*}
P(\psi u)=\psi Pu+[P,\psi]u.
\end{align*}
Substituting this into the previous display yields
\begin{align*}
\chi u=\chi Q\psi Pu+\chi Q[P,\psi]u+\chi R\psi u.
\end{align*}
The first term has the regularity of $Pu$ shifted up by $m$ derivatives because $Q\in\Psi^{-m}(U)$. The commutator satisfies $[P,\psi]\in\Psi^{m-1}(U)$, so $Q[P,\psi]\in\Psi^{-1}(U)$ by the symbolic order rule, and hence this term gains one derivative relative to $u$ by *Negative Order Sobolev Gain*. The final term is smooth by the *Smoothing Improvement Theorem*. Thus the identity separates the elliptic estimate into a controlled main term, a lower-order commutator term, and a smoothing error.
[/example]
## Differential, Smoothing, and Pseudodifferential Remainders
When using parametrices, the word remainder hides several different behaviours. A differential remainder has finite order and local support behaviour, a smoothing remainder improves regularity without limit, and a negative-order pseudodifferential remainder improves regularity by a finite amount. Estimates depend on distinguishing these cases.
[definition: Smoothing Operator]
An operator $R: \mathcal D'(U) \to C^\infty(U)$ is smoothing if its Schwartz kernel belongs to $C^\infty(U \times U)$.
[/definition]
A smoothing operator is stronger than an operator of order $-N$ for every fixed $N$, because its kernel regularity gives unlimited Sobolev improvement on compact subsets. To justify discarding smoothing remainders in elliptic estimates, we need the corresponding mapping theorem on every Sobolev scale.
[quotetheorem:7711]
[citeproof:7711]
This theorem explains why smoothing remainders disappear from finite-order regularity arguments: they can always be placed in a better Sobolev space than the estimate demands. Proper support prevents a smooth kernel from importing uncontrolled behaviour from spatial infinity; on a noncompact domain, an operator with smooth kernel but nonproper support can integrate against distant parts of a distribution in a way that is not a local Sobolev estimate. The smoothing hypothesis is also stronger than negative order: $\langle D\rangle^{-1}$ improves one derivative but does not send every finite-order distribution into $C^\infty$. The cutoff condition records exactly which input points can influence the output on $\operatorname{supp}\chi$, and without it the factor $\psi$ may remove the part of the input that $R$ actually uses. A different issue remains for remainders that have negative order but are not smoothing; these produce only a specified finite gain, and the next result quantifies that gain.
[quotetheorem:7712]
[citeproof:7712]
The number $r>0$ is the source of the gain; if $r=0$, the same conjugation argument yields only $H^s\to H^s$ boundedness. Proper support and the displayed cutoff condition keep the estimate local in the same sense as the smoothing theorem, since the output on $\operatorname{supp}\chi$ is allowed to use only input from the region where $\psi=1$. The theorem does not say that $A$ is harmless at every Sobolev level: it gives a finite shift by $r$, so an order $-1$ remainder cannot be discarded in an estimate that needs two additional derivatives. The distinction between these remainders is visible even in simple equations. A smoothing error may be ignored for every finite Sobolev target, while a $\Psi^{-1}$ error buys exactly one derivative and must often be iterated or absorbed.
[example: A Finite Gain Is Not Smoothing]
On $\mathbb R^n$, let $A=\langle D\rangle^{-1}$, so
\begin{align*}
\widehat{Au}(\xi)=(1+|\xi|^2)^{-1/2}\widehat u(\xi).
\end{align*}
For every $s\in\mathbb R$,
\begin{align*}
\|Au\|_{H^{s+1}}^2
=\int_{\mathbb R^n}(1+|\xi|^2)^{s+1}\left|(1+|\xi|^2)^{-1/2}\widehat u(\xi)\right|^2\,d\xi
\end{align*}
and the powers combine as
\begin{align*}
(1+|\xi|^2)^{s+1}(1+|\xi|^2)^{-1}=(1+|\xi|^2)^s.
\end{align*}
Hence
\begin{align*}
\|Au\|_{H^{s+1}}^2=\int_{\mathbb R^n}(1+|\xi|^2)^s|\widehat u(\xi)|^2\,d\xi=\|u\|_{H^s}^2.
\end{align*}
This is exactly a one-derivative gain, not unlimited smoothing. For example, take $u=\delta_0$. Then $\widehat u(\xi)=1$, so
\begin{align*}
\widehat{A\delta_0}(\xi)=(1+|\xi|^2)^{-1/2}.
\end{align*}
Its $H^t$ norm would require
\begin{align*}
\int_{\mathbb R^n}(1+|\xi|^2)^t(1+|\xi|^2)^{-1}\,d\xi
=\int_{\mathbb R^n}(1+|\xi|^2)^{t-1}\,d\xi<\infty.
\end{align*}
For large $|\xi|$, the integrand behaves like $|\xi|^{2t-2}$, and the radial integral behaves like
\begin{align*}
\int_1^\infty r^{2t-2}r^{n-1}\,dr=\int_1^\infty r^{2t+n-3}\,dr.
\end{align*}
This integral diverges whenever $2t+n-3\ge -1$, equivalently $t\ge 1-\frac n2$. Thus $A\delta_0$ does not lie in every Sobolev space, so $A$ is not smoothing; it improves Sobolev order by one derivative and no more.
[/example]
## Elliptic Regularity with Compact Support
The central estimate asks whether local Sobolev regularity of $Pu$ forces local Sobolev regularity of $u$. Ellipticity says that the principal symbol of $P$ is invertible at high frequency, so the parametrix should recover $u$ from $Pu$ up to harmless remainders.
[quotetheorem:7713]
[citeproof:7713]
This is the Sobolev form of elliptic regularity. The proper-support assumption ensures that the parametrix and commutator terms can be applied to distributions after cutoffs without uncontrolled dependence on faraway values. Ellipticity is the hypothesis that supplies the inverse symbol; without it, $\partial_{x_1}$ on $\mathbb R^2$ has smooth image on every distribution depending only on $x_2$, while the distribution itself may be singular. The result is local in the interior of $V$: it does not impose boundary regularity, it does not recover regularity at characteristic covectors, and it requires the hypothesis $Pu\in H^s_{\mathrm{loc}}(V)$ only where the conclusion is sought. In a classical PDE course the same conclusion is often proved by energy estimates or Fourier methods; here it appears as a direct consequence of symbolic inversion.
[example: Local Regularity for a Poisson Equation]
Let $\Omega \subset \mathbb R^n$ be open and let $u \in \mathcal D'(\Omega)$ satisfy
\begin{align*}
-\Delta u=f
\end{align*}
in $\Omega$. Since
\begin{align*}
-\Delta=-\sum_{j=1}^n \partial_{x_j}^2,
\end{align*}
the principal symbol is
\begin{align*}
p_2(x,\xi)=\sum_{j=1}^n \xi_j^2=|\xi|^2.
\end{align*}
Thus for every compact $K\subset \Omega$ and every $\xi\in \mathbb R^n$,
\begin{align*}
|p_2(x,\xi)|=|\xi|^2
\end{align*}
for all $x\in K$, so the local ellipticity lower bound holds with $C_K=1$ and any $R_K>0$. Hence $-\Delta$ is elliptic of order $2$.
If $f\in H^s_{\mathrm{loc}}(\Omega)$, then $-\Delta u=f\in H^s_{\mathrm{loc}}(\Omega)$, and *Elliptic Regularity With Compact Support* applied to the elliptic operator $-\Delta\in \Psi^2(\Omega)$ gives
\begin{align*}
u\in H^{s+2}_{\mathrm{loc}}(\Omega).
\end{align*}
If $f\in C^\infty(\Omega)$, then $f\in H^s_{\mathrm{loc}}(\Omega)$ for every $s\in\mathbb R$, so the same argument gives $u\in H^{s+2}_{\mathrm{loc}}(\Omega)$ for every $s$. On each compact subset $K\Subset\Omega$, choosing $s$ large enough so that $s+2>k+n/2$ and applying *Sobolev Embedding* gives $u\in C^k(K)$; since $k$ and $K$ were arbitrary, $u\in C^\infty(\Omega)$.
[/example]
## Symbolic Proof of Sobolev Regularity Gain
The previous theorem used several pieces of the calculus at once. It is useful to isolate the symbolic mechanism: order $m$ ellipticity converts an equation into an operator of order $-m$ applied to the right-hand side, and order $-m$ means gain of $m$ Sobolev derivatives.
[quotetheorem:7714]
[citeproof:7714]
The formula is short, but it contains the main operational meaning of ellipticity. Compact support is included to avoid a separate discussion of how the supports of $Q$, $P$, and $R$ project under their kernels; without compact support or proper localisation, the displayed identity need not be a local estimate on the region being tested. The smoothing remainder matters only because it is genuinely smoothing: replacing $R$ by an order $-1$ operator would give a finite gain rather than membership in every $H^t_{\mathrm{loc}}$. The statement also does not assert a global inverse for $P$, since lower-frequency behaviour, topology, boundary conditions, and kernels can obstruct exact solvability. The inverse is not an inverse in the algebra of exact operators; it is an inverse modulo errors that are better than any finite Sobolev scale can detect.
[example: Recovering Classical Smoothness from Sobolev Regularity]
Let $P \in \Psi^m(U)$ be elliptic with smooth coefficients, and let $u \in \mathcal D'(U)$ satisfy $Pu \in C^\infty(U)$. Fix a compact set $K\Subset U$ and an integer $k\ge 0$. Choose a real number $r$ with
\begin{align*}
r>k+\frac n2.
\end{align*}
Set $s=r-m$. Since $Pu\in C^\infty(U)$, every cutoff of $Pu$ is smooth with compact support, hence belongs to $H^s$; equivalently,
\begin{align*}
Pu\in H^s_{\mathrm{loc}}(U).
\end{align*}
By *Elliptic Regularity With Compact Support*, applied with this value of $s$, we get
\begin{align*}
u\in H^{s+m}_{\mathrm{loc}}(U).
\end{align*}
Because $s+m=(r-m)+m=r$, this is
\begin{align*}
u\in H^r_{\mathrm{loc}}(U).
\end{align*}
On a compact neighbourhood of $K$, the inequality $r>k+n/2$ lets us apply *Sobolev Embedding*, so the local membership $u\in H^r_{\mathrm{loc}}(U)$ implies
\begin{align*}
u\in C^k(K).
\end{align*}
Since $k$ and $K\Subset U$ were arbitrary, $u$ has derivatives of every order on every compact subset of $U$, and therefore
\begin{align*}
u\in C^\infty(U).
\end{align*}
[/example]
## What Fails Without Ellipticity
The parametrix argument depends on high-frequency invertibility of the principal symbol. If the symbol vanishes in some directions, then the operator may control derivatives only in selected directions, and a Sobolev gain for all variables is no longer available from this calculus alone.
[example: Missing Direction for a First-Order Operator]
On $\mathbb R^2$, take $P=\partial_{x_1}$, whose principal symbol is $p_1(x,\xi)=i\xi_1$. Given $v\in \mathcal D'(\mathbb R)$, define the distribution $u(x_1,x_2)=v(x_2)$ by
\begin{align*}
\langle u,\varphi\rangle=\left\langle v,\int_{\mathbb R}\varphi(x_1,\cdot)\,dx_1\right\rangle
\end{align*}
for every $\varphi\in C_c^\infty(\mathbb R^2)$. Then
\begin{align*}
\langle Pu,\varphi\rangle=\langle \partial_{x_1}u,\varphi\rangle=-\langle u,\partial_{x_1}\varphi\rangle
\end{align*}
by the definition of [distributional derivative](/page/Distributional%20Derivative). Using the definition of $u$,
\begin{align*}
-\langle u,\partial_{x_1}\varphi\rangle=-\left\langle v,\int_{\mathbb R}\partial_{x_1}\varphi(x_1,\cdot)\,dx_1\right\rangle.
\end{align*}
For each fixed $x_2$, compact support of $\varphi$ in the $x_1$ variable gives
\begin{align*}
\int_{\mathbb R}\partial_{x_1}\varphi(x_1,x_2)\,dx_1=0,
\end{align*}
so $\langle Pu,\varphi\rangle=0$ for every test function $\varphi$. Hence $Pu=0$, which is smooth.
Now choose $v\notin H^s_{\mathrm{loc}}(\mathbb R)$. If $u$ had the corresponding local $H^s$ regularity in both variables, then localizing $u$ by a product cutoff $\rho(x_1)\eta(x_2)$ with $\int_{\mathbb R}\rho(x_1)\,dx_1=1$ would imply that the $x_1$-average
\begin{align*}
\int_{\mathbb R}\rho(x_1)\eta(x_2)u(x_1,x_2)\,dx_1=\eta(x_2)v(x_2)
\end{align*}
belongs to $H^s(\mathbb R)$ for every $\eta\in C_c^\infty(\mathbb R)$, contradicting $v\notin H^s_{\mathrm{loc}}(\mathbb R)$. Thus $Pu$ may be smooth while $u$ keeps singular regularity in the $x_2$ direction. The obstruction is exactly the characteristic set $\{\xi_1=0\}$, where the symbol $i\xi_1$ vanishes and therefore gives no control of transverse covectors.
[/example]
This example looks like a violation of elliptic regularity only if one ignores the symbol. The equation differentiates in one variable and says nothing about singular oscillation in the transverse covariable.
[example: Uniformly Elliptic Divergence Form Operator]
Let
\begin{align*}
Lu=-\sum_{i,j=1}^n \partial_{x_i}(a_{ij}(x)\partial_{x_j}u)
\end{align*}
where $a_{ij}\in C^\infty(\Omega)$, and assume that for some $\theta>0$,
\begin{align*}
\sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j \ge \theta |\xi|^2
\end{align*}
for $x$ in each compact subset of $\Omega$ and all $\xi\in\mathbb R^n$. Expanding one derivative by the product rule gives
\begin{align*}
\partial_{x_i}(a_{ij}(x)\partial_{x_j}u)=(\partial_{x_i}a_{ij})(x)\partial_{x_j}u+a_{ij}(x)\partial_{x_i}\partial_{x_j}u.
\end{align*}
Hence
\begin{align*}
Lu=-\sum_{i,j=1}^n a_{ij}(x)\partial_{x_i}\partial_{x_j}u-\sum_{i,j=1}^n(\partial_{x_i}a_{ij})(x)\partial_{x_j}u.
\end{align*}
The second sum has order $1$, so the order-$2$ principal part is
\begin{align*}
-\sum_{i,j=1}^n a_{ij}(x)\partial_{x_i}\partial_{x_j}.
\end{align*}
With the convention that $\partial_{x_i}$ contributes $i\xi_i$ to the principal symbol, the term $-a_{ij}(x)\partial_{x_i}\partial_{x_j}$ contributes
\begin{align*}
-a_{ij}(x)(i\xi_i)(i\xi_j)=a_{ij}(x)\xi_i\xi_j.
\end{align*}
Therefore the principal symbol of $L$ is
\begin{align*}
l_2(x,\xi)=\sum_{i,j=1}^n a_{ij}(x)\xi_i\xi_j.
\end{align*}
For every compact $K\Subset\Omega$ and every $\xi\in\mathbb R^n$, the assumed coercive bound gives
\begin{align*}
|l_2(x,\xi)|=l_2(x,\xi)\ge \theta|\xi|^2
\end{align*}
for $x\in K$. Thus the local ellipticity lower bound holds with $C_K=\theta$ and any $R_K>0$, so $L$ is elliptic of order $2$. Applying *Elliptic Regularity With Compact Support* with $m=2$, the hypothesis $Lu\in H^s_{\mathrm{loc}}(\Omega)$ implies
\begin{align*}
u\in H^{s+2}_{\mathrm{loc}}(\Omega).
\end{align*}
The uniform positivity of the quadratic form in $\xi$ is exactly what turns the divergence-form equation into a two-derivative local Sobolev gain.
[/example]
The contrast between the two examples is the practical diagnostic for the whole chapter. Before seeking estimates, inspect the principal symbol: ellipticity predicts full regularity recovery, while characteristic directions signal the need for genuinely microlocal tools.
## Preparing for Microlocal Singularities
Local elliptic regularity answers a position-space question, but the non-elliptic example exposes a sharper obstruction: a distribution may be singular only in covector directions that the operator fails to control. Cutoffs can isolate where the singularity is tested, yet they cannot distinguish whether the bad oscillation points in $\xi_1$, $\xi_2$, or another conic direction. The next problem is therefore to refine regularity so that it records both the point $x$ and the high-frequency direction $\xi$.
[remark: From Local to Directional Regularity]
The support of $\chi u$ records where $u$ is being tested, while the high-frequency behaviour of the symbol records which covectors are controlled. Elliptic parametrices work where the symbol is invertible in all nonzero covector directions. If invertibility holds only in some conic region of frequency space, the same symbolic construction suggests a directional version of regularity.
[/remark]
This is the conceptual bridge to microlocal analysis. Pseudodifferential operators are built to localise simultaneously in $x$ and $\xi$, and elliptic regularity is the first instance of the principle that singularities are constrained by the characteristic geometry of the principal symbol.
## Beyond and Connected Topics
The next layer of the subject replaces local [Sobolev regularity](/page/Sobolev%20Space) by directional regularity. The natural object is the wavefront set: instead of asking only whether $u$ is smooth near $x$, one asks which nonzero covectors $\xi$ still carry high-frequency singularity. The parametrix construction in this chapter already contains the mechanism. Wherever a symbol is elliptic in a conic region, it can be inverted there modulo smoothing errors; wherever the principal symbol vanishes, singularities may persist and must be propagated rather than removed. The Fourier-side viewpoint developed here is the same one behind the [Fourier Transform](/page/Fourier%20Transform) and its $L^2$ form, [Fourier Transform on $L^2$](/page/Fourier%20Transform%20on%20L%C2%B2).
This calculus also feeds directly into elliptic Fredholm theory on compact manifolds. Negative-order remainders become compact after Sobolev localization, so an [elliptic operator](/page/Elliptic%20Operator) is invertible up to compact error and therefore has finite-dimensional kernel and cokernel. The compact-error mechanism connects to [Compact Operator](/page/Compact%20Operator), [Linear Operators on Banach Spaces](/page/Linear%20Operators%20on%20Banach%20Spaces), and [The Fredholm Alternative](/page/The%20Fredholm%20Alternative). In geometric analysis, the same package underlies regularity for elliptic differential operators, Hodge-type decompositions, index problems, and the construction of parametrices for geometric equations; for second-order PDE, see [Second-Order Elliptic Equations](/page/Second-Order%20Elliptic%20Equations).
Several refinements start from the choices suppressed here. [Semiclassical Analysis I: Symbols, Quantization, and Microlocal Foundations](/page/Semiclassical%20Analysis%20I%3A%20Symbols%2C%20Quantization%2C%20and%20Microlocal%20Foundations) adds a small parameter $h$ and studies the high-frequency limit with finer scale; Fourier integral operators replace pseudodifferential kernels by oscillatory kernels associated to canonical transformations; boundary value problems require transmission conditions and boundary parametrices; and nonsmooth or noncompact settings force one to track which estimates survive without the full classical symbol expansion. The common thread is the same one developed here: symbols encode the geometry of differentiation, while kernels and parametrices turn that geometry into estimates on [Hilbert Space](/page/Hilbert%20Space) and [Banach Space](/page/Banach%20Space) scales.
Contents
- Introduction
- What the Course Is Trying to Build
- From Multipliers to Variable Coefficients
- Why Estimates Replace Formulas
- The Guiding Elliptic Problem
- Prerequisites and Conventions
- Structure of the Course
- 1. Fourier Multipliers and the Model Calculus
- Fourier Transform on Schwartz Functions and Tempered Distributions
- Constant-Coefficient Operators as Polynomial Multipliers
- Bessel Potentials and Sobolev Spaces
- 2. Symbol Classes
- Measuring Symbols by Derivatives
- Order Filtration and Smoothing Symbols
- Classical Symbols and Asymptotic Expansions
- 3. Quantisation and Kernels
- From Symbols to Operators
- Oscillatory Integrals and Localisation
- Schwartz Kernels and Proper Support
- Smoothing Operators and Pseudolocality
- 4. Amplitudes, Changes of Quantisation, and Principal Symbols
- Amplitude Operators and Reduction to Left Symbols
- Right, Left, and Weyl Quantisation
- Principal Symbols and the Top-Order Quotient
- 5. Asymptotic Expansions and Symbolic Algebra
- Formal Asymptotic Expansions
- Taylor Expansion Inside Oscillatory Integrals
- Principal and Subprincipal Bookkeeping
- Smoothing and Formal Vanishing
- 6. Composition and Commutators
- The Composition Problem
- Asymptotic Expansion and Order Bookkeeping
- Principal Symbols and Commutators
- Consequences for the Calculus
- 7. Adjoints, Transposes, and Positivity
- Formal Adjoints in the Kohn-Nirenberg Calculus
- Transposes and Distributional Duality
- Real Principal Symbols and Self-Adjointness up to Lower Order
- Positivity and the Sharp Garding Inequality
- Elementary Positivity Mechanisms
- 8. Sobolev Mapping Properties
- Operators of Order $m$ Between Sobolev Spaces
- Calderon-Vaillancourt and Order-Zero Boundedness
- Local Sobolev Estimates on Open Sets
- Consequences for the Calculus
- 9. Ellipticity and Parametrices
- Elliptic Symbols and Lower Bounds
- Recursive Construction of the Parametrix
- Smoothing Remainders and Uniqueness
- Elliptic Regularity in Sobolev Scales
- 10. Compactness, Fredholm Theory, and Applications
- Negative-Order Operators and Compactness After Localization
- Elliptic Estimates with Lower-Order Error Terms
- Fredholm Consequences on Compact Manifolds
- 11. Coordinate Localisation and Operators on Manifolds
- Coordinate Symbols and the Cotangent Bundle
- Partitions of Unity and Proper Support
- Densities, Adjoints, and Coordinate-Free Kernels
- Ellipticity and Parametrices on Compact Manifolds
- 12. Consolidation: The Elliptic Calculus in Practice
- Building Parametrices for Estimates
- Differential, Smoothing, and Pseudodifferential Remainders
- Elliptic Regularity with Compact Support
- Symbolic Proof of Sobolev Regularity Gain
- What Fails Without Ellipticity
- Preparing for Microlocal Singularities
- Beyond and Connected Topics
Microlocal Analysis I: Pseudodifferential Operators
Content
Problems
History
Created by admin on 6/19/2026 | Last updated on 6/19/2026
Prerequisites (0/3 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent