Mathematical Physics II: Quantum Mechanics develops the mathematical structure behind nonrelativistic quantum theory, emphasizing how physical principles are encoded in Hilbert spaces, operators, spectra, and symmetry. The course begins from the basic language of states and observables, then treats the analytic subtleties that distinguish quantum mechanics from finite-dimensional linear algebra: unbounded operators, domains, self-adjointness, and the spectral theorem. These ideas provide the foundation for understanding measurement, canonical commutation relations, and the Schrödinger representation.
The central themes are the interaction between geometry, analysis, and physics. Time evolution is described by unitary groups and the Schrödinger equation; concrete one-dimensional systems and the harmonic oscillator show how abstract operator methods produce explicit spectra and wavefunctions. Angular momentum and rotations introduce representation-theoretic ideas, which then organize central potentials and the hydrogen atom. Later chapters develop perturbation theory, conservation laws, scattering, and semiclassical approximations, showing how exact structures guide practical computation when systems cannot be solved directly.
The chapters build from foundations to applications. First they establish the formal setting of quantum states, observables, and measurement. Next they introduce the canonical examples and dynamical laws that turn the formalism into a working theory. The middle of the course studies solvable models and symmetry, while the final chapters broaden the toolkit to approximation and limiting methods, connecting rigorous operator-theoretic foundations with the calculations used throughout mathematical physics.
# Introduction
This introductory chapter fixes the mathematical viewpoint of the course. Quantum mechanics will be treated as a theory of states in complex Hilbert spaces, observables represented by self-adjoint operators, and time evolution implemented by unitary groups. The examples are the standard physical systems, but the organizing questions are mathematical: what must be specified before an operator is an observable, how does measurement arise from spectral theory, and why does [approximation theory](/page/Approximation%20Theory) enter even in elementary models?
The course begins from finite-dimensional intuition and then moves toward the analytic complications that appear for particles on continuous spaces. In finite dimension, matrices already display superposition, non-commuting observables, spectral projections, spin, and uncertainty. Infinite-dimensional systems add domains, unbounded operators, continuous spectrum, distributions, and limiting procedures, so the same formal symbols require additional hypotheses.
## The Mathematical Problem of Quantum Theory
Classical mechanics often starts with a phase space, functions on it, and differential equations for trajectories. Quantum mechanics replaces trajectories by states and replaces numerical quantities by operators. The first problem is therefore to identify which pieces of the classical picture survive and what mathematical structure replaces the missing parts.
[motivation]
### From Points to States
A classical particle has a point $(q,p)$ in phase space at each instant, and observables such as energy or angular momentum are functions of that point. A quantum system instead has a state vector, or more precisely a ray, in a complex Hilbert space. The ray matters because multiplying a non-zero vector by a complex scalar does not change the physical state after normalization.
### From Functions to Operators
A classical observable assigns a number to each point of phase space. A quantum observable is represented by a self-adjoint operator $A$ on a Hilbert space $H$, and its numerical content is obtained through expectation values, spectral projections, and probability distributions. This replacement forces us to study non-commutativity: two observables need not admit simultaneous sharp values.
### From Dynamics to Unitary Evolution
Hamiltonian dynamics moves phase-space points. Quantum dynamics evolves states by a family of unitary operators $(U(t))_{t \in \mathbb R}$, usually written formally as $U(t)=e^{-itH}$ for a Hamiltonian operator $H$. Making this expression rigorous is one of the central reasons for developing self-adjoint unbounded operators and Stone's theorem.
[/motivation]
The motivation separates three mathematical tasks: describe states, describe observables, and describe dynamics. To discuss those tasks without committing to a particular particle or spin model, we need a compact name for the package of data that defines a quantum model.
[definition: Quantum System]
A quantum system in this course consists of a complex Hilbert space $H$, a specified class of admissible states on $H$, a specified class of self-adjoint operators $A:\mathcal D(A)\subset H\to H$ called observables, and a distinguished self-adjoint Hamiltonian operator $H_{\rm op}:\mathcal D(H_{\rm op})\subset H\to H$ generating time evolution.
[/definition]
The same letter $H$ is often used both for a Hilbert space and for a Hamiltonian in physics notation. These notes use $H$ for the Hilbert space when no confusion is possible, and write $H_{\rm op}$ for the Hamiltonian operator in introductory passages where both objects appear. The smallest non-spatial model already contains states, observables, and dynamics, so it is the right first test case for this definition.
[example: Two-Level System]
A spin-$1/2$ system with no spatial degree of freedom is modeled by $H=\mathbb C^2$. Write
\begin{align*}
\sigma_z(\alpha,\beta)=(\alpha,-\beta)
\end{align*}
so that, for the standard basis vectors $e_1=(1,0)$ and $e_2=(0,1)$,
\begin{align*}
\sigma_z e_1=e_1
\end{align*}
and
\begin{align*}
\sigma_z e_2=-e_2.
\end{align*}
Thus
\begin{align*}
H_{\rm op}e_1=\frac{\omega}{2}e_1
\end{align*}
and
\begin{align*}
H_{\rm op}e_2=-\frac{\omega}{2}e_2.
\end{align*}
For a normalized vector $\psi=(\alpha,\beta)=\alpha e_1+\beta e_2$, the time-evolution operator is the exponential $U(t)=e^{-itH_{\rm op}}$. Since $e_1$ and $e_2$ are eigenvectors of $H_{\rm op}$, applying the scalar exponential on each eigenspace gives
\begin{align*}
U(t)e_1=e^{-it\omega/2}e_1
\end{align*}
and
\begin{align*}
U(t)e_2=e^{it\omega/2}e_2.
\end{align*}
By linearity,
\begin{align*}
U(t)\psi=U(t)(\alpha e_1+\beta e_2)=\alpha e^{-it\omega/2}e_1+\beta e^{it\omega/2}e_2.
\end{align*}
Equivalently,
\begin{align*}
U(t)(\alpha,\beta)=\left(e^{-it\omega/2}\alpha,e^{it\omega/2}\beta\right).
\end{align*}
The two energy components therefore keep the same squared amplitudes while acquiring opposite phases, which is the finite-dimensional prototype for unitary quantum dynamics.
[/example]
This example is finite-dimensional, so every linear operator is bounded and defined on all of $\mathbb C^2$. The main analytic work of the course begins when the Hilbert space is a function space and operators such as position, momentum, and energy are not bounded on that space.
## States, Probabilities, and Superposition
If a quantum state is not a point of phase space, what does it predict? The answer is probabilistic: a state assigns probability distributions to observables. This section states the basic language that will be refined later by the spectral theorem.
[definition: Pure State]
Let $H$ be a complex Hilbert space. A pure state is a one-dimensional subspace of $H$, represented by a vector $\psi \in H$ with $\|\psi\|_H=1$.
[/definition]
A normalized representative is convenient for computations, but the physical pure state is the ray $\{\lambda\psi : \lambda \in \mathbb C,\lambda\ne 0\}$. Once a state has been represented by a unit vector, the next question is how an observable produces a number that can be compared with repeated measurements.
[definition: Expectation of a Bounded Observable]
Let $H$ be a complex Hilbert space and let $\psi \in H$ satisfy $\|\psi\|_H=1$. The expectation functional in the state $\psi$ is the map
\begin{align*}
\mathbb E_\psi:\{A\in \mathcal L(H):A=A^*\}\to \mathbb R, \qquad A\mapsto (A\psi,\psi)_H.
\end{align*}
[/definition]
For bounded observables, this definition is algebraically stable: the operator applies to every vector in $H$, and the [inner product](/page/Inner%20Product) is finite. Later, when $A$ is unbounded, the expression only makes sense for states in the domain of $A$, and variance requires the stronger condition $\psi \in \mathcal D(A^2)$ or an equivalent spectral integrability condition. A spin measurement gives the simplest computation of an expectation value from amplitudes.
[example: Spin Measurement Along an Axis]
Let $H=\mathbb C^2$, let $\sigma_z(\alpha,\beta)=(\alpha,-\beta)$, and let $\psi=(\alpha,\beta)$ satisfy $|\alpha|^2+|\beta|^2=1$. Using the standard inner product on $\mathbb C^2$, linear in the first variable, we compute the expectation value from the definition of $\mathbb E_\psi$:
\begin{align*}
\sigma_z\psi=\sigma_z(\alpha,\beta)=(\alpha,-\beta).
\end{align*}
Therefore
\begin{align*}
\mathbb E_\psi[\sigma_z]=(\sigma_z\psi,\psi)_{\mathbb C^2}=((\alpha,-\beta),(\alpha,\beta))_{\mathbb C^2}.
\end{align*}
Expanding the inner product coordinate by coordinate gives
\begin{align*}
((\alpha,-\beta),(\alpha,\beta))_{\mathbb C^2}=\alpha\overline{\alpha}+(-\beta)\overline{\beta}.
\end{align*}
Since $\alpha\overline{\alpha}=|\alpha|^2$ and $\beta\overline{\beta}=|\beta|^2$, this becomes
\begin{align*}
\mathbb E_\psi[\sigma_z]=|\alpha|^2-|\beta|^2.
\end{align*}
Thus the expectation is the weighted average $1\cdot |\alpha|^2+(-1)\cdot |\beta|^2$ of the possible outcomes $1$ and $-1$, with weights given by the squared amplitudes in the eigenbasis.
[/example]
Superposition is visible in this example because the same vector can be expanded in different eigenbases. This leads to the first conceptual obstruction: probability assignments for different observables are not usually obtained from a single underlying list of pre-existing values.
[remark: Non-Commutativity as a Structural Feature]
When two self-adjoint operators $A$ and $B$ do not commute, the order of applying them can matter. In quantum mechanics this algebraic fact is not a technical inconvenience; it controls compatibility of measurements, uncertainty inequalities, and the representation theory of symmetries.
[/remark]
The course will return to this point in several forms. At the beginning it appears as matrix algebra; in later chapters it becomes the canonical commutation relation between position and momentum, and then the representation theory behind angular momentum.
## Operators, Domains, and Observables
Why is a self-adjoint matrix a satisfactory observable, while the same phrase for a differential operator needs extra care? The difference is domain. Differential operators on Hilbert spaces of functions are often unbounded, so they cannot be defined on the whole Hilbert space as bounded operators.
[definition: Symmetric Operator]
Let $H$ be a complex Hilbert space. A densely defined linear operator $A:\mathcal D(A)\subset H\to H$ is symmetric if
\begin{align*}
(Au,v)_H=(u,Av)_H
\end{align*}
for all $u,v\in \mathcal D(A)$.
[/definition]
Symmetry is the integration-by-parts condition suggested by formal physics notation. It is weaker than self-adjointness because it does not by itself determine the correct boundary conditions or guarantee a spectral resolution. We therefore need the stronger condition that the operator and its adjoint have the same domain.
[definition: Self-Adjoint Operator]
Let $H$ be a complex Hilbert space. A densely defined linear operator $A:\mathcal D(A)\subset H\to H$ is self-adjoint if $A=A^*$, meaning $\mathcal D(A)=\mathcal D(A^*)$ and $Au=A^*u$ for all $u\in \mathcal D(A)$.
[/definition]
The equality of domains is the essential point. Many formally Hermitian differential expressions define several different self-adjoint operators depending on boundary conditions, and these different choices correspond to different quantum systems. A first-order differential operator on an interval shows how boundary data enter the operator itself.
[example: Momentum on an Interval]
Let $P_0=-i\frac{d}{dx}$ with domain $C_c^\infty(0,1)\subset L^2(0,1)$. For $u,v\in C_c^\infty(0,1)$, using the $L^2$ inner product linear in the first variable,
\begin{align*}
(P_0u,v)_{L^2}=\int_0^1(-iu'(x))\overline{v(x)}\,dx.
\end{align*}
[Integration by parts](/theorems/210) gives
\begin{align*}
\int_0^1 u'(x)\overline{v(x)}\,dx=u(1)\overline{v(1)}-u(0)\overline{v(0)}-\int_0^1u(x)\overline{v'(x)}\,dx.
\end{align*}
Because $u$ and $v$ have compact support in $(0,1)$, $u(0)=u(1)=v(0)=v(1)=0$, so
\begin{align*}
(P_0u,v)_{L^2}=i\int_0^1u(x)\overline{v'(x)}\,dx.
\end{align*}
Also
\begin{align*}
(u,P_0v)_{L^2}=\int_0^1u(x)\overline{-iv'(x)}\,dx=i\int_0^1u(x)\overline{v'(x)}\,dx.
\end{align*}
Hence $(P_0u,v)_{L^2}=(u,P_0v)_{L^2}$, so $P_0$ is symmetric on $C_c^\infty(0,1)$.
For each $\theta\in\mathbb R$, define $P_\theta=-i\frac{d}{dx}$ on the domain
\begin{align*}
\mathcal D(P_\theta)=\{u\in H^1(0,1):u(1)=e^{i\theta}u(0)\}.
\end{align*}
If $u,v\in\mathcal D(P_\theta)$, then
\begin{align*}
(P_\theta u,v)_{L^2}-(u,P_\theta v)_{L^2}=-i(u(1)\overline{v(1)}-u(0)\overline{v(0)}).
\end{align*}
Using $u(1)=e^{i\theta}u(0)$ and $v(1)=e^{i\theta}v(0)$,
\begin{align*}
u(1)\overline{v(1)}=e^{i\theta}u(0)e^{-i\theta}\overline{v(0)}=u(0)\overline{v(0)}.
\end{align*}
Therefore the boundary term is zero, and $P_\theta$ is symmetric. These domains are precisely the self-adjoint boundary conditions for the first-order momentum operator on the interval.
The spectrum also depends on $\theta$. If $P_\theta u=\lambda u$, then
\begin{align*}
-i u'(x)=\lambda u(x).
\end{align*}
Equivalently,
\begin{align*}
u'(x)=i\lambda u(x),
\end{align*}
so the non-zero solutions are
\begin{align*}
u(x)=Ce^{i\lambda x}.
\end{align*}
The boundary condition gives
\begin{align*}
Ce^{i\lambda}=e^{i\theta}C.
\end{align*}
For $C\ne 0$, this is $e^{i\lambda}=e^{i\theta}$, hence
\begin{align*}
\lambda=\theta+2\pi k,\qquad k\in\mathbb Z.
\end{align*}
Thus different choices of $\theta$ give different self-adjoint observables, with spectra shifted by the boundary phase.
[/example]
This example explains why the course spends a full chapter on unbounded operators before relying on the spectral theorem in physical applications. The next structural question is what self-adjointness buys that symmetry alone does not: it supplies the projection-valued spectral resolution needed to turn an operator into a measurement rule.
[quotetheorem:6911]
[citeproof:6911]
This theorem is the mathematical reason that self-adjointness, rather than formal symmetry, is the definition of an observable. A symmetric operator need not have a projection-valued spectral resolution; the minimal momentum operator $-i\frac{d}{dx}$ on $C_c^\infty(0,1)\subset L^2(0,1)$ is symmetric but not self-adjoint, and its self-adjoint realizations appear only after imposing boundary conditions $u(1)=e^{i\theta}u(0)$. The statement also does not say that every spectral value is an eigenvalue: continuous spectrum appears naturally for position and momentum on $L^2(\mathbb R)$. Its purpose here is to explain the order of the course: operators and domains come before measurement theory because spectral measures require self-adjointness.
## Time Evolution and the Schrodinger Equation
Once observables are operators, what replaces Newton's equation? The Hamiltonian operator generates time evolution. The mathematical question is how a possibly unbounded operator produces a well-defined family of unitary transformations.
[definition: Unitary Time Evolution]
Let $H$ be a complex Hilbert space. A unitary time evolution is a family $(U(t))_{t\in\mathbb R}$ of maps
\begin{align*}
U(t):H\to H
\end{align*}
such that each $U(t)$ is unitary, $U(0)=I$, $U(t+s)=U(t)U(s)$ for all $s,t\in\mathbb R$, and $U(t)\psi\to \psi$ in $H$ as $t\to 0$ for every $\psi\in H$.
[/definition]
The continuity condition rules out algebraic group actions with no analytic dependence on time. The next question is whether every acceptable Hamiltonian gives such an evolution, and whether every such evolution comes from a Hamiltonian. Stone's theorem gives the exact equivalence.
[quotetheorem:6912]
Stone's theorem is the bridge between the operator-theoretic and dynamical parts of the subject. Strong continuity is essential: if $H=\mathbb C$ and $f:\mathbb R\to\mathbb R$ is a discontinuous additive function, then $U(t)z=e^{if(t)}z$ is a unitary group with no strongly continuous dependence on $t$, so the derivative at $t=0$ does not define a Hamiltonian. Self-adjointness is also essential; the symmetric minimal momentum operator $-i\frac{d}{dx}$ on $C_c^\infty(0,1)\subset L^2(0,1)$ has no unitary group generated by that minimal operator until a self-adjoint boundary condition is chosen. The theorem does not say that every vector satisfies the differential equation below, since differentiability in time requires the initial state to lie in $\mathcal D(H_{\rm op})$. In physical notation, vectors satisfying sufficient domain hypotheses solve the Schrodinger equation
\begin{align*}
i\frac{d}{dt}\psi(t)=H_{\rm op}\psi(t), \qquad \psi(t)=U(t)\psi(0).
\end{align*}
A free particle is the first example where this abstract theorem becomes a calculable formula through Fourier analysis.
[example: Free Particle on the Line]
For a free particle on $\mathbb R$ with mass $m>0$, start on the dense test domain $\mathcal S(\mathbb R)\subset L^2(\mathbb R)$ and write
\begin{align*}
H_{\rm op}\psi=-\frac{1}{2m}\psi''.
\end{align*}
Use the Fourier transform convention
\begin{align*}
\widehat{\psi}(\xi)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb R}e^{-ix\xi}\psi(x)\,dx.
\end{align*}
For $\psi\in\mathcal S(\mathbb R)$, [integration by parts](/theorems/2098) gives
\begin{align*}
\widehat{\psi'}(\xi)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb R}e^{-ix\xi}\psi'(x)\,dx=i\xi\widehat{\psi}(\xi),
\end{align*}
because the boundary term $e^{-ix\xi}\psi(x)$ tends to $0$ at both ends. Applying the same identity to $\psi'$ gives
\begin{align*}
\widehat{\psi''}(\xi)=i\xi\widehat{\psi'}(\xi)=i\xi(i\xi\widehat{\psi}(\xi))=-\xi^2\widehat{\psi}(\xi).
\end{align*}
Therefore
\begin{align*}
\widehat{H_{\rm op}\psi}(\xi)=-\frac{1}{2m}\widehat{\psi''}(\xi)=-\frac{1}{2m}(-\xi^2\widehat{\psi}(\xi))=\frac{\xi^2}{2m}\widehat{\psi}(\xi).
\end{align*}
Thus, in Fourier variables, the Hamiltonian is multiplication by the real function $\xi^2/(2m)$. The time-evolution operator $U(t)=e^{-itH_{\rm op}}$ is therefore multiplication by the scalar exponential of this multiplier:
\begin{align*}
\widehat{U(t)\psi}(\xi)=e^{-it\xi^2/(2m)}\widehat{\psi}(\xi).
\end{align*}
Since
\begin{align*}
|e^{-it\xi^2/(2m)}|=1
\end{align*}
for every $\xi\in\mathbb R$, this formula preserves the $L^2$ norm in Fourier space, and hence preserves the norm of $\psi$ by Plancherel's theorem. Fourier analysis has turned the differential Hamiltonian into a multiplication operator, so the abstract spectral calculus becomes an explicit phase factor on each frequency component.
[/example]
The same mechanism reappears for wave packets, scattering intuition, and approximation by exactly solvable systems. The formula also warns that differentiating in time is more restrictive than applying $U(t)$: the unitary group acts on all of $L^2(\mathbb R)$, while the differential equation requires initial data in the domain of the Hamiltonian.
## Models Running Through the Course
A theory becomes useful only when it explains standard systems in a controlled way. The course uses a small set of recurring models to test each new mathematical tool: finite spin systems, the particle on a line, the particle in a box, the harmonic oscillator, the hydrogen atom, and perturbations of solvable Hamiltonians.
[example: Particle in a Box]
Let $H=L^2(0,L)$ and take the Dirichlet Hamiltonian
\begin{align*}
H_{\rm op}u=-u''
\end{align*}
with domain $\mathcal D(H_{\rm op})=H^2(0,L)\cap H_0^1(0,L)$, so vectors in the domain satisfy $u(0)=u(L)=0$. To find the energy levels, solve
\begin{align*}
-u''=\lambda u.
\end{align*}
If $\lambda=0$, then $u(x)=ax+b$, and the boundary conditions give $b=0$ and $aL=0$, hence $u=0$. If $\lambda=-\mu^2<0$ with $\mu>0$, then
\begin{align*}
u(x)=Ae^{\mu x}+Be^{-\mu x}.
\end{align*}
The condition $u(0)=0$ gives $A+B=0$, so $B=-A$. Then $u(L)=0$ gives
\begin{align*}
A(e^{\mu L}-e^{-\mu L})=0.
\end{align*}
Since $e^{\mu L}-e^{-\mu L}\ne 0$, we get $A=0$ and hence $B=0$. Thus no negative eigenvalue occurs.
For $\lambda=\mu^2>0$, the solutions are
\begin{align*}
u(x)=A\cos(\mu x)+B\sin(\mu x).
\end{align*}
The condition $u(0)=0$ gives $A=0$, so
\begin{align*}
u(x)=B\sin(\mu x).
\end{align*}
The condition $u(L)=0$ gives
\begin{align*}
B\sin(\mu L)=0.
\end{align*}
For a non-zero eigenfunction we need $B\ne 0$, hence $\sin(\mu L)=0$, so
\begin{align*}
\mu L=n\pi
\end{align*}
for some $n\in\mathbb N$. Therefore
\begin{align*}
\lambda_n=\left(\frac{n\pi}{L}\right)^2
\end{align*}
and a corresponding eigenfunction is
\begin{align*}
\phi_n(x)=\sqrt{\frac{2}{L}}\sin\left(\frac{n\pi x}{L}\right).
\end{align*}
The normalization follows from $\sin^2 y=(1-\cos(2y))/2$:
\begin{align*}
\int_0^L\left|\phi_n(x)\right|^2\,dx=\frac{2}{L}\int_0^L\sin^2\left(\frac{n\pi x}{L}\right)\,dx=1.
\end{align*}
For $m\ne n$, the product identity for sines gives
\begin{align*}
\int_0^L\phi_n(x)\phi_m(x)\,dx=\frac{1}{L}\int_0^L\left[\cos\left(\frac{(n-m)\pi x}{L}\right)-\cos\left(\frac{(n+m)\pi x}{L}\right)\right]\,dx=0.
\end{align*}
By the *Fourier sine basis theorem*, these normalized eigenfunctions form an [orthonormal basis](/page/Orthonormal%20Basis) of $L^2(0,L)$. Thus a state can be expanded into energy levels, and the boundary conditions force the discrete spectrum with energies proportional to $n^2$.
[/example]
The particle in a box is the first model where boundary conditions are not cosmetic. They are part of the definition of the operator, and hence part of the definition of the quantum system.
[example: Harmonic Oscillator]
For the one-dimensional oscillator in units with mass, angular frequency, and Planck's constant equal to $1$, take $Q\psi(x)=x\psi(x)$ and $P\psi(x)=-i\psi'(x)$ on the [Schwartz space](/page/Schwartz%20Space) $\mathcal S(\mathbb R)$. Define
\begin{align*}
a=\frac{1}{\sqrt2}(Q+iP)
\end{align*}
and
\begin{align*}
a^*=\frac{1}{\sqrt2}(Q-iP).
\end{align*}
Since $QP\psi(x)=-ix\psi'(x)$ and $PQ\psi(x)=-i(x\psi(x))'=-i\psi(x)-ix\psi'(x)$, we have
\begin{align*}
(QP-PQ)\psi=i\psi.
\end{align*}
Thus $[Q,P]=iI$ on this common test domain.
Now compute the factorization:
\begin{align*}
a^*a=\frac{1}{2}(Q-iP)(Q+iP)=\frac{1}{2}(Q^2+iQP-iPQ+P^2).
\end{align*}
Because $iQP-iPQ=i(QP-PQ)=i(iI)=-I$, this gives
\begin{align*}
a^*a=\frac{1}{2}(P^2+Q^2-I).
\end{align*}
Therefore
\begin{align*}
H_{\rm op}=\frac{1}{2}(P^2+Q^2)=a^*a+\frac{1}{2}I.
\end{align*}
The ground state is found from $a\phi_0=0$. In coordinates this equation is
\begin{align*}
0=a\phi_0=\frac{1}{\sqrt2}(x\phi_0+\phi_0'),
\end{align*}
so
\begin{align*}
\phi_0'(x)=-x\phi_0(x).
\end{align*}
The solution is
\begin{align*}
\phi_0(x)=C e^{-x^2/2}.
\end{align*}
Choosing $C=\pi^{-1/4}$ makes $\|\phi_0\|_{L^2}=1$, since $\int_{\mathbb R}e^{-x^2}\,dx=\sqrt{\pi}$. Because $a\phi_0=0$, we get
\begin{align*}
H_{\rm op}\phi_0=\left(a^*a+\frac{1}{2}I\right)\phi_0=\frac{1}{2}\phi_0.
\end{align*}
The commutator of the ladder operators is
\begin{align*}
[a,a^*]=aa^*-a^*a=I.
\end{align*}
Using $H_{\rm op}=a^*a+\frac{1}{2}I$, this implies
\begin{align*}
H_{\rm op}a^*=a^*aa^*+\frac{1}{2}a^*=a^*(a^*a+I)+\frac{1}{2}a^*=a^*H_{\rm op}+a^*.
\end{align*}
Hence, if $H_{\rm op}\phi=E\phi$, then
\begin{align*}
H_{\rm op}(a^*\phi)=a^*H_{\rm op}\phi+a^*\phi=(E+1)a^*\phi.
\end{align*}
Starting from $\phi_0$, the vectors obtained by repeatedly applying $a^*$ therefore have energies
\begin{align*}
E_n=n+\frac{1}{2},\qquad n\ge 0.
\end{align*}
The oscillator is the model case where commutation relations turn a differential-operator spectral problem into an algebraic ladder of energy levels.
[/example]
The oscillator is a meeting point for several themes: unbounded operators, dense domains, ladder operators, Gaussian ground states, and the role of the Fourier transform. It also provides the local model behind many approximation methods.
[example: Hydrogen Atom]
For a hydrogenic model with nuclear charge encoded by $\gamma>0$, take $H=L^2(\mathbb R^3)$ and begin formally with the Coulomb Hamiltonian
\begin{align*}
H_{\rm op}\psi=-\frac{1}{2m}\Delta\psi-\frac{\gamma}{r}\psi,\qquad r=|x|.
\end{align*}
Using the spherical-coordinate formula
\begin{align*}
\Delta=\frac{1}{r^2}\frac{\partial}{\partial r}\left(r^2\frac{\partial}{\partial r}\right)+\frac{1}{r^2}\Delta_{S^2},
\end{align*}
look for an eigenfunction of the form $\psi(r,\theta,\varphi)=R(r)Y(\theta,\varphi)$, where $Y$ is a spherical harmonic satisfying
\begin{align*}
-\Delta_{S^2}Y=\ell(\ell+1)Y.
\end{align*}
Since $\partial_r(RY)=R'Y$, we have
\begin{align*}
\frac{1}{r^2}\frac{\partial}{\partial r}\left(r^2\frac{\partial}{\partial r}(RY)\right)=\frac{1}{r^2}\frac{\partial}{\partial r}(r^2R'Y).
\end{align*}
Because $Y$ is independent of $r$,
\begin{align*}
\frac{1}{r^2}\frac{\partial}{\partial r}(r^2R'Y)=\frac{2rR'Y+r^2R''Y}{r^2}.
\end{align*}
Thus
\begin{align*}
\frac{1}{r^2}\frac{\partial}{\partial r}(r^2R'Y)=\left(R''+\frac{2}{r}R'\right)Y.
\end{align*}
The angular part is
\begin{align*}
\frac{1}{r^2}\Delta_{S^2}(RY)=\frac{R}{r^2}\Delta_{S^2}Y=-\frac{\ell(\ell+1)}{r^2}RY.
\end{align*}
Therefore
\begin{align*}
\Delta(RY)=\left(R''+\frac{2}{r}R'-\frac{\ell(\ell+1)}{r^2}R\right)Y.
\end{align*}
Substituting this into $H_{\rm op}\psi=E\psi$ gives
\begin{align*}
-\frac{1}{2m}\left(R''+\frac{2}{r}R'-\frac{\ell(\ell+1)}{r^2}R\right)Y-\frac{\gamma}{r}RY=ERY.
\end{align*}
On points where $Y\ne 0$, division by $Y$ leaves the radial equation
\begin{align*}
-\frac{1}{2m}\left(R''+\frac{2}{r}R'-\frac{\ell(\ell+1)}{r^2}R\right)-\frac{\gamma}{r}R=ER.
\end{align*}
Equivalently, multiplying by $2m$ and moving all terms to the left gives
\begin{align*}
-R''-\frac{2}{r}R'+\frac{\ell(\ell+1)}{r^2}R-\frac{2m\gamma}{r}R-2mER=0.
\end{align*}
Thus rotational symmetry separates the three-dimensional partial differential eigenvalue problem into one radial ordinary differential equation for each angular momentum number $\ell$, with the spherical harmonics carrying the angular dependence.
[/example]
The hydrogen atom is mathematically more demanding than the oscillator because the potential is singular and the symmetry group is richer. It is included to show how operator theory, special functions, and representation-theoretic ideas interact in a central physical model.
## Symmetry, Approximation, and the Shape of the Course
Why do symmetry and approximation appear in a course whose foundations are Hilbert spaces and self-adjoint operators? Exact solutions are rare, and symmetries explain both conserved quantities and degeneracies of spectra. Approximation methods then extract predictions when exact diagonalization is unavailable.
[definition: Quantum Symmetry]
Let $H$ be a complex Hilbert space and let $\mathbb P(H)$ denote the set of one-dimensional subspaces of $H$. A quantum symmetry is a map $S:\mathbb P(H)\to \mathbb P(H)$ for which there exists a unitary or antiunitary operator $U:H\to H$ such that $S([\psi])=[U\psi]$ for every non-zero $\psi\in H$, and such that transition probabilities between pure states are preserved.
[/definition]
This definition points toward Wigner's theorem, which explains why symmetries act through unitary or antiunitary maps. Continuous symmetries are then linked to self-adjoint generators, so the same operator theory used for Hamiltonians also describes momentum and angular momentum. The next question is how such a symmetry becomes a conservation law.
[quotetheorem:6913]
[citeproof:6913]
This conservation principle explains why commutators are not merely algebraic notation. They encode invariance, selection rules, and conserved quantities, and they guide the reduction of spectral problems into smaller pieces. The hypothesis is stronger than writing a formal commutator $[A,H_{\rm op}]=0$ on a convenient core; for unbounded operators, a formal commutator identity may fail to imply commutation of the corresponding unitary groups or spectral projections. A concrete finite-dimensional warning is obtained on $H=\mathbb C^2$ by taking $A=\sigma_z$ and $H_{\rm op}=\sigma_x$: the groups fail to commute, and a state initially in the $+1$ eigenspace of $\sigma_z$ develops a time-dependent $\sigma_z$ measurement distribution under $e^{-it\sigma_x}$. Conservation of the spectral distribution also does not mean that the state is stationary, since the state may still acquire phases or move inside a degenerate spectral subspace. Since most Hamiltonians cannot be diagonalized exactly, the final organizing question is how known spectral data change under a controlled modification of the operator.
[definition: Perturbative Question]
Let $H$ be a complex Hilbert space, let $H_{\rm op}:\mathcal D(H_{\rm op})\subset H\to H$ be a self-adjoint operator, let $V:\mathcal D(V)\subset H\to H$ be a symmetric or self-adjoint perturbing operator, and let $\lambda\in\mathbb R$ be a small parameter. A perturbative question asks how spectral data or time evolution for an operator realization of $H_{\rm op}+\lambda V$ on a specified domain in $H$ depends on $\lambda$, assuming the unperturbed operator $H_{\rm op}$ is already understood.
[/definition]
Perturbation theory is not a replacement for spectral theory; it is an application of it under extra hypotheses. The course will treat approximation methods as controlled mathematical procedures rather than formal symbol manipulation. The first formula students meet is the energy correction for a non-degenerate eigenvalue.
[example: First-Order Energy Shift]
Assume there is an eigenpair branch for the perturbed operator of the form
\begin{align*}
\psi(\lambda)=\psi_0+\lambda\psi_1+O(\lambda^2)
\end{align*}
and
\begin{align*}
E(\lambda)=E_0+\lambda E_1+O(\lambda^2),
\end{align*}
with $\psi_0\in\mathcal D(H_{\rm op})\cap\mathcal D(V)$, $\|\psi_0\|_H=1$, and $H_{\rm op}\psi_0=E_0\psi_0$. The perturbed eigenvalue equation is
\begin{align*}
(H_{\rm op}+\lambda V)\psi(\lambda)=E(\lambda)\psi(\lambda).
\end{align*}
Substituting the two expansions into the left-hand side gives, by linearity,
\begin{align*}
(H_{\rm op}+\lambda V)(\psi_0+\lambda\psi_1+O(\lambda^2))=H_{\rm op}\psi_0+\lambda H_{\rm op}\psi_1+\lambda V\psi_0+O(\lambda^2).
\end{align*}
Since $H_{\rm op}\psi_0=E_0\psi_0$, this becomes
\begin{align*}
(H_{\rm op}+\lambda V)\psi(\lambda)=E_0\psi_0+\lambda(H_{\rm op}\psi_1+V\psi_0)+O(\lambda^2).
\end{align*}
The right-hand side expands as
\begin{align*}
(E_0+\lambda E_1+O(\lambda^2))(\psi_0+\lambda\psi_1+O(\lambda^2))=E_0\psi_0+\lambda(E_0\psi_1+E_1\psi_0)+O(\lambda^2).
\end{align*}
Equating the coefficients of $\lambda$ gives
\begin{align*}
H_{\rm op}\psi_1+V\psi_0=E_0\psi_1+E_1\psi_0.
\end{align*}
Taking the inner product with $\psi_0$ in the second slot gives
\begin{align*}
(H_{\rm op}\psi_1,\psi_0)_H+(V\psi_0,\psi_0)_H=(E_0\psi_1,\psi_0)_H+(E_1\psi_0,\psi_0)_H.
\end{align*}
Because $H_{\rm op}$ is self-adjoint and $H_{\rm op}\psi_0=E_0\psi_0$,
\begin{align*}
(H_{\rm op}\psi_1,\psi_0)_H=(\psi_1,H_{\rm op}\psi_0)_H=(\psi_1,E_0\psi_0)_H=E_0(\psi_1,\psi_0)_H.
\end{align*}
Also,
\begin{align*}
(E_0\psi_1,\psi_0)_H=E_0(\psi_1,\psi_0)_H
\end{align*}
and, since $\|\psi_0\|_H=1$,
\begin{align*}
(E_1\psi_0,\psi_0)_H=E_1.
\end{align*}
Therefore the coefficient equation reduces to
\begin{align*}
E_0(\psi_1,\psi_0)_H+(V\psi_0,\psi_0)_H=E_0(\psi_1,\psi_0)_H+E_1.
\end{align*}
Subtracting $E_0(\psi_1,\psi_0)_H$ from both sides yields
\begin{align*}
E_1=(V\psi_0,\psi_0)_H.
\end{align*}
Thus the first-order change in the energy is the expectation value of the perturbing observable in the unperturbed state.
[/example]
The introduction closes with the course map. First we build states, observables, and finite-dimensional measurement. Then we develop unbounded self-adjoint operators and spectral measures. After that, unitary dynamics, canonical examples, angular momentum, spin, and approximation methods are treated as applications of the same Hilbert-space framework.
The introduction has laid out the course as a progressive development of the Hilbert-space framework. We now begin at the foundation: what a quantum state is, how observables act on it, and how measurement probabilities arise.
# 1. States, Observables, and Hilbert Space
Quantum mechanics begins by replacing the classical phase space of positions and momenta with a complex Hilbert space of states. The first task is to understand which vectors represent physical states, how probabilities are extracted from them, and how measurement outcomes are encoded by operators. This chapter sets up the finite- and infinite-dimensional language needed later for unbounded operators, spectral measures, and Schrödinger evolution.
## Complex Hilbert Spaces as State Spaces
What mathematical structure is needed if states are allowed to interfere, probabilities are computed from amplitudes, and limits of physically meaningful approximations must remain inside the theory? The answer is not merely a complex [vector space](/page/Vector%20Space): we need an inner product to form transition amplitudes and a completeness condition so that convergent approximation schemes have limits.
[definition: Complex Hilbert Space]
A complex Hilbert space is a complex vector space $H$ equipped with an inner product $(\cdot,\cdot)_H: H \times H \to \mathbb C$ such that:
1. $(\alpha u + \beta v,w)_H = \alpha (u,w)_H + \beta (v,w)_H$ for all $u,v,w \in H$ and $\alpha,\beta \in \mathbb C$;
2. $(u,v)_H = \overline{(v,u)_H}$ for all $u,v \in H$;
3. $(u,u)_H \ge 0$ for all $u \in H$, and $(u,u)_H = 0$ iff $u=0$;
4. $H$ is complete with respect to the norm $\|u\|_H = (u,u)_H^{1/2}$.
[/definition]
The convention here is that the inner product is linear in the first argument and conjugate-linear in the second. Completeness matters because sequences of finite approximations, wave packets, or basis expansions should converge to states still belonging to $H$.
[example: Finite Spin State Space]
For a spin-$1/2$ degree of freedom take $H=\mathbb C^2$ with inner product
\begin{align*}
(u,v)_H = u_1\overline{v_1}+u_2\overline{v_2}.
\end{align*}
The standard basis vectors
\begin{align*}
e_+=(1,0), \qquad e_-=(0,1)
\end{align*}
are orthonormal, since
\begin{align*}
(e_+,e_+)_H=1\cdot \overline{1}+0\cdot \overline{0}=1,
\end{align*}
\begin{align*}
(e_-,e_-)_H=0\cdot \overline{0}+1\cdot \overline{1}=1,
\end{align*}
and
\begin{align*}
(e_+,e_-)_H=1\cdot \overline{0}+0\cdot \overline{1}=0.
\end{align*}
If $u=\alpha e_+ + \beta e_-$, then $u=(\alpha,\beta)$, so its squared norm is
\begin{align*}
\|u\|_H^2=(u,u)_H=\alpha\overline{\alpha}+\beta\overline{\beta}=|\alpha|^2+|\beta|^2.
\end{align*}
When $u\ne 0$, the normalised vector is
\begin{align*}
\frac{u}{\|u\|_H}=\frac{\alpha}{\sqrt{|\alpha|^2+|\beta|^2}}e_+ + \frac{\beta}{\sqrt{|\alpha|^2+|\beta|^2}}e_-.
\end{align*}
Thus the two squared coefficient moduli in the normalised state are
\begin{align*}
\left|\frac{\alpha}{\sqrt{|\alpha|^2+|\beta|^2}}\right|^2=\frac{|\alpha|^2}{|\alpha|^2+|\beta|^2},
\end{align*}
and
\begin{align*}
\left|\frac{\beta}{\sqrt{|\alpha|^2+|\beta|^2}}\right|^2=\frac{|\beta|^2}{|\alpha|^2+|\beta|^2}.
\end{align*}
Their sum is $1$, so after normalisation they form the probabilities of obtaining the two spin outcomes represented by $e_+$ and $e_-$.
[/example]
The finite spin example turns Hilbert space geometry into a rule for probabilities, but that rule would fail unless inner products were controlled by norms. The next result supplies exactly that control: it bounds every transition amplitude and later becomes the proof mechanism behind the uncertainty inequality.
[quotetheorem:432]
[citeproof:432]
This inequality is the analytic source of the bound that every transition probability lies between $0$ and $1$. The positivity and definiteness hypotheses are essential: for the indefinite Hermitian form on $\mathbb C^2$ given by $(u,v)=u_1\overline{v_1}-u_2\overline{v_2}$, the vectors $u=(1,1)$ and $v=(1,-1)$ both have squared length $0$, while $(u,v)=2$. The result also does not say that a large transition amplitude makes two states equal; equality occurs only when the two vectors lie on the same complex line. If $u$ and $v$ are unit vectors, then $|(u,v)_H|^2$ is a well-defined probability candidate, and this is the form used later in both projection probabilities and the uncertainty inequality.
[example: Gaussian Wave Packet in $L^2(\mathbb R)$]
Let $H=L^2(\mathbb R)$ and let
\begin{align*}
\psi(x)= (\pi \sigma^2)^{-1/4}\exp\left(-\frac{(x-x_0)^2}{2\sigma^2}+ip_0x\right),
\end{align*}
where $\sigma>0$, $x_0\in\mathbb R$, and $p_0\in\mathbb R$. Since $|e^{ip_0x}|=1$ and the first exponential factor is real and positive,
\begin{align*}
|\psi(x)|^2=(\pi\sigma^2)^{-1/2}\exp\left(-\frac{(x-x_0)^2}{\sigma^2}\right).
\end{align*}
Therefore
\begin{align*}
\|\psi\|_{L^2}^2=\int_{\mathbb R}|\psi(x)|^2\,dx=(\pi\sigma^2)^{-1/2}\int_{\mathbb R}\exp\left(-\frac{(x-x_0)^2}{\sigma^2}\right)\,dx.
\end{align*}
With the substitution $y=(x-x_0)/\sigma$, so that $dx=\sigma\,dy$, this becomes
\begin{align*}
\|\psi\|_{L^2}^2=(\pi\sigma^2)^{-1/2}\sigma\int_{\mathbb R}e^{-y^2}\,dy.
\end{align*}
Using the standard [Gaussian integral](/theorems/1140) $\int_{\mathbb R}e^{-y^2}\,dy=\sqrt\pi$, we get
\begin{align*}
\|\psi\|_{L^2}^2=(\pi\sigma^2)^{-1/2}\sigma\sqrt\pi=1.
\end{align*}
Thus $\|\psi\|_{L^2}=1$, so $\psi$ is a normalised state.
The density
\begin{align*}
|\psi(x)|^2=(\pi\sigma^2)^{-1/2}\exp\left(-\frac{(x-x_0)^2}{\sigma^2}\right)
\end{align*}
depends only on $(x-x_0)^2$, so it is symmetric about $x_0$. The phase factor $e^{ip_0x}$ has spatial frequency $p_0$, since
\begin{align*}
-i\frac{d}{dx}e^{ip_0x}=p_0e^{ip_0x}.
\end{align*}
The packet is therefore localised around $x_0$ in position, while its oscillatory carrier phase selects the momentum scale $p_0$.
[/example]
The Gaussian example previews the two main state spaces of the course: finite-dimensional spaces for spin and matrix models, and $L^2$ spaces for wave functions. Later chapters add domain restrictions for differential operators acting on $L^2(\mathbb R)$.
## Rays, Superposition, and Pure States
If a vector represents a state, which changes of the vector alter physical predictions? Multiplication by a nonzero complex scalar changes neither the relative amplitudes nor the probabilities after normalisation. The physical object associated with a nonzero vector is therefore a one-dimensional complex subspace.
[definition: Ray]
Let $H$ be a complex Hilbert space. A ray in $H$ is a one-dimensional complex subspace of $H$.
[/definition]
For a nonzero vector $\psi\in H$, the corresponding ray is
\begin{align*}
[\psi] = \{\lambda \psi : \lambda \in \mathbb C\}.
\end{align*}
When $\|\psi\|_H=1$, all unit vectors $e^{i\theta}\psi$ with $\theta\in\mathbb R$ represent the same ray. This motivates naming the normalised ray as the basic pure preparation, since it forgets only the physically irrelevant global scalar.
[definition: Pure State]
A pure state on a complex Hilbert space $H$ is a ray $[\psi]$ represented by a unit vector $\psi\in H$.
[/definition]
Pure states are the extremal states of the theory: they contain maximal preparation information. The next feature, superposition, is a vector-space operation before passing to rays, and it is the main structural distinction between quantum and classical state spaces.
[definition: Superposition]
Let $H$ be a complex Hilbert space and let $\psi,\varphi\in H$ be nonzero vectors. A superposition of $\psi$ and $\varphi$ is a nonzero vector of the form
\begin{align*}
\alpha \psi + \beta \varphi,
\end{align*}
where $\alpha,\beta\in\mathbb C$.
[/definition]
The relative phase between $\alpha$ and $\beta$ can affect later measurement probabilities, even though a common phase multiplying the whole vector cannot. This is the algebraic origin of interference.
[example: Spin-$1/2$ Stern-Gerlach Preparation]
Take $H=\mathbb C^2$ with orthonormal $z$-spin basis $e_+,e_-$. The vector
\begin{align*}
\psi_x=\frac{1}{\sqrt2}(e_+ + e_-)
\end{align*}
is normalised because
\begin{align*}
(\psi_x,\psi_x)_H=\frac{1}{2}(e_+ + e_-,e_+ + e_-)_H=\frac{1}{2}(1+0+0+1)=1.
\end{align*}
Relative to the $z$-basis, its coefficients are both $1/\sqrt2$, so the squared coefficient moduli are
\begin{align*}
\left|\frac{1}{\sqrt2}\right|^2=\frac12
\end{align*}
and
\begin{align*}
\left|\frac{1}{\sqrt2}\right|^2=\frac12.
\end{align*}
Thus a Stern-Gerlach apparatus aligned with the $z$-axis gives the outcomes $+$ and $-$ with probabilities $1/2$ and $1/2$.
For the $x$-axis measurement, use the orthonormal $x$-spin basis
\begin{align*}
e_x^+=\frac{1}{\sqrt2}(e_+ + e_-), \qquad e_x^-=\frac{1}{\sqrt2}(e_+ - e_-).
\end{align*}
Here $\psi_x=e_x^+$, so its coefficient of $e_x^+$ is $1$ and its coefficient of $e_x^-$ is $0$. The corresponding probabilities are
\begin{align*}
|1|^2=1
\end{align*}
and
\begin{align*}
|0|^2=0.
\end{align*}
So the same vector is a superposition relative to the $z$-spin basis but a definite $+$ state relative to the $x$-spin basis.
[/example]
The example also warns against interpreting superposition as ignorance about a hidden alternative. A vector such as $(e_+ + e_-)/\sqrt2$ is not the same object as a probability mixture that selects $e_+$ or $e_-$ with equal probabilities.
[remark: Global and Relative Phase]
The vectors $\psi$ and $e^{i\theta}\psi$ determine the same pure state. By contrast, $e_+ + e_-$ and $e_+ - e_-$ represent different rays in $\mathbb C^2$, because the relative phase between the two basis components has changed. Measurements in a rotated basis can distinguish these two rays.
[/remark]
This distinction between coherent superposition and probabilistic mixing motivates the density operator formalism. It gives a single language for pure preparations, mixed preparations, and statistical ensembles.
## Mixed States and Density Operators
How should the theory represent a laboratory preparation that produces different pure states with specified classical probabilities? Vectors alone do not distinguish coherent superposition from classical randomisation. Density operators encode both cases and also give the correct formula for expectations.
A trace-class operator is an operator for which the absolute trace is finite: equivalently, if $|A|=(A^*A)^{1/2}$ and $(e_n)$ is an orthonormal basis, then
\begin{align*}
\sum_n (|A|e_n,e_n)_H<\infty.
\end{align*}
For such operators the trace $\operatorname{tr}(A)=\sum_n(Ae_n,e_n)_H$ is well defined and independent of the orthonormal basis. An operator $\rho$ is positive when $(\rho u,u)_H\ge0$ for every $u\in H$.
[definition: Density Operator]
Let $H$ be a complex Hilbert space. A density operator on $H$ is an element $\rho$ of the trace-class operators $\mathcal B_1(H)$ such that $\rho:H\to H$ is positive and
\begin{align*}
\operatorname{tr}(\rho)=1.
\end{align*}
[/definition]
In finite dimensions, a density operator is a positive semidefinite Hermitian matrix with trace $1$. The trace condition normalises total probability, while positivity ensures that the expected value of every projection is nonnegative. To connect the ray picture with this operator picture, we first need the projection associated with a unit vector.
[definition: Pure-State Projection]
Let $\psi\in H$ be a unit vector. The pure-state projection associated with $\psi$ is the operator $P_\psi:H\to H$ defined by
\begin{align*}
P_\psi:H\to H,\qquad u\mapsto (u,\psi)_H\psi.
\end{align*}
[/definition]
The operator $P_\psi$ depends only on the ray $[\psi]$, since replacing $\psi$ by $e^{i\theta}\psi$ gives the same map. Thus rays and rank-one orthogonal projections are equivalent descriptions of pure states.
[example: Coherent State Versus Equal Mixture]
In $H=\mathbb C^2$, set
\begin{align*}
\eta=\frac{1}{\sqrt2}(e_+ + e_-).
\end{align*}
We compare the pure coherent density operator $\rho_{\mathrm{coh}}=P_\eta$ with the equal mixture
\begin{align*}
\rho_{\mathrm{mix}}=\frac12P_{e_+}+\frac12P_{e_-}.
\end{align*}
By the definition of a pure-state projection, $P_\eta u=(u,\eta)_H\eta$. Since $(e_+,\eta)_H=1/\sqrt2$ and $(e_-,\eta)_H=1/\sqrt2$, we get
\begin{align*}
\rho_{\mathrm{coh}}e_+=P_\eta e_+=\frac{1}{\sqrt2}\eta=\frac12e_+ + \frac12e_-.
\end{align*}
Similarly,
\begin{align*}
\rho_{\mathrm{coh}}e_-=P_\eta e_-=\frac{1}{\sqrt2}\eta=\frac12e_+ + \frac12e_-.
\end{align*}
Thus, with respect to the ordered basis $e_+,e_-$, each basis vector is sent to a vector whose two coefficients are both $1/2$, so all four matrix entries of $\rho_{\mathrm{coh}}$ are $1/2$.
For the mixture, $P_{e_+}e_+=e_+$, $P_{e_+}e_-=0$, $P_{e_-}e_+=0$, and $P_{e_-}e_-=e_-$. Hence
\begin{align*}
\rho_{\mathrm{mix}}e_+=\frac12e_+.
\end{align*}
Also,
\begin{align*}
\rho_{\mathrm{mix}}e_-=\frac12e_-.
\end{align*}
So $\rho_{\mathrm{mix}}$ has diagonal entries $1/2,1/2$ and off-diagonal entries $0,0$ in the $e_+,e_-$ basis.
For the $z$-spin measurement, the projections are $P_{e_+}$ and $P_{e_-}$. In the coherent state,
\begin{align*}
\|P_{e_+}\eta\|_H^2=\left\|\frac{1}{\sqrt2}e_+\right\|_H^2=\frac12.
\end{align*}
Likewise,
\begin{align*}
\|P_{e_-}\eta\|_H^2=\left\|\frac{1}{\sqrt2}e_-\right\|_H^2=\frac12.
\end{align*}
For the mixture, the trace probabilities are
\begin{align*}
\operatorname{tr}(\rho_{\mathrm{mix}}P_{e_+})=\operatorname{tr}\left(\frac12P_{e_+}\right)=\frac12.
\end{align*}
Similarly,
\begin{align*}
\operatorname{tr}(\rho_{\mathrm{mix}}P_{e_-})=\operatorname{tr}\left(\frac12P_{e_-}\right)=\frac12.
\end{align*}
Thus the two preparations give the same $z$-spin probabilities.
Now let
\begin{align*}
e_x^+=\frac{1}{\sqrt2}(e_+ + e_-), \qquad e_x^-=\frac{1}{\sqrt2}(e_+ - e_-).
\end{align*}
Since $\eta=e_x^+$, the coherent state gives
\begin{align*}
\|P_{e_x^+}\eta\|_H^2=\|\eta\|_H^2=1.
\end{align*}
It also gives
\begin{align*}
\|P_{e_x^-}\eta\|_H^2=0.
\end{align*}
For the mixture, $(e_+,e_x^+)_H=1/\sqrt2$ and $(e_-,e_x^+)_H=1/\sqrt2$, so
\begin{align*}
(P_{e_x^+}e_+,e_+)_H=\left(\frac{1}{\sqrt2}e_x^+,e_+\right)_H=\frac12.
\end{align*}
Also,
\begin{align*}
(P_{e_x^+}e_-,e_-)_H=\left(\frac{1}{\sqrt2}e_x^+,e_-\right)_H=\frac12.
\end{align*}
Therefore
\begin{align*}
\operatorname{tr}(\rho_{\mathrm{mix}}P_{e_x^+})=\frac12\cdot\frac12+\frac12\cdot\frac12=\frac12.
\end{align*}
The same calculation with $e_x^-$ gives $\operatorname{tr}(\rho_{\mathrm{mix}}P_{e_x^-})=1/2$. Thus $\rho_{\mathrm{coh}}$ and $\rho_{\mathrm{mix}}$ agree on the $z$-basis populations but differ in the $x$-basis; the off-diagonal entries of $\rho_{\mathrm{coh}}$ are exactly the phase-coherence data lost in the equal mixture.
[/example]
The matrix comparison shows the purpose of density operators: diagonal entries encode populations in a chosen basis, while off-diagonal entries encode coherences relative to that basis. It remains to check that the usual statistical construction from a random ensemble really produces an operator satisfying the density conditions.
[quotetheorem:6914]
[citeproof:6914]
Not every ensemble decomposition of a density operator is unique, but the operator itself determines all measurement statistics. The hypotheses explain why this construction is safe: unit vectors make each $P_{\psi_j}$ have trace $1$, nonnegative probabilities preserve positivity, and $\sum_j p_j=1$ gives the correct trace normalisation. If a vector is not normalised, then the trace contribution becomes $p_j\|\psi_j\|_H^2$; if a coefficient is negative, positivity can fail even in $\mathbb C^2$; if the coefficients sum to a number different from $1$, the operator has the wrong total probability. Finite-dimensionality lets the proof use ordinary matrix trace without convergence questions; in infinite dimension the analogous statement requires trace-class operators and summability assumptions. For this reason the density operator, rather than a chosen list of weighted vectors, is the intrinsic mixed state.
## Observables as Self-Adjoint Operators
What kind of operator should represent a measurable quantity? Measurement outcomes must be real, and the expectation of the observable in any state must be real. In Hilbert space, the operator condition enforcing this is self-adjointness.
[definition: Self-Adjoint Operator in Finite Dimension]
Let $H$ be a finite-dimensional complex Hilbert space. A linear operator $A:H\to H$ is self-adjoint if
\begin{align*}
(Au,v)_H = (u,Av)_H
\end{align*}
for all $u,v\in H$.
[/definition]
In finite dimension this is the same as saying that the matrix of $A$ in an orthonormal basis is Hermitian. Later, for unbounded operators such as momentum and Hamiltonians on $L^2(\mathbb R)$, self-adjointness will require a precise domain condition.
[definition: Observable]
In the finite-dimensional model, an observable on a complex Hilbert space $H$ is a self-adjoint linear operator $A:H\to H$.
[/definition]
An observable is useful only if it gives a concrete list of possible measurement values and the subspaces associated with them. This need motivates the finite-dimensional spectral theorem, which decomposes a self-adjoint operator into real eigenvalues and orthogonal projections.
[quotetheorem:6915]
The projections $P_k$ are the mathematical representatives of the possible measurement alternatives. The number $a_k$ is the outcome, and $P_k$ extracts the part of the state belonging to that outcome. Self-adjointness is essential: for example, the non-normal operator on $\mathbb C^2$ sending $e_1$ to $e_1$ and $e_2$ to $e_1+e_2$ has no orthonormal eigenbasis and cannot be written as a sum of real eigenvalues times mutually orthogonal projections. Finite-dimensionality is also part of the statement; in infinite dimension a self-adjoint operator may have continuous spectrum, and an unbounded operator is not even defined on all of $H$. The finite theorem is therefore the discrete model for the spectral-measure language developed later.
[example: Two-Level Hamiltonian]
Let $H=\mathbb C^2$ with standard orthonormal basis $e_1=(1,0)$ and $e_2=(0,1)$, and define $H_0$ by
\begin{align*}
H_0e_1=E_1e_1,\qquad H_0e_2=E_2e_2.
\end{align*}
For $u=ae_1+be_2$ and $v=ce_1+de_2$, linearity gives
\begin{align*}
H_0u=aE_1e_1+bE_2e_2.
\end{align*}
Hence
\begin{align*}
(H_0u,v)_H=aE_1\overline c+bE_2\overline d.
\end{align*}
Also,
\begin{align*}
H_0v=cE_1e_1+dE_2e_2.
\end{align*}
Since $E_1,E_2\in\mathbb R$,
\begin{align*}
(u,H_0v)_H=a\overline{cE_1}+b\overline{dE_2}=aE_1\overline c+bE_2\overline d.
\end{align*}
Thus $(H_0u,v)_H=(u,H_0v)_H$ for all $u,v\in H$, so $H_0$ is self-adjoint and is therefore a finite-dimensional observable.
The vectors $e_1$ and $e_2$ are eigenvectors because
\begin{align*}
H_0e_1=E_1e_1
\end{align*}
and
\begin{align*}
H_0e_2=E_2e_2.
\end{align*}
If $\psi=\alpha e_1+\beta e_2$ is normalised, then
\begin{align*}
1=\|\psi\|_H^2=(\psi,\psi)_H=|\alpha|^2+|\beta|^2.
\end{align*}
Let $P_1$ and $P_2$ be the orthogonal projections onto $\mathbb C e_1$ and $\mathbb C e_2$. Then
\begin{align*}
P_1\psi=\alpha e_1.
\end{align*}
Therefore
\begin{align*}
\|P_1\psi\|_H^2=(\alpha e_1,\alpha e_1)_H=\alpha\overline\alpha(e_1,e_1)_H=|\alpha|^2.
\end{align*}
Similarly,
\begin{align*}
P_2\psi=\beta e_2.
\end{align*}
Thus
\begin{align*}
\|P_2\psi\|_H^2=(\beta e_2,\beta e_2)_H=\beta\overline\beta(e_2,e_2)_H=|\beta|^2.
\end{align*}
When $E_1$ and $E_2$ are distinct, the projection postulate therefore assigns probability $|\alpha|^2$ to the energy $E_1$ and probability $|\beta|^2$ to the energy $E_2$. If $E_1=E_2$, both eigenspaces correspond to the same energy value, and the total probability of that single outcome is $|\alpha|^2+|\beta|^2=1$.
[/example]
The two-level model shows how spectral projections convert an operator into actual measurement statistics. The next postulate turns that conversion into the finite-spectrum measurement rule and records how the state changes after an outcome is observed.
[explanation: Projection Postulate for Finite Spectra]
Let $A=\sum_{k=1}^m a_kP_k$ be a finite-dimensional observable with distinct eigenvalues $a_k$ and spectral projections $P_k$. If the system is in the pure state represented by a unit vector $\psi\in H$, then the probability of obtaining the outcome $a_k$ is
\begin{align*}
\mathbb P(a_k)=\|P_k\psi\|_H^2.
\end{align*}
If this outcome is obtained and $P_k\psi\ne 0$, the post-measurement pure state is represented by
\begin{align*}
\frac{P_k\psi}{\|P_k\psi\|_H}.
\end{align*}
For a density operator $\rho$, the probability of obtaining $a_k$ is $\operatorname{tr}(\rho P_k)$, and the corresponding post-measurement state is
\begin{align*}
\frac{P_k\rho P_k}{\operatorname{tr}(\rho P_k)}
\end{align*}
whenever $\operatorname{tr}(\rho P_k)>0$.
[/explanation]
This postulate is stated as part of the quantum formalism rather than proved from the Hilbert space axioms. Its scope is the ideal finite-spectrum projective measurement model: the projections must be the orthogonal spectral projections of a self-adjoint observable, the vector state must be normalised, and the displayed conditional state is defined only when the observed outcome has positive probability. If arbitrary nonorthogonal projections were substituted, the squared lengths would not generally form an additive probability distribution; if $P_k\psi=0$, conditioning on the outcome has no state to normalise. The postulate does not cover generalised measurements, finite detector resolution, or the domain questions that appear for unbounded observables. It will nevertheless be the rule used throughout the finite-dimensional parts of the course, and later spectral measures extend the same idea beyond a finite list of outcomes.
[example: Stern-Gerlach Measurement Probabilities]
In the $z$-basis of $\mathbb C^2$, define the two orthogonal projections by
\begin{align*}
P_+e_+=e_+,\quad P_+e_-=0,\quad P_-e_+=0,\quad P_-e_-=e_-.
\end{align*}
For the state
\begin{align*}
\psi_x=\frac{1}{\sqrt2}(e_+ + e_-),
\end{align*}
linearity of $P_+$ gives
\begin{align*}
P_+\psi_x=P_+\left(\frac{1}{\sqrt2}e_+ + \frac{1}{\sqrt2}e_-\right)=\frac{1}{\sqrt2}P_+e_+ + \frac{1}{\sqrt2}P_+e_-.
\end{align*}
Using the defining values of $P_+$,
\begin{align*}
P_+\psi_x=\frac{1}{\sqrt2}e_+ + \frac{1}{\sqrt2}0=\frac{1}{\sqrt2}e_+.
\end{align*}
Since $(e_+,e_+)_H=1$,
\begin{align*}
\|P_+\psi_x\|_H^2=\left(\frac{1}{\sqrt2}e_+,\frac{1}{\sqrt2}e_+\right)_H=\frac{1}{\sqrt2}\overline{\frac{1}{\sqrt2}}(e_+,e_+)_H=\frac12.
\end{align*}
Similarly, linearity of $P_-$ gives
\begin{align*}
P_-\psi_x=P_-\left(\frac{1}{\sqrt2}e_+ + \frac{1}{\sqrt2}e_-\right)=\frac{1}{\sqrt2}P_-e_+ + \frac{1}{\sqrt2}P_-e_-.
\end{align*}
Using the defining values of $P_-$,
\begin{align*}
P_-\psi_x=\frac{1}{\sqrt2}0 + \frac{1}{\sqrt2}e_-=\frac{1}{\sqrt2}e_-.
\end{align*}
Since $(e_-,e_-)_H=1$,
\begin{align*}
\|P_-\psi_x\|_H^2=\left(\frac{1}{\sqrt2}e_-,\frac{1}{\sqrt2}e_-\right)_H=\frac{1}{\sqrt2}\overline{\frac{1}{\sqrt2}}(e_-,e_-)_H=\frac12.
\end{align*}
Thus a $z$-spin measurement in the state $\psi_x$ gives the $+$ and $-$ outcomes with probabilities $1/2$ and $1/2$. If the $+$ outcome is obtained, the projection postulate represents the new state by
\begin{align*}
\frac{P_+\psi_x}{\|P_+\psi_x\|_H}=\frac{(1/\sqrt2)e_+}{\sqrt{1/2}}=e_+.
\end{align*}
A repeated $z$-measurement then has $+$ probability
\begin{align*}
\|P_+e_+\|_H^2=\|e_+\|_H^2=(e_+,e_+)_H=1.
\end{align*}
The first measurement therefore turns the superposed state $\psi_x$ into the sharp $z$-spin state corresponding to the observed outcome.
[/example]
Repeated measurement is therefore not passive observation: the first measurement changes the state unless the state was already in the measured eigenspace. This prepares the ground for the distinction between compatible and incompatible observables.
## Expectation, Variance, and Compatible Observables
How do we summarise the statistics of an observable without listing every outcome? As in probability theory, the first two summary quantities are expectation and variance. In quantum mechanics they are computed from the operator and the state.
[definition: Expectation of an Observable]
Let $A$ be a finite-dimensional observable on $H$. For a unit vector $\psi\in H$, the expectation of $A$ in the pure state $[\psi]$ is
\begin{align*}
\mathbb E_\psi[A] = (A\psi,\psi)_H.
\end{align*}
For a density operator $\rho$, the expectation of $A$ in the mixed state $\rho$ is
\begin{align*}
\mathbb E_\rho[A]=\operatorname{tr}(\rho A).
\end{align*}
[/definition]
Self-adjointness ensures these expectations are real, but the formula should also match the ordinary average of all measurement outcomes. This motivates deriving the [expectation from spectral probabilities](/theorems/6916), connecting the operator expression with the probabilities from the projection postulate.
[quotetheorem:6916]
The hypotheses in this calculation are doing real work. Self-adjointness makes the eigenvalues real and turns the spectral coefficients into ordinary real-valued measurement outcomes; for a non-self-adjoint matrix, the number $(A\psi,\psi)_H$ can be complex and has no direct interpretation as an average of real outcomes. Orthogonal spectral projections are also needed, since the identity $(P_k\psi,\psi)_H=\|P_k\psi\|_H^2$ uses $P_k=P_k^*=P_k^2$. Finite-dimensionality avoids convergence and trace-domain issues; the formula does not assert that every operator expression is a measurable observable in infinite-dimensional quantum mechanics. Expectation gives the centre of the outcome distribution but says nothing about whether repeated measurements cluster near that centre or spread across several eigenvalues. This motivates variance, defined through the deviation operator obtained by subtracting the expectation from the observable.
[definition: Variance of an Observable]
Let $A$ be a finite-dimensional observable and let $\psi\in H$ be a unit vector. The variance of $A$ in the state $\psi$ is
\begin{align*}
(\Delta_\psi A)^2 = \|(A-\mathbb E_\psi[A]I)\psi\|_H^2.
\end{align*}
For a density operator $\rho$, the variance is
\begin{align*}
(\Delta_\rho A)^2 = \operatorname{tr}(\rho A^2)-\operatorname{tr}(\rho A)^2.
\end{align*}
[/definition]
A state has zero variance for $A$ precisely when it lies in an eigenspace of $A$. This raises the central joint-measurement question and motivates the definition of compatibility: when can two observables have a common basis of states in which both have sharp values?
[definition: Compatible Observables]
Let $A$ and $B$ be finite-dimensional observables on $H$. They are compatible if
\begin{align*}
AB=BA.
\end{align*}
[/definition]
Compatibility is expressed by an algebraic equation, but its physical meaning is simultaneous diagonalisation. The obstruction is that $AB=BA$ alone must force each eigenspace of one observable to be stable under the other, otherwise the two measurements would still compete for different bases. What is needed is a basis-level statement: commuting finite-dimensional self-adjoint operators have enough common eigenvectors to support a joint measurement interpretation.
[quotetheorem:2403]
[citeproof:2403]
This theorem makes compatibility operational: measuring one observable does not force us out of the eigenspaces relevant to the other. The commutation hypothesis is necessary: the Pauli matrices $\sigma_x$ and $\sigma_z$ have no common eigenvector, reflecting the fact that sharp $x$-spin and sharp $z$-spin cannot be assigned in the same basis. Self-adjointness and diagonalisation are also necessary; a Jordan block commutes with the identity but still has no orthonormal eigenbasis. The theorem is finite-dimensional, while in infinite dimension simultaneous diagonalisation is replaced by a stronger spectral-measure condition, often called strong commutation. The need to quantify the opposite case motivates the following definition, which measures the failure of two observables to commute.
[definition: Commutator]
For linear operators $A,B:H\to H$, their commutator is
\begin{align*}
[A,B]:H\to H,\qquad [A,B]=AB-BA.
\end{align*}
[/definition]
The commutator vanishes exactly when the two observables are compatible in the finite-dimensional setting. A nonzero commutator motivates an uncertainty estimate: the variances of $A$ and $B$ cannot both be arbitrarily small in a state where the commutator has nonzero expectation.
[quotetheorem:6917]
[citeproof:6917]
This is the finite-dimensional prototype of the position-momentum uncertainty principle. The self-adjointness assumptions ensure that the variances are real nonnegative quantities and that the commutator expectation has the skew-adjoint form used in the proof; for non-self-adjoint operators the same expression need not describe measurement uncertainty. The common-domain issue is hidden in finite dimension: in infinite dimension, $A\psi$, $B\psi$, $AB\psi$, and $BA\psi$ may not all exist for the same vector $\psi$. The inequality also does not say that noncommuting observables always have large uncertainty in every state, since the lower bound can vanish when the commutator expectation vanishes in that state. In later chapters the same argument must be revisited with domains, since products of unbounded operators are not automatically defined on the same vectors.
[example: Pauli Matrices and Spin Uncertainty]
Let $e_+,e_-$ be the standard orthonormal basis of $\mathbb C^2$, and let $\sigma_x$, $\sigma_y$, and $\sigma_z$ be defined by
\begin{align*}
\sigma_x e_+ = e_-, \quad \sigma_x e_- = e_+, \quad \sigma_y e_+ = i e_-, \quad \sigma_y e_- = -i e_+, \quad \sigma_z e_+ = e_+, \quad \sigma_z e_- = -e_-.
\end{align*}
We first compute the commutator on the basis vectors. On $e_+$,
\begin{align*}
\sigma_x\sigma_y e_+=\sigma_x(i e_-)=i\sigma_x e_-=i e_+.
\end{align*}
Also,
\begin{align*}
\sigma_y\sigma_x e_+=\sigma_y e_-=-i e_+.
\end{align*}
Hence
\begin{align*}
[\sigma_x,\sigma_y]e_+=\sigma_x\sigma_y e_+-\sigma_y\sigma_x e_+=i e_+-(-i e_+)=2i e_+=2i\sigma_z e_+.
\end{align*}
On $e_-$,
\begin{align*}
\sigma_x\sigma_y e_-=\sigma_x(-i e_+)=-i\sigma_x e_+=-i e_-.
\end{align*}
Also,
\begin{align*}
\sigma_y\sigma_x e_-=\sigma_y e_+=i e_-.
\end{align*}
Thus
\begin{align*}
[\sigma_x,\sigma_y]e_-=-i e_- - i e_-=-2i e_-=2i\sigma_z e_-.
\end{align*}
Since the two linear operators agree on the basis $e_+,e_-$,
\begin{align*}
[\sigma_x,\sigma_y]=2i\sigma_z.
\end{align*}
In the state $e_+$, the $z$-spin is sharp because
\begin{align*}
\sigma_z e_+=e_+.
\end{align*}
Therefore
\begin{align*}
\mathbb E_{e_+}[\sigma_z]=(\sigma_z e_+,e_+)_H=(e_+,e_+)_H=1,
\end{align*}
and
\begin{align*}
(\Delta_{e_+}\sigma_z)^2=\|(\sigma_z-I)e_+\|_H^2=\|e_+-e_+\|_H^2=0.
\end{align*}
For $\sigma_x$,
\begin{align*}
\mathbb E_{e_+}[\sigma_x]=(\sigma_x e_+,e_+)_H=(e_-,e_+)_H=0.
\end{align*}
Hence
\begin{align*}
(\Delta_{e_+}\sigma_x)^2=\|(\sigma_x-0I)e_+\|_H^2=\|e_-\|_H^2=(e_-,e_-)_H=1.
\end{align*}
For $\sigma_y$,
\begin{align*}
\mathbb E_{e_+}[\sigma_y]=(\sigma_y e_+,e_+)_H=(i e_-,e_+)_H=i(e_-,e_+)_H=0.
\end{align*}
Thus
\begin{align*}
(\Delta_{e_+}\sigma_y)^2=\|(\sigma_y-0I)e_+\|_H^2=\|i e_-\|_H^2=(i e_-,i e_-)_H=i\overline{i}(e_-,e_-)_H=1.
\end{align*}
So $\Delta_{e_+}\sigma_x=1$ and $\Delta_{e_+}\sigma_y=1$.
The *[Robertson Uncertainty Inequality](/theorems/6917)* gives
\begin{align*}
\Delta_{e_+}\sigma_x\,\Delta_{e_+}\sigma_y \ge \frac12\left|\big(([\sigma_x,\sigma_y]e_+),e_+\big)_H\right|.
\end{align*}
Using $[\sigma_x,\sigma_y]=2i\sigma_z$ and $\sigma_z e_+=e_+$,
\begin{align*}
\big(([\sigma_x,\sigma_y]e_+),e_+\big)_H=(2i e_+,e_+)_H=2i(e_+,e_+)_H=2i.
\end{align*}
Therefore
\begin{align*}
\frac12\left|\big(([\sigma_x,\sigma_y]e_+),e_+\big)_H\right|=\frac12|2i|=1.
\end{align*}
The left side is also
\begin{align*}
\Delta_{e_+}\sigma_x\,\Delta_{e_+}\sigma_y=1\cdot 1=1.
\end{align*}
Thus equality holds: the state $e_+$ has sharp $z$-spin, while its $x$- and $y$-spin uncertainties exactly saturate the Robertson bound.
[/example]
The Pauli example closes the first chapter's circle: Hilbert spaces provide states, rays remove irrelevant global scale, density operators encode mixed preparations, self-adjoint operators encode observables, and the commutator controls incompatibility. The rest of the course develops the analytic machinery needed when the same structures are infinite-dimensional and the most important observables are unbounded.
Finite-dimensional quantum mechanics shows the algebraic shape of the theory, but it hides the analytic difficulties of the observables that matter most. The next chapter explains why position, momentum, and Hamiltonians require domains, closedness, and self-adjointness as part of their definition.
# 2. Unbounded Operators and Domains
The first chapter treated finite-dimensional observables as self-adjoint operators on all of a Hilbert space; this chapter explains what must be added when quantum observables are not bounded. Position, momentum, and Hamiltonians for differential equations are defined only on selected subspaces of the Hilbert space, and their physical meaning depends on those domains. This chapter develops the operator-theoretic language needed to decide when a formal differential expression gives a genuine observable and how boundary conditions enter that decision.
## Why Domains Matter for Quantum Observables
The guiding problem is that the same formula can describe different quantum systems when it is assigned different domains. For instance, the differential expression
\begin{align*}
-i\frac{d}{dx}
\end{align*}
has a different status on $L^2(\mathbb R)$ from its status on $L^2(0,1)$, even though the formal rule is identical. The distinction is not cosmetic: self-adjointness is the condition that produces real spectra, spectral measures, and unitary time evolution.
We begin by separating the formula for an operator from the subspace on which the formula is allowed to act.
[definition: Unbounded Operator]
Let $H$ be a complex Hilbert space. An unbounded operator on $H$ is a pair $(A,\mathcal D(A))$, where $\mathcal D(A)\subset H$ is a linear subspace and $A:\mathcal D(A)\to H$ is a [linear map](/page/Linear%20Map).
[/definition]
The domain is part of the data. If $A$ is a differential operator, then $\mathcal D(A)$ encodes differentiability, square-integrability, and boundary conditions.
[example: Multiplication by Position]
On $H=L^2(\mathbb R)$, define
\begin{align*}
Q:\mathcal D(Q)\subset L^2(\mathbb R)\to L^2(\mathbb R),\qquad (Q\psi)(x)=x\psi(x),
\end{align*}
with
\begin{align*}
\mathcal D(Q)=\{\psi\in L^2(\mathbb R):x\psi(x)\in L^2(\mathbb R)\}.
\end{align*}
The condition in the definition of $\mathcal D(Q)$ is exactly what makes $Q\psi$ an element of $L^2(\mathbb R)$.
To see that this domain is not all of $L^2(\mathbb R)$, take
\begin{align*}
\psi(x)=\frac{1}{1+|x|}.
\end{align*}
Then
\begin{align*}
\|\psi\|_{L^2}^2=\int_{\mathbb R}\frac{1}{(1+|x|)^2}\,dx=2\int_0^\infty \frac{1}{(1+x)^2}\,dx=2.
\end{align*}
Thus $\psi\in L^2(\mathbb R)$. But
\begin{align*}
\|x\psi\|_{L^2}^2=\int_{\mathbb R}\frac{x^2}{(1+|x|)^2}\,dx=2\int_0^\infty \frac{x^2}{(1+x)^2}\,dx.
\end{align*}
For $x\geq 1$ we have $1+x\leq 2x$, hence
\begin{align*}
\frac{x^2}{(1+x)^2}\geq \frac{x^2}{(2x)^2}=\frac14.
\end{align*}
Therefore
\begin{align*}
2\int_0^\infty \frac{x^2}{(1+x)^2}\,dx\geq 2\int_1^\infty \frac14\,dx=\infty.
\end{align*}
So $x\psi\notin L^2(\mathbb R)$, even though $\psi\in L^2(\mathbb R)$. The position formula $(Q\psi)(x)=x\psi(x)$ therefore defines an operator only after the domain is restricted to wave functions with enough decay at infinity.
[/example]
This example shows why the first test for an observable cannot be boundedness. Instead, the inner product should at least make expectation values real on the domain where the observable is defined, since measured values are [real numbers](/page/Real%20Numbers). We need the following definition to turn that expectation-value requirement into an operator condition.
[definition: Symmetric Operator]
Let $H$ be a complex Hilbert space, and let $A:\mathcal D(A)\to H$ be a densely defined linear operator. The operator $A$ is symmetric if
\begin{align*}
(Au,v)_H=(u,Av)_H
\end{align*}
for all $u,v\in\mathcal D(A)$.
[/definition]
Symmetry is necessary for being an observable, but it does not by itself give a spectral theorem. The missing issue is whether the domain is maximal for the symmetry relation.
[example: Momentum on a Finite Interval]
Let $H=L^2(0,1)$ and define
\begin{align*}
P_0:C_c^\infty(0,1)\subset L^2(0,1)\to L^2(0,1),\qquad P_0u=-i\frac{du}{dx}.
\end{align*}
We show from the definition of symmetry that $P_0$ is symmetric on this test-function domain. For $u,v\in C_c^\infty(0,1)$, using the $L^2$ inner product $(f,g)_{L^2}=\int_0^1 f(x)\overline{g(x)}\,dx$, we have
\begin{align*}
(P_0u,v)_{L^2}=\int_0^1 -i u'(x)\overline{v(x)}\,dx.
\end{align*}
Integration by parts gives
\begin{align*}
\int_0^1 -i u'(x)\overline{v(x)}\,dx=-i u(1)\overline{v(1)}+i u(0)\overline{v(0)}+\int_0^1 i u(x)\overline{v'(x)}\,dx.
\end{align*}
On the other hand,
\begin{align*}
(u,P_0v)_{L^2}=\int_0^1 u(x)\overline{-i v'(x)}\,dx=\int_0^1 i u(x)\overline{v'(x)}\,dx.
\end{align*}
Subtracting these two displayed identities gives
\begin{align*}
(P_0u,v)_{L^2}-(u,P_0v)_{L^2}=-i u(1)\overline{v(1)}+i u(0)\overline{v(0)}.
\end{align*}
Since $u\in C_c^\infty(0,1)$, it vanishes in a neighbourhood of both endpoints, so $u(0)=u(1)=0$. Hence
\begin{align*}
(P_0u,v)_{L^2}-(u,P_0v)_{L^2}=0.
\end{align*}
Therefore $(P_0u,v)_{L^2}=(u,P_0v)_{L^2}$ for all $u,v\in C_c^\infty(0,1)$, so $P_0$ is symmetric. The calculation also shows what can fail on a larger domain: endpoint values may survive integration by parts, and then symmetry requires boundary conditions that make the displayed boundary expression vanish.
[/example]
To formulate maximality, we need the adjoint. The adjoint is usually larger than the original operator because test vectors are allowed to detect integration-by-parts identities without satisfying the original boundary conditions.
[definition: Adjoint of a Densely Defined Operator]
Let $A:\mathcal D(A)\to H$ be a densely defined linear operator. The domain $\mathcal D(A^*)$ consists of those $v\in H$ for which there exists $w\in H$ satisfying
\begin{align*}
(Au,v)_H=(u,w)_H
\end{align*}
for all $u\in\mathcal D(A)$. For such $v$, define $A^*v=w$.
[/definition]
The density of $\mathcal D(A)$ makes $w$ unique, so the adjoint is well-defined. The main domain problem is that a symmetric differential expression may still admit extra adjoint vectors coming from missing boundary conditions. The maximal case is when no such extra vectors remain: the original domain already equals the full adjoint domain, so the formal equality with the adjoint is an operator equality rather than only an integration-by-parts identity.
[definition: Self-Adjoint Operator]
Let $A:\mathcal D(A)\to H$ be a densely defined linear operator. The operator $A$ is self-adjoint if $\mathcal D(A)=\mathcal D(A^*)$ and $Au=A^*u$ for all $u\in\mathcal D(A)$.
[/definition]
Self-adjointness is stronger than symmetry. A symmetric operator satisfies $A\subset A^*$, meaning $\mathcal D(A)\subset\mathcal D(A^*)$ and $A^*u=Au$ on $\mathcal D(A)$; self-adjointness says there is no remaining room in the adjoint domain.
[remark: Symmetric Is Not Self-Adjoint]
A symmetric operator may have many self-adjoint extensions, exactly one self-adjoint extension, or no self-adjoint extension. These alternatives correspond physically to whether boundary conditions are still missing, already forced, or incompatible with self-adjoint quantum dynamics.
[/remark]
The next issue is stability under limits. Quantum states are approximated in Hilbert-space norm, and domains of differential operators should behave well under graph convergence.
[definition: Graph Norm and Closed Operator]
Let $A:\mathcal D(A)\to H$ be a linear operator. The graph norm on $\mathcal D(A)$ is
\begin{align*}
\|u\|_A=\bigl(\|u\|_H^2+\|Au\|_H^2\bigr)^{1/2}.
\end{align*}
The operator $A$ is closed if $\mathcal D(A)$ is complete with respect to $\|\cdot\|_A$.
[/definition]
Equivalently, $A$ is closed when $u_n\to u$ in $H$ and $Au_n\to w$ in $H$ with $u_n\in\mathcal D(A)$ imply $u\in\mathcal D(A)$ and $Au=w$. This is the operator version of saying that no graph-[limit point](/page/Limit%20Point) has been omitted. Many differential operators start on a small test domain that is not closed, so the next question is whether the graph has a meaningful operator closure.
[definition: Closable Operator and Closure]
Let $A:\mathcal D(A)\to H$ be a linear operator. The operator $A$ is closable if the closure of its graph in $H\oplus H$ is the graph of a linear operator. That operator is denoted $\overline A$.
[/definition]
Closures allow us to begin with a small test domain such as $C_c^\infty(\mathbb R)$ and then recover the largest operator forced by graph limits. For quantum mechanics, the next question is whether this closure is already self-adjoint, because then the initial test-domain formula determines a unique observable without extra boundary choices. This motivates the definition below.
[definition: Essentially Self-Adjoint Operator]
Let $A:\mathcal D(A)\to H$ be a densely defined symmetric operator. The operator $A$ is essentially self-adjoint if its closure $\overline A$ is self-adjoint.
[/definition]
Essential self-adjointness means that the initial domain may be small, but it already determines a unique self-adjoint observable. The model example is momentum on the whole line: unlike an interval, the line has no finite endpoints where boundary conditions can be imposed. The following theorem makes that intuition precise and gives the standard free-particle momentum operator.
[quotetheorem:6918]
[citeproof:6918]
This theorem is the operator-theoretic version of the physical statement that a free particle on the whole line has a canonical momentum observable. The hypothesis that the initial domain is $C_c^\infty(\mathbb R)$ matters because it says we begin from a local test-function formula, not from a domain already chosen to make the answer work. The use of the whole line is also essential: there are no finite endpoints at which boundary values can survive integration by parts. The theorem does not say that every symmetric first-order differential operator is essentially self-adjoint; it says that this translation-invariant momentum expression has a unique self-adjoint closure. This prepares the contrast with interval operators, where boundary conditions become extra data.
[example: Gaussian Wave Packet in the Momentum Domain]
Using the unitary Fourier transform
\begin{align*}
\widehat f(\xi)=(2\pi)^{-1/2}\int_{\mathbb R}e^{-ix\xi}f(x)\,dx,
\end{align*}
let $\psi(x)=\pi^{-1/4}e^{-x^2/2}$. The Gaussian integral formula gives
\begin{align*}
\widehat\psi(\xi)=\pi^{-1/4}e^{-\xi^2/2}.
\end{align*}
Hence
\begin{align*}
\|\psi\|_{L^2}^2=\int_{\mathbb R}\pi^{-1/2}e^{-x^2}\,dx=\pi^{-1/2}\sqrt{\pi}=1,
\end{align*}
so $\psi\in L^2(\mathbb R)$.
For the momentum operator $P$ obtained by Fourier diagonalisation, the domain condition is $\xi\widehat\psi(\xi)\in L^2(\mathbb R)$. Here
\begin{align*}
\|\xi\widehat\psi\|_{L^2}^2=\int_{\mathbb R}\xi^2\pi^{-1/2}e^{-\xi^2}\,d\xi.
\end{align*}
Since the integrand is even,
\begin{align*}
\int_{\mathbb R}\xi^2e^{-\xi^2}\,d\xi=2\int_0^\infty \xi^2e^{-\xi^2}\,d\xi.
\end{align*}
Integrating by parts on $[0,\infty)$ with $a(\xi)=\xi$ and $db=\xi e^{-\xi^2}d\xi$, so $b(\xi)=-\frac12e^{-\xi^2}$, gives
\begin{align*}
\int_0^\infty \xi^2e^{-\xi^2}\,d\xi=\left[-\frac12\xi e^{-\xi^2}\right]_0^\infty+\frac12\int_0^\infty e^{-\xi^2}\,d\xi=\frac{\sqrt{\pi}}{4}.
\end{align*}
Therefore
\begin{align*}
\|\xi\widehat\psi\|_{L^2}^2=\pi^{-1/2}\cdot\frac{\sqrt{\pi}}{2}=\frac12<\infty,
\end{align*}
so $\psi\in\mathcal D(P)$.
The expectation value is computed in Fourier variables:
\begin{align*}
(P\psi,\psi)_{L^2}=\int_{\mathbb R}\xi\,\widehat\psi(\xi)\overline{\widehat\psi(\xi)}\,d\xi=\pi^{-1/2}\int_{\mathbb R}\xi e^{-\xi^2}\,d\xi.
\end{align*}
The function $\xi e^{-\xi^2}$ is odd, and it is integrable because
\begin{align*}
\int_{\mathbb R}|\xi|e^{-\xi^2}\,d\xi=2\int_0^\infty \xi e^{-\xi^2}\,d\xi=1.
\end{align*}
Thus its integral over $\mathbb R$ is $0$, and
\begin{align*}
(P\psi,\psi)_{L^2}=0.
\end{align*}
This Gaussian packet lies in the momentum domain, and its mean momentum is zero because its Fourier-space probability density is symmetric about $\xi=0$.
[/example]
## Deficiency Indices and Boundary Conditions
The next question is what happens when a symmetric operator is not already determined by its initial domain. For differential operators on intervals, the adjoint domain permits boundary values, and self-adjoint extensions arise by imposing boundary conditions that cancel the boundary form. Von Neumann's deficiency-index theory packages this into a finite-dimensional problem in many examples.
[definition: Deficiency Subspaces and Deficiency Indices]
Let $A:\mathcal D(A)\to H$ be a densely defined symmetric operator. The deficiency subspaces of $A$ are
\begin{align*}
N_+(A)&=\ker(A^*-iI), & N_-(A)&=\ker(A^*+iI).
\end{align*}
The deficiency indices of $A$ are
\begin{align*}
n_+(A)&=\dim N_+(A), & n_-(A)&=\dim N_-(A).
\end{align*}
[/definition]
The vectors in $N_+(A)$ and $N_-(A)$ solve the equations $A^*u=iu$ and $A^*u=-iu$. They measure the directions in which the adjoint domain exceeds the original symmetric domain. The next theorem explains why the dimensions of these two spaces determine whether the missing domain directions can be paired into a self-adjoint extension.
[quotetheorem:6919]
[citeproof:6919]
The theorem turns the search for self-adjoint boundary conditions into linear algebra on the deficiency spaces. Closedness is part of the hypothesis because the extension problem is being asked for an operator whose graph has already been completed; otherwise the closure must first be taken. The equality $n_+(A)=n_-(A)$ is the balance condition that lets the two defect directions be paired by a unitary map. The theorem does not identify the boundary conditions themselves; it identifies the abstract parameter space of possible self-adjoint domains. In one-dimensional quantum mechanics these spaces are often spanned by exponential solutions of simple ODEs, and the next examples translate the unitary parameter into endpoint conditions.
[example: Momentum on an Interval]
Let $P_0=-i\frac{d}{dx}$ on $L^2(0,1)$ with domain $C_c^\infty(0,1)$. Its adjoint is the maximal first-derivative operator
\begin{align*}
P_0^*u=-iu',\qquad \mathcal D(P_0^*)=H^1(0,1),
\end{align*}
because integration by parts against compactly supported test functions identifies the [distributional derivative](/page/Distributional%20Derivative) $u'$ with an $L^2$ function.
The positive deficiency equation is
\begin{align*}
(P_0^*-iI)u=0.
\end{align*}
Substituting $P_0^*u=-iu'$ gives
\begin{align*}
-iu'-iu=0.
\end{align*}
Multiplying by $i$ gives
\begin{align*}
u'+u=0.
\end{align*}
Hence
\begin{align*}
(e^x u(x))'=e^x u(x)+e^x u'(x)=e^x(u(x)+u'(x))=0,
\end{align*}
so $e^x u(x)=C$ and therefore
\begin{align*}
u(x)=Ce^{-x}.
\end{align*}
Moreover
\begin{align*}
\int_0^1 |Ce^{-x}|^2\,dx=|C|^2\int_0^1 e^{-2x}\,dx=|C|^2\frac{1-e^{-2}}{2}<\infty.
\end{align*}
Thus $N_+(P_0)=\operatorname{span}\{e^{-x}\}$ and $n_+(P_0)=1$.
The negative deficiency equation is
\begin{align*}
(P_0^*+iI)u=0.
\end{align*}
Substituting $P_0^*u=-iu'$ gives
\begin{align*}
-iu'+iu=0.
\end{align*}
Multiplying by $i$ gives
\begin{align*}
u'-u=0.
\end{align*}
Hence
\begin{align*}
(e^{-x}u(x))'=e^{-x}u'(x)-e^{-x}u(x)=e^{-x}(u'(x)-u(x))=0,
\end{align*}
so $e^{-x}u(x)=C$ and therefore
\begin{align*}
u(x)=Ce^x.
\end{align*}
Also
\begin{align*}
\int_0^1 |Ce^x|^2\,dx=|C|^2\int_0^1 e^{2x}\,dx=|C|^2\frac{e^2-1}{2}<\infty.
\end{align*}
Thus $N_-(P_0)=\operatorname{span}\{e^x\}$ and $n_-(P_0)=1$.
The deficiency indices are therefore $(1,1)$. By *[Von Neumann Deficiency Index Theorem](/theorems/6919)*, the self-adjoint extensions are parametrised by unitary maps $N_+(P_0)\to N_-(P_0)$, and since both spaces are one-dimensional, this parameter space is $U(1)$.
The same parameter appears as a boundary phase. For $u,v\in H^1(0,1)$, integration by parts gives
\begin{align*}
(P_0^*u,v)_{L^2}-(u,P_0^*v)_{L^2}=-iu(1)\overline{v(1)}+iu(0)\overline{v(0)}.
\end{align*}
If a domain satisfies
\begin{align*}
u(1)=e^{i\theta}u(0),
\end{align*}
and the same condition holds for $v$, then
\begin{align*}
-iu(1)\overline{v(1)}+iu(0)\overline{v(0)}=-i e^{i\theta}u(0)\overline{e^{i\theta}v(0)}+iu(0)\overline{v(0)}.
\end{align*}
Since $\overline{e^{i\theta}v(0)}=e^{-i\theta}\overline{v(0)}$, the right-hand side is
\begin{align*}
-i e^{i\theta}e^{-i\theta}u(0)\overline{v(0)}+iu(0)\overline{v(0)}=0.
\end{align*}
Thus the self-adjoint momentum extensions on the interval are exactly the phase boundary conditions
\begin{align*}
u(1)=e^{i\theta}u(0),\qquad \theta\in[0,2\pi).
\end{align*}
The finite interval has one boundary degree of freedom at each deficiency sign, and choosing the phase tells how the two endpoints are identified.
[/example]
The interval case differs sharply from the line: endpoints create boundary degrees of freedom. Deficiency indices count these degrees of freedom abstractly, but for concrete differential operators we need the boundary conditions themselves. The next theorem gives the boundary-form criterion that converts integration by parts into a self-adjointness test.
[quotetheorem:6920]
This criterion is the practical companion to the von Neumann deficiency-index theorem above. The closed symmetric starting operator ensures that the remaining freedom is genuinely boundary data rather than an uncompleted graph. Surjectivity of the trace map is what allows a statement about $E\oplus E$ to control the whole adjoint domain. The criterion does not say that every vanishing boundary condition is self-adjoint; it says maximal vanishing boundary conditions are the self-adjoint ones. Deficiency indices say whether self-adjoint extensions exist and how many parameters they have; the boundary form writes the same information in endpoint data.
## The Particle in a Box
The main Hamiltonian for a one-dimensional particle in a box is the second derivative expression
\begin{align*}
-\frac{d^2}{dx^2}
\end{align*}
on an interval. The physical question is which boundary conditions make it a self-adjoint energy observable. Different choices correspond to different walls, phase identifications, or endpoint couplings.
We first define the minimal operator, where no endpoint values survive.
[definition: Minimal Laplacian on an Interval]
Let $H=L^2(0,L)$ with $L>0$. The minimal Laplacian is the operator
\begin{align*}
S_0:C_c^\infty(0,L)\subset L^2(0,L)\to L^2(0,L),\qquad S_0u=-u''.
\end{align*}
[/definition]
This minimal domain is too small to represent the usual box Hamiltonian directly, but it is a useful starting point because every self-adjoint realisation extends it. To find the extensions, we need the adjoint and the boundary form that records endpoint leakage of probability current. The theorem below performs that computation.
[quotetheorem:6921]
[citeproof:6921]
The boundary form shows exactly where the endpoint information enters the operator. The hypothesis $u\in H^2(0,L)$ is needed because the traces $u(0),u(L),u'(0),u'(L)$ must be meaningful and because two integrations by parts are being used. The formula does not by itself choose a Hamiltonian; it lists the boundary leakage that every candidate domain must cancel. It also connects the analytic operator problem to a finite-dimensional symplectic boundary space. A self-adjoint domain must impose two independent real boundary conditions, or equivalently a unitary relation between two endpoint vectors. Since the endpoint data for $u$ and $u'$ form a four-dimensional complex boundary space with a symplectic-type pairing, the next theorem classifies the maximal vanishing subspaces by unitary $2\times2$ matrices.
[quotetheorem:6922]
[citeproof:6922]
The $U(2)$ parametrisation is compact but can hide familiar cases. The requirement $U\in U(2)$ is not decorative: it is the finite-dimensional form of maximal isotropicity for the boundary pairing. The theorem classifies domains for the fixed differential expression $-d^2/dx^2$ on a finite interval; it does not classify Hamiltonians with interior singularities or potentials. Its practical value is that each physical endpoint rule can be checked by identifying the corresponding unitary boundary relation. The next examples translate the parametrisation into boundary conditions used in quantum mechanics.
[example: Dirichlet Particle in a Box]
Define
\begin{align*}
S_D:\mathcal D(S_D)\subset L^2(0,L)\to L^2(0,L),\qquad S_Du=-u'',
\end{align*}
where
\begin{align*}
\mathcal D(S_D)=\{u\in H^2(0,L):u(0)=u(L)=0\}.
\end{align*}
For $u,v\in\mathcal D(S_D)$, the interval Laplacian boundary form is
\begin{align*}
(S_Du,v)_{L^2}-(u,S_Dv)_{L^2}= -u'(L)\overline{v(L)}+u(L)\overline{v'(L)}+u'(0)\overline{v(0)}-u(0)\overline{v'(0)}.
\end{align*}
Since $u(0)=u(L)=v(0)=v(L)=0$, each endpoint term contains one zero factor:
\begin{align*}
-u'(L)\overline{v(L)}+u(L)\overline{v'(L)}+u'(0)\overline{v(0)}-u(0)\overline{v'(0)}= -u'(L)\cdot 0+0\cdot\overline{v'(L)}+u'(0)\cdot 0-0\cdot\overline{v'(0)}=0.
\end{align*}
Thus the Dirichlet domain cancels the boundary form.
Now suppose $u\neq 0$ is an eigenfunction, so $-u''=\lambda u$ and $u(0)=u(L)=0$. Taking the inner product with $u$ and integrating by parts gives
\begin{align*}
\lambda\|u\|_{L^2}^2=(S_Du,u)_{L^2}=\int_0^L -u''(x)\overline{u(x)}\,dx=\int_0^L |u'(x)|^2\,dx-u'(L)\overline{u(L)}+u'(0)\overline{u(0)}=\int_0^L |u'(x)|^2\,dx.
\end{align*}
Hence $\lambda\geq 0$. If $\lambda=0$, then $u''=0$, so $u(x)=ax+b$. The conditions $u(0)=0$ and $u(L)=0$ give $b=0$ and $aL+b=0$, hence $a=0$, contradicting $u\neq 0$. Therefore every eigenvalue has the form $\lambda=k^2$ with $k>0$.
For $\lambda=k^2$, the equation $-u''=k^2u$ is equivalent to
\begin{align*}
u''+k^2u=0.
\end{align*}
Its solutions are
\begin{align*}
u(x)=A\cos(kx)+B\sin(kx).
\end{align*}
The condition $u(0)=0$ gives
\begin{align*}
0=u(0)=A\cos(0)+B\sin(0)=A,
\end{align*}
so $u(x)=B\sin(kx)$. The condition $u(L)=0$ gives
\begin{align*}
0=u(L)=B\sin(kL).
\end{align*}
Since $u\neq 0$, we have $B\neq 0$, so $\sin(kL)=0$. Thus $kL=n\pi$ for some $n\in\mathbb N$, and
\begin{align*}
k=\frac{n\pi}{L},\qquad \lambda=k^2=\left(\frac{n\pi}{L}\right)^2.
\end{align*}
The corresponding eigenfunctions are scalar multiples of
\begin{align*}
\sin\left(\frac{n\pi x}{L}\right),\qquad n\in\mathbb N.
\end{align*}
This is the standard infinite square well: imposing zero boundary values at both walls produces the discrete energy levels $\left(\frac{n\pi}{L}\right)^2$.
[/example]
The Dirichlet condition models impenetrable walls. Other self-adjoint domains describe different endpoint physics while retaining real energy and unitary time evolution.
[example: Periodic Boundary Conditions]
Define
\begin{align*}
S_{\mathrm{per}}:\mathcal D(S_{\mathrm{per}})\subset L^2(0,L)\to L^2(0,L),\qquad S_{\mathrm{per}}u=-u'',
\end{align*}
where
\begin{align*}
\mathcal D(S_{\mathrm{per}})=\{u\in H^2(0,L):u(0)=u(L),\ u'(0)=u'(L)\}.
\end{align*}
For $u,v\in\mathcal D(S_{\mathrm{per}})$, the interval Laplacian boundary form is
\begin{align*}
(S_{\mathrm{per}}u,v)_{L^2}-(u,S_{\mathrm{per}}v)_{L^2}= -u'(L)\overline{v(L)}+u(L)\overline{v'(L)}+u'(0)\overline{v(0)}-u(0)\overline{v'(0)}.
\end{align*}
Using $u(L)=u(0)$, $u'(L)=u'(0)$, $v(L)=v(0)$, and $v'(L)=v'(0)$, the right-hand side becomes
\begin{align*}
-u'(0)\overline{v(0)}+u(0)\overline{v'(0)}+u'(0)\overline{v(0)}-u(0)\overline{v'(0)}=0.
\end{align*}
Thus the periodic domain cancels the boundary form.
Now compute the eigenvalues. If $u\neq 0$ satisfies
\begin{align*}
-u''=\lambda u,
\end{align*}
then taking the inner product with $u$ and integrating by parts gives
\begin{align*}
\lambda\|u\|_{L^2}^2=\int_0^L -u''(x)\overline{u(x)}\,dx=\int_0^L |u'(x)|^2\,dx-u'(L)\overline{u(L)}+u'(0)\overline{u(0)}.
\end{align*}
The periodic conditions give $u'(L)\overline{u(L)}=u'(0)\overline{u(0)}$, so
\begin{align*}
\lambda\|u\|_{L^2}^2=\int_0^L |u'(x)|^2\,dx\geq 0.
\end{align*}
Hence $\lambda\geq 0$.
For $\lambda=0$, the equation is $u''=0$, so $u(x)=Ax+B$. The condition $u(0)=u(L)$ gives
\begin{align*}
B=AL+B,
\end{align*}
so $A=0$ and the eigenfunctions are the nonzero constant functions.
For $\lambda>0$, write $\lambda=k^2$ with $k>0$. Then
\begin{align*}
u(x)=A\cos(kx)+B\sin(kx).
\end{align*}
The condition $u(0)=u(L)$ gives
\begin{align*}
A=A\cos(kL)+B\sin(kL),
\end{align*}
or equivalently
\begin{align*}
A(\cos(kL)-1)+B\sin(kL)=0.
\end{align*}
Since
\begin{align*}
u'(x)=-Ak\sin(kx)+Bk\cos(kx),
\end{align*}
the condition $u'(0)=u'(L)$ gives
\begin{align*}
Bk=-Ak\sin(kL)+Bk\cos(kL).
\end{align*}
Dividing by $k>0$ gives
\begin{align*}
-A\sin(kL)+B(\cos(kL)-1)=0.
\end{align*}
The coefficient matrix for $A,B$ has determinant
\begin{align*}
(\cos(kL)-1)^2+\sin^2(kL)=\cos^2(kL)-2\cos(kL)+1+\sin^2(kL)=2-2\cos(kL).
\end{align*}
A nonzero eigenfunction requires this determinant to vanish, so
\begin{align*}
2-2\cos(kL)=0.
\end{align*}
Thus $\cos(kL)=1$, hence $kL=2\pi n$ for some $n\in\mathbb N$, and
\begin{align*}
\lambda=k^2=\left(\frac{2\pi n}{L}\right)^2.
\end{align*}
Equivalently, combining the positive and zero cases, the eigenfunctions may be written
\begin{align*}
e^{2\pi i n x/L},\qquad n\in\mathbb Z,
\end{align*}
because
\begin{align*}
-\frac{d^2}{dx^2}e^{2\pi i n x/L}=\left(\frac{2\pi n}{L}\right)^2e^{2\pi i n x/L}
\end{align*}
and
\begin{align*}
e^{2\pi i n L/L}=e^{2\pi i n}=1.
\end{align*}
The periodic domain identifies the two endpoints, so the model is a particle on a circle of circumference $L$ rather than a particle trapped between two separate walls.
[/example]
The same boundary-form method also accommodates singular interactions in one dimension. A point interaction is not introduced by multiplying by an ordinary potential function; it is encoded by matching conditions at the point.
[example: Delta Point Interaction on the Line]
Let $\alpha\in\mathbb R$ and define $H_\alpha u=-u''$ on functions
\begin{align*}
u\in H^2((-\infty,0))\oplus H^2((0,\infty))
\end{align*}
whose one-sided traces satisfy
\begin{align*}
u(0+)=u(0-)
\end{align*}
and
\begin{align*}
u'(0+)-u'(0-)=\alpha u(0).
\end{align*}
The common value in the first condition is denoted by $u(0)$.
For $u,v$ in this domain, integration by parts on the two half-lines gives the boundary contribution at the cut point
\begin{align*}
\mathcal B(u,v)=-u'(0-)\overline{v(0-)}+u(0-)\overline{v'(0-)}+u'(0+)\overline{v(0+)}-u(0+)\overline{v'(0+)}.
\end{align*}
Using $u(0-)=u(0+)=u(0)$ and $v(0-)=v(0+)=v(0)$, this becomes
\begin{align*}
\mathcal B(u,v)=\bigl(u'(0+)-u'(0-)\bigr)\overline{v(0)}+u(0)\bigl(\overline{v'(0-)}-\overline{v'(0+)}\bigr).
\end{align*}
The derivative jump condition for $u$ gives
\begin{align*}
u'(0+)-u'(0-)=\alpha u(0).
\end{align*}
The derivative jump condition for $v$ gives
\begin{align*}
v'(0+)-v'(0-)=\alpha v(0).
\end{align*}
Taking complex conjugates, and using $\alpha\in\mathbb R$, gives
\begin{align*}
\overline{v'(0+)}-\overline{v'(0-)}=\alpha\overline{v(0)}.
\end{align*}
Hence
\begin{align*}
\overline{v'(0-)}-\overline{v'(0+)}=-\alpha\overline{v(0)}.
\end{align*}
Substituting the two jump identities into the boundary form gives
\begin{align*}
\mathcal B(u,v)=\alpha u(0)\overline{v(0)}-\alpha u(0)\overline{v(0)}=0.
\end{align*}
Thus the matching conditions cancel exactly the boundary leakage at the interaction point. The continuity condition preserves the wave function across $0$, while the derivative jump encodes the strength $\alpha$ of the delta interaction; by the boundary-form criterion, these maximal matching conditions define the self-adjoint one-dimensional delta Hamiltonian.
[/example]
## Physical Consequences of Self-Adjointness
The operator theory above is not only a classification exercise. Quantum dynamics, spectra, and conservation of probability depend on self-adjointness rather than on formal symmetry alone. The course uses this point repeatedly when Hamiltonians are introduced by differential expressions.
[quotetheorem:6923]
Stone's theorem belongs to the spectral theorem material of the next chapter, so here it is used as a structural result rather than developed in detail. Its message for this chapter is that selecting a self-adjoint extension is selecting the quantum dynamics.
[example: Boundary Conditions Change the Spectrum]
Consider the same differential expression $-d^2/dx^2$ on $(0,L)$ with $L>0$, first with Dirichlet boundary conditions $u(0)=u(L)=0$. If $u\neq 0$ is an eigenfunction with eigenvalue $\lambda$, then
\begin{align*}
-u''=\lambda u.
\end{align*}
For the nonzero eigenvalues write $\lambda=k^2$ with $k>0$. The equation becomes
\begin{align*}
u''+k^2u=0,
\end{align*}
so
\begin{align*}
u(x)=A\cos(kx)+B\sin(kx).
\end{align*}
The first boundary condition gives
\begin{align*}
0=u(0)=A\cos(0)+B\sin(0)=A.
\end{align*}
Thus $u(x)=B\sin(kx)$. The second boundary condition gives
\begin{align*}
0=u(L)=B\sin(kL).
\end{align*}
Since $u\neq 0$, we have $B\neq 0$, hence $\sin(kL)=0$. Therefore $kL=n\pi$ for some $n\in\mathbb N$, and
\begin{align*}
\lambda=k^2=\left(\frac{n\pi}{L}\right)^2.
\end{align*}
With periodic boundary conditions $u(0)=u(L)$ and $u'(0)=u'(L)$, the zero mode already appears: if $u(x)=C$ with $C\neq 0$, then
\begin{align*}
-u''=0=0\cdot u,
\end{align*}
so $\lambda=0$ is an eigenvalue. For $\lambda=k^2>0$, the same general solution gives
\begin{align*}
u(x)=A\cos(kx)+B\sin(kx),
\end{align*}
and
\begin{align*}
u'(x)=-Ak\sin(kx)+Bk\cos(kx).
\end{align*}
The condition $u(0)=u(L)$ gives
\begin{align*}
A=A\cos(kL)+B\sin(kL),
\end{align*}
equivalently
\begin{align*}
A(\cos(kL)-1)+B\sin(kL)=0.
\end{align*}
The condition $u'(0)=u'(L)$ gives
\begin{align*}
Bk=-Ak\sin(kL)+Bk\cos(kL).
\end{align*}
Since $k>0$, this is equivalent to
\begin{align*}
-A\sin(kL)+B(\cos(kL)-1)=0.
\end{align*}
A nonzero pair $(A,B)$ exists exactly when the determinant of this two-by-two system is zero:
\begin{align*}
(\cos(kL)-1)^2+\sin^2(kL)=\cos^2(kL)-2\cos(kL)+1+\sin^2(kL)=2-2\cos(kL).
\end{align*}
Thus $\cos(kL)=1$, so $kL=2\pi n$ for some $n\in\mathbb N$, and
\begin{align*}
\lambda=k^2=\left(\frac{2\pi n}{L}\right)^2.
\end{align*}
Combining the zero mode with the positive modes, the periodic eigenvalues are
\begin{align*}
\left(\frac{2\pi n}{L}\right)^2,\qquad n\in\mathbb Z.
\end{align*}
Thus the same formal expression $-d^2/dx^2$ gives different energy levels after the domain is changed: the Dirichlet domain has no zero mode and has spacing based on $\pi/L$, while the periodic domain includes constants and has modes based on $2\pi/L$.
[/example]
The chapter's main lesson is that unbounded quantum observables are not formulas alone. A symmetric differential expression becomes a physical observable only after its domain has been chosen so that the operator is self-adjoint, or after a small initial domain has been shown to be essentially self-adjoint. Deficiency indices and boundary forms are the two working tools for making that decision in the examples used throughout quantum mechanics.
Once an unbounded operator has been made into a genuine self-adjoint observable, the next task is to extract physical predictions from it. The spectral theorem provides exactly this bridge, turning self-adjoint operators into measurement rules and probability measures.
# 3. Spectral Theorem and Measurement
The preceding chapter explained why quantum observables cannot be treated as arbitrary Hermitian matrices once the Hilbert space is infinite-dimensional: domains, closedness, and self-adjointness become part of the data. This chapter answers the next question: once a self-adjoint operator has been identified, how do we read physical outcomes from it? The prerequisites are complex Hilbert spaces, orthogonal projections, closed and self-adjoint unbounded operators, Borel measures on $\mathbb R$, and the basic $L^2$ model of wavefunctions. The spectral theorem supplies the answer by replacing diagonalisation by a projection-valued measure, and measurement is formulated by applying that measure to Borel sets of possible outcomes.
## Projection-Valued Measures as Generalised Diagonalisation
The finite-dimensional spectral theorem says that a Hermitian matrix decomposes into eigenspaces and that every vector splits orthogonally across those eigenspaces. For position, momentum, and Hamiltonians with continuous spectrum, there may be no honest eigenbasis in the Hilbert space. The right replacement is not a list of eigenvectors but a rule assigning an [orthogonal projection](/theorems/437) to each measurable subset of the real line.
[definition: Projection-Valued Measure]
Let $H$ be a complex Hilbert space. A projection-valued measure on $\mathbb R$ is a map $E: \mathcal B(\mathbb R) \to \mathcal L(H)$ such that:
1. $E(B)$ is an orthogonal projection for every $B \in \mathcal B(\mathbb R)$.
2. $E(\varnothing)=0$ and $E(\mathbb R)=I$.
3. If $(B_n)_{n \ge 1}$ are pairwise disjoint Borel sets, then
\begin{align*}
E\left(\bigcup_{n=1}^{\infty} B_n\right)u = \sum_{n=1}^{\infty} E(B_n)u.
\end{align*}
The convergence is in $H$ for every $u \in H$.
4. $E(B \cap C)=E(B)E(C)$ for all Borel sets $B,C \subset \mathbb R$.
[/definition]
The projection $E(B)$ is interpreted as the part of the state whose measured value lies in $B$. Countable additivity is in the strong operator topology because applying the projection to a vector is the physically meaningful operation.
[example: Finite-Dimensional Spectral Measure]
Let $A$ be a Hermitian matrix on $H=\mathbb C^n$ with distinct eigenvalues $\lambda_1,\dots,\lambda_m$ and orthogonal eigenspace projections $P_1,\dots,P_m$. The finite-dimensional eigenspace decomposition gives
\begin{align*}
P_jP_k=0 \text{ for } j\ne k,\qquad P_j^*=P_j,\qquad P_j^2=P_j,\qquad \sum_{j=1}^m P_j=I.
\end{align*}
For each Borel set $B\subset\mathbb R$, define
\begin{align*}
E(B)=\sum_{\lambda_j\in B}P_j.
\end{align*}
If $J_B=\{j:\lambda_j\in B\}$, then
\begin{align*}
E(B)^*=\left(\sum_{j\in J_B}P_j\right)^*=\sum_{j\in J_B}P_j^*= \sum_{j\in J_B}P_j=E(B).
\end{align*}
Also, using $P_jP_k=0$ for $j\ne k$ and $P_j^2=P_j$,
\begin{align*}
E(B)^2=\left(\sum_{j\in J_B}P_j\right)\left(\sum_{k\in J_B}P_k\right)=\sum_{j\in J_B}\sum_{k\in J_B}P_jP_k=\sum_{j\in J_B}P_j^2=E(B).
\end{align*}
Thus $E(B)$ is an orthogonal projection. Moreover $E(\varnothing)=0$ because no eigenvalue lies in $\varnothing$, and $E(\mathbb R)=\sum_{j=1}^mP_j=I$.
If $B$ and $C$ are Borel sets, then
\begin{align*}
E(B)E(C)=\left(\sum_{\lambda_j\in B}P_j\right)\left(\sum_{\lambda_k\in C}P_k\right)=\sum_{\lambda_j\in B,\ \lambda_j\in C}P_j=E(B\cap C).
\end{align*}
If $(B_r)_{r\ge 1}$ are pairwise disjoint Borel sets, then each eigenvalue $\lambda_j$ belongs to at most one $B_r$, so for every $u\in H$,
\begin{align*}
E\left(\bigcup_{r=1}^{\infty}B_r\right)u=\sum_{\lambda_j\in \cup_r B_r}P_ju=\sum_{r=1}^{\infty}\sum_{\lambda_j\in B_r}P_ju=\sum_{r=1}^{\infty}E(B_r)u.
\end{align*}
The last series has only finitely many nonzero eigenspace contributions, so it converges in $H$. Hence $E$ is a projection-valued measure.
For a unit state $u$, orthogonality of the ranges of the $P_j$ gives
\begin{align*}
\|E(B)u\|_H^2=\left(\sum_{\lambda_j\in B}P_ju,\sum_{\lambda_k\in B}P_ku\right)_H=\sum_{\lambda_j\in B}\sum_{\lambda_k\in B}(P_ju,P_ku)_H.
\end{align*}
When $j\ne k$, $(P_ju,P_ku)_H=(P_kP_ju,u)_H=0$, while for $j=k$ the term is $\|P_ju\|_H^2$. Therefore
\begin{align*}
\|E(B)u\|_H^2=\sum_{\lambda_j\in B}\|P_ju\|_H^2.
\end{align*}
So the spectral measure records both which eigenvalues lie in the outcome set $B$ and how much of the state lies in the corresponding orthogonal eigenspaces.
[/example]
This example shows that a projection-valued measure packages both the possible values and the [orthogonal decomposition](/theorems/436) of the Hilbert space. To use this package for probabilities and expectations, we need ordinary scalar measures obtained by testing the projections against vectors.
[definition: Scalar Spectral Measure]
Let $E$ be a projection-valued measure on $H$ and let $u,v \in H$. The complex measure $E_{u,v}: \mathcal B(\mathbb R)\to \mathbb C$ is defined by
\begin{align*}
E_{u,v}(B)=(E(B)u,v)_H.
\end{align*}
For $u=v$, write $E_u:=E_{u,u}$.
[/definition]
For a unit vector $u$, the measure $E_u$ is a probability measure. The remaining problem is existence: self-adjointness should produce such a measure canonically, since otherwise the measurement rule would depend on extra choices.
[quotetheorem:6911]
[citeproof:6911]
The theorem is the infinite-dimensional version of diagonalising a Hermitian matrix. The formula
\begin{align*}
A=\int_{\mathbb R}\lambda\,dE_A(\lambda)
\end{align*}
means that $A$ multiplies each spectral slice by its spectral value, even when the slices are infinitesimal rather than eigenspaces. Self-adjointness is essential: a merely symmetric operator need not have a projection-valued spectral resolution; for instance, $-i\frac{d}{dx}$ on $C_c^\infty(0,1) \subset L^2(0,1)$ is symmetric but not self-adjoint and has no such real spectral measure until a self-adjoint boundary condition is chosen. Different self-adjoint boundary conditions, such as periodic and quasiperiodic boundary conditions on $(0,1)$, give different self-adjoint extensions and therefore different spectral measures. The theorem does not say that $A$ has an orthonormal basis of eigenvectors, since multiplication by $x$ on $L^2(\mathbb R)$ has purely continuous spectrum. What it gives instead is the measure-theoretic replacement for that basis, which is exactly the structure needed for functional calculus and measurement probabilities.
[remark: Resolution of Identity]
For a self-adjoint operator $A$, the family $E_A(( -\infty,\lambda])$ is called the resolution of identity of $A$. It is increasing in $\lambda$, right-continuous in the strong operator topology, has strong limit $0$ as $\lambda \to -\infty$, and has strong limit $I$ as $\lambda \to \infty$.
[/remark]
The resolution of identity is often how the spectral theorem is written in physics. It is the operator-valued cumulative distribution function for the observable.
## Functional Calculus and Spectral Mapping
Once $A$ has a spectral measure, every measurable function of the measurement outcome should define a new observable when its domain is chosen correctly. This is the rigorous form of writing $f(A)$ for functions such as $A^2$, $e^{-itA}$, or $\mathbb{1}_B(A)$.
[definition: Borel Functional Calculus]
Let $A$ be self-adjoint with spectral measure $E_A$, and let $f: \mathbb R \to \mathbb C$ be Borel measurable. Define
\begin{align*}
f(A):\mathcal D(f(A))\subset H\to H,\qquad f(A)=\int_{\mathbb R} f(\lambda)\,dE_A(\lambda)
\end{align*}
on the domain
\begin{align*}
\mathcal D(f(A))=\left\{u \in H : \int_{\mathbb R}|f(\lambda)|^2\,dE_{A,u}(\lambda)<\infty\right\}.
\end{align*}
[/definition]
If $f$ is bounded, then $f(A) \in \mathcal L(H)$ and
\begin{align*}
\|f(A)\|_{\mathcal L(H)} \le \|f\|_\infty.
\end{align*}
If $f$ is real-valued, then $f(A)$ is self-adjoint on its natural domain.
[example: Spectral Projections as Functions of an Observable]
For a Borel set $B\subset\mathbb R$, let $f=\mathbb{1}_B$. Since $|\mathbb{1}_B(\lambda)|\le 1$ for every $\lambda\in\mathbb R$, the [Borel functional calculus](/theorems/2696) gives a bounded operator
\begin{align*}
\mathbb{1}_B(A)=\int_{\mathbb R}\mathbb{1}_B(\lambda)\,dE_A(\lambda).
\end{align*}
By the defining property of integration against a projection-valued measure, integrating the indicator of $B$ selects exactly the projection assigned to $B$:
\begin{align*}
\int_{\mathbb R}\mathbb{1}_B(\lambda)\,dE_A(\lambda)=E_A(B).
\end{align*}
Therefore
\begin{align*}
\mathbb{1}_B(A)=E_A(B).
\end{align*}
Because $E_A(B)$ is an orthogonal projection, the yes-no measurement asking whether the value of $A$ lies in $B$ is itself one of the projections produced by the same functional calculus.
[/example]
This example shows that the calculus recovers projections as well as powers and exponentials. We also need to know how the possible values of $f(A)$ relate to the possible values of $A$, since physical reparametrisation should not create unrelated spectral values.
[quotetheorem:6924]
[citeproof:6924]
Spectral mapping justifies the operational rule that applying a function to an observable applies the same function to its possible measured values. The bounded complex-valued case stays inside $\mathcal L(H)$, so invertibility can be tested in the algebra of bounded operators without adding domain conditions. If this boundedness is removed for complex-valued $f$, the multiplication model generally produces an unbounded normal operator with a proper maximal domain, and the statement belongs to the normal-operator functional calculus rather than to the bounded-operator clause above. The unbounded real-valued case stays in the self-adjoint observable framework because the natural domain makes $f(A)$ self-adjoint. Real-valuedness is what keeps the output observable-valued; a genuinely complex function of a real measurement outcome has complex spectral values and no longer represents a real-valued measurement. The closure in the continuous real case matters when the image approaches a limit value not attained by $f$ on the spectrum.
Continuity cannot be dropped from the pointwise image formula. Let $A=M_x$ on $L^2(\mathbb R)$ and define the Borel function $f$ by $f(0)=1$ and $f(x)=0$ for $x\ne 0$. Since $\mathcal L^1(\{0\})=0$, the functional calculus gives $f(A)=0$, so $\sigma(f(A))=\{0\}$, while $f(\sigma(A))=\{0,1\}$. The value $1$ is present in the pointwise image but invisible to the operator because it occurs only on a spectral-null set. Self-adjointness is also a real restriction: the unilateral shift $S$ on $\ell^2(\mathbb N)$ is bounded but not self-adjoint, and $\sigma(S)$ is the closed unit disc rather than a subset of $\mathbb R$ governed by a real projection-valued measure. Even the identity function cannot turn $S$ into a real observable with spectral projections over subsets of $\mathbb R$. This is the parallel failure to the measurable-function example: dropping the hypothesis changes the object that spectral mapping is describing, not merely the proof. The theorem also does not assert that every value in $f(\sigma(A))$ is an eigenvalue of $f(A)$; for multiplication by $x$ on $L^2(\mathbb R)$, the whole real line is spectral but no point is an eigenvalue. This distinction is what makes the essential-range formulation necessary for [measurable functions](/page/Measurable%20Functions) and prepares the use of unitary propagators below.
[example: Energy Propagator]
Let $H_0:\mathcal D(H_0)\subset H\to H$ be self-adjoint, and fix $t\in\mathbb R$. Put $f_t(\lambda)=e^{-it\lambda}$. Since $|f_t(\lambda)|=1$ for every $\lambda\in\mathbb R$, $f_t$ is bounded, so the Borel functional calculus defines
\begin{align*}
U(t)=f_t(H_0)=\int_{\mathbb R}e^{-it\lambda}\,dE_{H_0}(\lambda).
\end{align*}
Using that the bounded Borel functional calculus respects adjoints and products,
\begin{align*}
U(t)^*=\overline{f_t}(H_0)=\left(\lambda\mapsto e^{it\lambda}\right)(H_0).
\end{align*}
Hence
\begin{align*}
U(t)^*U(t)=(\overline{f_t}f_t)(H_0).
\end{align*}
For each $\lambda$,
\begin{align*}
\overline{f_t(\lambda)}f_t(\lambda)=e^{it\lambda}e^{-it\lambda}=e^0=1.
\end{align*}
Therefore
\begin{align*}
U(t)^*U(t)=1(H_0)=I.
\end{align*}
The same calculation gives
\begin{align*}
U(t)U(t)^*=(f_t\overline{f_t})(H_0)=I,
\end{align*}
so $U(t)$ is unitary.
If $u\ne 0$ is an eigenvector with $H_0u=Eu$, then the functional calculus acts on the eigenspace by evaluating the function at the eigenvalue. Thus
\begin{align*}
U(t)u=f_t(H_0)u=f_t(E)u=e^{-itE}u.
\end{align*}
More generally, for a Borel set $B$, the spectral component $E_{H_0}(B)v$ is transformed by
\begin{align*}
U(t)E_{H_0}(B)v=\int_B e^{-it\lambda}\,dE_{H_0}(\lambda)v.
\end{align*}
Thus time evolution does not change the size of each spectral component; it multiplies the component at spectral value $\lambda$ by the phase $e^{-it\lambda}$.
[/example]
This example anticipates Chapter 5: the spectral theorem will solve time-independent Schrödinger evolution by multiplying by phases in the spectral representation.
## Multiplication Operators and Physical Observables
The most transparent spectral theorem is the one in which the operator is already diagonal. Multiplication operators provide that model, and many quantum observables become multiplication operators after the right transform.
[definition: Multiplication Operator]
Let $(X,\mathcal E,\mu)$ be a [measure space](/page/Measure%20Space) and let $m:X\to\mathbb R$ be measurable. The multiplication operator $M_m$ on $L^2(X,\mathcal E,\mu)$ is defined by
\begin{align*}
M_m:\mathcal D(M_m)\subset L^2(X,\mathcal E,\mu)\to L^2(X,\mathcal E,\mu),\qquad
(M_m f)(x)=m(x)f(x)
\end{align*}
on the domain
\begin{align*}
\mathcal D(M_m)=\{f\in L^2(X,\mathcal E,\mu): mf\in L^2(X,\mathcal E,\mu)\}.
\end{align*}
[/definition]
Multiplication by a real measurable function is the prototype of a self-adjoint operator. To connect this model with the abstract theorem, we need to identify its spectral projections and verify that the integral reconstruction gives back multiplication by $m$.
[quotetheorem:6925]
[citeproof:6925]
This theorem converts an abstract statement into a concrete computation: measuring $M_m$ means observing the [random variable](/page/Random%20Variable) $m(x)$ when $x$ is distributed according to $|f(x)|^2d\mu(x)$. The reality of $m$ is essential for self-adjointness; multiplication by a genuinely complex-valued function is normal but not self-adjoint, so it cannot represent a real-valued observable. The theorem does not say that $M_m$ has eigenvectors for each value of $m$: if $m(x)=x$ on $(\mathbb R,\mathcal B(\mathbb R),\mathcal L^1)$, then every singleton has measure zero and there are no $L^2$ eigenfunctions. This model calculation is the bridge to position and momentum, where diagonalisation means changing to a representation in which the observable is multiplication by a real variable.
[example: Position Observable]
On $L^2(\mathbb R)$, the position observable is multiplication by the coordinate function $x$. Thus
\begin{align*}
(Q\psi)(x)=x\psi(x).
\end{align*}
The natural domain is exactly the set of square-integrable wavefunctions for which this product is still square-integrable:
\begin{align*}
\mathcal D(Q)=\{\psi\in L^2(\mathbb R):x\psi(x)\in L^2(\mathbb R)\}.
\end{align*}
By *Spectral Measure of a Multiplication Operator* with $m(x)=x$, the spectral projection attached to a Borel set $B\subset\mathbb R$ is multiplication by the indicator of $B$:
\begin{align*}
(E_Q(B)\psi)(x)=\mathbb{1}_B(x)\psi(x).
\end{align*}
Indeed, applying the projection twice gives
\begin{align*}
(E_Q(B)^2\psi)(x)=\mathbb{1}_B(x)(E_Q(B)\psi)(x)=\mathbb{1}_B(x)^2\psi(x)=\mathbb{1}_B(x)\psi(x).
\end{align*}
Since $\mathbb{1}_B$ is real-valued, multiplication by $\mathbb{1}_B$ is self-adjoint on $L^2(\mathbb R)$, so $E_Q(B)$ is an orthogonal projection.
If $\|\psi\|_{L^2}=1$, then the measurement distribution of $Q$ in the state $\psi$ satisfies
\begin{align*}
\mu_\psi^Q(B)=\|E_Q(B)\psi\|_{L^2}^2.
\end{align*}
Substituting the formula for $E_Q(B)$ gives
\begin{align*}
\|E_Q(B)\psi\|_{L^2}^2=\int_{\mathbb R}|\mathbb{1}_B(x)\psi(x)|^2\,d\mathcal L^1(x).
\end{align*}
Because $\mathbb{1}_B(x)^2=\mathbb{1}_B(x)$, this becomes
\begin{align*}
\|E_Q(B)\psi\|_{L^2}^2=\int_{\mathbb R}\mathbb{1}_B(x)|\psi(x)|^2\,d\mathcal L^1(x)=\int_B|\psi(x)|^2\,d\mathcal L^1(x).
\end{align*}
Thus position measurement in the position representation has probability density $|\psi(x)|^2$ with respect to [Lebesgue measure](/page/Lebesgue%20Measure).
[/example]
Position is diagonal in the position representation. Momentum is not diagonal there, but the Fourier transform changes representation and turns differentiation into multiplication.
[example: Fourier Diagonalisation of Momentum]
Let $P:H^1(\mathbb R)\subset L^2(\mathbb R)\to L^2(\mathbb R)$ be the self-adjoint momentum operator $P\psi=-i\psi'$ in the [weak derivative](/page/Weak%20Derivative) sense. The symmetric Fourier transform $\mathcal F:L^2(\mathbb R)\to L^2(\mathbb R)$ is the unitary extension of
\begin{align*}
\hat{\psi}(\xi)=\frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}\psi(x)e^{-i\xi x}\,d\mathcal L^1(x)
\end{align*}
initially defined on $L^1(\mathbb R)\cap L^2(\mathbb R)$.
For $\varphi\in C_c^\infty(\mathbb R)$, integration by parts has no boundary term because $\varphi$ has compact support. Hence
\begin{align*}
\widehat{P\varphi}(\xi)=\frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}(-i\varphi'(x))e^{-i\xi x}\,d\mathcal L^1(x)
\end{align*}
and
\begin{align*}
\int_{\mathbb R}\varphi'(x)e^{-i\xi x}\,d\mathcal L^1(x)=\varphi(x)e^{-i\xi x}\big|_{-\infty}^{\infty}-\int_{\mathbb R}\varphi(x)(-i\xi)e^{-i\xi x}\,d\mathcal L^1(x)
\end{align*}
so
\begin{align*}
\int_{\mathbb R}\varphi'(x)e^{-i\xi x}\,d\mathcal L^1(x)=i\xi\int_{\mathbb R}\varphi(x)e^{-i\xi x}\,d\mathcal L^1(x).
\end{align*}
Multiplying by $-i(2\pi)^{-1/2}$ gives
\begin{align*}
\widehat{P\varphi}(\xi)=\xi\hat{\varphi}(\xi).
\end{align*}
Now let $\psi\in H^1(\mathbb R)$. Choose $\varphi_n\in C_c^\infty(\mathbb R)$ with $\varphi_n\to\psi$ in $H^1(\mathbb R)$. Then $\varphi_n\to\psi$ in $L^2$ and $\varphi_n'\to\psi'$ in $L^2$, so $P\varphi_n\to P\psi$ in $L^2$. Since $\mathcal F$ is unitary,
\begin{align*}
\widehat{P\varphi_n}\to\widehat{P\psi}\quad\text{in }L^2(\mathbb R).
\end{align*}
Also $\hat{\varphi}_n\to\hat{\psi}$ in $L^2$, and the identities $\widehat{P\varphi_n}=\xi\hat{\varphi}_n$ show that $\xi\hat{\varphi}_n$ converges in $L^2$ to $\widehat{P\psi}$. Therefore $\hat{\psi}$ belongs to the domain of multiplication by $\xi$, and
\begin{align*}
\widehat{P\psi}(\xi)=\xi\hat{\psi}(\xi)
\end{align*}
as an identity in $L^2(\mathbb R)$.
Thus $\mathcal F P\mathcal F^{-1}=M_\xi$, equivalently
\begin{align*}
P=\mathcal F^{-1}M_\xi\mathcal F.
\end{align*}
By *Spectral Measure of a Multiplication Operator*, the spectral projection of $M_\xi$ for a Borel set $B$ is multiplication by $\mathbb{1}_B(\xi)$. Hence the spectral projection of $P$ is
\begin{align*}
E_P(B)=\mathcal F^{-1}E_{M_\xi}(B)\mathcal F.
\end{align*}
For a normalized state $\psi$, this gives
\begin{align*}
\|E_P(B)\psi\|_{L^2}^2=\|E_{M_\xi}(B)\hat{\psi}\|_{L^2}^2
\end{align*}
by unitarity of $\mathcal F$, and therefore
\begin{align*}
\|E_P(B)\psi\|_{L^2}^2=\int_{\mathbb R}|\mathbb{1}_B(\xi)\hat{\psi}(\xi)|^2\,d\mathcal L^1(\xi)=\int_B|\hat{\psi}(\xi)|^2\,d\mathcal L^1(\xi).
\end{align*}
So momentum is diagonal in the Fourier representation, and measuring momentum in the state $\psi$ gives the probability measure $|\hat{\psi}(\xi)|^2\,d\mathcal L^1(\xi)$.
[/example]
The Fourier transform therefore plays the role of a change of basis from position space to momentum space. General spectral representations are abstract versions of this same manoeuvre.
## Types of Spectrum and Generalized Eigenfunctions
The spectral theorem assigns projections to all Borel sets, but different parts of the spectrum behave differently in computations. Bound states correspond to genuine eigenvectors, scattering states are usually represented by non-square-integrable generalized eigenfunctions, and singular spectral pieces are neither of these in a simple way.
[definition: Point Continuous and Residual Spectrum]
Let $H$ be a complex Hilbert space, and let $A:\mathcal D(A)\subset H\to H$ be a closed densely defined operator. The point spectrum $\sigma_p(A)$ is the set of $\lambda\in\mathbb C$ such that $A-\lambda I:\mathcal D(A)\subset H\to H$ is not injective. The continuous spectrum $\sigma_c(A)$ is the set of $\lambda$ such that $A-\lambda I:\mathcal D(A)\subset H\to H$ is injective with dense range but has no bounded inverse on its range. The residual spectrum $\sigma_r(A)$ is the set of $\lambda$ such that $A-\lambda I:\mathcal D(A)\subset H\to H$ is injective and its range is not dense.
[/definition]
For self-adjoint operators the residual spectrum is empty. We need this result because it separates observables from merely closed operators: spectral mass for an observable cannot hide in a non-dense range obstruction.
[quotetheorem:6926]
[citeproof:6926]
The self-adjointness hypothesis cannot be replaced by closedness or boundedness alone. Let $S:\ell^2(\mathbb N)\to\ell^2(\mathbb N)$ be the unilateral shift $S e_n=e_{n+1}$. Then $S$ is bounded and injective, but $\operatorname{Range}(S)=\{x\in\ell^2(\mathbb N):x_1=0\}$ is not dense, so $0\in\sigma_r(S)$. This example shows the exact obstruction excluded by self-adjointness: a spectral value can appear through a missing range direction rather than through eigenvectors or approximate eigenvectors.
The point and continuous spectra still need finer language. The spectral measure of a vector may have atoms, a density with respect to Lebesgue measure, or a singular continuous component. The obstruction is that the label "continuous spectrum" only records failure of a bounded inverse, not the kind of measure present there: multiplication by $x$ on $L^2(\mathbb R)$ has absolutely continuous spectrum, while multiplication by $x$ on $L^2(C,\nu)$ for the [Cantor set](/page/Cantor%20Set) $C$ with Cantor measure $\nu$ has singular continuous spectral behaviour. These cases require different expansion methods even though both lack ordinary eigenvectors.
[definition: Spectral Type]
Let $A$ be self-adjoint with spectral measure $E_A$. The pure point subspace $H_{\mathrm{pp}}$ is the closed span of all eigenvectors of $A$. The absolutely continuous subspace $H_{\mathrm{ac}}$ consists of vectors $u$ such that $E_{A,u}$ is absolutely continuous with respect to Lebesgue measure. The singular continuous subspace $H_{\mathrm{sc}}$ consists of vectors $u$ such that $E_{A,u}$ has no atoms and is singular with respect to Lebesgue measure.
[/definition]
The notation separates three physical behaviours: bound states, scattering states with density in the spectral parameter, and singular continuous states that occur in more delicate models. We need a theorem saying these behaviours are not mixed ambiguously, but form orthogonal reducing pieces of the Hilbert space.
[quotetheorem:6927]
This decomposition is not merely terminology: it dictates how solutions are expanded. The distinction depends on self-adjointness and the projection-valued measure; for a general closed operator, root subspaces and nonorthogonal spectral behaviour can replace this clean orthogonal splitting. A concrete finite-dimensional obstruction is the nilpotent operator $N$ on $\mathbb C^2$ determined by $Ne_1=0$ and $Ne_2=e_1$. It has spectrum $\{0\}$, but it is not diagonalizable and cannot be reconstructed from orthogonal spectral projections, because a projection-valued spectral resolution with only the spectral value $0$ would force $N=0$. Thus even before measure-theoretic spectral types enter, dropping the projection-valued measure destroys the orthogonal decomposition mechanism. The theorem also does not identify which part is present for a given Hamiltonian, since a free particle, a confining potential, and a singular potential may have very different spectral types. Its role is structural: once later analysis determines the measure class, the spectral theorem tells us how to expand states and interpret the corresponding measurement statistics. Discrete parts are sums over eigenvectors, while absolutely continuous parts require integral transforms, so the next concept records the formal eigenfunctions that appear in those transforms.
[definition: Generalized Eigenfunction]
Let $\Omega\subseteq\mathbb R^n$ be open, let $H$ be a closed subspace of $L^2(\Omega)$, and let $A:\mathcal D(A)\subset H\to H$ be a self-adjoint differential operator. A generalized eigenfunction of $A$ with spectral parameter $\lambda\in\mathbb R$ is a nonzero $u\in\mathcal D'(\Omega)$ such that the distributional expression $Au$ is defined and
\begin{align*}
Au=\lambda u
\end{align*}
in $\mathcal D'(\Omega)$.
[/definition]
Generalized eigenfunctions are not vectors in the Hilbert space, so they are not eigenvectors in the definition of point spectrum. Their value is that they provide kernels for spectral transforms.
[example: Plane Waves for Momentum]
Fix $\xi\in\mathbb R$ and set $u_\xi(x)=e^{i\xi x}$. Since $u_\xi$ is smooth, its distributional derivative agrees with its classical derivative. For every $x\in\mathbb R$,
\begin{align*}
\frac{d}{dx}u_\xi(x)=\frac{d}{dx}e^{i\xi x}=i\xi e^{i\xi x}.
\end{align*}
Multiplying by $-i$ gives
\begin{align*}
-i\frac{d}{dx}u_\xi(x)=(-i)(i\xi)e^{i\xi x}=\xi e^{i\xi x}=\xi u_\xi(x).
\end{align*}
Thus $u_\xi$ solves the eigenvalue equation $Pu_\xi=\xi u_\xi$ in the distributional sense.
However $u_\xi$ is not an eigenvector in $L^2(\mathbb R)$, because $|e^{i\xi x}|=1$ for every $x$, so
\begin{align*}
\int_{\mathbb R}|u_\xi(x)|^2\,d\mathcal L^1(x)=\int_{\mathbb R}1\,d\mathcal L^1(x)=\infty.
\end{align*}
The Fourier transform still uses these functions as its oscillatory kernel:
\begin{align*}
\hat{\psi}(\xi)=\frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}\psi(x)e^{-i\xi x}\,d\mathcal L^1(x)=\frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}\psi(x)\overline{u_\xi(x)}\,d\mathcal L^1(x).
\end{align*}
So the momentum operator has no normalizable plane-wave eigenvectors, but its continuous spectral representation is obtained by expanding states against the generalized eigenfunctions $u_\xi$, one for each momentum value $\xi\in\mathbb R$.
[/example]
The same pattern appears for Schrödinger operators: negative eigenvalues often correspond to square-integrable bound states, while positive energies are represented by oscillatory generalized eigenfunctions. The spectral theorem is what makes both descriptions part of one operator-theoretic framework.
## Measurement and the Born Rule
The spectral theorem turns the slogan "observables are self-adjoint operators" into a rule for experimental statistics. A self-adjoint operator determines not only expected values but a probability distribution for every state.
[definition: Measurement Distribution]
Let $A$ be self-adjoint on $H$ with spectral measure $E_A$, and let $u\in H$ satisfy $\|u\|_H=1$. The measurement distribution of $A$ in state $u$ is the probability measure $\mu_u^A$ on $\mathbb R$ defined by
\begin{align*}
\mu_u^A(B)=\|E_A(B)u\|_H^2=(E_A(B)u,u)_H.
\end{align*}
[/definition]
The expectation and variance from the first chapter are recovered as moments of this measure whenever the corresponding integrals are finite. We need the moment formula to verify that the spectral measurement rule agrees with the operator formulas already used for observables.
[quotetheorem:6928]
[citeproof:6928]
This theorem completes the link between operator theory and measurement. An observable is not a device producing a single number before a state is chosen; it is a self-adjoint operator whose spectral measure produces a probability law for each normalized state. The domain assumptions are essential: for an unbounded observable, a normalized vector outside $\mathcal D(A)$ still has a measurement distribution, but its expectation may fail to be finite; for position on $L^2(\mathbb R)$, a state with sufficiently heavy tails can have no finite first moment. The theorem also does not describe state update after measurement, only the distribution and moments of outcomes. That remaining post-measurement rule is illustrated in the finite-outcome example and then generalized by spectral projections.
[example: Two Outcomes and Spectral Projection]
Suppose the spectral measure of $A$ is concentrated at two values $a_+$ and $a_-$, with corresponding orthogonal projections $P_+$ and $P_-$. Thus $P_+P_-=0$, $P_-P_+=0$, $P_+^2=P_+$, $P_-^2=P_-$, and $P_++P_-=I$. For a normalized state $u$, the measurement distribution assigns
\begin{align*}
\mu_u^A(\{a_+\})=\|P_+u\|_H^2
\end{align*}
and
\begin{align*}
\mu_u^A(\{a_-\})=\|P_-u\|_H^2.
\end{align*}
These two probabilities add to $1$ because
\begin{align*}
u=(P_++P_-)u=P_+u+P_-u
\end{align*}
and the two summands are orthogonal:
\begin{align*}
(P_+u,P_-u)_H=(P_-P_+u,u)_H=(0,u)_H=0.
\end{align*}
Therefore
\begin{align*}
1=\|u\|_H^2=\|P_+u+P_-u\|_H^2=\|P_+u\|_H^2+\|P_-u\|_H^2.
\end{align*}
Since $A$ acts as multiplication by $a_+$ on $\operatorname{Range}(P_+)$ and by $a_-$ on $\operatorname{Range}(P_-)$, we have
\begin{align*}
Au=a_+P_+u+a_-P_-u.
\end{align*}
Taking the inner product with $u=P_+u+P_-u$ gives
\begin{align*}
(Au,u)_H=(a_+P_+u+a_-P_-u,P_+u+P_-u)_H.
\end{align*}
Expanding by linearity in the first variable and conjugate-linearity in the second gives
\begin{align*}
(Au,u)_H=a_+\|P_+u\|_H^2+a_+(P_+u,P_-u)_H+a_-(P_-u,P_+u)_H+a_-\|P_-u\|_H^2.
\end{align*}
The two mixed terms vanish by orthogonality, so
\begin{align*}
(Au,u)_H=a_+\|P_+u\|_H^2+a_-\|P_-u\|_H^2.
\end{align*}
If the outcome $a_+$ is observed and $P_+u\ne 0$, the conditioned state is
\begin{align*}
u_+=\frac{P_+u}{\|P_+u\|_H}.
\end{align*}
It is normalized because
\begin{align*}
\|u_+\|_H^2=\left\|\frac{P_+u}{\|P_+u\|_H}\right\|_H^2=\frac{\|P_+u\|_H^2}{\|P_+u\|_H^2}=1.
\end{align*}
Thus the two-outcome case reduces measurement to orthogonal decomposition: probabilities are squared projection lengths, expectation is their weighted average, and conditioning means normalizing the observed spectral component.
[/example]
The finite-outcome formula is the old projection postulate from Chapter 1. The spectral theorem extends it from matrices and finite sums to arbitrary self-adjoint observables, replacing sums by spectral integrals and eigenspace projections by the full resolution of identity.
The spectral theorem gives a general language for measuring arbitrary self-adjoint observables. We now apply that language to the canonical pair of position and momentum, where the commutation relation forces a distinctive infinite-dimensional representation.
# 4. Canonical Commutation and the Schrödinger Representation
The preceding chapters made observables precise as self-adjoint operators and explained how the spectral theorem turns them into measurement rules. This chapter specializes those ideas to the pair of observables that encode motion on the line: position and momentum. The central question is how to represent the canonical commutation relation in a Hilbert space without hiding the domain issues caused by unbounded operators.
The answer is the Schrödinger representation on $L^2(\mathbb R)$, supported by Fourier analysis and by a controlled use of generalized eigenvectors. We first formulate the canonical commutation relation in its Weyl form, where exponentials of position and momentum are bounded unitary operators. We then connect this representation to uncertainty, to the Fourier transform, and to the rigged Hilbert space notation used throughout physics.
## Position and Momentum on the Line
Building on the position and momentum examples from Chapters 2 and 3, the problem is to turn the classical coordinates $q$ and $p$ on phase space into operators acting on wave functions. Multiplication by $x$ is natural for position, while differentiation is natural for momentum, but both operators are unbounded and therefore must be accompanied by domains.
[definition: Schrödinger Position Operator]
The Schrödinger position operator is the unbounded operator
\begin{align*}
Q &: \mathcal D(Q) \subset L^2(\mathbb R) \to L^2(\mathbb R),
\end{align*}
where
\begin{align*}
\mathcal D(Q) = \{\psi \in L^2(\mathbb R) : x\psi(x) \in L^2(\mathbb R)\},
\end{align*}
and
\begin{align*}
Q &: \psi \mapsto Q\psi, & (Q\psi)(x) &= x\psi(x).
\end{align*}
[/definition]
This operator records the spatial distribution of a state. The next observable should generate spatial translations, so its infinitesimal action is differentiation.
[definition: Schrödinger Momentum Operator]
From this chapter onward we keep the physical factor $\hbar$ explicitly, so the Schrödinger momentum operator is the unbounded operator
\begin{align*}
P &: H^1(\mathbb R) \subset L^2(\mathbb R) \to L^2(\mathbb R),
\end{align*}
given by
\begin{align*}
P &: \psi \mapsto P\psi, & (P\psi)(x) &= -i\hbar\frac{d\psi}{dx}(x).
\end{align*}
[/definition]
The next problem is to test whether these two operators reproduce the canonical bracket expected from classical phase space. This test is needed because merely naming an operator as position or momentum does not guarantee the right algebraic relation between them. Since $Q$ and $P$ are unbounded, the verification is made on the common invariant test domain $\mathcal S(\mathbb R)$, where both products $QP$ and $PQ$ are meaningful.
[quotetheorem:6929]
[citeproof:6929]
This calculation is the familiar canonical commutation relation, but the hypothesis $\psi\in\mathcal S(\mathbb R)$ is essential because both $QP\psi$ and $PQ\psi$ must exist. It does not assert that $QP-PQ=i\hbar I$ as an operator identity on all of $L^2(\mathbb R)$, since a generic $L^2$ function need not be differentiable and need not remain in the required domains after applying $Q$ or $P$. For example, a compactly supported discontinuous $L^2$ function lies outside $H^1(\mathbb R)$, so $P\psi$ is not an $L^2$ derivative to which the commutator calculation can be applied. This is why exponentiating the operators is preferable: translations and phase multiplications are defined everywhere on $L^2(\mathbb R)$, and the next example introduces these bounded transformations.
[example: Translation Generated by Momentum]
For $a\in\mathbb R$, define $T(a)\psi(x)=\psi(x-a)$. For $\psi\in L^2(\mathbb R)$, the change of variables $y=x-a$ gives
\begin{align*}
\|T(a)\psi\|_{L^2}^2=\int_{\mathbb R}|\psi(x-a)|^2\,d\mathcal L^1(x)=\int_{\mathbb R}|\psi(y)|^2\,d\mathcal L^1(y)=\|\psi\|_{L^2}^2.
\end{align*}
Also $T(a)T(b)\psi(x)=T(b)\psi(x-a)=\psi(x-a-b)=T(a+b)\psi(x)$ and $T(0)\psi=\psi$, so $T(a)^{-1}=T(-a)$. Hence each $T(a)$ is unitary.
For $\psi\in\mathcal S(\mathbb R)$, differentiating the translated function with respect to $a$ gives
\begin{align*}
\frac{d}{da}T(a)\psi(x)=\frac{d}{da}\psi(x-a)=-\psi'(x-a).
\end{align*}
At $a=0$ this becomes
\begin{align*}
\left.\frac{d}{da}\right|_{a=0}T(a)\psi(x)=-\psi'(x).
\end{align*}
Since $P\psi=-i\hbar\psi'$, we have
\begin{align*}
-\frac{i}{\hbar}P\psi=-\frac{i}{\hbar}(-i\hbar\psi')=-\psi'.
\end{align*}
Thus the infinitesimal generator of $T(a)$ on $\mathcal S(\mathbb R)$ is $-\frac{i}{\hbar}P$, so by *Stone's theorem for strongly continuous unitary groups*,
\begin{align*}
T(a)=\exp\left(-\frac{iaP}{\hbar}\right).
\end{align*}
This makes precise the physical statement that momentum generates spatial translations: applying $P$ measures the first-order change of a wave function under displacement.
[/example]
## Weyl Relations and Uniqueness of the Schrödinger Representation
The commutator identity alone is too fragile for a general representation theory, because different choices of domains can change the meaning of $QP-PQ$. The better question is how the unitary groups generated by $Q$ and $P$ interact.
[definition: Weyl Operators]
For $a,b\in\mathbb R$, the Schrödinger Weyl operators are the unitary maps
\begin{align*}
U(a),V(b) &: L^2(\mathbb R) \to L^2(\mathbb R),
\end{align*}
defined by
\begin{align*}
U(a) &: \psi \mapsto U(a)\psi, & (U(a)\psi)(x) &= e^{iax}\psi(x),
\end{align*}
and
\begin{align*}
V(b) &: \psi \mapsto V(b)\psi, & (V(b)\psi)(x) &= \psi(x-b).
\end{align*}
[/definition]
Here $U(a)$ is generated by $Q$ after a rescaling, while $V(b)$ is generated by $P$. Since both operators are unitary on the whole Hilbert space, their product can be formed without any domain restriction. The key question is whether the infinitesimal commutator has a bounded counterpart that still remembers the non-commutativity of position and momentum.
[quotetheorem:6930]
[citeproof:6930]
The Weyl relation shows that translating first and multiplying by a phase first differ by a scalar phase. The hypotheses that $a,b\in\mathbb R$ and that $U(a),V(b)$ act on all of $L^2(\mathbb R)$ matter because the proof compares two everywhere-defined unitary operators, not two partially defined differential expressions. The theorem does not say that $U(a)$ and $V(b)$ commute; when $ab\notin 2\pi\mathbb Z$, the scalar $e^{iab}$ is not $1$, so the order of the operations is detected by a phase. This phase is the analytic shadow of the symplectic geometry of classical phase space, with its sign fixed by the convention $V(b)\psi(x)=\psi(x-b)$. Once the relation has been stated in terms of strongly continuous unitary groups, it becomes meaningful to ask whether some other irreducible Hilbert space model of the same relation could exist.
[quotetheorem:6931]
Separability and strong continuity exclude pathological Hilbert space representations in which Stone's theorem cannot recover self-adjoint generators from the unitary groups. Irreducibility is also essential: a direct sum of two Schrödinger representations satisfies the same Weyl relation but is not unitarily equivalent to a single irreducible copy on $L^2(\mathbb R)$. The condition $\lambda\ne 0$ separates genuine quantum commutation from the commuting case $\lambda=0$, where simultaneous multiplication models are possible. The structural proof belongs to the representation theory of the Heisenberg group and uses the spectral theorem together with irreducibility. The message for quantum mechanics is that the standard position-momentum representation is not merely a convenient model: under the stated hypotheses, it is the unique irreducible model.
[remark: Why the Weyl Form Is Used]
The unbounded operators $Q$ and $P$ cannot be multiplied on arbitrary vectors, and their commutator depends on the intersection of several domains. By contrast, $U(a)$ and $V(b)$ are unitary on the whole Hilbert space. The Weyl relation therefore gives a stable formulation of the canonical commutation relations.
[/remark]
## The Heisenberg Uncertainty Principle
The commutation relation has a measurable consequence: no normalized state can make both position and momentum sharply concentrated. The question is how to turn the algebraic relation $[Q,P]=i\hbar I$ into a quantitative lower bound on variances.
[definition: Variance of an Observable in a State]
Let $A:\mathcal D(A)\subset H\to H$ be a self-adjoint operator on a Hilbert space $H$. For each normalized state $\psi\in\mathcal D(A)$, the expectation assignment is
\begin{align*}
\mathbb E_\psi &: \{A\} \to \mathbb R, & A &\mapsto \mathbb E_\psi[A] := (A\psi,\psi)_H.
\end{align*}
For the same normalized domain vector $\psi\in\mathcal D(A)$, the variance assignment is
\begin{align*}
\Delta_\psi^2 &: \{A\} \to [0,\infty), & A &\mapsto (\Delta_\psi A)^2 := \|(A-\mathbb E_\psi[A]I)\psi\|_H^2.
\end{align*}
[/definition]
Variance measures spread around the expected measurement value, so small $\Delta_\psi Q$ means spatial localization and small $\Delta_\psi P$ means momentum localization. For unrelated observables there need not be a universal lower bound on the product of spreads. For $Q$ and $P$, the commutator supplies exactly the missing lower bound.
[quotetheorem:6932]
The inequality is sharp, so it is not merely a qualitative obstruction. The normalization hypothesis is needed because rescaling $\psi$ rescales the inner products used to define expectation and variance; the statement is about physical states, which are unit vectors. The Schwartz hypothesis keeps all terms in the proof inside the domains of $Q$, $P$, and their commutator, and the theorem does not claim that every normalized vector of $L^2(\mathbb R)$ has finite position and momentum variance. For instance, a normalized wave function with a heavy tail can fail to lie in $\mathcal D(Q)$, so $\Delta_\psi Q$ is not finite. The states that attain equality are Gaussian wave packets, which also explain why Gaussians are central in free quantum evolution and the harmonic oscillator.
[example: Minimum-Uncertainty Gaussian]
Let
\begin{align*}
\psi(x)=\frac{1}{(2\pi\sigma^2)^{1/4}}\exp\left(-\frac{(x-x_0)^2}{4\sigma^2}+\frac{ip_0x}{\hbar}\right),
\end{align*}
where $\sigma>0$ and $x_0,p_0\in\mathbb R$. Since the phase has modulus $1$,
\begin{align*}
|\psi(x)|^2=\frac{1}{(2\pi\sigma^2)^{1/2}}\exp\left(-\frac{(x-x_0)^2}{2\sigma^2}\right).
\end{align*}
With $u=(x-x_0)/\sigma$, so that $dx=\sigma\,du$, the normalization is
\begin{align*}
\|\psi\|_{L^2}^2=\frac{1}{(2\pi\sigma^2)^{1/2}}\int_{\mathbb R}\exp\left(-\frac{(x-x_0)^2}{2\sigma^2}\right)\,d\mathcal L^1(x)=\frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}e^{-u^2/2}\,d\mathcal L^1(u)=1.
\end{align*}
The same substitution gives the position expectation:
\begin{align*}
\mathbb E_\psi[Q]=\int_{\mathbb R}x|\psi(x)|^2\,d\mathcal L^1(x)=\frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}(x_0+\sigma u)e^{-u^2/2}\,d\mathcal L^1(u)=x_0.
\end{align*}
Here the $u$-term vanishes because $u e^{-u^2/2}$ is odd. Therefore
\begin{align*}
(\Delta_\psi Q)^2=\int_{\mathbb R}(x-x_0)^2|\psi(x)|^2\,d\mathcal L^1(x)=\frac{\sigma^2}{(2\pi)^{1/2}}\int_{\mathbb R}u^2e^{-u^2/2}\,d\mathcal L^1(u)=\sigma^2,
\end{align*}
where $\int_{\mathbb R}u^2e^{-u^2/2}\,d\mathcal L^1(u)=\int_{\mathbb R}e^{-u^2/2}\,d\mathcal L^1(u)=(2\pi)^{1/2}$ follows by integration by parts.
Now differentiate the wave function:
\begin{align*}
\psi'(x)=\left(-\frac{x-x_0}{2\sigma^2}+\frac{ip_0}{\hbar}\right)\psi(x).
\end{align*}
Thus
\begin{align*}
(P\psi)(x)=-i\hbar\psi'(x)=\left(p_0+\frac{i\hbar(x-x_0)}{2\sigma^2}\right)\psi(x).
\end{align*}
Hence
\begin{align*}
\mathbb E_\psi[P]=\int_{\mathbb R}\left(p_0+\frac{i\hbar(x-x_0)}{2\sigma^2}\right)|\psi(x)|^2\,d\mathcal L^1(x)=p_0,
\end{align*}
because $\int_{\mathbb R}(x-x_0)|\psi(x)|^2\,d\mathcal L^1(x)=0$. Subtracting the expectation gives
\begin{align*}
((P-p_0I)\psi)(x)=\frac{i\hbar(x-x_0)}{2\sigma^2}\psi(x).
\end{align*}
Therefore
\begin{align*}
(\Delta_\psi P)^2=\|(P-p_0I)\psi\|_{L^2}^2=\frac{\hbar^2}{4\sigma^4}\int_{\mathbb R}(x-x_0)^2|\psi(x)|^2\,d\mathcal L^1(x)=\frac{\hbar^2}{4\sigma^2}.
\end{align*}
Since $\sigma>0$, this gives $\Delta_\psi Q=\sigma$ and $\Delta_\psi P=\hbar/(2\sigma)$, so
\begin{align*}
\Delta_\psi Q\,\Delta_\psi P=\sigma\cdot\frac{\hbar}{2\sigma}=\frac{\hbar}{2}.
\end{align*}
This example shows that equality in the Heisenberg inequality is physically attained by localized wave packets with Gaussian profile.
[/example]
## Fourier Transform and Momentum Space
The position representation makes $Q$ diagonal, since it acts by multiplication by $x$. The next problem is to find a representation in which $P$ is diagonal, and the Fourier transform provides exactly that change of representation.
[definition: Fourier Transform on the Schwartz Space]
The Fourier transform on the Schwartz space is the map
\begin{align*}
\mathcal F &: \mathcal S(\mathbb R) \to \mathcal S(\mathbb R),
\end{align*}
defined by
\begin{align*}
\mathcal F &: \psi \mapsto \hat\psi, & \hat\psi(k) &= \frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}\psi(x)e^{-ikx}\,d\mathcal L^1(x).
\end{align*}
The inverse Fourier transform on the Schwartz space is the map
\begin{align*}
\mathcal F^{-1} &: \mathcal S(\mathbb R) \to \mathcal S(\mathbb R),
\end{align*}
defined by
\begin{align*}
\mathcal F^{-1} &: \phi \mapsto \check\phi, & \check\phi(x) &= \frac{1}{(2\pi)^{1/2}}\int_{\mathbb R}\phi(k)e^{ikx}\,d\mathcal L^1(k).
\end{align*}
[/definition]
With this convention, the Fourier transform extends uniquely by density to a unitary map
\begin{align*}
\mathcal F &: L^2(\mathbb R) \to L^2(\mathbb R).
\end{align*}
Its operational importance comes from the way it exchanges differentiation and multiplication. Since momentum is a differential operator in the position variable, the next result identifies the momentum observable in the transformed representation.
[quotetheorem:6933]
[citeproof:6933]
This result lets us rewrite differential equations involving momentum as multiplication equations in momentum space. The hypothesis $\psi\in\mathcal S(\mathbb R)$ is used twice: it justifies integration by parts on the whole line and removes boundary terms at infinity. The theorem does not say that every $L^2$ function has a pointwise derivative whose Fourier transform can be computed by this formula; outside the domain of $P$, the expression $P\psi$ is not defined as an $L^2$ vector. After taking the self-adjoint closure, the result says that the spectral representation of $P$ is multiplication by $\hbar k$, which is often the simplest way to solve translation-invariant quantum systems.
[example: Momentum-Space Schrödinger Equation]
For $m>0$, let $H=P^2/(2m)$, and consider a differentiable time-dependent state $t\mapsto \psi(t)$ with $\psi(t)\in\mathcal D(P^2)$. The free Schrödinger equation is
\begin{align*}
i\hbar\frac{\partial\psi}{\partial t}(t)=\frac{1}{2m}P^2\psi(t).
\end{align*}
Apply the Fourier transform in the spatial variable $x$. Since $\mathcal F$ is linear and independent of $t$,
\begin{align*}
\mathcal F\left(i\hbar\frac{\partial\psi}{\partial t}(t)\right)(k)=i\hbar\frac{\partial}{\partial t}\hat\psi(t,k).
\end{align*}
By *Fourier Diagonalisation of Momentum*, $\mathcal F(P\phi)(k)=\hbar k\hat\phi(k)$ for suitable domain vectors $\phi$. Applying this first to $\phi=P\psi(t)$ gives
\begin{align*}
\mathcal F(P^2\psi(t))(k)=\hbar k\,\mathcal F(P\psi(t))(k).
\end{align*}
Applying the same identity again to $\phi=\psi(t)$ gives
\begin{align*}
\mathcal F(P^2\psi(t))(k)=\hbar k(\hbar k\hat\psi(t,k))=\hbar^2k^2\hat\psi(t,k).
\end{align*}
Therefore the Fourier-transformed Schrödinger equation is
\begin{align*}
i\hbar\frac{\partial}{\partial t}\hat\psi(t,k)=\frac{\hbar^2k^2}{2m}\hat\psi(t,k).
\end{align*}
For each fixed $k$, divide by $i\hbar$:
\begin{align*}
\frac{\partial}{\partial t}\hat\psi(t,k)=\frac{\hbar^2k^2}{2m\,i\hbar}\hat\psi(t,k).
\end{align*}
Since $1/i=-i$, this becomes
\begin{align*}
\frac{\partial}{\partial t}\hat\psi(t,k)=-\frac{i\hbar k^2}{2m}\hat\psi(t,k).
\end{align*}
Solving this scalar equation gives
\begin{align*}
\hat\psi(t,k)=\exp\left(-\frac{i\hbar k^2t}{2m}\right)\hat\psi(0,k).
\end{align*}
Thus the Fourier transform converts free quantum evolution into multiplication by a phase factor at each momentum value $p=\hbar k$.
[/example]
## Schwartz Space and Rigged Hilbert Space Notation
Physicists often write expressions such as $|x\rangle$, $|p\rangle$, and $\langle x|\psi\rangle=\psi(x)$, although $|x\rangle$ is not a vector in $L^2(\mathbb R)$. The mathematical question is how to keep the convenience of this notation while maintaining a correct Hilbert space framework.
[definition: Schwartz Space]
The Schwartz space $\mathcal S(\mathbb R)$ is the vector space of all smooth functions $\phi:\mathbb R\to\mathbb C$ such that, for every pair of nonnegative integers $m,n$,
\begin{align*}
\sup_{x\in\mathbb R}|x^m\phi^{(n)}(x)|<\infty.
\end{align*}
[/definition]
Schwartz functions are stable under differentiation, multiplication by polynomials, and Fourier transform. They provide a common test domain for $Q$, $P$, and the formal manipulations used in quantum mechanics. To interpret position and momentum eigenvectors, however, we need a space larger than $L^2(\mathbb R)$ in which evaluation functionals and plane waves can live.
[definition: Rigged Hilbert Space for the Line]
The standard rigged Hilbert space for the Schrödinger representation is the chain
\begin{align*}
\mathcal S(\mathbb R)\subset L^2(\mathbb R)\subset \mathcal S'(\mathbb R),
\end{align*}
where $\mathcal S'(\mathbb R)$ is the space of [tempered distributions](/page/Tempered%20Distributions).
[/definition]
The inclusion on the right identifies an $L^2$ function with the tempered distribution obtained by integration against test functions. This framework explains generalized eigenvectors as distributions rather than Hilbert space vectors.
[example: Position Eigenkets as Distributions]
For $x_0\in\mathbb R$, the formal ket $|x_0\rangle$ is represented in the rigged Hilbert space by the evaluation distribution $\delta_{x_0}\in\mathcal S'(\mathbb R)$, defined on test functions by
\begin{align*}
\delta_{x_0}(\phi)=\phi(x_0)
\end{align*}
for every $\phi\in\mathcal S(\mathbb R)$. To verify the generalized eigenvalue relation, use the dual action of multiplication by $x$: for $\phi\in\mathcal S(\mathbb R)$,
\begin{align*}
(Q\delta_{x_0})(\phi)=\delta_{x_0}(Q\phi)
\end{align*}
and the definition of $Q$ on test functions gives
\begin{align*}
(Q\phi)(x)=x\phi(x).
\end{align*}
Evaluating at $x_0$ therefore gives
\begin{align*}
(Q\delta_{x_0})(\phi)=\delta_{x_0}(Q\phi)=(Q\phi)(x_0)=x_0\phi(x_0).
\end{align*}
Since scalar multiplication of a distribution is defined by $(x_0\delta_{x_0})(\phi)=x_0\delta_{x_0}(\phi)$, we also have
\begin{align*}
(x_0\delta_{x_0})(\phi)=x_0\phi(x_0).
\end{align*}
Thus $(Q\delta_{x_0})(\phi)=(x_0\delta_{x_0})(\phi)$ for every $\phi\in\mathcal S(\mathbb R)$, so
\begin{align*}
Q\delta_{x_0}=x_0\delta_{x_0}
\end{align*}
as tempered distributions. Consequently $\langle x_0|\psi\rangle=\psi(x_0)$ means the distributional pairing $\delta_{x_0}(\psi)$ for $\psi\in\mathcal S(\mathbb R)$; it is not an $L^2$ inner product with a Hilbert-space vector $|x_0\rangle$.
[/example]
The same interpretation applies to momentum eigenkets. Plane waves are not square-integrable, but they act naturally on Schwartz functions and become eigenvectors after passing to the distributional extension.
[example: Momentum Eigenstates]
For $p\in\mathbb R$, define the plane-wave distribution $E_p\in\mathcal S'(\mathbb R)$ by
\begin{align*}
E_p(\phi)=\int_{\mathbb R}e^{ipx/\hbar}\phi(x)\,d\mathcal L^1(x)
\end{align*}
for $\phi\in\mathcal S(\mathbb R)$. The function $x\mapsto e^{ipx/\hbar}$ is not in $L^2(\mathbb R)$, because $|e^{ipx/\hbar}|^2=1$ for every $x$ and therefore $\int_{\mathbb R}1\,d\mathcal L^1(x)=\infty$.
We verify the eigenvalue equation in $\mathcal S'(\mathbb R)$. The distributional derivative is defined by $(D T)(\phi)=-T(\phi')$, so
\begin{align*}
(P E_p)(\phi)=(-i\hbar D E_p)(\phi)=i\hbar E_p(\phi').
\end{align*}
Substituting the definition of $E_p$ gives
\begin{align*}
(P E_p)(\phi)=i\hbar\int_{\mathbb R}e^{ipx/\hbar}\phi'(x)\,d\mathcal L^1(x).
\end{align*}
Since $\phi$ is rapidly decreasing, the boundary term in integration by parts is zero, and
\begin{align*}
\int_{\mathbb R}e^{ipx/\hbar}\phi'(x)\,d\mathcal L^1(x)=-\int_{\mathbb R}\frac{ip}{\hbar}e^{ipx/\hbar}\phi(x)\,d\mathcal L^1(x).
\end{align*}
Therefore
\begin{align*}
(P E_p)(\phi)=i\hbar\left(-\frac{ip}{\hbar}\right)\int_{\mathbb R}e^{ipx/\hbar}\phi(x)\,d\mathcal L^1(x)=pE_p(\phi).
\end{align*}
Since this holds for every $\phi\in\mathcal S(\mathbb R)$, we have $P E_p=pE_p$ as tempered distributions.
In momentum space, the same generalized eigenstate is concentrated at $k=p/\hbar$: with the Fourier convention of this chapter, $\mathcal F E_p=(2\pi)^{1/2}\delta_{p/\hbar}$ as a distribution. Thus momentum eigenstates are not Hilbert-space vectors, but in the rigged Hilbert space they become point masses for the multiplication operator $\hbar k$.
[/example]
Rigged Hilbert spaces do not replace Hilbert spaces; they organize a useful layer of notation around them. Physical states remain normalized elements of $L^2(\mathbb R)$, observables remain self-adjoint operators, and generalized eigenvectors are bookkeeping devices inside $\mathcal S'(\mathbb R)$.
The Schrödinger representation identifies the basic kinematic observables for a particle on the line. The next step is dynamical: given a Hamiltonian, determine how states evolve in time and why that evolution must be unitary.
# 5. Time Evolution and the Schrödinger Equation
Time evolution is where the Hilbert-space formulation becomes a dynamical theory. Earlier chapters identified states with rays or density operators, observables with self-adjoint operators, and measurement with the spectral theorem; this chapter asks how a state changes between measurements. The main prerequisites are the Hilbert-space formalism for quantum states from Chapter 1, the domain discipline for unbounded operators from Chapter 2, the spectral calculus from Chapter 3, and the interpretation of inner products as transition amplitudes. The main answer is that reversible time evolution is encoded by a strongly continuous unitary group, and Stone's theorem identifies its infinitesimal generator as the Hamiltonian.
## Strongly Continuous Unitary Groups
The first question is what structure should replace the finite-dimensional formula $\psi(t)=e^{-itH}\psi(0)$ when the Hamiltonian may be unbounded. We want a family of maps preserving probabilities, compatible with composition in time, and continuous enough that differentiating in time makes sense on a suitable domain.
[definition: Strongly Continuous Unitary Group]
Let $H$ be a complex Hilbert space. A strongly continuous unitary group on $H$ is a family $(U(t))_{t\in\mathbb R}$ of unitary operators $U(t)\in\mathcal L(H)$ such that:
1. $U(0)=I$.
2. $U(t+s)=U(t)U(s)$ for all $s,t\in\mathbb R$.
3. For every $\psi\in H$, $U(t)\psi\to \psi$ in $H$ as $t\to 0$.
[/definition]
The group law expresses time translation invariance: evolving by $s$ and then by $t$ is the same as evolving by $s+t$. Strong continuity is weaker than norm continuity of $t\mapsto U(t)$ as an operator-valued map, and that distinction is essential for differential operators such as momentum and Schrödinger Hamiltonians.
[example: Translation Group on the Line]
Let $H=L^2(\mathbb R)$ and define $(U(t)\psi)(x)=\psi(x-t)$. For $\psi\in L^2(\mathbb R)$, [translation invariance of Lebesgue measure](/theorems/4911) gives
\begin{align*}
\|U(t)\psi\|_2^2=\int_{\mathbb R}|\psi(x-t)|^2\,d\mathcal L^1(x)=\int_{\mathbb R}|\psi(y)|^2\,d\mathcal L^1(y)=\|\psi\|_2^2.
\end{align*}
Thus $U(t)$ is an isometry, and $U(-t)$ is its inverse because
\begin{align*}
(U(-t)U(t)\psi)(x)=(U(t)\psi)(x+t)=\psi(x).
\end{align*}
So each $U(t)$ is unitary. The group law is obtained pointwise:
\begin{align*}
(U(t)U(s)\psi)(x)=(U(s)\psi)(x-t)=\psi(x-t-s)=(U(t+s)\psi)(x).
\end{align*}
We show that this unitary group is strongly continuous. First let $\varphi\in C_c^\infty(\mathbb R)$, and choose a compact set $K$ containing $\operatorname{supp}\varphi$. For $|t|\le 1$, the function $x\mapsto \varphi(x-t)-\varphi(x)$ is supported in the compact set $K+[-1,1]$. Hence
\begin{align*}
\|U(t)\varphi-\varphi\|_2^2=\int_{K+[-1,1]}|\varphi(x-t)-\varphi(x)|^2\,d\mathcal L^1(x).
\end{align*}
Since $\varphi$ is continuous on the compact set $K+[-1,1]$, it is uniformly continuous there, so
\begin{align*}
\sup_{x\in K+[-1,1]}|\varphi(x-t)-\varphi(x)|\to 0\qquad\text{as }t\to 0.
\end{align*}
Therefore
\begin{align*}
\|U(t)\varphi-\varphi\|_2^2\le \mathcal L^1(K+[-1,1])\sup_{x\in K+[-1,1]}|\varphi(x-t)-\varphi(x)|^2\to 0.
\end{align*}
For an arbitrary $\psi\in L^2(\mathbb R)$ and $\varepsilon>0$, choose $\varphi\in C_c^\infty(\mathbb R)$ with $\|\psi-\varphi\|_2<\varepsilon/3$. Since $U(t)$ is unitary,
\begin{align*}
\|U(t)\psi-\psi\|_2\le \|U(t)(\psi-\varphi)\|_2+\|U(t)\varphi-\varphi\|_2+\|\varphi-\psi\|_2=2\|\psi-\varphi\|_2+\|U(t)\varphi-\varphi\|_2.
\end{align*}
For $t$ sufficiently close to $0$, the last term is less than $\varepsilon/3$, so $\|U(t)\psi-\psi\|_2<\varepsilon$. This proves strong continuity.
The same group is not norm continuous at $0$. Fix $t\ne 0$, choose an interval $I$ with length smaller than $|t|$, and set $\psi=\mathcal L^1(I)^{-1/2}\mathbf 1_I$. Then $\|\psi\|_2=1$, while $U(t)\psi$ is supported on $I+t$, which is disjoint from $I$. Hence
\begin{align*}
\|U(t)\psi-\psi\|_2^2=\|U(t)\psi\|_2^2+\|\psi\|_2^2=1+1=2.
\end{align*}
Thus $\|U(t)-I\|\ge \sqrt 2$ for every $t\ne 0$, so $\|U(t)-I\|$ cannot tend to $0$ as $t\to 0$. The example separates strong continuity of each orbit from norm continuity of the operator-valued map.
[/example]
The translation example shows that a physically natural unitary group may fail to be norm differentiable as an operator-valued map. This motivates defining the generator through differentiation on individual vectors, with the domain recording exactly which states have an instantaneous velocity.
[definition: Infinitesimal Generator]
Let $(U(t))_{t\in\mathbb R}$ be a strongly continuous unitary group on a complex Hilbert space $H$. Its infinitesimal generator is the operator $A:D(A)\subset H\to H$ with domain
\begin{align*}
D(A)=\left\{\psi\in H: \lim_{t\to 0}\frac{U(t)\psi-\psi}{t}\text{ exists in }H\right\}
\end{align*}
and action
\begin{align*}
A\psi= \lim_{t\to 0}\frac{U(t)\psi-\psi}{t},\qquad \psi\in D(A).
\end{align*}
[/definition]
The generator defined above is adapted to arbitrary strongly continuous groups, but quantum mechanics uses unitary groups and therefore expects skew-adjoint generators. This motivates rewriting the generator as $A=-i\mathcal H$, where the new operator $\mathcal H$ is the observable measuring energy.
[definition: Hamiltonian Generated Evolution]
Let $\mathcal H:D(\mathcal H)\subset H\to H$ be a self-adjoint operator on a complex Hilbert space $H$. The unitary group generated by $\mathcal H$ is the map
\begin{align*}
U:\mathbb R\to\mathcal L(H),\qquad U(t)=e^{-it\mathcal H}.
\end{align*}
[/definition]
Here $e^{-it\mathcal H}$ is defined by the functional calculus for self-adjoint operators. The notation $e^{-it\mathcal H}$ is not a formal [power series](/page/Power%20Series) when $\mathcal H$ is unbounded. The remaining question is whether every acceptable unitary time evolution arises in this way from a unique self-adjoint Hamiltonian; Stone's theorem answers that question.
[quotetheorem:6934]
[citeproof:6934]
Stone's theorem is the bridge between the kinematic postulate that time evolution preserves inner products and the dynamical postulate that a self-adjoint Hamiltonian drives the evolution. The hypotheses are doing real work. Strong continuity excludes pathological unitary representations for which each $U(t)$ preserves norm but no densely defined derivative exists at $t=0$, so there is no Hamiltonian in the usual operator sense. Self-adjointness is also stronger than symmetry: a symmetric differential operator may have several self-adjoint extensions, or none compatible with the intended boundary conditions, and then it does not determine a unique unitary dynamics. Thus Stone's theorem does not say that every formally symmetric expression is a Hamiltonian; it says that reversible strongly continuous dynamics are exactly those generated by self-adjoint operators.
[example: Momentum as Generator of Translations]
For the translation group $(U(t)\psi)(x)=\psi(x-t)$ on $L^2(\mathbb R)$, first take $\psi\in C_c^\infty(\mathbb R)$. For each $x\in\mathbb R$ and $t\ne 0$, the [fundamental theorem of calculus](/theorems/632) gives
\begin{align*}
\psi(x-t)-\psi(x)=-\int_0^t \psi'(x-r)\,dr.
\end{align*}
After the change of variables $r=st$, this becomes
\begin{align*}
\frac{(U(t)\psi)(x)-\psi(x)}{t}=-\int_0^1 \psi'(x-st)\,ds.
\end{align*}
Subtracting $-\psi'(x)$ gives
\begin{align*}
\frac{(U(t)\psi)(x)-\psi(x)}{t}+\psi'(x)=-\int_0^1\big(\psi'(x-st)-\psi'(x)\big)\,ds.
\end{align*}
If $|t|\le 1$, all functions in the last display are supported in one fixed compact set containing $\operatorname{supp}\psi+[-1,1]$. Since $\psi'$ is uniformly continuous on that compact set, the right-hand side tends uniformly to $0$ as $t\to 0$, and therefore tends to $0$ in $L^2(\mathbb R)$. Hence
\begin{align*}
\lim_{t\to 0}\frac{U(t)\psi-\psi}{t}=-\psi'
\end{align*}
for every $\psi\in C_c^\infty(\mathbb R)$.
Thus the infinitesimal generator $A$ of translations acts on smooth compactly supported vectors by
\begin{align*}
A\psi=-\psi'.
\end{align*}
With the Hamiltonian convention $A=-iP$, this means
\begin{align*}
-iP\psi=-\psi'.
\end{align*}
Multiplying both sides by $i$ gives
\begin{align*}
P\psi=-i\psi'.
\end{align*}
The self-adjoint momentum operator extending this action is
\begin{align*}
P:H^1(\mathbb R)\subset L^2(\mathbb R)\to L^2(\mathbb R),\qquad P\psi=-i\psi'.
\end{align*}
The Sobolev domain is exactly the natural one for this first derivative expression: $\psi\in H^1(\mathbb R)$ precisely means that the weak derivative $\psi'$ belongs to $L^2(\mathbb R)$, so $-i\psi'$ is again an $L^2$ vector. The example shows that the geometric operation of translating a wavefunction has a first-order differential generator, and that the domain records exactly which $L^2$ states have an $L^2$ instantaneous velocity under translations.
[/example]
## The Schrödinger Equation for Time-Independent Hamiltonians
The next problem is how to interpret the differential equation
\begin{align*}
i\frac{d}{dt}\psi(t)=\mathcal H\psi(t)
\end{align*}
when $\mathcal H$ is unbounded. The equation cannot hold for every initial vector in $H$, because $\mathcal H\psi$ is defined only on $D(\mathcal H)$; however, the unitary group still gives a global evolution for every state.
[definition: Strong Solution of the Schrödinger Equation]
Let $\mathcal H$ be a self-adjoint operator on a complex Hilbert space $H$. A strong solution of the Schrödinger equation with initial state $\psi_0\in D(\mathcal H)$ is a map $\psi:\mathbb R\to H$ such that:
1. $\psi(t)\in D(\mathcal H)$ for every $t\in\mathbb R$.
2. $t\mapsto \psi(t)$ is differentiable as an $H$-valued function.
3. $i\psi'(t)=\mathcal H\psi(t)$ for every $t\in\mathbb R$.
4. $\psi(0)=\psi_0$.
[/definition]
For initial data outside $D(\mathcal H)$, $e^{-it\mathcal H}\psi_0$ is still a well-defined continuous curve in $H$, but it need not satisfy the differential equation in the strong sense. This distinction lets the theory evolve all states while reserving the pointwise Hamiltonian equation for vectors with enough regularity.
[quotetheorem:6935]
[citeproof:6935]
The theorem separates two levels of dynamics. Every vector evolves by a unitary orbit, while vectors in $D(\mathcal H)$ also have an instantaneous energy vector $\mathcal H\psi(t)$ and satisfy the differential equation in norm. This distinction cannot be removed: for the free Hamiltonian on $L^2(\mathbb R)$, an arbitrary $L^2$ wavefunction need not have two weak derivatives in $L^2$, so $\mathcal H\psi_0$ may not exist even though $e^{-it\mathcal H}\psi_0$ is defined for all $t$. The theorem is also essentially autonomous. If $\mathcal H$ is replaced by a time-dependent family $\mathcal H(t)$, the group identity $U(t+s)=U(t)U(s)$ no longer represents the dynamics, and existence must be formulated using a two-parameter propagator with additional regularity assumptions on $t\mapsto\mathcal H(t)$.
[example: Free Particle Propagator]
For a free particle on the line, take $H=L^2(\mathbb R)$ and
\begin{align*}
\mathcal H:H^2(\mathbb R)\subset L^2(\mathbb R)\to L^2(\mathbb R),\qquad \mathcal H\psi=-\frac{1}{2m}\psi''
\end{align*}
with $m>0$. Using the unitary Fourier transform convention
\begin{align*}
\hat f(\xi)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb R}e^{-ix\xi}f(x)\,d\mathcal L^1(x),
\end{align*}
integration by parts for Schwartz functions gives
\begin{align*}
\widehat{\psi''}(\xi)=-\xi^2\hat\psi(\xi).
\end{align*}
Hence
\begin{align*}
\widehat{\mathcal H\psi}(\xi)=\frac{\xi^2}{2m}\hat\psi(\xi),
\end{align*}
so the Fourier transform turns the free Schrödinger evolution into multiplication by the phase
\begin{align*}
\widehat{\psi(t)}(\xi)=e^{-it\xi^2/(2m)}\hat{\psi}_0(\xi).
\end{align*}
Assume first that $\psi_0$ is sufficiently regular and rapidly decaying, for instance $\psi_0\in\mathcal S(\mathbb R)$. Fourier inversion gives
\begin{align*}
\psi(t,x)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb R}e^{ix\xi}e^{-it\xi^2/(2m)}\hat\psi_0(\xi)\,d\mathcal L^1(\xi).
\end{align*}
Substituting the formula for $\hat\psi_0$ yields
\begin{align*}
\psi(t,x)=\frac{1}{2\pi}\int_{\mathbb R}\int_{\mathbb R}e^{i(x-y)\xi}e^{-it\xi^2/(2m)}\psi_0(y)\,d\mathcal L^1(y)\,d\mathcal L^1(\xi).
\end{align*}
For fixed $z=x-y$, the exponent satisfies
\begin{align*}
iz\xi-\frac{it}{2m}\xi^2=-\frac{it}{2m}\left(\xi-\frac{mz}{t}\right)^2+\frac{imz^2}{2t}.
\end{align*}
The Fresnel integral
\begin{align*}
\int_{\mathbb R}\exp\left(-\frac{it}{2m}\eta^2\right)\,d\mathcal L^1(\eta)=\left(\frac{2\pi m}{it}\right)^{1/2}
\end{align*}
therefore gives, for $t\ne 0$,
\begin{align*}
\frac{1}{2\pi}\int_{\mathbb R}e^{i(x-y)\xi}e^{-it\xi^2/(2m)}\,d\mathcal L^1(\xi)=\left(\frac{m}{2\pi it}\right)^{1/2}e^{im(x-y)^2/(2t)}.
\end{align*}
Thus the evolved wavefunction has the integral-kernel representation
\begin{align*}
\psi(t,x)=\left(\frac{m}{2\pi it}\right)^{1/2}\int_{\mathbb R}e^{im(x-y)^2/(2t)}\psi_0(y)\,d\mathcal L^1(y),\qquad t\ne 0.
\end{align*}
This kernel displays dispersion: different spatial offsets $x-y$ acquire different oscillatory phases, so localized initial data need not remain spatially localized even though the $L^2$ norm is preserved.
[/example]
The free particle is the basic model where the spectral theorem is implemented by Fourier analysis. The harmonic oscillator gives the complementary model with discrete spectrum and recurrent motion.
[example: Harmonic Oscillator Evolution]
Let $H=L^2(\mathbb R)$, with $m>0$ and $\omega>0$. Define
\begin{align*}
P:H^1(\mathbb R)\subset L^2(\mathbb R)\to L^2(\mathbb R),\qquad P\psi=-i\psi',
\end{align*}
and
\begin{align*}
Q:D(Q)\subset L^2(\mathbb R)\to L^2(\mathbb R),\qquad (Q\psi)(x)=x\psi(x),
\end{align*}
where
\begin{align*}
D(Q)=\{\psi\in L^2(\mathbb R):x\psi(x)\in L^2(\mathbb R)\}.
\end{align*}
On functions for which the displayed expressions are defined,
\begin{align*}
P^2\psi=P(P\psi)=-i(-i\psi')'=-\psi'',
\end{align*}
and
\begin{align*}
(Q^2\psi)(x)=Q(Q\psi)(x)=x(x\psi(x))=x^2\psi(x).
\end{align*}
Thus the harmonic oscillator Hamiltonian is
\begin{align*}
\mathcal H\psi=\frac{1}{2m}P^2\psi+\frac{m\omega^2}{2}Q^2\psi=-\frac{1}{2m}\psi''+\frac{m\omega^2}{2}x^2\psi(x),
\end{align*}
with self-adjoint domain
\begin{align*}
D(\mathcal H)=\{\psi\in L^2(\mathbb R):\psi''\in L^2(\mathbb R),\,x^2\psi(x)\in L^2(\mathbb R)\}.
\end{align*}
The Hermite functions $(h_n)_{n\ge 0}$ form an orthonormal basis of eigenvectors of $\mathcal H$, and their eigenvalue equation is
\begin{align*}
\mathcal Hh_n=\omega\left(n+\frac{1}{2}\right)h_n.
\end{align*}
Write an initial state as
\begin{align*}
\psi_0=\sum_{n=0}^\infty c_nh_n,\qquad c_n=(\psi_0,h_n)_{L^2}.
\end{align*}
Since the basis is orthonormal,
\begin{align*}
\|\psi_0\|_2^2=\sum_{n=0}^\infty |c_n|^2.
\end{align*}
The spectral expansion of the unitary group generated by $\mathcal H$ multiplies each eigenvector component by the phase $e^{-itE_n}$, where $E_n=\omega(n+1/2)$. Therefore
\begin{align*}
\psi(t)=e^{-it\mathcal H}\psi_0=\sum_{n=0}^\infty c_ne^{-i\omega(n+1/2)t}h_n.
\end{align*}
For $\psi_0\in D(\mathcal H)$, the domain condition is
\begin{align*}
\sum_{n=0}^\infty \omega^2\left(n+\frac{1}{2}\right)^2|c_n|^2<\infty.
\end{align*}
Differentiating the series in $L^2$ gives
\begin{align*}
\psi'(t)=\sum_{n=0}^\infty c_n\left(-i\omega\left(n+\frac{1}{2}\right)\right)e^{-i\omega(n+1/2)t}h_n.
\end{align*}
Multiplying by $i$ yields
\begin{align*}
i\psi'(t)=\sum_{n=0}^\infty c_n\omega\left(n+\frac{1}{2}\right)e^{-i\omega(n+1/2)t}h_n.
\end{align*}
Using $\mathcal Hh_n=\omega(n+1/2)h_n$, the right-hand side is exactly
\begin{align*}
\mathcal H\psi(t)=\sum_{n=0}^\infty c_n\omega\left(n+\frac{1}{2}\right)e^{-i\omega(n+1/2)t}h_n.
\end{align*}
Thus the spectral expansion turns the Schrödinger equation into independent scalar rotations of the coefficients.
The motion is periodic up to a global phase. If $T=2\pi/\omega$, then
\begin{align*}
e^{-i\omega(n+1/2)(t+T)}=e^{-i\omega(n+1/2)t}e^{-i2\pi(n+1/2)}=-e^{-i\omega(n+1/2)t}.
\end{align*}
Hence
\begin{align*}
\psi(t+T)=-\psi(t),
\end{align*}
so the physical ray is periodic with period $2\pi/\omega$, while the vector itself has period $4\pi/\omega$.
[/example]
## Conservation Laws
Once a unitary evolution exists, the next question is which quantities remain constant. Conservation laws arise when an observable is compatible with the Hamiltonian, either because it is the Hamiltonian itself or because its commutator with the Hamiltonian vanishes on a suitable common domain.
[quotetheorem:6936]
[citeproof:6936]
Norm conservation is the mathematical expression of conservation of total probability. Its hypothesis is unitarity, not merely the existence of an evolution equation. If the generator is not self-adjoint, or if one studies a dissipative semigroup such as the heat semigroup $e^{t\Delta}$ on $L^2$, the norm typically decreases rather than remaining constant. Since probability is only the zeroth conserved quantity for unitary quantum dynamics, the next natural question is whether the observable that generates the dynamics, namely energy, is also constant along the same evolution.
[quotetheorem:6937]
[citeproof:6937]
Energy conservation depends on time independence and on the state having enough regularity for the energy expectation to be defined. If $\psi_0\notin D(\mathcal H)$, the unitary orbit still exists, but $(\mathcal H\psi(t),\psi(t))_H$ is not an available expression. If the Hamiltonian itself varies with time, no single spectral measure diagonalises the whole evolution; for instance, a driven Hamiltonian $\mathcal H(t)=\mathcal H_0+V(t)$ can exchange energy with the external field, so the expectation of the instantaneous Hamiltonian need not be constant. Extra hypotheses are then needed even to state existence of a two-parameter propagator.
[definition: Conserved Observable]
Let $\mathcal H:D(\mathcal H)\subset H\to H$ be self-adjoint on $H$, and let $A:D(A)\subset H\to H$ be a self-adjoint observable. We say that $A$ is conserved along the evolution generated by $\mathcal H$ on a class of states $\mathcal D\subset D(A)$ if
\begin{align*}
(Ae^{-it\mathcal H}\psi,e^{-it\mathcal H}\psi)_H=(A\psi,\psi)_H
\end{align*}
for every $\psi\in\mathcal D$ and every $t\in\mathbb R$.
[/definition]
The definition is deliberately stated in terms of expectations, because products and commutators of unbounded operators may have restricted domains. When the algebraic manipulations are justified, the commutator gives the rate of change.
[quotetheorem:6938]
[citeproof:6938]
Ehrenfest's theorem explains how classical-looking equations emerge from quantum dynamics. Its domain assumptions are not cosmetic. For unbounded observables such as position and momentum, the products $A\mathcal H\psi(t)$ and $\mathcal H A\psi(t)$ may fail to exist even when $A\psi(t)$ and $\mathcal H\psi(t)$ separately exist, and the curve $t\mapsto A\psi(t)$ need not be differentiable. Without a common invariant domain, the formal commutator can produce a plausible formula whose terms are not defined on the state under discussion. When the commutator vanishes on a domain where these operations are justified, the expectation of $A$ is constant; when the commutator is another observable on that domain, the expectation evolves according to a closed differential relation.
[example: Position and Momentum for the Free Particle]
On $L^2(\mathbb R)$, let $P=-i\,d/dx$, let $(Q\psi)(x)=x\psi(x)$, and take the free Hamiltonian
\begin{align*}\mathcal H=\frac{P^2}{2m}\end{align*}
with $m>0$. Work first on the Schwartz space $\mathcal S(\mathbb R)$, where all products below are defined and the free evolution preserves the space. For $\psi\in\mathcal S(\mathbb R)$,
\begin{align*}(QP\psi)(x)=x(-i\psi'(x))=-ix\psi'(x)\end{align*}
and
\begin{align*}(PQ\psi)(x)=-i(x\psi(x))'=-i(\psi(x)+x\psi'(x))=-i\psi(x)-ix\psi'(x).\end{align*}
Therefore
\begin{align*}((QP-PQ)\psi)(x)=-ix\psi'(x)-(-i\psi(x)-ix\psi'(x))=i\psi(x),\end{align*}
so $[Q,P]=iI$ on this domain.
Since $[P,Q]=-[Q,P]=-iI$, we compute
\begin{align*}[P^2,Q]\psi=P(PQ\psi)-Q(P^2\psi).\end{align*}
Insert and subtract $PQP\psi$:
\begin{align*}[P^2,Q]\psi=P(PQ\psi)-P(QP\psi)+P(QP\psi)-Q(P^2\psi).\end{align*}
Thus
\begin{align*}[P^2,Q]\psi=P([P,Q]\psi)+[P,Q]P\psi.\end{align*}
Using $[P,Q]=-iI$ gives
\begin{align*}[P^2,Q]\psi=P(-i\psi)-iP\psi=-iP\psi-iP\psi=-2iP\psi.\end{align*}
Hence
\begin{align*}[\mathcal H,Q]\psi=\frac{1}{2m}[P^2,Q]\psi=-\frac{i}{m}P\psi.\end{align*}
Also
\begin{align*}[\mathcal H,P]\psi=\frac{1}{2m}(P^2P\psi-PP^2\psi)=\frac{1}{2m}(P^3\psi-P^3\psi)=0.\end{align*}
Let $\psi(t)=e^{-it\mathcal H}\psi_0$ and write
\begin{align*}\mathbb E[Q]_t=(Q\psi(t),\psi(t))_{L^2}\end{align*}
and
\begin{align*}\mathbb E[P]_t=(P\psi(t),\psi(t))_{L^2}.\end{align*}
By *[Ehrenfest Theorem](/theorems/6938)*,
\begin{align*}\frac{d}{dt}\mathbb E[Q]_t=i((\mathcal H Q-Q\mathcal H)\psi(t),\psi(t))_{L^2}.\end{align*}
Substituting $[\mathcal H,Q]=-(i/m)P$ gives
\begin{align*}\frac{d}{dt}\mathbb E[Q]_t=i\left(-\frac{i}{m}P\psi(t),\psi(t)\right)_{L^2}=\frac{1}{m}(P\psi(t),\psi(t))_{L^2}=\frac{1}{m}\mathbb E[P]_t.\end{align*}
Similarly,
\begin{align*}\frac{d}{dt}\mathbb E[P]_t=i((\mathcal H P-P\mathcal H)\psi(t),\psi(t))_{L^2}=i(0,\psi(t))_{L^2}=0.\end{align*}
Therefore $\mathbb E[P]_t=\mathbb E[P]_0$, and integrating the first differential equation from $0$ to $t$ gives
\begin{align*}\mathbb E[Q]_t=\mathbb E[Q]_0+\frac{t}{m}\mathbb E[P]_0.\end{align*}
The expected momentum is conserved, and the expected position moves at the constant velocity $\mathbb E[P]_0/m$.
[/example]
## Wave Packet Spreading
The final question in this chapter is what the abstract unitary formula predicts for an initially localized state. Free evolution preserves the $L^2$ norm and energy, but it does not preserve spatial localization; the kinetic Hamiltonian converts sharp localization into phase oscillation and then into spreading.
[example: Spreading of a Gaussian Wave Packet]
Let $m>0$, $\sigma>0$, and
\begin{align*}
\psi_0(x)=(2\pi\sigma^2)^{-1/4}\exp\left(-\frac{x^2}{4\sigma^2}\right).
\end{align*}
With the unitary Fourier transform convention
\begin{align*}
\hat f(\xi)=\frac{1}{\sqrt{2\pi}}\int_{\mathbb R}e^{-ix\xi}f(x)\,d\mathcal L^1(x),
\end{align*}
we first compute $\hat\psi_0$. Using the Gaussian identity
\begin{align*}
\int_{\mathbb R}\exp(-ax^2+bx)\,d\mathcal L^1(x)=\left(\frac{\pi}{a}\right)^{1/2}\exp\left(\frac{b^2}{4a}\right)
\end{align*}
for $\operatorname{Re}a>0$, with $a=1/(4\sigma^2)$ and $b=-i\xi$, gives
\begin{align*}
\int_{\mathbb R}\exp\left(-\frac{x^2}{4\sigma^2}-ix\xi\right)\,d\mathcal L^1(x)=2\sigma\sqrt{\pi}\exp(-\sigma^2\xi^2).
\end{align*}
Therefore
\begin{align*}
\hat\psi_0(\xi)=\frac{1}{\sqrt{2\pi}}(2\pi\sigma^2)^{-1/4}2\sigma\sqrt{\pi}\exp(-\sigma^2\xi^2).
\end{align*}
The constant equals
\begin{align*}
\frac{2\sigma\sqrt{\pi}}{\sqrt{2\pi}(2\pi\sigma^2)^{1/4}}=\frac{\sqrt{2}\sigma}{(2\pi\sigma^2)^{1/4}}=\left(\frac{2\sigma^2}{\pi}\right)^{1/4},
\end{align*}
so
\begin{align*}
\hat\psi_0(\xi)=\left(\frac{2\sigma^2}{\pi}\right)^{1/4}\exp(-\sigma^2\xi^2).
\end{align*}
Under the free Hamiltonian
\begin{align*}
\mathcal H=\frac{P^2}{2m},
\end{align*}
the Fourier-side evolution is
\begin{align*}
\widehat{\psi(t)}(\xi)=e^{-it\xi^2/(2m)}\hat\psi_0(\xi).
\end{align*}
Put
\begin{align*}
a_t=\sigma^2+\frac{it}{2m}.
\end{align*}
Since $\operatorname{Re}a_t=\sigma^2>0$, Fourier inversion and the same Gaussian identity give
\begin{align*}
\psi(t,x)=\frac{1}{\sqrt{2\pi}}\left(\frac{2\sigma^2}{\pi}\right)^{1/4}\int_{\mathbb R}\exp(ix\xi-a_t\xi^2)\,d\mathcal L^1(\xi).
\end{align*}
Taking $a=a_t$ and $b=ix$ in the Gaussian identity gives
\begin{align*}
\int_{\mathbb R}\exp(ix\xi-a_t\xi^2)\,d\mathcal L^1(\xi)=\left(\frac{\pi}{a_t}\right)^{1/2}\exp\left(-\frac{x^2}{4a_t}\right).
\end{align*}
Hence
\begin{align*}
\psi(t,x)=\frac{1}{\sqrt{2}}\left(\frac{2\sigma^2}{\pi}\right)^{1/4}a_t^{-1/2}\exp\left(-\frac{x^2}{4a_t}\right).
\end{align*}
Now compute the modulus. Since
\begin{align*}
|a_t|^2=\sigma^4+\frac{t^2}{4m^2},
\end{align*}
we have
\begin{align*}
|a_t^{-1/2}|^2=|a_t|^{-1}.
\end{align*}
Also,
\begin{align*}
\left|\exp\left(-\frac{x^2}{4a_t}\right)\right|^2=\exp\left(-\frac{x^2}{4a_t}-\frac{x^2}{4\overline{a_t}}\right).
\end{align*}
Because
\begin{align*}
\frac{1}{a_t}+\frac{1}{\overline{a_t}}=\frac{a_t+\overline{a_t}}{|a_t|^2}=\frac{2\sigma^2}{|a_t|^2},
\end{align*}
this becomes
\begin{align*}
\left|\exp\left(-\frac{x^2}{4a_t}\right)\right|^2=\exp\left(-\frac{\sigma^2x^2}{2|a_t|^2}\right).
\end{align*}
Therefore
\begin{align*}
|\psi(t,x)|^2=\frac{1}{2}\left(\frac{2\sigma^2}{\pi}\right)^{1/2}\frac{1}{|a_t|}\exp\left(-\frac{\sigma^2x^2}{2|a_t|^2}\right).
\end{align*}
Writing this in the standard Gaussian form
\begin{align*}
|\psi(t,x)|^2=\frac{1}{\sqrt{2\pi}\,\sigma(t)}\exp\left(-\frac{x^2}{2\sigma(t)^2}\right)
\end{align*}
requires
\begin{align*}
\frac{1}{2\sigma(t)^2}=\frac{\sigma^2}{2|a_t|^2}.
\end{align*}
Thus
\begin{align*}
\sigma(t)^2=\frac{|a_t|^2}{\sigma^2}=\frac{\sigma^4+t^2/(4m^2)}{\sigma^2}=\sigma^2+\frac{t^2}{4m^2\sigma^2}.
\end{align*}
The packet is therefore narrowest at $t=0$ and spreads symmetrically as $|t|$ increases, while the normalization factor above keeps the total probability equal to $1$.
[/example]
This example is the prototype for dispersive quantum motion. Unitary evolution preserves the Hilbert-space norm, but different spectral components acquire different phases; when transformed back to position space, those phase differences change the shape of the wavefunction.
[remark: What Time Independence Buys]
For a time-independent Hamiltonian, the entire dynamics is controlled by one self-adjoint operator and its spectral measure. This gives existence, uniqueness, conservation of norm, [conservation of energy](/theorems/1335), and a functional calculus formula for $U(t)$. Time-dependent Hamiltonians require a different theory: instead of a one-parameter group $U(t)$, one seeks propagators $U(t,s)$ satisfying $U(t,r)U(r,s)=U(t,s)$ and a non-autonomous Schrödinger equation.
[/remark]
The abstract theory of unitary evolution explains what it means for a Hamiltonian to generate motion. We now turn to one-dimensional Schrödinger operators, where these ideas become explicit through bound states, scattering states, barriers, wells, and currents.
# 6. One-Dimensional Quantum Systems
This chapter turns the abstract spectral theory of earlier chapters into concrete one-dimensional calculations. It assumes the Hilbert-space formulation of quantum mechanics, self-adjoint Hamiltonians, eigenvalues and continuous spectrum, and the basic separation of variables leading from the time-dependent Schrödinger equation to the stationary equation. The central problem is to solve the stationary Schrödinger equation on a line or interval, classify its spectrum, and interpret the solutions as bound states or scattering states. One-dimensional systems are special because the eigenvalue equation is an ordinary differential equation, so qualitative ODE methods, matching conditions, and conservation laws give strong information before any explicit formula is found.
## Schrödinger Operators in One Dimension
The first question is how the spectral problem for a quantum Hamiltonian becomes an ODE with boundary and integrability conditions. In one dimension the formal Hamiltonian has the form kinetic energy plus multiplication by a potential, but the physical problem is not determined until we specify the domain and the condition imposed at infinity or at endpoints.
[definition: One-Dimensional Schrödinger Operator]
Let $I \subseteq \mathbb R$ be an interval and let $V:I\to\mathbb R$ be a real-valued potential. A one-dimensional Schrödinger operator is a self-adjoint operator
\begin{align*}
H:D(H)\subset L^2(I)\to L^2(I)
\end{align*}
with domain $D(H)$ determined by boundary or integrability conditions, such that for each $\psi\in D(H)$,
\begin{align*}
H\psi=-\frac{\hbar^2}{2m}\psi''+V\psi
\end{align*}
in the weak sense. A real number $E$ and a non-zero function $\psi\in D(H)$ solve the stationary Schrödinger equation on $I$ if
\begin{align*}
H\psi=E\psi.
\end{align*}
[/definition]
When $V$ and $\psi$ are regular enough, the weak equation agrees with the corresponding classical second-order ODE on the interior of $I$.
The phrase "formal operator" is important: different domains can give different self-adjoint operators, hence different spectra. On the full line the standard bound-state domain asks for square-integrable wave functions; on a finite interval, endpoint boundary conditions such as Dirichlet, Neumann, or Robin conditions encode the physical walls.
[definition: Bound State]
For a self-adjoint one-dimensional Schrödinger operator $H$ on $L^2(I)$, a bound state is an eigenvector $\psi\in L^2(I)$ with $H\psi=E\psi$ for some $E\in\mathbb R$.
[/definition]
Bound states are normalisable stationary states. Their probability density $|\psi(x)|^2$ remains spatially localised, and their energies appear as isolated spectral points in the simplest confining and short-range examples.
[example: Infinite Square Well]
Take $I=(0,L)$, $V=0$, and Dirichlet boundary conditions $\psi(0)=\psi(L)=0$. The stationary equation is
\begin{align*}
-\frac{\hbar^2}{2m}\psi''=E\psi.
\end{align*}
If $E=0$, then $\psi''=0$, so $\psi(x)=Ax+B$. The condition $\psi(0)=0$ gives $B=0$, and $\psi(L)=0$ gives $AL=0$, hence $A=0$; this is not an eigenfunction. If $E<0$, write $\alpha=\sqrt{-2mE}/\hbar>0$. Then $\psi''=\alpha^2\psi$, so
\begin{align*}
\psi(x)=Ae^{\alpha x}+Be^{-\alpha x}.
\end{align*}
The condition $\psi(0)=0$ gives $A+B=0$, so $B=-A$. Then $\psi(L)=0$ gives
\begin{align*}
Ae^{\alpha L}-Ae^{-\alpha L}=A(e^{\alpha L}-e^{-\alpha L})=0.
\end{align*}
Since $\alpha L>0$, we have $e^{\alpha L}-e^{-\alpha L}\ne0$, so $A=0$ and again $\psi=0$.
Thus a non-zero solution must have $E>0$. Put $q=\sqrt{2mE}/\hbar$. The equation becomes $\psi''+q^2\psi=0$, so
\begin{align*}
\psi(x)=A\sin(qx)+B\cos(qx).
\end{align*}
The condition $\psi(0)=0$ gives $B=0$, and then $\psi(L)=0$ gives
\begin{align*}
A\sin(qL)=0.
\end{align*}
For a non-zero solution, $A\ne0$, so $\sin(qL)=0$. Hence $qL=n\pi$ for some $n\in\mathbb N$, and therefore
\begin{align*}
q=\frac{n\pi}{L}.
\end{align*}
Since $q^2=2mE/\hbar^2$, the corresponding energy is
\begin{align*}
E_n=\frac{\hbar^2q^2}{2m}=\frac{\hbar^2\pi^2n^2}{2mL^2}.
\end{align*}
The unnormalised eigenfunction is $A\sin(n\pi x/L)$. To choose $A$ with $\|\psi_n\|_{L^2(0,L)}=1$, use $\sin^2 u=(1-\cos(2u))/2$:
\begin{align*}
\int_0^L \sin^2\left(\frac{n\pi x}{L}\right)\,dx=\int_0^L \frac{1-\cos(2n\pi x/L)}{2}\,dx.
\end{align*}
The right-hand side equals
\begin{align*}
\frac{L}{2}-\frac{1}{2}\left[\frac{L}{2n\pi}\sin\left(\frac{2n\pi x}{L}\right)\right]_{0}^{L}=\frac{L}{2}.
\end{align*}
Thus $|A|^2(L/2)=1$, and taking the positive real normalisation gives
\begin{align*}
\psi_n(x)=\sqrt{\frac{2}{L}}\sin\left(\frac{n\pi x}{L}\right).
\end{align*}
The allowed energies are discrete because the two endpoint conditions force the wave number to be an integer multiple of $\pi/L$, even though the potential is zero inside the interval.
[/example]
After seeing a fully discrete example, the next issue is how one recognises discrete eigenvalues without solving the equation exactly. The Sturm-Liouville viewpoint supplies comparison and oscillation principles, but it needs a way to record how often a real eigenfunction crosses zero.
[definition: Node]
Let $\psi:I\to\mathbb R$ be a non-zero real-valued solution of a second-order Schrödinger equation. A node of $\psi$ is a point $x_0\in I$ such that $\psi(x_0)=0$.
[/definition]
Nodes measure oscillation. For one-dimensional real potentials, eigenfunctions can be chosen real, and the number of zeros should record where the corresponding eigenvalue sits in the ordered discrete spectrum. The theorem below makes this diagnostic precise and turns pictures of wave functions into spectral information.
[quotetheorem:6939]
This theorem is one of the main reasons one-dimensional quantum mechanics is so rigid. The hypotheses are doing real work: boundedness and regular endpoints ensure a discrete sequence of eigenvalues, while separated self-adjoint boundary conditions give real eigenvalues, orthogonal eigenfunctions, and the Sturm separation structure. On the full line, or for singular endpoints, the spectrum can contain continuous parts and the phrase "the $n$th eigenfunction" may no longer make sense without extra assumptions. Non-self-adjoint endpoint conditions can also destroy the variational ordering, so the nodal count is a theorem about self-adjoint Sturm-Liouville problems, not about arbitrary second-order ODEs.
The ground state has no nodes, the first excited state has one, and so on; this gives both a qualitative diagnostic for numerical solutions and a way to check whether an explicitly matched solution has been assigned the correct energy level.
[example: Node Count in the Infinite Well]
For the infinite square well, the $n$th normalised eigenfunction is
\begin{align*}
\psi_n(x)=\sqrt{\frac{2}{L}}\sin\left(\frac{n\pi x}{L}\right).
\end{align*}
A point $x\in(0,L)$ is a node exactly when $\psi_n(x)=0$. Since $\sqrt{2/L}\ne0$, this is equivalent to
\begin{align*}
\sin\left(\frac{n\pi x}{L}\right)=0.
\end{align*}
The sine function vanishes exactly at integer multiples of $\pi$, so
\begin{align*}
\frac{n\pi x}{L}=k\pi
\end{align*}
for some $k\in\mathbb Z$. Dividing by $\pi$ and solving for $x$ gives
\begin{align*}
x=\frac{kL}{n}.
\end{align*}
The condition $0<x<L$ becomes
\begin{align*}
0<\frac{kL}{n}<L.
\end{align*}
Because $L>0$ and $n>0$, this is equivalent to
\begin{align*}
0<k<n.
\end{align*}
Thus the interior zeros are precisely
\begin{align*}
x=\frac{kL}{n}\quad\text{for }k=1,\dots,n-1.
\end{align*}
There are therefore $n-1$ interior nodes, exactly as predicted by the *[Oscillation Theorem for One-Dimensional Bound States](/theorems/6939)*.
[/example]
## Bound-State Counting and Finite Wells
The next problem is to estimate how many bound states a potential supports before solving the transcendental matching equations. In one dimension the answer is governed by the size and width of the classically allowed region, where $E>V(x)$ and the local wavelength is real.
[definition: Classical Turning Point]
For an energy $E$ and potential $V:I\to\mathbb R$, a classical turning point is a point $x_0\in I$ such that $V(x_0)=E$.
[/definition]
Turning points separate oscillatory and exponential behaviour in the stationary equation. Away from turning points, the sign of $E-V(x)$ predicts whether a local solution resembles a sinusoid or an exponential.
[explanation: Bound-State Counting Heuristic]
For a potential well on the line, a rough count of bound states comes from fitting half-wavelengths inside the classically allowed region. If $E>V(x)$, the local wave number is
\begin{align*}
k(x)=\frac{\sqrt{2m(E-V(x))}}{\hbar}.
\end{align*}
The WKB quantisation rule for two turning points $a(E)<b(E)$ is
\begin{align*}
\int_{a(E)}^{b(E)} k(x)\,dx \approx \pi\left(n+\frac{1}{2}\right).
\end{align*}
Thus wider and deeper wells support more bound states, while shallow or narrow wells may support only a ground state. The formula is a semiclassical guide rather than an exact spectral theorem in this chapter.
[/explanation]
This heuristic becomes concrete in the finite square well. It is the first example where the wave is oscillatory inside the well but decays outside, and where the allowed energies are determined by matching both the value and derivative of the wave function.
[example: Finite Square Well]
Let $V(x)=-V_0$ for $|x|<a$ and $V(x)=0$ for $|x|\ge a$, where $V_0>0$. For a bound state with $-V_0<E<0$, define
\begin{align*}
k=\frac{\sqrt{2m(E+V_0)}}{\hbar},\qquad \kappa=\frac{\sqrt{-2mE}}{\hbar}.
\end{align*}
Inside the well, the stationary equation becomes
\begin{align*}
-\frac{\hbar^2}{2m}\psi''-V_0\psi=E\psi,
\end{align*}
so
\begin{align*}
\psi''+\frac{2m(E+V_0)}{\hbar^2}\psi=0,
\end{align*}
that is,
\begin{align*}
\psi''+k^2\psi=0.
\end{align*}
Outside the well, where $V=0$, the equation is
\begin{align*}
-\frac{\hbar^2}{2m}\psi''=E\psi,
\end{align*}
so
\begin{align*}
\psi''=\frac{-2mE}{\hbar^2}\psi=\kappa^2\psi.
\end{align*}
The square-integrable outside solutions must decay as $|x|\to\infty$.
Because the potential is even, solutions can be chosen even or odd. For an even bound state, take
\begin{align*}
\psi(x)=A\cos(kx)\quad\text{for }|x|<a,
\end{align*}
and
\begin{align*}
\psi(x)=Be^{-\kappa |x|}\quad\text{for }|x|\ge a.
\end{align*}
Matching the value at $x=a$ gives
\begin{align*}
A\cos(ka)=Be^{-\kappa a}.
\end{align*}
Differentiating the two pieces gives
\begin{align*}
\psi'(x)=-Ak\sin(kx)\quad\text{inside the well}
\end{align*}
and, on the right exterior,
\begin{align*}
\psi'(x)=-\kappa Be^{-\kappa x}.
\end{align*}
Matching derivatives at $x=a$ gives
\begin{align*}
-Ak\sin(ka)=-\kappa Be^{-\kappa a}.
\end{align*}
Using $Be^{-\kappa a}=A\cos(ka)$ from value matching, this becomes
\begin{align*}
Ak\sin(ka)=\kappa A\cos(ka).
\end{align*}
For a non-zero even state, $A\ne0$, and division by $A\cos(ka)$ gives
\begin{align*}
k\tan(ka)=\kappa.
\end{align*}
For an odd bound state, take
\begin{align*}
\psi(x)=A\sin(kx)\quad\text{for }|x|<a.
\end{align*}
On the right exterior write
\begin{align*}
\psi(x)=Be^{-\kappa x}\quad\text{for }x\ge a.
\end{align*}
Matching the value at $x=a$ gives
\begin{align*}
A\sin(ka)=Be^{-\kappa a}.
\end{align*}
The derivatives at $x=a$ are
\begin{align*}
Ak\cos(ka)
\end{align*}
from the inside and
\begin{align*}
-\kappa Be^{-\kappa a}
\end{align*}
from the outside, so derivative matching gives
\begin{align*}
Ak\cos(ka)=-\kappa Be^{-\kappa a}.
\end{align*}
Using $Be^{-\kappa a}=A\sin(ka)$, we get
\begin{align*}
Ak\cos(ka)=-\kappa A\sin(ka).
\end{align*}
For a non-zero odd state, division by $A\sin(ka)$ gives
\begin{align*}
k\cot(ka)=-\kappa,
\end{align*}
equivalently
\begin{align*}
-k\cot(ka)=\kappa.
\end{align*}
Finally,
\begin{align*}
k^2+\kappa^2=\frac{2m(E+V_0)}{\hbar^2}+\frac{-2mE}{\hbar^2}.
\end{align*}
The $E$ terms cancel, leaving
\begin{align*}
k^2+\kappa^2=\frac{2mV_0}{\hbar^2}.
\end{align*}
Thus the allowed bound-state energies are determined by the intersections of this circle in the $(k,\kappa)$-plane with the even equations $k\tan(ka)=\kappa$ and the odd equations $-k\cot(ka)=\kappa$; each intersection gives one matched decaying eigenfunction.
[/example]
The finite well also illustrates a one-dimensional feature that contrasts with higher-dimensional intuition. Even an attractive potential that is very narrow can produce a bound state, as the delta potential shows in its limiting form.
[example: Attractive Delta Potential]
Consider the formal potential $V(x)=-\alpha\delta(x)$ with $\alpha>0$ on the line. Away from $x=0$ the potential is zero, so a bound state with energy $E<0$ satisfies the free equation
\begin{align*}
-\frac{\hbar^2}{2m}\psi''=E\psi
\end{align*}
on each half-line. Put
\begin{align*}
\kappa=\frac{\sqrt{-2mE}}{\hbar}>0.
\end{align*}
Then
\begin{align*}
\psi''=\kappa^2\psi.
\end{align*}
The square-integrable solution on $x>0$ is a multiple of $e^{-\kappa x}$, and the square-integrable solution on $x<0$ is a multiple of $e^{\kappa x}$. Continuity at the point interaction gives a common value at $0$, so the even bound-state ansatz is
\begin{align*}
\psi(x)=Ce^{-\kappa |x|}
\end{align*}
with $C\ne0$.
For $x>0$,
\begin{align*}
\psi'(x)=-\kappa Ce^{-\kappa x}.
\end{align*}
Thus
\begin{align*}
\psi'(0^+)=-\kappa C.
\end{align*}
For $x<0$, since $|x|=-x$,
\begin{align*}
\psi(x)=Ce^{\kappa x}.
\end{align*}
Therefore
\begin{align*}
\psi'(x)=\kappa Ce^{\kappa x}.
\end{align*}
Thus
\begin{align*}
\psi'(0^-)=\kappa C.
\end{align*}
The jump in the derivative is
\begin{align*}
\psi'(0^+)-\psi'(0^-)=-\kappa C-\kappa C=-2\kappa C.
\end{align*}
The point interaction imposes
\begin{align*}
\psi'(0^+)-\psi'(0^-)=-\frac{2m\alpha}{\hbar^2}\psi(0).
\end{align*}
Since $\psi(0)=C$, this becomes
\begin{align*}
-2\kappa C=-\frac{2m\alpha}{\hbar^2}C.
\end{align*}
Because $C\ne0$, division by $-2C$ gives
\begin{align*}
\kappa=\frac{m\alpha}{\hbar^2}.
\end{align*}
Using $\kappa^2=-2mE/\hbar^2$, we get
\begin{align*}
E=-\frac{\hbar^2\kappa^2}{2m}.
\end{align*}
Substituting $\kappa=m\alpha/\hbar^2$ gives
\begin{align*}
E=-\frac{\hbar^2}{2m}\frac{m^2\alpha^2}{\hbar^4}=-\frac{m\alpha^2}{2\hbar^2}.
\end{align*}
There is only one positive value of $\kappa$ satisfying the jump condition, so the attractive delta potential has exactly one bound-state energy; its wave function is even and decays exponentially away from the origin.
[/example]
## Scattering States and Probability Current
Bound states are only part of the spectral picture. The next question is how to describe non-normalisable stationary solutions representing particles incoming from infinity, interacting with a localised potential, and emerging as reflected and transmitted waves.
[definition: Scattering State]
Let $V:\mathbb R\to\mathbb R$ be a real-valued potential that decays sufficiently fast as $|x|\to\infty$, and let $E>0$. A scattering state at energy $E$ is a non-zero function $\phi:\mathbb R\to\mathbb C$ with $\phi\in H^2_{\mathrm{loc}}(\mathbb R)$ such that
\begin{align*}
-\frac{\hbar^2}{2m}\phi''+V\phi=E\phi
\end{align*}
in the distributional sense and $\phi$ has asymptotic plane-wave behaviour as $x\to\pm\infty$.
[/definition]
Scattering states are not elements of $L^2(\mathbb R)$, but they encode the continuous spectrum and physical scattering amplitudes. The difficulty is that a non-normalisable solution cannot be assigned a probability by integrating $|\psi|^2$ over the line.
To extract measurable data from such a solution, one must replace spatial probabilities by asymptotic wave coefficients. For a wave sent in from the left, the next definition is needed to name the two unknown coefficients in the asymptotic form: the amplitude of the wave returning toward $-\infty$ and the amplitude of the wave transmitted toward $+\infty$.
[definition: Reflection and Transmission Amplitudes]
For a short-range potential and an incoming plane wave from the left at energy $E>0$, set
\begin{align*}
k=\frac{\sqrt{2mE}}{\hbar}.
\end{align*}
The reflection and transmission amplitudes $R,T\in\mathbb C$ are defined by the asymptotics
\begin{align*}
\psi(x)&\sim e^{ikx}+Re^{-ikx}\quad \text{as }x\to-\infty,\qquad \psi(x)\sim Te^{ikx}\quad \text{as }x\to+\infty.
\end{align*}
[/definition]
The amplitudes themselves are complex because phase matters, and their absolute squares should represent probabilities only after comparison of incoming and outgoing fluxes. This raises the conservation question: what quantity remains constant along a stationary scattering solution and justifies the flux interpretation?
[definition: Probability Current]
For a wave function $\psi:\mathbb R\times\mathbb R\to\mathbb C$ solving the time-dependent Schrödinger equation with real potential and with enough regularity for the derivatives below to be defined pointwise, the probability current associated to $\psi$ is the map
\begin{align*}
j_\psi:\mathbb R\times\mathbb R\to\mathbb R
\end{align*}
defined by
\begin{align*}
j_\psi(x,t)=\frac{\hbar}{m}\operatorname{Im}\big(\overline{\psi(x,t)}\,\partial_x\psi(x,t)\big).
\end{align*}
[/definition]
This current is the one-dimensional flux of probability density. For the scattering amplitudes to have a probability interpretation, flux cannot appear or disappear inside a region where the potential is real.
The definition of $j_\psi$ only names the flux; it does not yet prove that this flux is conserved or tied to probability density. The next theorem is needed to supply that local conservation law, identifying exactly how changes in $|\psi|^2$ are balanced by the spatial flow measured by $j_\psi$ so stationary scattering currents can later be compared at the two ends of the line.
[quotetheorem:6940]
[citeproof:6940]
The real-valued potential hypothesis is essential because it is what makes the potential terms cancel in the continuity equation. If $V$ has an imaginary part, as in an absorbing optical potential, probability can be lost or gained locally and the current need not be constant. The theorem also does not say that scattering states are normalisable; it only gives a flux conservation law for stationary solutions whose plane-wave asymptotics carry finite current.
For a real potential, any loss of current to the reflected wave is exactly balanced by the current continuing to the right. To obtain the standard probability identity, the next step is to compute this conserved current in the two asymptotic regions.
[quotetheorem:6941]
[citeproof:6941]
This identity is a unitarity statement for the one-dimensional scattering matrix. The equal limiting potential assumption means the incoming and transmitted plane waves have the same group velocity, so the flux ratios reduce to $|R|^2$ and $|T|^2$. If the limiting potentials differ, the transmitted wave number changes and the identity contains a velocity factor instead of the bare term $|T|^2$; if the potential is complex, flux conservation can fail altogether. Thus $R$ and $T$ should not be interpreted as probabilities until they have been converted into flux ratios.
## Tunneling Through Barriers
The next question is what happens when the incoming energy is below the height of a barrier. Classical mechanics forbids passage through a forbidden region, but the Schrödinger equation permits an exponentially decaying wave inside the barrier and a non-zero transmitted wave beyond it.
[definition: Tunneling]
Tunneling is the occurrence of a non-zero transmission probability through a region where $V(x)>E$ for the stationary Schrödinger equation at energy $E$.
[/definition]
Inside such a region the local wave number becomes imaginary, so the wave no longer oscillates. The finite rectangular barrier gives the standard calculation and shows the exponential dependence on barrier width.
[example: Rectangular Potential Barrier]
Let $V(x)=V_0$ for $0<x<a$ and let $V(x)=0$ for $x<0$ and $x>a$, with $0<E<V_0$. Put
\begin{align*}
k=\frac{\sqrt{2mE}}{\hbar},\qquad \kappa=\frac{\sqrt{2m(V_0-E)}}{\hbar}.
\end{align*}
For incidence from the left, write the three pieces as
\begin{align*}
\psi(x)=e^{ikx}+Re^{-ikx}\quad (x<0),
\end{align*}
\begin{align*}
\psi(x)=Ae^{\kappa x}+Be^{-\kappa x}\quad (0<x<a),
\end{align*}
and
\begin{align*}
\psi(x)=Te^{ikx}\quad (x>a).
\end{align*}
Inside the barrier the exponentials occur because the stationary equation is
\begin{align*}
-\frac{\hbar^2}{2m}\psi''+V_0\psi=E\psi,
\end{align*}
hence
\begin{align*}
\psi''=\frac{2m(V_0-E)}{\hbar^2}\psi=\kappa^2\psi.
\end{align*}
First match $\psi$ and $\psi'$ at $x=a$. The value condition is
\begin{align*}
Ae^{\kappa a}+Be^{-\kappa a}=Te^{ika}.
\end{align*}
The derivative condition is
\begin{align*}
\kappa Ae^{\kappa a}-\kappa Be^{-\kappa a}=ikTe^{ika}.
\end{align*}
Set $U=Ae^{\kappa a}$ and $W=Be^{-\kappa a}$. Then
\begin{align*}
U+W=Te^{ika}.
\end{align*}
Also,
\begin{align*}
U-W=\frac{ik}{\kappa}Te^{ika}.
\end{align*}
Adding these two equations gives
\begin{align*}
2U=\left(1+\frac{ik}{\kappa}\right)Te^{ika},
\end{align*}
so
\begin{align*}
A=\frac{T e^{ika}e^{-\kappa a}}{2}\left(1+\frac{ik}{\kappa}\right).
\end{align*}
Subtracting the second equation from the first gives
\begin{align*}
2W=\left(1-\frac{ik}{\kappa}\right)Te^{ika},
\end{align*}
so
\begin{align*}
B=\frac{T e^{ika}e^{\kappa a}}{2}\left(1-\frac{ik}{\kappa}\right).
\end{align*}
Now match at $x=0$. The value condition is
\begin{align*}
1+R=A+B.
\end{align*}
The derivative condition is
\begin{align*}
ik(1-R)=\kappa(A-B).
\end{align*}
From $1+R=A+B$, we have $R=A+B-1$, so
\begin{align*}
ik(1-R)=ik(2-A-B).
\end{align*}
Thus
\begin{align*}
ik(2-A-B)=\kappa(A-B).
\end{align*}
Moving the $A$ and $B$ terms to the right gives
\begin{align*}
2ik=(ik+\kappa)A+(ik-\kappa)B.
\end{align*}
Substituting the formulas for $A$ and $B$ gives
\begin{align*}
2ik=\frac{T e^{ika}}{2}\left((ik+\kappa)\left(1+\frac{ik}{\kappa}\right)e^{-\kappa a}+(ik-\kappa)\left(1-\frac{ik}{\kappa}\right)e^{\kappa a}\right).
\end{align*}
The two products are
\begin{align*}
(ik+\kappa)\left(1+\frac{ik}{\kappa}\right)=\frac{(\kappa+ik)^2}{\kappa}=\frac{\kappa^2-k^2+2i\kappa k}{\kappa}
\end{align*}
and
\begin{align*}
(ik-\kappa)\left(1-\frac{ik}{\kappa}\right)=\frac{-\kappa^2+k^2+2i\kappa k}{\kappa}.
\end{align*}
Therefore the bracket equals
\begin{align*}
\frac{\kappa^2-k^2+2i\kappa k}{\kappa}e^{-\kappa a}+\frac{-\kappa^2+k^2+2i\kappa k}{\kappa}e^{\kappa a}.
\end{align*}
Using $e^{\kappa a}+e^{-\kappa a}=2\cosh(\kappa a)$ and $e^{\kappa a}-e^{-\kappa a}=2\sinh(\kappa a)$, this becomes
\begin{align*}
\frac{2}{\kappa}\left((k^2-\kappa^2)\sinh(\kappa a)+2i\kappa k\cosh(\kappa a)\right).
\end{align*}
Hence
\begin{align*}
2ik=\frac{T e^{ika}}{\kappa}\left((k^2-\kappa^2)\sinh(\kappa a)+2i\kappa k\cosh(\kappa a)\right).
\end{align*}
Solving for $T$ gives
\begin{align*}
T=\frac{2i\kappa k e^{-ika}}{(k^2-\kappa^2)\sinh(\kappa a)+2i\kappa k\cosh(\kappa a)}.
\end{align*}
Since $|e^{-ika}|=1$, taking squared moduli gives
\begin{align*}
|T|^2=\frac{4\kappa^2k^2}{(k^2-\kappa^2)^2\sinh^2(\kappa a)+4\kappa^2k^2\cosh^2(\kappa a)}.
\end{align*}
Using $\cosh^2 u=1+\sinh^2 u$, the denominator is
\begin{align*}
4\kappa^2k^2+\left((k^2-\kappa^2)^2+4\kappa^2k^2\right)\sinh^2(\kappa a).
\end{align*}
The coefficient of $\sinh^2(\kappa a)$ satisfies
\begin{align*}
(k^2-\kappa^2)^2+4\kappa^2k^2=k^4+2k^2\kappa^2+\kappa^4=(k^2+\kappa^2)^2.
\end{align*}
Therefore
\begin{align*}
|T|^2=\left(1+\frac{(k^2+\kappa^2)^2}{4\kappa^2k^2}\sinh^2(\kappa a)\right)^{-1}.
\end{align*}
Now
\begin{align*}
k^2+\kappa^2=\frac{2mE}{\hbar^2}+\frac{2m(V_0-E)}{\hbar^2}=\frac{2mV_0}{\hbar^2}
\end{align*}
and
\begin{align*}
4\kappa^2k^2=4\frac{2m(V_0-E)}{\hbar^2}\frac{2mE}{\hbar^2}=\frac{16m^2E(V_0-E)}{\hbar^4}.
\end{align*}
Thus
\begin{align*}
\frac{(k^2+\kappa^2)^2}{4\kappa^2k^2}=\frac{4m^2V_0^2/\hbar^4}{16m^2E(V_0-E)/\hbar^4}=\frac{V_0^2}{4E(V_0-E)}.
\end{align*}
So the transmission probability is
\begin{align*}
|T|^2=\left(1+\frac{V_0^2\sinh^2(\kappa a)}{4E(V_0-E)}\right)^{-1}.
\end{align*}
For $\kappa a$ large,
\begin{align*}
\sinh(\kappa a)=\frac{e^{\kappa a}-e^{-\kappa a}}{2}=\frac{e^{\kappa a}}{2}(1-e^{-2\kappa a}),
\end{align*}
so $\sinh^2(\kappa a)$ is asymptotic to $e^{2\kappa a}/4$. Hence $|T|^2$ is asymptotic to
\begin{align*}
\frac{16E(V_0-E)}{V_0^2}e^{-2\kappa a}.
\end{align*}
The transmitted wave is non-zero for every finite $a$, but its flux is exponentially suppressed as the forbidden width $a$ increases.
[/example]
Tunneling is not only a barrier effect; it also explains the splitting of nearly degenerate states in double-well potentials. The overlap of exponentially decaying tails through the classically forbidden region controls how much two localised well states hybridise.
[remark: Exponential Sensitivity]
Small changes in barrier width or height can produce large changes in transmission. This is why semiclassical tunneling factors are often written in the form
\begin{align*}
\exp\left(-2\int_a^b \frac{\sqrt{2m(V(x)-E)}}{\hbar}\,dx\right),
\end{align*}
where $(a,b)$ is the forbidden region.
[/remark]
The rectangular barrier calculation relies on exact piecewise solutions, and the tunneling factor suggests what should remain true for smooth barriers. The next result records the local rule needed to pass through a turning point, where neither the oscillatory nor the exponential WKB formula is valid by itself.
[quotetheorem:6942]
The simple turning point hypothesis is essential because it is what produces the Airy scaling and the universal phase shift $\pi/4$. At a multiple turning point, the linear approximation $V(x)-E\approx V'(x_0)(x-x_0)$ vanishes, so the Airy model no longer captures the local equation; for example, near a quadratic minimum with $V(x)-E\approx c(x-x_0)^2$, the local model is closer to the parabolic-cylinder equation. When two simple turning points approach within the Airy transition scale, treating them separately double-counts the local transition region and gives the wrong phase and amplitude. In a shallow well near the threshold where the two turning points coalesce, the usual two-turning-point quantisation rule must therefore be replaced by a uniform approximation adapted to the combined pair. The connection formula is the bridge from local WKB waves to global semiclassical rules: it supplies the phase correction in bound-state quantisation and the exponential factor in tunneling estimates.
## Resonances and Quasi-Bound States
The final issue in this chapter is how a barrier can temporarily trap probability without producing a genuine $L^2$ bound state. Such states behave like bound states for a long time but leak through tunneling, producing sharp peaks in transmission.
[definition: Resonance]
In one-dimensional scattering, a resonance is an energy at which the transmission probability is close to one, or a scattering feature associated with a nearby pole after [analytic continuation](/page/Analytic%20Continuation) in the complex energy or wave-number variable.
[/definition]
The pole formulation is the analytic viewpoint used in more advanced scattering theory. In elementary barrier problems, resonances occur when the wave inside a well or barrier region fits an approximately standing-wave condition.
[example: Resonant Transmission Through a Well]
Model the two barriers as identical, lossless scatterers with single-barrier transmission amplitude $t$ and reflection amplitude $r$, and write
\begin{align*}
\tau=|t|^2,\qquad |r|^2=1-\tau.
\end{align*}
Let $\Phi(E)$ be the total phase gained by a wave after one round trip inside the well, including propagation across the well and the two reflection phases. The transmitted wave is the sum of all paths that transmit through the first barrier, make $j$ complete round trips, and then transmit through the second barrier:
\begin{align*}
\mathcal T(E)=t^2 e^{i\theta(E)}\sum_{j=0}^{\infty}\left(|r|^2 e^{i\Phi(E)}\right)^j.
\end{align*}
Since $|r|^2<1$ when the barriers have non-zero transmission, the geometric series gives
\begin{align*}
\mathcal T(E)=\frac{t^2 e^{i\theta(E)}}{1-|r|^2 e^{i\Phi(E)}}.
\end{align*}
Taking squared moduli and using $|e^{i\theta(E)}|=1$ gives
\begin{align*}
|\mathcal T(E)|^2=\frac{|t|^4}{|1-|r|^2 e^{i\Phi(E)}|^2}.
\end{align*}
Now
\begin{align*}
|1-|r|^2 e^{i\Phi}|^2=(1-|r|^2\cos\Phi)^2+(|r|^2\sin\Phi)^2.
\end{align*}
Expanding the squares gives
\begin{align*}
|1-|r|^2 e^{i\Phi}|^2=1-2|r|^2\cos\Phi+|r|^4\cos^2\Phi+|r|^4\sin^2\Phi.
\end{align*}
Since $\cos^2\Phi+\sin^2\Phi=1$, this becomes
\begin{align*}
|1-|r|^2 e^{i\Phi}|^2=1-2|r|^2\cos\Phi+|r|^4.
\end{align*}
Using $|r|^2=1-\tau$ and $|t|^4=\tau^2$, we get
\begin{align*}
|\mathcal T(E)|^2=\frac{\tau^2}{1-2(1-\tau)\cos\Phi(E)+(1-\tau)^2}.
\end{align*}
Equivalently,
\begin{align*}
1-2(1-\tau)\cos\Phi+(1-\tau)^2=\tau^2+2(1-\tau)(1-\cos\Phi).
\end{align*}
At a resonant energy, $\Phi(E)=2\pi n$ for some $n\in\mathbb Z$, so $\cos\Phi(E)=1$ and the denominator is $\tau^2$. Hence
\begin{align*}
|\mathcal T(E)|^2=\frac{\tau^2}{\tau^2}=1.
\end{align*}
Away from resonance, $1-\cos\Phi(E)>0$, so the denominator is larger than $\tau^2$ and the transmission probability is smaller. Thus repeated internal reflections create perfect constructive interference in transmission at the round-trip phase condition, producing sharp peaks of $|\mathcal T(E)|^2$ even when each individual barrier transmits only weakly.
[/example]
Resonances connect the bound-state and scattering parts of the chapter. Bound states are square-integrable eigenfunctions with real energies, while resonances are scattering features associated with complex energies whose imaginary parts encode decay rates.
[remark: Bound States Versus Resonances]
A bound state has $\psi\in L^2(\mathbb R)$ and evolves by a pure phase $e^{-iEt/\hbar}$. A resonance is not an $L^2$ eigenstate of the self-adjoint Hamiltonian; it is detected through scattering data and produces metastable behaviour over a long but finite time scale.
[/remark]
The same language reappears outside one-dimensional quantum mechanics. Sturm-Liouville oscillation theory is the prototype for mode counting in vibrating strings and radial reductions of higher-dimensional wave equations. Probability current is the quantum analogue of a conserved flux in electromagnetism and fluid mechanics, while tunneling and resonances supply the semiclassical mechanism behind alpha decay, molecular inversion, and sharp spectral lines in open systems.
One-dimensional systems show how spectral theory becomes concrete in differential equations and boundary conditions. The harmonic oscillator is the next canonical example, where operator algebra replaces direct integration and reveals the spectrum almost immediately.
# 7. Harmonic Oscillator and Ladder Methods
The harmonic oscillator is the first infinite-dimensional quantum system in the course where the algebra of operators gives the whole solution. Instead of solving a differential equation from scratch, we factor the Hamiltonian into creation and annihilation operators and read off the spectrum from commutation relations. This chapter also introduces coherent states, which behave as quantum wave packets whose centres follow the classical oscillator motion.
## Factoring the Oscillator Hamiltonian
The basic question is how to solve the eigenvalue problem for the one-dimensional oscillator without repeatedly differentiating special functions. Work in $L^2(\mathbb R)$ with the common invariant dense domain $\mathcal S(\mathbb R)$. On this core the position operator, denoted $X$ here rather than $Q$ to match oscillator notation, is
\begin{align*}
X&:\mathcal S(\mathbb R)\to \mathcal S(\mathbb R), & X\psi(x)&=x\psi(x).
\end{align*}
The momentum operator is
\begin{align*}
P&:\mathcal S(\mathbb R)\to \mathcal S(\mathbb R), & P\psi(x)&=-i\hbar\psi'(x).
\end{align*}
For mass $m>0$ and frequency $\omega>0$, the oscillator Hamiltonian is initially the symmetric operator
\begin{align*}
H&:\mathcal S(\mathbb R)\to L^2(\mathbb R), & H&=\frac{1}{2m}P^2+\frac{1}{2}m\omega^2X^2,
\end{align*}
and its self-adjoint realisation is the closure of this operator on the usual oscillator domain $\mathcal D(H)\subset L^2(\mathbb R)$.
The coefficients in the next definition are chosen so that the quadratic expression in $X$ and $P$ becomes a product plus a scalar correction.
[definition: Creation And Annihilation Operators]
The annihilation operator is the map
\begin{align*}
a&:\mathcal S(\mathbb R)\to\mathcal S(\mathbb R), & a\psi&=\left(\sqrt{\frac{m\omega}{2\hbar}}\,X+\frac{i}{\sqrt{2m\hbar\omega}}\,P\right)\psi.
\end{align*}
The creation operator is the map
\begin{align*}
a^*&:\mathcal S(\mathbb R)\to\mathcal S(\mathbb R), & a^*\psi&=\left(\sqrt{\frac{m\omega}{2\hbar}}\,X-\frac{i}{\sqrt{2m\hbar\omega}}\,P\right)\psi.
\end{align*}
[/definition]
Here $a^*$ denotes the Hilbert-space adjoint on the natural oscillator domain; on $\mathcal S(\mathbb R)$ it is obtained by formal integration by parts.
The terms "creation" and "annihilation" refer to the action these operators have on energy eigenvectors: one raises the oscillator quantum number and the other lowers it. Before any spectral argument can begin, we need the exact algebra satisfied by these two operators, because small scalar errors in the commutator would change the spacing of every energy level.
[quotetheorem:6943]
[citeproof:6943]
This result turns the oscillator into a positivity problem. On the common core, the identity depends essentially on the canonical commutator $[X,P]=i\hbar I$; changing that scalar would change the level spacing and the ground-state offset. The domain convention is also essential: if $X$, $P$, $a$, and $a^*$ are multiplied formally on unrelated domains, the displayed products may fail to define the same operator. For instance, the function $\psi(x)=e^{-|x|}$ lies in $L^2(\mathbb R)$ and has a weak first derivative in $L^2(\mathbb R)$, but applying $P^2$ produces a distributional term at $0$, so treating $P^2\psi$ as an ordinary $L^2$ function would give a false operator identity. The theorem proves the commutator and factorisation on the stated core; it does not by itself prove that the closures are self-adjoint or that the resulting eigenvectors form a complete basis. Since $a^*a$ has expectation $\|a\psi\|_{L^2}^2$ on its domain, the energy lower bound is
\begin{align*}
(\psi,H\psi)_{L^2}\ge \frac{\hbar\omega}{2}\|\psi\|_{L^2}^2.
\end{align*}
This lower bound is the reason the ladder construction must have a bottom rung rather than descending indefinitely.
[definition: Number Operator]
The number operator is the operator
\begin{align*}
N&:\mathcal D(N)\subset L^2(\mathbb R)\to L^2(\mathbb R), & N\psi&=a^*a\psi,
\end{align*}
where
\begin{align*}
\mathcal D(N)=\{\psi\in\mathcal D(a):a\psi\in\mathcal D(a^*)\}.
\end{align*}
[/definition]
The number operator isolates the integer part of the energy. With the self-adjoint oscillator realisation fixed above,
\begin{align*}
H=\hbar\omega\left(N+\frac12 I\right),
\end{align*}
so diagonalising $H$ is equivalent to diagonalising $N$.
[example: Oscillator Partition Of Hilbert Space]
Let $\psi\in\mathcal H_n=\ker(N-nI)$ be normalised, so $N\psi=n\psi$ and $\|\psi\|_{L^2}=1$. Using $H=\hbar\omega(N+\frac12 I)$, we get
\begin{align*}
H\psi=\hbar\omega\left(N\psi+\frac12\psi\right)=\hbar\omega\left(n+\frac12\right)\psi.
\end{align*}
Thus every vector in $\mathcal H_n$ has energy $\hbar\omega(n+\frac12)$.
Now let $\phi\in\mathcal H_n$. Since $N=a^*a$ and $aa^*=a^*a+I=N+I$, we have
\begin{align*}
N(a^*\phi)=a^*aa^*\phi=a^*(aa^*)\phi=a^*(N+I)\phi=a^*((n+1)\phi)=(n+1)a^*\phi.
\end{align*}
Therefore $a^*\phi\in\mathcal H_{n+1}$ unless $a^*\phi=0$, in which case the inclusion is still true. For the lowering operator,
\begin{align*}
aN\phi=aa^*a\phi=(a^*a+I)a\phi=Na\phi+a\phi.
\end{align*}
Since also $aN\phi=a(n\phi)=n\,a\phi$, this gives
\begin{align*}
Na\phi=(n-1)a\phi.
\end{align*}
Hence $a\phi\in\mathcal H_{n-1}$ when $a\phi\ne0$. If $n=0$, then
\begin{align*}
\|a\phi\|_{L^2}^2=(\phi,a^*a\phi)_{L^2}=(\phi,N\phi)_{L^2}=0,
\end{align*}
so $a\phi=0$, and the target space is interpreted as $\{0\}$. The oscillator Hilbert space is therefore organised into eigenspaces whose energies differ by exactly $\hbar\omega$, with $a^*$ moving one level up and $a$ moving one level down.
[/example]
## The Spectrum And Hermite Function Basis
The next problem is to prove that the ladder construction gives all states, not merely a formal list of possible energies. The lowest state must be killed by $a$, and repeatedly applying $a^*$ should generate the whole basis.
[quotetheorem:6944]
[citeproof:6944]
This result contains the analytic seed of the oscillator: the ground state equation is first order, and all higher stationary states are produced from this seed. Positivity alone would not identify the spectrum without the norm identities for the [ladder coefficients](/theorems/6953); those identities are what rule out a formal ladder starting at a non-integer value. The theorem gives the possible energies and their one-dimensional eigenspaces for the self-adjoint oscillator, but the ladder algebra alone still does not prove that the corresponding eigenvectors span all of $L^2(\mathbb R)$. That completeness enters through the Hermite basis argument below. To state the basis theorem cleanly, we first name the normalised ladder states generated from the ground state.
[definition: Hermite Functions]
Let $\psi_0\in L^2(\mathbb R)$ be the unique positive normalised solution of $a\psi_0=0$. The oscillator Hermite function assignment is the map
\begin{align*}
\psi&:\mathbb N\cup\{0\}\to L^2(\mathbb R), & n&\mapsto \psi_n=\frac{1}{\sqrt{n!}}(a^*)^n\psi_0.
\end{align*}
[/definition]
In position space the ground state equation is
\begin{align*}
\left(\sqrt{\frac{m\omega}{2\hbar}}x+\sqrt{\frac{\hbar}{2m\omega}}\frac{d}{dx}\right)\psi_0(x)=0,
\end{align*}
so the normalised solution is
\begin{align*}
\psi_0(x)=\left(\frac{m\omega}{\pi\hbar}\right)^{1/4}e^{-m\omega x^2/(2\hbar)}.
\end{align*}
These functions coincide with a Gaussian multiplied by the physicists' Hermite polynomial after rescaling $x$ by $(m\omega/\hbar)^{1/2}$. The operator definition is usually the cleanest form for spectral calculations.
[quotetheorem:6945]
[citeproof:6945]
This theorem is the oscillator analogue of a Fourier basis: every square-integrable wave function can be expanded in stationary states, and time evolution only changes the phases of the coefficients. The completeness clause is the analytic part that a purely algebraic ladder computation would miss; without it, the construction would only describe an invariant subspace generated from the Gaussian. Self-adjointness is not cosmetic here: a symmetric differential expression with the wrong domain can have missing boundary conditions, non-real spectral behaviour after closure fails, or non-orthogonal root vectors, so the spectral expansion would no longer follow from the same argument. The normalisation and phase convention are harmless for the eigenspaces but necessary for the exact formulas for ladder coefficients and coherent-state expansions below; replacing $\psi_0$ by $e^{i\theta}\psi_0$ leaves the eigenspaces unchanged but multiplies every $\psi_n$ by $e^{i\theta}$, changing the displayed coherent-state coefficient convention. The theorem also explains why computations may be reduced to finite ladder moves on basis vectors before passing to limits in $L^2(\mathbb R)$.
[example: Matrix Elements Of Position And Momentum]
For $j,k\ge0$, use the ladder identities $a\psi_k=\sqrt{k}\,\psi_{k-1}$, with the convention that the term is $0$ when $k=0$, and $a^*\psi_k=\sqrt{k+1}\,\psi_{k+1}$. From
\begin{align*}
X=\sqrt{\frac{\hbar}{2m\omega}}(a+a^*)
\end{align*}
we get
\begin{align*}
X\psi_k=\sqrt{\frac{\hbar}{2m\omega}}\left(a\psi_k+a^*\psi_k\right)
\end{align*}
and hence
\begin{align*}
X\psi_k=\sqrt{\frac{\hbar}{2m\omega}}\left(\sqrt{k}\,\psi_{k-1}+\sqrt{k+1}\,\psi_{k+1}\right).
\end{align*}
Taking the inner product with $\psi_j$ and using orthonormality of the Hermite basis gives
\begin{align*}
(\psi_j,X\psi_k)_{L^2}=\sqrt{\frac{\hbar}{2m\omega}}\left(\sqrt{k}\,(\psi_j,\psi_{k-1})_{L^2}+\sqrt{k+1}\,(\psi_j,\psi_{k+1})_{L^2}\right).
\end{align*}
Since $(\psi_j,\psi_\ell)_{L^2}=\delta_{j,\ell}$, this becomes
\begin{align*}
(\psi_j,X\psi_k)_{L^2}=\sqrt{\frac{\hbar}{2m\omega}}\left(\sqrt{k}\,\delta_{j,k-1}+\sqrt{k+1}\,\delta_{j,k+1}\right).
\end{align*}
The same calculation for
\begin{align*}
P=-i\sqrt{\frac{m\hbar\omega}{2}}(a-a^*)
\end{align*}
starts with
\begin{align*}
P\psi_k=-i\sqrt{\frac{m\hbar\omega}{2}}\left(a\psi_k-a^*\psi_k\right).
\end{align*}
Substituting the ladder actions gives
\begin{align*}
P\psi_k=-i\sqrt{\frac{m\hbar\omega}{2}}\left(\sqrt{k}\,\psi_{k-1}-\sqrt{k+1}\,\psi_{k+1}\right).
\end{align*}
Therefore
\begin{align*}
(\psi_j,P\psi_k)_{L^2}=-i\sqrt{\frac{m\hbar\omega}{2}}\left(\sqrt{k}\,\delta_{j,k-1}-\sqrt{k+1}\,\delta_{j,k+1}\right).
\end{align*}
Thus both $X$ and $P$ have matrix elements only between neighbouring oscillator levels: the entry vanishes unless $j=k-1$ or $j=k+1$.
[/example]
## Coherent States And Generating Functions
Having diagonalised the Hamiltonian, we now ask for states that look as much as possible like classical oscillator states. Energy eigenstates are stationary, but a classical oscillator has a moving centre in phase space. Coherent states answer this by being eigenvectors of the annihilation operator rather than of the Hamiltonian.
[definition: Coherent State]
For $\alpha\in\mathbb C$, the coherent state is the vector $\psi_\alpha\in\mathcal D(a)\subset L^2(\mathbb R)$ satisfying
\begin{align*}
a\psi_\alpha=\alpha\psi_\alpha,\qquad \|\psi_\alpha\|_{L^2}=1,
\end{align*}
with phase chosen by $(\psi_0,\psi_\alpha)_{L^2}>0$.
[/definition]
The eigenvalue $\alpha$ encodes both the expected position and expected momentum. To use coherent states in calculations, we need an explicit expansion in the known Hermite basis, and the annihilation equation turns into a recurrence for the expansion coefficients.
[quotetheorem:6946]
This formula is the generating-function viewpoint: the coherent state packages all Hermite functions into a single analytic family. The phase convention matters because multiplying an eigenvector of $a$ by a non-unit scalar would either destroy normalisation or change the coefficient $c_0$ in the expansion. The convergence statement is also essential: the displayed series is not just formal, but an actual vector of $L^2(\mathbb R)$ on which the annihilation equation is meaningful. The theorem gives existence and uniqueness for the chosen normalisation and phase; it does not assert uniqueness after changing the phase convention, prove the overcompleteness relation for coherent states, or quantify phase-space localisation beyond the expectation and variance formulas that follow. The expansion makes it possible to compute expectations without first writing the wave function in position space. The next question is whether these states are genuinely the closest quantum analogue of classical phase-space points, which is measured by their uncertainty product.
[quotetheorem:6947]
[citeproof:6947]
The equality case connects the oscillator to the Robertson uncertainty inequality from the first chapter. The eigenvector condition for $a$ is the decisive hypothesis: a general superposition of Hermite states has the same Hilbert space setting but need not make the cancellation in the variance computation. A concrete failure mode is the first excited state $\psi_1$, for which the standard oscillator formulas give
\begin{align*}
(\Delta_{\psi_1}X)^2=\frac{3\hbar}{2m\omega},\qquad (\Delta_{\psi_1}P)^2=\frac{3m\hbar\omega}{2},
\end{align*}
so $(\Delta_{\psi_1}X)(\Delta_{\psi_1}P)=3\hbar/2$, strictly above $\hbar/2$. The formula also depends on the normalisation of $X$ and $P$ in the definitions of $a$ and $a^*$; rescaling either operator would change the numerical variances though not the underlying commutator principle. Coherent states do not remove uncertainty; they distribute it in the balanced way forced by the commutator.
[example: Coherent State Time Evolution]
Let $U(t)=e^{-itH/\hbar}$ and fix $\alpha\in\mathbb C$. By the Hermite expansion of the coherent state,
\begin{align*}
\psi_\alpha=e^{-|\alpha|^2/2}\sum_{n=0}^{\infty}\frac{\alpha^n}{\sqrt{n!}}\psi_n.
\end{align*}
Since $H\psi_n=\hbar\omega(n+\frac12)\psi_n$, each basis vector evolves as
\begin{align*}
U(t)\psi_n=e^{-it\hbar\omega(n+\frac12)/\hbar}\psi_n=e^{-i\omega t(n+\frac12)}\psi_n.
\end{align*}
Therefore
\begin{align*}
U(t)\psi_\alpha=e^{-|\alpha|^2/2}\sum_{n=0}^{\infty}\frac{\alpha^n e^{-i\omega t(n+\frac12)}}{\sqrt{n!}}\psi_n.
\end{align*}
Factoring the $n$-independent phase and collecting the remaining powers gives
\begin{align*}
U(t)\psi_\alpha=e^{-i\omega t/2}e^{-|\alpha|^2/2}\sum_{n=0}^{\infty}\frac{(\alpha e^{-i\omega t})^n}{\sqrt{n!}}\psi_n.
\end{align*}
Because $|\alpha e^{-i\omega t}|=|\alpha|$, the final series is exactly the coherent-state expansion with label $\alpha e^{-i\omega t}$:
\begin{align*}
U(t)\psi_\alpha=e^{-i\omega t/2}\psi_{\alpha e^{-i\omega t}}.
\end{align*}
The expectation formulas for coherent states then give
\begin{align*}
\mathbb E_{U(t)\psi_\alpha}[X]=\sqrt{\frac{2\hbar}{m\omega}}\operatorname{Re}(\alpha e^{-i\omega t}).
\end{align*}
Writing $\alpha=\operatorname{Re}\alpha+i\operatorname{Im}\alpha$, this is
\begin{align*}
\mathbb E_{U(t)\psi_\alpha}[X]=\sqrt{\frac{2\hbar}{m\omega}}\left((\operatorname{Re}\alpha)\cos(\omega t)+(\operatorname{Im}\alpha)\sin(\omega t)\right).
\end{align*}
Similarly,
\begin{align*}
\mathbb E_{U(t)\psi_\alpha}[P]=\sqrt{2m\hbar\omega}\operatorname{Im}(\alpha e^{-i\omega t}).
\end{align*}
Expanding the imaginary part gives
\begin{align*}
\mathbb E_{U(t)\psi_\alpha}[P]=\sqrt{2m\hbar\omega}\left((\operatorname{Im}\alpha)\cos(\omega t)-(\operatorname{Re}\alpha)\sin(\omega t)\right).
\end{align*}
Thus the coherent-state label rotates in the complex plane at angular speed $\omega$, while the corresponding expected position and momentum trace the classical oscillator orbit. The scalar factor $e^{-i\omega t/2}$ has modulus $1$, so replacing a state by this multiple leaves all probabilities and expectation values unchanged.
[/example]
## Operator Calculus For The Ladder Method
The ladder method now faces its main technical obstruction: the operators that make the algebra useful are unbounded, so products that look identical on symbols may have different domains. The reliable procedure is to choose a dense invariant domain where every product below is defined, prove the commutator identities there, and only then pass to closed operators or spectral calculus when the domain has been checked.
[quotetheorem:6948]
[citeproof:6948]
These commutator rules are the algebraic form of energy raising and lowering. The restriction to the finite Hermite span is part of the statement, not a cosmetic detail: on larger domains, unbounded products such as $a q(N)$ and $q(N+I)a$ require separate domain checks before they can be identified as operators. For example, take coefficients $c_n=n^{-2}$ for $n\ge1$ and $c_0=0$. Then $\psi=\sum_{n\ge0}c_n\psi_n$ lies in $L^2(\mathbb R)$ and in $\mathcal D(a)$, but the series defining $Na\psi$ has squared coefficients comparable to $n^{-1}$ and therefore diverges. The formal identity $Na=aN-a$ has no operator meaning on this vector even though it holds on every finite partial sum. The polynomial hypothesis is similarly deliberate, since general functions of $N$ require spectral calculus rather than induction on monomials. These identities also make perturbative computations efficient, because matrix elements of polynomials in $X$ and $P$ reduce to finite sums of ladder moves.
[remark: Domain Discipline]
The formal manipulations in this chapter are first carried out on $\mathcal S(\mathbb R)$ or on the finite span of the Hermite functions. The self-adjoint Hamiltonian is then obtained by closure, and the spectral statements hold on the resulting operator domain. This separation between algebra on a common core and spectral conclusions for the closed operator is the standard way to keep the ladder method rigorous.
[/remark]
The oscillator shows how commutation relations and ladder operators can solve a quantum system once the right algebra is found. Angular momentum extends this idea from one Hamiltonian to a full symmetry group, making rotations a source of observables, quantum numbers, and selection rules.
# 8. Angular Momentum and Rotations
Angular momentum is the part of quantum mechanics where symmetry becomes computational. In earlier chapters, symmetries appeared through unitary operators commuting with a Hamiltonian; here the rotation group supplies a family of such symmetries whose infinitesimal generators are themselves observables. The chapter has two main goals: first to pass from rotations to the Lie algebras $\mathfrak{so}(3)$ and $\mathfrak{su}(2)$, and then to use representation theory to organize orbital angular momentum, spin, and coupled systems.
## Rotations as Unitary Symmetries
What does it mean for a quantum system to be rotationally invariant? The physical answer is that rotating the apparatus should not change transition probabilities, so rotations must act by unitary or antiunitary transformations on rays. Since $SO(3)$ is connected, the relevant transformations are represented by unitary operators up to phase, and the double cover $SU(2) \to SO(3)$ removes the projective ambiguity for finite-dimensional spin systems.
[definition: Unitary Representation]
Let $G$ be a Lie group and let $H$ be a complex Hilbert space. Write $\mathcal{L}(H)$ for the algebra of bounded linear operators $H\to H$. A unitary representation of $G$ on $H$ is a map $U: G \to \mathcal{L}(H)$ such that:
1. $U(g)$ is unitary for every $g \in G$.
2. $U(gh)=U(g)U(h)$ for all $g,h\in G$.
3. The map $g \mapsto U(g)\psi$ is continuous for every $\psi \in H$.
[/definition]
This definition turns a geometric transformation into an operator acting on states. To extract an observable from a continuous family of symmetries, we need a theorem that identifies the infinitesimal generator of a one-parameter unitary group.
[quotetheorem:6949]
[citeproof:6949]
Strong continuity is essential here: without it, there need not be a densely defined infinitesimal generator obtained from differentiating the group at the identity. For example, a discontinuous additive map $\alpha:\mathbb R\to\mathbb R$ gives a one-parameter unitary group on $\mathbb C$ by $U(t)z=e^{i\alpha(t)}z$, but the derivative at $t=0$ fails to exist. The theorem does not say that every vector is differentiable in $t$; the generator is usually unbounded and has a proper dense domain. This is exactly the domain issue that will recur for angular momentum components.
Stone's theorem supplies a self-adjoint generator for every rotation axis. This motivates the definition of angular momentum operators as the three coordinate-axis generators whose commutators encode the infinitesimal composition law of rotations.
[definition: Angular Momentum Operators]
Let $H$ be a complex Hilbert space and let $D\subset H$ be a dense linear subspace. A triple of linear maps
\begin{align*}
J_i:D\to H, \qquad i\in\{1,2,3\},
\end{align*}
is a system of angular momentum operators on $D$ if each $J_i$ is the restriction to $D$ of a self-adjoint operator on $H$, the domain $D$ is invariant under each $J_i$, and
\begin{align*}
[J_i,J_j]\psi=i\hbar\sum_{k=1}^3\varepsilon_{ijk}J_k\psi
\end{align*}
for all $\psi\in D$ and all $i,j\in\{1,2,3\}$, where $\varepsilon_{ijk}$ is the Levi-Civita symbol.
[/definition]
The commutation relations explain why the three components cannot be simultaneously measured. To find compatible labels for states, we need a rotational scalar built from all three components rather than a second component competing with $J_z$.
[definition: Total Angular Momentum Squared]
Let $J_1,J_2,J_3:D\to H$ be angular momentum operators on a common dense invariant domain $D\subset H$ such that $J_i(D)\subset D$ for each $i$. The total angular momentum squared operator is the linear map
\begin{align*}
J^2:D\to H, \qquad J^2\psi:=J_1^2\psi+J_2^2\psi+J_3^2\psi.
\end{align*}
[/definition]
The quantity $J^2$ is designed to be independent of the chosen coordinate axis. Since the individual components fail to commute with one another, simultaneous angular-momentum labels would be impossible unless this rotational scalar avoids the same obstruction. The algebraic question is whether $J^2$ commutes with each component, especially with the chosen component $J_z$ used for magnetic labels.
[quotetheorem:6950]
[citeproof:6950]
The invariant-domain hypothesis is not bookkeeping: if $J_i\psi$ leaves the domain of $J^2$, the expression $J^2J_i\psi$ is not defined. The angular momentum commutation relations are also essential; for instance, on $\mathbb C^2$ take $A$ to be the projection onto the second coordinate, $B$ to be the operator swapping the two standard basis vectors, and $C=0$. Then $A^2+B^2+C^2=A+I$, which does not commute with $B$. The theorem also does not assert that all three components commute with each other; it singles out the rotational scalar $J^2$ as compatible with any chosen component. This is why the simultaneous labels use $J^2$ and one component, not $J_x,J_y,J_z$ all at once.
The commutation theorem justifies simultaneous eigenvectors of $J^2$ and $J_z$ whenever the spectral problem is finite-dimensional. Before classifying all possibilities, it helps to see a representation in which the double cover $SU(2)\to SO(3)$ is visible in a direct calculation.
[example: Rotations of a Spin One Half State]
Let $e_+=(1,0)$ and $e_-=(0,1)$ be the standard basis of $\mathbb C^2$. Define the Pauli operators by
\begin{align*}
\sigma_x e_+=e_-,\quad \sigma_x e_-=e_+,\quad \sigma_y e_+=i e_-,\quad \sigma_y e_-=-i e_+,\quad \sigma_z e_+=e_+,\quad \sigma_z e_-=-e_-.
\end{align*}
Set $J_i=(\hbar/2)\sigma_i$. For example,
\begin{align*}
[\sigma_x,\sigma_y]e_+=\sigma_x(i e_-)-\sigma_y(e_-)=i e_+-(-i e_+)=2i e_+=2i\sigma_z e_+.
\end{align*}
\begin{align*}
[\sigma_x,\sigma_y]e_-=\sigma_x(-i e_+)-\sigma_y(e_+)=-i e_- - i e_-=-2i e_-=2i\sigma_z e_-.
\end{align*}
Since $e_+,e_-$ form a basis, $[\sigma_x,\sigma_y]=2i\sigma_z$. The same basis calculation gives $[\sigma_y,\sigma_z]=2i\sigma_x$ and $[\sigma_z,\sigma_x]=2i\sigma_y$, so
\begin{align*}
[J_i,J_j]=\frac{\hbar^2}{4}[\sigma_i,\sigma_j]=i\hbar\sum_{k=1}^3\varepsilon_{ijk}J_k.
\end{align*}
A rotation by angle $\theta$ about the $z$-axis is generated by $J_z$, so
\begin{align*}
U_z(\theta)=\exp(-i\theta J_z/\hbar)=\exp(-i\theta\sigma_z/2).
\end{align*}
Because $\sigma_z e_+=e_+$ and $\sigma_z e_-=-e_-$, the exponential acts by
\begin{align*}
U_z(\theta)e_+=e^{-i\theta/2}e_+.
\end{align*}
\begin{align*}
U_z(\theta)e_-=e^{i\theta/2}e_-.
\end{align*}
Thus $U_z(2\pi)e_\pm=-e_\pm$, so $U_z(2\pi)=-I$, while $U_z(4\pi)e_\pm=e_\pm$, so $U_z(4\pi)=I$. The spin one half representation therefore remembers the double cover: one full spatial turn changes the sign of the spinor, and two full turns return it to itself.
[/example]
The spin one half example cannot be a faithful representation of $SO(3)$, because $SO(3)$ identifies rotations differing by $2\pi$. It is instead a representation of $SU(2)$, which is why spin is most naturally classified using $\mathfrak{su}(2)$ rather than the global topology of $SO(3)$.
## The Lie Algebra of Angular Momentum
How much of angular momentum follows from the commutation relations alone? The answer is that the finite-dimensional irreducible theory is determined by the algebra. The operators $J_z$ and $J^2$ can be diagonalised together, while the raising and lowering operators move between adjacent $J_z$ eigenstates.
[definition: Ladder Operators]
Let $J_x,J_y,J_z:D\to H$ be angular momentum operators on a common dense invariant domain $D\subset H$. The ladder operators are the linear maps
\begin{align*}
J_+:D\to H, \qquad J_+\psi := J_x\psi+iJ_y\psi,
\end{align*}
and
\begin{align*}
J_-:D\to H, \qquad J_-\psi := J_x\psi-iJ_y\psi.
\end{align*}
[/definition]
The ladder operators are not self-adjoint, but they encode how rotations around the $x$- and $y$-axes change the measured $z$-component. We therefore need their commutators with $J_z$, because those commutators determine the step size in the $J_z$ spectrum.
[quotetheorem:6951]
[citeproof:6951]
The invariant-domain assumption ensures that applying a ladder operator and then taking another commutator is meaningful. The angular momentum algebra is also needed for the step-size conclusion: if $J_z=0$ and $J_+$ is any nonzero operator, then $[J_z,J_+]=0$ rather than $\hbar J_+$. These identities do not imply that $J_+$ or $J_-$ is unitary; at the top or bottom of a finite ladder the relevant vector is sent to zero. Their role is spectral rather than norm-preserving, and that is what turns the problem into a highest-weight calculation.
The ladder relations convert the spectral problem into a finite string problem. Starting from a highest weight state and applying $J_-$ should produce all states in an irreducible representation, and the endpoint conditions determine the allowed quantum numbers.
[quotetheorem:6952]
[citeproof:6952]
[Finite dimensionality](/theorems/1534) is the hypothesis that forces the ladder to stop; for example, the shift-type ladder on a basis indexed by all integers has no top or bottom weight. Irreducibility is also needed, because a direct sum such as $V_0\oplus V_1$ satisfies the same commutation relations but has more than one value of $j$. The physics normalization matters as well: rescaling the generators changes the numerical eigenvalues unless the factor $\hbar$ in the commutator is fixed. The theorem therefore classifies the elementary rotational building blocks, and the next task is to compute the generator matrices inside one block.
The classification theorem gives the possible labels but not yet the numerical matrix entries of the generators. The notation $j$ is the total angular momentum quantum number, while $m$ is the magnetic quantum number; the next formula is what makes these labels usable in calculations.
[quotetheorem:6953]
Normalization is essential in this statement: rescaling the basis vectors would change the displayed coefficients while preserving the same abstract representation. The theorem does not choose the overall phase of the highest vector, only the relative phases along the ladder. Once those conventions are fixed, the formulas become the practical matrix entries used in spin computations. They also drive coupled-spin calculations: the same raising and lowering equations become linear recurrence relations for Clebsch-Gordan coefficients, and the resulting matrices implement the change from the uncoupled tensor-product basis to the total-angular-momentum basis.
The coefficient formula gives the explicit finite-dimensional matrices for each spin. For $j=1$, it produces a three-state system that should be compared with ordinary vectors in $\mathbb R^3$ after complexification.
[example: Spin One Matrices]
For $j=1$, use the ordered basis $|1,1\rangle,|1,0\rangle,|1,-1\rangle$. The $J_z$ eigenvalue equation gives
\begin{align*}
J_z|1,1\rangle=\hbar|1,1\rangle,\quad J_z|1,0\rangle=0,\quad J_z|1,-1\rangle=-\hbar|1,-1\rangle.
\end{align*}
For $J_+$, the coefficient formula gives
\begin{align*}
J_+|1,-1\rangle=\hbar\sqrt{1(1+1)-(-1)(-1+1)}\,|1,0\rangle=\hbar\sqrt2\,|1,0\rangle.
\end{align*}
\begin{align*}
J_+|1,0\rangle=\hbar\sqrt{1(1+1)-0(0+1)}\,|1,1\rangle=\hbar\sqrt2\,|1,1\rangle.
\end{align*}
\begin{align*}
J_+|1,1\rangle=\hbar\sqrt{1(1+1)-1(1+1)}\,|1,2\rangle=0.
\end{align*}
Similarly,
\begin{align*}
J_-|1,1\rangle=\hbar\sqrt{1(1+1)-1(1-1)}\,|1,0\rangle=\hbar\sqrt2\,|1,0\rangle.
\end{align*}
\begin{align*}
J_-|1,0\rangle=\hbar\sqrt{1(1+1)-0(0-1)}\,|1,-1\rangle=\hbar\sqrt2\,|1,-1\rangle.
\end{align*}
\begin{align*}
J_-|1,-1\rangle=\hbar\sqrt{1(1+1)-(-1)(-1-1)}\,|1,-2\rangle=0.
\end{align*}
Since $J_x=(J_++J_-)/2$ and $J_y=(J_+-J_-)/(2i)$, this determines the spin-one matrices by their action on the basis:
\begin{align*}
J_x|1,1\rangle=\frac{\hbar}{\sqrt2}|1,0\rangle,\quad J_x|1,0\rangle=\frac{\hbar}{\sqrt2}(|1,1\rangle+|1,-1\rangle),\quad J_x|1,-1\rangle=\frac{\hbar}{\sqrt2}|1,0\rangle.
\end{align*}
\begin{align*}
J_y|1,1\rangle=\frac{i\hbar}{\sqrt2}|1,0\rangle,\quad J_y|1,0\rangle=-\frac{i\hbar}{\sqrt2}|1,1\rangle+\frac{i\hbar}{\sqrt2}|1,-1\rangle,\quad J_y|1,-1\rangle=-\frac{i\hbar}{\sqrt2}|1,0\rangle.
\end{align*}
Thus the three basis states form the $3$-dimensional irreducible spin-one representation, with magnetic labels $m=1,0,-1$ and total angular momentum eigenvalue $\hbar^2 1(1+1)=2\hbar^2$.
[/example]
## Orbital Angular Momentum and Spherical Harmonics
Where does angular momentum appear for a particle moving in ordinary space? For a wavefunction $\psi\in L^2(\mathbb R^3)$, rotations act by rotating the argument of the wavefunction, and the corresponding infinitesimal generators are differential operators. These are the orbital angular momentum operators.
[definition: Orbital Angular Momentum]
Let $D=C_c^\infty(\mathbb R^3\setminus\{0\})\subset L^2(\mathbb R^3)$. The orbital angular momentum operators are the linear maps
\begin{align*}
L_i:D\to L^2(\mathbb R^3), \qquad L_i\psi:=-i\hbar\sum_{j,k=1}^3\varepsilon_{ijk}x_j\partial_{x_k}\psi,
\end{align*}
for $i\in\{1,2,3\}$.
[/definition]
Equivalently, these three components assemble into the vector-valued operator
\begin{align*}
L:D\to L^2(\mathbb R^3;\mathbb C^3), \qquad L\psi:=-i\hbar\,x\times \nabla\psi.
\end{align*}
The definition is the quantization of the classical angular momentum $x\times p$, with $p=-i\hbar\nabla$. Because it is generated by spatial rotations, it should satisfy the abstract angular momentum algebra and commute with Hamiltonians that are invariant under rotations.
[quotetheorem:6954]
[citeproof:6954]
The rotation-invariance hypothesis is doing real work: a potential such as $V(x)=x_3$ singles out an axis and does not commute with all components of $L$. The theorem also does not solve the radial part of the Schrodinger equation; it only separates the angular representation content from whatever radial spectral problem remains. This separation is the reason spherical harmonics enter central-potential calculations.
The commutator theorem explains why central potentials can be separated into angular sectors. The commuting observables $L^2$ and $L_z$ lead to a basis on the sphere, which is the angular part of separation of variables for the hydrogen atom and other central potentials.
[definition: Spherical Harmonics]
Let $\mathcal H_{S^2}=L^2(S^2;\mathbb C)$ and let $C^\infty(S^2)\subset \mathcal H_{S^2}$ be the smooth domain for the angular operators induced by the orbital angular momentum action on the unit sphere. Let
\begin{align*}
L^2:C^\infty(S^2)\to \mathcal H_{S^2}, \qquad L_z:C^\infty(S^2)\to \mathcal H_{S^2}
\end{align*}
denote the corresponding total orbital angular momentum squared operator and $z$-component. For $\ell\in\mathbb Z_{\ge 0}$ and $m=-\ell,-\ell+1,\dots,\ell$, the spherical harmonic $Y_{\ell,m}:S^2\to\mathbb C$ is a function $Y_{\ell,m}\in C^\infty(S^2)$ satisfying $\|Y_{\ell,m}\|_{L^2(S^2)}=1$ and
\begin{align*}
L^2Y_{\ell,m}=\hbar^2\ell(\ell+1)Y_{\ell,m}, \qquad L_zY_{\ell,m}=\hbar mY_{\ell,m}.
\end{align*}
[/definition]
Here $L^2=-\hbar^2\Delta_{S^2}$ on $C^\infty(S^2)$, and $L_z=-i\hbar\,\partial_\phi$ in the usual spherical coordinate chart away from the polar axis, with the global operator defined by the rotation action. The orbital quantum number is always an integer because scalar wavefunctions on physical space must return to themselves after a $2\pi$ spatial rotation. Eigenvalue equations alone would not be enough for separation of variables: a central-potential wavefunction must be recoverable from its angular modes. This raises the completeness question for the whole Hilbert space $L^2(S^2;\mathbb C)$, not only for the smooth eigenfunctions already written down.
The precise spectral statement is that these angular modes are not merely examples of eigenfunctions. They exhaust the Hilbert space of square-integrable angular dependence, so each central-potential problem can be split into independent $\ell$-sectors before the radial equation is solved.
[quotetheorem:6955]
Compactness of $S^2$ is the spectral input behind completeness; on $\mathbb R^2$, for comparison, the Laplacian has continuous spectrum and plane waves are not a countable orthonormal basis of $L^2(\mathbb R^2)$. The theorem also depends on using the sphere's standard rotation-invariant geometry: changing the angular operator changes both the eigenvalues and the eigenfunctions. It does not say that every three-dimensional wavefunction is only angular; the radial variable remains in a separate $L^2$ factor after separation of variables. What it gives is a complete angular bookkeeping system for central potentials.
Completeness means that angular dependence can be expanded into irreducible rotational pieces. The first two pieces already show the pattern: constants are rotation-invariant, while the next sector transforms like ordinary three-dimensional space.
[example: The Lowest Spherical Harmonics]
Write $n=(x/r,y/r,z/r)\in S^2$. The normalized $\ell=0$ spherical harmonic is the constant function
\begin{align*}
Y_{0,0}(n)=\frac{1}{\sqrt{4\pi}},
\end{align*}
and its defining eigenvalue is
\begin{align*}
L^2Y_{0,0}=\hbar^2\cdot 0\cdot(0+1)Y_{0,0}=0.
\end{align*}
Thus the $\ell=0$ sector is one-dimensional and rotation-invariant.
For $\ell=1$, using the standard spherical-coordinate convention,
\begin{align*}
Y_{1,0}=\sqrt{\frac{3}{4\pi}}\frac{z}{r}.
\end{align*}
\begin{align*}
Y_{1,1}=-\sqrt{\frac{3}{8\pi}}\frac{x+iy}{r}.
\end{align*}
\begin{align*}
Y_{1,-1}=\sqrt{\frac{3}{8\pi}}\frac{x-iy}{r}.
\end{align*}
Solving these three equations for the coordinate functions gives
\begin{align*}
\frac{z}{r}=\sqrt{\frac{4\pi}{3}}Y_{1,0}.
\end{align*}
\begin{align*}
Y_{1,-1}-Y_{1,1}=\sqrt{\frac{3}{8\pi}}\frac{x-iy}{r}+\sqrt{\frac{3}{8\pi}}\frac{x+iy}{r}=\sqrt{\frac{3}{2\pi}}\frac{x}{r}.
\end{align*}
\begin{align*}
\frac{x}{r}=\sqrt{\frac{2\pi}{3}}(Y_{1,-1}-Y_{1,1}).
\end{align*}
\begin{align*}
Y_{1,-1}+Y_{1,1}=\sqrt{\frac{3}{8\pi}}\frac{x-iy}{r}-\sqrt{\frac{3}{8\pi}}\frac{x+iy}{r}=-i\sqrt{\frac{3}{2\pi}}\frac{y}{r}.
\end{align*}
\begin{align*}
\frac{y}{r}=i\sqrt{\frac{2\pi}{3}}(Y_{1,-1}+Y_{1,1}).
\end{align*}
Therefore $\operatorname{span}\{Y_{1,-1},Y_{1,0},Y_{1,1}\}=\operatorname{span}\{x/r,y/r,z/r\}$. If $f_a(n)=a\cdot n$ for $a\in\mathbb C^3$, then a rotation acts by
\begin{align*}
(U(R)f_a)(n)=f_a(R^{-1}n)=a\cdot R^{-1}n=(Ra)\cdot n=f_{Ra}(n),
\end{align*}
so the first nonzero orbital sector transforms exactly like the ordinary three-dimensional vector representation of rotations.
[/example]
## Spin and Tensor Product Systems
Why is spin not just another kind of orbital motion? Spin is an internal degree of freedom: it transforms under rotations but is not produced by differentiating a wavefunction in physical position space. Mathematically, the Hilbert space gains a finite-dimensional representation factor.
[definition: Spin j System]
A spin $j$ system has internal Hilbert space $V_j\cong \mathbb C^{2j+1}$ carrying the irreducible representation of $\mathfrak{su}(2)$ with angular momentum operators $S_x,S_y,S_z:V_j\to V_j$ satisfying
\begin{align*}
S^2|j,m\rangle=\hbar^2j(j+1)|j,m\rangle, \qquad S_z|j,m\rangle=\hbar m|j,m\rangle.
\end{align*}
[/definition]
The spin definition describes an internal representation, while orbital angular momentum describes spatial rotations of wavefunctions. For a particle with both position and spin, we need a single operator that rotates both factors at once.
[definition: Total Angular Momentum]
Let $D_{\mathrm{orb}}\subset H_{\mathrm{orb}}$ be a common dense invariant domain for $L_x,L_y,L_z$, and let $H_{\mathrm{spin}}$ be finite-dimensional with spin operators $S_x,S_y,S_z:H_{\mathrm{spin}}\to H_{\mathrm{spin}}$. On
\begin{align*}
D:=D_{\mathrm{orb}}\otimes H_{\mathrm{spin}}\subset H_{\mathrm{orb}}\otimes H_{\mathrm{spin}},
\end{align*}
the total angular momentum operators are the linear maps
\begin{align*}
J_i:D\to H_{\mathrm{orb}}\otimes H_{\mathrm{spin}}, \qquad J_i\psi:=(L_i\otimes I+I\otimes S_i)\psi
\end{align*}
for $i\in\{x,y,z\}$.
[/definition]
The [tensor product](/page/Tensor%20Product) formula reflects the rule that both the spatial wavefunction and the internal spinor are rotated. Adding operators is harmless only if the orbital and spin actions live on independent tensor factors, so their cross commutators vanish.
Before total angular momentum can be used as an observable, one must check that the summed operators still form a legitimate angular momentum triple. The next result supplies that algebraic check: independence of the tensor factors should make the mixed commutators disappear while preserving the usual relations for the total operators.
[quotetheorem:6956]
[citeproof:6956]
The different-factor hypothesis is essential: if two angular momenta act on the same degrees of freedom, the cross commutators need not vanish. The theorem also does not decompose the tensor product into irreducibles; it only proves that the sum is again a valid angular momentum action. That decomposition is the next problem and is measured by the coupled basis.
The theorem permits two different bases for the same Hilbert space: one basis labels the two angular momenta separately, and the other labels their total. This distinction is already useful for a spin one half particle moving in a central potential.
[example: Spin Orbit Labels]
For a spin one half particle in a central potential, the uncoupled basis first records orbital and spin labels separately:
\begin{align*}
|\ell,m_\ell\rangle\otimes|\tfrac12,m_s\rangle
\end{align*}
with $m_\ell=-\ell,-\ell+1,\dots,\ell$ and $m_s=\pm\tfrac12$. For the total operator $J_z=L_z\otimes I+I\otimes S_z$, this tensor product has eigenvalue
\begin{align*}
J_z\bigl(|\ell,m_\ell\rangle\otimes|\tfrac12,m_s\rangle\bigr)=\hbar(m_\ell+m_s)\,|\ell,m_\ell\rangle\otimes|\tfrac12,m_s\rangle.
\end{align*}
Thus the coupled magnetic quantum number is $M=m_\ell+m_s$.
By the *Clebsch-Gordan Decomposition* applied to $V_\ell\otimes V_{1/2}$, the possible total-spin sectors are
\begin{align*}
V_\ell\otimes V_{1/2}\cong \bigoplus_{j=|\ell-1/2|}^{\ell+1/2}V_j,
\end{align*}
where $j$ increases in integer steps. If $\ell\ge 1$, this gives exactly two sectors,
\begin{align*}
V_\ell\otimes V_{1/2}\cong V_{\ell+1/2}\oplus V_{\ell-1/2}.
\end{align*}
The dimensions match the original tensor product because
\begin{align*}
\dim(V_\ell\otimes V_{1/2})=(2\ell+1)\cdot 2=4\ell+2
\end{align*}
and
\begin{align*}
\dim V_{\ell+1/2}+\dim V_{\ell-1/2}=(2\ell+2)+(2\ell)=4\ell+2.
\end{align*}
When $\ell=0$, the lower and upper bounds coincide:
\begin{align*}
|\ell-\tfrac12|=\tfrac12
\end{align*}
and
\begin{align*}
\ell+\tfrac12=\tfrac12,
\end{align*}
so only $j=\tfrac12$ occurs. The spin-orbit labels therefore come from replacing the uncoupled labels $(\ell,m_\ell;\tfrac12,m_s)$ by the coupled labels $(j,M)$, with $j=\ell\pm\tfrac12$ except in the $\ell=0$ case.
[/example]
## Addition of Angular Momenta and Clebsch-Gordan Coefficients
Given two subsystems with angular momentum, how does the combined system decompose into irreducible rotational sectors? The tensor product basis records separate measurements, while the coupled basis records total angular momentum. Clebsch-Gordan coefficients are the entries of the unitary change-of-basis matrix between these two descriptions.
[definition: Clebsch-Gordan Coefficient]
Let $V_{j_1}$ and $V_{j_2}$ be irreducible $\mathfrak{su}(2)$ representations. The Clebsch-Gordan coefficients are the complex numbers $\langle j_1m_1,j_2m_2\mid JM\rangle$ defined by the expansion
\begin{align*}
|J,M\rangle=\sum_{m_1,m_2}\langle j_1m_1,j_2m_2\mid JM\rangle |j_1,m_1\rangle\otimes |j_2,m_2\rangle,
\end{align*}
where the sum is over $m_1+m_2=M$.
[/definition]
The condition $m_1+m_2=M$ follows from the fact that $J_z=J_{1,z}+J_{2,z}$. This selection rule filters the possible coefficients, but it does not say which values of the total spin $J$ actually exist in the tensor product. A coefficient table is only meaningful once those allowed total-spin sectors have been identified.
The tensor product has many vectors with the same value of $M$, so the condition $m_1+m_2=M$ cannot by itself identify total angular momentum. What is needed next is a structural theorem that splits the full tensor product into irreducible strings for the total operators $J_i=J_{1,i}+J_{2,i}$. The Clebsch-Gordan decomposition supplies exactly this list of total-spin sectors; after it is known, the coefficients are computed by choosing normalized vectors inside those sectors.
There is one notational translation to keep explicit. The theorem card below uses the standard algebraic $SU(2)$ convention in which $V_n$ denotes the irreducible representation of highest weight $n$ and dimension $n+1$, with $n\in\mathbb Z_{\ge0}$. The physics notation in this chapter writes $V_j$ for the spin $j$ representation of dimension $2j+1$. Thus the same irreducible is written
\begin{align*}
V_j^{\mathrm{spin}}=V_{2j}^{\mathrm{alg}}.
\end{align*}
Applying the theorem with $n=2j_1$ and $m=2j_2$ gives the spin-labelled version
\begin{align*}
V_{j_1}^{\mathrm{spin}}\otimes V_{j_2}^{\mathrm{spin}}\cong \bigoplus_{J=|j_1-j_2|}^{j_1+j_2} V_J^{\mathrm{spin}},
\end{align*}
where $J$ increases in steps of $1$.
[quotetheorem:2479]
[citeproof:2479]
In the examples below, $V_{1/2}$, $V_1$, and $V_{3/2}$ are spin-labelled representations, so they correspond respectively to the algebraic modules $V_1$, $V_2$, and $V_3$ in the quoted theorem. Complete reducibility is the structural hypothesis behind the subtraction process; without invariant orthogonal complements, a weight count alone would not produce a direct-sum decomposition. A two-dimensional non-unitary representation of a solvable Lie algebra can have two one-dimensional composition factors while still being an indecomposable Jordan block, which illustrates the failure that complete reducibility prevents. The theorem also relies on both factors being irreducible; if $V_{j_1}$ were replaced by $V_0\oplus V_1$, the tensor product would decompose by distributing over both summands first. The theorem does not determine the Clebsch-Gordan coefficients themselves, only which irreducible summands can appear. Those allowed summands impose immediate selection rules before any table is calculated.
The decomposition theorem gives selection rules before any coefficient is computed. A Clebsch-Gordan coefficient can be nonzero only when $M=m_1+m_2$ and $|j_1-j_2|\le J\le j_1+j_2$.
[example: Two Spin One Half Particles]
For two spin one half systems, *Clebsch-Gordan Decomposition* gives
\begin{align*}
V_{1/2}\otimes V_{1/2}\cong V_1\oplus V_0.
\end{align*}
Write $|+\rangle=|\tfrac12,\tfrac12\rangle$ and $|-\rangle=|\tfrac12,-\tfrac12\rangle$. For a single spin one half factor, the lowering formula gives
\begin{align*}
J_-|+\rangle=\hbar\sqrt{\tfrac12(\tfrac12+1)-\tfrac12(\tfrac12-1)}\,|-\rangle=\hbar|-\rangle.
\end{align*}
Also,
\begin{align*}
J_-|-\rangle=\hbar\sqrt{\tfrac12(\tfrac12+1)-(-\tfrac12)(-\tfrac12-1)}\,|\tfrac12,-\tfrac32\rangle=0,
\end{align*}
because there is no state with $m=-\tfrac32$ in $V_{1/2}$.
For the total operators on the tensor product, use
\begin{align*}
J_-=J_{1,-}\otimes I+I\otimes J_{2,-}.
\end{align*}
The highest vector has $M=1$:
\begin{align*}
|1,1\rangle=|+\rangle|+\rangle.
\end{align*}
Lowering it gives
\begin{align*}
J_-|+\rangle|+\rangle=(J_{1,-}|+\rangle)|+\rangle+|+\rangle(J_{2,-}|+\rangle)=\hbar|-\rangle|+\rangle+\hbar|+\rangle|-\rangle.
\end{align*}
On the spin-one representation, the same lowering coefficient is
\begin{align*}
J_-|1,1\rangle=\hbar\sqrt{1(1+1)-1(1-1)}\,|1,0\rangle=\hbar\sqrt2\,|1,0\rangle.
\end{align*}
Comparing the two expressions for $J_-|1,1\rangle$ gives
\begin{align*}
|1,0\rangle=\frac{1}{\sqrt2}\bigl(|+\rangle|-\rangle+|-\rangle|+\rangle\bigr).
\end{align*}
Lowering once more,
\begin{align*}
J_-|1,0\rangle=\frac{1}{\sqrt2}\bigl(\hbar|-\rangle|-\rangle+\hbar|-\rangle|-\rangle\bigr)=\hbar\sqrt2\,|-\rangle|-\rangle.
\end{align*}
Since
\begin{align*}
J_-|1,0\rangle=\hbar\sqrt{1(1+1)-0(0-1)}\,|1,-1\rangle=\hbar\sqrt2\,|1,-1\rangle,
\end{align*}
we obtain
\begin{align*}
|1,-1\rangle=|-\rangle|-\rangle.
\end{align*}
The remaining $M=0$ vector must be orthogonal to $|1,0\rangle$ inside $\operatorname{span}\{|+\rangle|-\rangle,|-\rangle|+\rangle\}$. With the tensor-product basis orthonormal,
\begin{align*}
|0,0\rangle=\frac{1}{\sqrt2}\bigl(|+\rangle|-\rangle-|-\rangle|+\rangle\bigr)
\end{align*}
has norm $1$ and satisfies
\begin{align*}
\langle 1,0|0,0\rangle=\frac12(1-1)=0.
\end{align*}
It has $J_z$ eigenvalue $0$ because the two terms both have $m_1+m_2=0$. Moreover,
\begin{align*}
J_+|0,0\rangle=\frac{1}{\sqrt2}\bigl(\hbar|+\rangle|+\rangle-\hbar|+\rangle|+\rangle\bigr)=0.
\end{align*}
Similarly,
\begin{align*}
J_-|0,0\rangle=\frac{1}{\sqrt2}\bigl(\hbar|-\rangle|-\rangle-\hbar|-\rangle|-\rangle\bigr)=0.
\end{align*}
Thus the vector is annihilated by the total angular momentum operators and belongs to the spin-zero summand.
If $P$ denotes particle interchange, then
\begin{align*}
P|+\rangle|+\rangle=|+\rangle|+\rangle,
\end{align*}
\begin{align*}
P|-\rangle|-\rangle=|-\rangle|-\rangle,
\end{align*}
and
\begin{align*}
P\frac{|+\rangle|-\rangle+|-\rangle|+\rangle}{\sqrt2}=\frac{|-\rangle|+\rangle+|+\rangle|-\rangle}{\sqrt2}=|1,0\rangle.
\end{align*}
So the triplet is symmetric. For the singlet,
\begin{align*}
P|0,0\rangle=\frac{|-\rangle|+\rangle-|+\rangle|-\rangle}{\sqrt2}=-|0,0\rangle,
\end{align*}
so the singlet is antisymmetric and has total angular momentum zero.
[/example]
The spin one half coupling is small enough to write in full, but the same ladder method works in larger tensor products. Coupling spin one with spin one half gives the next case where different total-spin sectors share intermediate $M$ values and must be separated by orthogonality.
[example: Coupling Spin One with Spin One Half]
Write $|+\rangle=|\tfrac12,\tfrac12\rangle$ and $|-\rangle=|\tfrac12,-\tfrac12\rangle$. By *Clebsch-Gordan Decomposition*,
\begin{align*}
V_1\otimes V_{1/2}\cong V_{3/2}\oplus V_{1/2}.
\end{align*}
We compute the coupled basis from the total lowering operator
\begin{align*}
J_-=J_{1,-}\otimes I+I\otimes J_{2,-}.
\end{align*}
By *Ladder Coefficients*,
\begin{align*}
J_{1,-}|1,1\rangle=\hbar\sqrt{1(1+1)-1(1-1)}|1,0\rangle=\hbar\sqrt2|1,0\rangle.
\end{align*}
\begin{align*}
J_{1,-}|1,0\rangle=\hbar\sqrt{1(1+1)-0(0-1)}|1,-1\rangle=\hbar\sqrt2|1,-1\rangle.
\end{align*}
\begin{align*}
J_{1,-}|1,-1\rangle=\hbar\sqrt{1(1+1)-(-1)(-1-1)}|1,-2\rangle=0.
\end{align*}
Also,
\begin{align*}
J_{2,-}|+\rangle=\hbar\sqrt{\tfrac12(\tfrac12+1)-\tfrac12(\tfrac12-1)}|-\rangle=\hbar|-\rangle.
\end{align*}
\begin{align*}
J_{2,-}|-\rangle=\hbar\sqrt{\tfrac12(\tfrac12+1)-(-\tfrac12)(-\tfrac12-1)}|\tfrac12,-\tfrac32\rangle=0.
\end{align*}
The highest state in the $V_{3/2}$ summand is
\begin{align*}
|\tfrac32,\tfrac32\rangle=|1,1\rangle|+\rangle.
\end{align*}
Lowering it in the tensor product gives
\begin{align*}
J_-|1,1\rangle|+\rangle=(\hbar\sqrt2|1,0\rangle)|+\rangle+|1,1\rangle(\hbar|-\rangle)=\hbar\sqrt2|1,0\rangle|+\rangle+\hbar|1,1\rangle|-\rangle.
\end{align*}
In the spin $\tfrac32$ representation,
\begin{align*}
J_-|\tfrac32,\tfrac32\rangle=\hbar\sqrt{\tfrac32(\tfrac32+1)-\tfrac32(\tfrac32-1)}|\tfrac32,\tfrac12\rangle=\hbar\sqrt3|\tfrac32,\tfrac12\rangle.
\end{align*}
Comparing the two expressions gives
\begin{align*}
|\tfrac32,\tfrac12\rangle=\sqrt{\frac23}|1,0\rangle|+\rangle+\frac{1}{\sqrt3}|1,1\rangle|-\rangle.
\end{align*}
Lower this state once more:
\begin{align*}
J_-|\tfrac32,\tfrac12\rangle=\sqrt{\frac23}\bigl(\hbar\sqrt2|1,-1\rangle|+\rangle+\hbar|1,0\rangle|-\rangle\bigr)+\frac{1}{\sqrt3}\hbar\sqrt2|1,0\rangle|-\rangle.
\end{align*}
Combining the displayed coefficients,
\begin{align*}
J_-|\tfrac32,\tfrac12\rangle=\frac{2\hbar}{\sqrt3}|1,-1\rangle|+\rangle+2\hbar\sqrt{\frac23}|1,0\rangle|-\rangle.
\end{align*}
The spin $\tfrac32$ lowering coefficient at $m=\tfrac12$ is
\begin{align*}
\hbar\sqrt{\tfrac32(\tfrac32+1)-\tfrac12(\tfrac12-1)}=2\hbar,
\end{align*}
so
\begin{align*}
|\tfrac32,-\tfrac12\rangle=\frac{1}{\sqrt3}|1,-1\rangle|+\rangle+\sqrt{\frac23}|1,0\rangle|-\rangle.
\end{align*}
Lowering this state gives
\begin{align*}
J_-|\tfrac32,-\tfrac12\rangle=\frac{1}{\sqrt3}\hbar|1,-1\rangle|-\rangle+\sqrt{\frac23}\hbar\sqrt2|1,-1\rangle|-\rangle=\hbar\sqrt3|1,-1\rangle|-\rangle.
\end{align*}
Since
\begin{align*}
J_-|\tfrac32,-\tfrac12\rangle=\hbar\sqrt{\tfrac32(\tfrac32+1)-(-\tfrac12)(-\tfrac12-1)}|\tfrac32,-\tfrac32\rangle=\hbar\sqrt3|\tfrac32,-\tfrac32\rangle,
\end{align*}
we obtain
\begin{align*}
|\tfrac32,-\tfrac32\rangle=|1,-1\rangle|-\rangle.
\end{align*}
It remains to find the $V_{1/2}$ summand. In the $M=\tfrac12$ subspace, the orthonormal basis vectors are $|1,0\rangle|+\rangle$ and $|1,1\rangle|-\rangle$. A unit vector orthogonal to $|\tfrac32,\tfrac12\rangle$ is
\begin{align*}
|\tfrac12,\tfrac12\rangle=\frac{1}{\sqrt3}|1,0\rangle|+\rangle-\sqrt{\frac23}|1,1\rangle|-\rangle,
\end{align*}
because its inner product with $\sqrt{\frac23}|1,0\rangle|+\rangle+\frac{1}{\sqrt3}|1,1\rangle|-\rangle$ is
\begin{align*}
\frac{1}{\sqrt3}\sqrt{\frac23}-\sqrt{\frac23}\frac{1}{\sqrt3}=0.
\end{align*}
Lowering gives
\begin{align*}
J_-|\tfrac12,\tfrac12\rangle=\frac{1}{\sqrt3}\bigl(\hbar\sqrt2|1,-1\rangle|+\rangle+\hbar|1,0\rangle|-\rangle\bigr)-\sqrt{\frac23}\hbar\sqrt2|1,0\rangle|-\rangle.
\end{align*}
Thus
\begin{align*}
|\tfrac12,-\tfrac12\rangle=\sqrt{\frac23}|1,-1\rangle|+\rangle-\frac{1}{\sqrt3}|1,0\rangle|-\rangle,
\end{align*}
because the spin $\tfrac12$ lowering coefficient from $m=\tfrac12$ to $m=-\tfrac12$ is $\hbar$. This vector is orthogonal to $|\tfrac32,-\tfrac12\rangle$ since
\begin{align*}
\sqrt{\frac23}\frac{1}{\sqrt3}-\frac{1}{\sqrt3}\sqrt{\frac23}=0.
\end{align*}
The coupled basis is therefore obtained entirely by ladder coefficients and orthogonality inside fixed $M$ subspaces; no differential equation is involved.
[/example]
## Rotationally Invariant Hamiltonians
How do these representation-theoretic labels affect dynamics? If the Hamiltonian commutes with the rotation action, it preserves each irreducible angular momentum sector. This turns symmetry into a block diagonalization principle for spectral problems.
[quotetheorem:6957]
[citeproof:6957]
Commutation with the full $SU(2)$ action is essential: the Zeeman Hamiltonian $H_0+\omega J_z$ commutes with $J_z$ but assigns different shifts $\omega\hbar m$ to different magnetic labels, so it splits a spin $j$ multiplet when $\omega\ne 0$. The domain condition is also essential for unbounded Hamiltonians, because a formal commutator on a small core does not by itself imply that the spectral eigenspaces are preserved by every rotation. The theorem gives a lower bound on multiplicity, not an exact count, because other symmetries or accidental degeneracies may add further states at the same energy. This distinction is what lets perturbations split some degeneracies while preserving others.
This is the formal explanation for magnetic degeneracy in central potentials. Additional perturbations, such as magnetic fields or spin-orbit terms, may break part of the symmetry and split these multiplets according to the remaining commuting observables.
[example: Central Potential Multiplets]
For a central Hamiltonian
\begin{align*}
H=-\frac{\hbar^2}{2m}\Delta+V(r)
\end{align*}
on $L^2(\mathbb R^3)$, write a separated eigenstate as
\begin{align*}
\psi_{n,\ell,m}(r,\theta,\phi)=R_{n,\ell}(r)Y_{\ell,m}(\theta,\phi).
\end{align*}
The angular labels satisfy $m=-\ell,-\ell+1,\dots,\ell$, so the number of allowed $m$-values is
\begin{align*}
\ell-(-\ell)+1=2\ell+1.
\end{align*}
If the radial equation gives an energy $E_{n,\ell}$ independent of $m$, then each state in the same angular sector satisfies
\begin{align*}
H\psi_{n,\ell,m}=E_{n,\ell}\psi_{n,\ell,m}.
\end{align*}
Thus the fixed $(n,\ell)$ energy level contains the $2\ell+1$ linearly independent states $\psi_{n,\ell,-\ell},\dots,\psi_{n,\ell,\ell}$.
Now perturb by a term proportional to $L_z$, say
\begin{align*}
H_\lambda=H+\lambda L_z.
\end{align*}
Since $L_zY_{\ell,m}=\hbar mY_{\ell,m}$, the product form gives
\begin{align*}
L_z\psi_{n,\ell,m}=R_{n,\ell}(r)L_zY_{\ell,m}=\hbar m\psi_{n,\ell,m}.
\end{align*}
Therefore
\begin{align*}
H_\lambda\psi_{n,\ell,m}=H\psi_{n,\ell,m}+\lambda L_z\psi_{n,\ell,m}.
\end{align*}
Substituting the two eigenvalue equations,
\begin{align*}
H_\lambda\psi_{n,\ell,m}=E_{n,\ell}\psi_{n,\ell,m}+\lambda\hbar m\psi_{n,\ell,m}.
\end{align*}
Factoring out the common vector,
\begin{align*}
H_\lambda\psi_{n,\ell,m}=(E_{n,\ell}+\lambda\hbar m)\psi_{n,\ell,m}.
\end{align*}
The perturbation preserves the magnetic label $m$, but it splits the former $2\ell+1$ degeneracy into levels shifted by $\lambda\hbar m$.
[/example]
Angular momentum is therefore both a symmetry principle and a computational method. The passage from $SO(3)$ to $SU(2)$ explains half-integer spin, the classification of $\mathfrak{su}(2)$ representations gives the allowed spectra of $J^2$ and $J_z$, and the Clebsch-Gordan decomposition controls how composite quantum systems organize into total angular momentum sectors.
Angular momentum supplies the representation theory needed to organize rotationally invariant systems. The next chapter applies that machinery to central potentials, where symmetry reduces the three-dimensional Schrödinger equation to radial spectral problems.
# 9. Central Potentials and the Hydrogen Atom
Central potentials are the main class of three-dimensional quantum systems where symmetry reduces the spectral problem to ordinary differential equations. The question is how rotational invariance turns the Hamiltonian on $L^2(\mathbb R^3)$ into independent radial problems, and how the Coulomb potential then produces the discrete hydrogen spectrum. This chapter connects the angular momentum representation theory from the previous chapter with the first physically important spectral calculation in three dimensions.
## Rotational Symmetry and Angular Momentum Decomposition
The starting problem is that the Schrödinger operator on $\mathbb R^3$ involves three spatial variables, while a central potential depends only on the radius. Rotational symmetry should mean that the angular variables can be separated and labelled by angular momentum quantum numbers. The mathematical content is an orthogonal decomposition of $L^2(\mathbb R^3)$ into spherical harmonic sectors.
A central potential is the class of potentials for which rotations commute with the Hamiltonian.
[definition: Central Potential]
Let $V:(0,\infty)\to \mathbb R$ be a real-valued function. The associated central potential is the multiplication function $V_c:\mathbb R^3\setminus\{0\}\to\mathbb R$ given by
\begin{align*}
V_c:x\mapsto V(|x|).
\end{align*}
[/definition]
The definition singles out potentials that are constant on spheres centred at the origin. To exploit this symmetry in the Schrödinger equation, the differential operator itself must be rewritten into radial and angular parts. This motivates the following theorem because it identifies the exact angular operator that rotations diagonalise.
[quotetheorem:6958]
[citeproof:6958]
The hypotheses in this theorem are doing real work. The condition $r>0$ avoids the coordinate singularity at the origin, where spherical coordinates do not provide a smooth chart and the displayed formula cannot be read as a pointwise identity. For example, the radial function $u(x)=1/|x|$ satisfies the displayed differential expression away from $0$, but as a distribution on $\mathbb R^3$ its Laplacian contains the point mass $-4\pi\delta_0$. Thus applying the spherical-coordinate formula across the origin would miss a source term. The $C^2$ assumption is what permits the second derivatives in both the radial and angular directions to be interpreted classically; for rough $L^2$ wavefunctions the same decomposition must be understood through operator domains or weak derivatives. The statement is also only a differential identity: by itself it does not specify a self-adjoint Laplacian, a boundary condition at $r=0$, or a domain for a quantum Hamiltonian.
The decomposition shows that the angular dependence is controlled by a single self-adjoint operator on the sphere. To make separation of variables into a basis expansion, we need the eigenfunctions of that angular operator. This motivates the following definition because those eigenfunctions become the angular quantum states.
[definition: Spherical Harmonics]
For $l\in\mathbb N\cup\{0\}$ and $m\in\mathbb Z$ with $|m|\le l$, a spherical harmonic $Y_{l,m}:S^2\to\mathbb C$ is an eigenfunction of $-\Delta_{S^2}$ satisfying
\begin{align*}
-\Delta_{S^2}Y_{l,m} = l(l+1)Y_{l,m}.
\end{align*}
The family is chosen orthonormal in $L^2(S^2)$.
[/definition]
The indices have physical meaning: $l$ is the same orbital angular momentum quantum number denoted $\ell$ in Chapter 8, and $m$ is the magnetic quantum number. For fixed $l$, there are $2l+1$ independent angular states. If the angular functions were not a complete orthonormal basis of $L^2(S^2)$, separation would miss part of the Hilbert space. For instance, keeping only the $m=0$ spherical harmonics would describe axially symmetric angular dependence but would exclude states such as $Y_{1,1}$. This motivates the following theorem because a central Hamiltonian can be reduced only after the whole Hilbert space has been decomposed into these angular sectors.
[quotetheorem:6959]
[citeproof:6959]
This theorem is the functional-analytic justification for separating variables, but it is an $L^2$ statement rather than a pointwise [Fourier series](/page/Fourier%20Series) theorem. The expansion converges in the Hilbert-space norm, so additional regularity is needed before differentiating the series term by term or reading off pointwise values. The completeness and orthonormality of the spherical harmonics on $S^2$ are essential: without them the [Parseval identity](/theorems/248) and the sector-by-sector norm formula would fail. A concrete failure occurs if the angular basis is replaced by only the zonal harmonics $Y_{l,0}$: the state $R(r)Y_{1,1}(\theta,\phi)$ has positive norm and is orthogonal to every retained angular function, so the proposed expansion would reconstruct it as zero. A central Hamiltonian preserves each fixed $l,m$ sector, while a non-central perturbation generally couples different spherical harmonic sectors and destroys the radial reduction.
## Radial Schrödinger Operators
The next problem is to turn separation of variables into an operator statement. The radial equation in the variable $R(r)$ contains a first derivative and the measure $r^2dr$, which is inconvenient for self-adjoint operator theory. A standard unitary substitution removes both features.
Suppose
\begin{align*}
H=-\frac{\hbar^2}{2\mu}\Delta+V(r)
\end{align*}
acts formally on $L^2(\mathbb R^3)$, where $\mu>0$ is the reduced mass. If $\psi(r,\theta,\phi)=R(r)Y_{l,m}(\theta,\phi)$, then the eigenvalue equation $H\psi=E\psi$ becomes
\begin{align*}
-\frac{\hbar^2}{2\mu}\left(R''(r)+\frac{2}{r}R'(r)-\frac{l(l+1)}{r^2}R(r)\right)+V(r)R(r)=ER(r).
\end{align*}
The term
\begin{align*}
\frac{l(l+1)}{r^2}
\end{align*}
is repulsive and prevents higher angular momentum states from concentrating as strongly near the origin.
[definition: Reduced Radial Wavefunction]
The reduced radial transform is the unitary map
\begin{align*}
U_l:L^2((0,\infty),r^2\,dr)\to L^2(0,\infty),\qquad U_l(R)=u,
\end{align*}
where for a representative radial factor $R:(0,\infty)\to\mathbb C$,
\begin{align*}
u:(0,\infty)\to\mathbb C,\qquad u:r\mapsto rR(r).
\end{align*}
[/definition]
The substitution changes the radial norm by
\begin{align*}
\int_0^\infty |R(r)|^2r^2\,dr=\int_0^\infty |u(r)|^2\,dr.
\end{align*}
It therefore moves the radial sector into the ordinary Hilbert space $L^2(0,\infty)$. After this unitary change of variables, the remaining issue is not just the differential expression but also the boundary behaviour at $r=0$. Each angular momentum sector should become a precise half-line eigenvalue problem with the centrifugal term and the regularity condition inherited from three-dimensional wavefunctions.
[quotetheorem:6960]
The reduced equation is a one-dimensional Schrödinger equation on the half-line with an effective potential. The boundary condition is not a cosmetic addition: it records which self-adjoint radial operator is obtained from regular three-dimensional wavefunctions. Square-integrability of $u$ alone does not always exclude singular local behaviour at the origin, especially for singular potentials or for alternative self-adjoint extensions in the $l=0$ channel. For the free $l=0$ half-line equation, a boundary condition such as $u'(0)=\gamma u(0)$ with finite $\gamma$ gives a point-interaction extension at the origin rather than the regular three-dimensional radial operator; the constant local behaviour $u(r)\sim 1$ then corresponds to $R(r)\sim 1/r$, which is not a regular wavefunction at the origin. The formal ODE is therefore only part of the operator problem; the domain and boundary condition determine which radial solutions correspond to physical states of the original Hamiltonian.
[example: Radial Equation for $l=0$]
For $l=0$, we have $l(l+1)=0\cdot 1=0$, so the centrifugal contribution in the reduced radial equation is
\begin{align*}
\frac{\hbar^2 l(l+1)}{2\mu r^2}u(r)=\frac{\hbar^2\cdot 0}{2\mu r^2}u(r)=0.
\end{align*}
Thus the reduced half-line equation becomes
\begin{align*}
-\frac{\hbar^2}{2\mu}u''(r)+V(r)u(r)=Eu(r),
\end{align*}
with the regular radial boundary condition
\begin{align*}
u(0)=0.
\end{align*}
The associated spherical harmonic is $Y_{0,0}$, so the angular factor is constant on $S^2$ and the states in this sector are spherically symmetric. For the Coulomb potential $V(r)=-\kappa/r$ with $\kappa>0$, the effective potential in the $l=0$ sector is exactly
\begin{align*}
V_{\mathrm{eff},0}(r)=-\frac{\kappa}{r}+0=-\frac{\kappa}{r}.
\end{align*}
There is therefore no positive inverse-square angular barrier opposing the attractive singularity at the origin, which is why the lowest Coulomb bound state occurs in the spherically symmetric $l=0$ sector.
[/example]
For $l>0$, the same attractive potential competes with a positive inverse-square term. This changes the behaviour near $r=0$ and is responsible for the angular node structure of hydrogen orbitals.
[example: Radial Equation for $l>0$]
For $l\ge 1$, the effective potential in the reduced radial equation is
\begin{align*}
V_{\mathrm{eff},l}(r)=V(r)+\frac{\hbar^2l(l+1)}{2\mu r^2}.
\end{align*}
For the Coulomb potential $V(r)=-\kappa/r$, this becomes
\begin{align*}
V_{\mathrm{eff},l}(r)=-\frac{\kappa}{r}+\frac{\hbar^2l(l+1)}{2\mu r^2}.
\end{align*}
To compare the two singular terms near $r=0$, divide the inverse-square term by the Coulomb term in absolute value:
\begin{align*}
\frac{\frac{\hbar^2l(l+1)}{2\mu r^2}}{\frac{\kappa}{r}}=\frac{\hbar^2l(l+1)}{2\mu\kappa}\frac{1}{r}.
\end{align*}
Since $l\ge 1$, the constant $\hbar^2l(l+1)/(2\mu\kappa)$ is positive, and the ratio tends to $+\infty$ as $r\to 0^+$. Thus the inverse-square centrifugal term controls the leading behaviour at the origin.
Keeping only the leading singular terms gives the indicial equation
\begin{align*}
-\frac{\hbar^2}{2\mu}u''(r)+\frac{\hbar^2l(l+1)}{2\mu r^2}u(r)=0.
\end{align*}
Multiply by $2\mu/\hbar^2$:
\begin{align*}
-u''(r)+\frac{l(l+1)}{r^2}u(r)=0.
\end{align*}
Try a power $u(r)=r^\alpha$. Then
\begin{align*}
u''(r)=\alpha(\alpha-1)r^{\alpha-2}.
\end{align*}
Substitution gives
\begin{align*}
-\alpha(\alpha-1)r^{\alpha-2}+l(l+1)r^{\alpha-2}=0.
\end{align*}
For $r>0$, divide by $r^{\alpha-2}$:
\begin{align*}
\alpha(\alpha-1)=l(l+1).
\end{align*}
The two roots are $\alpha=l+1$ and $\alpha=-l$, because
\begin{align*}
(l+1)l=l(l+1)
\end{align*}
and
\begin{align*}
(-l)(-l-1)=l(l+1).
\end{align*}
Regularity at the origin selects $u(r)\sim r^{l+1}$ rather than $u(r)\sim r^{-l}$. Since $u(r)=rR(r)$, this gives
\begin{align*}
R(r)=\frac{u(r)}{r}\sim r^l.
\end{align*}
Thus higher angular momentum forces the radial wavefunction to vanish to higher order at the origin.
[/example]
The central hypothesis is also necessary for the reduction into independent $l,m$ sectors. If a perturbation such as $\varepsilon x_3$ is added to the potential, the multiplication operator is no longer constant on spheres and it mixes spherical harmonics with different angular momenta. The radial equations then become coupled equations rather than separate half-line eigenvalue problems.
## The Coulomb Hamiltonian
The hydrogen atom is the central potential problem with $V(r)=-\kappa/r$. The spectral question is not only to solve the differential equation, but also to know which self-adjoint operator the formal expression defines. This matters because singular potentials and unbounded differential operators are not specified by formulas alone.
[definition: Coulomb Hamiltonian]
For $\kappa>0$ and $\mu,\hbar>0$, the Coulomb Hamiltonian is the self-adjoint operator
\begin{align*}
H_C:\mathcal D(H_C)\subset L^2(\mathbb R^3)\to L^2(\mathbb R^3)
\end{align*}
associated with the closure of the operator initially defined on $C_c^\infty(\mathbb R^3)$ by
\begin{align*}
H_C\psi=-\frac{\hbar^2}{2\mu}\Delta\psi-\frac{\kappa}{|x|}\psi.
\end{align*}
[/definition]
By Kato-Rellich, this operator may be realised on the Sobolev domain $\mathcal D(H_C)=H^2(\mathbb R^3)$ with the same differential expression. The definition gives both the expression and the Hilbert-space setting, because a quantum observable must be a self-adjoint operator with a domain. The singularity at the origin is the only possible obstruction. This motivates the following theorem because it states that the formal Coulomb expression determines a unique physical Hamiltonian.
[quotetheorem:6961]
The operator-theoretic input is the relative smallness of the Coulomb singularity compared with $-\Delta$. Hardy-type estimates and the Kato-Rellich theorem show that multiplication by $|x|^{-1}$ is relatively form-bounded with respect to $-\Delta$ with relative bound zero. The distinction between $C_c^\infty(\mathbb R^3)$ and a punctured test-function domain matters because removing the origin can introduce additional extension questions that are not part of the hydrogen Hamiltonian used here.
The self-adjoint operator commutes with rotations, so its spectral subspaces can be studied through the reduced radial operators. Bound states correspond to negative eigenvalues and square-integrable radial solutions.
## Bound States and the Hydrogen Energy Formula
The central calculation is to solve the Coulomb radial equation for $E<0$. The normalisability condition at infinity and regularity condition at the origin together force a quantisation rule. This is where the principal quantum number appears.
For the Coulomb potential, the reduced radial equation is
\begin{align*}
-\frac{\hbar^2}{2\mu}u''(r)
+\left(\frac{\hbar^2l(l+1)}{2\mu r^2}-\frac{\kappa}{r}\right)u(r)
=Eu(r).
\end{align*}
For $E<0$, write
\begin{align*}
\rho=\frac{2r}{na_0},\qquad a_0=\frac{\hbar^2}{\mu\kappa},
\end{align*}
where $a_0$ is the Bohr radius and $n$ will be forced to be a positive integer.
[quotetheorem:6962]
The restrictions in the theorem come from the two endpoints of the radial problem and from the operator being solved. The assumption $E<0$ is essential because it produces exponential decay at infinity; for $E\ge 0$ the solutions belong to the continuous spectral regime rather than to normalisable bound states. The constants $\kappa>0$, $\mu>0$, and $\hbar>0$ fix an attractive Coulomb problem with a positive kinetic-energy coefficient and a nonzero quantum scale; if $\kappa\le 0$ there is no attractive hydrogenic bound-state series, while changing the sign of $\mu$ or allowing $\hbar=0$ would no longer define the same quantum Hamiltonian. Self-adjointness and the regular radial domain are also part of the hypothesis, since different point-interaction extensions at the origin can add spectral data not described by the displayed formula. The pure Coulomb form is essential as well: adding spin, relativistic fine structure, finite nuclear size, or any non-Coulomb perturbation changes the operator and can shift or split these levels. The energy formula has introduced three labels whose ranges are tied together by normalisability. To use these labels consistently in degeneracy counts and orbital notation, we name them as quantum numbers. This motivates the following definition because it packages the permitted indices for every bound state.
[definition: Hydrogen Quantum Numbers]
A hydrogen bound state is labelled by integers $(n,l,m)$ satisfying
\begin{align*}
n\ge 1,
\qquad 0\le l\le n-1,
\qquad -l\le m\le l.
\end{align*}
[/definition]
Here $n$ is the principal quantum number, $l$ is the orbital angular momentum quantum number, and $m$ is the magnetic quantum number.
These labels also organise the eigenspace dimension. The remaining counting problem is that the energy depends on $n$ while the admissible values of $l$ and $m$ vary inside that fixed level. Summing those angular-momentum multiplicities determines how many independent spatial bound states share one Coulomb energy and separates ordinary rotational degeneracy from the additional Coulomb degeneracy.
[quotetheorem:6963]
[citeproof:6963]
This degeneracy is larger than the degeneracy forced by rotations alone. Rotational symmetry accounts for the $2l+1$ states with fixed $l$, while the equality of energies across different $l$ values is a special Coulomb phenomenon. The $n^2$ count assumes a spinless particle, the regular self-adjoint Coulomb Hamiltonian, and the exact $-\kappa/r$ potential. These hypotheses cannot be dropped without changing the count: a central harmonic oscillator or a finite-size nuclear potential is still rotationally invariant, but its radial spectrum depends on $l$ in a different way, so only the fixed-$l$ magnetic degeneracy is forced. Spin doubles the spatial count before spin-orbit effects, fine structure separates some levels, external electric or magnetic fields split the $m$ labels, and a general central potential usually leaves only the rotational $2l+1$ degeneracy.
[example: First Hydrogen Levels]
For $n=1$, the allowed angular momenta satisfy $0\le l\le n-1=0$, so $l=0$. For this value of $l$, the magnetic label satisfies $-0\le m\le 0$, hence $m=0$. Thus the $n=1$ eigenspace has exactly one spatial state:
\begin{align*}
2l+1=2\cdot 0+1=1.
\end{align*}
Since the angular factor is $Y_{0,0}$, which is constant on $S^2$, the ground state is spherically symmetric.
For $n=2$, the allowed angular momenta satisfy $0\le l\le n-1=1$, so $l=0$ or $l=1$. When $l=0$, the magnetic labels again give only $m=0$, contributing
\begin{align*}
2\cdot 0+1=1
\end{align*}
state. When $l=1$, the magnetic labels are $m=-1,0,1$, contributing
\begin{align*}
2\cdot 1+1=3
\end{align*}
states. Therefore the total spatial degeneracy at $n=2$ is
\begin{align*}
1+3=4.
\end{align*}
The $l=0$ state is the $2s$ orbital, and the three $l=1$ states are the $2p$ orbitals. Rotational symmetry accounts for the three magnetic states inside the $2p$ sector, while the fact that the $2s$ and $2p$ orbitals have the same energy is the additional Coulomb degeneracy, before spin is included.
[/example]
## Hydrogen Orbitals and Spectral Structure
The next question is what the eigenfunctions look like and how the bound states sit inside the full spectrum. The radial equation gives discrete normalisable states at negative energy, but the Hamiltonian also has scattering states at nonnegative energy.
[definition: Hydrogen Orbital]
A hydrogen orbital is an element
\begin{align*}
\psi_{n,l,m}\in\mathcal D(H_C)\subset L^2(\mathbb R^3)
\end{align*}
satisfying
\begin{align*}
H_C\psi_{n,l,m}=E_n\psi_{n,l,m}
\end{align*}
and admitting the spherical-coordinate representation
\begin{align*}
\psi_{n,l,m}(r,\theta,\phi)=R_{n,l}(r)Y_{l,m}(\theta,\phi),
\end{align*}
where $(n,l,m)$ are hydrogen quantum numbers and $R_{n,l}$ is the normalised radial factor.
[/definition]
The angular part fixes nodal planes and magnetic behaviour, while the radial part fixes radial nodes and decay. The number of radial nodes is $n-l-1$, matching the Laguerre polynomial degree in the proof of the energy formula.
[example: Shapes of Low-Lying Orbitals]
Using the standard normalised hydrogen radial factors, up to positive constants, the three lowest shapes are determined by
\begin{align*}
R_{1,0}(r)=C_{10}e^{-r/a_0}.
\end{align*}
Since $C_{10}>0$ and $e^{-r/a_0}>0$ for every $r>0$, the $1s$ radial factor has no positive zero. Its angular factor is $Y_{0,0}=1/\sqrt{4\pi}$, so $\psi_{1,0,0}(r,\theta,\phi)=R_{1,0}(r)Y_{0,0}$ is independent of $\theta$ and $\phi$, hence spherically symmetric.
For the $2s$ orbital,
\begin{align*}
R_{2,0}(r)=C_{20}\left(2-\frac{r}{a_0}\right)e^{-r/(2a_0)}.
\end{align*}
The exponential factor is positive for $r>0$, so the zeros come exactly from
\begin{align*}
2-\frac{r}{a_0}=0.
\end{align*}
Solving gives
\begin{align*}
r=2a_0.
\end{align*}
Thus the $2s$ orbital has one radial node, the sphere of radius $2a_0$.
For the $2p$ orbitals, $n=2$ and $l=1$, so the allowed magnetic labels are $m=-1,0,1$. Their radial factor has the form
\begin{align*}
R_{2,1}(r)=C_{21}\frac{r}{a_0}e^{-r/(2a_0)}.
\end{align*}
Since
\begin{align*}
\frac{R_{2,1}(r)}{r}=\frac{C_{21}}{a_0}e^{-r/(2a_0)},
\end{align*}
the limit as $r\to 0^+$ is $C_{21}/a_0$, so $R_{2,1}(r)$ vanishes linearly at the origin. The three different $2p$ shapes come from the three angular factors $Y_{1,-1}$, $Y_{1,0}$, and $Y_{1,1}$; their nodes are angular rather than radial.
[/example]
The spectral picture combines a discrete negative part with a continuous nonnegative part. The discrete levels accumulate only at zero from below, which is the ionisation threshold.
[quotetheorem:6964]
The spectral statement depends on both the Coulomb decay at infinity and the self-adjoint domain chosen above. Short-range perturbations can shift the negative eigenvalues, and confining potentials can replace the continuous part with a purely discrete spectrum. Conversely, changing the behaviour at the origin or adding point interactions changes the low angular momentum sector and can create extra bound states. The accumulation at zero has physical meaning: highly excited bound states require less energy to ionise. Positive energy states are not normalisable bound states; they describe scattering states in the continuous spectral subspace.
## Selection Rules from Angular Momentum
The final question in this chapter is how angular momentum restricts transitions between hydrogen orbitals. In perturbative quantum mechanics, transition amplitudes often contain matrix elements of position, and the angular part of those integrals imposes selection rules.
For electric dipole transitions, the relevant operator is multiplication by a component of $x$. Since $x$ transforms as an angular momentum $l=1$ object under rotations, it can only connect spherical harmonic sectors according to the angular momentum addition rules.
[quotetheorem:6965]
These rules are constraints, not guarantees. They rely on the electric dipole approximation, where the perturbing operator transforms as a vector under rotations and has odd parity. If higher multipole terms are retained, or if the perturbation does not transform as a vector, different angular momentum channels can appear. For instance, the quadrupole angular factor transforms like an $l=2$ spherical tensor, so the triangle rule first allows $l'$ between $|l-2|$ and $l+2$, while even parity then removes the opposite-parity channels. Thus quadrupole angular factors can allow changes such as $\Delta l=0$ or $\Delta l=\pm2$ that are absent for electric dipole transitions, subject also to the magnetic quantum number restrictions. Even when the angular integral is allowed, the radial integral or additional physical assumptions may make a particular transition vanish; for example, an allowed angular transition can still have a zero radial overlap for a special pair of radial states.
[example: Allowed and Forbidden Dipole Transitions]
For a $2p\to 1s$ dipole transition, the initial state has $n=2$ and $l=1$, while the final state has $n'=1$ and $l'=0$. Hence
\begin{align*}
\Delta l=l'-l=0-1=-1.
\end{align*}
The electric dipole angular selection rule allows exactly $l'=l\pm 1$. Since
\begin{align*}
l-1=1-1=0,
\end{align*}
the value $l'=0$ is allowed. The magnetic labels are also compatible: a $2p$ state has $m=-1,0,1$, while a $1s$ state has $m'=0$; for $m=-1$ one has $m'=m+1$, for $m=0$ one has $m'=m$, and for $m=1$ one has $m'=m-1$. Thus angular momentum does not force the $2p\to 1s$ dipole matrix element to vanish.
For a $2s\to 1s$ transition, both states have $l=0$. Therefore
\begin{align*}
\Delta l=l'-l=0-0=0.
\end{align*}
But the electric dipole rule requires $l'=l+1$ or $l'=l-1$. When $l=0$, these two possibilities are
\begin{align*}
l+1=1
\end{align*}
and
\begin{align*}
l-1=-1.
\end{align*}
The final value $l'=0$ is neither $1$ nor $-1$, so the dipole angular selection rule forbids the $2s\to 1s$ transition. Thus angular momentum alone separates the allowed $2p\to 1s$ channel from the forbidden $2s\to 1s$ channel, even before evaluating any radial integral.
[/example]
Central potentials turn rotational symmetry into a practical method: decompose into spherical harmonics, solve a half-line radial equation, and then rebuild the three-dimensional eigenfunctions. For the Coulomb potential, the exact solvability of the radial equation gives the hydrogen energy levels, their $n^2$ spatial degeneracy, and the orbital labels used throughout atomic quantum mechanics. The same separation method is also a recurring theme beyond quantum mechanics: in electrostatics it underlies multipole expansions for solutions of [Laplace's equation](/page/Laplace's%20Equation), in heat and wave equations on balls it separates radial Sturm-Liouville problems from angular harmonics, and in classical mechanics it mirrors the reduction of Kepler motion by conserved angular momentum. The hydrogen atom is therefore both a spectral calculation and a model example of how symmetry converts a multidimensional differential problem into structured one-dimensional pieces.
The hydrogen atom is a rare exactly solvable model, made tractable by symmetry and separation of variables. Perturbation theory begins where exact solvability ends, asking how spectra and states change when a Hamiltonian is modified slightly.
# 10. Perturbation Theory
Perturbation theory asks how the spectrum and eigenvectors of a Hamiltonian change when the system is slightly modified. In quantum mechanics this is the bridge between solvable models and the Hamiltonians that actually occur after external fields, weak interactions, or anharmonic terms are included. The previous chapters supplied the spectral and variational language for self-adjoint operators; here we use that language to compute approximate eigenvalues and to control those approximations.
The guiding situation is a self-adjoint operator $H_0:D(H_0)\subset\mathcal H\to\mathcal H$ with known spectral data and a perturbed operator
\begin{align*}
H(\lambda):D(H_0)\subset\mathcal H\to\mathcal H,\qquad H(\lambda)=H_0+\lambda V,
\end{align*}
where $\lambda\in\mathbb R$ is small and $V:D(V)\subset\mathcal H\to\mathcal H$ is another symmetric or self-adjoint operator. The main distinction is whether the eigenvalue of $H_0$ under study is simple or has multiplicity greater than one. Simple eigenvalues move like scalar quantities, while degenerate eigenspaces require diagonalising the perturbation inside the eigenspace before individual first-order shifts can be named.
## Nondegenerate Rayleigh-Schrödinger Theory
Suppose an eigenvalue of $H_0$ is isolated and simple. The first problem is to turn the formal expansion of an eigenvalue into equations that determine its coefficients. This section develops the nondegenerate perturbation series and explains why the first-order coefficient is an expectation value of the perturbing operator.
We begin with the local spectral setting in which the computation is meant to take place. The assumptions isolate one bound state from the rest of the spectrum, so that nearby perturbed eigenvalues can be followed as $\lambda$ varies.
[definition: Isolated Simple Eigenvalue]
Let $H_0:D(H_0)\subset\mathcal H\to\mathcal H$ be a self-adjoint operator on a Hilbert space $\mathcal H$. A number $E_0\in\mathbb R$ is an isolated simple eigenvalue of $H_0$ if $\ker(H_0-E_0 I)$ is one-dimensional and there exists $\delta>0$ such that
\begin{align*}
\sigma(H_0)\cap (E_0-\delta,E_0+\delta)=\{E_0\}.
\end{align*}
[/definition]
The isolation condition separates the chosen state from all competing spectral values. This raises the computational question of how the isolated eigenvalue begins to move when $V$ is switched on, and the first theorem gives the coefficient that appears in every later perturbation calculation.
[quotetheorem:6966]
The formula says that the leading energy shift is the expectation value of the perturbation in the unperturbed state. The second-order term measures virtual coupling to the other unperturbed states, weighted by the inverse spectral gaps. The domain and relative-boundedness hypotheses are not cosmetic: if $V\psi_0$ is not defined, the expression $(V\psi_0,\psi_0)_{\mathcal H}$ has no meaning, and if the perturbed operators do not share a controlled self-adjoint realisation, the formal series may fail to describe any spectral branch. The displayed sum also does not cover embedded eigenvalues or continuous-spectrum couplings without replacing the sum by a resolvent or reduced Green operator.
[example: Anharmonic Oscillator First Correction]
Consider the one-dimensional harmonic oscillator
\begin{align*}
H_0:D(H_0)\subset L^2(\mathbb R)\longrightarrow L^2(\mathbb R),\qquad H_0=-\frac{1}{2}\frac{d^2}{dx^2}+\frac{x^2}{2},
\end{align*}
with $D(H_0)=\{\psi\in L^2(\mathbb R):-\psi''+x^2\psi\in L^2(\mathbb R)\}$, where derivatives are understood in the distributional sense. Perturb it by the multiplication operator
\begin{align*}
V:D(V)\subset L^2(\mathbb R)\longrightarrow L^2(\mathbb R),\qquad (V\psi)(x)=x^4\psi(x),
\end{align*}
with $D(V)=\{\psi\in L^2(\mathbb R):x^4\psi\in L^2(\mathbb R)\}$. The ground state is $\psi_0(x)=\pi^{-1/4}e^{-x^2/2}$ with $E_0=1/2$, and $\psi_0\in D(H_0)\cap D(V)$ because Gaussian decay makes $-\psi_0''+x^2\psi_0$ and $x^4\psi_0$ square-integrable.
The first-order coefficient is the expectation value of $V$ in the unperturbed ground state:
\begin{align*}
E_1=(V\psi_0,\psi_0)_{L^2}.
\end{align*}
Since $(V\psi_0)(x)=x^4\pi^{-1/4}e^{-x^2/2}$ and $\overline{\psi_0(x)}=\pi^{-1/4}e^{-x^2/2}$, the integrand is
\begin{align*}
(V\psi_0)(x)\overline{\psi_0(x)}=\pi^{-1/2}x^4e^{-x^2}.
\end{align*}
Thus, writing $\mathcal L^1$ for Lebesgue measure on $\mathbb R$,
\begin{align*}
E_1=\frac{1}{\sqrt{\pi}}\int_{\mathbb R}x^4e^{-x^2}\,d\mathcal L^1(x).
\end{align*}
Let $I_0=\int_{\mathbb R}e^{-x^2}\,d\mathcal L^1(x)=\sqrt{\pi}$, the standard Gaussian integral. Integration by parts gives
\begin{align*}
\int_{\mathbb R}x^2e^{-x^2}\,d\mathcal L^1(x)=\frac{1}{2}\int_{\mathbb R}e^{-x^2}\,d\mathcal L^1(x)=\frac{\sqrt{\pi}}{2}.
\end{align*}
Applying integration by parts once more, with $d(e^{-x^2})=-2xe^{-x^2}\,dx$ and with the boundary term $x^3e^{-x^2}$ vanishing at $\pm\infty$, gives
\begin{align*}
\int_{\mathbb R}x^4e^{-x^2}\,d\mathcal L^1(x)=\frac{3}{2}\int_{\mathbb R}x^2e^{-x^2}\,d\mathcal L^1(x)=\frac{3\sqrt{\pi}}{4}.
\end{align*}
Therefore
\begin{align*}
E_1=\frac{1}{\sqrt{\pi}}\cdot\frac{3\sqrt{\pi}}{4}=\frac{3}{4}.
\end{align*}
The quartic anharmonic oscillator therefore has the formal ground-energy expansion $E(\lambda)=1/2+3\lambda/4+O(\lambda^2)$, so the first correction is obtained by evaluating one fourth moment of the known Gaussian ground-state density.
[/example]
The preceding derivation was formal, so we next record the analytic theorem that justifies power series expansions near isolated eigenvalues. It is the mathematical foundation behind the Rayleigh-Schrodinger procedure in the regular case.
[quotetheorem:6967]
For this course, the theorem is mostly used as a licence to compute coefficients by formal expansion. It also warns that when $m>1$, individual eigenvalue branches are not determined by a single vector expectation value; the whole spectral subspace must be treated together. Isolation is essential here: for a Schrödinger operator with an eigenvalue embedded in continuous spectrum, no contour can separate that value from the rest of $\sigma(H(0))$, and a small perturbation may turn the eigenvalue into a resonance rather than a nearby eigenvalue. Finite multiplicity is also essential: if an eigenvalue has infinite-dimensional eigenspace, the spectral projection has infinite rank and the local reduction is not a finite matrix problem. The type A analyticity assumption rules out domain motion and non-analytic dependence. A concrete model is the Laplacian $-\frac{d^2}{dx^2}$ on $L^2(0,1)$ with boundary condition $\psi'(0)=\alpha\psi(0)$ and $\psi(1)=0$: changing $\alpha$ changes the operator domain, although the differential expression is unchanged. Such families can often be handled by quadratic forms, but they are not analytic type A families on a fixed operator domain. The theorem also gives local information only; it does not prevent two analytic branches from meeting at a later value of $\lambda$ outside the neighbourhood where the contour argument applies.
## Degenerate Perturbation Theory
What changes when $E_0$ has multiplicity greater than one? The perturbation can split the eigenspace into different first-order energies, so the first task is to find the correct zeroth-order basis. That basis is not arbitrary: it is chosen by diagonalising the perturbation after projecting it onto the degenerate eigenspace.
Let $P_0$ denote the orthogonal projection onto the eigenspace $\mathcal E_0=\ker(H_0-E_0I)$. The operator $P_0VP_0$ is the effective first-order Hamiltonian on $\mathcal E_0$.
[quotetheorem:6968]
[citeproof:6968]
The result replaces a difficult infinite-dimensional problem by a finite-dimensional one. Its physical content is that a perturbation chooses preferred linear combinations inside a degenerate energy level. The finite multiplicity and isolation assumptions prevent leakage into nearby spectral values; without them, the projected matrix may miss first-order mixing with a continuum or with an accumulating sequence of eigenvalues. For instance, if $H_0$ is diagonal on $\ell^2(\mathbb N)$ with eigenvalues $E_0+1/n$ accumulating at $E_0$, then no contour isolates $E_0$ from the nearby levels, and a perturbation can mix the chosen vector with infinitely many almost resonant states. If $P_0VP_0$ is a scalar multiple of the identity on a two-dimensional eigenspace, the first-order shifts remain repeated and the splitting is postponed to higher order. Thus the theorem determines first-order slopes, but not every later separation of branches.
[example: Fine Splitting From A Matrix Perturbation]
Let $\mathcal E_0$ be a two-dimensional degenerate eigenspace with orthonormal basis $e_1,e_2$, and write $A=P_0VP_0$. Assume
\begin{align*}
Ae_1=ae_1+\bar b e_2
\end{align*}
and
\begin{align*}
Ae_2=be_1+de_2,
\end{align*}
where $a,d\in\mathbb R$ and $b\in\mathbb C$. Thus, in the ordered basis $(e_1,e_2)$, the compressed perturbation has entries $A_{11}=a$, $A_{21}=\bar b$, $A_{12}=b$, and $A_{22}=d$.
The first-order shifts are the eigenvalues of $A$ on $\mathcal E_0$. If $\mu$ is such an eigenvalue, the characteristic equation is
\begin{align*}
0=(a-\mu)(d-\mu)-b\bar b.
\end{align*}
Since $b\bar b=|b|^2$, this becomes
\begin{align*}
0=(a-\mu)(d-\mu)-|b|^2.
\end{align*}
Expanding the product gives
\begin{align*}
0=ad-a\mu-d\mu+\mu^2-|b|^2.
\end{align*}
Reordering powers of $\mu$ gives
\begin{align*}
0=\mu^2-(a+d)\mu+ad-|b|^2.
\end{align*}
By the [quadratic formula](/theorems/1301),
\begin{align*}
\mu=\frac{a+d\pm\sqrt{(a+d)^2-4(ad-|b|^2)}}{2}.
\end{align*}
The discriminant is
\begin{align*}
(a+d)^2-4(ad-|b|^2)=a^2+2ad+d^2-4ad+4|b|^2.
\end{align*}
Combining the middle terms gives
\begin{align*}
(a+d)^2-4(ad-|b|^2)=(a-d)^2+4|b|^2.
\end{align*}
Therefore
\begin{align*}
\mu=\frac{a+d}{2}\pm\sqrt{\left(\frac{a-d}{2}\right)^2+|b|^2}.
\end{align*}
Hence the two first-order shifts are
\begin{align*}
E_{1,+}=\frac{a+d}{2}+\sqrt{\left(\frac{a-d}{2}\right)^2+|b|^2}
\end{align*}
and
\begin{align*}
E_{1,-}=\frac{a+d}{2}-\sqrt{\left(\frac{a-d}{2}\right)^2+|b|^2}.
\end{align*}
For the corresponding zeroth-order states, let $u=xe_1+ye_2$. The equation $Au=\mu u$ is equivalent to
\begin{align*}
(a-\mu)x+by=0
\end{align*}
and
\begin{align*}
\bar b x+(d-\mu)y=0.
\end{align*}
When $b\ne0$, choose $x=b$ and $y=\mu-a$. The first equation is then satisfied because
\begin{align*}
(a-\mu)b+b(\mu-a)=0.
\end{align*}
For the second equation, substitute the same $x$ and $y$:
\begin{align*}
\bar b b+(d-\mu)(\mu-a)=|b|^2+(d-\mu)(\mu-a).
\end{align*}
Since $(\mu-a)=-(a-\mu)$, this is
\begin{align*}
|b|^2-(d-\mu)(a-\mu).
\end{align*}
The characteristic equation says $(a-\mu)(d-\mu)=|b|^2$, so the expression is $0$. Thus, for $b\ne0$, one may take
\begin{align*}
u_\mu=be_1+(\mu-a)e_2
\end{align*}
and then divide by $\|u_\mu\|$ to obtain a normalised eigenvector. If $b=0$, the compressed perturbation is already diagonal in the basis $e_1,e_2$, with shifts $a$ and $d$.
The perturbation therefore selects the eigenvectors of $P_0VP_0$ as the correct zeroth-order states. This is fine splitting: an initially repeated spectral line separates according to the eigenvalues of the perturbation restricted to the degenerate eigenspace.
[/example]
Degenerate perturbation theory also explains finite-dimensional versions of the Stark effect, where an electric field adds a dipole-type matrix to a previously symmetric Hamiltonian. The next example keeps only a small spectral subspace, which is often how the calculation first appears in applications.
[example: Stark Effect In A Finite-Dimensional Model]
Let
\begin{align*}
H_0:\mathbb C^2\longrightarrow\mathbb C^2,\qquad H_0=E_0I_{\mathbb C^2},
\end{align*}
with standard orthonormal basis $e_1,e_2$, and define $V$ by $Ve_1=\bar\alpha e_2$ and $Ve_2=\alpha e_1$. We compute the spectrum of $H(\lambda)=H_0+\lambda V$ by first diagonalising $V$ on the degenerate eigenspace $\mathbb C^2$.
Let $u=xe_1+ye_2$. By linearity,
\begin{align*}
Vu=xVe_1+yVe_2.
\end{align*}
Substituting the defining values of $V$ gives
\begin{align*}
Vu=x\bar\alpha e_2+y\alpha e_1.
\end{align*}
Thus the eigenvalue equation $Vu=\mu u$ is equivalent to the two scalar equations
\begin{align*}
\alpha y=\mu x
\end{align*}
and
\begin{align*}
\bar\alpha x=\mu y.
\end{align*}
If $\alpha=0$, then $V=0$, so both eigenvalues of $H(\lambda)$ are $E_0$. Assume now that $\alpha\ne0$. Multiplying the first scalar equation by $\bar\alpha$ gives
\begin{align*}
|\alpha|^2y=\mu\bar\alpha x.
\end{align*}
Using $\bar\alpha x=\mu y$ in the right-hand side gives
\begin{align*}
|\alpha|^2y=\mu^2y.
\end{align*}
For a nonzero eigenvector with $y\ne0$, this implies
\begin{align*}
\mu^2=|\alpha|^2.
\end{align*}
Hence
\begin{align*}
\mu=|\alpha|
\end{align*}
or
\begin{align*}
\mu=-|\alpha|.
\end{align*}
If $y=0$, then $\alpha y=\mu x$ gives $\mu x=0$, while $\bar\alpha x=\mu y$ gives $\bar\alpha x=0$; since $\alpha\ne0$, this forces $x=0$, contradicting $u\ne0$. Thus no eigenvalue has been missed.
Because $H_0=E_0I_{\mathbb C^2}$, an eigenvector of $V$ with eigenvalue $\mu$ satisfies
\begin{align*}
H(\lambda)u=E_0u+\lambda\mu u.
\end{align*}
Therefore
\begin{align*}
H(\lambda)u=(E_0+\lambda\mu)u.
\end{align*}
The two eigenvalues of $H(\lambda)$ are consequently
\begin{align*}
E_0+\lambda|\alpha|
\end{align*}
and
\begin{align*}
E_0-\lambda|\alpha|.
\end{align*}
To see the eigenvectors explicitly, write $\alpha=|\alpha|e^{i\theta}$. For
\begin{align*}
u_+=e^{i\theta}e_1+e_2,
\end{align*}
linearity gives
\begin{align*}
Vu_+=e^{i\theta}Ve_1+Ve_2.
\end{align*}
Substituting $Ve_1=\bar\alpha e_2$ and $Ve_2=\alpha e_1$ gives
\begin{align*}
Vu_+=e^{i\theta}\bar\alpha e_2+\alpha e_1.
\end{align*}
Since $\bar\alpha=|\alpha|e^{-i\theta}$ and $\alpha=|\alpha|e^{i\theta}$, this becomes
\begin{align*}
Vu_+=|\alpha|e_2+|\alpha|e^{i\theta}e_1.
\end{align*}
Hence
\begin{align*}
Vu_+=|\alpha|u_+.
\end{align*}
Similarly, for
\begin{align*}
u_-=e^{i\theta}e_1-e_2,
\end{align*}
we have
\begin{align*}
Vu_-=e^{i\theta}Ve_1-Ve_2.
\end{align*}
Substitution gives
\begin{align*}
Vu_-=e^{i\theta}\bar\alpha e_2-\alpha e_1.
\end{align*}
Using the same phase identities,
\begin{align*}
Vu_-=|\alpha|e_2-|\alpha|e^{i\theta}e_1.
\end{align*}
Therefore
\begin{align*}
Vu_-=-|\alpha|u_-.
\end{align*}
The normalised eigenvectors are $2^{-1/2}u_+$ and $2^{-1/2}u_-$. Thus the degenerate level $E_0$ splits linearly into two branches separated by $2|\lambda||\alpha|$, and the preferred zeroth-order states are the phase-dependent symmetric and antisymmetric combinations selected by the perturbing dipole matrix.
[/example]
The finite-dimensional examples display the whole first-order story. In infinite-dimensional systems the same projection argument applies to an isolated finite-multiplicity eigenvalue, but care is needed with domains and with whether the perturbation is relatively bounded with respect to $H_0$.
## Hellmann-Feynman Differentiation
Once an eigenvalue branch is known to vary differentiably, the next question is how to differentiate it without differentiating the eigenvector in detail. The [Hellmann-Feynman theorem](/theorems/6969) gives a direct formula: the derivative of the energy is the expectation value of the derivative of the Hamiltonian.
[quotetheorem:6969]
[citeproof:6969]
This theorem packages the first-order perturbation formula into a differential identity. When
\begin{align*}
H(\lambda)=H_0+\lambda V
\end{align*}
on a common operator domain, evaluating at $\lambda=0$ recovers, for a simple eigenvalue,
\begin{align*}
E'(0)=(V\psi_0,\psi_0)_{\mathcal H}.
\end{align*}
The common-domain hypothesis is what permits differentiating the operator equation itself; in form perturbation problems the same idea must be rewritten using quadratic forms. A particle in a box with a moving wall is a typical failure mode: after rescaling to a fixed interval the differential expression acquires parameter-dependent coefficients and boundary terms, and differentiating the untransformed operator domain is not a legitimate operation. Differentiability of the eigenbranch is also a real hypothesis: at a level crossing in a two-state model, the ordered eigenvalues may have a corner even though the matrix entries are smooth. Simplicity is used to avoid this ambiguity; for a multiple eigenvalue, differentiating a chosen vector branch depends on the basis inside the eigenspace, and the invariant first-order data are the eigenvalues of the compressed perturbation.
[example: Parameter-Dependent Oscillator]
For $\omega>0$, consider
\begin{align*}
H(\omega):D(H(\omega))\subset L^2(\mathbb R)\longrightarrow L^2(\mathbb R),\qquad H(\omega)=-\frac{1}{2}\frac{d^2}{dx^2}+\frac{\omega^2x^2}{2}.
\end{align*}
Its closed quadratic form is
\begin{align*}
q_\omega[\psi]=\frac{1}{2}\|\psi'\|_{L^2}^2+\frac{\omega^2}{2}\|x\psi\|_{L^2}^2,\qquad Q=\{\psi\in H^1(\mathbb R):x\psi\in L^2(\mathbb R)\}.
\end{align*}
The form domain $Q$ is independent of $\omega$, so differentiating the form is legitimate on this fixed energy space.
For $\psi\in Q$, compute the difference quotient:
\begin{align*}
\frac{q_{\omega+h}[\psi]-q_\omega[\psi]}{h}=\frac{1}{h}\left(\frac{(\omega+h)^2}{2}-\frac{\omega^2}{2}\right)\|x\psi\|_{L^2}^2.
\end{align*}
Expanding the numerator gives
\begin{align*}
\frac{(\omega+h)^2-\omega^2}{2h}=\frac{\omega^2+2\omega h+h^2-\omega^2}{2h}.
\end{align*}
Cancelling the $\omega^2$ terms and dividing by $h$ gives
\begin{align*}
\frac{2\omega h+h^2}{2h}=\omega+\frac{h}{2}.
\end{align*}
Letting $h\to0$, we obtain
\begin{align*}
q_\omega'[\psi]=\omega\|x\psi\|_{L^2}^2.
\end{align*}
The oscillator spectrum is
\begin{align*}
E_n(\omega)=\omega\left(n+\frac{1}{2}\right).
\end{align*}
Differentiating this scalar expression gives
\begin{align*}
\frac{dE_n}{d\omega}=n+\frac{1}{2}.
\end{align*}
Applying the form version of *Hellmann-Feynman Theorem* to a normalised eigenfunction $\psi_{n,\omega}$ gives
\begin{align*}
\frac{dE_n}{d\omega}=q_\omega'[\psi_{n,\omega}].
\end{align*}
Substituting the form derivative gives
\begin{align*}
\frac{dE_n}{d\omega}=\omega\|x\psi_{n,\omega}\|_{L^2}^2.
\end{align*}
Combining this with $\frac{dE_n}{d\omega}=n+\frac{1}{2}$ yields
\begin{align*}
\omega\|x\psi_{n,\omega}\|_{L^2}^2=n+\frac{1}{2}.
\end{align*}
Since $\psi_{n,\omega}$ is normalised,
\begin{align*}
\|x\psi_{n,\omega}\|_{L^2}^2=\int_{\mathbb R}x^2|\psi_{n,\omega}(x)|^2\,d\mathcal L^1(x).
\end{align*}
Thus $\omega\mathbb E_n[x^2]=n+\frac{1}{2}$ in the $n$th stationary state. The example shows that differentiating an eigenvalue with respect to a parameter can recover an expectation value of an observable.
[/example]
The same idea is used in molecular physics, where differentiating energy levels with respect to nuclear positions gives forces. The mathematical hypothesis to remember is differentiability of the eigenbranch and sufficient control of the domains.
## Variational Bounds And Rayleigh-Ritz Approximation
Perturbation series are local and algebraic, but they do not by themselves give global bounds on eigenvalues. Variational methods answer a different question: how can we approximate or bound low-lying eigenvalues using trial subspaces? The Rayleigh quotient turns the spectral problem into an optimisation problem over vectors.
[definition: Rayleigh Quotient]
Let $H:D(H)\subset\mathcal H\to\mathcal H$ be a self-adjoint operator bounded below on a Hilbert space $\mathcal H$, and let $q_H:Q(H)\to\mathbb R$ be its closed quadratic form. The Rayleigh quotient is the functional
\begin{align*}
R_H:Q(H)\setminus\{0\}\longrightarrow\mathbb R,\qquad
\psi\longmapsto R_H[\psi]=\frac{q_H[\psi]}{\|\psi\|_{\mathcal H}^2}.
\end{align*}
[/definition]
When $\psi\in D(H)$, the form value is $q_H[\psi]=(H\psi,\psi)_{\mathcal H}$. The definition is written on the form domain because variational trial functions often have finite energy even when $H\psi$ is not an $L^2$ vector. For the one-dimensional Dirichlet Laplacian on an interval, a piecewise linear hat function lies in $H_0^1$ and has finite kinetic energy, but its second derivative contains distributional point masses, so it is not in the operator domain $H^2\cap H_0^1$.
The Rayleigh quotient is the expected energy of the normalised state in direction $\psi$. The natural question is whether this energy expectation detects the true lowest energy when all admissible directions are allowed, and the Rayleigh-Ritz principle gives the exact answer together with finite-dimensional upper bounds.
[quotetheorem:6970]
[citeproof:6970]
The variational principle is useful because any trial vector gives certified information in the correct direction. A carefully chosen finite basis can approximate the ground state well even when exact perturbation coefficients are inaccessible. The bounded-below hypothesis is needed so that the infimum is finite, and the discreteness hypothesis is what turns the bottom of the spectrum into an attained ground-state eigenvalue. If the bottom of the spectrum belongs only to the essential spectrum, the same infimum formula may hold with $E_0=\inf\sigma(H)$, but there need not be a normalised vector attaining it.
[example: Gaussian Trial State For An Anharmonic Oscillator]
Let $H:D(H)\subset L^2(\mathbb R)\to L^2(\mathbb R)$ be the self-adjoint operator associated with the closed quadratic form
\begin{align*}
q_H[\psi]=\frac{1}{2}\|\psi'\|_{L^2}^2+\frac{1}{2}\|x\psi\|_{L^2}^2+\lambda\|x^2\psi\|_{L^2}^2,
\end{align*}
on
\begin{align*}
Q(H)=\{\psi\in H^1(\mathbb R):x^2\psi\in L^2(\mathbb R)\},
\end{align*}
where $\lambda>0$. Take the Gaussian trial states
\begin{align*}
\psi_a(x)=\left(\frac{a}{\pi}\right)^{1/4}e^{-ax^2/2},\qquad a>0.
\end{align*}
Then
\begin{align*}
|\psi_a(x)|^2=\left(\frac{a}{\pi}\right)^{1/2}e^{-ax^2}.
\end{align*}
Using the change of variables $u=\sqrt a\,x$ in the standard Gaussian integral,
\begin{align*}
\int_{\mathbb R}e^{-ax^2}\,d\mathcal L^1(x)=\sqrt{\frac{\pi}{a}}.
\end{align*}
Therefore
\begin{align*}
\|\psi_a\|_{L^2}^2=\left(\frac{a}{\pi}\right)^{1/2}\sqrt{\frac{\pi}{a}}=1.
\end{align*}
We now compute the three terms in $q_H[\psi_a]$. Since
\begin{align*}
\psi_a'(x)=-ax\left(\frac{a}{\pi}\right)^{1/4}e^{-ax^2/2}=-ax\psi_a(x),
\end{align*}
we have
\begin{align*}
\|\psi_a'\|_{L^2}^2=a^2\int_{\mathbb R}x^2|\psi_a(x)|^2\,d\mathcal L^1(x).
\end{align*}
To evaluate this moment, integrate by parts on $[-R,R]$ and let $R\to\infty$; the boundary term $x e^{-ax^2}$ tends to $0$ at both ends. Since $d(e^{-ax^2})=-2ax e^{-ax^2}\,dx$,
\begin{align*}
\int_{\mathbb R}x^2e^{-ax^2}\,d\mathcal L^1(x)=\frac{1}{2a}\int_{\mathbb R}e^{-ax^2}\,d\mathcal L^1(x)=\frac{1}{2a}\sqrt{\frac{\pi}{a}}.
\end{align*}
Thus
\begin{align*}
\|x\psi_a\|_{L^2}^2=\left(\frac{a}{\pi}\right)^{1/2}\frac{1}{2a}\sqrt{\frac{\pi}{a}}=\frac{1}{2a}.
\end{align*}
Consequently
\begin{align*}
\|\psi_a'\|_{L^2}^2=a^2\cdot\frac{1}{2a}=\frac{a}{2}.
\end{align*}
For the quartic term, integrate by parts again, now using the boundary term $x^3e^{-ax^2}\to0$ at $\pm\infty$:
\begin{align*}
\int_{\mathbb R}x^4e^{-ax^2}\,d\mathcal L^1(x)=\frac{3}{2a}\int_{\mathbb R}x^2e^{-ax^2}\,d\mathcal L^1(x).
\end{align*}
Substituting the previous moment gives
\begin{align*}
\int_{\mathbb R}x^4e^{-ax^2}\,d\mathcal L^1(x)=\frac{3}{2a}\cdot\frac{1}{2a}\sqrt{\frac{\pi}{a}}=\frac{3}{4a^2}\sqrt{\frac{\pi}{a}}.
\end{align*}
Hence
\begin{align*}
\|x^2\psi_a\|_{L^2}^2=\left(\frac{a}{\pi}\right)^{1/2}\frac{3}{4a^2}\sqrt{\frac{\pi}{a}}=\frac{3}{4a^2}.
\end{align*}
Since $\|\psi_a\|_{L^2}=1$, the Rayleigh quotient equals $q_H[\psi_a]$. Substituting the three computed terms gives
\begin{align*}
R_H[\psi_a]=\frac{1}{2}\cdot\frac{a}{2}+\frac{1}{2}\cdot\frac{1}{2a}+\lambda\cdot\frac{3}{4a^2}.
\end{align*}
Therefore
\begin{align*}
R_H[\psi_a]=\frac{a}{4}+\frac{1}{4a}+\frac{3\lambda}{4a^2}.
\end{align*}
Minimising this one-variable expression over $a>0$ gives a variational upper bound on the true ground-state energy. If
\begin{align*}
F(a)=\frac{a}{4}+\frac{1}{4a}+\frac{3\lambda}{4a^2},
\end{align*}
then
\begin{align*}
F'(a)=\frac{1}{4}-\frac{1}{4a^2}-\frac{3\lambda}{2a^3}.
\end{align*}
The stationary equation $F'(a)=0$ is therefore
\begin{align*}
a^3-a-6\lambda=0.
\end{align*}
When $\lambda=0$, the positive solution is $a=1$; for small positive $\lambda$, the minimising Gaussian width is consequently a perturbation of the harmonic-oscillator width. This connects the variational Gaussian approximation with perturbation theory around the unperturbed oscillator.
[/example]
Rayleigh-Ritz approximation becomes a matrix problem once a trial basis is chosen. If $L=\operatorname{span}\{\phi_1,\dots,\phi_N\}$, then stationary values of $R_H$ on $L$ are found from the generalised eigenvalue problem
\begin{align*}
A c=\mu Gc,
\end{align*}
where $A_{ij}=(H\phi_j,\phi_i)_{\mathcal H}$ and $G_{ij}=(\phi_j,\phi_i)_{\mathcal H}$. This is the computational form of the method used throughout quantum chemistry and numerical spectral theory.
## Min-Max Principles And First-Order Spectral Shifts
The final issue is how to characterise excited discrete eigenvalues variationally. The ground state is found by minimising over all states, but the second eigenvalue must avoid collapsing back onto the ground state. The min-max principle enforces this by optimising over subspaces of prescribed dimension or by imposing orthogonality to lower modes.
[quotetheorem:6971]
[citeproof:6971]
This principle is the rigorous variational counterpart of the perturbation calculations. It also gives monotonicity statements: increasing the Hamiltonian as a quadratic form increases every discrete eigenvalue below the essential spectrum. The restriction to eigenvalues below the essential spectrum matters because the dimension-counting argument relies on genuine finite-dimensional eigenspaces separated from the continuum. For the free Laplacian $-\Delta:L^2(\mathbb R)\supset H^2(\mathbb R)\to L^2(\mathbb R)$, the spectrum is $[0,\infty)$ and there is no first normalised eigenvector to list; trial functions can drive the Rayleigh quotient toward $0$, but no $L^2$ state attains that value. The bounded-below hypothesis prevents the variational quantities from running to $-\infty$: for an operator whose quadratic form is unbounded below, the infimum over trial subspaces may fail to represent a finite spectral level. Once the index reaches the essential spectrum, the formula must be interpreted through spectral thresholds rather than a listed sequence of isolated eigenvalues.
[quotetheorem:6972]
[citeproof:6972]
The theorem unifies the computational rules of the chapter. Simple levels use expectation values or form values, while degenerate levels use the eigenvalues of the perturbation form on the unperturbed eigenspace. The form formulation is needed for singular perturbations, such as changing a potential in a way that preserves finite energy but not a common operator domain. The result still depends on isolation and differentiability of the local spectral branches; it does not describe resonance motion inside continuous spectrum or higher-order splitting after the first-order restricted form has repeated eigenvalues.
[example: Two-Level Avoided Crossing Model]
Let $e_1,e_2$ be the standard orthonormal basis of $\mathbb C^2$. Consider the Hamiltonian determined by
\begin{align*}
H(\lambda):\mathbb C^2\longrightarrow\mathbb C^2,\qquad H(\lambda)e_1=(E+a\lambda)e_1+\varepsilon e_2,
\end{align*}
and
\begin{align*}
H(\lambda)e_2=\varepsilon e_1+(E+b\lambda)e_2,
\end{align*}
where $a,b,E,\varepsilon\in\mathbb R$ and $\varepsilon\ne0$. We compute its eigenvalues by solving the two-dimensional eigenvalue equation. If $u=xe_1+ye_2$, then linearity gives
\begin{align*}
H(\lambda)u=xH(\lambda)e_1+yH(\lambda)e_2.
\end{align*}
Substituting the definitions of $H(\lambda)e_1$ and $H(\lambda)e_2$ gives
\begin{align*}
H(\lambda)u=\bigl((E+a\lambda)x+\varepsilon y\bigr)e_1+\bigl(\varepsilon x+(E+b\lambda)y\bigr)e_2.
\end{align*}
Thus $H(\lambda)u=\mu u$ is equivalent to
\begin{align*}
(E+a\lambda-\mu)x+\varepsilon y=0
\end{align*}
and
\begin{align*}
\varepsilon x+(E+b\lambda-\mu)y=0.
\end{align*}
A nonzero solution exists precisely when the two scalar equations are linearly dependent, so the determinant of their coefficient matrix vanishes:
\begin{align*}
0=(E+a\lambda-\mu)(E+b\lambda-\mu)-\varepsilon^2.
\end{align*}
Set $\nu=\mu-E$. Then $E+a\lambda-\mu=a\lambda-\nu$ and $E+b\lambda-\mu=b\lambda-\nu$, so the characteristic equation becomes
\begin{align*}
0=(a\lambda-\nu)(b\lambda-\nu)-\varepsilon^2.
\end{align*}
Expanding the product gives
\begin{align*}
0=ab\lambda^2-a\lambda\nu-b\lambda\nu+\nu^2-\varepsilon^2.
\end{align*}
Combining the two middle terms gives
\begin{align*}
0=\nu^2-(a+b)\lambda\nu+ab\lambda^2-\varepsilon^2.
\end{align*}
By the quadratic formula,
\begin{align*}
\nu=\frac{(a+b)\lambda\pm\sqrt{(a+b)^2\lambda^2-4(ab\lambda^2-\varepsilon^2)}}{2}.
\end{align*}
The discriminant is
\begin{align*}
(a+b)^2\lambda^2-4(ab\lambda^2-\varepsilon^2)=(a^2+2ab+b^2)\lambda^2-4ab\lambda^2+4\varepsilon^2.
\end{align*}
Combining the $\lambda^2$ terms gives
\begin{align*}
(a^2+2ab+b^2)\lambda^2-4ab\lambda^2+4\varepsilon^2=(a-b)^2\lambda^2+4\varepsilon^2.
\end{align*}
Since
\begin{align*}
(a-b)^2\lambda^2+4\varepsilon^2=4\left(\varepsilon^2+\left(\frac{a-b}{2}\lambda\right)^2\right),
\end{align*}
we obtain
\begin{align*}
\nu=\frac{a+b}{2}\lambda\pm\sqrt{\varepsilon^2+\left(\frac{a-b}{2}\lambda\right)^2}.
\end{align*}
Returning to $\mu=E+\nu$, the two eigenvalues are
\begin{align*}
E+\frac{a+b}{2}\lambda\pm \sqrt{\varepsilon^2+\left(\frac{a-b}{2}\lambda\right)^2}.
\end{align*}
If $\varepsilon=0$, the defining equations decouple and the eigenvalues reduce to
\begin{align*}
E+a\lambda
\end{align*}
and
\begin{align*}
E+b\lambda.
\end{align*}
These two lines meet at $\lambda=0$ when $a\ne b$. For $\varepsilon\ne0$, evaluating the two eigenvalues at $\lambda=0$ gives
\begin{align*}
E\pm\sqrt{\varepsilon^2}=E\pm|\varepsilon|.
\end{align*}
The separation at $\lambda=0$ is therefore
\begin{align*}
(E+|\varepsilon|)-(E-|\varepsilon|)=2|\varepsilon|.
\end{align*}
Thus the off-diagonal coupling opens a gap of size $2|\varepsilon|$, which is why perturbation theory near nearly degenerate levels must keep the full small matrix rather than treating the coupling as a secondary detail.
[/example]
Perturbation theory and variational theory serve complementary roles. Perturbation expansions give local formulas and physical interpretation of spectral shifts; variational principles give bounds, convergence tests, and stable numerical procedures. Together they form the standard toolkit for extracting quantitative predictions from quantum Hamiltonians that are too complicated to solve exactly.
Perturbation and variational methods give practical tools for extracting numbers from complicated Hamiltonians. We now step back to the structural reason such tools work so broadly: symmetries, conservation laws, and the unitary implementation of quantum kinematics.
# 11. Symmetry, Conservation, and Quantum Kinematics
Symmetry enters quantum mechanics through a tension between two requirements: physical states are rays rather than vectors, while transition probabilities are numerical predictions that must be preserved. This chapter explains how that tension forces symmetries to be implemented by unitary or antiunitary maps, then uses continuous symmetries to identify conserved observables. The final section records how commuting observables let us organize quantum systems by simultaneous quantum numbers.
## Symmetries of Quantum States
A physical symmetry should carry possible states to possible states without changing the probability of transition between any two states. Since pure states are rays in a complex Hilbert space $H$, this requirement is not initially a statement about linear operators on $H$. The first task is to recover operator-level structure from projective data.
[definition: Ray]
Let $H$ be a complex Hilbert space. A ray in $H$ is a one-dimensional complex subspace of $H$.
[/definition]
For a nonzero vector $\psi\in H$, the ray represented by $\psi$ is
\begin{align*}
[\psi]=\{\lambda\psi:\lambda\in\mathbb C,\lambda\ne0\}.
\end{align*}
When a unit representative is fixed, the remaining representatives are the phase multiples $e^{i\theta}\psi$.
Rays identify vectors that differ only by phase, since such vectors give the same expectation values and transition probabilities. This raises the next question: which maps on rays deserve to be called physical symmetries, rather than arbitrary relabellings of the projective Hilbert space? The answer is that a symmetry must preserve the transition probabilities that experiments can measure.
[definition: Projective Symmetry]
Let $H$ be a complex Hilbert space, and let $\mathbb P(H)$ denote the set of rays in $H$. A projective symmetry is a bijection
\begin{align*}
S:\mathbb P(H)\to\mathbb P(H)
\end{align*}
such that
\begin{align*}
| (\psi, \phi)_H |^2 = |(\psi', \phi')_H|^2
\end{align*}
whenever $S([\psi]) = [\psi']$ and $S([\phi]) = [\phi']$, with all representatives chosen to have norm $1$.
[/definition]
Projective symmetries are formulated exactly where the physics lives, but calculations with Hamiltonians and observables require maps on the Hilbert space itself. The obstruction is phase: a ray map does not specify a unique vector representative. Before stating the reconstruction theorem, we need the second kind of Hilbert-space map that preserves transition probabilities but conjugates complex scalars.
[definition: Antiunitary Operator]
Let $H$ be a complex Hilbert space. An antiunitary operator on $H$ is a bijection $A:H\to H$ such that
\begin{align*}
A(\alpha \psi + \beta \phi) = \bar\alpha A\psi + \bar\beta A\phi
\end{align*}
and
\begin{align*}
(A\psi,A\phi)_H = \overline{(\psi,\phi)_H}
\end{align*}
for all $\psi,\phi\in H$ and $\alpha,\beta\in\mathbb C$.
[/definition]
Antiunitary maps are conjugate-linear isometries. They preserve absolute values of inner products, but they reverse the complex phase convention in the inner product. Wigner's theorem now answers the reconstruction problem: every projective symmetry comes from either a unitary or antiunitary operator, so there are no further hidden kinds of quantum symmetry.
[quotetheorem:6973]
This theorem is structural rather than dynamical: it tells us what a symmetry can be before saying whether it is a symmetry of a particular Hamiltonian. The hypotheses are restrictive. If a ray bijection fails to preserve transition probabilities, for instance by sending two nonorthogonal rays in $\mathbb C^2$ to orthogonal rays, no unitary or antiunitary implementer can exist. The theorem also does not say that every unitary or antiunitary operator is a symmetry of a given system; it must still preserve the Hamiltonian or the relevant family of observables. This distinction matters because parity and time reversal have different linearity types, and those types determine how position, momentum, and spin transform. We first examine parity, where the symmetry is unitary and spatial reflection is the main operation.
[example: Parity On The Line]
Let $H=L^2(\mathbb R)$ and define $(\Pi\psi)(x)=\psi(-x)$. For $\psi,\phi\in L^2(\mathbb R)$, the change of variables $y=-x$ gives
\begin{align*}
(\Pi\psi,\Pi\phi)_{L^2}=\int_{\mathbb R}\psi(-x)\overline{\phi(-x)}\,dx=\int_{\mathbb R}\psi(y)\overline{\phi(y)}\,dy=(\psi,\phi)_{L^2}.
\end{align*}
Also
\begin{align*}
(\Pi^2\psi)(x)=(\Pi\psi)(-x)=\psi(x),
\end{align*}
so $\Pi^2=I$, $\Pi^{-1}=\Pi$, and $\Pi$ is unitary.
Let $X$ be multiplication by $x$ on $D(X)=\{\psi\in L^2(\mathbb R):x\psi(x)\in L^2(\mathbb R)\}$. If $\psi\in D(X)$, then
\begin{align*}
\int_{\mathbb R}|x(\Pi\psi)(x)|^2\,dx=\int_{\mathbb R}|x\psi(-x)|^2\,dx=\int_{\mathbb R}|y\psi(y)|^2\,dy<\infty,
\end{align*}
so $\Pi D(X)=D(X)$. Hence, for $\psi\in D(X)$,
\begin{align*}
(\Pi X\Pi^{-1}\psi)(x)=(X\Pi\psi)(-x)=(-x)(\Pi\psi)(-x)=-x\psi(x)=(-X\psi)(x).
\end{align*}
Thus $\Pi X\Pi^{-1}=-X$ on $D(X)$.
Let $M=-i\frac{d}{dx}$ on $D(M)=H^1(\mathbb R)$. Reflection preserves $H^1(\mathbb R)$, because the weak derivative of $\Pi\psi$ is
\begin{align*}
(\Pi\psi)'(x)=-\psi'(-x),
\end{align*}
and $\|\psi'(-\,\cdot)\|_{L^2}=\|\psi'\|_{L^2}$. Therefore $\Pi D(M)=D(M)$. For $\psi\in H^1(\mathbb R)$,
\begin{align*}
(M\Pi\psi)(y)=-i(\Pi\psi)'(y)=-i(-\psi'(-y))=i\psi'(-y).
\end{align*}
Evaluating at $y=-x$ gives
\begin{align*}
(\Pi M\Pi^{-1}\psi)(x)=(M\Pi\psi)(-x)=i\psi'(x)=(-M\psi)(x),
\end{align*}
so $\Pi M\Pi^{-1}=-M$ on $H^1(\mathbb R)$.
If $V:\mathbb R\to\mathbb R$ is measurable and even, then $V(X)$ is multiplication by $V(x)$ and, on its multiplication domain,
\begin{align*}
(\Pi V(X)\Pi^{-1}\psi)(x)=V(-x)\psi(x)=V(x)\psi(x)=(V(X)\psi)(x).
\end{align*}
Also, on the transformed domain of $M^2$,
\begin{align*}
\Pi M^2\Pi^{-1}=(\Pi M\Pi^{-1})(\Pi M\Pi^{-1})=(-M)(-M)=M^2.
\end{align*}
Therefore, when $H_0=\frac{1}{2m}M^2+V(X)$ is self-adjoint on an operator domain invariant under $\Pi$, we have
\begin{align*}
\Pi H_0\Pi^{-1}=\frac{1}{2m}M^2+V(X)=H_0.
\end{align*}
Parity reverses position and momentum, but an even potential and the kinetic energy are unchanged, so parity is a symmetry of the dynamics.
[/example]
Parity is unitary because it changes the spatial argument without conjugating complex scalars. Time reversal must reverse momentum while leaving position fixed, and the factor $i$ in $M_j=-i\partial_{x_j}$ forces complex conjugation to enter. This makes time reversal the basic antiunitary symmetry in nonrelativistic quantum mechanics.
[example: Spinless Time Reversal]
Let $H=L^2(\mathbb R^n)$ and define $T:H\to H$ by $(T\psi)(x)=\overline{\psi(x)}$. For $\alpha,\beta\in\mathbb C$ and $\psi,\phi\in H$,
\begin{align*}
T(\alpha\psi+\beta\phi)(x)=\overline{\alpha\psi(x)+\beta\phi(x)}=\bar\alpha(T\psi)(x)+\bar\beta(T\phi)(x).
\end{align*}
Moreover,
\begin{align*}
(T\psi,T\phi)_{L^2}=\int_{\mathbb R^n}\overline{\psi(x)}\,\overline{\overline{\phi(x)}}\,dx=\int_{\mathbb R^n}\overline{\psi(x)}\phi(x)\,dx=\overline{\int_{\mathbb R^n}\psi(x)\overline{\phi(x)}\,dx}=\overline{(\psi,\phi)_{L^2}}.
\end{align*}
Finally,
\begin{align*}
(T^2\psi)(x)=T(\overline{\psi})(x)=\overline{\overline{\psi(x)}}=\psi(x),
\end{align*}
so $T$ is antiunitary and $T^2=I$.
For each $j$, let $X_j$ be coordinate multiplication by $x_j$ on $D(X_j)=\{\psi\in H:x_j\psi\in H\}$. If $\psi\in D(X_j)$, then
\begin{align*}
\int_{\mathbb R^n}|x_j(T\psi)(x)|^2\,dx=\int_{\mathbb R^n}|x_j\overline{\psi(x)}|^2\,dx=\int_{\mathbb R^n}|x_j\psi(x)|^2\,dx<\infty,
\end{align*}
so $TD(X_j)=D(X_j)$. Since $T^{-1}=T$, for $\psi\in D(X_j)$ we get
\begin{align*}
(TX_jT^{-1}\psi)(x)=\overline{(X_jT\psi)(x)}=\overline{x_j\overline{\psi(x)}}=x_j\psi(x)=(X_j\psi)(x).
\end{align*}
Thus $TX_jT^{-1}=X_j$.
For each $j$, let $M_j=-i\partial_{x_j}$ on $D(M_j)=\{\psi\in L^2(\mathbb R^n):\partial_{x_j}\psi\in L^2(\mathbb R^n)\}$, with derivative understood weakly. If $\psi\in D(M_j)$, then the weak derivative of $T\psi$ is $\partial_{x_j}(T\psi)=\overline{\partial_{x_j}\psi}$, so $T\psi\in D(M_j)$. Hence $TD(M_j)=D(M_j)$. For $\psi\in D(M_j)$,
\begin{align*}
(M_jT\psi)(x)=-i\,\partial_{x_j}\overline{\psi(x)}=-i\,\overline{\partial_{x_j}\psi(x)}.
\end{align*}
Applying $T$ gives
\begin{align*}
(TM_jT^{-1}\psi)(x)=\overline{-i\,\overline{\partial_{x_j}\psi(x)}}=i\,\partial_{x_j}\psi(x)=(-M_j\psi)(x).
\end{align*}
Therefore $TM_jT^{-1}=-M_j$: spinless time reversal leaves position unchanged and reverses momentum.
On the corresponding second-derivative domains,
\begin{align*}
TM_j^2T^{-1}=(TM_jT^{-1})(TM_jT^{-1})=(-M_j)(-M_j)=M_j^2.
\end{align*}
If $V:\mathbb R^n\to\mathbb R$ is real-valued, then $V(X)$ is multiplication by $V(x)$ and, on its multiplication domain,
\begin{align*}
(TV(X)T^{-1}\psi)(x)=\overline{V(x)\overline{\psi(x)}}=V(x)\psi(x)=(V(X)\psi)(x).
\end{align*}
Thus, if the differential Hamiltonian
\begin{align*}
H_0=\frac{1}{2m}\sum_{j=1}^n M_j^2+V(X)
\end{align*}
is self-adjoint on a domain satisfying $TD(H_0)=D(H_0)$, then
\begin{align*}
TH_0T^{-1}=\frac{1}{2m}\sum_{j=1}^n TM_j^2T^{-1}+TV(X)T^{-1}=\frac{1}{2m}\sum_{j=1}^n M_j^2+V(X)=H_0.
\end{align*}
The conjugation in $T$ is exactly what changes the sign of the factor $-i$ in momentum while leaving real multiplication operators unchanged.
[/example]
The antiunitarity of time reversal is not a technical detail: it changes degeneracy phenomena. For half-integer spin systems, the square of time reversal is negative rather than positive, which prevents an eigenvector from being invariant under time reversal up to phase. The next theorem isolates the algebraic reason for the resulting twofold degeneracy.
[quotetheorem:6974]
[citeproof:6974]
Kramers degeneracy is a symmetry statement with spectral consequences. It is especially useful because it gives degeneracy without solving the eigenvalue equation. The antiunitary hypothesis is what distinguishes it from an ordinary commuting unitary symmetry: when $T^2=-I$, no nonzero vector can be a time-reversal eigenvector, so an energy eigenstate must be accompanied by a linearly independent partner at the same energy. This is why half-integer spin systems retain at least a twofold degeneracy whenever time-reversal symmetry is present and the relevant eigenvalue is isolated. The result is also limited in scope: breaking time reversal, for example by a magnetic field, removes the algebraic protection, and continuous spectrum requires a scattering version rather than this discrete-eigenvalue statement.
## Continuous Symmetries And Conserved Observables
The next question is how a continuous family of symmetries appears infinitesimally. In quantum mechanics, continuous unitary symmetries are governed by self-adjoint generators, and conservation laws are expressed by commutators with the Hamiltonian.
[definition: Strongly Continuous One Parameter Unitary Group]
Let $H$ be a complex Hilbert space. A strongly continuous one parameter unitary group is a map
\begin{align*}
\mathbb R &\to \mathcal L(H), & t&\mapsto U_t
\end{align*}
such that each $U_t:H\to H$ is unitary, $U_0=I$, $U_{s+t}=U_sU_t$, and $\|U_t\psi-\psi\|_H\to 0$ as $t\to 0$ for every $\psi\in H$.
[/definition]
Such a group describes a symmetry depending continuously on a real parameter, such as time translations, spatial translations, or rotations about an axis. To use such symmetries in spectral theory, we need to replace the whole family $(U_t)$ by a single observable-like object. Stone's theorem provides this infinitesimal generator and explains why self-adjoint operators are the natural quantum counterparts of classical conserved quantities.
[quotetheorem:6975]
Stone's theorem turns continuous symmetry into an operator, but it requires strong continuity. A discontinuous homomorphism $\mathbb R\to U(1)$ can be built using a noncontinuous additive function $\mathbb R\to\mathbb R$, and such a family has no densely defined self-adjoint generator obtained by differentiating at $0$. The conclusion is also not a bounded-operator statement: the generator of translations on $L^2(\mathbb R)$ is momentum, which is unbounded and has a proper domain. Conservation is therefore a separate statement about time evolution under a Hamiltonian. The next definition fixes the Schrödinger-picture meaning of being conserved: all expectations of the observable stay unchanged along the unitary flow generated by $H_0$.
[definition: Constant Of Motion]
Let $H$ be a complex Hilbert space, let $H_0:D(H_0)\subset H\to H$ be a self-adjoint Hamiltonian, and let $A:D(A)\subset H\to H$ be a self-adjoint observable. The observable $A$ is a constant of motion for $H_0$ if
\begin{align*}
(e^{-itH_0}\psi,Ae^{-itH_0}\psi)_H = (\psi,A\psi)_H
\end{align*}
for all $t\in\mathbb R$ and all $\psi$ in a common invariant domain on which the displayed expression is defined.
[/definition]
The domain clause is part of the mathematical content: for unbounded operators, commutators are meaningful only on vectors where both compositions make sense. We now want a test for conservation that does not require computing the full time evolution. The commutator criterion gives that test by reducing conservation to the vanishing of the infinitesimal change of the observable.
[quotetheorem:6976]
[citeproof:6976]
This criterion is the quantum analogue of the Poisson bracket condition $\{H,a\}=0$ from Hamiltonian mechanics. Its limitation is the same place where unbounded operators enter: a formal commutator calculation on a small set of test functions need not determine a self-adjoint operator identity. For example, position and momentum have a well-defined commutator on the Schwartz space, but neither product is defined on all of $L^2(\mathbb R)$. A dense invariant core is therefore part of the criterion, not bookkeeping. With that domain control in place, conserved observables are precisely those commuting with the Hamiltonian. Rotationally invariant Hamiltonians give the central example, because their continuous symmetry group has angular momentum as its infinitesimal generator.
[example: Rotational Invariance Of A Central Hamiltonian]
Let $H=L^2(\mathbb R^3)$ and let the central Hamiltonian be
\begin{align*}
H_0=-\frac{1}{2m}\Delta+V(|x|)
\end{align*}
on a self-adjoint domain $D(H_0)$. For $R\in SO(3)$ define $(U_R\psi)(x)=\psi(R^{-1}x)$. Since $R$ preserves Lebesgue measure,
\begin{align*}
(U_R\psi,U_R\phi)_{L^2}=\int_{\mathbb R^3}\psi(R^{-1}x)\overline{\phi(R^{-1}x)}\,dx=\int_{\mathbb R^3}\psi(y)\overline{\phi(y)}\,dy=(\psi,\phi)_{L^2}.
\end{align*}
Also,
\begin{align*}
(U_RU_S\psi)(x)=(U_S\psi)(R^{-1}x)=\psi(S^{-1}R^{-1}x)=\psi((RS)^{-1}x)=(U_{RS}\psi)(x),
\end{align*}
so $U_R^{-1}=U_{R^{-1}}$ and $U_R$ is unitary.
We verify the operator identity first on a smooth compactly supported core on which all displayed derivatives are defined. Put $\chi=U_R^{-1}\psi=U_{R^{-1}}\psi$, so $\chi(y)=\psi(Ry)$. For each coordinate $a$,
\begin{align*}
\partial_{y_a}\chi(y)=\sum_{b=1}^3 R_{ba}(\partial_b\psi)(Ry).
\end{align*}
Differentiating once more gives
\begin{align*}
\partial_{y_a}^2\chi(y)=\sum_{b=1}^3\sum_{c=1}^3 R_{ba}R_{ca}(\partial_b\partial_c\psi)(Ry).
\end{align*}
Hence, using $\sum_{a=1}^3R_{ba}R_{ca}=\delta_{bc}$ because $RR^\top=I$,
\begin{align*}
(\Delta\chi)(y)=\sum_{b=1}^3\sum_{c=1}^3\delta_{bc}(\partial_b\partial_c\psi)(Ry)=(\Delta\psi)(Ry).
\end{align*}
Evaluating at $y=R^{-1}x$ yields
\begin{align*}
(U_R\Delta U_R^{-1}\psi)(x)=(\Delta\chi)(R^{-1}x)=(\Delta\psi)(x).
\end{align*}
For the potential term, orthogonality of $R$ gives $|R^{-1}x|=|x|$, so
\begin{align*}
(U_RV(|x|)U_R^{-1}\psi)(x)=V(|R^{-1}x|)\psi(x)=V(|x|)\psi(x).
\end{align*}
Therefore, on the core,
\begin{align*}
(U_RH_0U_R^{-1}\psi)(x)=-\frac{1}{2m}(\Delta\psi)(x)+V(|x|)\psi(x)=(H_0\psi)(x).
\end{align*}
If $U_RD(H_0)=D(H_0)$, this core identity extends to $U_RH_0U_R^{-1}=H_0$ on $D(H_0)$.
For rotations about the coordinate axes, Stone's theorem identifies the infinitesimal generators as the angular momentum operators
\begin{align*}
L_j=-i(x\times\nabla)_j
\end{align*}
on their self-adjoint domains. Since $U_{R_j(t)}H_0U_{R_j(t)}^{-1}=H_0$ for the rotation group $R_j(t)$, differentiating at $t=0$ on the common smooth core gives
\begin{align*}
0=\frac{d}{dt}\bigg|_{t=0}\bigl(U_{R_j(t)}H_0U_{R_j(t)}^{-1}\psi\bigr)=-iL_jH_0\psi+iH_0L_j\psi=i[H_0,L_j]\psi.
\end{align*}
Thus $[H_0,L_j]\psi=0$ on that core. By the [commutator criterion for constants of motion](/theorems/6976), each $L_j$ is conserved on the corresponding invariant domain. A central Hamiltonian is unchanged by rotations, and its infinitesimal conserved quantities are precisely the angular momentum components.
[/example]
Rotational invariance therefore supplies conserved angular momentum. Because the components of angular momentum do not commute with one another, the next organizational problem is to choose maximal compatible collections, such as $H_0$, $L^2$, and $L_3$ in central potentials.
## Commuting Observables And Simultaneous Quantum Numbers
A single self-adjoint operator gives one spectral decomposition. Many quantum systems require several labels at once, such as energy, total angular momentum, and one component of angular momentum. The mathematical question is when several observables can be diagonalized in the same basis.
[definition: Compatible Observables]
Let $H$ be a complex Hilbert space. Two self-adjoint observables $A:D(A)\subset H\to H$ and $B:D(B)\subset H\to H$ are compatible in the pure point setting if there exists an orthonormal basis of $H$ consisting of common eigenvectors of $A$ and $B$.
[/definition]
Compatibility is stronger than the existence of separate eigenbases, because the same vector must carry both labels. The practical sufficient condition used throughout elementary quantum mechanics is commutation, together with enough eigenvectors to span the Hilbert space. The next theorem explains why commuting observables can be used as simultaneous quantum numbers in the pure point case.
[quotetheorem:6977]
The role of spectral projections is to make the unbounded-operator statement precise. Mere commutation on a convenient dense domain is too weak for unbounded operators: it may fail to control domains of spectral projections, and it need not imply that eigenspaces reduce the other operator. The pure point hypothesis is also a limitation; position and momentum on $L^2(\mathbb R)$ do not admit a common eigenbasis in $H$, and their spectral representations are continuous rather than discrete. In finite-dimensional Hilbert spaces, the theorem reduces to the familiar fact that commuting Hermitian matrices are simultaneously unitarily diagonalizable. Even potentials give a basic infinite-dimensional example in which energy and parity can be used together.
[example: Energy And Parity In An Even Potential]
Let $H=L^2(\mathbb R)$ and let $V:\mathbb R\to\mathbb R$ be even. Suppose
\begin{align*}
H_0=-\frac{1}{2m}\frac{d^2}{dx^2}+V(x)
\end{align*}
defines a self-adjoint Hamiltonian on a domain $D(H_0)$ satisfying $\Pi D(H_0)=D(H_0)$ for the parity operator $(\Pi\psi)(x)=\psi(-x)$. Since
\begin{align*}
(\Pi^2\psi)(x)=(\Pi\psi)(-x)=\psi(x),
\end{align*}
we have $\Pi^2=I$ and hence $\Pi^{-1}=\Pi$. For $\psi,\phi\in L^2(\mathbb R)$, the substitution $y=-x$ gives
\begin{align*}
(\Pi\psi,\Pi\phi)_{L^2}=\int_{\mathbb R}\psi(-x)\overline{\phi(-x)}\,dx=\int_{\mathbb R}\psi(y)\overline{\phi(y)}\,dy=(\psi,\phi)_{L^2},
\end{align*}
so $\Pi$ is unitary.
On vectors for which the displayed derivatives are defined, put $\chi=\Pi^{-1}\psi=\Pi\psi$, so $\chi(x)=\psi(-x)$. Then
\begin{align*}
\chi'(x)=-\psi'(-x).
\end{align*}
Differentiating once more,
\begin{align*}
\chi''(x)=\psi''(-x).
\end{align*}
Therefore
\begin{align*}
(\Pi H_0\Pi^{-1}\psi)(x)=(H_0\chi)(-x)=-\frac{1}{2m}\chi''(-x)+V(-x)\chi(-x).
\end{align*}
Substituting $\chi''(-x)=\psi''(x)$ and $\chi(-x)=\psi(x)$ gives
\begin{align*}
(\Pi H_0\Pi^{-1}\psi)(x)=-\frac{1}{2m}\psi''(x)+V(-x)\psi(x).
\end{align*}
Since $V$ is even, $V(-x)=V(x)$, hence
\begin{align*}
(\Pi H_0\Pi^{-1}\psi)(x)=-\frac{1}{2m}\psi''(x)+V(x)\psi(x)=(H_0\psi)(x).
\end{align*}
Thus $\Pi H_0\Pi^{-1}=H_0$ on $D(H_0)$, equivalently $\Pi H_0=H_0\Pi$.
Now let $\psi\ne0$ satisfy $H_0\psi=E\psi$. Commutation gives
\begin{align*}
H_0(\Pi\psi)=\Pi H_0\psi=\Pi(E\psi)=E\Pi\psi,
\end{align*}
so $\Pi\psi$ lies in the same energy eigenspace. If that eigenspace is one-dimensional, then $\Pi\psi=c\psi$ for some $c\in\mathbb C$. Since $\Pi$ is unitary, $\|\Pi\psi\|=\|\psi\|$, so $|c|=1$. Applying $\Pi$ again gives
\begin{align*}
\psi=\Pi^2\psi=\Pi(c\psi)=c\Pi\psi=c^2\psi.
\end{align*}
Because $\psi\ne0$, $c^2=1$, hence $c=1$ or $c=-1$. In the first case $\psi(-x)=\psi(x)$, so $\psi$ is even; in the second case $\psi(-x)=-\psi(x)$, so $\psi$ is odd.
If an energy eigenspace has dimension greater than one, $\Pi$ maps that eigenspace to itself and satisfies $\Pi^2=I$ there. The projections
\begin{align*}
P_+=\frac{1}{2}(I+\Pi)
\end{align*}
and
\begin{align*}
P_-=\frac{1}{2}(I-\Pi)
\end{align*}
split the eigenspace into the $\Pi=1$ and $\Pi=-1$ subspaces, since $\Pi P_+=P_+$ and $\Pi P_-=-P_-$. Equivalently, by *[Simultaneous Diagonalization For Commuting Self-Adjoint Operators](/theorems/6977)*, the commuting observables $H_0$ and $\Pi$ admit an energy basis chosen from parity eigenstates. Thus an even potential lets the spectral problem separate into even and odd sectors.
[/example]
This example shows how symmetry reduces the complexity of solving the spectral problem. Instead of diagonalizing $H_0$ on all of $L^2(\mathbb R)$ at once, one may work separately in the even and odd subspaces.
[remark: Degeneracy And Symmetry Labels]
Commuting conserved observables provide quantum numbers, but they need not remove all degeneracy. Remaining degeneracy often signals a larger symmetry algebra or an antiunitary constraint such as Kramers degeneracy. In applications, the art is to find a complete commuting family large enough to label states without demanding commutation from observables, such as different angular momentum components, that cannot be simultaneously measured.
[/remark]
The chapter's central pattern is now visible. Wigner's theorem restricts the possible forms of symmetry, Stone's theorem turns continuous unitary symmetries into self-adjoint generators, the commutator criterion identifies conserved observables, and simultaneous diagonalization converts commuting conservation laws into usable spectral labels.
Symmetry theory identifies the conserved quantities and spectral labels that organize quantum motion. The final chapter uses that structure together with spectral and approximation methods to study scattering, effective descriptions, and the passage toward classical mechanics.
# 12. Approximation, Scattering, and Semiclassical Limits
This chapter completes the course by bringing together spectral theory, approximation methods, and the classical limit of quantum mechanics. The prerequisites are the self-adjointness of Schrödinger operators, unitary time evolution, the distinction between point and continuous spectrum, and the basic Fourier representation of the free particle. The main questions are how scattering states are compared with free states, how scattering amplitudes are approximated, and how small-$\hbar$ asymptotics recover classical motion.
The organizing theme is approximation under control. Scattering theory compares exact dynamics with free dynamics at large times, perturbation theory replaces an exact integral equation by computable leading terms, and semiclassical analysis explains why rapidly oscillating quantum phases concentrate near classical trajectories. These topics also prepare the bridge from Hilbert-space spectral theory to more geometric subjects such as Hamilton-Jacobi theory and microlocal analysis.
## Scattering by Short-Range Potentials
The basic scattering question asks what remains observable when a particle is sent in from far away, interacts with a localized potential, and is detected far away again. Bound states describe localized energy levels, but scattering states live in the continuous spectrum and cannot be normalized as vectors in $L^2(\mathbb R^3)$. The mathematical replacement is to compare the interacting dynamics with the free dynamics as $t \to \pm \infty$.
We work first with the Schrödinger operator
\begin{align*}
H &= H_0 + V, & H_0 &= -\frac{\hbar^2}{2m}\Delta
\end{align*}
on $L^2(\mathbb R^3)$, where $V: \mathbb R^3 \to \mathbb R$ is a real-valued potential decaying sufficiently fast at infinity. The exact hypotheses needed for full scattering theory are technical; for this introductory treatment, a smooth compactly supported potential is the clean model.
[definition: Short-Range Potential]
A real-valued measurable function $V: \mathbb R^3 \to \mathbb R$ is called short-range for this chapter if $V$ is bounded and there exist constants $C>0$ and $\varepsilon>0$ such that
\begin{align*}
|V(x)| \le C(1+|x|)^{-1-\varepsilon}
\end{align*}
for all $x \in \mathbb R^3$.
[/definition]
This condition says that the interaction is localized enough for incoming and outgoing particles to resemble free particles at large times. Compactly supported potentials and rapidly decaying potentials are the guiding examples, while Coulomb scattering requires a modified long-range theory.
[example: Compactly Supported Scattering Potential]
Let $V\in C_c^\infty(\mathbb R^3;\mathbb R)$ and set $H=H_0+V$ on $L^2(\mathbb R^3)$. Since $V$ is continuous with compact support, there is $M<\infty$ such that $|V(x)|\le M$ for all $x$, and multiplication by $V$ satisfies
\begin{align*}
\|V\psi\|_{L^2}^2=\int_{\mathbb R^3}|V(x)|^2|\psi(x)|^2\,d\mathcal L^3(x)\le M^2\|\psi\|_{L^2}^2.
\end{align*}
Because $V$ is real-valued, this bounded multiplication operator is self-adjoint, so by the *bounded perturbation theorem for self-adjoint operators*, $H=H_0+V$ is self-adjoint on $D(H_0)=H^2(\mathbb R^3)$.
The compact support gives a fixed radius $R>0$ with $\operatorname{supp}V\subseteq B_R(0)$. For every $\psi\in H^2(\mathbb R^3)$ and every $x\notin B_R(0)$,
\begin{align*}
(H\psi)(x)=(H_0\psi)(x)+V(x)\psi(x)=(H_0\psi)(x).
\end{align*}
Thus the only part of the wave affected by the interaction is the part lying in the fixed ball where $V$ is nonzero. For a free wave packet whose momentum is concentrated away from $0$, propagation carries most of its $L^2$ mass outside this ball in the remote past and remote future, and the interaction estimate
\begin{align*}
\|V e^{-itH_0/\hbar}\psi\|_{L^2}\le M\|\mathbf 1_{B_R(0)}e^{-itH_0/\hbar}\psi\|_{L^2}
\end{align*}
makes precise why the particle is asymptotically free before and after it crosses the interaction region. This compactly supported case is the clean model for the incoming-free, interaction, outgoing-free picture of scattering.
[/example]
The example shows the physical picture, but it still describes packets informally. To turn this into an operator statement, we need a limit that says interacting evolution becomes free evolution in the remote past and future.
[definition: Wave Operators]
Let $H_0$ and $H$ be self-adjoint operators on a Hilbert space $\mathcal H$. Let $\mathcal H_{\mathrm{ac}}(H_0)=P_{\mathrm{ac}}(H_0)\mathcal H$. The incoming and outgoing wave operators are the operators
\begin{align*}
\Omega_\pm &: \mathcal H_{\mathrm{ac}}(H_0) \to \mathcal H
\end{align*}
defined by the strong limits
\begin{align*}
\Omega_\pm \psi = \lim_{t \to \mp\infty} e^{itH/\hbar}e^{-itH_0/\hbar}\psi,
\end{align*}
whenever these limits exist for every $\psi\in\mathcal H_{\mathrm{ac}}(H_0)$.
[/definition]
The sign convention records whether a free state is matched to the interacting state in the remote past or remote future. The next issue is existence: without an existence theorem, the definition would only name the desired comparison, not justify that scattering states are available for the potentials used in the course.
[quotetheorem:6978]
This theorem supplies the asymptotic identification between free and interacting states, but its hypotheses are doing real work. Compact support and smoothness remove the long tails and singularities that would otherwise complicate the Cook estimate; the packet must spend only a finite effective time in the interaction region. The result also avoids threshold phenomena near zero energy, where slow packets may interact for too long for the same proof to apply directly. Coulomb potentials are the standard warning: their $1/|x|$ tail changes the asymptotic free dynamics, so the wave operators need long-range phase corrections rather than the unmodified comparison above. The theorem as stated gives existence and the basic range information needed in this course; full asymptotic completeness is a stronger assertion identifying the entire absolutely continuous subspace of $H$ with scattering states.
The wave operators separately answer the past and future comparison problems, so they still leave the observable input-output question unresolved. An experiment does not measure the intermediate interacting state directly; it prepares incoming free data and records outgoing free data. To obtain that observable scattering transformation, we compose the two asymptotic identifications in the correct order: start with incoming free data, build the interacting state that has that incoming asymptote, and then read off the outgoing free data determined by the same interacting evolution. This construction removes the unobservable middle of the motion and keeps only the input-output map on the free absolutely continuous subspace.
[definition: Scattering Operator]
Assume the wave operators $\Omega_\pm$ exist and are isometries on $\mathcal H_{\mathrm{ac}}(H_0)=P_{\mathrm{ac}}(H_0)\mathcal H$. The scattering operator is the map
\begin{align*}
S &: \mathcal H_{\mathrm{ac}}(H_0) \to \mathcal H_{\mathrm{ac}}(H_0), & S&=\Omega_+^*\Omega_-.
\end{align*}
[/definition]
The operator $S$ commutes with the free Hamiltonian $H_0$, so in the spectral representation of $H_0$ it decomposes into fixed-energy scattering matrices. In three dimensions, those fixed-energy objects encode scattering amplitudes and cross sections.
## The Lippmann-Schwinger Equation and the Born Approximation
The next problem is how to compute the scattering matrix, even approximately. Plane waves are not elements of $L^2(\mathbb R^3)$, but they give the correct generalized eigenfunctions for the continuous spectrum. The [Lippmann-Schwinger equation](/theorems/6979) rewrites the eigenvalue equation for $H_0+V$ as an integral equation with outgoing or incoming boundary conditions.
Fix an incoming wave vector $k \in \mathbb R^3$ with $k \ne 0$ and energy
\begin{align*}
E_k = \frac{\hbar^2 |k|^2}{2m}.
\end{align*}
The free incoming plane wave is $e^{ik\cdot x}$. The outgoing scattered solution should solve $(H-E_k)\psi=0$ and resemble a plane wave plus an outgoing spherical wave at infinity.
[quotetheorem:6979]
The equation is exact within the outgoing resolvent framework, but it is not a harmless algebraic rearrangement. The boundary value prescription selects the radiation condition and excludes the incoming solution with the same formal eigenvalue equation. Each hypothesis rules out a specific failure mode. If compact support is replaced by the Coulomb potential $V(x)=\gamma/|x|$, the asymptotic wave contains a logarithmic long-range phase, so the free outgoing kernel no longer gives the correct large-distance comparison. If smoothness is replaced by a point interaction $V=\alpha\delta_0$, the product $V\psi$ is a distribution supported at the origin and the equation must be interpreted through boundary conditions at $0$, not by the displayed [Lebesgue integral](/page/Lebesgue%20Integral). If the boundary value $(H_0-E_k-i0)^{-1}$ is not available at the energy under discussion, as can occur at a zero-energy resonance or at an embedded positive eigenvalue for a slowly decaying oscillatory potential, the limiting absorption step fails and the radiation condition does not select a bounded integral operator on the required source class. The compactly supported smooth model keeps these issues out of the course calculations, while the full limiting absorption principle belongs to a more advanced scattering course.
This motivates a controlled first approximation: the unknown wave still appears inside the interaction term. The natural approximation is to preserve the outgoing Green function while replacing the interior wave by data we already know.
[definition: Born Approximation]
Let $\mathcal V=C_c^\infty(\mathbb R^3;\mathbb R)$ and let
\begin{align*}
\mathcal A_{\mathrm{Born}}:\{(V,k',k)\in \mathcal V\times\mathbb R^3\times(\mathbb R^3\setminus\{0\}): |k'|=|k|\}\to\mathbb C
\end{align*}
be the map obtained from the outgoing Lippmann-Schwinger equation by replacing $\psi_k^{(+)}(y)$ under the interaction integral by the incident plane wave $e^{ik\cdot y}$. The complex number $\mathcal A_{\mathrm{Born}}(V,k',k)$ is called the first Born approximation to the elastic scattering amplitude from incoming momentum $k$ to outgoing momentum $k'$.
[/definition]
This substitution turns the scattering problem into a Fourier integral. The coefficient of the outgoing spherical wave is the observable scattering amplitude, so the calculation must identify exactly which Fourier mode of $V$ appears when the incoming and outgoing wave vectors have the same length. The issue is not merely evaluation of an integral, but the conversion from the Born replacement inside the Lippmann-Schwinger equation to the far-field amplitude measured at infinity.
[quotetheorem:6980]
This result explains why scattering is experimentally a Fourier probe of the interaction: the first-order amplitude samples the Fourier transform of $V$ at the momentum transfer $k'-k$. The hypotheses also mark the boundary of the formula. Compact support may be weakened to sufficient integrability, but a long-range potential such as $V(x)=\gamma/|x|$ produces a divergent ordinary Fourier integral and requires Coulomb scattering states with modified phases. If $V$ is not integrable at infinity, for instance $V(x)=(1+|x|)^{-1}$ in three dimensions, the displayed integral is not an absolutely convergent Lebesgue integral and cannot be used as written. If the potential is strong enough to create a near-threshold bound state or resonance, repeated interactions dominate and the first Born term can miss the large scattering length. The formula is therefore best read as the linear response of the scattering amplitude to a short-range integrable potential, not as a uniformly valid scattering law. Small-angle scattering measures low-frequency features of $V$, while large momentum transfer is sensitive to sharper spatial variation.
[example: Born Approximation for a Gaussian Potential]
Let $V(x)=V_0e^{-|x|^2/a^2}$ with $V_0\in\mathbb R$ and $a>0$, and write the momentum transfer as $q=k'-k$. By *[First Born Scattering Amplitude](/theorems/6980)*,
\begin{align*}
f_{\mathrm{Born}}(k',k)=-\frac{m}{2\pi\hbar^2}\int_{\mathbb R^3}e^{-iq\cdot x}V_0e^{-|x|^2/a^2}\,d\mathcal L^3(x).
\end{align*}
Since $|x|^2=x_1^2+x_2^2+x_3^2$ and $q\cdot x=q_1x_1+q_2x_2+q_3x_3$, the integral factors by [Fubini's theorem](/theorems/2961):
\begin{align*}
\int_{\mathbb R^3}e^{-iq\cdot x}e^{-|x|^2/a^2}\,d\mathcal L^3(x)=\prod_{j=1}^3\int_{\mathbb R}e^{-x_j^2/a^2}e^{-iq_jx_j}\,dx_j.
\end{align*}
For each component, completing the square gives
\begin{align*}
-\frac{s^2}{a^2}-iq_js=-\frac{(s+ia^2q_j/2)^2}{a^2}-\frac{a^2q_j^2}{4}.
\end{align*}
Using the standard Gaussian Fourier integral obtained from this completion of the square,
\begin{align*}
\int_{\mathbb R}e^{-s^2/a^2}e^{-iq_js}\,ds=\sqrt{\pi}\,a\,e^{-a^2q_j^2/4}.
\end{align*}
Multiplying the three one-dimensional factors gives
\begin{align*}
\int_{\mathbb R^3}e^{-iq\cdot x}e^{-|x|^2/a^2}\,d\mathcal L^3(x)=\pi^{3/2}a^3e^{-a^2(q_1^2+q_2^2+q_3^2)/4}.
\end{align*}
Because $q_1^2+q_2^2+q_3^2=|q|^2=|k'-k|^2$, we obtain
\begin{align*}
f_{\mathrm{Born}}(k',k)=-\frac{mV_0}{2\pi\hbar^2}\pi^{3/2}a^3e^{-a^2|k'-k|^2/4}.
\end{align*}
Thus the amplitude is a Gaussian in the momentum transfer $k'-k$, with width of order $1/a$. The approximate differential cross section is therefore
\begin{align*}
\frac{d\sigma_{\mathrm{Born}}}{d\Omega}(\omega,k)=|f_{\mathrm{Born}}(|k|\omega,k)|^2=\frac{m^2V_0^2\pi a^6}{4\hbar^4}e^{-a^2||k|\omega-k|^2/2}.
\end{align*}
A wider Gaussian in position space, meaning larger $a$, produces a more concentrated Gaussian in momentum transfer.
[/example]
## Cross Sections, Unitarity, and the Optical Theorem
Approximation alone is not enough; the scattering matrix is unitary, and unitarity imposes constraints on any physically acceptable approximation. The most important introductory constraint is the [optical theorem](/theorems/6981), which relates the total scattering probability to the imaginary part of the forward scattering amplitude.
The Born example produces an amplitude, but experiments measure rates into solid angles. This is the three-dimensional analogue of the flux interpretation of reflection and transmission from Chapter 6. To connect the amplitude with measured scattering intensity, we package its squared modulus as a cross section.
[definition: Differential Cross Section]
For elastic scattering in $\mathbb R^3$ with fixed incoming wave vector $k\in\mathbb R^3\setminus\{0\}$, the differential cross section is the function
\begin{align*}
\frac{d\sigma}{d\Omega}(\cdot,k) &: S^2 \to [0,\infty)
\end{align*}
defined by
\begin{align*}
\frac{d\sigma}{d\Omega}(\omega,k)=|f(|k|\omega,k)|^2.
\end{align*}
[/definition]
Integrating the differential cross section over outgoing directions gives the total cross section. A unitary scattering process has only one probability budget, so the flux removed from the incoming beam cannot be independent of the wave that continues in the forward direction.
The remaining question is how this global loss from the incident beam is detected in the scattering amplitude itself. The relevant identity must connect the integral over all outgoing directions with the special forward value of the amplitude, where interference with the incoming wave records the total removed flux.
[quotetheorem:6981]
The optical theorem is a useful warning about perturbation theory, and its assumptions explain the exact form of the identity. Real-valuedness of $V$ gives self-adjoint dynamics and hence a unitary scattering operator; if $V=V_{\mathrm{re}}-iW$ with $W\ge0$ nonzero, the imaginary part models absorption and probability is lost from the elastic channel, so the RHS must include an absorption term. The word elastic means that incoming and outgoing particles have the same energy; if an internal excitation channel is present, integrating only over $S^2$ at the original energy omits probability carried by inelastic channels. The short-range assumption is also structural: for the Coulomb potential, the scattering amplitude has a forward singularity and the usual total cross section diverges, so the displayed formula is replaced by a long-range version with modified asymptotics. The exclusion of thresholds and embedded eigenvalues prevents singular fixed-energy limits; at zero energy for a potential with a resonance, the scattering length can blow up and the forward amplitude need not have the regular behaviour used in the derivation. The normalization hypothesis fixes the constant $4\pi/|k|$: changing the outgoing term to $c\,f e^{i|k||x|}/|x|$ rescales both the total cross section and the forward term, producing a different numerical factor. The first Born amplitude for a real potential is real in the forward direction, so it cannot by itself satisfy the theorem except in the zero-scattering limit; the imaginary contribution appears at the next order.
[example: Unitarity Check on the Born Approximation]
For the Gaussian potential $V(x)=V_0e^{-|x|^2/a^2}$ with $V_0>0$, the first Born amplitude computed above is
\begin{align*}
f_{\mathrm{Born}}(k',k)=-\frac{mV_0}{2\pi\hbar^2}\pi^{3/2}a^3e^{-a^2|k'-k|^2/4}.
\end{align*}
In the forward direction $k'=k$, so $|k-k|^2=0$ and $e^{-a^2|k-k|^2/4}=e^0=1$. Hence
\begin{align*}
f_{\mathrm{Born}}(k,k)=-\frac{mV_0}{2\pi\hbar^2}\pi^{3/2}a^3=-\frac{mV_0\sqrt{\pi}a^3}{2\hbar^2}.
\end{align*}
All factors $m,V_0,\sqrt{\pi},a^3,\hbar^2$ are positive, so $f_{\mathrm{Born}}(k,k)<0$ and is real. Therefore
\begin{align*}
\operatorname{Im} f_{\mathrm{Born}}(k,k)=0.
\end{align*}
On the other hand, the Born differential cross section is
\begin{align*}
|f_{\mathrm{Born}}(|k|\omega,k)|^2=\left(\frac{mV_0\sqrt{\pi}a^3}{2\hbar^2}\right)^2 e^{-a^2||k|\omega-k|^2/2}.
\end{align*}
The constant factor is positive because $V_0>0$, and the exponential factor is positive for every $\omega\in S^2$. Thus
\begin{align*}
\int_{S^2}|f_{\mathrm{Born}}(|k|\omega,k)|^2\,d\omega>0.
\end{align*}
The optical theorem would require the same total cross section to equal $(4\pi/|k|)\operatorname{Im}f(k,k)$ for the exact unitary amplitude, but the first Born amplitude gives a zero right-hand side and a positive left-hand side. This does not make the Born approximation useless: it shows that the retained amplitude is first order in $V_0$, while its squared cross section is second order in $V_0$ and the matching forward imaginary term only appears at the next order.
[/example]
The unitarity check shows that scattering data has internal structure not visible from a single Fourier transform. For radial potentials, rotational symmetry motivates splitting the problem into angular momentum channels.
[definition: Phase Shift]
Let $V(x)=V(r)$ be radial, where $r=|x|$. For angular momentum $\ell\ge0$, the phase shift is the function
\begin{align*}
\delta_\ell &: (0,\infty) \to \mathbb R/\pi\mathbb Z
\end{align*}
determined by the large-$r$ asymptotic form of the radial scattering solution
\begin{align*}
u_\ell(r) \sim \sin\left(kr-\frac{\ell\pi}{2}+\delta_\ell(k)\right).
\end{align*}
[/definition]
The phase shift records the effect of the potential in a single angular momentum channel. This motivates the [partial wave expansion](/theorems/6982): since a radial potential conserves angular momentum, the full amplitude should be recoverable by summing the channel contributions with the correct angular basis.
[quotetheorem:6982]
This formula is often the most efficient method for central potentials at low energy, because only the first few angular momentum channels contribute significantly. It also gives direct tests of unitarity channel by channel through the factors $e^{i\delta_\ell}$. Its scope is limited by the symmetry and convergence assumptions behind the expansion. For a nonradial potential, angular momentum channels are coupled, so a single sequence of phase shifts no longer determines the amplitude. Even for radial potentials, the infinite sum is an asymptotic or conditionally convergent object in many physical regimes, and the displayed formula by itself does not prove existence of the scattering amplitude or justify interchanging large-$r$ limits with the angular momentum sum.
[example: Low-Energy Radial Scattering]
Suppose $V$ is radial, short-range, and attractive. For a radial potential, the effective radial equation in angular momentum channel $\ell$ contains the centrifugal term
\begin{align*}
\frac{\hbar^2\ell(\ell+1)}{2mr^2}.
\end{align*}
For $\ell=0$ this term is $0$, while for every $\ell\ge 1$ it is positive. Thus, at small wave number $k$, the lowest channel is the first one retained in the low-energy approximation.
Using the *Partial Wave Expansion* and keeping only the $\ell=0$ term gives
\begin{align*}
f(\theta)\approx \frac{1}{k}(2\cdot 0+1)e^{i\delta_0(k)}\sin(\delta_0(k))P_0(\cos\theta).
\end{align*}
Since $2\cdot0+1=1$ and $P_0(\cos\theta)=1$, this becomes
\begin{align*}
f(\theta)\approx \frac{1}{k}e^{i\delta_0(k)}\sin(\delta_0(k)).
\end{align*}
Taking the squared modulus and using $|e^{i\delta_0(k)}|=1$ gives
\begin{align*}
|f(\theta)|^2\approx \frac{1}{k^2}\sin^2(\delta_0(k)).
\end{align*}
This leading approximation is independent of $\theta$, so the total cross section is
\begin{align*}
\sigma_{\mathrm{tot}}(k)\approx \int_{S^2}\frac{1}{k^2}\sin^2(\delta_0(k))\,d\omega.
\end{align*}
Because $\int_{S^2}1\,d\omega=4\pi$, the integral equals
\begin{align*}
\sigma_{\mathrm{tot}}(k)\approx \frac{4\pi}{k^2}\sin^2(\delta_0(k)).
\end{align*}
If the scattering length $a$ is defined by the low-energy behaviour $\delta_0(k)\sim -ak$ as $k\to0^+$, then $\sin(\delta_0(k))\sim \delta_0(k)\sim -ak$. Substituting this into the preceding formula gives
\begin{align*}
\sigma_{\mathrm{tot}}(k)\approx \frac{4\pi}{k^2}a^2k^2=4\pi a^2.
\end{align*}
Thus low-energy radial scattering is governed, to leading order, by the $s$-wave phase shift, and the scattering length records the first-order slope of $\delta_0(k)$ at $k=0$.
[/example]
## Semiclassical Scaling and WKB States
The second half of the chapter asks what quantum mechanics looks like when $\hbar$ is small compared with the action scale of the problem. In this regime the wave function oscillates rapidly, and the phase should follow classical mechanics. The WKB method turns this intuition into an asymptotic expansion.
Consider the one-dimensional stationary Schrödinger equation
\begin{align*}
-\frac{\hbar^2}{2m}\psi''(x)+V(x)\psi(x)=E\psi(x).
\end{align*}
Where $E>V(x)$, the classical momentum is
\begin{align*}
p(x)=\sqrt{2m(E-V(x))}.
\end{align*}
The oscillatory ansatz has phase derivative approximately equal to this momentum.
[definition: WKB Ansatz]
Let $I\subset\mathbb R$ be an open interval, let $V\in C^\infty(I;\mathbb R)$, and suppose $E>V(x)$ for all $x\in I$. Define $p\in C^\infty(I;(0,\infty))$ by $p(x)=\sqrt{2m(E-V(x))}$. A leading-order WKB solution on $I$ is the function $\psi_{\mathrm{WKB}}:I\to\mathbb C$ given by
\begin{align*}
\psi_{\mathrm{WKB}}(x)
= \frac{C_+}{\sqrt{p(x)}}\exp\left(\frac{i}{\hbar}\int^x p(s)\,ds\right)
+ \frac{C_-}{\sqrt{p(x)}}\exp\left(-\frac{i}{\hbar}\int^x p(s)\,ds\right),
\end{align*}
where $C_+,C_-\in\mathbb C$.
[/definition]
The amplitude factor $p(x)^{-1/2}$ is not decorative; it is what balances the transport equation at the next order. To justify both the phase and amplitude, we insert a general oscillatory ansatz and sort the resulting equation by powers of $\hbar$.
[quotetheorem:6983]
[citeproof:6983]
The calculation explains the WKB formula on intervals where $p(x)>0$, and it also states the main limitation of the ansatz. The transport equation divides by $S'$, so the derivation breaks down at points where $S'=0$, exactly the classical turning points. For example, in the linear turning-point model $V(x)=E+Fx$ with $F>0$, the allowed side has $p(x)=\sqrt{-2mFx}$ for $x<0$, and the WKB amplitude is proportional to $(-x)^{-1/4}$ as $x\to0^-$. The exact local solutions are Airy functions, which remain finite after the correct rescaling and connect the oscillatory and exponential regions. Smoothness of $V$ is also being used when the phase and amplitude are differentiated and expanded; nonsmooth potentials require separate matching conditions or weak formulations. Finally, the theorem assumes that the leading $a$ and $S$ are independent of $\hbar$; if the amplitude or potential contains $\hbar$-scale oscillations, the power counting changes. The obstruction to a global formula is therefore not cosmetic: the amplitude becomes singular where the classical momentum vanishes.
[definition: Classical Turning Point]
For a one-dimensional potential $V: \mathbb R\to\mathbb R$ and energy $E$, a point $x_0$ is a classical turning point if
\begin{align*}
V(x_0)=E.
\end{align*}
[/definition]
At a turning point the local model changes from oscillatory to exponential behaviour. This motivates the quantization rule: periodicity of the WKB phase must be supplemented by the phase shifts created when solutions are matched across the two turning points.
[quotetheorem:6984]
The rule converts a spectral problem into an action integral, but its hypotheses are tailored to the simplest bound-state geometry. A single allowed interval means the classical motion oscillates between two turning points without tunnelling between separate wells. Simple turning points are needed because the Airy equation is the correct local model only when $E-V(x)$ has a nonzero first derivative at the endpoint. Multiple wells introduce exponentially small splitting and connection matrices, while degenerate turning points lead to different special-function models and different phase losses. The harmonic oscillator is the best calibration example because the leading WKB condition reproduces the exact eigenvalues, so the meaning of the half-integer shift can be checked against a solved model.
[example: Harmonic Oscillator from Bohr-Sommerfeld]
For $V(x)=\frac{1}{2}m\omega^2x^2$ with $m>0$ and $\omega>0$, the turning points at energy $E>0$ are found by solving
\begin{align*}
E=\frac{1}{2}m\omega^2x^2.
\end{align*}
Multiplying by $2/(m\omega^2)$ gives
\begin{align*}
x^2=\frac{2E}{m\omega^2},
\end{align*}
so
\begin{align*}
a(E)=-\sqrt{\frac{2E}{m\omega^2}},\qquad b(E)=\sqrt{\frac{2E}{m\omega^2}}.
\end{align*}
Set
\begin{align*}
A=\sqrt{\frac{2E}{m\omega^2}}.
\end{align*}
Then the action integral is
\begin{align*}
\int_{-A}^{A}\sqrt{2m\left(E-\frac{1}{2}m\omega^2x^2\right)}\,dx.
\end{align*}
Use the substitution $x=A\sin\theta$, with $\theta\in[-\pi/2,\pi/2]$. Since $dx=A\cos\theta\,d\theta$ and $\cos\theta\ge0$ on this interval, the expression under the square root becomes
\begin{align*}
2m\left(E-\frac{1}{2}m\omega^2A^2\sin^2\theta\right).
\end{align*}
Because $A^2=2E/(m\omega^2)$, we have
\begin{align*}
\frac{1}{2}m\omega^2A^2\sin^2\theta=E\sin^2\theta.
\end{align*}
Thus
\begin{align*}
2m\left(E-\frac{1}{2}m\omega^2A^2\sin^2\theta\right)=2mE(1-\sin^2\theta)=2mE\cos^2\theta.
\end{align*}
Taking the positive square root gives
\begin{align*}
\sqrt{2mE\cos^2\theta}=\sqrt{2mE}\cos\theta.
\end{align*}
Therefore
\begin{align*}
\int_{-A}^{A}\sqrt{2m\left(E-\frac{1}{2}m\omega^2x^2\right)}\,dx=\int_{-\pi/2}^{\pi/2}\sqrt{2mE}\cos\theta\cdot A\cos\theta\,d\theta.
\end{align*}
Since $A=\sqrt{2E/(m\omega^2)}$, the constant factor is
\begin{align*}
\sqrt{2mE}\,A=\sqrt{2mE}\sqrt{\frac{2E}{m\omega^2}}=\frac{2E}{\omega}.
\end{align*}
Hence
\begin{align*}
\int_{a(E)}^{b(E)}\sqrt{2m(E-V(x))}\,dx=\frac{2E}{\omega}\int_{-\pi/2}^{\pi/2}\cos^2\theta\,d\theta.
\end{align*}
Using $\cos^2\theta=(1+\cos 2\theta)/2$, we get
\begin{align*}
\int_{-\pi/2}^{\pi/2}\cos^2\theta\,d\theta=\int_{-\pi/2}^{\pi/2}\frac{1+\cos 2\theta}{2}\,d\theta=\frac{\pi}{2}.
\end{align*}
Therefore
\begin{align*}
\int_{a(E)}^{b(E)}\sqrt{2m(E-V(x))}\,dx=\frac{2E}{\omega}\cdot\frac{\pi}{2}=\frac{\pi E}{\omega}.
\end{align*}
By the *[Bohr-Sommerfeld Quantization Rule](/theorems/6984)*, the leading semiclassical condition is
\begin{align*}
\frac{\pi E}{\omega}=\pi\hbar\left(n+\frac{1}{2}\right).
\end{align*}
Dividing by $\pi$ and multiplying by $\omega$ gives
\begin{align*}
E_n=\hbar\omega\left(n+\frac{1}{2}\right).
\end{align*}
For the harmonic oscillator, the leading Bohr-Sommerfeld rule reproduces the exact equally spaced energy levels, including the half-quantum ground-state shift.
[/example]
## Stationary Phase and Semiclassical Estimates
The last approximation principle explains why classical paths dominate oscillatory quantum integrals. When $\hbar$ is small, phases cancel except near critical points of the phase function. This is the stationary phase mechanism behind path integrals, WKB propagation, and semiclassical trace formulae.
The Bohr-Sommerfeld rule came from matching local asymptotic solutions. For propagators and integral formulae, the corresponding tool is stationary phase, which estimates oscillatory integrals directly.
[quotetheorem:6985]
Stationary phase is the analytic reason for the classical variational principle in the semiclassical limit, but the theorem is deliberately local and nondegenerate. Each assumption has a concrete role. If uniqueness fails, for example $\phi(x)=x^4-x^2$ with $a$ supported near both nondegenerate critical points $\pm 1/\sqrt{2}$, the leading term is a sum of two oscillatory contributions and interference between their phases can change the size of the integral. If nondegeneracy fails, for example $\phi(x)=x^3$ near $0$, the correct scale is $|x|\sim\hbar^{1/3}$ and the leading size is governed by an Airy-type integral rather than a Gaussian $\hbar^{1/2}$ term. If compact support is removed without decay, for example $a(x)=1$ on $\mathbb R$ and $\phi(x)=x$, the integral is not an ordinary convergent oscillatory integral, so boundary or distributional interpretations are needed. The forward connection is that semiclassical propagators are built from exactly these oscillatory integrals: their critical points are classical trajectories, their Hessians measure local focusing, and degenerate critical points signal caustics where a single WKB chart must be replaced by a different local model. The next example reads the free propagator in this language.
[example: Free Propagator Phase]
For $t\ne0$, the one-dimensional free Schrödinger propagator has kernel
\begin{align*}
K_t(x,y)=\left(\frac{m}{2\pi i\hbar t}\right)^{1/2}\exp\left(\frac{im(x-y)^2}{2\hbar t}\right).
\end{align*}
The phase in the exponential is the free classical action. Indeed, the straight path from $y$ to $x$ in time $t$ is $q(s)=y+(x-y)s/t$, so $q'(s)=(x-y)/t$. Since the free Lagrangian is $L(q,q')=\frac{1}{2}m(q')^2$, its action is
\begin{align*}
\int_0^t \frac{1}{2}m(q'(s))^2\,ds=\int_0^t \frac{1}{2}m\left(\frac{x-y}{t}\right)^2\,ds.
\end{align*}
The integrand is constant in $s$, hence
\begin{align*}
\int_0^t \frac{1}{2}m\left(\frac{x-y}{t}\right)^2\,ds=t\cdot\frac{m(x-y)^2}{2t^2}=\frac{m(x-y)^2}{2t}.
\end{align*}
Now compose the kernel with an oscillatory initial state $u_0(y)=a(y)e^{iS_0(y)/\hbar}$. Up to the prefactor, the propagated wave is governed by the oscillatory integral
\begin{align*}
u(t,x)=\left(\frac{m}{2\pi i\hbar t}\right)^{1/2}\int_{\mathbb R}a(y)\exp\left(\frac{i}{\hbar}\left(\frac{m(x-y)^2}{2t}+S_0(y)\right)\right)\,dy.
\end{align*}
Thus the total phase as a function of $y$ is
\begin{align*}
\Phi_x(y)=\frac{m(x-y)^2}{2t}+S_0(y).
\end{align*}
Differentiating with respect to $y$ gives
\begin{align*}
\Phi_x'(y)=-\frac{m(x-y)}{t}+S_0'(y).
\end{align*}
By *[One-Dimensional Stationary Phase](/theorems/6985)*, the leading contribution comes from nondegenerate critical points, so $\Phi_x'(y)=0$. This condition is
\begin{align*}
-\frac{m(x-y)}{t}+S_0'(y)=0.
\end{align*}
Equivalently,
\begin{align*}
S_0'(y)=\frac{m(x-y)}{t}.
\end{align*}
If $p_0(y)=S_0'(y)$ is the initial WKB momentum, then this equation becomes
\begin{align*}
x=y+\frac{p_0(y)}{m}t.
\end{align*}
So stationary phase selects exactly those starting points $y$ whose free classical trajectory reaches $x$ at time $t$.
[/example]
We close by relating the three viewpoints. The Born approximation linearizes scattering in the potential, the scattering matrix packages the exact asymptotic comparison, and WKB methods approximate high-frequency solutions by classical phases. All three rely on the same spectral foundation from earlier chapters: self-adjointness gives unitary dynamics, the continuous spectrum requires generalized eigenfunction methods, and asymptotic estimates turn those formal solutions into computable predictions.
## Beyond and Connections
The main internal thread of these notes runs from [Hilbert space](/page/Hilbert%20Space) and [self-adjoint operator](/page/Self-Adjoint%20Operators) theory to spectral decompositions, unitary dynamics, and concrete approximation schemes. For a first return path through the material, the spectral theorem and Stone-type evolution results explain why observables, measurements, and time evolution are all expressed through the same functional calculus. The uncertainty and compatibility results then show how that operator language constrains simultaneous measurement before any particular Hamiltonian is chosen.
Several onward directions branch from this core. Angular momentum and Clebsch-Gordan theory connect the course to [representation theory](/page/Cambridge%20II%20Representation%20Theory), especially the representation theory of $\mathfrak{su}(2)$ and compact groups. Perturbation theory and min-max methods lead into spectral approximation, variational numerical methods, and stability of eigenvalues under model changes. Scattering and the Born approximation point toward [Fourier transform](/page/Fourier%20Transform), resolvent estimates, and inverse problems, while WKB and stationary phase connect quantum mechanics with asymptotic analysis, Hamilton-Jacobi theory, caustics, and semiclassical trace formulae. These are the natural next topics once the present page has established the operator-theoretic and computational toolkit.
## References
Internal Androma reference paths for continuing through the material:
- [Hilbert Space](/page/Hilbert%20Space), [Self-Adjoint Operator](/page/Self-Adjoint%20Operators), and spectral theorem material for the operator-theoretic foundation of observables.
- Unitary operator theory, Stone's theorem, and functional calculus for the connection between generators, dynamics, and spectral measures.
- [Fourier Transform](/page/Fourier%20Transform), [Representation Theory](/page/Cambridge%20II%20Representation%20Theory), and asymptotic analysis for the main analytic and algebraic tools used in scattering, angular momentum, and semiclassical limits.
Contents
- Introduction
- The Mathematical Problem of Quantum Theory
- From Points to States
- From Functions to Operators
- From Dynamics to Unitary Evolution
- States, Probabilities, and Superposition
- Operators, Domains, and Observables
- Time Evolution and the Schrodinger Equation
- Models Running Through the Course
- Symmetry, Approximation, and the Shape of the Course
- 1. States, Observables, and Hilbert Space
- Complex Hilbert Spaces as State Spaces
- Rays, Superposition, and Pure States
- Mixed States and Density Operators
- Observables as Self-Adjoint Operators
- Expectation, Variance, and Compatible Observables
- 2. Unbounded Operators and Domains
- Why Domains Matter for Quantum Observables
- Deficiency Indices and Boundary Conditions
- The Particle in a Box
- Physical Consequences of Self-Adjointness
- 3. Spectral Theorem and Measurement
- Projection-Valued Measures as Generalised Diagonalisation
- Functional Calculus and Spectral Mapping
- Multiplication Operators and Physical Observables
- Types of Spectrum and Generalized Eigenfunctions
- Measurement and the Born Rule
- 4. Canonical Commutation and the Schrödinger Representation
- Position and Momentum on the Line
- Weyl Relations and Uniqueness of the Schrödinger Representation
- The Heisenberg Uncertainty Principle
- Fourier Transform and Momentum Space
- Schwartz Space and Rigged Hilbert Space Notation
- 5. Time Evolution and the Schrödinger Equation
- Strongly Continuous Unitary Groups
- The Schrödinger Equation for Time-Independent Hamiltonians
- Conservation Laws
- Wave Packet Spreading
- 6. One-Dimensional Quantum Systems
- Schrödinger Operators in One Dimension
- Bound-State Counting and Finite Wells
- Scattering States and Probability Current
- Tunneling Through Barriers
- Resonances and Quasi-Bound States
- 7. Harmonic Oscillator and Ladder Methods
- Factoring the Oscillator Hamiltonian
- The Spectrum And Hermite Function Basis
- Coherent States And Generating Functions
- Operator Calculus For The Ladder Method
- 8. Angular Momentum and Rotations
- Rotations as Unitary Symmetries
- The Lie Algebra of Angular Momentum
- Orbital Angular Momentum and Spherical Harmonics
- Spin and Tensor Product Systems
- Addition of Angular Momenta and Clebsch-Gordan Coefficients
- Rotationally Invariant Hamiltonians
- 9. Central Potentials and the Hydrogen Atom
- Rotational Symmetry and Angular Momentum Decomposition
- Radial Schrödinger Operators
- The Coulomb Hamiltonian
- Bound States and the Hydrogen Energy Formula
- Hydrogen Orbitals and Spectral Structure
- Selection Rules from Angular Momentum
- 10. Perturbation Theory
- Nondegenerate Rayleigh-Schrödinger Theory
- Degenerate Perturbation Theory
- Hellmann-Feynman Differentiation
- Variational Bounds And Rayleigh-Ritz Approximation
- Min-Max Principles And First-Order Spectral Shifts
- 11. Symmetry, Conservation, and Quantum Kinematics
- Symmetries of Quantum States
- Continuous Symmetries And Conserved Observables
- Commuting Observables And Simultaneous Quantum Numbers
- 12. Approximation, Scattering, and Semiclassical Limits
- Scattering by Short-Range Potentials
- The Lippmann-Schwinger Equation and the Born Approximation
- Cross Sections, Unitarity, and the Optical Theorem
- Semiclassical Scaling and WKB States
- Stationary Phase and Semiclassical Estimates
- Beyond and Connections
- References
Mathematical Physics II: Quantum Mechanics
Content
Problems
History
Created by admin on 6/12/2026 | Last updated on 6/12/2026
Prerequisites (0/3 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent