Compactness is the part of functional analysis where bounded sequences begin to behave as if they lived in finite-dimensional space. In a finite-dimensional normed space, closed bounded sets are compact, so every bounded sequence has a convergent subsequence after enclosing it in a closed bounded set. This single fact powers existence proofs, spectral arguments, and approximation methods. The first shock in infinite dimensions is that boundedness alone is too weak: the unit ball of an infinite-dimensional [Banach space](/page/Banach%20Space) need not contain any norm-convergent subsequence at all.
The theory of compact operators isolates a class of linear maps that restore this missing finite-dimensional behavior. A compact operator may act on an infinite-dimensional space, but it sends bounded sets into sets whose closure is compact. From the point of view of sequences, it turns bounded input sequences into output sequences with convergent subsequences. This is why compact operators appear wherever an infinite-dimensional problem has a hidden smoothing, averaging, or approximation mechanism.
[example: The Shift That Refuses Compactness]
Let $H=\ell^2$ and define the right shift $S:H\to H$ by
\begin{align*}
S(a_1,a_2,a_3,\ldots)=(0,a_1,a_2,a_3,\ldots).
\end{align*}
For $x=(a_k)_{k=1}^\infty$ and $y=(b_k)_{k=1}^\infty$ in $\ell^2$, and scalars $\alpha,\beta$, we have
\begin{align*}
S(\alpha x+\beta y)
&=S(\alpha a_1+\beta b_1,\alpha a_2+\beta b_2,\ldots)\\
&=(0,\alpha a_1+\beta b_1,\alpha a_2+\beta b_2,\ldots)\\
&=\alpha(0,a_1,a_2,\ldots)+\beta(0,b_1,b_2,\ldots)\\
&=\alpha Sx+\beta Sy,
\end{align*}
so $S$ is linear. Also,
\begin{align*}
\|Sx\|_{\ell^2}^2
&=|0|^2+|a_1|^2+|a_2|^2+|a_3|^2+\cdots\\
&=\sum_{k=1}^\infty |a_k|^2\\
&=\|x\|_{\ell^2}^2,
\end{align*}
so $\|Sx\|_{\ell^2}=\|x\|_{\ell^2}$ for every $x\in \ell^2$, and $S$ is bounded.
Let $e_k$ be the sequence with $1$ in the $k$-th coordinate and $0$ elsewhere. Then
\begin{align*}
\|e_k\|_{\ell^2}^2=\sum_{n=1}^\infty |(e_k)_n|^2=1,
\end{align*}
so $(e_k)_{k=1}^\infty$ is bounded. Moreover,
\begin{align*}
Se_k=e_{k+1},
\end{align*}
because the single nonzero coordinate of $e_k$ is shifted from position $k$ to position $k+1$.
For $j\ne k$, the sequence $e_j-e_k$ has coordinate $1$ in position $j$, coordinate $-1$ in position $k$, and $0$ in every other position. Hence
\begin{align*}
\|e_j-e_k\|_{\ell^2}^2
&=\sum_{n=1}^\infty |(e_j-e_k)_n|^2\\
&=|1|^2+|-1|^2\\
&=2.
\end{align*}
Therefore, for $j\ne k$,
\begin{align*}
\|Se_j-Se_k\|_{\ell^2}
&=\|e_{j+1}-e_{k+1}\|_{\ell^2}\\
&=\sqrt{2}.
\end{align*}
Every subsequence of $(Se_k)_{k=1}^\infty$ still has pairwise distances $\sqrt{2}$ between distinct terms, so no subsequence is Cauchy. Since every norm-convergent sequence is Cauchy, no subsequence of $(Se_k)_{k=1}^\infty$ converges in norm.
Thus $S$ sends the bounded set $\{e_k:k\in\mathbb{N}\}$ to a set containing a sequence with no norm-convergent subsequence. Its image is not relatively compact, so the bounded linear right shift is not compact.
[/example]
The shift example is the basic obstruction. Bounded linearity controls distances uniformly, but it does not create convergence. To state compactness without losing the distinction, we first need the ambient class of operators whose size can be measured uniformly on the whole unit ball. This operator norm is the language in which compact operators are compared, approximated, and composed.
The notation below is therefore more than bookkeeping. Compactness will later imply boundedness for linear maps, but most stability and approximation statements live inside the normed space of bounded operators, where operator-norm convergence and composition are available.
[definition: Bounded Linear Operator Space]
Let $X$ and $Y$ be normed spaces. The space $\mathcal{L}(X,Y)$ consists of all bounded linear operators $T:X \to Y$, equipped with the operator norm
\begin{align*}
\|T\|_{\mathcal{L}(X,Y)}=\sup_{\|x\|_X \le 1}\|Tx\|_Y.
\end{align*}
When $X=Y$, write $\mathcal{L}(X)=\mathcal{L}(X,X)$.
[/definition]
## Definition
The right definition must be phrased in terms of bounded sets rather than pointwise behavior. A [linear map](/page/Linear%20Map) can send each individual vector somewhere harmless while still moving an infinite bounded family around without creating any convergent subsequence. Compactness asks for a uniform sequential improvement on every bounded family.
[definition: Compact Operator]
Let $X$ and $Y$ be normed spaces. A linear map $T: X \to Y$ is a compact operator if for every bounded set $B \subset X$, the set $T(B) \subset Y$ has compact closure in $Y$.
[/definition]
With this notation in place, the geometric definition can be converted into the form used in most arguments. When we solve variational problems or extract limits from approximate solutions, we usually start with a bounded sequence rather than an arbitrary bounded set. The next result is needed because it turns compactness into the subsequence criterion used in proofs.
[quotetheorem:4919]
This sequential form explains the word compact: the operator converts boundedness in the domain into relative compactness in the codomain. It also shows why compactness is stronger than boundedness in infinite dimensions. Because compactness is defined before boundedness has been proved, the next theorem is needed to rule out a possible ambiguity: for linear maps, compactness already supplies the uniform estimate required for boundedness.
[quotetheorem:4890]
The theorem tells us that compactness is not an alternative to boundedness; for linear maps, it is a stronger regularity condition. The next examples establish the two extremes: finite-dimensional behavior is always compact, while the identity in infinite dimensions is not.
[example: Finite-Rank Operators Are Compact]
Let $X$ and $Y$ be normed spaces, and suppose $T \in \mathcal{L}(X,Y)$ has finite-dimensional range. Let $B \subset X$ be bounded. Then there is $M \ge 0$ such that $\|x\|_X \le M$ for every $x \in B$. By the definition of the operator norm,
\begin{align*}
\|Tx\|_Y
&\le \|T\|_{\mathcal{L}(X,Y)}\|x\|_X\\
&\le \|T\|_{\mathcal{L}(X,Y)}M
\end{align*}
for every $x \in B$. Hence $T(B)$ is bounded as a subset of the finite-dimensional normed space $\operatorname{Range}(T)$.
The closure of $T(B)$ taken inside $\operatorname{Range}(T)$ is compact by *Heine-Borel for finite-dimensional normed spaces*. Since $\operatorname{Range}(T)$ is a finite-dimensional subspace of a normed space, it is closed in $Y$, so this same closure is also the closure of $T(B)$ in $Y$. Therefore $T(B)$ has compact closure in $Y$ for every bounded set $B \subset X$, and $T$ is compact.
[/example]
Finite-rank operators are the model compact operators: they discard all but finitely many degrees of freedom. The opposite test case is the identity map, which discards no directions at all. The next theorem is needed because it separates genuine compactness from the finite-dimensional compactness of closed bounded sets.
[quotetheorem:4920]
This theorem is often the quickest way to detect a false compactness claim. If an argument would imply that every bounded sequence in an infinite-dimensional Banach space has a norm-convergent subsequence, it has imported finite-dimensional reasoning.
## Images of Bounded Sets
### The Unit Ball Test
Compact operators are best understood by watching what they do to the unit ball. The unit ball is the universal bounded set: a linear operator is bounded precisely when it maps the unit ball to a bounded set, and it is compact precisely when it maps the unit ball to a relatively compact set.
The image of a bounded set under a linear map need not be closed, so it is not enough to ask that the image itself be compact. What matters for subsequence extraction is whether all possible limits remain inside the ambient codomain after taking closure.
[definition: Relatively Compact Set]
Let $(Y,d)$ be a [metric space](/page/Metric%20Space). A subset $A \subset Y$ is relatively compact if its closure $\overline{A}$ is compact in $Y$.
[/definition]
Relative compactness is the right condition because the image of a bounded set under a linear map need not itself be closed. The next theorem is needed to reduce the definition from all bounded sets to one canonical bounded set, making compactness checkable in practice.
[quotetheorem:4890]
The unit ball test turns compactness into a geometric statement. In applications, one proves estimates on $T(B_X)$: equicontinuity, decay of tails, smoothing, or approximation by finite-dimensional pieces.
### Equicontinuity and Integral Operators
For spaces of continuous functions, boundedness in the sup norm controls vertical size but not horizontal oscillation. A sequence of bounded continuous functions can oscillate faster and faster, preventing [uniform convergence](/page/Uniform%20Convergence). The missing condition is a uniform modulus of continuity across the whole family.
[definition: Equicontinuous Family]
Let $(M,d)$ be a metric space and let $C(M,\mathbb{C})$ denote the space of continuous functions $f:M \to \mathbb{C}$. A family $\mathcal{F} \subset C(M,\mathbb{C})$ is equicontinuous if for every $\varepsilon>0$ there exists $\delta>0$ such that for all $f \in \mathcal{F}$ and all $x,y \in M$,
\begin{align*}
d(x,y)<\delta \implies |f(x)-f(y)|<\varepsilon.
\end{align*}
[/definition]
Equicontinuity prevents bounded functions from hiding variation at smaller and smaller scales, which is exactly the failure that prevents bounded families in $C(M)$ from having uniformly convergent subsequences.
For compactness questions in $C(M)$, the useful next step is to convert this qualitative control into a [sequential compactness](/page/Sequential%20Compactness) criterion in the sup norm. Pointwise boundedness supplies control of values, while equicontinuity supplies control of oscillation; the Arzela-Ascoli theorem packages these two requirements into the compactness test needed for operator images.
[quotetheorem:885]
The theorem gives a standard route to compactness for integral operators with continuous kernels: bounded inputs produce uniformly bounded and equicontinuous outputs.
[example: A Continuous Kernel Gives a Compact Integral Operator]
Let $K \in C([0,1]\times [0,1])$ and define $T:C([0,1])\to C([0,1])$ by
\begin{align*}
(Tf)(x)=\int_0^1 K(x,y)f(y)\,d\mathcal{L}^1(y).
\end{align*}
For $f,g\in C([0,1])$ and scalars $\alpha,\beta$,
\begin{align*}
(T(\alpha f+\beta g))(x)
&=\int_0^1 K(x,y)(\alpha f(y)+\beta g(y))\,d\mathcal{L}^1(y)\\
&=\alpha\int_0^1 K(x,y)f(y)\,d\mathcal{L}^1(y)
+\beta\int_0^1 K(x,y)g(y)\,d\mathcal{L}^1(y)\\
&=\alpha(Tf)(x)+\beta(Tg)(x),
\end{align*}
so $T$ is linear. Since $K$ is continuous on the compact set $[0,1]\times[0,1]$, it is bounded; write
\begin{align*}
\|K\|_\infty=\sup_{(x,y)\in[0,1]^2}|K(x,y)|.
\end{align*}
If $\|f\|_\infty\le 1$, then for every $x\in[0,1]$,
\begin{align*}
|(Tf)(x)|
&=\left|\int_0^1 K(x,y)f(y)\,d\mathcal{L}^1(y)\right|\\
&\le \int_0^1 |K(x,y)f(y)|\,d\mathcal{L}^1(y)\\
&=\int_0^1 |K(x,y)|\,|f(y)|\,d\mathcal{L}^1(y)\\
&\le \int_0^1 \|K\|_\infty\,d\mathcal{L}^1(y)\\
&=\|K\|_\infty.
\end{align*}
Thus $T(B_{C([0,1])})$ is bounded in the supremum norm.
It remains to check equicontinuity of the same image of the unit ball. Because $K$ is continuous on the compact square, it is uniformly continuous. Hence, for every $\varepsilon>0$, there exists $\delta>0$ such that whenever $|x-z|<\delta$ and $y\in[0,1]$,
\begin{align*}
|K(x,y)-K(z,y)|<\varepsilon.
\end{align*}
For any $f\in C([0,1])$ with $\|f\|_\infty\le 1$ and any $x,z\in[0,1]$ with $|x-z|<\delta$,
\begin{align*}
|(Tf)(x)-(Tf)(z)|
&=\left|\int_0^1 K(x,y)f(y)\,d\mathcal{L}^1(y)
-\int_0^1 K(z,y)f(y)\,d\mathcal{L}^1(y)\right|\\
&=\left|\int_0^1 (K(x,y)-K(z,y))f(y)\,d\mathcal{L}^1(y)\right|\\
&\le \int_0^1 |K(x,y)-K(z,y)|\,|f(y)|\,d\mathcal{L}^1(y)\\
&\le \int_0^1 \varepsilon\,d\mathcal{L}^1(y)\\
&=\varepsilon.
\end{align*}
Therefore $T(B_{C([0,1])})$ is bounded and equicontinuous, so it is relatively compact in $C([0,1])$ by *Arzela-Ascoli Criterion*. By the *Unit Ball Test for Compactness*, $T$ is compact.
[/example]
The integral operator is compact because integration averages. It smooths the input enough that the output family cannot oscillate independently at infinitely many scales.
## Approximation by Finite-Rank Operators
### Finite-Dimensional Models
[Finite-rank operators are compact](/theorems/4891) for finite-dimensional reasons. A major theme is that many compact operators are limits of finite-rank operators, so compactness can be viewed as approximate finite-dimensionality. The approximation must be in operator norm, because pointwise approximation on each vector is too weak to control the whole unit ball.
To say this precisely, we first isolate the finite-dimensional model. Finite-rank operators are the maps whose outputs live in a fixed finite-dimensional subspace, no matter how large the domain is.
[definition: Finite-Rank Operator]
Let $X$ and $Y$ be normed spaces. A linear operator $T: X \to Y$ has finite rank if $\operatorname{Range}(T)$ is finite-dimensional.
[/definition]
Finite-rank operators are the algebraic core of compactness, but compactness is usually analytic: it permits infinitely many directions as long as their influence becomes uniformly negligible. To pass from finite-dimensional models to genuinely infinite-dimensional compact operators, we need compactness to survive limits in the operator norm.
[quotetheorem:4892]
Completeness of $Y$ appears because the limiting subsequence must converge in the codomain. Without completeness, a Cauchy subsequence may converge only after completing the space.
A useful sufficient condition now follows: any operator-norm limit of finite-rank operators is compact. Diagonal operators on sequence spaces give the cleanest picture of this approximation principle.
[example: Diagonal Compact Operators on $\ell^2$]
Let $(a_k)_{k=1}^\infty$ be a bounded scalar sequence with $a_k \to 0$. Choose $M\ge 0$ such that $|a_k|\le M$ for every $k$, and define
\begin{align*}
T(x_1,x_2,x_3,\ldots)=(a_1x_1,a_2x_2,a_3x_3,\ldots).
\end{align*}
If $x=(x_k)_{k=1}^\infty\in \ell^2$, then
\begin{align*}
\|Tx\|_{\ell^2}^2
&=\sum_{k=1}^\infty |a_kx_k|^2\\
&=\sum_{k=1}^\infty |a_k|^2|x_k|^2\\
&\le \sum_{k=1}^\infty M^2|x_k|^2\\
&=M^2\|x\|_{\ell^2}^2,
\end{align*}
so $Tx\in \ell^2$ and $\|Tx\|_{\ell^2}\le M\|x\|_{\ell^2}$. Also, for $x=(x_k)$, $y=(y_k)$, and scalars $\alpha,\beta$,
\begin{align*}
T(\alpha x+\beta y)
&=T(\alpha x_1+\beta y_1,\alpha x_2+\beta y_2,\ldots)\\
&=(a_1(\alpha x_1+\beta y_1),a_2(\alpha x_2+\beta y_2),\ldots)\\
&=\alpha(a_1x_1,a_2x_2,\ldots)+\beta(a_1y_1,a_2y_2,\ldots)\\
&=\alpha Tx+\beta Ty,
\end{align*}
so $T\in \mathcal{L}(\ell^2)$.
For $N\in\mathbb{N}$, define the finite-rank truncation
\begin{align*}
T_N(x_1,x_2,x_3,\ldots)=(a_1x_1,\ldots,a_Nx_N,0,0,\ldots).
\end{align*}
Its range is contained in $\operatorname{span}\{e_1,\ldots,e_N\}$, so $T_N$ has finite rank. For $x=(x_k)\in\ell^2$ with $\|x\|_{\ell^2}\le 1$,
\begin{align*}
\|(T-T_N)x\|_{\ell^2}^2
&=\sum_{k=N+1}^\infty |a_kx_k|^2\\
&=\sum_{k=N+1}^\infty |a_k|^2|x_k|^2\\
&\le \left(\sup_{k>N}|a_k|\right)^2\sum_{k=N+1}^\infty |x_k|^2\\
&\le \left(\sup_{k>N}|a_k|\right)^2.
\end{align*}
Hence
\begin{align*}
\|T-T_N\|_{\mathcal{L}(\ell^2)}
\le \sup_{k>N}|a_k|.
\end{align*}
Conversely, if $m>N$, then $\|e_m\|_{\ell^2}=1$ and
\begin{align*}
\|(T-T_N)e_m\|_{\ell^2}
&=\|a_me_m\|_{\ell^2}\\
&=|a_m|\|e_m\|_{\ell^2}\\
&=|a_m|.
\end{align*}
Taking the supremum over all $m>N$ gives
\begin{align*}
\|T-T_N\|_{\mathcal{L}(\ell^2)}
\ge \sup_{m>N}|a_m|.
\end{align*}
Therefore
\begin{align*}
\|T-T_N\|_{\mathcal{L}(\ell^2)}
=\sup_{k>N}|a_k|.
\end{align*}
Since $a_k\to 0$, for every $\varepsilon>0$ there exists $N_0$ such that $|a_k|<\varepsilon$ whenever $k>N_0$, and hence
\begin{align*}
N\ge N_0
\implies
\|T-T_N\|_{\mathcal{L}(\ell^2)}
=\sup_{k>N}|a_k|
\le \varepsilon.
\end{align*}
Thus $\|T-T_N\|_{\mathcal{L}(\ell^2)}\to 0$. Each $T_N$ is compact by *Finite-Rank Operators Are Compact*, and $\ell^2$ is a Banach space, so $T$ is compact by *Operator-Norm Limits of Compact Operators*. This shows that a diagonal operator on $\ell^2$ is compact precisely because its coordinate weights vanish in the tail.
[/example]
This example also shows what compactness means spectrally: infinitely many coordinate directions may remain, but their weights must tend to zero. The identity corresponds to $a_k=1$, so the tail never disappears.
### Approximation in Hilbert Spaces
The converse approximation statement requires extra structure on the codomain and is not true in every Banach space. Hilbert spaces have enough [orthogonal projection](/theorems/437) structure to approximate relatively compact sets by finite-dimensional subspaces uniformly on the unit ball. That geometric fact turns every compact operator into an operator-norm limit of finite-rank maps.
[quotetheorem:4921]
After choosing an [orthonormal basis](/page/Orthonormal%20Basis), the approximants $F_k$ may be viewed as finite-dimensional matrix blocks, or equivalently as compressions through finite-dimensional orthogonal projections. In that precise sense, compact operators on Hilbert spaces behave like infinite matrices whose tails can be made uniformly small. This is also the bridge from compactness to spectral decompositions.
## Algebraic Stability
A useful class of operators must survive the operations that occur in analysis: sums, scalar multiples, and composition. Compact operators do survive these operations, with an important asymmetry in composition: only one factor needs to be compact, but the other must be bounded.
To state these stability properties cleanly, we need notation for the class of all compact maps between two spaces. This notation matters because compact terms often appear as finite sums of lower-order or smoothing contributions, and we need the whole perturbation to remain compact rather than checking each combined expression from scratch.
[definition: Space of Compact Operators]
Let $X$ and $Y$ be normed spaces. The space of compact operators from $X$ to $Y$ is
\begin{align*}
\mathcal{K}(X,Y) = \{T \in \mathcal{L}(X,Y) : T \text{ is compact}\}.
\end{align*}
When $X=Y$, write $\mathcal{K}(X)=\mathcal{K}(X,X)$.
[/definition]
This notation places compact operators inside the bounded-operator space, but applications rarely produce a single compact term in isolation. Error terms, lower-order pieces, and finite-dimensional corrections are usually combined before the final operator is studied. The next theorem is needed to ensure that compactness survives these linear combinations, so a sum of compact contributions can still be treated as one compact perturbation.
[quotetheorem:4901]
Linearity is not merely bookkeeping. In applications, compact terms often appear as perturbations, and finite sums of compact errors remain compact.
The more powerful stability property is behavior under composition. A bounded operator after a compact operator cannot destroy relative compactness, because continuous images of compact sets are compact. A bounded operator before a compact operator sends bounded sets to bounded sets, giving the compact operator the input control it needs.
[quotetheorem:4900]
This is why $\mathcal{K}(X)$ is called an operator ideal in $\mathcal{L}(X)$. Compactness is stable under bounded changes of variables, embeddings, and projections.
[example: Compactness from a Compact Embedding]
Let $j:X\hookrightarrow Y$ be compact, meaning that $j$ is a compact linear map from $X$ to $Y$. We first show that $Aj:X\to Z$ is compact. Let $(x_k)_{k=1}^\infty$ be a bounded sequence in $X$. Since $j$ is compact, there is a subsequence $(x_{k_m})_{m=1}^\infty$ and some $y\in Y$ such that
\begin{align*}
\|jx_{k_m}-y\|_Y \to 0.
\end{align*}
Because $A\in\mathcal{L}(Y,Z)$, for every $m$ we have
\begin{align*}
\|Ajx_{k_m}-Ay\|_Z
&=\|A(jx_{k_m}-y)\|_Z\\
&\le \|A\|_{\mathcal{L}(Y,Z)}\|jx_{k_m}-y\|_Y.
\end{align*}
The right-hand side tends to $0$, so $(Ajx_{k_m})_{m=1}^\infty$ converges in $Z$. Hence $Aj$ sends every bounded sequence in $X$ to a sequence with a convergent subsequence, and therefore $Aj$ is compact by *[Sequential Characterisation of Compact Operators](/theorems/4919)*.
Now assume $B\in\mathcal{L}(Z,X)$, so that
\begin{align*}
Y \xrightarrow{A} Z \xrightarrow{B} X \xrightarrow{j} Y
\end{align*}
is defined. Let $(y_k)_{k=1}^\infty$ be bounded in $Y$, say $\|y_k\|_Y\le M$ for all $k$. Then
\begin{align*}
\|BAy_k\|_X
&\le \|B\|_{\mathcal{L}(Z,X)}\|Ay_k\|_Z\\
&\le \|B\|_{\mathcal{L}(Z,X)}\|A\|_{\mathcal{L}(Y,Z)}\|y_k\|_Y\\
&\le \|B\|_{\mathcal{L}(Z,X)}\|A\|_{\mathcal{L}(Y,Z)}M,
\end{align*}
so $(BAy_k)_{k=1}^\infty$ is bounded in $X$. Since $j:X\to Y$ is compact, the sequence
\begin{align*}
(jBAy_k)_{k=1}^\infty
\end{align*}
has a convergent subsequence in $Y$. By *Sequential Characterisation of Compact Operators*, $jBA:Y\to Y$ is compact. Thus compact embeddings turn bounded control in the stronger space $X$ into norm-convergent subsequences in the weaker space $Y$, even after bounded operators are composed before or after the embedding.
[/example]
The ideal property is the abstract form of a common analytic pattern: obtain boundedness in a stronger space, use a compact embedding into a weaker space, then pass to a convergent subsequence.
## Compactness and Weak Convergence
### Weak Limits Become Strong Limits
Compactness is often detected through weak convergence. In a reflexive Banach space, bounded sequences have weakly convergent subsequences, but weak convergence is weaker than norm convergence. Compact operators can convert weak convergence in the domain into strong convergence in the codomain.
Weak convergence is designed for situations where norm convergence is unavailable but all bounded linear measurements converge. Before compact operators can improve weak convergence, we need the precise meaning of those measurements.
[definition: Weak Convergence]
Let $X$ be a Banach space over the scalar field $\mathbb{F}$, where $\mathbb{F}$ is either $\mathbb{R}$ or $\mathbb{C}$. A sequence $(x_k)_{k=1}^\infty$ in $X$ converges weakly to $x \in X$, written $x_k \rightharpoonup x$, if
\begin{align*}
f(x_k) \to f(x)
\end{align*}
for every bounded linear functional $f:X \to \mathbb{F}$, that is, for every $f \in X^*$.
[/definition]
Weak convergence records convergence against all bounded linear functionals, so it can miss norm-size oscillations that no single functional detects.
The next compactness question is how an operator can force this weaker convergence to become norm convergence after applying it. Relative compactness of the image leaves subsequences with strong limits, and compatibility with the weak limit then rules out competing norm limits. This is the mechanism behind the weak-to-strong convergence criterion for compact operators.
[quotetheorem:4895]
This theorem is one of the most useful working tests for compactness. It says a compact operator improves the mode of convergence.
[example: Weak but Not Strong Convergence in $\ell^2$]
Let $e_k$ denote the sequence whose $k$-th coordinate is $1$ and whose other coordinates are $0$. In $\ell^2$, with [inner product](/page/Inner%20Product) linear in the first argument,
\begin{align*}
(x,y)_{\ell^2}
=\sum_{n=1}^\infty x_n\overline{y_n}.
\end{align*}
We show that $e_k \rightharpoonup 0$, but not in norm.
Let $y=(y_n)_{n=1}^\infty\in\ell^2$. Since
\begin{align*}
\sum_{n=1}^\infty |y_n|^2<\infty,
\end{align*}
the scalar sequence $(y_n)_{n=1}^\infty$ tends to $0$: indeed, for every $\varepsilon>0$ there is $N$ such that
\begin{align*}
\sum_{n=N+1}^\infty |y_n|^2<\varepsilon^2,
\end{align*}
and hence, whenever $k>N$,
\begin{align*}
|y_k|^2
\le \sum_{n=N+1}^\infty |y_n|^2
<\varepsilon^2,
\end{align*}
so $|y_k|<\varepsilon$. Therefore $y_k\to 0$.
For the bounded linear functional represented by $y$, namely $x\mapsto (x,y)_{\ell^2}$, we have
\begin{align*}
(e_k,y)_{\ell^2}
&=\sum_{n=1}^\infty (e_k)_n\overline{y_n}\\
&=0\cdot \overline{y_1}+\cdots+0\cdot \overline{y_{k-1}}+1\cdot \overline{y_k}+0\cdot \overline{y_{k+1}}+\cdots\\
&=\overline{y_k}\\
&\to 0.
\end{align*}
By the Hilbert-space representation of bounded linear functionals, this verifies $f(e_k)\to f(0)$ for every $f\in(\ell^2)^*$, so $e_k\rightharpoonup 0$.
On the other hand,
\begin{align*}
\|e_k-0\|_{\ell^2}^2
&=\sum_{n=1}^\infty |(e_k)_n|^2\\
&=|1|^2\\
&=1,
\end{align*}
so $\|e_k-0\|_{\ell^2}=1$ for every $k$. Thus $e_k$ does not converge strongly to $0$. If $T\in\mathcal{K}(\ell^2,Y)$, then $e_k\rightharpoonup 0$ implies
\begin{align*}
\|Te_k-T0\|_Y\to 0
\end{align*}
by *Compact Operators Send Weak Convergence to Strong Convergence*. Since $T$ is linear, $T0=0$, and therefore
\begin{align*}
\|Te_k\|_Y\to 0.
\end{align*}
Compact operators turn this weakly null, norm-separated sequence into a norm-null sequence in the output space.
[/example]
The example shows the compactness mechanism in its most economical form. A weakly invisible sequence must become norm-small after applying a compact operator.
### Reflexive Domains
The converse requires care. Weak-to-strong continuity on sequences characterizes compactness when the domain is reflexive, because bounded sequences then have weakly convergent subsequences. Reflexivity is the condition that makes bounded sets weakly compact, giving compact operators a weak subsequence to upgrade.
[definition: Reflexive Banach Space]
Let $X$ be a Banach space, and let $J:X \to X^{**}$ be the canonical map defined by
\begin{align*}
Jx(f)=f(x), \qquad f \in X^*.
\end{align*}
The space $X$ is reflexive if $J$ is surjective.
[/definition]
Reflexivity supplies weak compactness of bounded sets, so every bounded sequence has a weakly convergent subsequence after passing to a subsequence. Without that weak subsequence, weak-to-strong continuity would have no reason to control arbitrary bounded sequences. On reflexive domains, this removes the obstruction and turns weak-to-strong sequential behavior into a criterion for compactness.
[quotetheorem:4922]
The theorem explains why compactness is so prominent in variational methods: bounded minimizing sequences often have weak limits, while compactness recovers the strong convergence needed for nonlinear terms.
## Spectrum of Compact Operators
### Nonzero Spectral Values
Compact operators are infinite-dimensional, but their spectra resemble finite-dimensional spectra more than those of general bounded operators. The key phenomenon is that nonzero spectral values behave like eigenvalues of finite multiplicity, and possible accumulation can occur only at $0$.
Spectral theory asks where an operator fails to be invertible after subtracting a scalar multiple of the identity. Compactness does not make this question finite-dimensional, but it forces every nonzero obstruction to have finite-dimensional structure.
[definition: Spectrum]
Let $X$ be a complex Banach space and let $T \in \mathcal{L}(X)$. The spectrum of $T$ is
\begin{align*}
\sigma(T)=\{\lambda \in \mathbb{C}: T-\lambda I_X \text{ is not bijective with bounded inverse}\}.
\end{align*}
[/definition]
The spectrum measures where the resolvent fails. For compact operators, the nonzero part of this failure is almost entirely algebraic.
To describe that algebraic part, we need to distinguish spectral values that arise from actual vectors scaled by the operator. These are the spectral values visible through finite-dimensional invariant directions, and compactness forces all nonzero spectral values into this category.
[definition: Eigenvalue]
Let $X$ be a complex [vector space](/page/Vector%20Space) and let $T:X \to X$ be linear. A scalar $\lambda \in \mathbb{C}$ is an eigenvalue of $T$ if there exists $x \in X\setminus\{0\}$ such that
\begin{align*}
Tx=\lambda x.
\end{align*}
The set $\ker(T-\lambda I_X)$ is the eigenspace associated to $\lambda$.
[/definition]
Eigenvalues are visible directions where the operator acts by scaling. A general bounded operator can have spectral values with no eigenvectors, because failure of invertibility need not produce an actual kernel vector. Compactness rules out that pathology away from $0$: a nonzero spectral obstruction cannot spread through infinitely many independent directions without producing a genuine finite-dimensional eigenspace.
[quotetheorem:4923]
This theorem is the compact-operator analogue of finite-dimensional spectral theory. The new feature is the unavoidable presence of $0$, reflecting that a compact operator on an infinite-dimensional space cannot be boundedly invertible.
[example: Spectrum of a Diagonal Compact Operator]
Let $T:\ell^2\to\ell^2$ be defined by
\begin{align*}
T(x_1,x_2,x_3,\ldots)=\left(x_1,\frac{x_2}{2},\frac{x_3}{3},\ldots\right).
\end{align*}
For the standard basis vector $e_k$, the $n$-th coordinate of $Te_k$ is
\begin{align*}
(Te_k)_n=\frac{(e_k)_n}{n}
=
\begin{cases}
1/k, & n=k,\\
0, & n\ne k,
\end{cases}
\end{align*}
so
\begin{align*}
Te_k=\frac{1}{k}e_k.
\end{align*}
Thus every $1/k$ is an eigenvalue of $T$.
Conversely, suppose $\lambda\ne 0$ and $\lambda\ne 1/k$ for every $k\in\mathbb{N}$. For $x=(x_k)\in\ell^2$,
\begin{align*}
((T-\lambda I)x)_k
&=\frac{x_k}{k}-\lambda x_k\\
&=\left(\frac{1}{k}-\lambda\right)x_k.
\end{align*}
Since $1/k\to 0$ and $\lambda\ne 0$, there is $N$ such that $1/k<|\lambda|/2$ for all $k>N$. Hence, for $k>N$,
\begin{align*}
\left|\frac{1}{k}-\lambda\right|
\ge |\lambda|-\frac{1}{k}
> \frac{|\lambda|}{2}.
\end{align*}
For the remaining finitely many indices, the numbers $\left|1/k-\lambda\right|$ are positive, so
\begin{align*}
c=\min\left\{\left|1-\lambda\right|,\left|\frac{1}{2}-\lambda\right|,\ldots,\left|\frac{1}{N}-\lambda\right|,\frac{|\lambda|}{2}\right\}>0.
\end{align*}
Therefore $\left|1/k-\lambda\right|\ge c$ for every $k$.
Given $y=(y_k)\in\ell^2$, define
\begin{align*}
x_k=\frac{y_k}{1/k-\lambda}.
\end{align*}
Then
\begin{align*}
\|x\|_{\ell^2}^2
&=\sum_{k=1}^\infty \left|\frac{y_k}{1/k-\lambda}\right|^2\\
&\le \frac{1}{c^2}\sum_{k=1}^\infty |y_k|^2\\
&=\frac{1}{c^2}\|y\|_{\ell^2}^2,
\end{align*}
so $x\in\ell^2$, and
\begin{align*}
((T-\lambda I)x)_k
&=\left(\frac{1}{k}-\lambda\right)\frac{y_k}{1/k-\lambda}\\
&=y_k.
\end{align*}
Thus $T-\lambda I$ is bijective with bounded inverse whenever $\lambda\ne 0$ and $\lambda\notin\{1/k:k\in\mathbb{N}\}$. Hence the nonzero spectral values are exactly
\begin{align*}
\left\{\frac{1}{k}:k\in\mathbb{N}\right\}.
\end{align*}
Finally, $0$ lies in the spectrum because $T$ is not onto $\ell^2$. Indeed, let
\begin{align*}
y=\left(1,\frac{1}{2},\frac{1}{3},\ldots\right).
\end{align*}
Then
\begin{align*}
\|y\|_{\ell^2}^2
=\sum_{k=1}^\infty \frac{1}{k^2}<\infty,
\end{align*}
so $y\in\ell^2$. If $Tx=y$, then for every $k$,
\begin{align*}
\frac{x_k}{k}=\frac{1}{k},
\end{align*}
and therefore $x_k=1$. But
\begin{align*}
\sum_{k=1}^\infty |x_k|^2
=\sum_{k=1}^\infty 1
=\infty,
\end{align*}
so no such $x$ belongs to $\ell^2$. Thus $T$ is not surjective, $0\in\sigma(T)$, and the spectrum is
\begin{align*}
\sigma(T)=\{0\}\cup\left\{\frac{1}{k}:k\in\mathbb{N}\right\}.
\end{align*}
The nonzero spectral values accumulate only at $0$, exactly matching the compact-operator spectral picture.
[/example]
The example is a perfect model for the general theorem: compactness allows infinitely many spectral values, but they must disappear toward zero.
### Self-Adjoint Compact Operators
In Hilbert spaces, the inner product lets us ask whether an operator is compatible with orthogonality. This compatibility is essential for diagonalization: without it, eigenspaces need not be orthogonal and the spectral picture can contain Jordan-type behavior. The right [symmetry condition](/theorems/1360) is self-adjointness.
[definition: Self-Adjoint Operator]
Let $H$ be a complex Hilbert space. An operator $T \in \mathcal{L}(H)$ is self-adjoint if
\begin{align*}
(Tx,y)_H=(x,Ty)_H
\end{align*}
for all $x,y \in H$.
[/definition]
Self-adjointness brings orthogonality into the spectral picture, addressing a limitation of the preceding spectral description. Compactness alone controls where nonzero spectral values can occur, but it does not by itself provide an orthogonal coordinate system in which the operator is diagonal. The inner-product symmetry removes this obstruction and allows the compact operator to be represented through orthonormal eigenvectors.
[quotetheorem:538]
This theorem is the foundation for Fourier-type expansions attached to compact [self-adjoint operators](/page/Self-Adjoint%20Operators), including many boundary value problems and integral equations.
## Fredholm Equations and Compact Perturbations
Compact operators are often important because they are small relative to the identity in an infinite-dimensional sense. The equation $Tx=y$ may be ill-behaved for compact $T$, but the equation $(I-T)x=y$ behaves like a finite-dimensional linear system up to a finite-dimensional obstruction.
The guiding problem is the Fredholm equation. It isolates the case where an unknown appears both directly and through a compact transformation, as happens in many integral equations and reformulations of differential equations.
[definition: Fredholm Equation of the Second Kind]
Let $X$ be a Banach space and let $T \in \mathcal{K}(X)$. A Fredholm equation of the second kind is an equation of the form
\begin{align*}
x-Tx=y,
\end{align*}
where $y \in X$ is given and $x \in X$ is unknown.
[/definition]
This equation appears when the identity part controls the main behavior and the compact part contributes a smoothing or lower-order effect. The solvability theory is governed by a finite-dimensional nullspace.
The central question is whether failure of uniqueness destroys existence theory. For compact perturbations of the identity, the answer is finite-dimensional: either the equation is uniquely solvable for every right-hand side, or the homogeneous equation has a nonzero solution. In the solvability condition below, $T^*:X^* \to X^*$ denotes the Banach-space adjoint, defined by $T^*f=f \circ T$ for $f \in X^*$; hence $I_{X^*}-T^*$ is the adjoint of $I_X-T$.
[quotetheorem:219]
[The Fredholm alternative](/page/The%20Fredholm%20Alternative) says that failure of invertibility is not chaotic. It is explained by genuine solutions of the homogeneous equation $x=Tx$, and the only forbidden right-hand sides are those detected by the finite-dimensional adjoint nullspace.
[example: A Rank-One Fredholm Equation]
Let $X=C([0,1])$ and define $T:X\to X$ by
\begin{align*}
(Tf)(x)=\int_0^1 f(y)\,d\mathcal{L}^1(y).
\end{align*}
For each $f\in X$, the number
\begin{align*}
c_f=\int_0^1 f(y)\,d\mathcal{L}^1(y)
\end{align*}
is independent of $x$, so $Tf$ is the constant function $x\mapsto c_f$. Hence
\begin{align*}
\operatorname{Range}(T)\subset \operatorname{span}\{\mathbf{1}\},
\end{align*}
where $\mathbf{1}(x)=1$ for all $x\in[0,1]$. Thus $T$ has finite rank, and therefore $T$ is compact by *Finite-Rank Operators Are Compact*.
We solve
\begin{align*}
f-Tf=g.
\end{align*}
Writing
\begin{align*}
c=\int_0^1 f(y)\,d\mathcal{L}^1(y),
\end{align*}
the equation becomes, for every $x\in[0,1]$,
\begin{align*}
f(x)-c=g(x).
\end{align*}
Integrating both sides over $[0,1]$ gives
\begin{align*}
\int_0^1 f(x)\,d\mathcal{L}^1(x)-\int_0^1 c\,d\mathcal{L}^1(x)
&=\int_0^1 g(x)\,d\mathcal{L}^1(x)\\
c-c\,\mathcal{L}^1([0,1])
&=\int_0^1 g(x)\,d\mathcal{L}^1(x)\\
c-c
&=\int_0^1 g(x)\,d\mathcal{L}^1(x)\\
0&=\int_0^1 g(x)\,d\mathcal{L}^1(x).
\end{align*}
So a solution can exist only if
\begin{align*}
\int_0^1 g(x)\,d\mathcal{L}^1(x)=0.
\end{align*}
Conversely, assume
\begin{align*}
\int_0^1 g(x)\,d\mathcal{L}^1(x)=0.
\end{align*}
For any constant $a\in\mathbb{C}$, define
\begin{align*}
f(x)=g(x)+a.
\end{align*}
Then
\begin{align*}
\int_0^1 f(y)\,d\mathcal{L}^1(y)
&=\int_0^1 (g(y)+a)\,d\mathcal{L}^1(y)\\
&=\int_0^1 g(y)\,d\mathcal{L}^1(y)+\int_0^1 a\,d\mathcal{L}^1(y)\\
&=0+a\,\mathcal{L}^1([0,1])\\
&=a,
\end{align*}
and therefore
\begin{align*}
(f-Tf)(x)
&=(g(x)+a)-a\\
&=g(x).
\end{align*}
Thus every constant $a$ gives a solution $f=g+a$.
Finally, if $f$ is any solution and $c=\int_0^1 f\,d\mathcal{L}^1$, then $f(x)-c=g(x)$ for every $x$, so
\begin{align*}
f(x)=g(x)+c.
\end{align*}
Hence all solutions are exactly
\begin{align*}
f=g+a,\qquad a\in\mathbb{C},
\end{align*}
and the only obstruction to solvability is the single scalar condition that $g$ have integral zero.
[/example]
Rank-one examples show the structure without technical overhead. General compact operators behave similarly, except the obstruction space can have any finite dimension.
## Beyond and Connected Topics
Compact operators sit at the intersection of topology, linear analysis, and spectral theory. The next natural topic is [Hilbert Space](/page/Hilbert%20Space), where orthogonality turns compactness into a diagonal theory for self-adjoint operators. The [spectral theorem for compact self-adjoint operators](/theorems/538) is the prototype for many expansions in analysis.
Another continuation is [Weak Convergence](/page/Weak%20Convergence). Compactness is frequently used to upgrade weak convergence to strong convergence, especially in variational problems and PDE. The compact embedding notation $X \hookrightarrow\hookrightarrow Y$ belongs to this same circle of ideas.
A third direction is [Sobolev Space](/page/Sobolev%20Space). Rellich-Kondrachov compactness says that certain Sobolev embeddings are compact on bounded domains, and this is the mechanism behind many existence proofs for elliptic equations.
Compact operators also lead into Fredholm theory. Operators of the form identity plus compact operator have finite-dimensional kernels and cokernels under appropriate hypotheses, making them central in elliptic boundary value problems and integral equations.
Finally, compactness should be contrasted with general bounded operators on Banach spaces. The shift on $\ell^2$ and multiplication operators on $L^2$ show that bounded operators can retain infinite-dimensional degrees of freedom in ways compact operators cannot.
## References
Androma, [Cambridge III Functional Analysis](/page/Cambridge%20III%20Functional%20Analysis).
Androma, [Hilbert Space](/page/Hilbert%20Space).
Androma, [Weak Convergence](/page/Weak%20Convergence).
Androma, [Sobolev Space](/page/Sobolev%20Space).
Androma, *Spectral Theory*.
Reed and Simon, *Methods of Modern Mathematical Physics I: Functional Analysis* (1980).
Conway, *A Course in Functional Analysis* (1990).
Kreyszig, *Introductory Functional Analysis with Applications* (1978).