Many of the most fundamental results in analysis — the Extreme Value Theorem, the existence of minimizers for variational problems, the convergence of approximation schemes — depend on the ability to extract convergent subsequences from bounded collections of objects. In finite dimensions, the [Bolzano-Weierstrass theorem](/theorems/628) guarantees this: every bounded sequence in $\mathbb{R}^n$ has a convergent subsequence. But in the broader settings of topology and infinite-dimensional analysis, boundedness alone is insufficient. The question of *which* spaces permit the extraction of convergent subsequences — and which do not — is the central concern of the theory of compactness.
The concept is best appreciated by examining what goes wrong without it.
[example: Failure of the Extreme Value Theorem on Non-Compact Domains]
Consider the function $f: (0, 1) \to \mathbb{R}$ defined by $f(x) = 1/x$. This function is [continuous](/page/Continuity), and its [supremum](/page/Supremum%20and%20Infimum) satisfies
\begin{align*}
\sup_{x \in (0,1)} f(x) = +\infty.
\end{align*}
The supremum exists in the extended reals $\overline{\mathbb{R}}$ but is not finite, so $f$ does not attain a maximum value in $(0,1)$. The domain $(0,1)$ is bounded but not closed in $\mathbb{R}$, hence not compact by the [Heine-Borel theorem](/theorems/309).
Now consider the function $g: (0, 1) \to \mathbb{R}$ defined by $g(x) = x$. This function is bounded (with $\sup g = 1$), but the supremum is never attained — there is no $x_0 \in (0,1)$ with $g(x_0) = 1$. The [infimum](/page/Supremum%20and%20Infimum) $\inf g = 0$ is likewise not attained. The issue is the same: the domain is not compact, and a maximizing sequence $x_k = 1 - 1/k$ converges to the point $1 \notin (0,1)$.
By contrast, on the compact domain $[0, 1]$, both functions (after appropriate modification at the endpoints) attain their extrema. The closed interval traps every convergent sequence inside the domain, preventing the "escape" that occurs in the open interval.
[/example]
This pattern — a maximizing or minimizing sequence that "escapes" the domain, either by diverging to infinity or by converging to a boundary point that is not included — is the fundamental obstruction that compactness removes. In the [calculus of variations](/page/Calculus%20of%20Variations), the same phenomenon arises in infinite dimensions: a minimizing sequence of functions may oscillate ever more rapidly, or concentrate its mass at a point, or spread out to infinity, all while maintaining a bounded energy. Compactness, in its various incarnations, is the tool that prevents these pathologies.
## Definition
In finite-dimensional Euclidean space, the Heine-Borel theorem tells us that "closed and bounded" is the right condition to prevent the escape of sequences. But this characterisation depends on the metric structure of $\mathbb{R}^n$ — it uses the distance function to define boundedness and the completeness of $\mathbb{R}^n$ to ensure limits stay in the space. In a general [topological space](/page/Topology), where there may be no metric, no notion of "bounded," and no sequences powerful enough to detect the topology, we need a condition stated entirely in terms of open sets. The challenge is: *what purely topological condition captures the same "nothing escapes" phenomenon?*
The answer is the open-cover formulation. The idea is that if a space can be "watched" by a collection of open sets, then compactness demands that finitely many of those sets already suffice. This finiteness is what makes local-to-global arguments work — it converts infinitely many local observations into a single global conclusion.
[definition: Compact Space]
A [topological space](/page/Topology) $X$ is **compact** if every open cover of $X$ has a finite subcover. That is, whenever $\{U_\alpha\}_{\alpha \in A}$ is a collection of [open sets](/page/Open%20Set) in $X$ satisfying
\begin{align*}
X = \bigcup_{\alpha \in A} U_\alpha,
\end{align*}
there exist finitely many indices $\alpha_1, \ldots, \alpha_N \in A$ such that
\begin{align*}
X = U_{\alpha_1} \cup U_{\alpha_2} \cup \cdots \cup U_{\alpha_N}.
\end{align*}
A subset $K$ of a topological space $X$ is **compact** if it is compact in the subspace topology — equivalently, if every cover of $K$ by open sets in $X$ has a finite subcover.
[/definition]
The requirement of *finite* subcovers is essential. Every topological space is covered by itself (a single open set), so the trivial cover always has a finite subcover. The force of the definition comes from requiring that *every* open cover, no matter how inefficiently chosen, can be reduced to a finite one.
[remark: Compactness vs. Closedness and Boundedness]
In general [topological spaces](/page/Topology), compactness is neither equivalent to closedness nor to boundedness. These are metric or Euclidean notions. The discrete topology on an infinite set makes every subset closed and bounded (in any compatible metric bounded by 1), yet the space is not compact — the cover by singletons $\{x\}$ has no finite subcover. Conversely, the indiscrete topology on any set makes the space compact (the only open cover is $\{\varnothing, X\}$), but has no nontrivial closed sets beyond $\varnothing$ and $X$ itself.
The equivalence of compactness with "closed and bounded" is specific to $\mathbb{R}^n$ (the Heine-Borel theorem) and does not extend even to infinite-dimensional [normed spaces](/page/Normed%20Vector%20Space).
[/remark]
## Compactness in Metric Spaces
For general topological spaces, the open-cover definition of compactness is the only available formulation. But in [metric spaces](/page/Metric%20Space), where we have a distance function and therefore a notion of convergent sequences, compactness admits several equivalent characterisations that are often more practical to verify and to apply. The central question is: when can we guarantee that every sequence has a convergent subsequence?
### Sequential Compactness
In analysis, we most often encounter compactness through sequences — constructing a bounded sequence and then extracting a convergent subsequence. This leads to a natural alternative formulation.
[definition: Sequential Compactness]
A topological space $X$ is **sequentially compact** if every [sequence](/page/Sequence) in $X$ has a convergent subsequence. That is, for every sequence $\{x_k\}_{k=1}^\infty \subset X$, there exists a subsequence $\{x_{k_j}\}_{j=1}^\infty$ and a point $x \in X$ such that $x_{k_j} \to x$ as $j \to \infty$.
[/definition]
In general topological spaces, sequential compactness and compactness are independent properties — neither implies the other. However, in metric spaces, the two notions coincide, and they are furthermore equivalent to a third property involving approximation by finite sets.
[remark: Nets and Compactness in General Topological Spaces]
The failure of sequences to detect compactness in general topological spaces is resolved by **nets** (also called Moore-Smith sequences). A net is a function from a directed set into a topological space, generalising sequences (which are nets indexed by $\mathbb{N}$). A topological space $X$ is compact if and only if every net in $X$ has a convergent subnet. This characterisation is the correct generalisation of sequential compactness beyond metric spaces and is essential in functional analysis, where weak and [weak-*](/page/Weak*%20Topology) topologies on nonseparable spaces are typically not [metrizable](/page/Metrizable%20Space). See [Nets and Filters](/page/Nets%20and%20Filters) for details.
[/remark]
In the metric setting, compactness admits a more concrete characterisation through the notion of total boundedness — a quantitative condition that controls how efficiently the space can be covered by finitely many balls.
[definition: Total Boundedness]
A [metric space](/page/Metric%20Space) $(M, d)$ is **totally bounded** if for every $\varepsilon > 0$, there exists a finite set $\{x_1, \ldots, x_N\} \subset M$ such that
\begin{align*}
M = \bigcup_{i=1}^N B(x_i, \varepsilon).
\end{align*}
The set $\{x_1, \ldots, x_N\}$ is called an **$\varepsilon$-net** for $M$.
[/definition]
Total boundedness says that the space can be approximated, to any desired accuracy, by a finite set of points. This is stronger than boundedness: a bounded set in an infinite-dimensional [normed space](/page/Normed%20Vector%20Space) need not be totally bounded (indeed, this failure characterises infinite dimensionality). But it is weaker than compactness on its own — the open interval $(0,1)$ with the standard metric is totally bounded but not compact, because it is not complete.
The following theorem shows that completeness is precisely the missing ingredient.
[quotetheorem:316]
The equivalence of (1) and (2) is the reason that, in metric space settings, one can freely move between the open-cover definition and the subsequence-extraction definition without comment. The equivalence with (3) is particularly useful for *verifying* compactness: checking that a metric space is complete is often straightforward (e.g., closed subsets of complete spaces are complete), and total boundedness can be verified by constructing explicit $\varepsilon$-nets.
The equivalence also illuminates what can go wrong. A metric space fails to be compact if and only if it is either incomplete (sequences can converge to a point "outside" the space) or not totally bounded (the space is "too spread out" to be covered by finitely many balls of uniform radius). In $\mathbb{R}^n$, the open interval $(0,1)$ fails for the first reason; the real line $\mathbb{R}$ fails for the second.
### The Lebesgue Number Lemma
A useful companion to the equivalence theorem is the **Lebesgue number lemma**, which quantifies how "thickly" an open cover of a compact metric space overlaps.
[quotetheorem:952]
The Lebesgue number lemma fails without compactness. On $\mathbb{R}$ with the cover $\{(-n, n)\}_{n \in \mathbb{N}}$, no single $\delta$ works: for any $\delta > 0$, the set $(N - \delta/2, N + \delta/2)$ for sufficiently large $N$ requires the open set $(-N-1, N+1)$ to contain it, but there is no uniform bound on which set suffices. The lemma is used in algebraic topology to prove the equivalence of singular and simplicial homology, and in analysis to pass between covers and uniform estimates.
[example: Total Boundedness of Closed Bounded Sets in $\mathbb{R}^n$]
Let $K \subset \mathbb{R}^n$ be bounded, meaning $K \subset B(0, R)$ for some $R > 0$. We verify that $K$ is totally bounded.
Fix $\varepsilon > 0$. The cube $[-R, R]^n$ can be partitioned into sub-cubes of side length $\varepsilon / \sqrt{n}$. The number of such sub-cubes is
\begin{align*}
N = \left\lceil \frac{2R}{\varepsilon / \sqrt{n}} \right\rceil^n = \left\lceil \frac{2R\sqrt{n}}{\varepsilon} \right\rceil^n < \infty.
\end{align*}
Each sub-cube has diameter at most $\varepsilon$ (since the diagonal of a cube with side $s$ in $\mathbb{R}^n$ has length $s\sqrt{n} = \varepsilon$). Placing a ball of radius $\varepsilon$ at the centre of each sub-cube that intersects $K$ produces a finite $\varepsilon$-net for $K$.
Since $\mathbb{R}^n$ is complete and closed subsets of complete spaces are complete, any closed bounded subset of $\mathbb{R}^n$ is both complete and totally bounded, hence compact. This is one direction of the Heine-Borel theorem.
[/example]
### The Heine-Borel Theorem
The most concrete and frequently used compactness criterion is the characterisation of compact subsets of Euclidean space.
[quotetheorem:309]
The "only if" direction holds in any metric space: compact subsets are always closed (in Hausdorff spaces) and bounded. The "if" direction is specific to $\mathbb{R}^n$ — it relies on the completeness of $\mathbb{R}^n$ and on the fact that bounded sets in $\mathbb{R}^n$ are totally bounded (as verified in the example above). Both properties fail in general metric spaces.
The Heine-Borel theorem is the reason that, in undergraduate analysis, one can work with "closed and bounded" as a synonym for "compact." But this equivalence breaks down catastrophically in infinite-dimensional spaces.
[example: Failure of Heine-Borel in Infinite Dimensions]
Consider the Hilbert space $\ell^2 = \ell^2(\mathbb{N})$ of square-summable real sequences, equipped with the norm
\begin{align*}
\|x\|_{\ell^2} = \left(\sum_{k=1}^\infty x_k^2\right)^{1/2}.
\end{align*}
The closed unit ball $\overline{B}_1(0) = \{x \in \ell^2 : \|x\|_{\ell^2} \le 1\}$ is closed and bounded. We show it is not compact by constructing a sequence with no convergent subsequence.
Define the sequence of standard basis vectors $e_k = (0, \ldots, 0, 1, 0, \ldots)$ with a $1$ in position $k$. Each $e_k$ lies in $\overline{B}_1(0)$ since $\|e_k\|_{\ell^2} = 1$. For $j \neq k$, the vectors $e_j$ and $e_k$ are orthogonal: $(e_j, e_k)_{\ell^2} = 0$. Computing the distance directly,
\begin{align*}
\|e_j - e_k\|_{\ell^2}^2 &= \sum_{i=1}^\infty |(e_j)_i - (e_k)_i|^2 = |(e_j)_j - (e_k)_j|^2 + |(e_j)_k - (e_k)_k|^2 = 1^2 + (-1)^2 = 2,
\end{align*}
so $\|e_j - e_k\|_{\ell^2} = \sqrt{2}$. Every pair of distinct terms is separated by a distance of $\sqrt{2}$. No subsequence can be Cauchy, hence no subsequence converges. The closed unit ball in $\ell^2$ is not sequentially compact, hence not compact.
[/example]
This failure is not an accident of the particular space $\ell^2$; it is a fundamental feature of infinite-dimensional analysis.
[quotetheorem:878]
This result, a consequence of [Riesz's lemma](/page/Riesz's%20Lemma), explains why compactness arguments in PDE theory and functional analysis require fundamentally different tools than in $\mathbb{R}^n$. The Bolzano-Weierstrass theorem — extract a convergent subsequence from any bounded sequence — is a theorem about $\mathbb{R}^n$, not about Banach spaces. In infinite dimensions, one must either work with weaker notions of convergence (weak convergence, weak-* convergence) or impose additional structure (equicontinuity, control of derivatives) to recover compactness.
## Continuous Maps on Compact Spaces
One of the most important features of compact spaces is that they interact well with [continuous maps](/page/Continuity). Many results that are difficult or impossible to establish on general spaces become automatic when the domain is compact. The underlying mechanism is always the same: continuity converts a problem about the range into a problem about the domain, and compactness (of the domain) provides the finiteness needed to close the argument.
### Preservation of Compactness
The simplest and most frequently used property is that continuous images of compact sets are compact.
[quotetheorem:305]
This theorem is the engine behind the Extreme Value Theorem: if $f: X \to \mathbb{R}$ is continuous and $X$ is compact, then $f(X)$ is a compact subset of $\mathbb{R}$, hence closed and bounded by Heine-Borel. Being closed, $f(X)$ contains its supremum and infimum.
[quotetheorem:304]
This result is the foundation of the **direct method in the calculus of variations**: to find a minimizer of an energy functional $E: X \to \mathbb{R}$, one constructs a minimizing sequence $\{u_k\} \subset X$ with $E(u_k) \to \inf E$. If $X$ is compact (or if a suitable compactness argument produces a convergent subsequence) and $E$ is continuous (or lower semicontinuous), then the limit of the subsequence is a minimizer. The failure examples from the opening — where the supremum is not attained — show precisely what goes wrong when $X$ is not compact: the minimizing sequence converges to a point outside the domain.
### Uniform Continuity
On non-compact domains, a [continuous function](/page/Continuity%20(Metric%20Spaces)) can have its modulus of continuity deteriorate arbitrarily at different points. For example, the function $f: (0, \infty) \to \mathbb{R}$ defined by $f(x) = \sin(1/x)$ is continuous on $(0, \infty)$, but it oscillates more and more rapidly near $x = 0$. The function is continuous but not uniformly continuous. Compactness prevents this deterioration.
[quotetheorem:954]
The crucial point is that $\delta$ depends only on $\varepsilon$, not on the point $x_1$. On compact sets, every continuous function is uniformly continuous. This result is used constantly in approximation theory (to justify that [mollification](/page/Standard%20Mollifier) converges uniformly on compact subsets), in numerical analysis (to bound truncation errors uniformly), and in PDE theory (to pass limits through integrals).
### The Closed Map Lemma
The interaction between compactness and the Hausdorff property produces a powerful rigidity for [continuous maps](/page/Continuity%20(Metric%20Spaces)).
[quotetheorem:317]
An immediate consequence is that a continuous bijection from a compact space to a Hausdorff space is automatically a homeomorphism. In a general topological setting, a continuous bijection need not have a continuous inverse — the identity map from $(\mathbb{R}, \text{discrete})$ to $(\mathbb{R}, \text{Euclidean})$ is continuous and bijective, but its inverse is not continuous. Compactness of the domain eliminates this pathology.
[example: Compactness Produces Homeomorphisms]
Consider the map $f: [0, 2\pi) \to S^1$ defined by $f(t) = (\cos t, \sin t)$, where $S^1 = \{(x,y) \in \mathbb{R}^2 : x^2 + y^2 = 1\}$ is the unit circle. This map is continuous and bijective. However, $f$ is *not* a homeomorphism: the domain $[0, 2\pi)$ is not compact (it is not closed in $\mathbb{R}$), and correspondingly the inverse $f^{-1}$ is not continuous — it has a discontinuity at the point $(1, 0) = f(0)$, where nearby points on the circle come from $t$ near $0$ and $t$ near $2\pi$.
By contrast, the restriction $g: [0, \pi] \to S^1_+$ where $S^1_+ = \{(\cos t, \sin t) : 0 \le t \le \pi\}$ is a continuous bijection from a compact space to a Hausdorff space. The Closed Map Lemma guarantees that $g$ is a homeomorphism, and indeed $g^{-1}(x,y) = \arccos(x)$ is continuous.
[/example]
## Products and Subspaces
In practice, the spaces we work with are rarely given directly — they are constructed from simpler pieces via products, subspaces, and quotients. If compactness were not preserved by these constructions, the Heine-Borel theorem (which relies on compact intervals and their products) would be an isolated fact rather than a useful tool. The question is: *which constructions preserve compactness, and which can destroy it?*
### Compact Subspaces and Hausdorff Spaces
The following theorem describes the interplay between compactness and closedness. In a compact space, closed subsets inherit compactness. In a Hausdorff space, compact subsets are automatically closed.
[quotetheorem:307]
Part (1) is used constantly: to verify that a subset of $\mathbb{R}^n$ is compact, it suffices to check that it is closed and contained in a compact set (for instance, a large closed ball). Part (2) explains why compact sets in $\mathbb{R}^n$ are always closed — but note that this relies on the Hausdorff property. In a non-Hausdorff space, compact subsets need not be closed.
The combination of parts (1) and (2) in a compact Hausdorff space yields a particularly clean picture: the compact subsets are *exactly* the closed subsets.
### Products of Compact Spaces
One of the most important stability properties of compactness is that it is preserved under products.
[quotetheorem:308]
By induction, any finite product $X_1 \times \cdots \times X_N$ of compact spaces is compact. This immediately implies the Heine-Borel theorem in $\mathbb{R}^n$: a closed, bounded subset $K \subset \mathbb{R}^n$ is contained in a product $[-M, M]^n = [-M, M] \times \cdots \times [-M, M]$, which is compact as a product of compact intervals. Since $K$ is closed and a closed subset of a compact space is compact, $K$ is compact.
The truly deep result is the extension to *arbitrary* products, which requires the [Axiom of Choice](/page/Axiom%20of%20Choice).
[quotetheorem:953]
Tychonoff's theorem is equivalent to the Axiom of Choice. The reverse implication — that Tychonoff's theorem implies the Axiom of Choice — was established by Kelley (1950) and is considerably less straightforward than the forward direction. The theorem's primary application in analysis is through the [Banach-Alaoglu theorem](/theorems/212), which establishes weak-* compactness of the dual unit ball — a result that underpins much of functional analysis and PDE theory.
[explanation: Why the Product Topology]
The choice of topology on the product is critical. Tychonoff's theorem holds for the **product topology** (the coarsest topology making all projection maps continuous), not for the box topology (which has a basis of products of open sets, without the restriction that all but finitely many be the full space).
In the box topology, even a [countable](/page/Countable%20Set) product of compact spaces can fail to be compact. Consider $\prod_{k=1}^\infty [0, 1]$ with the box topology. The sequence $x_k$ defined by $(x_k)_j = 0$ for $j \neq k$ and $(x_k)_k = 1$ has no convergent subsequence in the box topology: any open neighbourhood of a proposed limit must contain a "box" $\prod_{j=1}^\infty U_j$ with each $U_j$ open, and no matter what the limit point is, infinitely many terms of the sequence will lie outside this box because their $k$-th coordinate "jumps" to $1$.
This example shows that Tychonoff's theorem is not merely a statement about sets; it depends essentially on the product topology.
[/explanation]
## Failure of Compactness in Infinite Dimensions
The results of the previous sections might suggest that compactness is a broadly available property, at least for "reasonable" subsets of "reasonable" spaces. In finite dimensions, this is largely true: any closed, bounded set is compact. But this picture collapses in infinite-dimensional [Banach spaces](/page/Banach%20Space) and [Hilbert spaces](/page/Hilbert%20Space), which are the natural domains for partial differential equations and the calculus of variations.
The failure is not merely a curiosity — it is the primary obstruction to existence proofs in infinite-dimensional optimisation and PDE theory. Understanding *why* compactness fails, and *how* it fails, is essential for developing the tools (weak convergence, compact embeddings, compact operators) that serve as substitutes.
### Why the Unit Ball is Not Compact
We have already seen in the example above that the closed unit ball $\overline{B}_1(0)$ in $\ell^2$ is not compact: the sequence of standard basis vectors $\{e_k\}_{k=1}^\infty$ satisfies $\|e_j - e_k\|_{\ell^2} = \sqrt{2}$ for $j \neq k$, so no subsequence is Cauchy.
The root cause is that $\overline{B}_1(0)$ is *not* totally bounded in the norm topology. For $\varepsilon < \sqrt{2}/2$, no finite collection of balls $B(x_i, \varepsilon)$ can cover $\overline{B}_1(0)$, because the unit ball contains infinitely many points that are mutually separated by distance $\sqrt{2}$.
This phenomenon manifests in three distinct pathologies for bounded sequences in infinite-dimensional spaces. Each represents a mode of "non-compactness" — a way that a bounded sequence can fail to have a norm-convergent subsequence.
[example: Three Modes of Non-Compactness]
Let $U = (0, 1) \subset \mathbb{R}$ and consider the space $L^2(U)$.
**Oscillation.** Define $f_k: (0,1) \to \mathbb{R}$ by $f_k(x) = \sin(2\pi k x)$. Then $\|f_k\|_{L^2} = 1/\sqrt{2}$ for all $k$, so the sequence is bounded. However, $f_k$ converges weakly to $0$ (by the Riemann-Lebesgue lemma) but not strongly:
\begin{align*}
\|f_k - 0\|_{L^2}^2 = \int_0^1 \sin^2(2\pi k x) \, d\mathcal{L}^1(x) = \frac{1}{2} \not\to 0.
\end{align*}
The oscillations become infinitely rapid, and the "energy" spreads across ever higher frequencies. No subsequence converges in $L^2$.
**Concentration.** Define $g_k: (0,1) \to \mathbb{R}$ by $g_k(x) = \sqrt{k} \, \mathbb{1}_{(0, 1/k)}(x)$. Then $\|g_k\|_{L^2}^2 = k \cdot (1/k) = 1$, so the sequence is bounded. As $k \to \infty$, the mass of $g_k$ concentrates at the origin: $g_k \to 0$ pointwise almost everywhere, but $\|g_k\|_{L^2} = 1$. No subsequence converges strongly in $L^2$.
**Escape to infinity (on unbounded domains).** On $U = \mathbb{R}$, fix $\varphi \in C^\infty_c(\mathbb{R})$ with $\|\varphi\|_{L^2} = 1$ and define $h_k(x) = \varphi(x - k)$. Then $\|h_k\|_{L^2} = 1$ and $h_k \to 0$ pointwise for every fixed $x$. The "bump" slides to the right and escapes any compact set. Again, $\|h_k\|_{L^2} = 1$, so no subsequence converges strongly to $0$.
[/example]
These three modes of failure — oscillation, concentration, and escape — recur throughout PDE theory and the calculus of variations. Different compactness tools address different modes:
- **Compact Sobolev embeddings** (Rellich-Kondrachov) prevent oscillation by controlling derivatives, and prevent escape by requiring bounded domains.
- **Concentration compactness** (Lions) addresses the concentration mode by tracking where the mass goes.
- **Weak and weak-* compactness** (Banach-Alaoglu) provide a weaker notion of convergence in which bounded sequences *always* have convergent subsequences, at the cost of losing norm convergence.
## Compactness in Function Spaces
The failure of norm compactness for bounded subsets of infinite-dimensional spaces raises a fundamental question: under what additional conditions does a bounded family of functions become (pre)compact? The answer involves controlling not just the size of the functions but also their *regularity* — how rapidly they can vary. This is the content of the Arzel\`a-Ascoli theorem and its descendants.
### The Arzel\`a-Ascoli Theorem
Consider the space $C(K)$ of [continuous functions](/page/Continuity%20(Metric%20Spaces)) on a compact metric space $(K, d)$, equipped with the supremum norm. A bounded subset of $C(K)$ — that is, a uniformly bounded family of continuous functions — need not be precompact. The oscillation example above (with $f_k(x) = \sin(2\pi k x)$ on $[0,1]$) shows this: all functions satisfy $\|f_k\|_\infty \le 1$, but no subsequence converges uniformly.
The missing ingredient is **equicontinuity**: the requirement that the oscillation of all functions in the family is controlled by a single modulus of continuity.
[definition: Equicontinuity]
Let $(K, d)$ be a metric space. A family $\mathcal{F} \subset C(K)$ of continuous functions $f: K \to \mathbb{R}$ is **equicontinuous** if for every $\varepsilon > 0$, there exists $\delta > 0$ such that for all $f \in \mathcal{F}$ and all $x, y \in K$:
\begin{align*}
d(x, y) < \delta \implies |f(x) - f(y)| < \varepsilon.
\end{align*}
[/definition]
Equicontinuity is a uniform condition in two senses: the $\delta$ is independent of both the point $x$ and the function $f$. A single continuous function on a compact set is uniformly continuous (by the Heine-Cantor theorem), but a family of functions may have moduli of continuity that deteriorate as you move through the family. Equicontinuity prevents this.
[quotetheorem:66]
The "if" direction is the more useful one: it provides a sufficient condition for extracting [uniformly convergent](/page/Uniform%20Convergence) subsequences from a family of functions. The "only if" direction is a consistency check — any precompact family must satisfy these conditions.
The Arzel\`a-Ascoli theorem explains why the oscillation mode of non-compactness (from the previous section) does not arise when derivatives are controlled. If $\{f_k\} \subset C^1([0,1])$ with $\|f_k\|_\infty \le M$ and $\|f_k'\|_\infty \le M$, then the Mean Value Theorem gives $|f_k(x) - f_k(y)| \le M |x - y|$ for all $k$. The family is equicontinuous (with $\delta = \varepsilon / M$), and Arzel\`a-Ascoli produces a uniformly convergent subsequence.
[example: Arzel\`a-Ascoli Fails Without Equicontinuity]
On $K = [0, 1]$, define $f_k: [0, 1] \to \mathbb{R}$ by $f_k(x) = x^k$. These functions are uniformly bounded ($\|f_k\|_\infty = 1$), but the family is *not* equicontinuous near $x = 1$. To see this, observe that
\begin{align*}
f_k(1) - f_k(1 - 1/k) = 1 - (1 - 1/k)^k \to 1 - 1/e > 0
\end{align*}
as $k \to \infty$, while the gap $|1 - (1 - 1/k)| = 1/k \to 0$. For any fixed $\delta > 0$, the variation of $f_k$ on the interval $(1 - \delta, 1)$ does not become small as $k \to \infty$.
The pointwise limit of $f_k$ is the discontinuous function $f(x) = \mathbb{1}_{\{1\}}(x)$. No subsequence of $\{f_k\}$ converges *uniformly* — the convergence to the discontinuous limit is only pointwise. The failure of equicontinuity near $x = 1$ is precisely what prevents uniform convergence.
[/example]
### Compact Embeddings: From Sobolev to Lebesgue
The Arzel\`a-Ascoli theorem characterises compactness in $C(K)$: control the values and the modulus of continuity. In PDE theory, the analogous question is: when is a bounded sequence in a [Sobolev space](/page/Sobolev%20Space) $W^{1,p}(U)$ precompact in $L^q(U)$? The answer is the [Rellich-Kondrachov theorem](/theorems/64), which is the infinite-dimensional workhorse of existence theory for [elliptic PDE](/page/Second-Order%20Elliptic%20Equations).
The mechanism is the same as in Arzel\`a-Ascoli: controlling derivatives (the Sobolev norm controls $\|\nabla u\|_{L^p}$) prevents oscillation, and requiring the domain $U$ to be bounded prevents escape to infinity. The price is that compactness holds only in a *weaker* norm — we gain compactness in $L^q$ but not in $W^{1,p}$ itself.
[quotetheorem:64]
The compactness fails at the critical exponent $q = p^*$: the embedding $W^{1,p}(U) \hookrightarrow L^{p^*}(U)$ is continuous but not compact. It also fails on unbounded domains, as the escape-to-infinity example demonstrates: translations of a fixed bump function form a bounded sequence in $W^{1,p}(\mathbb{R}^n)$ with no convergent subsequence in $L^p(\mathbb{R}^n)$.
The Rellich-Kondrachov theorem is used in virtually every existence proof for elliptic PDE via the direct method. The typical argument proceeds as follows:
1. Construct a minimizing sequence $\{u_k\} \subset W^{1,p}(U)$ for an energy functional, with $\|u_k\|_{W^{1,p}} \le C$.
2. Extract a subsequence converging weakly in $W^{1,p}(U)$ (by [reflexivity](/page/Reflexive%20Space) and the Eberlein-Šmulian theorem; see below).
3. By Rellich-Kondrachov, this subsequence converges *strongly* in $L^q(U)$.
4. Use the strong $L^q$ convergence and weak lower semicontinuity of the energy to identify the limit as a minimizer.
Step 3 is where the compact embedding is essential — without it, one cannot pass from weak convergence (which does not preserve nonlinear functionals) to strong convergence (which does).
### Weak and Weak-* Compactness
When norm compactness is unavailable, weaker topologies provide a substitute. The Banach-Alaoglu theorem guarantees that the dual unit ball is always compact in the [weak-* topology](/page/Weak*%20Topology), regardless of the dimension of the space.
[quotetheorem:212]
A crucial subtlety: the Banach-Alaoglu theorem provides compactness in the weak-* topology, which is *not* in general metrizable for nonseparable spaces. Consequently, the theorem does not directly yield sequentially convergent subsequences — it guarantees the existence of convergent *subnets*, but these may be indexed by uncountable directed sets rather than by $\mathbb{N}$. The passage from weak-* compactness to *sequential* weak compactness requires additional work.
For [reflexive](/page/Reflexive%20Space) Banach spaces (such as $L^p$ for $1 < p < \infty$, or Sobolev spaces $W^{k,p}$ for $1 < p < \infty$), the relevant tool is the **Eberlein-Šmulian theorem**, which establishes the equivalence of weak compactness and weak sequential compactness.
[quotetheorem:214]
The logical chain is: reflexivity identifies $X$ with $X^{**}$, so the weak topology on $X$ coincides with the weak-* topology on $X^{**}$; Banach-Alaoglu provides weak-* compactness of bounded sets in $X^{**}$; and the Eberlein-Šmulian theorem (which holds in any Banach space, not just reflexive ones) guarantees that weak compactness of a set is equivalent to weak sequential compactness. Together, these results yield the statement above.
The distinction between weak and strong convergence is critical. The sequence $f_k(x) = \sin(2\pi k x)$ in $L^2(0,1)$ converges weakly to $0$ but not strongly. For linear functionals and convex functionals, weak convergence suffices (by lower semicontinuity). For nonlinear functionals, it does not — and this is precisely where the strong convergence provided by compact embeddings (Rellich-Kondrachov) becomes indispensable.
## Compact Operators
In infinite-dimensional spaces, the equation $Tx = y$ may fail to have well-behaved solutions even when $T$ is bounded and injective: the inverse $T^{-1}$ may be unbounded, and spectral decompositions of the kind familiar from finite-dimensional linear algebra may not exist. Which operators retain enough finite-dimensional structure to admit a clean spectral theory and guarantee that bounded input sequences produce convergent output subsequences?
The answer comes from the theory of *compact operators*, which map bounded sets to precompact sets. The motivation is most natural in the context of integral equations. Consider the Fredholm integral operator $T: L^2([0,1]) \to L^2([0,1])$ defined by
\begin{align*}
(Tf)(x) = \int_0^1 K(x, y) f(y) \, d\mathcal{L}^1(y),
\end{align*}
where $K: [0,1] \times [0,1] \to \mathbb{R}$ is a continuous kernel. If $\{f_k\}$ is a bounded sequence in $L^2$, then the images $\{Tf_k\}$ are equicontinuous (because $K$ is uniformly continuous on the compact set $[0,1]^2$) and uniformly bounded. By the Arzel\`a-Ascoli theorem, $\{Tf_k\}$ has a subsequence converging uniformly, hence in $L^2$. The operator $T$ is therefore compact.
[definition: Compact Operator]
Let $X$ and $Y$ be [Banach spaces](/page/Banach%20Space). A [bounded linear operator](/page/Linear%20Operators%20on%20Banach%20Spaces) $T: X \to Y$ is **compact** if for every bounded sequence $\{x_k\}_{k=1}^\infty \subset X$, the image sequence $\{Tx_k\}_{k=1}^\infty$ has a convergent subsequence in $Y$.
Equivalently, $T$ is compact if the image $T(\overline{B}_1(0))$ of the closed unit ball is precompact (has compact closure) in $Y$. The space of compact operators from $X$ to $Y$ is denoted $\mathcal{K}(X, Y)$.
[/definition]
Compact operators form a closed two-sided ideal in the algebra of bounded operators $\mathcal{L}(X)$: the composition of a compact operator with a bounded operator (in either order) is compact, and the norm limit of compact operators is compact. Every finite-rank operator (one whose range is finite-dimensional) is compact, and on a Hilbert space, every compact operator is the norm limit of finite-rank operators.
### Spectral Theory of Compact Operators
The spectrum of a bounded operator on an infinite-dimensional space can be extremely complicated — it can be any nonempty compact subset of $\mathbb{C}$, and it can contain continuous spectrum and residual spectrum in addition to [eigenvalues](/page/Eigenvalue%20and%20Eigenvector). For compact operators, the spectral picture simplifies dramatically.
We write $\sigma(T)$ for the spectrum of $T$ and $\sigma_p(T)$ for the **point spectrum** (the set of eigenvalues of $T$, i.e., those $\lambda \in \mathbb{C}$ for which $T - \lambda I$ is not injective).
[quotetheorem:220]
This result means that compact operators behave, spectrally, like "infinite matrices with eigenvalues decaying to zero." The spectrum is countable, eigenspaces are finite-dimensional, and the only possible accumulation point is $0$. There is no continuous spectrum, no residual spectrum (away from $0$), and no strange pathologies.
When the compact operator is additionally self-adjoint on a Hilbert space, the spectral decomposition becomes as complete as the diagonalisation of a symmetric matrix.
[quotetheorem:538]
This theorem is the foundation of the spectral theory of [second-order elliptic equations](/page/Second%20Order%20Elliptic%20Equations). The resolvent operator $K = L^{-1}$ of an elliptic operator $L$ on a bounded domain is compact (by Rellich-Kondrachov), and if $L$ is self-adjoint, the spectral theorem produces an orthonormal basis of eigenfunctions of $L$ with eigenvalues tending to infinity. This underpins the Fourier method for solving evolution equations, the separation of variables in mathematical physics, and the [Fredholm alternative](/page/The%20Fredholm%20Alternative) for elliptic boundary value problems.
## Local Compactness
Not every space of interest is compact, but many important spaces — $\mathbb{R}^n$, locally compact groups, manifolds — enjoy a weaker property that is still powerful enough to support a rich theory.
[definition: Locally Compact Space]
A topological space $X$ is **locally compact** if every point $x \in X$ has a compact neighbourhood: there exists an open set $U$ and a compact set $K$ with $x \in U \subset K$.
[/definition]
Every compact space is locally compact (take $K = X$). The space $\mathbb{R}^n$ is locally compact but not compact: every point $x$ is contained in the open ball $B(x, 1)$, which sits inside the compact set $\overline{B}(x, 1)$. By contrast, infinite-dimensional Banach spaces are *never* locally compact: since the closed unit ball is not compact, no point has a compact neighbourhood in the norm topology.
Local compactness is the natural setting for several foundational results: the Riesz representation theorem (which identifies positive linear functionals on $C_c(X)$ with Radon measures), the existence of Haar measure on locally compact groups, and the one-point compactification $X^+ = X \cup \{\infty\}$ (which is compact Hausdorff if and only if $X$ is locally compact Hausdorff). When the full force of compactness is not available, local compactness often provides just enough control to carry the argument.
## Standard Arguments Using Compactness
Compactness enters mathematical arguments through a small number of recurring patterns. Recognising these patterns is essential for both reading and writing proofs in analysis and topology. This section describes the most common techniques and illustrates each with a representative application.
### The Covering Argument
The most direct use of compactness: cover the space with open sets having a desired property, extract a finite subcover, and use the finiteness to obtain a uniform bound or a global conclusion from local hypotheses.
[example: The Lebesgue Number Lemma via a Covering Argument]
Let $(M, d)$ be a compact metric space and let $\mathcal{U} = \{U_\alpha\}_{\alpha \in A}$ be an open cover of $M$. We construct the Lebesgue number $\delta > 0$ explicitly.
For each $x \in M$, there exists $U_{\alpha(x)} \in \mathcal{U}$ containing $x$. Since $U_{\alpha(x)}$ is open, there exists $r_x > 0$ with $B(x, r_x) \subset U_{\alpha(x)}$. The collection $\{B(x, r_x/2)\}_{x \in M}$ is an open cover of $M$. By compactness, extract a finite subcover $\{B(x_i, r_{x_i}/2)\}_{i=1}^N$.
Set $\delta = \min_{1 \le i \le N} r_{x_i}/2 > 0$ (this is a minimum of finitely many positive numbers, hence positive — the finiteness from compactness is essential here). Now let $S \subset M$ have $\operatorname{diam}(S) < \delta$. Pick any $s \in S$. Then $s \in B(x_i, r_{x_i}/2)$ for some $i$. For every $t \in S$, we have
\begin{align*}
d(t, x_i) \le d(t, s) + d(s, x_i) < \delta + r_{x_i}/2 \le r_{x_i}/2 + r_{x_i}/2 = r_{x_i},
\end{align*}
so $S \subset B(x_i, r_{x_i}) \subset U_{\alpha(x_i)}$. The set $S$ is contained in a single member of $\mathcal{U}$.
The argument fails without compactness because the minimum of infinitely many positive numbers can be zero.
[/example]
### The Subsequence-Extraction Argument
In metric spaces, compactness is most often invoked through sequential compactness: given a bounded sequence, extract a convergent subsequence and identify its limit. This is the pattern behind the direct method in the calculus of variations and most existence proofs in PDE theory.
The structure is:
1. **Construct a bounded sequence** $\{x_k\} \subset K$ satisfying an approximate property (e.g., $E(x_k) \to \inf E$).
2. **Extract a convergent subsequence** $x_{k_j} \to x$ (by compactness of $K$, or by a compact embedding).
3. **Pass the approximate property to the limit** to show $x$ has the desired exact property (e.g., $E(x) = \inf E$, using continuity or lower semicontinuity of $E$).
The key subtlety is step 3: the approximate property must be preserved under the type of convergence available. If $x_{k_j} \to x$ in norm and $E$ is continuous, this is automatic. If $x_{k_j} \rightharpoonup x$ weakly, one needs $E$ to be (weakly) lower semicontinuous.
### Diagonal Extraction
When one needs to extract convergent subsequences with respect to infinitely many criteria simultaneously, the **diagonal argument** (or Cantor diagonal extraction) applies. The key observation is that a subsequence of a convergent sequence still converges to the same limit.
[example: Diagonal Extraction for Pointwise Convergence]
Let $\{f_k\}_{k=1}^\infty$ be a uniformly bounded sequence of continuous functions on $[0,1]$, and let $\{q_j\}_{j=1}^\infty$ be an enumeration of the rationals in $[0,1]$. We extract a subsequence converging at every rational point.
**Step 1.** Since $\{f_k(q_1)\}_{k=1}^\infty$ is a bounded sequence in $\mathbb{R}$, the Bolzano-Weierstrass theorem produces a subsequence $\{f_{k,1}\}$ (the subsequence extracted at stage $1$) with $f_{k,1}(q_1) \to L_1$ for some $L_1 \in \mathbb{R}$.
**Step 2.** The sequence $\{f_{k,1}(q_2)\}$ is still bounded, so extract a further subsequence $\{f_{k,2}\}$ of $\{f_{k,1}\}$ with $f_{k,2}(q_2) \to L_2$. Since $\{f_{k,2}\}$ is a subsequence of $\{f_{k,1}\}$, we still have $f_{k,2}(q_1) \to L_1$.
**Step $j$.** At stage $j$, extract $\{f_{k,j}\}$ as a subsequence of $\{f_{k,j-1}\}$ such that $f_{k,j}(q_j) \to L_j$. By construction, $\{f_{k,j}\}$ converges at $q_1, \ldots, q_j$.
**Diagonal.** Define the diagonal subsequence $g_k = f_{k,k}$. For each fixed $j$, the sequence $\{g_k\}_{k \ge j}$ is a subsequence of $\{f_{k,j}\}$, hence $g_k(q_j) \to L_j$ as $k \to \infty$. The single subsequence $\{g_k\}$ converges at every rational point simultaneously.
If the original sequence is additionally equicontinuous, the Arzel\`a-Ascoli theorem guarantees that $\{g_k\}$ in fact converges uniformly on $[0,1]$ (since pointwise convergence on a dense set plus equicontinuity implies uniform convergence). This diagonal-plus-equicontinuity argument is the standard proof of the Arzel\`a-Ascoli theorem.
[/example]
### Compactness-and-Contradiction
A powerful indirect technique: assume a statement fails, construct a sequence of counterexamples, extract a convergent subsequence by compactness, and derive a contradiction from the properties of the limit.
[example: Uniform Continuity via Compactness-and-Contradiction]
We give the standard proof that a continuous function $f: K \to \mathbb{R}$ on a compact metric space $(K, d)$ is uniformly continuous, using the compactness-and-contradiction pattern.
Suppose $f$ is *not* uniformly continuous. Then there exists $\varepsilon_0 > 0$ and sequences $\{x_k\}$, $\{y_k\}$ in $K$ such that $d(x_k, y_k) < 1/k$ but $|f(x_k) - f(y_k)| \ge \varepsilon_0$ for all $k$.
Since $K$ is compact, $\{x_k\}$ has a convergent subsequence $x_{k_j} \to x_0 \in K$. Since $d(x_{k_j}, y_{k_j}) < 1/k_j \to 0$, we also have $y_{k_j} \to x_0$.
By continuity of $f$ at $x_0$: $f(x_{k_j}) \to f(x_0)$ and $f(y_{k_j}) \to f(x_0)$. Therefore $|f(x_{k_j}) - f(y_{k_j})| \to 0$, contradicting $|f(x_{k_j}) - f(y_{k_j})| \ge \varepsilon_0 > 0$.
The pattern is: failure of the statement produces a sequence; compactness produces a limit; continuity at the limit produces a contradiction. This structure appears in the proofs of the Heine-Cantor theorem, the Lebesgue number lemma, and many results in differential geometry and algebraic topology.
[/example]
### Epsilon-Net Arguments
Total boundedness provides a different angle on compactness: rather than extracting convergent subsequences, one *covers* the space by finitely many balls and reduces an infinite problem to a finite one.
[example: Compactness of the Image of a Compact Operator]
Let $T: X \to Y$ be a compact operator between Banach spaces. We verify directly that $T(\overline{B}_1(0))$ is totally bounded (hence precompact).
Fix $\varepsilon > 0$. The closure $C = \overline{T(\overline{B}_1(0))}$ is compact: since $Y$ is a Banach space (hence complete), the closure of a precompact set is compact. By total boundedness of $C$, there exist $y_1, \ldots, y_N \in C$ such that $C \subset \bigcup_{i=1}^N B(y_i, \varepsilon)$. In particular, for every $x \in \overline{B}_1(0)$, the point $Tx$ lies within distance $\varepsilon$ of some $y_i$.
This finite $\varepsilon$-net controls the "complexity" of the image: although $\overline{B}_1(0)$ is infinite-dimensional, its image under $T$ can be approximated by a finite-dimensional set to any desired accuracy. This is the sense in which compact operators are "almost finite-rank."
[/example]
## References
1. Munkres, J. R., *[Topology](/page/Topology)* (2000).
2. Rudin, W., *Principles of Mathematical Analysis* (1976).
3. Brezis, H., *Functional Analysis, [Sobolev Spaces](/page/Sobolev%20Space) and Partial Differential Equations* (2011).
4. Evans, L. C., *Partial Differential Equations* (2010).
5. Conway, J. B., *A Course in Functional Analysis* (1990).
6. Kelley, J. L., *General Topology* (1955).
7. Dunford, N. and Schwartz, J. T., *Linear Operators, Part I* (1958).