Throughout mathematics, constructions that feel routine in the finite case become deeply problematic in the infinite case. Given a finite collection of nonempty sets, one can simply pick an element from each --- no principle beyond the logical rules of set theory is needed. But given an *infinite* family of nonempty sets, the act of simultaneously choosing one element from each is no longer guaranteed by the other axioms of Zermelo-Fraenkel set theory (ZF). The **Axiom of Choice** asserts that such simultaneous selections are always possible: every family of nonempty sets admits a *choice function*.
This axiom is so natural that mathematicians used it implicitly for decades before Zermelo isolated it in 1904. Yet it is also so powerful that it produces consequences many find deeply counterintuitive: sets of real numbers that are not Lebesgue measurable, decompositions of a ball into finitely many pieces that can be reassembled into two balls of the same size, and well-orderings of $\mathbb{R}$ that no one can explicitly describe. The Axiom of Choice is the only standard axiom of set theory whose consistency *and* independence from the remaining axioms have both been established: Godel showed in 1938 that it cannot be disproved from ZF, and Cohen showed in 1963 that it cannot be proved from ZF either. It is genuinely optional --- and yet virtually all of modern mathematics depends on it.
[example: A Choice That Cannot Be Made Explicitly]
Consider the quotient group $\mathbb{R}/\mathbb{Q}$, where two real numbers are identified if their difference is rational. Each equivalence class $[x] = x + \mathbb{Q} = \{x + q : q \in \mathbb{Q}\}$ is a countable dense subset of $\mathbb{R}$. The quotient $\mathbb{R}/\mathbb{Q}$ is an uncountable collection of nonempty sets, and these sets partition $\mathbb{R}$.
A **Vitali set** is a subset $V \subset \mathbb{R}$ that contains exactly one representative from each equivalence class $[x]$. To construct $V$, one must choose, for each of the uncountably many classes $[x] \in \mathbb{R}/\mathbb{Q}$, a single element $v_{[x]} \in [x]$. No explicit rule for making this selection is known --- the equivalence classes are all "alike" in the sense that no measurable or topological property distinguishes any particular element of $[x]$ from any other.
The Axiom of Choice guarantees that $V$ exists. But as we shall see, any such $V$ is necessarily non-measurable with respect to Lebesgue measure. Without the Axiom of Choice, it is consistent with ZF that every subset of $\mathbb{R}$ is Lebesgue measurable (Solovay, 1970, assuming the consistency of an inaccessible cardinal). The Vitali set is thus a pure creature of Choice: it exists only because the axiom permits a simultaneous selection that no constructive procedure can replicate.
[/example]
## Definition
The fundamental problem the Axiom of Choice addresses is the gap between *existence of elements* and *existence of a selection*. In ZF, one can prove that each set in a family is nonempty --- meaning each contains at least one element --- without having any mechanism to simultaneously designate one element from each set. For a single nonempty set $A$, the statement $\exists\, x \in A$ is a theorem of logic. For two nonempty sets $A$ and $B$, one obtains an element of $A \times B$ by two applications of existential instantiation. For finitely many sets, induction suffices. But for an arbitrary (possibly uncountable) family, no finite sequence of logical steps produces the required selection, and the other axioms of ZF do not fill this gap.
The Axiom of Choice closes this gap by fiat.
[definition: Axiom of Choice]
Let $\{A_i\}_{i \in I}$ be a family of nonempty sets indexed by a set $I$. The **Axiom of Choice** (AC) asserts the existence of a **choice function**: a function
\begin{align*}
f: I &\to \bigcup_{i \in I} A_i
\end{align*}
satisfying $f(i) \in A_i$ for every $i \in I$.
Equivalently, AC states that the Cartesian product $\prod_{i \in I} A_i$ is nonempty whenever each $A_i$ is nonempty.
[/definition]
The equivalence of the two formulations is immediate: a choice function $f: I \to \bigcup_{i \in I} A_i$ with $f(i) \in A_i$ is the same data as an element $(f(i))_{i \in I}$ of the product $\prod_{i \in I} A_i$. The product formulation makes the connection to topology transparent --- it is the starting point for Tychonoff's theorem, which asserts that an arbitrary product of [compact spaces](/page/Compact%20Space) is compact and is, remarkably, *equivalent* to the Axiom of Choice in ZF.
[remark: When Choice Is Not Needed]
The Axiom of Choice is not invoked in every argument involving selection from sets. The following situations do *not* require AC:
1. **Finite families.** If $I$ is finite, a choice function can be constructed by finitely many applications of existential instantiation within ZF.
2. **Families with a definable selection rule.** If each $A_i$ has a *canonical* or *distinguished* element that can be specified by a formula, no appeal to Choice is necessary. For instance, one can always select the least element from a family of nonempty subsets of $\mathbb{N}$ (using the well-ordering of $\mathbb{N}$, which is a theorem of ZF).
3. **A single arbitrary choice.** Given one nonempty set $A$, the statement "let $a \in A$" is an application of existential instantiation, not of the Axiom of Choice.
The Axiom of Choice is needed precisely when one must make *infinitely many simultaneous selections* with *no definable rule* for choosing. The paradigmatic case is the Vitali set construction: each equivalence class $[x] \in \mathbb{R}/\mathbb{Q}$ is nonempty, but no formula in the language of set theory picks out a canonical representative from each class.
[/remark]
## Equivalent Formulations
The power of the Axiom of Choice is illuminated by its equivalence with several statements that, on their face, appear to have nothing to do with selecting elements from sets. The three most important equivalents --- Zorn's Lemma, the Well-Ordering Theorem, and Tychonoff's theorem --- each provide a different operational handle on the same underlying principle.
### Zorn's Lemma
In algebra and functional analysis, one frequently needs to extend partial structures to maximal ones: a linearly independent set to a basis, a proper ideal to a maximal ideal, a partial order to a total order. The difficulty is that the extension process may require transfinitely many steps, and at each step one must choose which element to add. Zorn's Lemma packages this transfinite construction into a clean sufficient condition for the existence of maximal elements.
[quotetheorem:1226]
Zorn's Lemma is equivalent to the Axiom of Choice in ZF: each implies the other using only the remaining ZF axioms. The proof that AC implies Zorn's Lemma proceeds by transfinite recursion. One constructs an increasing chain in $P$ by repeatedly choosing, at each successor step, an element strictly above the current one (using a choice function on the set of strict upper bounds). At limit steps, one takes an upper bound of the chain constructed so far (which exists by hypothesis). If the process never terminates, the chain, indexed by all ordinals, would inject the proper class of ordinals into the set $P$, violating the axiom of replacement. Therefore the process must terminate, and the terminal element is maximal.
The converse --- Zorn's Lemma implies AC --- is more subtle. Given a family $\{A_i\}_{i \in I}$ of nonempty sets, one considers the partially ordered set of partial choice functions: functions $f: J \to \bigcup_{i \in I} A_i$ defined on some subset $J \subset I$ with $f(i) \in A_i$ for all $i \in J$, ordered by extension. Every chain of partial choice functions has an upper bound (their union), so Zorn's Lemma produces a maximal partial choice function. If its domain were a proper subset of $I$, one could extend it by choosing any element from $A_j$ for some $j \notin J$, contradicting maximality. Therefore the maximal element is a total choice function.
Zorn's Lemma is the form of Choice most frequently invoked in algebra. It is used to prove that every vector space has a [basis](/page/Basis), every ring has a maximal ideal, every field has an algebraic closure, and every filter can be extended to an ultrafilter. In each case, the argument follows the same template: define the "partial objects" one seeks to extend, verify the chain condition, and apply Zorn.
### The Well-Ordering Theorem
The Well-Ordering Theorem addresses a different structural question: can every set be equipped with a well-ordering?
[definition: Well-Ordering]
A **well-ordering** on a set $S$ is a total order $\leq$ on $S$ such that every nonempty subset of $S$ has a least element.
[/definition]
The natural numbers $\mathbb{N}$ are well-ordered by the usual $\leq$, and it is this well-ordering that makes induction possible. The Well-Ordering Theorem asserts that *every* set, no matter how large or structurally featureless, admits such an ordering.
[quotetheorem:1227]
The Well-Ordering Theorem is equivalent to the Axiom of Choice (Zermelo, 1904). This equivalence was, in fact, the original motivation for isolating AC as a separate axiom: Zermelo formulated the Axiom of Choice explicitly in order to prove the Well-Ordering Theorem.
The theorem is startling when applied to $\mathbb{R}$. The usual ordering of $\mathbb{R}$ is *not* a well-ordering: the open interval $(0, 1)$ has no least element. The Well-Ordering Theorem asserts that there exists some other total ordering of $\mathbb{R}$ under which every nonempty subset has a least element. Such a well-ordering cannot be described by any explicit formula; it exists only as a consequence of the Axiom of Choice. Moreover, the order type of a well-ordering of $\mathbb{R}$ depends on the cardinality of the continuum, and its relationship to the ordinal $\omega_1$ (the first uncountable ordinal) is precisely the content of the Continuum Hypothesis --- another statement independent of ZFC.
### Tychonoff's Theorem
The equivalence between the Axiom of Choice and [Tychonoff's theorem](/page/Tychonoff's%20Theorem) is perhaps the most surprising of the three, because Tychonoff's theorem is a statement about topological spaces rather than about sets or orderings. The theorem asserts that an arbitrary product of [compact spaces](/page/Compact%20Space) is compact in the product topology. That this topological statement is logically equivalent to an axiom about selecting elements from sets reveals a deep structural connection between compactness and choice.
[quotetheorem:953]
The forward direction (each $X_\alpha$ compact implies the product is compact) uses the Axiom of Choice in every known proof --- either through Zorn's Lemma (in the Alexander subbasis proof) or through the existence of ultrafilter extensions (in the ultrafilter proof). The reverse direction is elementary: each projection $\pi_\alpha$ is continuous and surjective, so $X_\alpha = \pi_\alpha(\prod_{\beta} X_\beta)$ is the continuous image of a compact space, hence compact.
The deep content is the converse equivalence: Tychonoff's theorem implies the Axiom of Choice. This was proved by Kelley (1950) using a clever topological construction. Given a family $\{A_i\}_{i \in I}$ of nonempty sets, Kelley constructs, for each $i$, a topological space $X_i = A_i \cup \{p_i\}$ (where $p_i \notin A_i$ is a new point) with the topology in which the open sets are $\varnothing$, all singletons $\{a\}$ for $a \in A_i$, and $X_i$ itself. Each $X_i$ is compact (any open cover containing $X_i$ is already finite, and any cover not containing $X_i$ can only consist of singletons of elements of $A_i$, but then $p_i$ is not covered --- so $X_i$ must appear in every cover). By Tychonoff, $\prod_{i \in I} X_i$ is compact. One then constructs a suitable family of closed sets with the finite intersection property whose intersection is precisely the set of choice functions, and compactness forces this intersection to be nonempty. The resulting element of $\prod_{i \in I} A_i$ (after restricting coordinates away from the $p_i$) is a choice function.
This equivalence means that any proof of Tychonoff's theorem, no matter how it is formulated, must use the full strength of the Axiom of Choice --- the theorem cannot be proved in ZF alone.
## Weaker Forms of Choice
Not all applications of the Axiom of Choice require its full strength. Many results in analysis and measure theory need only the ability to make *countably many* choices, and some algebraic results require a principle intermediate between countable choice and full AC. Understanding these weaker forms clarifies exactly how much "choice" is needed for a given theorem and reveals a rich hierarchy of set-theoretic principles.
### Countable Choice and Dependent Choice
The most commonly encountered weak form of Choice in analysis is the Axiom of Countable Choice (AC$_\omega$), which restricts the index set to be countable. Even weaker in appearance but stronger in practice is the Axiom of Dependent Choice (DC), which allows the construction of sequences by making choices that depend on previous selections.
[definition: Axiom of Countable Choice]
The **Axiom of Countable Choice** (AC$_\omega$) states: if $\{A_k\}_{k=1}^\infty$ is a countable family of nonempty sets, then there exists a choice function $f: \mathbb{N} \to \bigcup_{k=1}^\infty A_k$ with $f(k) \in A_k$ for all $k \in \mathbb{N}$.
[/definition]
A stronger principle, which additionally permits each choice to depend on the outcome of the previous one, is used even more frequently in analysis.
[definition: Axiom of Dependent Choice]
The **Axiom of Dependent Choice** (DC) states: if $X$ is a nonempty set and $R \subset X \times X$ is a binary relation such that for every $x \in X$ there exists $y \in X$ with $(x, y) \in R$, then there exists a sequence $(x_k)_{k=1}^\infty$ in $X$ such that $(x_k, x_{k+1}) \in R$ for all $k \in \mathbb{N}$.
[/definition]
DC is strictly stronger than AC$_\omega$: it implies countable choice. To see this, given $\{A_k\}_{k=1}^\infty$, define
\begin{align*}
X &:= \{(k, a) : k \in \mathbb{N},\, a \in A_k\}, \\
(k, a) &\mathrel{R} (l, b) \quad \text{if and only if} \quad l = k + 1.
\end{align*}
Then DC produces a sequence $((k_j, a_j))_{j=1}^\infty$ with $k_{j+1} = k_j + 1$, yielding a choice function. But AC$_\omega$ does not imply DC in ZF.
DC is the principle that makes sequential arguments in analysis possible. It is used every time one constructs a sequence inductively with "choose $x_{k+1}$ such that $d(x_{k+1}, x_k) < 1/k$" or "choose $x_{k+1} \in A$ with $\|x_{k+1} - x_k\| < \varepsilon_k$." Any argument that constructs a Cauchy sequence by iteratively selecting better approximations implicitly invokes DC.
[example: Dependent Choice in the Proof That Sequentially Compact Implies Compact]
Consider the proof that, in a metric space $(M, d)$, sequential compactness implies compactness. The standard argument proceeds by contradiction: suppose $\mathcal{U}$ is an open cover with no finite subcover. Then there is no Lebesgue number for $\mathcal{U}$ (otherwise, a finite $\delta/2$-net, which exists by total boundedness from sequential compactness, would produce a finite subcover). The failure of the Lebesgue number lemma means: for every $n \in \mathbb{N}$, there exists a point $x_n \in M$ such that $B(x_n, 1/n)$ is not contained in any single member of $\mathcal{U}$.
The construction of the sequence $(x_n)_{n=1}^\infty$ requires choosing, for each $n$, a point $x_n$ from the nonempty set $S_n := \{x \in M : B(x, 1/n) \not\subset U \text{ for all } U \in \mathcal{U}\}$. This is a countable sequence of choices, so AC$_\omega$ suffices. Once the sequence is constructed, sequential compactness produces a convergent subsequence, which leads to a contradiction.
Notably, the extraction of the convergent subsequence itself requires DC. Given the sequence $(x_n)$ and the requirement that $x_{n_{k+1}}$ must be chosen from among the terms $x_m$ with $m > n_k$ and $d(x_m, x) < 1/k$ (where $x$ is the would-be limit), each choice depends on the previous one --- exactly the situation DC handles.
[/example]
The following results from analysis require only DC (and sometimes only AC$_\omega$), not full AC:
- Every sequentially compact metric space is compact (AC$_\omega$).
- The Baire Category Theorem for complete metric spaces (DC).
- The equivalence of $\varepsilon$-$\delta$ continuity and sequential continuity for functions on metric spaces (AC$_\omega$).
- A countable union of countable sets is countable (AC$_\omega$).
By contrast, the following require full AC or principles strictly stronger than DC:
- Every vector space has a basis (equivalent to full AC).
- Every commutative ring with unity has a maximal ideal (equivalent to full AC).
- The Hahn-Banach theorem for nonseparable spaces (requires the Boolean Prime Ideal Theorem, which is strictly weaker than AC but strictly stronger than DC).
### The Boolean Prime Ideal Theorem
Between DC and full AC lies a principle of particular importance in functional analysis and algebra: the Boolean Prime Ideal Theorem (BPI), also known as the Ultrafilter Lemma.
[definition: Boolean Prime Ideal Theorem]
The **Boolean Prime Ideal Theorem** (BPI) states: every proper ideal in a Boolean algebra is contained in a prime ideal.
Equivalently (the **Ultrafilter Lemma**): every proper filter on a set can be extended to an ultrafilter.
[/definition]
BPI is strictly weaker than AC (Halpern and Levy, 1971): it does not imply that every vector space has a basis, nor does it imply the Well-Ordering Theorem. But it is strictly stronger than DC (Pincus, 1977). Its significance is that it suffices for several major results that are often attributed to the full Axiom of Choice:
- The Hahn-Banach theorem (both the dominated extension form and the separation form).
- [Tychonoff's theorem for Hausdorff spaces](/page/Tychonoff's%20Theorem) (the restriction to Hausdorff spaces weakens the choice principle needed from full AC to BPI).
- The compactness theorem for first-order logic.
- The existence of non-principal ultrafilters on $\mathbb{N}$.
- The Stone representation theorem for Boolean algebras.
The distinction between BPI and full AC is mathematically significant because BPI does not produce the most pathological consequences of AC. Under BPI alone (without full AC), every subset of $\mathbb{R}$ can be Lebesgue measurable (this is consistent with ZF + BPI), and the Banach-Tarski paradox does not follow. Thus BPI represents a "moderate" amount of choice: enough for the Hahn-Banach theorem and first-order logic, but not enough to produce non-measurable sets.
## Consequences in Analysis
Many foundational results in analysis --- the existence of non-measurable sets, the Hahn-Banach extension theorem, the existence of Hamel bases --- are frequently invoked without any mention of the Axiom of Choice, as though they were inescapable features of the mathematical landscape. In fact, each depends on Choice in an essential way, and each reveals a different facet of what the axiom makes possible: the first shows that Choice produces objects that defy measure-theoretic intuition, the second demonstrates that extending local linear functionals to global ones requires a transfinite selection principle, and the third shows that the algebraic structure of infinite-dimensional spaces is, in a precise sense, an artifact of the axiom.
### Existence of Non-Measurable Sets
The construction of a non-measurable set was the first indication that the Axiom of Choice has consequences that conflict with naive geometric intuition. The Vitali construction, outlined in the opening example, produces a subset of $\mathbb{R}$ that cannot be assigned a Lebesgue measure in a translation-invariant, countably additive way.
[quotetheorem:1228]
The argument proceeds as follows. Let $V \subset [0, 1]$ be a Vitali set: a set containing exactly one representative from each equivalence class of $\mathbb{R}/\mathbb{Q}$. For each rational $q \in \mathbb{Q} \cap [-1, 1]$, define the translate $V_q := \{v + q : v \in V\}$. Since $V$ contains exactly one element from each class, the translates $\{V_q\}_{q \in \mathbb{Q} \cap [-1,1]}$ are pairwise disjoint (if $v_1 + q_1 = v_2 + q_2$ with $v_1, v_2 \in V$ and $q_1 \neq q_2$, then $v_1 - v_2 = q_2 - q_1 \in \mathbb{Q}$, so $v_1$ and $v_2$ are in the same equivalence class, forcing $v_1 = v_2$ and $q_1 = q_2$). Moreover,
\begin{align*}
[0, 1] \subset \bigcup_{q \in \mathbb{Q} \cap [-1, 1]} V_q \subset [-1, 2].
\end{align*}
The first inclusion holds because every $x \in [0,1]$ has a representative $v \in V$ with $x - v \in \mathbb{Q}$, and $|x - v| \leq 1$ since both $x$ and $v$ lie in $[0,1]$.
Now suppose $V$ were measurable with Lebesgue measure $\mathcal{L}^1(V) = \lambda$. By translation invariance, $\mathcal{L}^1(V_q) = \lambda$ for every $q$. By countable additivity (the rationals in $[-1,1]$ are countable),
\begin{align*}
\mathcal{L}^1\!\left(\bigcup_{q \in \mathbb{Q} \cap [-1,1]} V_q\right) = \sum_{q \in \mathbb{Q} \cap [-1,1]} \lambda.
\end{align*}
The inclusion $[0,1] \subset \bigcup V_q \subset [-1,2]$ forces
\begin{align*}
1 \leq \sum_{q \in \mathbb{Q} \cap [-1,1]} \lambda \leq 3.
\end{align*}
But the sum $\sum_q \lambda$ is either $0$ (if $\lambda = 0$, since $0 \cdot \aleph_0 = 0$) or $+\infty$ (if $\lambda > 0$, since $\lambda$ is summed countably many times). Neither $0$ nor $+\infty$ lies in $[1, 3]$. This contradiction shows that $V$ is not measurable.
The key observation is that *every* step of this argument except the existence of $V$ is provable in ZF. The only use of Choice is in selecting one representative from each of uncountably many equivalence classes. Without Choice, the argument has no starting point: it is consistent with ZF (plus the consistency of an inaccessible cardinal) that every subset of $\mathbb{R}$ is Lebesgue measurable (Solovay, 1970).
### The Hahn-Banach Theorem
In functional analysis, the Hahn-Banach theorem is indispensable: it guarantees the existence of bounded linear extensions and the separation of convex sets by hyperplanes. The theorem's dependence on Choice is more subtle than that of the Vitali construction --- it requires only the Boolean Prime Ideal Theorem, not full AC.
[quotetheorem:879]
When $X$ is separable (has a countable dense subset), the Hahn-Banach theorem can be proved using only the Axiom of Countable Choice: one extends $\varphi$ one dimension at a time along a countable sequence of vectors whose span is dense, and then extends to the closure by continuity. This is a sequential construction and falls within the scope of DC.
For nonseparable spaces, the one-dimension-at-a-time argument requires transfinitely many extension steps, and Zorn's Lemma (or equivalently, the Boolean Prime Ideal Theorem) enters. The partially ordered set is the collection of all dominated linear extensions of $\varphi$ to subspaces of $X$, ordered by extension. The chain condition is verified because the union of a chain of compatible linear functionals is again a linear functional. Zorn's Lemma produces a maximal extension, which must be defined on all of $X$ (otherwise one could extend by one more dimension).
The Hahn-Banach theorem cannot be proved in ZF alone: there exist models of ZF in which it fails (Diaconescu, 1975). However, it does *not* require full AC --- BPI suffices, and there are models of ZF in which BPI holds and full AC fails.
### Every Vector Space Has a Basis
One of the most striking consequences of the Axiom of Choice is that every vector space --- regardless of dimension --- possesses a basis (a maximal linearly independent set, also called a Hamel basis). This result is equivalent to the full Axiom of Choice.
[quotetheorem:1229]
The proof is a direct application of Zorn's Lemma. Let $X$ be a vector space over a field $\mathbb{F}$. Consider the partially ordered set $\mathcal{P}$ of all linearly independent subsets of $X$, ordered by inclusion. The union of a chain of linearly independent sets is linearly independent (any finite subset of the union is contained in some member of the chain, hence linearly independent). By Zorn's Lemma, $\mathcal{P}$ has a maximal element $\mathcal{B}$. If $\operatorname{span}(\mathcal{B}) \neq X$, then there exists $v \in X \setminus \operatorname{span}(\mathcal{B})$, and $\mathcal{B} \cup \{v\}$ is linearly independent --- contradicting the maximality of $\mathcal{B}$. Therefore $\mathcal{B}$ is a basis.
The converse --- that the assertion "every vector space has a basis" implies the Axiom of Choice --- was proved by Blass (1984). The proof is nonconstructive and proceeds by encoding the problem of making choices from a family of sets into the problem of finding a basis for a suitable vector space.
The theorem has some consequences that are geometrically absurd for infinite-dimensional spaces. A Hamel basis for $\mathbb{R}$ as a vector space over $\mathbb{Q}$ is an uncountable set $\mathcal{B} \subset \mathbb{R}$ such that every real number is a unique finite linear combination of elements of $\mathcal{B}$ with rational coefficients. Such a basis cannot be explicitly described: in any model of ZF without Choice, there is no Hamel basis for $\mathbb{R}$ over $\mathbb{Q}$, and in models with Choice, the basis exists but cannot be defined by any formula of set theory.
## The Banach-Tarski Paradox
Of all the consequences of the Axiom of Choice, the Banach-Tarski paradox is the most visually striking and the one most frequently cited as evidence that AC leads to pathological results. It asserts that a solid ball in $\mathbb{R}^3$ can be decomposed into finitely many pieces that, using only rigid motions (rotations and translations), can be reassembled into *two* solid balls of the same radius as the original.
The paradox depends on two ingredients: the Axiom of Choice (to construct the pieces) and the existence of free groups of rotations in $\operatorname{SO}(3)$ (to provide the geometric mechanism).
[quotetheorem:1230]
The paradox does *not* violate any law of physics, because the pieces $A_1, \ldots, A_5$ are not Lebesgue measurable --- they cannot be assigned a volume in any way that is invariant under rigid motions. The decomposition is impossible for measurable sets: Lebesgue measure is finitely additive and isometry-invariant, so if all pieces were measurable, the two copies would each have the same total measure as the original ball, giving $2 \cdot \mathcal{L}^3(B) = \mathcal{L}^3(B)$ and hence $\mathcal{L}^3(B) = 0$, a contradiction.
The geometric mechanism behind the paradox is the existence of a **free group** on two generators inside the rotation group $\operatorname{SO}(3)$. Let $\rho$ and $\sigma$ be rotations about two axes that generate a free subgroup $F_2 \subset \operatorname{SO}(3)$ (such pairs exist; for instance, rotations by $\arccos(1/3)$ about two orthogonal axes generate a free group, as can be verified by the ping-pong lemma applied to suitable subsets of $S^2$). The free group $F_2$ has the algebraic property that it is *paradoxical*: it can be partitioned into four pieces, each of which is a translate (in the group-theoretic sense) of $F_2$ itself. More precisely,
\begin{align*}
F_2 = W(\rho) \cup W(\rho^{-1}) \cup W(\sigma) \cup W(\sigma^{-1}) \cup \{e\},
\end{align*}
where $W(g)$ denotes the set of reduced words beginning with $g$, and one has $\rho^{-1} \cdot W(\rho) = F_2 \setminus W(\rho^{-1})$ (left-multiplying by $\rho^{-1}$ maps $W(\rho)$ bijectively onto $F_2 \setminus W(\rho^{-1})$). This algebraic paradoxicality of $F_2$ is transferred to $\mathbb{R}^3 \setminus \{0\}$ via the action of $\operatorname{SO}(3)$, and the Axiom of Choice is used to select one representative from each orbit of $F_2$ acting on the unit sphere $S^2$.
The role of dimension is essential. The Banach-Tarski paradox fails in $\mathbb{R}^1$ and $\mathbb{R}^2$: in these dimensions, Lebesgue measure is invariant under all isometries *and* extends to a finitely additive measure defined on *all* subsets (Banach, 1923). The group of isometries of $\mathbb{R}^1$ and $\mathbb{R}^2$ is *amenable* (admits a left-invariant finitely additive probability measure), whereas $\operatorname{SO}(3)$ is non-amenable because it contains the free group $F_2$. Amenability is the precise dividing line: paradoxical decompositions exist if and only if the acting group is non-amenable (Tarski, 1929).
## Independence from ZF
The Axiom of Choice occupies a unique position among the standard axioms of set theory: it is the only axiom whose consistency with and independence from the remaining axioms have both been established. This means that neither AC nor its negation can be proved from ZF --- the axiom is genuinely "optional" in the sense that both ZF + AC and ZF + $\neg$AC are consistent theories (assuming ZF itself is consistent).
### Godel's Constructible Universe
The consistency of AC with ZF was established by Godel in 1938. Godel constructed an inner model of set theory, the **constructible universe** $L$, in which every set is "built up" from below by definable operations. The key properties of $L$ are:
1. $L$ is a model of ZF: all the ZF axioms are satisfied when the quantifiers range over constructible sets.
2. $L$ satisfies the Axiom of Choice: the constructible sets are naturally well-ordered by the order in which they are constructed, and this well-ordering provides a definable choice function for any family of nonempty constructible sets.
3. $L$ satisfies the Generalized Continuum Hypothesis (GCH): for every infinite cardinal $\kappa$, $2^\kappa = \kappa^+$ in $L$.
The construction of $L$ proceeds by transfinite recursion. One defines $L_0 = \varnothing$, $L_{\alpha+1} = \mathcal{D}(L_\alpha)$ (the set of all subsets of $L_\alpha$ that are definable by a first-order formula with parameters from $L_\alpha$), and $L_\lambda = \bigcup_{\alpha < \lambda} L_\alpha$ for limit ordinals $\lambda$. The constructible universe is $L = \bigcup_{\alpha \in \mathrm{Ord}} L_\alpha$.
Godel's result shows that AC cannot lead to a contradiction with ZF: if ZF is consistent, then so is ZFC (= ZF + AC). This does not mean that AC is "true" --- only that it is safe to use, in the sense that any contradiction derivable from ZFC would already be derivable from ZF alone.
### Cohen's Forcing
The independence of AC from ZF --- that AC cannot be *proved* from ZF --- was established by Cohen in 1963 using the method of **forcing**. Cohen constructed a model of ZF in which the Axiom of Choice fails. Specifically, he produced a model containing a countable family of pairs of real numbers (sets of the form $\{a_k, b_k\}$) for which no choice function exists: there is no sequence $(c_k)_{k=1}^\infty$ with $c_k \in \{a_k, b_k\}$ for all $k$.
The method of forcing extends a countable transitive model $M$ of ZFC (a "ground model") to a larger model $M[G]$ by adjoining a "generic" object $G$ that is constructed to violate a specific statement. The generic object is built by specifying a partially ordered set of "conditions" (finite approximations to $G$) and taking a filter that intersects every dense set in the ground model. The resulting model $M[G]$ satisfies ZF but may fail to satisfy AC, depending on the choice of the partial order.
Cohen's achievement was not merely the construction of a single model: forcing turned out to be a universal method for proving independence results. Nearly all independence results in set theory since 1963 --- including the independence of the Continuum Hypothesis from ZFC, the consistency of various large cardinal axioms, and the relative consistency of many weak choice principles --- have been established using forcing or its descendants.
The combined results of Godel and Cohen establish that the Axiom of Choice is *independent* of ZF:
\begin{align*}
\text{Con}(\text{ZF}) \implies \text{Con}(\text{ZFC}) \quad &\text{(G\"{o}del, 1938)} \\
\text{Con}(\text{ZF}) \implies \text{Con}(\text{ZF} + \neg\text{AC}) \quad &\text{(Cohen, 1963)}.
\end{align*}
Thus the status of AC is analogous to that of the parallel postulate in Euclidean geometry: one can do mathematics with it or without it, and neither choice leads to a contradiction.
## Constructive Mathematics and the Status of Choice
The Axiom of Choice is rejected in several foundational frameworks, most notably in constructive mathematics and in certain forms of topos theory. The reasons for rejection illuminate what the axiom actually asserts and why it is philosophically contentious.
### The Constructive Objection
In constructive mathematics (following Brouwer, Bishop, and Martin-Lof), a statement $\exists\, x \in A\, \varphi(x)$ is considered proved only when one exhibits a *specific* element $a \in A$ together with a proof of $\varphi(a)$. Under this interpretation, the Axiom of Choice becomes either automatically true or deeply problematic, depending on exactly how the quantifiers are interpreted.
For constructive type theories (such as Martin-Lof type theory), AC is provable: if one has, for each $i \in I$, a witness $a_i \in A_i$ (which is what $A_i \neq \varnothing$ means constructively), then the function $i \mapsto a_i$ is the choice function. The subtlety is that in constructive set theories (such as IZF or CZF), "nonempty" does not mean "we have a witness" --- it means "it is not the case that the set is empty," which is weaker. Under this interpretation, AC is not provable and has consequences (like the law of excluded middle) that constructivists reject.
The Diaconescu-Goodman-Myhill theorem makes the tension precise.
[quotetheorem:1231]
Since constructive mathematics rejects the law of excluded middle, this theorem shows that the full Axiom of Choice is incompatible with the constructive programme. The proof constructs, from any proposition $P$, a two-element family of sets whose choice function encodes the truth value of $P$. Specifically, define
\begin{align*}
A_0 &:= \{x \in \{0, 1\} : x = 0 \lor P\}, \\
A_1 &:= \{x \in \{0, 1\} : x = 1 \lor P\}.
\end{align*}
Both $A_0$ and $A_1$ are nonempty ($0 \in A_0$ and $1 \in A_1$). A choice function selects $f(0) \in A_0$ and $f(1) \in A_1$. If $f(0) = f(1)$, then either both are $0$ (forcing $P$ from the definition of $A_1$) or both are $1$ (forcing $P$ from the definition of $A_0$), so $P$ holds. If $f(0) \neq f(1)$, then $f(0) = 0$ and $f(1) = 1$, and neither comes from the "$\lor P$" clause, so $\neg P$ must hold (more precisely, $P$ implies $A_0 = A_1 = \{0, 1\}$, so $P$ would force $f(0)$ and $f(1)$ to be in the same two-element set, but we could still have $f(0) \neq f(1)$; the full argument uses the extensionality of the choice function on the quotient). In either case, $P \lor \neg P$.
### Solovay's Model
The question of what mathematics looks like without the Axiom of Choice has a dramatic answer in the context of real analysis. Solovay (1970) showed that, assuming the consistency of ZF together with an inaccessible cardinal, there is a model of ZF + DC (Dependent Choice) in which:
1. Every subset of $\mathbb{R}$ is Lebesgue measurable.
2. Every subset of $\mathbb{R}$ has the Baire property.
3. Every uncountable subset of $\mathbb{R}$ contains a perfect subset (and hence has the cardinality of the continuum).
This model demonstrates that the pathologies of AC --- non-measurable sets, sets without the Baire property, uncountable sets with no perfect subset --- are genuine consequences of the axiom, not inevitable features of set theory. In Solovay's model, real analysis behaves "better" in many respects: the Lebesgue measure is total, the Baire category theorem still holds (DC suffices), and the continuum has no "wild" subsets.
The price is that DC is strictly weaker than AC, so some results of classical mathematics fail: there exist vector spaces without bases, there exist surjections without right inverses, and Tychonoff's theorem fails for uncountable products. Whether this price is acceptable depends on one's foundational commitments.
### The Axiom of Determinacy
A more radical alternative to AC, studied extensively in descriptive set theory, is the Axiom of Determinacy.
[definition: Axiom of Determinacy]
The **Axiom of Determinacy** (AD) asserts that for any subset $A \subset \mathbb{N}^{\mathbb{N}}$ (viewed as the set of plays of an infinite two-player game in which the players alternately choose natural numbers), one of the two players has a winning strategy.
[/definition]
AD directly contradicts full AC (a well-ordering of $\mathbb{R}$ would produce undetermined games), but it is consistent with DC.
Under ZF + AD, the regularity properties that fail under AC are restored in a strong and uniform way: every subset of $\mathbb{R}$ is Lebesgue measurable, every subset has the Baire property, and every uncountable subset contains a perfect subset. Moreover, AD settles many questions that are independent of ZFC: for instance, AD implies that every set of reals is determined, which in turn implies the measurability of all projective sets --- a consequence that can also be obtained in ZFC from large cardinal hypotheses (Martin and Steel, 1989).
The modern perspective, following Woodin and others, is that large cardinals and determinacy are deeply intertwined: the Axiom of Determinacy holds in the inner model $L(\mathbb{R})$ if and only if there exist infinitely many Woodin cardinals with a measurable cardinal above them. This connection places AD not as a whimsical alternative to AC, but as a natural consequence of a rich large cardinal structure --- one that is consistent with full AC holding in the ambient universe $V$ while AD holds in the restricted universe $L(\mathbb{R})$.
## Standard Techniques Using the Axiom of Choice
The Axiom of Choice is not merely a foundational curiosity --- it is a working tool used throughout mathematics. The following patterns of argument recur so frequently that recognising them is essential for reading and writing proofs.
### The Zorn's Lemma Template
The majority of applications of Choice in algebra and analysis follow a single template:
1. **Define the partially ordered set.** Identify the collection $\mathcal{P}$ of "partial objects" you wish to extend, and order $\mathcal{P}$ by extension (e.g., inclusion, restriction).
2. **Verify the chain condition.** Show that every totally ordered subset (chain) of $\mathcal{P}$ has an upper bound in $\mathcal{P}$. This typically requires taking the union of the chain and verifying that it belongs to $\mathcal{P}$.
3. **Apply Zorn's Lemma.** Conclude that $\mathcal{P}$ has a maximal element.
4. **Show that maximality implies totality.** Argue that any element of $\mathcal{P}$ that is not "total" (not defined on the full domain, not maximal in the algebraic sense, etc.) can be extended, contradicting maximality.
[example: Extending a Filter to an Ultrafilter]
Let $X$ be a nonempty set and $\mathcal{F}$ a proper filter on $X$ (a nonempty collection of subsets of $X$ that is closed under finite intersections and supersets, and does not contain $\varnothing$). We show that $\mathcal{F}$ is contained in an ultrafilter.
**Step 1: Define $\mathcal{P}$.** Let $\mathcal{P}$ be the collection of all proper filters on $X$ that contain $\mathcal{F}$, ordered by inclusion: $\mathcal{G}_1 \leq \mathcal{G}_2$ if $\mathcal{G}_1 \subset \mathcal{G}_2$.
**Step 2: Chain condition.** Let $\{\mathcal{G}_\alpha\}_{\alpha \in A}$ be a chain in $\mathcal{P}$. Define $\mathcal{G} = \bigcup_{\alpha \in A} \mathcal{G}_\alpha$. Then $\mathcal{G}$ is a filter: if $B_1, B_2 \in \mathcal{G}$, then $B_1 \in \mathcal{G}_{\alpha_1}$ and $B_2 \in \mathcal{G}_{\alpha_2}$ for some $\alpha_1, \alpha_2$; since the chain is totally ordered, one of $\mathcal{G}_{\alpha_1} \subset \mathcal{G}_{\alpha_2}$ or $\mathcal{G}_{\alpha_2} \subset \mathcal{G}_{\alpha_1}$ holds, so both $B_1$ and $B_2$ belong to the larger filter, and $B_1 \cap B_2$ belongs to it as well. Since no $\mathcal{G}_\alpha$ contains $\varnothing$, neither does $\mathcal{G}$. Upward closure is immediate. Thus $\mathcal{G} \in \mathcal{P}$.
**Step 3: Apply Zorn.** By Zorn's Lemma, $\mathcal{P}$ has a maximal element $\mathcal{U}$.
**Step 4: Maximality implies ultrafilter.** Suppose $\mathcal{U}$ is not an ultrafilter. Then there exists $A \subset X$ with $A \notin \mathcal{U}$ and $X \setminus A \notin \mathcal{U}$. The collection $\mathcal{U}' = \{B \subset X : B \supset U \cap A \text{ for some } U \in \mathcal{U}\}$ is a proper filter (properness follows from $X \setminus A \notin \mathcal{U}$, which ensures $\varnothing \notin \mathcal{U}'$), and $\mathcal{U} \subsetneq \mathcal{U}'$ (since $A \in \mathcal{U}'$ but $A \notin \mathcal{U}$). This contradicts the maximality of $\mathcal{U}$.
[/example]
### The Transfinite Selection Pattern
Some applications of Choice do not fit the Zorn template because the construction requires making choices at each step of a transfinite recursion, rather than simply picking a maximal element. The Well-Ordering Theorem is itself the paradigm: one well-orders a set by choosing, at each ordinal stage, an element not yet selected. The construction continues until the set is exhausted, and the resulting well-ordering has the desired properties.
This pattern appears in:
- The proof that every infinite set contains a countably infinite subset (DC suffices).
- The construction of a Bernstein set (a set $B \subset \mathbb{R}$ such that neither $B$ nor $\mathbb{R} \setminus B$ contains a perfect subset), which requires a well-ordering of the perfect subsets of $\mathbb{R}$ and a transfinite selection at each stage.
- The proof of the Lowenheim-Skolem theorem, where one builds an elementary submodel by selecting witnesses for existential statements at each stage of a transfinite construction.
### Recognising Hidden Uses of Choice
Many standard arguments use the Axiom of Choice implicitly. The following is a diagnostic checklist for detecting hidden applications:
1. **"For each $n$, choose..."** — If a proof constructs a sequence $(x_n)_{n=1}^\infty$ by making a choice at each step, it uses DC (at minimum).
2. **"There exists a basis..."** — Any claim that a vector space (especially an infinite-dimensional one) has a basis uses full AC.
3. **"By Zorn's Lemma..."** — This is AC, explicitly.
4. **"Take a maximal ideal..."** — Maximal ideals in arbitrary rings require Zorn's Lemma.
5. **"Extend to an ultrafilter..."** — This requires BPI, which is strictly weaker than AC but not provable in ZF.
6. **"A countable union of countable sets is countable"** — This requires AC$_\omega$.
7. **"Every surjection has a right inverse"** — This is equivalent to full AC.
8. **"Every infinite set has a countably infinite subset"** — This requires a weak form of Choice (a countable selection from nonempty sets, which follows from DC).
## References
- Jech, T., *The Axiom of Choice* (1973). North-Holland.
- Jech, T., *Set Theory: The Third Millennium Edition* (2003). Springer.
- Herrlich, H., *Axiom of Choice* (2006). Springer Lecture Notes in Mathematics, Vol. 1876.
- Kunen, K., *Set Theory: An Introduction to Independence Proofs* (1980). North-Holland.
- Moore, G. H., *Zermelo's Axiom of Choice: Its Origins, Development, and Influence* (1982). Springer.
- Solovay, R. M., A model of set theory in which every set of reals is Lebesgue measurable. *Annals of Mathematics* **92** (1970), 1-56.
- Cohen, P. J., *Set Theory and the Continuum Hypothesis* (1966). W. A. Benjamin.
- Kelley, J. L., The Tychonoff product theorem implies the axiom of choice. *Fundamenta Mathematicae* **37** (1950), 75-76.
- Blass, A., Existence of bases implies the Axiom of Choice. *Contemporary Mathematics* **31** (1984), 31-33.
- Wagon, S., *The Banach-Tarski Paradox* (1985). Cambridge University Press.