How large can a subset of $[0,1]$ be while containing no interval? At first glance, removing open intervals from $[0,1]$ should eventually leave behind a set that is either countable (and hence negligible) or retains some interval of positive length. The **Cantor set** demolishes both intuitions: it is uncountable — as large as $\mathbb{R}$ in the sense of cardinality — yet contains no interval whatsoever, and its [Lebesgue measure](/page/Measure%20Space) is zero. This single construction serves as the canonical counterexample across analysis, topology, and measure theory. It shows that uncountability does not imply positive measure, that [compactness](/page/Compact%20Space) and [total disconnectedness](/page/Connectedness) can coexist in an uncountable set, and that a [nowhere dense](/page/Nowhere%20Dense%20Set) [closed](/page/Closure) set can still be topologically rich.
The construction itself is elementary — one removes middle thirds iteratively — but the resulting object has a surprisingly deep structure. The Cantor set is homeomorphic to the product space $\{0,1\}^{\mathbb{N}}$, making it a universal object: every compact metrizable space is a continuous image of it. Its companion, the **Cantor function** (the "devil's staircase"), provides a continuous surjection from $[0,1]$ onto $[0,1]$ that is constant on a set of full measure yet manages to increase from $0$ to $1$ — a phenomenon that is impossible for absolutely continuous functions. Variants called **fat Cantor sets** show that even the measure-zero property is an artifact of the particular construction, not of the topological type: one can build sets homeomorphic to the Cantor set that have any prescribed Lebesgue measure in $(0,1)$.
[example: Naive Interval Removal]
Consider the simplest strategy for building a "small" subset of $[0,1]$: remove a single open interval, say $(1/4, 3/4)$. The remainder $[0, 1/4] \cup [3/4, 1]$ consists of two closed intervals and has Lebesgue measure $1/2$. Remove the middle half of each remaining interval to get four intervals of total measure $1/4$. After $n$ steps, the remaining set $R_n$ consists of $2^n$ closed intervals, each of length $(1/4)^n$, so
\begin{align*}
\mathcal{L}^1(R_n) = 2^n \cdot \left(\frac{1}{4}\right)^n = \left(\frac{1}{2}\right)^n \to 0.
\end{align*}
The intersection $R = \bigcap_{n=0}^{\infty} R_n$ has measure zero. But is $R$ countable or uncountable? The answer depends on the removal ratio, and it is far from obvious. The Cantor set arises from the specific choice of removing middle *thirds*, and it turns out to be uncountable — a fact that requires a careful argument via ternary expansions.
[/example]
## Definition
The standard Cantor set is built by iterating a single geometric operation: remove the open middle third of every remaining closed interval.
[definition: Cantor Set]
Define a decreasing sequence of compact sets $C_0 \supset C_1 \supset C_2 \supset \cdots$ in $[0,1]$ as follows. Set $C_0 := [0,1]$. Given $C_n$, which is a disjoint union of $2^n$ closed intervals each of length $3^{-n}$, define $C_{n+1}$ by removing the open middle third of each constituent interval. That is, if $[a, b]$ is a constituent interval of $C_n$ (so $b - a = 3^{-n}$), replace it by
\begin{align*}
[a, b] \;\longmapsto\; \left[a,\; a + \frac{b-a}{3}\right] \cup \left[a + \frac{2(b-a)}{3},\; b\right].
\end{align*}
The **Cantor set** (also called the **Cantor ternary set** or **Cantor middle-thirds set**) is
\begin{align*}
\mathcal{C} := \bigcap_{n=0}^{\infty} C_n.
\end{align*}
[/definition]
The first few stages make the construction concrete:
\begin{align*}
C_0 &= [0, 1], \\
C_1 &= \left[0, \tfrac{1}{3}\right] \cup \left[\tfrac{2}{3}, 1\right], \\
C_2 &= \left[0, \tfrac{1}{9}\right] \cup \left[\tfrac{2}{9}, \tfrac{1}{3}\right] \cup \left[\tfrac{2}{3}, \tfrac{7}{9}\right] \cup \left[\tfrac{8}{9}, 1\right], \\
C_3 &= \text{eight closed intervals, each of length } \tfrac{1}{27}.
\end{align*}
[illustration:cantor-set-construction-stages]
At stage $n$, the set $C_n$ consists of $2^n$ closed intervals, each of length $3^{-n}$. Since each $C_n$ is a finite union of closed intervals, it is compact. The Cantor set $\mathcal{C}$, being the intersection of a nested sequence of nonempty compact subsets of $\mathbb{R}$, is itself nonempty and compact by the [finite intersection property](/page/Compact%20Space).
### Characterisation by ternary expansions
The iterative construction, while vivid, does not lend itself to direct computation. A far more useful description identifies $\mathcal{C}$ with those real numbers whose ternary (base-$3$) expansion avoids the digit $1$.
[quotetheorem:1196]
The connection between the two descriptions becomes transparent once one tracks the removal process digit by digit. At stage $1$, the interval $(1/3, 2/3)$ is removed. In base $3$, this interval consists of those numbers whose first ternary digit is $1$ (excluding the endpoints $0.1000\ldots_3 = 1/3$ and $0.0222\ldots_3 = 1/3$, which have alternative representations $0.0222\ldots_3$ and $0.2000\ldots_3$ using only digits $\{0,2\}$). At stage $2$, we remove numbers whose second ternary digit is $1$ (among those surviving stage $1$), and so on. After all stages, precisely those numbers with no digit $1$ in any position survive.
This characterisation is the primary computational tool for the Cantor set. Rather than reasoning about nested intersections of intervals, one reasons about sequences in $\{0,2\}^{\mathbb{N}}$, which is far more tractable.
## Measure and Cardinality
Two properties of the Cantor set stand in stark tension: it has Lebesgue measure zero, yet it is uncountable. Understanding both facts — and why they do not contradict each other — is essential for developing correct intuition about the relationship between measure and cardinality.
### Measure zero
The total length removed at stage $n$ is straightforward to compute. At stage $1$, we remove one interval of length $1/3$. At stage $2$, we remove two intervals each of length $1/9$. At stage $n$, we remove $2^{n-1}$ intervals each of length $3^{-n}$. The total measure of all removed intervals is
\begin{align*}
\sum_{n=1}^{\infty} 2^{n-1} \cdot \frac{1}{3^n} = \frac{1}{3} \sum_{n=0}^{\infty} \left(\frac{2}{3}\right)^n = \frac{1}{3} \cdot \frac{1}{1 - 2/3} = 1.
\end{align*}
Since the removed intervals are pairwise disjoint and their union lies in $[0,1]$, we obtain
\begin{align*}
\mathcal{L}^1(\mathcal{C}) = \mathcal{L}^1([0,1]) - \sum_{n=1}^{\infty} 2^{n-1} \cdot 3^{-n} = 1 - 1 = 0.
\end{align*}
[quotetheorem:1197]
One might suspect that any set of Lebesgue measure zero must be "small" in every sense — perhaps countable, or at least topologically trivial. The next result shows this is false.
### Uncountability
[quotetheorem:1198]
The ternary characterisation provides a direct argument. Each element of $\mathcal{C}$ corresponds to a sequence $(a_k)_{k \in \mathbb{N}}$ with $a_k \in \{0,2\}$. The map
\begin{align*}
\Phi: \{0,2\}^{\mathbb{N}} &\to \mathcal{C} \\
(a_k)_{k \in \mathbb{N}} &\mapsto \sum_{k=1}^{\infty} \frac{a_k}{3^k}
\end{align*}
is surjective by the ternary characterisation theorem. It is also injective: if two distinct sequences $(a_k)$ and $(b_k)$ in $\{0,2\}^{\mathbb{N}}$ are given, let $m$ be the first index where they differ. Then $|a_m - b_m| = 2$, while the contribution of all subsequent terms satisfies
\begin{align*}
\sum_{k=m+1}^{\infty} \frac{|a_k - b_k|}{3^k} \le \sum_{k=m+1}^{\infty} \frac{2}{3^k} = \frac{2}{3^{m+1}} \cdot \frac{1}{1 - 1/3} = \frac{1}{3^m} < \frac{2}{3^m},
\end{align*}
so the images are distinct. Since $|\{0,2\}^{\mathbb{N}}| = 2^{\aleph_0} = |\mathbb{R}|$, the bijection $\Phi$ establishes $|\mathcal{C}| = |\mathbb{R}|$.
[explanation: Measure Zero Does Not Imply Countable]
The coexistence of measure zero and uncountability in $\mathcal{C}$ reveals a fundamental point: **Lebesgue measure and cardinality are independent notions of size.** Countable sets always have measure zero (a countable set $\{x_1, x_2, \ldots\}$ can be covered by intervals of total length $\sum_{k=1}^{\infty} \varepsilon/2^k = \varepsilon$ for any $\varepsilon > 0$), but the converse fails spectacularly. The Cantor set is as large as $\mathbb{R}$ in cardinality while being invisible to Lebesgue measure. Later, in the section on fat Cantor sets, we will see that topology and measure can be decoupled even further: sets homeomorphic to $\mathcal{C}$ can have any prescribed measure in $[0,1)$.
[/explanation]
## Topological Structure
The Cantor set is remarkable not for any single topological property but for the combination of properties it achieves simultaneously. It is compact, perfect (no isolated points), and totally disconnected — properties that individually are routine but in combination characterise the Cantor set up to homeomorphism.
### Compactness and closedness
Since each $C_n$ is a closed subset of $[0,1]$ and $\mathcal{C} = \bigcap_{n=0}^{\infty} C_n$, the Cantor set is closed. As a closed subset of the compact space $[0,1]$, it is [compact](/page/Compact%20Space).
### Perfectness
The question of whether $\mathcal{C}$ contains isolated points leads to a topological property that provides an alternative route to uncountability — one that does not rely on ternary expansions at all.
[definition: Perfect Set]
A closed set is **perfect** if it has no isolated points — every point is a limit point.
[/definition]
Why should perfectness matter? Because perfect subsets of a complete metric space are necessarily uncountable, so establishing that $\mathcal{C}$ is perfect gives a second, independent proof of its uncountability.
[quotetheorem:1199]
To see why no point of $\mathcal{C}$ is isolated, consider $x \in \mathcal{C}$ with ternary expansion $x = \sum_{k=1}^{\infty} a_k / 3^k$ where each $a_k \in \{0, 2\}$. For each $n \in \mathbb{N}$, define $y_n$ by changing only the $n$-th digit:
\begin{align*}
y_n := \sum_{k=1}^{\infty} \frac{b_k}{3^k}, \quad \text{where } b_k = \begin{cases} a_k & \text{if } k \ne n, \\ 2 - a_n & \text{if } k = n. \end{cases}
\end{align*}
Since each $b_k \in \{0,2\}$, we have $y_n \in \mathcal{C}$. Moreover, $y_n \ne x$ and
\begin{align*}
|x - y_n| = \frac{|a_n - (2 - a_n)|}{3^n} = \frac{2}{3^n} \to 0 \quad \text{as } n \to \infty.
\end{align*}
So every neighbourhood of $x$ contains another point of $\mathcal{C}$.
### Total disconnectedness
To capture the sense in which $\mathcal{C}$ is "maximally fragmented," we need a notion that goes beyond having no isolated points.
[definition: Totally Disconnected Space]
A topological space is **totally disconnected** if its only connected subsets are singletons.
[/definition]
For subsets of $\mathbb{R}$, connected subsets are intervals, so total disconnectedness means the set contains no interval of positive length.
[quotetheorem:1200]
The argument uses the complement of $\mathcal{C}$ directly. Given distinct $x, y \in \mathcal{C}$, there exists a point $z \in (x,y) \setminus \mathcal{C}$ — because $(x,y)$ has positive length and $\mathcal{C}$ contains no interval. (Every interval of positive length intersects the complement $[0,1] \setminus \mathcal{C}$, since the removed middle thirds are dense in $[0,1]$.) Then $U := \mathcal{C} \cap (-\infty, z)$ and $V := \mathcal{C} \cap (z, \infty)$ provide the desired separation.
An equivalent viewpoint: $\mathcal{C}$ is totally disconnected because for any $x, y \in \mathcal{C}$ with $x < y$, the ternary expansions of $x$ and $y$ first differ at some digit $m$, and the removed middle third at level $m$ provides a gap in $\mathcal{C}$ separating $x$ from $y$.
### Nowhere density
The Cantor set is [nowhere dense](/page/Nowhere%20Dense%20Set): its [closure](/page/Closure) (which is $\mathcal{C}$ itself, since $\mathcal{C}$ is closed) has empty interior. This is a direct consequence of total disconnectedness — since $\mathcal{C}$ contains no interval, it contains no open set of $\mathbb{R}$, so $\operatorname{int}(\mathcal{C}) = \varnothing$.
Despite being nowhere dense, $\mathcal{C}$ is uncountable. This stands in contrast to the [Baire Category Theorem](/page/Compact%20Space), which guarantees that a complete metric space cannot be written as a countable union of nowhere dense sets. The Cantor set is a single nowhere dense set that is already uncountable — the Baire theorem imposes no constraint on the size of individual nowhere dense sets.
### Topological characterisation
The combination of compactness, perfectness, and total disconnectedness in a metrizable space is not just a list of properties — it is a *complete* characterisation.
[quotetheorem:1201]
This theorem, due to L. E. J. Brouwer (1910), is one of the earliest and most striking results in descriptive set theory. It says that the Cantor set is, up to homeomorphism, the *unique* compact metrizable space that is both perfect and totally disconnected. Any construction that produces such a space — regardless of how different it looks geometrically — yields a space homeomorphic to $\mathcal{C}$. For instance, the fat Cantor sets constructed later in this article are all homeomorphic to $\mathcal{C}$, even though they have positive Lebesgue measure. The theorem also implies that the product space $\{0,1\}^{\mathbb{N}}$ (with the product topology, where $\{0,1\}$ carries the discrete topology) is homeomorphic to $\mathcal{C}$, since $\{0,1\}^{\mathbb{N}}$ is readily verified to be compact, metrizable, perfect, and totally disconnected.
What the theorem does *not* say is equally important: it gives no information about metric properties. Two spaces can be homeomorphic (hence topologically identical) yet have vastly different measures, Hausdorff dimensions, or Holder regularity properties. The Cantor set and a fat Cantor set of measure $1/2$ are homeomorphic, but they are not bi-Lipschitz equivalent — the homeomorphism must distort distances.
## The Cantor Set as a Product Space
The ternary characterisation already suggests a bijection between $\mathcal{C}$ and $\{0,2\}^{\mathbb{N}}$. A natural question is whether this bijection is merely set-theoretic or whether it respects topology. Since $\{0,2\}$ is a two-point discrete space and $\{0,2\}^{\mathbb{N}}$ carries the product topology, this amounts to asking: is the natural encoding of Cantor set elements by their digit sequences a homeomorphism?
[quotetheorem:1202]
The key observation is that $\Phi$ is a continuous bijection from a compact space to a Hausdorff space, and any such map is automatically a homeomorphism. Continuity of $\Phi$ follows from the fact that if two sequences agree in the first $n$ coordinates, then their images under $\Phi$ differ by at most $\sum_{k=n+1}^{\infty} 2 \cdot 3^{-k} = 3^{-n}$, so $\Phi$ maps the basic open set $\{(a_k) : a_1 = c_1, \ldots, a_n = c_n\}$ into a set of diameter at most $3^{-n}$.
Since $\{0,2\}$ is homeomorphic to $\{0,1\}$, this result can equivalently be stated as $\mathcal{C} \cong \{0,1\}^{\mathbb{N}}$, the space of all binary sequences. This identification is the gateway to the universality property of the Cantor set.
[explanation: Why the Product Topology Matters]
The product topology on $\{0,1\}^{\mathbb{N}}$ is the topology of "coordinate-wise convergence" — a sequence of binary strings converges if and only if it converges in each coordinate (which, for a discrete factor, means it is eventually constant in each coordinate). Since $\Phi$ is a homeomorphism, convergence of a sequence $(x_n)$ in $\mathcal{C}$ (equivalently, convergence in $\mathbb{R}$ to a limit in $\mathcal{C}$) is the same as coordinate-wise convergence of the corresponding digit sequences. This gives a powerful computational criterion: to check whether a sequence in $\mathcal{C}$ converges, it suffices to check that each ternary digit eventually stabilises. The homeomorphism also transfers other topological properties: the clopen sets of $\mathcal{C}$ correspond to cylinder sets in $\{0,2\}^{\mathbb{N}}$ (sets determined by finitely many coordinates), making the Borel structure of $\mathcal{C}$ fully explicit.
[/explanation]
## Universality
The Cantor set is not merely an interesting example — it is a universal object in the category of compact metrizable spaces. Every such space, no matter how complicated, can be obtained as a continuous image of $\mathcal{C}$. This is surprising: the Cantor set is totally disconnected and has measure zero, yet it surjects continuously onto spaces like $[0,1]$, $[0,1]^2$, the Hilbert cube, or any compact metric space.
[quotetheorem:1203]
The theorem follows from the identification $\mathcal{C} \cong \{0,1\}^{\mathbb{N}}$ together with the fact that every compact metrizable space embeds into the Hilbert cube $[0,1]^{\mathbb{N}}$, combined with a careful construction of a continuous surjection from $\{0,1\}^{\mathbb{N}}$ onto $[0,1]^{\mathbb{N}}$.
This universality property has striking consequences. For example, there exists a continuous surjection from $\mathcal{C}$ onto $[0,1]^n$ for every $n \in \mathbb{N}$ — the Cantor set, a one-dimensional dust, maps continuously onto spaces of arbitrarily high dimension. This is consistent because continuous surjections need not preserve dimension; they can "fold" a totally disconnected space onto a connected one.
What the universality theorem does *not* provide is any control over the regularity of the surjection. The map $f: \mathcal{C} \to X$ is continuous but typically not Holder continuous for any positive exponent, and it cannot be injective when $X$ is connected (since continuous injective images of totally disconnected compact spaces are totally disconnected).
## The Cantor Function
The construction of the Cantor set removes open intervals whose total length sums to $1$. On each removed interval, we can assign a constant value that reflects its "position" in the construction. Extending this assignment to all of $[0,1]$ produces a function with remarkable properties: it is continuous, nondecreasing, surjects $[0,1]$ onto $[0,1]$, yet has zero derivative almost everywhere. This is the **Cantor function**, also known as the **devil's staircase**.
The difficulty that the Cantor function resolves is this: can a continuous nondecreasing function increase from $0$ to $1$ while having zero derivative almost everywhere? For absolutely continuous functions, the Fundamental Theorem of Calculus would force
\begin{align*}
f(1) - f(0) = \int_0^1 f'(x) \, d\mathcal{L}^1(x),
\end{align*}
so $f' = 0$ a.e. would imply $f(1) = f(0)$. The Cantor function shows that this identity fails for merely continuous functions — the hypothesis of absolute continuity in the Fundamental Theorem of Calculus is not a technicality but a genuine necessity.
[definition: Cantor Function]
The **Cantor function** (or **Cantor-Lebesgue function**) is the function $\varphi: [0,1] \to [0,1]$ defined as follows. For $x \in \mathcal{C}$ with ternary expansion $x = \sum_{k=1}^{\infty} a_k / 3^k$ (where each $a_k \in \{0,2\}$), set
\begin{align*}
\varphi(x) := \sum_{k=1}^{\infty} \frac{a_k / 2}{2^k} = \sum_{k=1}^{\infty} \frac{b_k}{2^k}, \quad \text{where } b_k := \frac{a_k}{2} \in \{0, 1\}.
\end{align*}
For $x \in [0,1] \setminus \mathcal{C}$, the point $x$ lies in some removed open interval $(a, b)$ with $a, b \in \mathcal{C}$; define $\varphi(x) := \varphi(a) = \varphi(b)$.
[/definition]
[illustration:cantor-function-devils-staircase]
The map $\varphi$ replaces each ternary digit $a_k \in \{0,2\}$ by $a_k/2 \in \{0,1\}$ and reinterprets the result as a binary expansion. Since every number in $[0,1]$ has a binary expansion, $\varphi$ is surjective.
[example: Computing the Cantor Function]
Consider $x = 1/4$. In base $3$, we have $1/4 = 0.020202\ldots_3$, so $a_k$ cycles as $0, 2, 0, 2, \ldots$ . Then $b_k = a_k/2$ cycles as $0, 1, 0, 1, \ldots$, giving
\begin{align*}
\varphi(1/4) = \sum_{k=1}^{\infty} \frac{b_k}{2^k} = \frac{0}{2} + \frac{1}{4} + \frac{0}{8} + \frac{1}{16} + \cdots = \sum_{j=1}^{\infty} \frac{1}{4^j} = \frac{1/4}{1 - 1/4} = \frac{1}{3}.
\end{align*}
Now consider $x = 1/2$, which lies in the first removed interval $(1/3, 2/3)$. The left endpoint $1/3 = 0.0222\ldots_3$ maps to $\varphi(1/3) = 0.0111\ldots_2 = 1/2$. The right endpoint $2/3 = 0.2000\ldots_3$ maps to $\varphi(2/3) = 0.1000\ldots_2 = 1/2$. So $\varphi(1/2) = 1/2$, and indeed $\varphi$ is constant on the entire interval $(1/3, 2/3)$.
[/example]
[quotetheorem:1204]
Properties (3) and (4) are closely related: $[0,1] \setminus \mathcal{C}$ is an open set of full measure ($\mathcal{L}^1([0,1] \setminus \mathcal{C}) = 1$), and $\varphi$ is constant on each connected component of this open set, so $\varphi' = 0$ on $[0,1] \setminus \mathcal{C}$, which gives (4).
Property (6) is the most consequential. If $\varphi$ were absolutely continuous, the Fundamental Theorem of Calculus would yield $\varphi(1) - \varphi(0) = \int_0^1 \varphi'(x) \, d\mathcal{L}^1(x) = 0$, contradicting $\varphi(1) - \varphi(0) = 1$. Thus $\varphi$ is a concrete witness that the Fundamental Theorem of Calculus fails without absolute continuity — uniform continuity alone (which $\varphi$ satisfies, being continuous on a compact set) is not sufficient.
Property (5) merits attention: $\varphi$ maps a set of measure zero ($\mathcal{C}$) onto all of $[0,1]$. This shows that continuous images of measure-zero sets can have full measure, so the property of having measure zero is not preserved under continuous maps (though it is preserved under Lipschitz maps).
[remark: The Cantor Function and Singular Measures]
The Cantor function $\varphi$ generates a [Borel measure](/page/Measure%20Space) $\mu_\varphi$ on $[0,1]$ via $\mu_\varphi([a,b]) := \varphi(b) - \varphi(a)$. This is the **Cantor measure**. Since $\varphi' = 0$ a.e., the measure $\mu_\varphi$ is singular with respect to Lebesgue measure: $\mu_\varphi \perp \mathcal{L}^1$. Yet $\mu_\varphi$ is continuous (assigns zero mass to every singleton), so it is neither absolutely continuous nor purely atomic. The Cantor measure is the standard example of a **singular continuous measure** — it is supported on the measure-zero set $\mathcal{C}$ while assigning zero mass to each individual point of $\mathcal{C}$.
[/remark]
## Fat Cantor Sets
The Cantor set has Lebesgue measure zero, but this is an artifact of removing intervals whose lengths sum to $1$. By removing smaller intervals at each stage, we can construct compact, perfect, totally disconnected, nowhere dense subsets of $[0,1]$ with any prescribed measure in $(0,1)$. These are called **fat Cantor sets** (or **Smith-Volterra-Cantor sets**).
The motivation for studying fat Cantor sets goes beyond mere curiosity. They demonstrate that topological "smallness" (nowhere density) and measure-theoretic "smallness" (measure zero) are completely independent. A set can be topologically negligible — contained in no open set, a first-category set — while being measure-theoretically substantial. This independence is crucial in real analysis: it shows why the Baire category theorem and Lebesgue measure theory give fundamentally different notions of "most" functions or "most" points.
[definition: Fat Cantor Set]
Let $(\alpha_n)_{n \in \mathbb{N}}$ be a sequence with $0 < \alpha_n < 1$ for all $n$, chosen so that the total length removed is strictly less than $1$. Define a decreasing sequence of compact sets $F_0 \supset F_1 \supset F_2 \supset \cdots$ as follows. Set $F_0 := [0,1]$. Given $F_n$ (a union of $2^n$ closed intervals), define $F_{n+1}$ by removing from each constituent interval of $F_n$ a centrally placed open interval of length $\alpha_{n+1}$ times the length of that constituent interval. The **fat Cantor set** with removal sequence $(\alpha_n)$ is
\begin{align*}
F := \bigcap_{n=0}^{\infty} F_n.
\end{align*}
Its Lebesgue measure is $\mathcal{L}^1(F) = \prod_{n=1}^{\infty}(1 - \alpha_n) > 0$, provided the product converges to a positive limit (equivalently, provided $\sum_{n=1}^{\infty} \alpha_n < \infty$ is not required — the precise condition is $\prod (1 - \alpha_n) > 0$).
[/definition]
[example: Fat Cantor Set of Measure One Half]
Fix a target measure of $1/2$. At stage $n$, remove from each of the $2^{n-1}$ constituent intervals of $F_{n-1}$ a central open interval of length $\ell_n$, where we choose $\ell_n$ so that the total removed length is $1/2$. The simplest choice: at stage $n$, remove a central interval of length $1/4^n$ from each of the $2^{n-1}$ constituent intervals. The total length removed is
\begin{align*}
\sum_{n=1}^{\infty} \frac{2^{n-1}}{4^n} = \frac{1}{4} \sum_{n=0}^{\infty} \left(\frac{1}{2}\right)^n = \frac{1}{4} \cdot 2 = \frac{1}{2}.
\end{align*}
So the resulting fat Cantor set $F$ has $\mathcal{L}^1(F) = 1 - 1/2 = 1/2$. By Brouwer's characterisation theorem, $F$ is homeomorphic to $\mathcal{C}$ (it is readily verified to be compact, metrizable, perfect, and totally disconnected). Yet $\mathcal{L}^1(F) = 1/2$, while $\mathcal{L}^1(\mathcal{C}) = 0$.
[/example]
This example shows that homeomorphic subsets of $\mathbb{R}$ can have entirely different Lebesgue measures. In particular, Lebesgue measure is not a topological invariant — it depends on the metric embedding, not just the topology.
[example: Nowhere Dense Yet Positive Measure]
The fat Cantor set $F$ of the previous example has $\mathcal{L}^1(F) = 1/2$ but $\operatorname{int}(F) = \varnothing$. To verify that $F$ has empty interior, note that $F$ is totally disconnected (it contains no interval), so it contains no open set of $\mathbb{R}$. Thus $F$ is a [nowhere dense](/page/Nowhere%20Dense%20Set) set of positive measure. Its complement $[0,1] \setminus F$ is an open dense subset of $[0,1]$ with $\mathcal{L}^1([0,1] \setminus F) = 1/2$ — it is topologically "large" (dense and open) but measure-theoretically only half the interval.
This produces a striking counterexample: define $g := \mathbb{1}_F$, the indicator function of $F$. Then $g$ is Lebesgue integrable (with $\int_0^1 g \, d\mathcal{L}^1 = 1/2$) but is discontinuous at every point of $[0,1]$, since every open interval intersects both $F$ and its complement. In particular, $g$ is not Riemann integrable.
[/example]
## Working with the Cantor Set
This section collects the standard techniques and arguments that arise repeatedly when using the Cantor set in analysis and topology.
### Technique 1: Ternary expansion arguments
The most common technique for proving properties of $\mathcal{C}$ is to translate the problem into a statement about sequences in $\{0,2\}^{\mathbb{N}}$. This works because the homeomorphism $\Phi: \{0,2\}^{\mathbb{N}} \to \mathcal{C}$ converts topological properties (convergence, compactness, connectedness) into combinatorial properties of digit sequences.
**Pattern.** To show a property holds for all $x \in \mathcal{C}$: write $x = \sum_{k=1}^{\infty} a_k/3^k$ with $a_k \in \{0,2\}$, then argue using the sequence $(a_k)$. To show two points are separated: find the first digit where they differ and estimate the resulting gap.
[example: Midpoints Outside the Cantor Set]
We show that if $x, y \in \mathcal{C}$ with $x \ne y$, the midpoint $(x+y)/2$ need not belong to $\mathcal{C}$ — that is, $\mathcal{C}$ is not **convex**. In fact, we show something stronger: $\mathcal{C}$ is **not** closed under the midpoint operation.
Take $x = 0 = 0.000\ldots_3$ and $y = 2/3 = 0.200\ldots_3$. Both lie in $\mathcal{C}$. Their midpoint is
\begin{align*}
\frac{x + y}{2} = \frac{1}{3} = 0.1000\ldots_3.
\end{align*}
The ternary expansion $0.1000\ldots_3$ has a digit $1$. The alternative expansion is $0.0222\ldots_3$, which avoids $1$, so $1/3 \in \mathcal{C}$. This particular midpoint survives. Now try $x = 0 = 0.000\ldots_3$ and $y = 2/9 = 0.020\ldots_3$. Both lie in $\mathcal{C}$, but
\begin{align*}
\frac{x + y}{2} = \frac{1}{9} = 0.010\ldots_3 = 0.002222\ldots_3.
\end{align*}
The second expansion uses only $\{0,2\}$, so $1/9 \in \mathcal{C}$. For a genuine failure, take $x = 2/9 = 0.020\ldots_3$ and $y = 2/3 = 0.200\ldots_3$:
\begin{align*}
\frac{x + y}{2} = \frac{2/9 + 2/3}{2} = \frac{2/9 + 6/9}{2} = \frac{8/9}{2} = \frac{4}{9} = 0.110\ldots_3.
\end{align*}
Both ternary representations of $4/9$ contain the digit $1$ (indeed $4/9 = 0.1100\ldots_3 = 0.1022\ldots_3$), so $4/9 \notin \mathcal{C}$. The midpoint of two Cantor set elements can lie outside $\mathcal{C}$.
[/example]
### Technique 2: Cantor set as a source of counterexamples
The Cantor set and its variants appear throughout analysis as counterexamples. The standard method is:
1. Identify the property to be disproved (e.g., "continuous images preserve measure zero").
2. Select the appropriate Cantor construction (standard, fat, or the Cantor function).
3. Verify the construction satisfies the hypotheses of the false claim and violates its conclusion.
[example: A Continuous Image of a Measure-Zero Set with Positive Measure]
The Cantor function $\varphi: [0,1] \to [0,1]$ maps $\mathcal{C}$ (measure zero) onto all of $[0,1]$ (measure one). But we can be more precise. Define $g: [0,1] \to [0,2]$ by $g(x) := \varphi(x) + x$. Then $g$ is continuous and strictly increasing (hence injective), so $g$ maps Borel sets to Borel sets. We claim that $g(\mathcal{C})$ has positive Lebesgue measure.
Since $\varphi$ is constant on each connected component of $[0,1] \setminus \mathcal{C}$, the function $g(x) = \varphi(x) + x$ maps each such component (an interval of length $\ell$) to an interval of the same length $\ell$ (because $\varphi$ is constant, the increase comes entirely from the $+x$ term). The total measure of $g([0,1] \setminus \mathcal{C})$ therefore equals $\mathcal{L}^1([0,1] \setminus \mathcal{C}) = 1$. Since $g([0,1])$ is an interval of length $g(1) - g(0) = (\varphi(1) + 1) - (\varphi(0) + 0) = 2$, we find
\begin{align*}
\mathcal{L}^1(g(\mathcal{C})) = \mathcal{L}^1(g([0,1])) - \mathcal{L}^1(g([0,1] \setminus \mathcal{C})) = 2 - 1 = 1.
\end{align*}
So $g$ maps the measure-zero set $\mathcal{C}$ to a set of measure $1$. Since $g$ is a homeomorphism onto its image, this also shows that homeomorphisms can fail to preserve measure-zero sets (in contrast to Lipschitz maps, which do preserve measure zero by the estimate $\mathcal{L}^1(f(E)) \le \operatorname{Lip}(f) \cdot \mathcal{L}^1(E)$).
[/example]
### Technique 3: Using Brouwer's theorem to identify Cantor sets
Whenever a construction produces a compact, perfect, totally disconnected metrizable space, Brouwer's characterisation theorem guarantees it is homeomorphic to $\mathcal{C}$. This technique is especially useful in dynamics, where intersections of nested families of "strips" naturally produce Cantor sets.
**Pattern.** To show $X \cong \mathcal{C}$: verify (i) $X$ is compact and metrizable (usually as a closed subset of a Polish space), (ii) $X$ has no isolated points (perfectness), and (iii) $X$ is totally disconnected (no connected component has more than one point). Then invoke Brouwer's theorem.
### Technique 4: Arithmetic operations on the Cantor set
The ternary expansion characterisation makes it possible to determine which arithmetic combinations of Cantor set elements belong to particular sets. A striking result of this type is that $\mathcal{C} + \mathcal{C} = [0,2]$ — the sumset of the Cantor set with itself covers the entire interval $[0,2]$.
[quotetheorem:1205]
This result is initially surprising: a set of measure zero, summed with itself, produces an interval. The proof uses ternary expansions directly. Every $z \in [0,2]$ can be written as $z = \sum_{k=1}^{\infty} c_k / 3^k$ with $c_k \in \{0, 1, 2, 3, 4\}$ (a base-$3$ expansion where we allow carries to be absorbed into the digits). For each $k$, decompose $c_k = a_k + b_k$ with $a_k, b_k \in \{0, 2\}$:
\begin{align*}
c_k = 0 &\implies a_k = 0,\; b_k = 0, \\
c_k = 2 &\implies a_k = 0,\; b_k = 2 \;\text{ or }\; a_k = 2,\; b_k = 0, \\
c_k = 4 &\implies a_k = 2,\; b_k = 2.
\end{align*}
The remaining cases $c_k \in \{1, 3\}$ require carrying, but a careful inductive argument shows that after resolving carries, any $z \in [0,2]$ can be decomposed as $z = x + y$ with $x, y \in \mathcal{C}$.
This result has applications to the theory of convolutions: the convolution $\mu_\varphi * \mu_\varphi$ of the Cantor measure with itself is absolutely continuous with respect to Lebesgue measure on $[0,2]$, precisely because the sumset $\mathcal{C} + \mathcal{C}$ contains an interval.
## Hausdorff Dimension
The Cantor set has Lebesgue measure zero, so it is "smaller than one-dimensional" in a measure-theoretic sense. But how much smaller? The Cantor set is uncountable, so it is "larger than zero-dimensional" in a cardinality sense (finite and countable sets have Hausdorff dimension $0$). The Hausdorff dimension provides a fractional notion of dimension that quantifies exactly where $\mathcal{C}$ falls between dimension $0$ and dimension $1$.
[quotetheorem:1206]
The value $\log 2 / \log 3$ arises from the self-similarity of $\mathcal{C}$. The Cantor set is the union of two copies of itself, each scaled by the factor $1/3$:
\begin{align*}
\mathcal{C} = \frac{1}{3}\mathcal{C} \cup \left(\frac{2}{3} + \frac{1}{3}\mathcal{C}\right).
\end{align*}
If we write $d = \dim_H(\mathcal{C})$ and use the heuristic that $\mathcal{H}^d$ should scale as the $d$-th power of the scaling factor, then $\mathcal{H}^d(\mathcal{C}) = 2 \cdot (1/3)^d \cdot \mathcal{H}^d(\mathcal{C})$, giving $1 = 2 \cdot 3^{-d}$ and hence $d = \log 2 / \log 3$. Making this rigorous requires the theory of self-similar sets and the associated dimension formulas (the **Moran-Hutchinson theorem**), which justify the heuristic for iterated function systems satisfying the open set condition.
The Hausdorff dimension $\log 2/\log 3$ lies strictly between $0$ and $1$. This places $\mathcal{C}$ in the realm of **fractals** — sets whose Hausdorff dimension is not an integer. The Cantor set is often cited as the simplest example of a fractal, though the term "fractal" has no universally accepted formal definition.
## References
- Folland, G. B., *Real Analysis: Modern Techniques and Their Applications*, 2nd edition (1999).
- Royden, H. L. and Fitzpatrick, P. M., *Real Analysis*, 4th edition (2010).
- Falconer, K. J., *Fractal Geometry: Mathematical Foundations and Applications*, 3rd edition (2014).
- Munkres, J. R., *Topology*, 2nd edition (2000).
- Oxtoby, J. C., *Measure and Category*, 2nd edition (1980).
- Mattila, P., *Geometry of Sets and Measures in Euclidean Spaces*, (1995).