[proofplan]
We construct a binary tree of non-empty closed subsets $(A_s)_{s \in \{0,1\}^*}$ of $K$ with $A_\varnothing = K$, $A_s = A_{s0} \cup A_{s1}$, and diameters that tend to zero along every infinite branch. Total boundedness of the compact metric $K$ provides, at each level, a finite cover of every closed subset by small-diameter closed pieces; we package these pieces into a binary tree by inserting a finite-depth padding subtree at each node. The map $\varphi : \Delta \to K$ sends an infinite binary string $\varepsilon$ to the limit of $(x_{\varepsilon|_n})$ where $x_s \in A_s$ are chosen representatives; the limit exists by the Cauchy property forced by shrinking diameters and completeness of $K$. Continuity follows because two strings agreeing on a long prefix produce limits in the same small set $A_{\varepsilon|_n}$. Surjectivity follows by inductively descending the tree to track any target point $x \in K$.
[/proofplan]
[step:Build a binary tree of non-empty closed subsets of $K$ with diameters shrinking along every branch]
Let $\{0,1\}^* := \bigcup_{n=0}^\infty \{0,1\}^n$ denote the set of finite binary strings, including the empty string $\varnothing \in \{0,1\}^0$. For $s \in \{0,1\}^n$, write $|s| := n$, and for $i \in \{0,1\}$, write $si \in \{0,1\}^{n+1}$ for the concatenation. For $\varepsilon \in \Delta$ and $n \ge 0$, write $\varepsilon|_n := (\varepsilon_1, \ldots, \varepsilon_n) \in \{0,1\}^n$ for the truncation; for $n = 0$, $\varepsilon|_0 = \varnothing$.
We construct, by induction on $n$, a family $(A_s)_{s \in \{0,1\}^*}$ of non-empty closed subsets of $K$ satisfying:
(P1) $A_\varnothing = K$.
(P2) $A_s = A_{s0} \cup A_{s1}$ for every $s \in \{0,1\}^*$ (so $A_{si} \subseteq A_s$ for $i \in \{0,1\}$).
(P3) For every $\varepsilon \in \Delta$, $\operatorname{diam}(A_{\varepsilon|_n}) \to 0$ as $n \to \infty$.
The construction proceeds in **blocks**. We define a strictly increasing sequence of "block depths" $0 = n_0 < n_1 < n_2 < \cdots$ along each branch, such that at each block depth $n_k$ a fresh application of total boundedness is performed and the resulting cover is distributed across a finite-depth binary subtree. Between block depths, the tree merely encodes which leaf of the current padding subtree we are heading toward.
*Notation for the construction.* Since $K$ is a compact metric space, by the [Equivalence of Compactness and Total Boundedness with Completeness in Metric Spaces](/theorems/???), $K$ is totally bounded; subsets of totally bounded spaces are totally bounded; hence every non-empty closed (and so compact) subset $A \subseteq K$ admits, for every $r > 0$, a finite cover $A \subseteq \bigcup_{j=1}^N \overline{B}(y_j, r)$ with $y_j \in A$.
*Block construction.* Suppose, by recursion on $k \ge 0$, that we have defined $A_t$ for all $t \in \{0,1\}^{n_k}$ as non-empty closed subsets of $K$, with $\operatorname{diam}(A_t) \le 2^{-n_k}$ for every $t \in \{0,1\}^{n_k}$ (with the convention that the bound $2^{-0} = 1$ at $k = 0$ is vacuous). For the base $k = 0$: set $n_0 := 0$ and $A_\varnothing := K$. (No diameter bound is required.)
Fix $t \in \{0,1\}^{n_k}$. Apply total boundedness to $A_t$ at radius $r := 2^{-(n_k + 2)}$: there exist $y_1^{(t)}, \ldots, y_{N_t}^{(t)} \in A_t$ with
\begin{align*}
A_t = \bigcup_{j=1}^{N_t} D_j^{(t)}, \qquad D_j^{(t)} := A_t \cap \overline{B}(y_j^{(t)}, 2^{-(n_k + 2)}).
\end{align*}
Each $D_j^{(t)}$ is closed in $K$ (intersection of closed sets); each is non-empty since $y_j^{(t)} \in D_j^{(t)}$; the union covers $A_t$ since the open balls $B(y_j^{(t)}, 2^{-(n_k + 2)})$ cover $A_t$ and $B(y_j^{(t)}, r) \subseteq \overline{B}(y_j^{(t)}, r)$. Each piece has
\begin{align*}
\operatorname{diam}(D_j^{(t)}) \le \operatorname{diam}(\overline{B}(y_j^{(t)}, 2^{-(n_k + 2)})) \le 2 \cdot 2^{-(n_k + 2)} = 2^{-(n_k + 1)}.
\end{align*}
Now choose any integer $\kappa_t \ge 1$ with $2^{\kappa_t} \ge N_t$ (concretely, $\kappa_t := \max(1, \lceil \log_2 N_t \rceil)$; the lower bound $\kappa_t \ge 1$ ensures strict block-depth increase even when $N_t = 1$). Pad the cover to length $2^{\kappa_t}$ by repetition: for $1 \le j \le N_t$, set $\tilde{D}_j^{(t)} := D_j^{(t)}$, and for $N_t < j \le 2^{\kappa_t}$, set $\tilde{D}_j^{(t)} := D_1^{(t)}$. Then $A_t = \bigcup_{j=1}^{2^{\kappa_t}} \tilde{D}_j^{(t)}$ still holds, each $\tilde{D}_j^{(t)}$ is non-empty closed with $\operatorname{diam}(\tilde{D}_j^{(t)}) \le 2^{-(n_k + 1)}$.
Index the $2^{\kappa_t}$ leaves of a depth-$\kappa_t$ binary subtree rooted at $t$ by $\{0,1\}^{\kappa_t}$. For each $u = (u_1, \ldots, u_{\kappa_t}) \in \{0,1\}^{\kappa_t}$, set
\begin{align*}
A_{tu} := \tilde{D}_{1 + \mathrm{bin}(u)}^{(t)},
\end{align*}
where $\mathrm{bin}(u) := \sum_{i=1}^{\kappa_t} u_i\, 2^{\kappa_t - i} \in \{0, 1, \ldots, 2^{\kappa_t} - 1\}$ is the integer encoded in binary by $u$ (most significant bit first).
For internal nodes of the subtree, that is, for $u \in \{0,1\}^\ell$ with $0 < \ell < \kappa_t$, define
\begin{align*}
A_{tu} := A_{tu0} \cup A_{tu1}.
\end{align*}
By induction on $\ell$ from $\ell = \kappa_t - 1$ down to $\ell = 1$, each $A_{tu}$ is a non-empty finite union of closed sets, hence non-empty and closed. Moreover, by induction
\begin{align*}
A_{tu} = \bigcup_{v \in \{0,1\}^{\kappa_t - \ell}} A_{tuv} = \bigcup_{v \in \{0,1\}^{\kappa_t - \ell}} \tilde{D}_{1 + \mathrm{bin}(uv)}^{(t)}.
\end{align*}
At $\ell = 0$ this gives $A_t = \bigcup_{j=1}^{2^{\kappa_t}} \tilde{D}_j^{(t)}$, consistent with the cover we started from. The decomposition (P2) holds at every node of the padding subtree by construction.
Define the next block depth $n_{k+1} := n_k + \kappa_t$. (Note: $\kappa_t$ depends on $t$, hence so does $n_{k+1}$; the block depths form a sequence indexed by branches, not a single sequence. We address this below.) The leaves $A_{tu}$ for $u \in \{0,1\}^{\kappa_t}$ have diameter $\le 2^{-(n_k + 1)} \le 2^{-n_{k+1}}$ (since $\kappa_t \ge 1$). This completes the construction of the block of depth $\kappa_t$ rooted at $t$, and the inductive hypothesis is preserved at each leaf of this subtree.
*Branch-dependent block depths.* For each $\varepsilon \in \Delta$, the block depths $0 = n_0(\varepsilon) < n_1(\varepsilon) < n_2(\varepsilon) < \cdots$ are determined by the construction: $n_{k+1}(\varepsilon) = n_k(\varepsilon) + \kappa_{\varepsilon|_{n_k(\varepsilon)}}$. Since each $\kappa_t \ge 1$, this is a strictly increasing sequence of non-negative integers, hence $n_k(\varepsilon) \to \infty$ as $k \to \infty$ (every strictly increasing sequence of integers is unbounded).
This completes the recursive construction of $(A_s)_{s \in \{0,1\}^*}$. Properties (P1) and (P2) are immediate from the construction. We verify (P3) as a separate claim.
[claim:Diameters shrink along every infinite branch]
For every $\varepsilon \in \Delta$, $\operatorname{diam}(A_{\varepsilon|_n}) \to 0$ as $n \to \infty$.
[/claim]
[proof]
Fix $\varepsilon \in \Delta$. Let $0 = n_0 < n_1 < n_2 < \cdots$ be the block depths along $\varepsilon$ (so $n_k = n_k(\varepsilon)$). For each $k \ge 0$, $A_{\varepsilon|_{n_{k+1}}}$ is a leaf of the padding subtree rooted at $\varepsilon|_{n_k}$, and we showed
\begin{align*}
\operatorname{diam}(A_{\varepsilon|_{n_{k+1}}}) \le 2^{-(n_k + 1)}.
\end{align*}
By (P2), $A_{\varepsilon|_{n+1}} \subseteq A_{\varepsilon|_n}$ for every $n \ge 0$, so $\operatorname{diam}(A_{\varepsilon|_n})$ is non-increasing in $n$. Given $\delta > 0$, choose $K$ with $2^{-(n_K + 1)} < \delta$ (possible since $n_K \to \infty$). For all $n \ge n_{K+1}$, monotonicity gives
\begin{align*}
\operatorname{diam}(A_{\varepsilon|_n}) \le \operatorname{diam}(A_{\varepsilon|_{n_{K+1}}}) \le 2^{-(n_K + 1)} < \delta.
\end{align*}
Hence $\operatorname{diam}(A_{\varepsilon|_n}) \to 0$.
[/proof]
For each $s \in \{0,1\}^*$, choose a point $x_s \in A_s$. The index set $\{0,1\}^*$ is countable and each $A_s$ is non-empty, so this selection exists by the [Axiom of Countable Choice](/theorems/???); the choices are independent across the countable index set, so dependent choice is not needed.
[guided]
The construction realises $K$ as the leaves-at-infinity of an infinitely branching tree of small closed subsets. The analytical content is total boundedness, which provides finite covers; the bookkeeping converts the finite covers into a binary tree.
**Why a binary tree.** The Cantor set $\Delta = \{0,1\}^{\mathbb{N}}$ is itself the leaves-at-infinity of the complete binary tree $\{0,1\}^*$. So a continuous surjection $\Delta \to K$ corresponds to a "binary tree decomposition" of $K$ where each node $A_s$ is the union of its two children $A_{s0}, A_{s1}$, and diameters shrink along every branch.
**Why a padding subtree.** Total boundedness gives a cover of $A_s$ by $N_s$ small pieces, where $N_s$ depends on $A_s$ and may be any positive integer. To package $N_s$ pieces into a binary tree, we pad the cover to length $2^{\kappa_s}$ (by repeating $D_1^{(s)}$ if necessary) and assign one piece to each leaf of a depth-$\kappa_s$ binary subtree. Internal nodes of this subtree are unions of leaf pieces.
**Why we settle for "diameter shrinks along every branch" rather than "diameter shrinks at every node."** The internal nodes of the padding subtree at depths $n_k + 1, \ldots, n_{k+1} - 1$ are unions of small pieces but the unions themselves can have large diameter. However, every infinite branch of the tree passes from block depth $n_k$ to block depth $n_{k+1}$ in $\kappa_{\varepsilon|_{n_k}}$ steps, and at the new block depth lands in a leaf of diameter $\le 2^{-(n_k + 1)}$. Since block depths $n_k \to \infty$ along every branch, the diameter goes to zero.
**Verification of (P3).** Fix $\varepsilon$ and let $n_k = n_k(\varepsilon)$. We have $\operatorname{diam}(A_{\varepsilon|_{n_{k+1}}}) \le 2^{-(n_k + 1)}$ by the leaf bound, and $n_k \to \infty$ since the block-depth sequence is strictly increasing in $\mathbb{Z}_{\ge 0}$. Together with monotonicity ($A_{\varepsilon|_{n+1}} \subseteq A_{\varepsilon|_n}$ from (P2)), this gives $\operatorname{diam}(A_{\varepsilon|_n}) \to 0$.
**Choice principle.** The selection $x_s \in A_s$ for $s \in \{0,1\}^*$ is a choice from a countable family of non-empty sets, hence valid by the [Axiom of Countable Choice](/theorems/???). The choices are independent across the index set, so dependent choice is not needed.
[/guided]
[/step]
[step:Define $\varphi : \Delta \to K$ as the limit along branches]
For $\varepsilon \in \Delta$, consider the sequence $(x_{\varepsilon|_n})_{n=0}^\infty$ in $K$.
[claim:$(x_{\varepsilon|_n})_n$ is Cauchy in $K$]
For every $\varepsilon \in \Delta$, $(x_{\varepsilon|_n})_n$ is Cauchy in the metric of $K$.
[/claim]
[proof]
By (P2), $A_{si} \subseteq A_s$ for each $s \in \{0,1\}^*$ and $i \in \{0,1\}$. Iterating, $A_{\varepsilon|_m} \subseteq A_{\varepsilon|_n}$ for all $m \ge n$. Hence $x_{\varepsilon|_m} \in A_{\varepsilon|_m} \subseteq A_{\varepsilon|_n}$ for $m \ge n$, and $x_{\varepsilon|_n} \in A_{\varepsilon|_n}$, so
\begin{align*}
d(x_{\varepsilon|_n}, x_{\varepsilon|_m}) \le \operatorname{diam}(A_{\varepsilon|_n}) \quad \text{for all } m \ge n.
\end{align*}
By (P3), $\operatorname{diam}(A_{\varepsilon|_n}) \to 0$. Given $\delta > 0$, choose $N$ with $\operatorname{diam}(A_{\varepsilon|_n}) < \delta$ for all $n \ge N$. Then $d(x_{\varepsilon|_n}, x_{\varepsilon|_m}) < \delta$ for $m \ge n \ge N$. Thus $(x_{\varepsilon|_n})_n$ is Cauchy.
[/proof]
By the [Equivalence of Compactness and Total Boundedness with Completeness in Metric Spaces](/theorems/???), $K$ is complete. Hence the Cauchy sequence $(x_{\varepsilon|_n})_n$ converges in $K$. Define
\begin{align*}
\varphi : \Delta &\to K, \\
\varepsilon &\mapsto \lim_{n \to \infty} x_{\varepsilon|_n}.
\end{align*}
The limit is independent of the choice of representatives $x_s$: if $(x_s')_{s \in \{0,1\}^*}$ is another selection with $x_s' \in A_s$, then $d(x_{\varepsilon|_n}, x_{\varepsilon|_n}') \le \operatorname{diam}(A_{\varepsilon|_n}) \to 0$, so $\lim_n x_{\varepsilon|_n} = \lim_n x_{\varepsilon|_n}'$. Thus $\varphi$ depends only on the tree $(A_s)$, not on the chosen representatives.
[/step]
[step:Verify $\varphi(\varepsilon) \in A_{\varepsilon|_n}$ for every $n \ge 0$]
Fix $\varepsilon \in \Delta$ and $n \ge 0$. For all $m \ge n$, $x_{\varepsilon|_m} \in A_{\varepsilon|_m} \subseteq A_{\varepsilon|_n}$ by (P2) and induction. Since $A_{\varepsilon|_n}$ is closed in $K$ and the tail sequence $(x_{\varepsilon|_m})_{m \ge n}$ lies in $A_{\varepsilon|_n}$ with limit $\varphi(\varepsilon)$, we conclude $\varphi(\varepsilon) \in A_{\varepsilon|_n}$.
In particular, $\varphi(\varepsilon) \in \bigcap_{n=0}^\infty A_{\varepsilon|_n}$.
[/step]
[step:Prove $\varphi$ is continuous]
The product topology on $\Delta = \{0,1\}^{\mathbb{N}}$ has a basis of cylinder sets
\begin{align*}
[\eta] := \{\varepsilon \in \Delta : \varepsilon|_{|\eta|} = \eta\}, \qquad \eta \in \{0,1\}^*.
\end{align*}
Equivalently, the metric
\begin{align*}
d_\Delta(\varepsilon, \delta) := \begin{cases} 0 & \text{if } \varepsilon = \delta, \\ 2^{-\inf\{n \ge 1 : \varepsilon_n \neq \delta_n\}} & \text{if } \varepsilon \neq \delta, \end{cases}
\end{align*}
generates the same topology.
Fix $\varepsilon \in \Delta$ and $\delta > 0$. By (P3), choose $n \ge 1$ with $\operatorname{diam}(A_{\varepsilon|_n}) < \delta$. The cylinder $[\varepsilon|_n]$ is an open neighbourhood of $\varepsilon$ in $\Delta$. For any $\eta \in [\varepsilon|_n]$, $\eta|_n = \varepsilon|_n$, so $A_{\eta|_n} = A_{\varepsilon|_n}$. By the previous step, $\varphi(\eta) \in A_{\eta|_n} = A_{\varepsilon|_n}$ and $\varphi(\varepsilon) \in A_{\varepsilon|_n}$. Therefore
\begin{align*}
d(\varphi(\varepsilon), \varphi(\eta)) \le \operatorname{diam}(A_{\varepsilon|_n}) < \delta.
\end{align*}
This is continuity of $\varphi$ at $\varepsilon$. Since $\varepsilon \in \Delta$ was arbitrary, $\varphi$ is continuous on $\Delta$.
[guided]
The product topology on $\Delta$ identifies "close strings" with "strings agreeing on a long initial segment." A basic open neighbourhood of $\varepsilon$ at scale $2^{-n}$ in the metric $d_\Delta$ is exactly the cylinder $[\varepsilon|_n] = \{\eta \in \Delta : \eta|_n = \varepsilon|_n\}$.
To prove continuity at $\varepsilon$, given $\delta > 0$, we want a cylinder $[\varepsilon|_n]$ whose $\varphi$-image has diameter $< \delta$. Property (P3) guarantees such an $n$: pick $n \ge 1$ with $\operatorname{diam}(A_{\varepsilon|_n}) < \delta$.
For any $\eta \in [\varepsilon|_n]$, $\eta|_n = \varepsilon|_n$, hence $A_{\eta|_n} = A_{\varepsilon|_n}$. By the previous step (applied to both $\varepsilon$ and $\eta$), $\varphi(\varepsilon), \varphi(\eta) \in A_{\varepsilon|_n}$, so the distance between them is bounded by the diameter of this common containing set:
\begin{align*}
d(\varphi(\varepsilon), \varphi(\eta)) \le \operatorname{diam}(A_{\varepsilon|_n}) < \delta.
\end{align*}
The cylinder $[\varepsilon|_n]$ is the required neighbourhood. Since $\varepsilon$ was arbitrary, $\varphi$ is continuous on all of $\Delta$. The argument is uniform in the sense that the modulus of continuity at $\varepsilon$ depends on $\varepsilon$ only through the diameters $\operatorname{diam}(A_{\varepsilon|_n})$.
[/guided]
[/step]
[step:Prove $\varphi$ is surjective]
Fix $x \in K$. We construct $\varepsilon = (\varepsilon_n) \in \Delta$ with $\varphi(\varepsilon) = x$ by inductively choosing $\varepsilon_n$ so that $x \in A_{\varepsilon|_n}$ for every $n \ge 0$.
*Base case.* $x \in A_\varnothing = K$ by (P1) and the hypothesis $x \in K$.
*Inductive step.* Suppose $\varepsilon_1, \ldots, \varepsilon_n$ have been chosen with $x \in A_{\varepsilon|_n}$. By (P2), $A_{\varepsilon|_n} = A_{\varepsilon|_n 0} \cup A_{\varepsilon|_n 1}$, so $x \in A_{\varepsilon|_n 0}$ or $x \in A_{\varepsilon|_n 1}$ (or both). Define $\varepsilon_{n+1} := 0$ if $x \in A_{\varepsilon|_n 0}$, and $\varepsilon_{n+1} := 1$ otherwise. Then $x \in A_{\varepsilon|_n \varepsilon_{n+1}} = A_{\varepsilon|_{n+1}}$. The rule is deterministic, so no choice axiom is invoked.
This produces $\varepsilon \in \Delta$ with $x \in A_{\varepsilon|_n}$ for every $n \ge 0$. By the third step, $\varphi(\varepsilon) \in A_{\varepsilon|_n}$ for every $n \ge 0$. Hence both $x$ and $\varphi(\varepsilon)$ lie in $A_{\varepsilon|_n}$, so
\begin{align*}
d(x, \varphi(\varepsilon)) \le \operatorname{diam}(A_{\varepsilon|_n}).
\end{align*}
By (P3), $\operatorname{diam}(A_{\varepsilon|_n}) \to 0$, so $d(x, \varphi(\varepsilon)) = 0$, i.e., $\varphi(\varepsilon) = x$.
Since $x \in K$ was arbitrary, $\varphi$ is surjective.
[/step]
[step:Conclude]
The map $\varphi : \Delta \to K$ defined in the second step is continuous (by the fourth step) and surjective (by the fifth step). This produces the continuous surjection asserted in the theorem statement.
[/step]