[proofplan]
We prove both implications separately. For $(1) \Rightarrow (2)$, given a sequence in a totally bounded set, we apply a pigeonhole-and-refine argument: at each stage $m$, we use a finite $1/m$-net to extract a subsequence confined to a single ball, then take a diagonal subsequence. For $(2) \Rightarrow (1)$, we argue by contraposition: if $A$ is not totally bounded, a greedy construction produces a sequence with pairwise distances bounded below, from which no Cauchy subsequence can be extracted.
[/proofplan]
[step:$(1) \Rightarrow (2)$: Extract successive subsequences using finite $1/m$-nets]
Assume $A$ is totally bounded. Let $\{a_k\}_{k=1}^\infty$ be a sequence in $A$. We construct, for each $m \in \mathbb{N}$, a subsequence $\{a_k^{(m)}\}_{k=1}^\infty$ of $\{a_k^{(m-1)}\}_{k=1}^\infty$ (where $\{a_k^{(0)}\}_{k=1}^\infty := \{a_k\}_{k=1}^\infty$) such that all terms of $\{a_k^{(m)}\}_{k=1}^\infty$ lie within a single ball of radius $1/m$.
**Construction at stage $m$.** Since $A$ is totally bounded, there exist finitely many points $x_1, \ldots, x_{N_m} \in M$ with $A \subset \bigcup_{i=1}^{N_m} B(x_i, 1/m)$. The sequence $\{a_k^{(m-1)}\}_{k=1}^\infty$ is infinite and takes values in $A \subset \bigcup_{i=1}^{N_m} B(x_i, 1/m)$. Since the union is finite, by the pigeonhole principle, at least one ball $B(x_{i_m}, 1/m)$ contains infinitely many terms. Define $\{a_k^{(m)}\}_{k=1}^\infty$ to be the subsequence of $\{a_k^{(m-1)}\}_{k=1}^\infty$ consisting of those terms lying in $B(x_{i_m}, 1/m)$.
By construction, for any two indices $j, k$, the triangle inequality gives
\begin{align*}
d(a_j^{(m)}, a_k^{(m)}) \le d(a_j^{(m)}, x_{i_m}) + d(x_{i_m}, a_k^{(m)}) < \frac{1}{m} + \frac{1}{m} = \frac{2}{m}.
\end{align*}
[guided]
The idea is to "zoom in" at finer and finer scales. At scale $1/m$, total boundedness guarantees a finite covering; since an infinite sequence meets finitely many sets, some set must capture infinitely many terms. We pass to the subsequence living in that single ball, then repeat at scale $1/(m+1)$.
Formally, assume $A$ is totally bounded. Let $\{a_k\}_{k=1}^\infty$ be any sequence in $A$. We recursively define a nested family of infinite subsequences: set $\{a_k^{(0)}\}_{k=1}^\infty := \{a_k\}_{k=1}^\infty$.
At stage $m \ge 1$, since $A$ is totally bounded, there exists a finite set $\{x_1, \ldots, x_{N_m}\} \subset M$ with $A \subset \bigcup_{i=1}^{N_m} B(x_i, 1/m)$. Every term of the infinite sequence $\{a_k^{(m-1)}\}_{k=1}^\infty$ lies in $A$, hence in at least one of the $N_m$ balls. Since there are infinitely many terms and only $N_m$ balls, the pigeonhole principle guarantees at least one ball --- call it $B(x_{i_m}, 1/m)$ --- contains infinitely many terms. We define $\{a_k^{(m)}\}_{k=1}^\infty$ as the subsequence of $\{a_k^{(m-1)}\}_{k=1}^\infty$ consisting of precisely those terms in $B(x_{i_m}, 1/m)$.
Two properties are immediate. First, $\{a_k^{(m)}\}_{k=1}^\infty$ is a subsequence of $\{a_k^{(m-1)}\}_{k=1}^\infty$ and hence of the original sequence. Second, for any indices $j, k$, both $a_j^{(m)}$ and $a_k^{(m)}$ lie in $B(x_{i_m}, 1/m)$, so the triangle inequality gives
\begin{align*}
d(a_j^{(m)}, a_k^{(m)}) \le d(a_j^{(m)}, x_{i_m}) + d(x_{i_m}, a_k^{(m)}) < \frac{1}{m} + \frac{1}{m} = \frac{2}{m}.
\end{align*}
Why do we need this nested construction rather than a single application of the pigeonhole principle? Because a single application at scale $\varepsilon$ only guarantees that the subsequence terms are within $2\varepsilon$ of each other --- not that they form a Cauchy sequence. We need the diameter to shrink as the scale refines, which requires iterating over all $m$.
[/guided]
[/step]
[step:Extract a diagonal subsequence that is Cauchy]
Define the diagonal subsequence $b_m := a_m^{(m)}$ for each $m \in \mathbb{N}$. Since $\{a_k^{(m)}\}_{k=1}^\infty$ is a subsequence of $\{a_k^{(m-1)}\}_{k=1}^\infty$ for every $m$, the tail $\{b_m, b_{m+1}, b_{m+2}, \ldots\}$ is a subsequence of $\{a_k^{(m)}\}_{k=1}^\infty$: for $\ell \ge m$, the term $b_\ell = a_\ell^{(\ell)}$ belongs to $\{a_k^{(\ell)}\}_{k=1}^\infty$, which is a subsequence of $\{a_k^{(m)}\}_{k=1}^\infty$. In particular, $b_\ell \in B(x_{i_m}, 1/m)$ for all $\ell \ge m$.
We verify the Cauchy property. Let $\varepsilon > 0$. Choose $m_0 \in \mathbb{N}$ with $2/m_0 < \varepsilon$. For all $j, \ell \ge m_0$, both $b_j$ and $b_\ell$ lie in $B(x_{i_{m_0}}, 1/m_0)$, so
\begin{align*}
d(b_j, b_\ell) < \frac{2}{m_0} < \varepsilon.
\end{align*}
Hence $\{b_m\}_{m=1}^\infty$ is a Cauchy subsequence of $\{a_k\}_{k=1}^\infty$.
[guided]
The diagonal trick is a standard device in analysis: from a doubly indexed family of subsequences $\{a_k^{(m)}\}$, we extract a single subsequence that eventually sits inside each of them.
Define $b_m := a_m^{(m)}$. We must verify two things: that $\{b_m\}$ is indeed a subsequence of the original sequence $\{a_k\}$, and that it is Cauchy.
**Subsequence property.** By construction, $\{a_k^{(m)}\}$ is a subsequence of $\{a_k^{(m-1)}\}$ for each $m$. The nesting means that the sequence $\{a_k^{(\ell)}\}$ is a subsequence of $\{a_k^{(m)}\}$ whenever $\ell \ge m$. Since $b_\ell = a_\ell^{(\ell)}$ is the $\ell$-th term of $\{a_k^{(\ell)}\}$, it is a term of $\{a_k^{(m)}\}$ for all $m \le \ell$. In particular, each $b_m$ is a term of the original sequence, and the indices can be verified to be strictly increasing by tracking the enumeration through the nested subsequences. Hence $\{b_m\}$ is a subsequence of $\{a_k\}$.
**Cauchy property.** The key observation is that for $\ell \ge m$, the term $b_\ell$ lies in $B(x_{i_m}, 1/m)$ (since $b_\ell$ belongs to $\{a_k^{(m)}\}$, all of whose terms lie in that ball). Fix $\varepsilon > 0$ and choose $m_0 \in \mathbb{N}$ with $2/m_0 < \varepsilon$ (possible by the Archimedean property of $\mathbb{R}$). For all $j, \ell \ge m_0$, both $b_j$ and $b_\ell$ lie in $B(x_{i_{m_0}}, 1/m_0)$, so
\begin{align*}
d(b_j, b_\ell) \le d(b_j, x_{i_{m_0}}) + d(x_{i_{m_0}}, b_\ell) < \frac{1}{m_0} + \frac{1}{m_0} = \frac{2}{m_0} < \varepsilon.
\end{align*}
This establishes that $\{b_m\}$ is Cauchy, completing the proof that $(1) \Rightarrow (2)$.
[/guided]
[/step]
[step:$(2) \Rightarrow (1)$: Construct an $\varepsilon_0$-separated sequence when total boundedness fails]
We prove the contrapositive: if $A$ is not totally bounded, then there exists a sequence in $A$ with no Cauchy subsequence.
Since $A$ is not totally bounded, there exists $\varepsilon_0 > 0$ such that no finite subset of $M$ forms an $\varepsilon_0$-net for $A$. We construct a sequence $\{a_k\}_{k=1}^\infty$ in $A$ by induction.
**Base case.** Choose any $a_1 \in A$ (which is non-empty, since the empty set is totally bounded).
**Inductive step.** Suppose $a_1, \ldots, a_k$ have been chosen. The finite set $\{a_1, \ldots, a_k\}$ is not an $\varepsilon_0$-net for $A$ (since no finite set is). Therefore there exists $a_{k+1} \in A$ with $a_{k+1} \notin \bigcup_{i=1}^k B(a_i, \varepsilon_0)$, which means $d(a_{k+1}, a_i) \ge \varepsilon_0$ for all $i = 1, \ldots, k$.
By induction, we obtain a sequence $\{a_k\}_{k=1}^\infty$ in $A$ with $d(a_j, a_k) \ge \varepsilon_0$ for all $j \neq k$. Any subsequence $\{a_{k_\ell}\}_{\ell=1}^\infty$ inherits this separation: $d(a_{k_j}, a_{k_\ell}) \ge \varepsilon_0$ for $j \neq \ell$. Such a subsequence cannot be Cauchy, since no tail can have diameter less than $\varepsilon_0$.
[guided]
We prove the contrapositive: assume $A$ is not totally bounded and produce a sequence in $A$ with no Cauchy subsequence.
The negation of total boundedness says: there exists $\varepsilon_0 > 0$ such that for every finite subset $F \subset M$, the set $A$ is not contained in $\bigcup_{x \in F} B(x, \varepsilon_0)$. The strategy is to exploit this failure greedily --- at each step, the point we add to our sequence must escape all previously chosen points.
**Greedy construction.** Pick $a_1 \in A$ arbitrarily. Given $a_1, \ldots, a_k$, the set $\{a_1, \ldots, a_k\}$ cannot be an $\varepsilon_0$-net for $A$, so there exists a point
\begin{align*}
a_{k+1} \in A \setminus \bigcup_{i=1}^k B(a_i, \varepsilon_0).
\end{align*}
By definition, $d(a_{k+1}, a_i) \ge \varepsilon_0$ for every $i \le k$.
**No Cauchy subsequence.** The resulting sequence satisfies $d(a_j, a_k) \ge \varepsilon_0$ for all $j \neq k$. Any subsequence inherits this pairwise separation. A Cauchy sequence must eventually have all terms within distance $\varepsilon_0/2$ of each other (taking $\varepsilon = \varepsilon_0/2$ in the Cauchy condition), but the separation bound $d(a_j, a_k) \ge \varepsilon_0$ prevents any two distinct terms from being that close. Therefore no subsequence is Cauchy.
This argument also reveals the geometric content of total boundedness: $A$ is totally bounded if and only if it contains no infinite $\varepsilon$-separated subset for any $\varepsilon > 0$. The greedy construction produces a maximal $\varepsilon_0$-separated set when it terminates in finitely many steps, which happens precisely when $A$ is totally bounded at scale $\varepsilon_0$.
[/guided]
[/step]