[proofplan]
The proof has two steps. First we use compactness of the geodesic sphere $S_\delta(p)$ and continuity of $d(\cdot, q)$ to extract a minimiser $p_0 \in S_\delta(p)$ of $d(\cdot, q)$. We then show this $p_0$ achieves the triangle equality. The key is that any path from $p$ to $q$ must cross $S_\delta(p)$ — by connectedness of the path and the intermediate value theorem applied to $t \mapsto d(p, \gamma(t))$ — and at the crossing point the path is split into a piece of length at least $\delta$ and a piece of length at least $d(\cdot, q)$. Minimising $d(\cdot, q)$ over crossings, then taking an infimum over paths, recovers $d(p, q)$. Picking the minimiser $p_0$ on $S_\delta(p)$ saturates the inequality.
[/proofplan]
[step:Extract a minimiser of $d(\cdot, q)$ on the geodesic sphere $S_\delta(p)$]
Choose $\delta > 0$ small enough that $\exp_p$ is a diffeomorphism on the closed ball $\overline{B}(0, \delta) \subseteq T_p M$ — possible by the [Exponential Map as a Local Diffeomorphism](/theorems/2712). Then
\begin{align*}
S_\delta(p) = \exp_p(\{v \in T_p M : |v|_g = \delta\})
\end{align*}
is the diffeomorphic image of a Euclidean sphere in $T_p M$, hence compact. The distance function
\begin{align*}
d(\cdot, q) : M &\to \mathbb{R} \\
x &\mapsto d(x, q)
\end{align*}
is continuous (in fact $1$-Lipschitz) by the triangle inequality. By the extreme value theorem on the compact set $S_\delta(p)$, the restriction $d(\cdot, q)|_{S_\delta(p)}$ attains its infimum: there exists $p_0 \in S_\delta(p)$ with
\begin{align*}
d(p_0, q) = \inf_{x \in S_\delta(p)} d(x, q).
\end{align*}
We assume $\delta < d(p, q)$, which is part of the hypothesis "$\delta$ sufficiently small" together with the diffeomorphism condition above.
[guided]
The candidate point $p_0$ is forced by compactness. We need a point on the small sphere $S_\delta(p)$ that is closest to $q$, and the standard recipe is: a continuous function on a compact space attains its minimum. We must verify both ingredients.
**Compactness of $S_\delta(p)$.** By the [Exponential Map as a Local Diffeomorphism](/theorems/2712), there exists $\varepsilon > 0$ such that $\exp_p|_{B(0, \varepsilon)}$ is a diffeomorphism onto its image in $M$. Choose $\delta \in (0, \varepsilon)$ so that the closed ball $\overline{B}(0, \delta) \subseteq T_p M$ lies inside the domain of the diffeomorphism. Then the geodesic sphere is exactly the diffeomorphic image of a Euclidean sphere:
\begin{align*}
S_\delta(p) = \exp_p(\{v \in T_p M : |v|_g = \delta\}).
\end{align*}
The set $\{v \in T_p M : |v|_g = \delta\}$ is closed and bounded in the finite-dimensional inner product space $(T_p M, g_p)$, hence compact by Heine-Borel. A diffeomorphism is in particular a homeomorphism, so it carries compact sets to compact sets, and $S_\delta(p)$ is compact.
**Continuity of $d(\cdot, q)$.** We declare the distance function as a map:
\begin{align*}
d(\cdot, q) : M &\to \mathbb{R} \\
x &\mapsto d(x, q).
\end{align*}
For any $x_1, x_2 \in M$, the triangle inequality gives $d(x_1, q) \leq d(x_1, x_2) + d(x_2, q)$ and symmetrically $d(x_2, q) \leq d(x_1, x_2) + d(x_1, q)$, so $|d(x_1, q) - d(x_2, q)| \leq d(x_1, x_2)$. Thus $d(\cdot, q)$ is $1$-Lipschitz with respect to $d$, and in particular continuous.
**Applying the extreme value theorem.** A continuous real-valued function on a non-empty compact set attains its infimum. Since $S_\delta(p)$ is compact and non-empty (the Euclidean sphere is non-empty), and $d(\cdot, q)|_{S_\delta(p)}$ is continuous, there exists $p_0 \in S_\delta(p)$ with
\begin{align*}
d(p_0, q) = \inf_{x \in S_\delta(p)} d(x, q).
\end{align*}
**The auxiliary assumption $\delta < d(p, q)$.** We additionally assume $\delta < d(p, q)$, on top of $\delta < \varepsilon$. This is what "$\delta$ sufficiently small" means in the theorem hypothesis. Why do we need it? Because in the next step we want any path from $p$ to $q$ to genuinely cross $S_\delta(p)$ — to do this we need the path to start strictly inside the ball of radius $\delta$ (at $p$ itself) and end strictly outside it (at $q$, which is at distance more than $\delta$). Without $\delta < d(p, q)$, the point $q$ could be inside $\overline{B}(p, \delta)$ and the intermediate value argument would fail.
[/guided]
[/step]
[step:Every path from $p$ to $q$ crosses $S_\delta(p)$ at a point with $d$ to $q$ at least $d(p_0, q)$]
Let $\gamma : [a, b] \to M$ be any piecewise $C^1$ path with $\gamma(a) = p$ and $\gamma(b) = q$. Define
\begin{align*}
h : [a, b] &\to \mathbb{R} \\
t &\mapsto d(\gamma(t), p).
\end{align*}
This is continuous (as the composition of continuous $\gamma$ and $1$-Lipschitz $d(\cdot, p)$) with $h(a) = 0$ and $h(b) = d(p, q) > \delta$. By the intermediate value theorem applied to $h$ on $[a, b]$, there exists $t_0 \in (a, b)$ with $h(t_0) = \delta$, i.e., $\gamma(t_0) \in S_\delta(p)$.
Let $x := \gamma(t_0) \in S_\delta(p)$. The path $\gamma$ decomposes as $\gamma|_{[a, t_0]}$ (a path from $p$ to $x$) followed by $\gamma|_{[t_0, b]}$ (a path from $x$ to $q$). Therefore
\begin{align*}
\ell(\gamma) = \ell(\gamma|_{[a, t_0]}) + \ell(\gamma|_{[t_0, b]}) \geq d(p, x) + d(x, q),
\end{align*}
where the last inequality uses that the length of any path between two points is at least their distance.
Since $x \in S_\delta(p)$, $d(p, x) = \delta$, and $d(x, q) \geq d(p_0, q)$ by the choice of $p_0$. Hence
\begin{align*}
\ell(\gamma) \geq \delta + d(p_0, q).
\end{align*}
[guided]
Fix an arbitrary piecewise $C^1$ path $\gamma : [a, b] \to M$ with $\gamma(a) = p$ and $\gamma(b) = q$. We want to show $\ell(\gamma) \geq \delta + d(p_0, q)$, where $p_0$ is the minimiser from Step 1.
**Why must $\gamma$ cross the geodesic sphere $S_\delta(p)$?** The path starts at $p$ (distance $0$ from $p$) and ends at $q$ (distance $d(p, q) > \delta$ from $p$, by our auxiliary assumption $\delta < d(p, q)$). Continuity should force the path to pass through every intermediate distance, including $\delta$. We make this precise by introducing the auxiliary function
\begin{align*}
h : [a, b] &\to \mathbb{R} \\
t &\mapsto d(\gamma(t), p).
\end{align*}
This is continuous: $\gamma$ is continuous (piecewise $C^1$ implies continuous), and $d(\cdot, p)$ is $1$-Lipschitz by the triangle inequality (same argument as in Step 1, with $p$ in place of $q$). The composition of continuous functions is continuous. We compute the boundary values: $h(a) = d(p, p) = 0$ and $h(b) = d(q, p) = d(p, q) > \delta$. The [intermediate value theorem](/theorems/???) applied to $h$ on $[a, b]$ — its hypotheses are continuity of $h$ (verified) and the value $\delta$ lying strictly between $h(a) = 0$ and $h(b) = d(p, q)$ (verified) — produces $t_0 \in (a, b)$ with $h(t_0) = \delta$. Setting $x := \gamma(t_0)$, this says $d(x, p) = \delta$, i.e., $x \in S_\delta(p)$.
**Why does the length of $\gamma$ exceed $\delta + d(p_0, q)$?** Split $\gamma$ at the crossing time $t_0$. The first piece $\gamma|_{[a, t_0]}$ is a piecewise $C^1$ path from $p$ to $x$; the second piece $\gamma|_{[t_0, b]}$ is a piecewise $C^1$ path from $x$ to $q$. Length is additive under concatenation, and the length of any piecewise $C^1$ path between two points is at least their distance (this is the very definition of $d$ as the infimum over path lengths). Therefore
\begin{align*}
\ell(\gamma) = \ell(\gamma|_{[a, t_0]}) + \ell(\gamma|_{[t_0, b]}) \geq d(p, x) + d(x, q).
\end{align*}
We now substitute the value $d(p, x) = \delta$ (since $x \in S_\delta(p)$):
\begin{align*}
\ell(\gamma) \geq d(p, x) + d(x, q) = \delta + d(x, q).
\end{align*}
**Replacing $d(x, q)$ by $d(p_0, q)$.** The crossing point $x$ depends on the chosen path $\gamma$ — different paths cross the sphere at different places. But $p_0$ was defined in Step 1 as a global minimiser of $d(\cdot, q)$ over the entire sphere $S_\delta(p)$, so for any $x \in S_\delta(p)$ we have $d(x, q) \geq d(p_0, q)$. Substituting this $\gamma$-independent lower bound:
\begin{align*}
\ell(\gamma) \geq \delta + d(p_0, q).
\end{align*}
This is the uniform bound across all paths that the next step needs.
[/guided]
[/step]
[step:Take the infimum over paths to obtain $d(p, q) \geq \delta + d(p_0, q)$]
The bound $\ell(\gamma) \geq \delta + d(p_0, q)$ holds for every piecewise $C^1$ path $\gamma$ from $p$ to $q$. Taking the infimum on the left:
\begin{align*}
d(p, q) = \inf_\gamma \ell(\gamma) \geq \delta + d(p_0, q),
\end{align*}
where the infimum is over all piecewise $C^1$ paths from $p$ to $q$ (this is the definition of $d$).
[/step]
[step:Establish the reverse inequality via the triangle inequality]
By the triangle inequality applied to the metric $d$ on $M$,
\begin{align*}
d(p, q) \leq d(p, p_0) + d(p_0, q) = \delta + d(p_0, q),
\end{align*}
since $p_0 \in S_\delta(p)$ gives $d(p, p_0) = \delta$.
[/step]
[step:Conclude the triangle equality]
Combining the inequalities from the previous two steps,
\begin{align*}
d(p, q) \leq \delta + d(p_0, q) \leq d(p, q),
\end{align*}
so all inequalities are equalities. In particular,
\begin{align*}
d(p, p_0) + d(p_0, q) = \delta + d(p_0, q) = d(p, q).
\end{align*}
This proves the claim with the witness $p_0 \in S_\delta(p)$ chosen in Step 1.
[/step]