[proofplan]
The proof has two parts. First, the Cauchy–Schwarz inequality for $L^2(0, T)$ produces the universal estimate $E(\gamma) \geq \ell(\gamma)^2 / T$, with equality if and only if $|\dot\gamma|_g$ is constant. Substituting an arbitrary reparametrisation of any path $\gamma_1 \in \Omega(p, q)$ to constant speed shows that the minimum of $E$ over $\Omega(p, q)$ equals $d(p, q)^2 / T$ and is attained only by constant-speed paths of length $d(p, q)$. So $\gamma_0$ has constant speed and minimises length. By [Minimal Geodesics Are Smooth Geodesics](/theorems/2721), a constant-speed length minimiser is a smooth geodesic.
[/proofplan]
[step:Apply Cauchy–Schwarz to bound the energy of any path by its length squared]
Let $\gamma : [0, T] \to M$ be a piecewise $C^1$ path. The Cauchy–Schwarz inequality in $L^2([0, T], \mathcal{L}^1)$ applied to the functions $1$ and $|\dot\gamma|_g$ yields
\begin{align*}
\left(\int_0^T 1 \cdot |\dot\gamma(t)|_g \, d\mathcal{L}^1(t)\right)^2 \leq \left(\int_0^T 1^2 \, d\mathcal{L}^1(t)\right) \left(\int_0^T |\dot\gamma(t)|_g^2 \, d\mathcal{L}^1(t)\right).
\end{align*}
The left-hand side equals $\ell(\gamma)^2$ and the right-hand side equals $T \cdot 2 E(\gamma)$, where the energy is $E(\gamma) = \frac{1}{2} \int_0^T |\dot\gamma|_g^2 \, d\mathcal{L}^1(t)$. Hence
\begin{align*}
\ell(\gamma)^2 \leq 2 T \, E(\gamma), \qquad \text{i.e.,} \qquad E(\gamma) \geq \frac{\ell(\gamma)^2}{2T}.
\end{align*}
Equality in Cauchy–Schwarz holds if and only if $|\dot\gamma|_g$ is proportional to the constant function $1$, i.e., $|\dot\gamma|_g$ is constant on $[0, T]$.
[guided]
We want a universal inequality between length and energy that, crucially, identifies the case of equality. Cauchy–Schwarz on $L^2$ is exactly such a tool: the standard form is
\begin{align*}
\left(\int_0^T f h \, d\mathcal{L}^1(t)\right)^2 \leq \left(\int_0^T f^2 \, d\mathcal{L}^1(t)\right)\left(\int_0^T h^2 \, d\mathcal{L}^1(t)\right),
\end{align*}
with equality if and only if $f$ and $h$ are linearly dependent in $L^2$. The trick is choosing $f$ and $h$ so that one product reproduces $\ell$ and the other reproduces $E$.
Take $f \equiv 1$ and $h(t) := |\dot\gamma(t)|_g$ — both lie in $L^2([0, T], \mathcal{L}^1)$ since $\gamma$ is piecewise $C^1$ on the compact interval $[0, T]$. Substituting:
\begin{align*}
\left(\int_0^T 1 \cdot |\dot\gamma(t)|_g \, d\mathcal{L}^1(t)\right)^2 \leq \left(\int_0^T 1^2 \, d\mathcal{L}^1(t)\right) \left(\int_0^T |\dot\gamma(t)|_g^2 \, d\mathcal{L}^1(t)\right).
\end{align*}
Now identify the three integrals. The left-hand integral is the length by definition, $\int_0^T |\dot\gamma|_g \, d\mathcal{L}^1 = \ell(\gamma)$. The first right-hand integral is $T$. The second is $2 E(\gamma)$, where we use the convention $E(\gamma) = \tfrac{1}{2}\int_0^T |\dot\gamma|_g^2 \, d\mathcal{L}^1$. (The factor of $\tfrac12$ is conventional; with the alternative $E = \int |\dot\gamma|_g^2$ the inequality reads $\ell^2 \leq T E$ and the rest of the argument runs identically.) Substituting these identifications back:
\begin{align*}
\ell(\gamma)^2 \leq 2 T \, E(\gamma), \qquad \text{i.e.,} \qquad E(\gamma) \geq \frac{\ell(\gamma)^2}{2T}.
\end{align*}
The equality case is what makes this lemma useful. Cauchy–Schwarz attains equality iff $1$ and $|\dot\gamma|_g$ are linearly dependent in $L^2$ — equivalently, iff $|\dot\gamma|_g$ is a constant multiple of $1$, i.e., $|\dot\gamma|_g$ is constant on $[0, T]$. This equivalence — saturating the energy bound iff the speed is constant — is the lever we will use in Step 4 to force $\gamma_0$ to have constant speed.
[/guided]
[/step]
[step:Reparametrise regular paths to constant speed and approximate arbitrary paths in length by regular ones]
The arc-length reparametrisation requires $|\dot\gamma_1|_g > 0$ to invert the arc-length function as a piecewise $C^1$ map. We restrict the construction to **regular paths** and recover the general case by a density argument.
**Definition.** Let $\Omega^*(p, q) := \{\gamma \in \Omega(p, q) : |\dot\gamma(t)|_g > 0 \text{ at every } t \text{ where } \dot\gamma \text{ exists}\}$.
**Constant-speed reparametrisation on $\Omega^*$.** Fix $\gamma_1 \in \Omega^*(p, q)$ with $\ell := \ell(\gamma_1) > 0$. Define $s : [0, T] \to [0, \ell]$ by $s(t) = \int_0^t |\dot\gamma_1(\tau)|_g \, d\mathcal{L}^1(\tau)$. Let $0 = t_0 < t_1 < \cdots < t_N = T$ be a partition adapted to $\gamma_1$. On each piece $[t_{i-1}, t_i]$, $s$ is $C^1$ with $s'(t) = |\dot\gamma_1(t)|_g > 0$ by regularity, hence strictly increasing. By the inverse function theorem applied piecewise, the inverse $s^{-1} : [0, \ell] \to [0, T]$ is well-defined, continuous, strictly increasing, and piecewise $C^1$ with $(s^{-1})'(u) = 1/|\dot\gamma_1(s^{-1}(u))|_g$. Define
\begin{align*}
\tilde\gamma_1(t) := \gamma_1\!\left(s^{-1}\!\left(\tfrac{\ell}{T}\, t\right)\right).
\end{align*}
This is a composition of piecewise $C^1$ maps, hence piecewise $C^1$, with $\tilde\gamma_1(0) = p$, $\tilde\gamma_1(T) = q$, and the chain rule gives $|\dot{\tilde\gamma}_1|_g \equiv \ell/T$. Hence $\ell(\tilde\gamma_1) = \ell$ and the equality case of Step 1 yields $E(\tilde\gamma_1) = \ell(\gamma_1)^2/(2T)$. (If $\gamma_1$ is constant, then $p = q$ and the identity is $0 = 0$.)
**Approximation of $\Omega(p, q)$ by $\Omega^*(p, q)$ in length.** We claim
\begin{align*}
\inf_{\gamma \in \Omega^*(p, q)} \ell(\gamma) = \inf_{\gamma \in \Omega(p, q)} \ell(\gamma) = d(p, q).
\end{align*}
The inequality $\inf_{\Omega^*} \ell \geq \inf_\Omega \ell$ is immediate since $\Omega^* \subseteq \Omega$. For the reverse, fix $\gamma_1 \in \Omega(p, q)$ and $\varepsilon > 0$; we construct $\gamma_1^\varepsilon \in \Omega^*(p, q)$ with $\ell(\gamma_1^\varepsilon) \leq \ell(\gamma_1) + \varepsilon$.
Fix any unit vector $v \in T_p M$ and let $V$ be the parallel transport of $v$ along $\gamma_1$, a continuous unit vector field along $\gamma_1$. By smoothness of $\exp$, there exist $r_0 > 0$ and $C \ge 1$ such that for piecewise $C^1$ functions $\eta : [0,T] \to [-r_0, r_0]$ with $\eta(0) = \eta(T) = 0$, the curve $\gamma_1^\eta(t) := \exp_{\gamma_1(t)}(\eta(t) V(t))$ is well-defined, piecewise $C^1$, and satisfies the bound
\begin{align*}
\bigl| |\dot{\gamma_1^\eta}(t)|_g - |\dot\gamma_1(t) + \eta'(t) V(t)|_g \bigr| \leq C\, |\eta(t)|\bigl(|\dot\gamma_1(t)|_g + |\eta'(t)|\bigr).
\end{align*}
Choose $\eta(t) := \delta\, \rho(t) (\sin(2\pi t/T + \theta) - \sin\theta)$ where $\rho$ is a smooth cutoff equal to $1$ except in tiny neighbourhoods of $0$ and $T$, and $\theta$ is a generic phase chosen so that no zero of $\cos(2\pi t/T + \theta)$ in $[0, T]$ coincides with a zero of $|\dot\gamma_1|_g$ (possible because the latter has measure $\le T$ and the locations of the former depend continuously on $\theta$, so a set of $\theta$ of full measure works). Then $\eta(0) = \eta(T) = 0$ and $|\eta'(t)| > 0$ except at finitely many points, none coinciding with zeros of $|\dot\gamma_1|_g$. Hence $|\dot{\gamma_1^\eta}(t)|_g > 0$ for all $t$ where $\dot\gamma_1$ exists, so $\gamma_1^\eta \in \Omega^*(p, q)$.
The length increase is bounded above by $\int_0^T |\eta'(t)| \, d\mathcal{L}^1(t) + C \delta(\ell(\gamma_1) + 2\pi\delta)$. The first term is $O(\delta)$ from the bounded total variation of one period of sine; the second is $O(\delta\, \ell(\gamma_1))$. Choosing $\delta$ small enough makes the total $\leq \varepsilon$. Set $\gamma_1^\varepsilon := \gamma_1^\eta$ for this choice.
This proves the density claim.
[guided]
The strategy of this step is to upgrade the inequality of Step 1 to an *equality on a sufficiently rich subset*: we want to show that for every length the curve $\gamma_1$ achieves, there is another curve in $\Omega(p, q)$ with the same length but constant speed (saturating Cauchy–Schwarz). Why is this needed? Because only constant-speed curves saturate Cauchy–Schwarz, but a generic $\gamma_1 \in \Omega(p, q)$ has wildly varying speed.
The standard tool is the arc-length reparametrisation. The arc-length function $s(t) = \int_0^t |\dot\gamma_1|_g \, d\mathcal{L}^1$ measures distance covered by time $t$. To invert it as a piecewise $C^1$ map, we need $s'(t) = |\dot\gamma_1(t)|_g > 0$ — otherwise $s$ may be flat on intervals and not invertible. So we restrict the construction to **regular paths** and recover the general case by approximation.
**Definition.** Let $\Omega^*(p, q) := \{\gamma \in \Omega(p, q) : |\dot\gamma(t)|_g > 0 \text{ at every } t \text{ where } \dot\gamma \text{ exists}\}$.
**Constant-speed reparametrisation on $\Omega^*$.** Fix $\gamma_1 \in \Omega^*(p, q)$ with $\ell := \ell(\gamma_1) > 0$. Define the arc-length $s : [0, T] \to [0, \ell]$ by $s(t) = \int_0^t |\dot\gamma_1(\tau)|_g \, d\mathcal{L}^1(\tau)$, and let $0 = t_0 < t_1 < \cdots < t_N = T$ be a partition adapted to the piecewise-$C^1$ structure of $\gamma_1$. On each piece $[t_{i-1}, t_i]$, $s$ is $C^1$ with $s'(t) = |\dot\gamma_1(t)|_g > 0$ by regularity, hence strictly increasing. Applying the inverse function theorem piecewise, the inverse $s^{-1} : [0, \ell] \to [0, T]$ is well-defined, continuous, strictly increasing, and piecewise $C^1$ with $(s^{-1})'(u) = 1/|\dot\gamma_1(s^{-1}(u))|_g$. Now the rescaling factor $\ell/T$ converts the natural parameter on $[0, \ell]$ into the prescribed parameter on $[0, T]$:
\begin{align*}
\tilde\gamma_1(t) := \gamma_1\!\left(s^{-1}\!\left(\tfrac{\ell}{T}\, t\right)\right).
\end{align*}
This is a composition of piecewise $C^1$ maps, hence itself piecewise $C^1$, and $\tilde\gamma_1(0) = \gamma_1(s^{-1}(0)) = \gamma_1(0) = p$, $\tilde\gamma_1(T) = \gamma_1(s^{-1}(\ell)) = \gamma_1(T) = q$. The chain rule gives $|\dot{\tilde\gamma}_1(t)|_g = |\dot\gamma_1(s^{-1}(\tfrac{\ell}{T} t))|_g \cdot \tfrac{\ell}{T} \cdot \tfrac{1}{|\dot\gamma_1(s^{-1}(\tfrac{\ell}{T} t))|_g} \equiv \ell/T$, constant. Hence $\ell(\tilde\gamma_1) = \int_0^T \ell/T \, d\mathcal{L}^1 = \ell$ and the equality case of Step 1 yields $E(\tilde\gamma_1) = \ell(\gamma_1)^2/(2T)$. (If $\gamma_1$ is constant, then $p = q$ and the identity $0 = 0$ is trivial.)
**Why a density argument?** The constant-speed reparametrisation is restricted to $\Omega^*$. To transfer the conclusion to all of $\Omega(p, q)$, we show that $\Omega^*$ is dense in $\Omega(p, q)$ in length — i.e., the infima agree:
\begin{align*}
\inf_{\gamma \in \Omega^*(p, q)} \ell(\gamma) = \inf_{\gamma \in \Omega(p, q)} \ell(\gamma) = d(p, q).
\end{align*}
The inequality $\inf_{\Omega^*} \ell \geq \inf_\Omega \ell$ is immediate since $\Omega^* \subseteq \Omega$. For the reverse, fix $\gamma_1 \in \Omega(p, q)$ and $\varepsilon > 0$; we must construct $\gamma_1^\varepsilon \in \Omega^*(p, q)$ with $\ell(\gamma_1^\varepsilon) \leq \ell(\gamma_1) + \varepsilon$.
The idea is to perturb $\gamma_1$ in a transverse direction by a small amplitude oscillation, which knocks the speed away from zero everywhere while costing arbitrarily little length. Fix any unit vector $v \in T_p M$ and let $V$ be its parallel transport along $\gamma_1$ — a continuous unit vector field along $\gamma_1$. By smoothness of $\exp$ on a neighbourhood of the zero section, there exist $r_0 > 0$ and $C \ge 1$ such that for piecewise $C^1$ functions $\eta : [0, T] \to [-r_0, r_0]$ with $\eta(0) = \eta(T) = 0$, the curve $\gamma_1^\eta(t) := \exp_{\gamma_1(t)}(\eta(t) V(t))$ is well-defined, piecewise $C^1$, agrees with $\gamma_1$ at the endpoints, and satisfies the speed comparison
\begin{align*}
\bigl| |\dot{\gamma_1^\eta}(t)|_g - |\dot\gamma_1(t) + \eta'(t) V(t)|_g \bigr| \leq C\, |\eta(t)|\bigl(|\dot\gamma_1(t)|_g + |\eta'(t)|\bigr).
\end{align*}
(This is the standard $C^1$-comparison between $\exp$ and its differential at the origin, which is the identity in normal coordinates.)
Now the choice of $\eta$. We pick $\eta(t) := \delta\, \rho(t)\bigl(\sin(2\pi t/T + \theta) - \sin\theta\bigr)$ where $\rho$ is a smooth cutoff equal to $1$ except in tiny neighbourhoods of $0$ and $T$ (forcing $\eta(0) = \eta(T) = 0$), and $\theta$ is a generic phase chosen so that no zero of $\cos(2\pi t/T + \theta)$ in $[0, T]$ coincides with a zero of $|\dot\gamma_1|_g$. This last condition is achievable because the set of zeros of $|\dot\gamma_1|_g$ in $[0, T]$ is closed and the locations of the cosine zeros depend continuously on $\theta$, so a measure-positive set of $\theta$ avoids the obstruction. Then $\eta(0) = \eta(T) = 0$ and $|\eta'(t)| > 0$ except at finitely many points, none of which coincide with zeros of $|\dot\gamma_1|_g$. Hence at every $t$ where $\dot\gamma_1$ exists, the right-hand side $|\dot\gamma_1 + \eta' V|_g$ is bounded below by either $|\dot\gamma_1|_g > 0$ (when $\eta'(t) = 0$ but speed is nonzero) or $|\eta'(t)| > 0$ (when $|\dot\gamma_1| = 0$); the speed comparison then keeps $|\dot{\gamma_1^\eta}|_g > 0$ for $\delta$ small enough. So $\gamma_1^\eta \in \Omega^*(p, q)$.
Finally, the length cost. Integrating the speed comparison and using the triangle inequality $|\dot\gamma_1 + \eta' V|_g \leq |\dot\gamma_1|_g + |\eta'|$, the length increase $\ell(\gamma_1^\eta) - \ell(\gamma_1)$ is bounded above by $\int_0^T |\eta'(t)| \, d\mathcal{L}^1(t) + C \delta(\ell(\gamma_1) + 2\pi\delta)$. The first term is $O(\delta)$ (the bounded variation of one period of sine is $\leq 4\delta$, modulated by $\rho$); the second is $O(\delta\, \ell(\gamma_1))$. Choosing $\delta$ small enough makes the total $\leq \varepsilon$, and we set $\gamma_1^\varepsilon := \gamma_1^\eta$.
This completes the density argument. The lever for the next step: any minimum of $E$ on $\Omega(p, q)$ must achieve at most $\ell(\gamma_1)^2 / (2T)$ for every $\gamma_1 \in \Omega^*(p, q)$, hence at most $d(p, q)^2 / (2T)$ by passing to the infimum.
[/guided]
[/step]
[step:Identify the minimum value of $E$ on $\Omega(p, q)$ as $d(p, q)^2 / (2T)$]
Combining Steps 1 and 2: for any $\gamma \in \Omega(p, q)$,
\begin{align*}
E(\gamma) \geq \frac{\ell(\gamma)^2}{2T} \geq \frac{d(p, q)^2}{2T},
\end{align*}
where the second inequality uses $\ell(\gamma) \geq d(p, q)$ (the metric $d$ is defined as the infimum of lengths over $\Omega(p, q)$).
Conversely, Step 2 produces, for every regular path $\gamma_1 \in \Omega^*(p, q)$, a constant-speed reparametrisation $\tilde\gamma_1 \in \Omega(p, q)$ with $E(\tilde\gamma_1) = \ell(\gamma_1)^2/(2T)$. Hence
\begin{align*}
\inf_{\gamma \in \Omega(p, q)} E(\gamma) \leq \inf_{\gamma_1 \in \Omega^*(p, q)} \frac{\ell(\gamma_1)^2}{2T} = \frac{1}{2T}\left(\inf_{\gamma_1 \in \Omega^*(p, q)} \ell(\gamma_1)\right)^2 = \frac{d(p, q)^2}{2T},
\end{align*}
where the second equality uses monotonicity of $u \mapsto u^2/(2T)$ on $[0, \infty)$ and the third uses the density claim $\inf_{\Omega^*} \ell = d(p, q)$ proved in Step 2.
(Note: the infimum of $E$ may or may not be attained, but our hypothesis says $\gamma_0$ attains it.)
Combining,
\begin{align*}
\inf_{\gamma \in \Omega(p, q)} E(\gamma) = \frac{d(p, q)^2}{2T}.
\end{align*}
[/step]
[step:Deduce that $\gamma_0$ has constant speed and length $d(p, q)$]
By hypothesis, $\gamma_0$ minimises $E$ on $\Omega(p, q)$, so $E(\gamma_0) = d(p, q)^2/(2T)$. The chain of inequalities
\begin{align*}
\frac{d(p, q)^2}{2T} = E(\gamma_0) \geq \frac{\ell(\gamma_0)^2}{2T} \geq \frac{d(p, q)^2}{2T}
\end{align*}
forces both inequalities to be equalities. The first equality, $E(\gamma_0) = \ell(\gamma_0)^2/(2T)$, is the Cauchy–Schwarz equality case from Step 1: $|\dot{\gamma_0}|_g$ is constant on $[0, T]$. The second equality, $\ell(\gamma_0) = d(p, q)$, says $\gamma_0$ is a length minimiser in $\Omega(p, q)$.
So $\gamma_0$ has constant speed and is a length-minimising piecewise $C^1$ path from $p$ to $q$.
[guided]
By hypothesis $\gamma_0$ minimises $E$ on $\Omega(p, q)$, so by Step 3 we have $E(\gamma_0) = \inf_{\gamma \in \Omega(p,q)} E(\gamma) = d(p, q)^2/(2T)$. The strategy now is to assemble Steps 1–3 into a chain of inequalities that begins and ends at this same value, forcing every link in the chain to be an equality. Each forced equality then gives a structural property of $\gamma_0$.
We have two inequalities at our disposal: Step 1 gave $E(\gamma_0) \geq \ell(\gamma_0)^2/(2T)$ (Cauchy–Schwarz), and the definition of the Riemannian distance gives $\ell(\gamma_0) \geq d(p, q)$ since $\gamma_0 \in \Omega(p, q)$ and $d(p,q) = \inf_{\Omega} \ell$. Squaring the latter and dividing by $2T$ (both monotone on $[0, \infty)$) gives $\ell(\gamma_0)^2/(2T) \geq d(p, q)^2/(2T)$. Chaining:
\begin{align*}
\frac{d(p, q)^2}{2T} = E(\gamma_0) \geq \frac{\ell(\gamma_0)^2}{2T} \geq \frac{d(p, q)^2}{2T}.
\end{align*}
The first and last terms are identical, so both inequalities collapse to equalities. We extract a structural conclusion from each.
**Equality in the first link**, $E(\gamma_0) = \ell(\gamma_0)^2/(2T)$, is exactly the saturation case of Step 1's Cauchy–Schwarz lemma. By the equality clause established there, this forces $|\dot{\gamma_0}|_g$ to be constant on $[0, T]$. So $\gamma_0$ has constant speed.
**Equality in the second link**, $\ell(\gamma_0)^2/(2T) = d(p, q)^2/(2T)$, gives $\ell(\gamma_0)^2 = d(p, q)^2$, hence $\ell(\gamma_0) = d(p, q)$ (both quantities are non-negative). So $\gamma_0$ is a length-minimiser in $\Omega(p, q)$.
In summary, $\gamma_0$ has constant speed and is a length-minimising piecewise $C^1$ path from $p$ to $q$. Both pieces of information feed into the next step: a length-minimising curve with constant speed is precisely the type of object that the regularity theorem [Minimal Geodesics Are Smooth Geodesics](/theorems/2721) handles.
[/guided]
[/step]
[step:Conclude that $\gamma_0$ is a geodesic via the regularity theorem]
We apply [Minimal Geodesics Are Smooth Geodesics](/theorems/2721). That theorem requires the curve to be a piecewise $C^1$ path (given), constant speed (verified in Step 4), and length-minimising in $\Omega(p, q)$ (verified in Step 4). Under these hypotheses, the theorem concludes that the curve is a smooth geodesic, meaning $\nabla_{\dot{\gamma_0}}\dot{\gamma_0} = 0$ and $\gamma_0$ is smooth on $[0, T]$.
Therefore $\gamma_0$ is a geodesic, completing the proof.
[/step]