[proofplan]
We expand the squared norm $f(t) := g_{\gamma(t)}(J(t), J(t))$ in a Taylor series at $t = 0$, then take a square root. The successive derivatives $f^{(k)}(0)$ are computed from the initial data $J(0) = 0$, $J'(0) = w$ (so $|J'(0)| = 1$) and the Jacobi equation $J'' + R(\dot\gamma, J)\dot\gamma = 0$ — which under the chapter sign convention $R = -\nabla \circ \nabla$ comes from [Jacobi Fields are Geodesic Variations](/theorems/2716). Crucially, $J''(0) = -R(\dot\gamma(0), J(0))\dot\gamma(0) = 0$ because $J(0) = 0$, killing the cubic term. Differentiating the Jacobi equation once more produces $J'''(0) = -R(a, w)a$, whose pairing $g(J'''(0), w) = -g(R(a, w)a, w)$ — combined with the formula $K(\sigma) = g(R(a, w)a, w)$ for orthonormal $a, w$ under the chapter convention — supplies the quartic coefficient. Assembling: $f(t) = t^2 - \frac{1}{3}K(\sigma) t^4 + o(t^4)$, and the square root gives $|J(t)| = t - \frac{1}{6} K(\sigma) t^3 + o(t^3)$. Throughout we use the [Gauss Lemma — Covariant Form](/theorems/2714) only implicitly via the identification $\dot\gamma(t) = (d\exp_p)_{ta}(a)$ and the formula for $J(t)$ from [Jacobi Fields via the Exponential Map](/theorems/2717); the heart of the computation is the fourth-order Taylor expansion of $g(J, J)$ along the geodesic.
[/proofplan]
[step:Set up the squared-norm function $f(t) = g(J, J)$ and reduce the proof to computing $f^{(k)}(0)$ for $k \le 4$]
Define
\begin{align*}
f: [0, T] &\to \mathbb{R} \\
t &\mapsto g_{\gamma(t)}(J(t), J(t))
\end{align*}
on a small interval $[0, T]$ on which $\gamma$ and $J$ are defined. The function $f$ is smooth on $[0, T]$ because $J$, $\gamma$, and $g$ are smooth. Moreover $f \ge 0$, with $f(0) = 0$ since $J(0) = 0$. Since $J(0) = 0$ and $J'(0) = w \ne 0$, the leading non-zero behaviour of $J$ near $t = 0$ is $J(t) = tw + O(t^2)$ (as we will see explicitly), so $|J(t)| \to 0$ from positive values as $t \to 0^+$, and we may take square roots.
The strategy is to compute the Taylor expansion
\begin{align*}
f(t) = f(0) + f'(0)\, t + \frac{f''(0)}{2!}\, t^2 + \frac{f'''(0)}{3!}\, t^3 + \frac{f^{(4)}(0)}{4!}\, t^4 + o(t^4)
\end{align*}
to fourth order, then extract $|J(t)| = \sqrt{f(t)}$. Smoothness of $f$ on $[0, T]$ guarantees Taylor's theorem with Peano remainder applies.
Differentiate $f$ using metric compatibility $\frac{d}{dt} g(V, W) = g(\nabla_{dt} V, W) + g(V, \nabla_{dt} W)$ for vector fields $V, W$ along $\gamma$:
\begin{align*}
f'(t) &= 2\, g(J'(t), J(t)), \\
f''(t) &= 2\, g(J''(t), J(t)) + 2\, g(J'(t), J'(t)) = 2\, g(J''(t), J(t)) + 2\, |J'(t)|^2, \\
f'''(t) &= 2\, g(J'''(t), J(t)) + 6\, g(J''(t), J'(t)), \\
f^{(4)}(t) &= 2\, g(J^{(4)}(t), J(t)) + 8\, g(J'''(t), J'(t)) + 6\, |J''(t)|^2.
\end{align*}
(Here we have used the product rule successively: each derivative of $f^{(k)}$ via metric compatibility produces $\nabla_{dt}$-derivatives on the $J$-factors.)
[/step]
[step:Establish the Jacobi equation $J'' = -R(\dot\gamma, J)\dot\gamma$ and the constancy of $|\dot\gamma| = 1$ along $\gamma$]
By hypothesis, $|a| = 1$, so $\gamma$ is a unit-speed geodesic by [Geodesics Have Constant Speed](/theorems/2709). Hence $|\dot\gamma(t)| = 1$ for all $t$ in the domain of $\gamma$, and $\nabla_{dt} \dot\gamma = 0$.
The Jacobi field $J$ satisfies the Jacobi equation, which under the chapter sign convention $R = -\nabla \circ \nabla$ takes the form
\begin{align*}
J''(t) + R(\dot\gamma(t), J(t))\, \dot\gamma(t) = 0
\end{align*}
by [Jacobi Fields are Geodesic Variations](/theorems/2716). That is,
\begin{align*}
J''(t) = -R(\dot\gamma(t), J(t))\, \dot\gamma(t). \tag{J}
\end{align*}
[/step]
[step:Evaluate $f(0), f'(0), f''(0), f'''(0)$ using $J(0) = 0, J'(0) = w$, and $J''(0) = 0$]
We use the initial data $J(0) = 0$, $J'(0) = w$ (with $|w| = 1$) and the Jacobi equation (J).
**$f(0)$ and $f'(0)$.** Since $J(0) = 0$,
\begin{align*}
f(0) = g_p(0, 0) = 0, \qquad f'(0) = 2\, g_p(w, 0) = 0.
\end{align*}
**$f''(0)$.** The crucial observation is that $J''(0) = 0$: substituting $J(0) = 0$ into (J),
\begin{align*}
J''(0) = -R(\dot\gamma(0), J(0))\, \dot\gamma(0) = -R(a, 0)\, a = 0
\end{align*}
by $\mathbb{R}$-linearity of $R$ in the second slot. Hence
\begin{align*}
f''(0) = 2\, g_p(J''(0), J(0)) + 2\, |J'(0)|^2 = 0 + 2\, |w|^2 = 2.
\end{align*}
**$f'''(0)$.** Using $J(0) = 0$ and $J''(0) = 0$,
\begin{align*}
f'''(0) = 2\, g_p(J'''(0), J(0)) + 6\, g_p(J''(0), J'(0)) = 0 + 0 = 0.
\end{align*}
[/step]
[step:Compute $J'''(0)$ by differentiating the Jacobi equation, then evaluate $f^{(4)}(0)$]
We compute $J'''(t)$ from (J) by applying $\nabla_{dt}$. Since $R$ is the curvature tensor of $\nabla$ on $M$ and $J, \dot\gamma$ are vector fields along $\gamma$, the Leibniz rule for $\nabla_{dt}$ acting on the tensor expression $R(\dot\gamma, J)\dot\gamma$ gives
\begin{align*}
\nabla_{dt}\bigl[R(\dot\gamma, J)\dot\gamma\bigr] = (\nabla_{\dot\gamma} R)(\dot\gamma, J)\dot\gamma + R(\nabla_{dt}\dot\gamma, J)\dot\gamma + R(\dot\gamma, \nabla_{dt} J)\dot\gamma + R(\dot\gamma, J)\nabla_{dt}\dot\gamma.
\end{align*}
Using $\nabla_{dt}\dot\gamma = 0$ (geodesic equation) and $\nabla_{dt} J = J'$,
\begin{align*}
\nabla_{dt}\bigl[R(\dot\gamma, J)\dot\gamma\bigr] = (\nabla_{\dot\gamma} R)(\dot\gamma, J)\dot\gamma + R(\dot\gamma, J')\dot\gamma.
\end{align*}
Therefore from (J),
\begin{align*}
J'''(t) = -(\nabla_{\dot\gamma} R)(\dot\gamma, J)\dot\gamma - R(\dot\gamma, J')\dot\gamma.
\end{align*}
At $t = 0$, with $J(0) = 0$, $J'(0) = w$, $\dot\gamma(0) = a$, the first term vanishes by $\mathbb{R}$-linearity of $\nabla_a R$ in the second slot:
\begin{align*}
J'''(0) = -(\nabla_a R)(a, 0)\, a - R(a, w)\, a = 0 - R(a, w)\, a = -R(a, w)\, a. \tag{$\star$}
\end{align*}
**Evaluating $f^{(4)}(0)$.** Substituting $J(0) = 0$ and $J''(0) = 0$ into the formula for $f^{(4)}(t)$ at $t = 0$,
\begin{align*}
f^{(4)}(0) = 2\, g_p(J^{(4)}(0), J(0)) + 8\, g_p(J'''(0), J'(0)) + 6\, |J''(0)|^2 = 0 + 8\, g_p(J'''(0), w) + 0.
\end{align*}
Using $(\star)$,
\begin{align*}
g_p(J'''(0), w) = g_p(-R(a, w)a, w) = -g_p(R(a, w)a, w).
\end{align*}
Hence
\begin{align*}
f^{(4)}(0) = -8\, g_p(R(a, w)a, w). \tag{$\dagger$}
\end{align*}
[guided]
We have arrived at the first place where curvature actually enters the computation. The Jacobi equation (J) only used $J(0) = 0$ to force $J''(0) = 0$ — a "second-order vanishing" that made $f''(0)$ depend solely on $|J'(0)|^2$ and made $f'''(0)$ collapse entirely. To extract the leading curvature contribution we must reach into the third derivative of $J$, which requires differentiating (J) once more along $\gamma$.
How does $\nabla_{dt}$ act on the tensor expression $R(\dot\gamma, J)\dot\gamma$? The Riemann tensor $R$ is a $(1, 3)$-tensor field on $M$, and the expression $R(\dot\gamma, J)\dot\gamma$ is the vector field along $\gamma$ obtained by feeding the three vector arguments $\dot\gamma, J, \dot\gamma$ into $R$ and contracting. The covariant derivative along $\gamma$ is a derivation, so each "factor" — including the tensor $R$ itself — gets differentiated and the results are summed. This produces four terms:
\begin{align*}
\nabla_{dt}\bigl[R(\dot\gamma, J)\dot\gamma\bigr] = \underbrace{(\nabla_{\dot\gamma} R)(\dot\gamma, J)\dot\gamma}_{\text{differentiate }R} + \underbrace{R(\nabla_{dt}\dot\gamma, J)\dot\gamma}_{\text{first slot}} + \underbrace{R(\dot\gamma, \nabla_{dt} J)\dot\gamma}_{\text{second slot}} + \underbrace{R(\dot\gamma, J)\nabla_{dt}\dot\gamma}_{\text{third slot}}.
\end{align*}
Here $\nabla_{\dot\gamma} R$ is the covariant derivative of the tensor field $R$ in the direction $\dot\gamma$, evaluated at $\gamma(t)$.
**Cancellations along the geodesic.** Because $\gamma$ is a geodesic, $\nabla_{dt}\dot\gamma = 0$, killing the first-slot and third-slot terms. The second-slot term simplifies via $\nabla_{dt} J = J'$. Thus
\begin{align*}
\nabla_{dt}\bigl[R(\dot\gamma, J)\dot\gamma\bigr] = (\nabla_{\dot\gamma} R)(\dot\gamma, J)\dot\gamma + R(\dot\gamma, J')\dot\gamma.
\end{align*}
Plugging into $\nabla_{dt}$ of the Jacobi equation $J'' = -R(\dot\gamma, J)\dot\gamma$ gives
\begin{align*}
J'''(t) = -(\nabla_{\dot\gamma} R)(\dot\gamma, J)\dot\gamma - R(\dot\gamma, J')\dot\gamma.
\end{align*}
**Evaluating at $t = 0$.** Substitute $J(0) = 0$, $J'(0) = w$, $\dot\gamma(0) = a$. The first term contains $J(0) = 0$ in the second slot of $(\nabla_a R)(a, \cdot)\,a$ and vanishes by $\mathbb{R}$-linearity — *not* because $\nabla R$ is itself zero (on a generic Riemannian manifold $\nabla R \ne 0$; it vanishes only on locally symmetric spaces), but because the second argument is the zero vector. The second term is $R(a, w)\,a$, which survives. With the minus sign,
\begin{align*}
J'''(0) = -(\nabla_a R)(a, 0)\, a - R(a, w)\, a = 0 - R(a, w)\, a = -R(a, w)\, a. \tag{$\star$}
\end{align*}
**Quartic coefficient $f^{(4)}(0)$.** Now we substitute into the fourth-derivative formula
\begin{align*}
f^{(4)}(t) = 2\, g(J^{(4)}(t), J(t)) + 8\, g(J'''(t), J'(t)) + 6\, |J''(t)|^2.
\end{align*}
At $t = 0$, the $J^{(4)}$-term is killed by $J(0) = 0$, and the $|J''|^2$-term is killed by $J''(0) = 0$. Only the middle term survives, giving $f^{(4)}(0) = 8\, g_p(J'''(0), w)$. Using $(\star)$,
\begin{align*}
g_p(J'''(0), w) = g_p(-R(a, w)a, w) = -g_p(R(a, w)a, w),
\end{align*}
so
\begin{align*}
f^{(4)}(0) = -8\, g_p(R(a, w)a, w). \tag{$\dagger$}
\end{align*}
This is the only place curvature enters the Taylor coefficients of $f$, and the next step turns $g_p(R(a, w)a, w)$ into the sectional curvature $K(\sigma)$.
[/guided]
[/step]
[step:Express the curvature pairing $g(R(a, w)a, w)$ in terms of the sectional curvature $K(\sigma)$ under the chapter sign convention]
The sectional curvature of the two-plane $\sigma = \operatorname{span}(a, w) \subseteq T_p M$ is, by definition,
\begin{align*}
K(\sigma) = \frac{g_p(R(a, w)a, w)}{|a|^2_{g_p} |w|^2_{g_p} - g_p(a, w)^2},
\end{align*}
under the chapter sign convention $R = -\nabla \circ \nabla$. (We pause to verify this formula matches the geometric sectional curvature; the verification is recorded in the expanded discussion below.) With $|a|_{g_p} = |w|_{g_p} = 1$ and $g_p(a, w) = 0$ by hypothesis, the denominator equals $1 \cdot 1 - 0^2 = 1$, so
\begin{align*}
K(\sigma) = g_p(R(a, w)a, w).
\end{align*}
Substituting into $(\dagger)$,
\begin{align*}
f^{(4)}(0) = -8\, K(\sigma).
\end{align*}
[guided]
We need to convert the curvature pairing $g_p(R(a, w)a, w)$ that appeared in $(\dagger)$ into the sectional curvature $K(\sigma)$. Sign conventions are the central subtlety, so we proceed carefully.
The chapter convention $R = -\nabla \circ \nabla$ defines the curvature endomorphism as
\begin{align*}
R(X, Y) Z = -\nabla_X \nabla_Y Z + \nabla_Y \nabla_X Z + \nabla_{[X, Y]} Z,
\end{align*}
which is the negative of the "standard" convention $R^{\text{std}}(X, Y) Z = \nabla_X\nabla_Y Z - \nabla_Y\nabla_X Z - \nabla_{[X, Y]} Z$ used in many references. So $R^{\text{chap}} = -R^{\text{std}}$. The standard sectional-curvature formula reads $K(\sigma) = g(R^{\text{std}}(X, Y) Y, X) / (|X|^2 |Y|^2 - g(X, Y)^2)$ for $X, Y$ spanning $\sigma$; substituting $R^{\text{std}} = -R^{\text{chap}}$ flips the sign:
\begin{align*}
K(\sigma) = \frac{-g_p(R^{\text{chap}}(X, Y)Y, X)}{|X|^2 |Y|^2 - g_p(X, Y)^2} = \frac{g_p(R^{\text{chap}}(X, Y)X, Y)}{|X|^2 |Y|^2 - g_p(X, Y)^2},
\end{align*}
where the second equality uses the symmetry $R(X, Y, Z, W) = R(Z, W, X, Y)$ together with the antisymmetry in each pair, both of which hold by the [Symmetries of the Riemann Curvature Tensor](/theorems/2704). Concretely, $-g(R(X, Y)Y, X) = -R(X, Y, Y, X) = R(X, Y, X, Y) = g(R(X, Y)X, Y)$.
**Plugging in $X = a$, $Y = w$.** By hypothesis $|a|_{g_p} = 1$, $|w|_{g_p} = 1$, $g_p(a, w) = 0$, so the denominator $1 \cdot 1 - 0 = 1$ and
\begin{align*}
K(\sigma) = g_p(R(a, w)a, w).
\end{align*}
This is exactly the curvature pairing that appeared in $(\dagger)$, so substituting,
\begin{align*}
f^{(4)}(0) = -8\, g_p(R(a, w)a, w) = -8\, K(\sigma).
\end{align*}
**Sanity check on $S^2$.** On the unit sphere with $K \equiv 1$, the orthonormal Jacobi field with $J(0) = 0$, $J'(0) = w$ is well known to be $J(t) = \sin(t)\, E(t)$ where $E$ is parallel along $\gamma$ with $E(0) = w$. Then $|J(t)| = \sin(t) = t - \frac{1}{6} t^3 + o(t^3)$, matching our target formula $|J(t)| = t - \frac{1}{6} K(\sigma) t^3 + o(t^3)$ at $K = 1$. The chapter-sign Jacobi equation reads $J'' = -R(\dot\gamma, J)\dot\gamma$; on $S^2$ we have $J'' = -\sin(t) E = -J$, hence $R(\dot\gamma, J)\dot\gamma = J$, and pairing at $t = 0$ with orthonormal $a, w$ gives $g_p(R(a, w)a, w) = 1 = K$.
**Why the chapter sign is convenient.** With $R = -\nabla \circ \nabla$, the Jacobi equation reads $J'' + R(\dot\gamma, J)\dot\gamma = 0$ — the *same* form as in the standard convention. Only the sectional-curvature formula picks up a sign: chapter $K = g(R(X, Y) X, Y)$ versus standard $K = g(R(X, Y) Y, X)$. The geometric sectional curvature itself is convention-independent; only the bookkeeping changes.
[/guided]
[/step]
[step:Assemble the Taylor expansion of $f$ and extract $|J(t)| = \sqrt{f(t)}$]
Substituting the values $f(0) = f'(0) = 0$, $f''(0) = 2$, $f'''(0) = 0$, $f^{(4)}(0) = -8 K(\sigma)$ into the Taylor formula,
\begin{align*}
f(t) = \frac{2}{2!}\, t^2 + \frac{0}{3!}\, t^3 + \frac{-8 K(\sigma)}{4!}\, t^4 + o(t^4) = t^2 - \frac{1}{3} K(\sigma)\, t^4 + o(t^4).
\end{align*}
Equivalently,
\begin{align*}
f(t) = t^2 \left(1 - \frac{1}{3} K(\sigma)\, t^2 + o(t^2)\right) \quad \text{as } t \to 0^+.
\end{align*}
For small $t > 0$, $f(t) > 0$ (the bracketed factor is $1 + o(1)$). Take square roots, using $\sqrt{1 + u} = 1 + \frac{u}{2} + O(u^2)$ as $u \to 0$ with $u = -\frac{1}{3} K(\sigma) t^2 + o(t^2)$:
\begin{align*}
|J(t)| = \sqrt{f(t)} = t\, \sqrt{1 - \frac{1}{3} K(\sigma)\, t^2 + o(t^2)} = t\left(1 - \frac{1}{6} K(\sigma)\, t^2 + o(t^2)\right) = t - \frac{1}{6}\, K(\sigma)\, t^3 + o(t^3),
\end{align*}
which is the claimed expansion. The proof is complete.
[guided]
The square-root step deserves a moment's care because we are dealing with a $o$-asymptotic and need to multiply through by $t$ correctly.
We have $f(t) = t^2 g(t)$ where $g(t) = 1 - \frac{1}{3} K(\sigma) t^2 + o(t^2)$ as $t \to 0$. Since $g(0) = 1 > 0$ and $g$ is continuous, $g(t) > 0$ on a small interval $(0, T_0)$. On this interval,
\begin{align*}
|J(t)| = \sqrt{f(t)} = t\, \sqrt{g(t)},
\end{align*}
where we have taken positive square roots (valid because $t > 0$ on this interval).
Now $\sqrt{1 + u} = 1 + \frac{u}{2} - \frac{u^2}{8} + O(u^3)$ as $u \to 0$. With $u = -\frac{1}{3} K(\sigma) t^2 + o(t^2)$, we have $u = O(t^2)$ and $u^2 = O(t^4) = o(t^2)$ as $t \to 0$. Hence
\begin{align*}
\sqrt{g(t)} = \sqrt{1 + u} = 1 + \frac{u}{2} + O(u^2) = 1 - \frac{1}{6} K(\sigma) t^2 + o(t^2).
\end{align*}
Multiplying by $t$:
\begin{align*}
|J(t)| = t\, \sqrt{g(t)} = t - \frac{1}{6} K(\sigma) t^3 + t \cdot o(t^2) = t - \frac{1}{6} K(\sigma) t^3 + o(t^3),
\end{align*}
since $t \cdot o(t^2) = o(t^3)$.
**Why the cubic is special.** The cubic coefficient $-\frac{1}{6} K(\sigma)$ encodes the leading-order deviation of the geodesic spread from Euclidean behaviour. In flat space ($K \equiv 0$), $|J(t)| = t$ exactly — geodesics fan out linearly. Positive curvature ($K > 0$) **defocuses less**: $|J(t)| < t$ for small $t > 0$, consistent with the "geodesic convergence" picture on a sphere. Negative curvature ($K < 0$) defocuses more: $|J(t)| > t$, consistent with geodesics spreading apart on hyperbolic space.
**Connection to Gauss' Lemma.** By [Jacobi Fields via the Exponential Map](/theorems/2717), $J(t) = (d\exp_p)_{ta}(tw)$. So our expansion is equivalently the second-order Taylor expansion of the metric $g$ in normal coordinates at $p$:
\begin{align*}
g_{ij}(t) = \delta_{ij} - \frac{1}{3} R_{ikjl}(p)\, x_k x_l + O(|x|^3) \quad \text{at } x = ta + tsw.
\end{align*}
The radial direction is rigid by the [Gauss Lemma — Covariant Form](/theorems/2714) — i.e., the radial component of the metric is exactly Euclidean — and the deviation appears only in the angular component, which is what our expansion captures.
[/guided]
[/step]