[step:Differentiate $\exp_p$ along the straight-line curve $t \mapsto tv$]
Fix $v \in T_pM$. The differential $(d\exp_p)_0: T_0(T_pM) \to T_p M$ is, by definition, the linear map sending $\dot\alpha_v(0)$ to the velocity at $0$ of the composed curve $\exp_p \circ \alpha_v$:
\begin{align*}
(d\exp_p)_0(v) &= \frac{d}{dt}\bigg|_{t=0} \exp_p(tv).
\end{align*}
Here we have used the identification $\iota(v) = \dot\alpha_v(0)$ from the previous step, so $(d\exp_p)_0(v)$ stands for $(d\exp_p)_0(\iota(v))$.
By the [definition of the exponential map](/page/Exponential%20Map), $\exp_p(w) = \gamma_p(1, w)$ for all $w$ in the domain of $\exp_p$, where $\gamma_p(\cdot, w)$ is the geodesic with $\gamma_p(0, w) = p$ and $\dot\gamma_p(0, w) = w$. Substituting $w = tv$:
\begin{align*}
\exp_p(tv) &= \gamma_p(1, tv).
\end{align*}
Now apply the [Geodesic Rescaling](/theorems/2710) lemma with $\lambda = t$ and initial velocity $v$: it states $\gamma_p(\lambda s, a) = \gamma_p(s, \lambda a)$, so taking $s = 1$, $\lambda = t$, $a = v$ gives
\begin{align*}
\gamma_p(1, tv) &= \gamma_p(t, v).
\end{align*}
The rescaling lemma applies because for $|t|$ sufficiently small, $tv$ lies in the maximal domain of $\gamma_p(\cdot, v)$ via reparametrisation; equivalently, both $\gamma_p(1, tv)$ and $\gamma_p(t, v)$ are defined for $t$ in some open interval around $0$. Combining,
\begin{align*}
\exp_p(tv) &= \gamma_p(t, v).
\end{align*}
[/step]