[guided]We need to compute the variation field, which by definition is the $\partial_s$-derivative of $f$ along the central curve. Unpacking the definition:
\begin{align*}
\tilde J(t) := \frac{\partial f}{\partial s}(t, 0) = df_{(t, 0)}(\partial_s).
\end{align*}
How do we compute $df$? The trick is to factor $f$ through the vector space $T_p M$. Define $\Phi(t, s) := t(a + sw)$; then $f$ is the composition
\begin{align*}
[0, L] \times (-\varepsilon, \varepsilon) \xrightarrow{\Phi} \mathcal{D}_p \xrightarrow{\exp_p} M, \qquad \Phi(t, s) := t(a + sw).
\end{align*}
This factorisation is useful because $\Phi$ lands in the vector space $T_p M$, where derivatives reduce to ordinary partial derivatives in the affine chart, while the only place we have to do "manifold work" is the single application of $d\exp_p$.
Applying the chain rule to a smooth map between manifolds — with $T_p M$ regarded as a smooth manifold and the canonical identification $T_v(T_p M) \cong T_p M$ for any $v \in T_p M$, which is canonical because $T_p M$ is itself a vector space — we get
\begin{align*}
df_{(t, s)}(\partial_s) = (d\exp_p)_{\Phi(t, s)}\bigl(d\Phi_{(t, s)}(\partial_s)\bigr).
\end{align*}
Now we compute the inner factor $d\Phi_{(t, s)}(\partial_s)$. Since $\Phi$ takes values in a vector space, its derivative is just the partial derivative computed in the affine chart. Differentiating $\Phi(t, s) = ta + tsw$ with respect to $s$ at fixed $t$:
\begin{align*}
d\Phi_{(t, s)}(\partial_s) = \frac{\partial}{\partial s}\big|_{s} \bigl(t(a + sw)\bigr) = tw \in T_{t(a + sw)}(T_p M) \cong T_p M.
\end{align*}
The output is a tangent vector at the point $t(a + sw) \in T_p M$, which we identify with $T_p M$ itself via the canonical identification $T_v V \cong V$ for vector spaces $V$.
Substituting back into the chain-rule expression:
\begin{align*}
df_{(t, s)}(\partial_s) = (d\exp_p)_{t(a + sw)}(tw).
\end{align*}
Finally, setting $s = 0$ to evaluate at the central curve:
\begin{align*}
\tilde J(t) = df_{(t, 0)}(\partial_s) = (d\exp_p)_{ta}(tw). \tag{$\ast$}
\end{align*}
We mark this as $(\ast)$ because we will refer back to it in the next step when we verify that $\tilde J$ has the correct initial data.
This is a coordinate-free version of the standard "differentiate the exponential of a curve in the parameter" computation, expressed via the chain rule on the manifold $T_p M$ and pulled through $\exp_p$.
**Why the formula is morally simple.** $\exp_p(t(a + sw))$ is the geodesic from $p$ in direction $a + sw$, traversed for unit time and reparametrised by $t$. As we increase $s$, we tilt the initial velocity by $sw$, and the variation field at parameter $t$ is the difference between the tilted and untilted geodesics at time $t$ — to first order, this is $(d\exp_p)_{ta}$ applied to $tw$ because the input to $\exp_p$ is $\Phi(t, s) = ta + s(tw)$, and the linear part in $s$ is $tw$.[/guided]