[proofplan]
We write the curve increment as $h(t)=\gamma(t)-a$ and use differentiability of $\gamma$ at $0$ to obtain $h(t)/t \to v$. Differentiability of $f$ at $a$ gives a linear approximation with a remainder that is small compared with $|h|$. Substituting $h=h(t)$, dividing by $t$, and using the boundedness of $|h(t)|/|t|$ shows that the remainder contributes zero in the limit. The remaining term converges by linearity of $Df_a$, giving $(f \circ \gamma)'(0)=Df_a(v)=D_v f(a)$.
[/proofplan]
[step:Choose a neighbourhood on which the linear approximation of $f$ is available]
For $x_0 \in \mathbb{R}^n$ and $r > 0$, let $B(x_0,r) = \{x \in \mathbb{R}^n : |x-x_0| < r\}$ denote the open ball. Since $U$ is open and $a \in U$, choose $\delta > 0$ such that $B(a,\delta) \subset U$. Let
\begin{align*}
L: \mathbb{R}^n \to \mathbb{R}
\end{align*}
denote the [linear map](/page/Linear%20Map) $L = Df_a$. Define the remainder map
\begin{align*}
\rho: B(0,\delta) \to \mathbb{R}
\end{align*}
by
\begin{align*}
\rho(h) = f(a+h)-f(a)-L(h).
\end{align*}
By differentiability of $f$ at $a$,
\begin{align*}
\lim_{h \to 0} \frac{\rho(h)}{|h|} = 0,
\end{align*}
with the quotient understood for $h \ne 0$.
For later use, define
\begin{align*}
\alpha: B(0,\delta) \to \mathbb{R}
\end{align*}
by $\alpha(0)=0$ and, for $h \ne 0$,
\begin{align*}
\alpha(h)=\frac{\rho(h)}{|h|}.
\end{align*}
Then $\alpha(h)\to 0$ as $h\to 0$, and for every $h \in B(0,\delta)$,
\begin{align*}
\rho(h)=\alpha(h)|h|.
\end{align*}
[/step]
[step:Translate differentiability of the curve into a first order estimate]
Define the curve increment map
\begin{align*}
h: (-\varepsilon,\varepsilon) \to \mathbb{R}^n
\end{align*}
by
\begin{align*}
h(t)=\gamma(t)-a.
\end{align*}
Since $\gamma(0)=a$, we have $h(0)=0$. Since $\gamma$ is differentiable at $0$ and $\gamma'(0)=v$,
\begin{align*}
\lim_{t \to 0} \frac{h(t)}{t}=v.
\end{align*}
In particular, $h(t)\to 0$ as $t\to 0$. Hence, after restricting to sufficiently small $|t|$, we have $h(t)\in B(0,\delta)$.
The same convergence also gives a linear bound for $h(t)$. There exists $\eta \in (0,\varepsilon)$ such that, for $0<|t|<\eta$,
\begin{align*}
\left|\frac{h(t)}{t}-v\right| \le 1.
\end{align*}
Therefore, for $0<|t|<\eta$,
\begin{align*}
\frac{|h(t)|}{|t|} \le |v|+1.
\end{align*}
[guided]
The point of introducing $h(t)=\gamma(t)-a$ is to convert the curve statement into the standard increment notation used in differentiability of $f$ at $a$. Formally, define
\begin{align*}
h: (-\varepsilon,\varepsilon) \to \mathbb{R}^n
\end{align*}
by
\begin{align*}
h(t)=\gamma(t)-a.
\end{align*}
Because $\gamma(0)=a$, this gives $h(0)=0$. The hypothesis $\gamma'(0)=v$ means exactly that
\begin{align*}
\lim_{t \to 0} \frac{\gamma(t)-\gamma(0)}{t}=v.
\end{align*}
Since $\gamma(0)=a$, this is the same as
\begin{align*}
\lim_{t \to 0} \frac{h(t)}{t}=v.
\end{align*}
This convergence has two consequences. First, multiplying by $t$ shows $h(t)\to 0$ as $t\to 0$, so for all sufficiently small $|t|$ the point $a+h(t)=\gamma(t)$ lies in the ball $B(a,\delta)$ where the differentiability expansion of $f$ is valid. Second, the ratio $|h(t)|/|t|$ is bounded near $0$. Indeed, choose $\eta \in (0,\varepsilon)$ such that, whenever $0<|t|<\eta$,
\begin{align*}
\left|\frac{h(t)}{t}-v\right| \le 1.
\end{align*}
By the triangle inequality in $\mathbb{R}^n$,
\begin{align*}
\frac{|h(t)|}{|t|} = \left|\frac{h(t)}{t}\right| \le |v|+1.
\end{align*}
This boundedness is what will make the remainder term from the differentiability of $f$ vanish after division by $t$.
[/guided]
[/step]
[step:Substitute the curve increment into the differentiability expansion]
For $0<|t|<\eta$ small enough that $h(t)\in B(0,\delta)$, the definition of $\rho$ gives
\begin{align*}
f(\gamma(t))-f(a)=f(a+h(t))-f(a)=L(h(t))+\rho(h(t)).
\end{align*}
Since $f(\gamma(0))=f(a)$, this is
\begin{align*}
(f\circ\gamma)(t)-(f\circ\gamma)(0)=L(h(t))+\rho(h(t)).
\end{align*}
Dividing by $t$ gives
\begin{align*}
\frac{(f\circ\gamma)(t)-(f\circ\gamma)(0)}{t}
=
L\left(\frac{h(t)}{t}\right)+\frac{\rho(h(t))}{t},
\end{align*}
where linearity of $L$ is used in the first term.
[/step]
[step:Show the remainder term vanishes after division by $t$]
For $0<|t|<\eta$, using $\rho(h)=\alpha(h)|h|$ gives
\begin{align*}
\left|\frac{\rho(h(t))}{t}\right|
=
|\alpha(h(t))|\frac{|h(t)|}{|t|}.
\end{align*}
Since $h(t)\to 0$ and $\alpha(h)\to 0$ as $h\to 0$, we have $\alpha(h(t))\to 0$. Since $|h(t)|/|t|\le |v|+1$ for $0<|t|<\eta$, the product satisfies
\begin{align*}
\lim_{t\to 0}\frac{\rho(h(t))}{t}=0.
\end{align*}
[/step]
[step:Take the limit of the difference quotient and identify the directional derivative]
Because $L:\mathbb{R}^n\to\mathbb{R}$ is linear, it is continuous. Since $h(t)/t \to v$,
\begin{align*}
\lim_{t\to 0} L\left(\frac{h(t)}{t}\right)=L(v).
\end{align*}
Combining this with the vanishing of the remainder term,
\begin{align*}
\lim_{t\to 0}\frac{(f\circ\gamma)(t)-(f\circ\gamma)(0)}{t}=L(v).
\end{align*}
Thus $f\circ\gamma$ is differentiable at $0$ and
\begin{align*}
(f\circ\gamma)'(0)=L(v)=Df_a(v)=D_v f(a).
\end{align*}
This is the desired formula.
[/step]