[proofplan]
Represent tangent vectors by equivalence classes of smooth curves through $p$, where equivalence is equality of coordinate velocities in a chart. The key point is that applying $F$ to two equivalent curves produces two curves through $F(p)$ whose coordinate velocities agree, because the coordinate expressions are related by the Euclidean chain rule. Once this is established, the same computation gives the coordinate formula for $dF_p$, and linearity follows by transporting the Euclidean [vector space](/page/Vector%20Space) structure through the coordinate identifications.
[/proofplan]
[step:Choose charts and express the pushed forward curves in coordinates]
Let $(U,\varphi)$ be a chart about $p$ and let $(V,\psi)$ be a chart about $F(p)$ with $F(U) \subset V$. Define the coordinate representative of $F$ on these charts as the smooth map
\begin{align*}
\widehat{F}: \varphi(U) &\to \psi(V) \\
x &\mapsto \psi(F(\varphi^{-1}(x))).
\end{align*}
Thus $\widehat{F} = \psi \circ F \circ \varphi^{-1}$.
Let $\gamma: I \to M$ be a smooth curve, where $I \subset \mathbb{R}$ is an open interval with $0 \in I$, $\gamma(0)=p$, and, after replacing $I$ by a smaller open interval containing $0$ if necessary, $\gamma(I) \subset U$. Define its coordinate velocity by
\begin{align*}
v_\gamma := \frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma)(t) \in \mathbb{R}^{\dim M}.
\end{align*}
The pushed forward curve is the smooth curve
\begin{align*}
F \circ \gamma: I &\to N \\
t &\mapsto F(\gamma(t)),
\end{align*}
and its coordinate expression in the chart $(V,\psi)$ is
\begin{align*}
\psi \circ F \circ \gamma
= \widehat{F} \circ \varphi \circ \gamma.
\end{align*}
[/step]
[step:Show that equivalent curves have equivalent pushed forward curves]
Suppose $\gamma_1: I_1 \to M$ and $\gamma_2: I_2 \to M$ are smooth curves through $p$, where $I_1,I_2 \subset \mathbb{R}$ are open intervals containing $0$, and suppose that they represent the same tangent vector in $T_pM$. By the curve definition of the tangent space, this means that their coordinate velocities in the chart $(U,\varphi)$ agree:
\begin{align*}
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma_1)(t)
=
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma_2)(t).
\end{align*}
Let this common vector be denoted by $v \in \mathbb{R}^{\dim M}$.
The coordinate expressions of the pushed forward curves are
\begin{align*}
\psi \circ F \circ \gamma_i
=
\widehat{F} \circ \varphi \circ \gamma_i,
\qquad i \in \{1,2\}.
\end{align*}
Applying the Euclidean chain rule to the smooth maps $\widehat{F}$ and $\varphi \circ \gamma_i$ gives
\begin{align*}
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma_i)(t)
=
D\widehat{F}_{\varphi(p)}
\left(
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma_i)(t)
\right),
\qquad i \in \{1,2\}.
\end{align*}
Since the two coordinate velocities of $\gamma_1$ and $\gamma_2$ are both $v$, we obtain
\begin{align*}
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma_1)(t)
=
D\widehat{F}_{\varphi(p)}(v)
=
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma_2)(t).
\end{align*}
Thus $F \circ \gamma_1$ and $F \circ \gamma_2$ represent the same tangent vector in $T_{F(p)}N$. Therefore the rule $dF_p([\gamma]) := [F \circ \gamma]$ is independent of the chosen representative $\gamma$.
[guided]
We need to prove that $dF_p([\gamma]) := [F \circ \gamma]$ does not depend on the curve chosen to represent the tangent vector. Let $\gamma_1: I_1 \to M$ and $\gamma_2: I_2 \to M$ be smooth curves through $p$, with $0 \in I_1 \cap I_2$, and suppose $[\gamma_1] = [\gamma_2]$ in $T_pM$. By the curve model of tangent vectors, equality means equality of coordinate velocities in any chart about $p$; using $(U,\varphi)$, this is
\begin{align*}
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma_1)(t)
=
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma_2)(t).
\end{align*}
Denote this common vector by $v \in \mathbb{R}^{\dim M}$.
Now apply $F$ to both curves and write the result in the chart $(V,\psi)$ about $F(p)$. Since $F(U) \subset V$, the coordinate expression is legitimate near $t=0$, and we have
\begin{align*}
\psi \circ F \circ \gamma_i
=
(\psi \circ F \circ \varphi^{-1}) \circ (\varphi \circ \gamma_i)
=
\widehat{F} \circ \varphi \circ \gamma_i,
\qquad i \in \{1,2\}.
\end{align*}
The map $\widehat{F}: \varphi(U) \to \psi(V)$ is smooth because $F$, $\varphi^{-1}$, and $\psi$ are smooth on their domains. Hence the Euclidean chain rule applies to $\widehat{F} \circ \varphi \circ \gamma_i$ and gives
\begin{align*}
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma_i)(t)
=
D\widehat{F}_{\varphi(p)}
\left(
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma_i)(t)
\right).
\end{align*}
Substituting the equality of the two original coordinate velocities gives
\begin{align*}
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma_1)(t)
=
D\widehat{F}_{\varphi(p)}(v)
=
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma_2)(t).
\end{align*}
This is exactly the [equivalence relation](/page/Equivalence%20Relation) defining equality of tangent vectors in $T_{F(p)}N$. Therefore $[F \circ \gamma_1]=[F \circ \gamma_2]$, so $dF_p$ is well-defined.
[/guided]
[/step]
[step:Compute the coordinate representative of the differential]
Define the coordinate identification at $p$ induced by $\varphi$ as the map
\begin{align*}
\Theta_\varphi: T_pM &\to \mathbb{R}^{\dim M} \\
[\gamma] &\mapsto \frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma)(t),
\end{align*}
and define the coordinate identification at $F(p)$ induced by $\psi$ as the map
\begin{align*}
\Theta_\psi: T_{F(p)}N &\to \mathbb{R}^{\dim N} \\
[\eta] &\mapsto \frac{d}{dt}\Big|_{t=0}(\psi \circ \eta)(t).
\end{align*}
For any $[\gamma] \in T_pM$, the previous computation gives
\begin{align*}
\Theta_\psi(dF_p([\gamma]))
&=
\Theta_\psi([F \circ \gamma]) \\
&=
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma)(t) \\
&=
D\widehat{F}_{\varphi(p)}
\left(
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma)(t)
\right) \\
&=
D\widehat{F}_{\varphi(p)}(\Theta_\varphi([\gamma])).
\end{align*}
Therefore
\begin{align*}
\Theta_\psi \circ dF_p \circ \Theta_\varphi^{-1}
=
D\widehat{F}_{\varphi(p)}
=
D(\psi \circ F \circ \varphi^{-1})_{\varphi(p)}.
\end{align*}
Thus the coordinate representative of $dF_p$ is the total derivative of $\psi \circ F \circ \varphi^{-1}$ at $\varphi(p)$, and its matrix in the standard bases is the Jacobian matrix
\begin{align*}
J(\psi \circ F \circ \varphi^{-1})_{\varphi(p)}.
\end{align*}
[guided]
The coordinate representative of a map between tangent spaces is obtained by translating tangent vectors into Euclidean coordinate velocities before and after applying the map. The chart $(U,\varphi)$ gives
\begin{align*}
\Theta_\varphi: T_pM &\to \mathbb{R}^{\dim M} \\
[\gamma] &\mapsto \frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma)(t),
\end{align*}
and the chart $(V,\psi)$ gives
\begin{align*}
\Theta_\psi: T_{F(p)}N &\to \mathbb{R}^{\dim N} \\
[\eta] &\mapsto \frac{d}{dt}\Big|_{t=0}(\psi \circ \eta)(t).
\end{align*}
These are the coordinate identifications of the tangent spaces with Euclidean spaces.
Let $[\gamma] \in T_pM$. Applying $dF_p$ sends it to $[F \circ \gamma]$. Applying $\Theta_\psi$ then means taking the coordinate velocity of $F \circ \gamma$ in the target chart:
\begin{align*}
\Theta_\psi(dF_p([\gamma]))
=
\frac{d}{dt}\Big|_{t=0}(\psi \circ F \circ \gamma)(t).
\end{align*}
Because $\psi \circ F \circ \gamma = \widehat{F} \circ \varphi \circ \gamma$, the Euclidean chain rule gives
\begin{align*}
\Theta_\psi(dF_p([\gamma]))
=
D\widehat{F}_{\varphi(p)}
\left(
\frac{d}{dt}\Big|_{t=0}(\varphi \circ \gamma)(t)
\right)
=
D\widehat{F}_{\varphi(p)}(\Theta_\varphi([\gamma])).
\end{align*}
Since this holds for every tangent vector $[\gamma] \in T_pM$, the map represented in coordinates is precisely
\begin{align*}
\Theta_\psi \circ dF_p \circ \Theta_\varphi^{-1}
=
D\widehat{F}_{\varphi(p)}
=
D(\psi \circ F \circ \varphi^{-1})_{\varphi(p)}.
\end{align*}
The total derivative is a [linear map](/page/Linear%20Map) between Euclidean spaces, and its standard matrix is the Jacobian matrix
\begin{align*}
J(\psi \circ F \circ \varphi^{-1})_{\varphi(p)}.
\end{align*}
This proves the claimed coordinate formula.
[/guided]
[/step]
[step:Deduce linearity from the coordinate formula]
The map
\begin{align*}
D(\psi \circ F \circ \varphi^{-1})_{\varphi(p)}: \mathbb{R}^{\dim M} \to \mathbb{R}^{\dim N}
\end{align*}
is linear by definition of the total derivative in Euclidean space. The coordinate identifications $\Theta_\varphi$ and $\Theta_\psi$ are linear isomorphisms by the vector space structure on tangent spaces induced by coordinate velocities. Since
\begin{align*}
dF_p
=
\Theta_\psi^{-1}
\circ
D(\psi \circ F \circ \varphi^{-1})_{\varphi(p)}
\circ
\Theta_\varphi,
\end{align*}
the map $dF_p: T_pM \to T_{F(p)}N$ is a composition of linear maps. Hence $dF_p$ is linear. Combining this with well-definedness and the coordinate formula proves the theorem.
[/step]