[proofplan]
We use the curve-class definition of tangent vectors. A tangent vector $v \in T_pM$ is represented by a smooth curve $\gamma$ through $p$, and the differential of a smooth map sends the class $[\gamma]$ to the class of the composed curve. Applying this definition first to $G \circ F$ and then separately to $F$ and $G$ gives the same curve class in $T_{G(F(p))}P$. The identity statement follows from the same definition because composing a curve with $\operatorname{id}_M$ leaves the curve unchanged.
[/proofplan]
[step:Evaluate both sides on an arbitrary tangent vector represented by a curve]
Fix $p \in M$. Let $v \in T_pM$ be arbitrary, and choose $\varepsilon > 0$ and a smooth curve
\begin{align*}
\gamma : (-\varepsilon,\varepsilon) \to M
\end{align*}
such that $\gamma(0)=p$ and $v=[\gamma]$ in the curve-class model of $T_pM$.
Define the smooth composition
\begin{align*}
H : M &\to P \\
x &\mapsto G(F(x)).
\end{align*}
By the definition of the differential using curve classes,
\begin{align*}
dH_p(v)=d(G\circ F)_p([\gamma])=[H\circ \gamma]=[G\circ F\circ \gamma].
\end{align*}
On the other hand, the differential of $F$ at $p$ sends
\begin{align*}
dF_p([\gamma])=[F\circ \gamma]\in T_{F(p)}N,
\end{align*}
because $(F\circ \gamma)(0)=F(p)$. Applying the differential of $G$ at $F(p)$ gives
\begin{align*}
dG_{F(p)}(dF_p([\gamma]))
= dG_{F(p)}([F\circ \gamma])
= [G\circ F\circ \gamma].
\end{align*}
Thus
\begin{align*}
d(G\circ F)_p(v)=\bigl(dG_{F(p)}\circ dF_p\bigr)(v).
\end{align*}
[guided]
We prove equality of two linear maps
\begin{align*}
d(G\circ F)_p,\quad dG_{F(p)}\circ dF_p : T_pM \to T_{G(F(p))}P
\end{align*}
by evaluating them on an arbitrary tangent vector. In the curve-class model, a vector $v \in T_pM$ is represented by a smooth curve through $p$. Thus choose $\varepsilon > 0$ and a smooth curve
\begin{align*}
\gamma : (-\varepsilon,\varepsilon) \to M
\end{align*}
with $\gamma(0)=p$ and $v=[\gamma]$.
Let
\begin{align*}
H : M &\to P \\
x &\mapsto G(F(x))
\end{align*}
denote the composition $G\circ F$. By definition, the differential of $H$ sends the curve class $[\gamma]$ to the curve class of $H\circ\gamma$. Therefore
\begin{align*}
d(G\circ F)_p(v)
= dH_p([\gamma])
= [H\circ \gamma]
= [G\circ F\circ \gamma].
\end{align*}
Now evaluate the right-hand side. First, the differential of $F$ sends $[\gamma]$ to the tangent vector represented by the curve $F\circ\gamma$:
\begin{align*}
dF_p([\gamma])=[F\circ \gamma].
\end{align*}
This class lies in $T_{F(p)}N$ because
\begin{align*}
(F\circ\gamma)(0)=F(\gamma(0))=F(p).
\end{align*}
Next, applying $dG_{F(p)}$ sends this class to the class of the further composition with $G$:
\begin{align*}
dG_{F(p)}(dF_p([\gamma]))
= dG_{F(p)}([F\circ \gamma])
= [G\circ F\circ \gamma].
\end{align*}
Both sides therefore send the same arbitrary vector $v=[\gamma]$ to the same element $[G\circ F\circ \gamma]$ of $T_{G(F(p))}P$.
[/guided]
[/step]
[step:Conclude equality of the two differentials]
Since the tangent vector $v \in T_pM$ was arbitrary, the two linear maps
\begin{align*}
d(G\circ F)_p,\quad dG_{F(p)}\circ dF_p : T_pM \to T_{G(F(p))}P
\end{align*}
agree on every element of $T_pM$. Hence
\begin{align*}
d(G\circ F)_p=dG_{F(p)}\circ dF_p.
\end{align*}
[/step]
[step:Apply the curve-class definition to the identity map]
Let $M$ be a smooth manifold, let $p \in M$, and let $v \in T_pM$ be represented by a smooth curve
\begin{align*}
\gamma : (-\varepsilon,\varepsilon) \to M
\end{align*}
with $\gamma(0)=p$. By the curve-class definition of the differential,
\begin{align*}
d(\operatorname{id}_M)_p(v)
= d(\operatorname{id}_M)_p([\gamma])
= [\operatorname{id}_M\circ \gamma]
= [\gamma]
= v.
\end{align*}
Thus $d(\operatorname{id}_M)_p$ agrees with $\operatorname{id}_{T_pM}$ on every $v \in T_pM$, so
\begin{align*}
d(\operatorname{id}_M)_p=\operatorname{id}_{T_pM}.
\end{align*}
[/step]