[proofplan]
We first establish three preliminary facts about any derivation $l \in \mathrm{Der}_p(M)$: it annihilates constants, it is local (determined by the germ at $p$), and it passes through a chart at $p$. Localisation reduces the problem to computing $\dim \mathrm{Der}_0(\mathbb{R}^n)$. There we exhibit the partial derivatives $\partial_{x_i}|_0$ as $n$ linearly independent derivations, and we prove they span by writing any smooth function as a Taylor-like expansion $f(x) = f(0) + \sum_i x_i g_i(x)$ valid on a convex neighbourhood of $0$ and applying the Leibniz rule to each summand.
[/proofplan]
[step:Show that every derivation annihilates constant functions]
Let $l \in \mathrm{Der}_p(M)$. Write $\mathbf{1} \in C^\infty(M)$ for the constant function with value $1$. Applying the Leibniz rule to $\mathbf{1} \cdot \mathbf{1} = \mathbf{1}$:
\begin{align*}
l(\mathbf{1}) = l(\mathbf{1} \cdot \mathbf{1}) = \mathbf{1}(p) \cdot l(\mathbf{1}) + \mathbf{1}(p) \cdot l(\mathbf{1}) = 2\, l(\mathbf{1}),
\end{align*}
so $l(\mathbf{1}) = 0$. For a general constant function $c \cdot \mathbf{1}$ with $c \in \mathbb{R}$, $\mathbb{R}$-linearity of $l$ gives $l(c \cdot \mathbf{1}) = c\, l(\mathbf{1}) = 0$.
[/step]
[step:Prove locality: $l$ depends only on the germ of $f$ at $p$]
[claim:If $f, g \in C^\infty(M)$ agree on an open neighbourhood of $p$, then $l(f) = l(g)$]
[proof]
By linearity of $l$ it suffices to show that $l(h) = 0$ whenever $h \in C^\infty(M)$ vanishes on an open neighbourhood $O \ni p$.
Let $h \in C^\infty(M)$ with $h \equiv 0$ on an open $O \ni p$. Apply the [Bump Functions Exist](/theorems/1513) theorem to the point $p$ and the open set $O$: there exist an open set $V$ with $p \in V \subseteq \overline{V} \subseteq O$ and a function $B \in C^\infty(M, \mathbb{R})$ with $B \equiv 1$ on $V$, $B \equiv 0$ on $M \setminus O$, and $0 \le B \le 1$.
We compute $l(h \cdot B)$ in two ways. First, since $h \equiv 0$ on $O \supseteq \operatorname{supp}(B)$, the product $h \cdot B$ is identically zero on $M$: on $O$ this holds because $h = 0$, and on $M \setminus O$ this holds because $B = 0$. So $h \cdot B \equiv 0$ and $l(h \cdot B) = l(0) = 0$ by linearity.
Second, by the Leibniz rule at $p$:
\begin{align*}
l(h \cdot B) = h(p)\, l(B) + B(p)\, l(h).
\end{align*}
Because $p \in O$ and $h \equiv 0$ on $O$, $h(p) = 0$. Because $p \in V$ and $B \equiv 1$ on $V$, $B(p) = 1$. Therefore $l(h \cdot B) = 0 + 1 \cdot l(h) = l(h)$.
Equating the two expressions gives $l(h) = 0$, as claimed.
[/proof]
[/claim]
[/step]
[step:Reduce the dimension computation to $\mathrm{Der}_0(\mathbb{R}^n)$ via a chart at $p$]
Choose a smooth chart $(U, \varphi)$ on $M$ with $p \in U$ and $\varphi(p) = 0$. The pullback by $\varphi^{-1}$ induces an $\mathbb{R}$-linear map between derivation spaces:
\begin{align*}
\Phi: \mathrm{Der}_0(\mathbb{R}^n) &\to \mathrm{Der}_p(M) \\
\delta &\mapsto \bigl( f \mapsto \delta(\widetilde f\,) \bigr)
\end{align*}
where, for $f \in C^\infty(M)$, $\widetilde f \in C^\infty(\varphi(U))$ is any smooth function on an open neighbourhood of $0 \in \mathbb{R}^n$ whose germ at $0$ coincides with $f \circ \varphi^{-1}$; extend this germ arbitrarily to a globally defined smooth function on $\mathbb{R}^n$ using a bump function adapted to $\varphi(U)$ (this is possible by the [Bump Functions Exist](/theorems/1513) theorem). By the locality established in the previous step, $\delta(\widetilde f\,)$ is independent of the choice of extension, so $\Phi$ is well-defined.
That $\Phi(\delta)$ satisfies $\mathbb{R}$-linearity and the Leibniz rule at $p$ follows directly from the corresponding properties of $\delta$ at $0$, together with the fact that $\widetilde{fg} - \widetilde f \cdot \widetilde g$ vanishes on a neighbourhood of $0$ and hence is annihilated by $\delta$.
The inverse map $\Psi: \mathrm{Der}_p(M) \to \mathrm{Der}_0(\mathbb{R}^n)$ sends $l$ to $g \mapsto l(g \circ \varphi)$, where $g \circ \varphi$ is first defined on $U$ and then extended smoothly to all of $M$ by zero times a bump; locality again ensures well-definedness. One checks $\Psi \circ \Phi = \mathrm{id}$ and $\Phi \circ \Psi = \mathrm{id}$, so $\Phi$ is a linear isomorphism. Hence $\dim \mathrm{Der}_p(M) = \dim \mathrm{Der}_0(\mathbb{R}^n)$, and it remains to show this latter dimension equals $n$.
[/step]
[step:Exhibit $n$ linearly independent derivations in $\mathrm{Der}_0(\mathbb{R}^n)$]
For each $i \in \{1, \dots, n\}$ define
\begin{align*}
\partial_{x_i}|_0: C^\infty(\mathbb{R}^n) &\to \mathbb{R} \\
f &\mapsto \frac{\partial f}{\partial x_i}(0).
\end{align*}
$\mathbb{R}$-linearity is immediate, and the Leibniz rule at $0$ follows from the product rule for partial derivatives:
\begin{align*}
\partial_{x_i}(fg)(0) = \frac{\partial f}{\partial x_i}(0)\, g(0) + f(0)\, \frac{\partial g}{\partial x_i}(0).
\end{align*}
So $\partial_{x_i}|_0 \in \mathrm{Der}_0(\mathbb{R}^n)$.
Linear independence: suppose $\sum_{i=1}^{n} c_i\, \partial_{x_i}|_0 = 0$ in $\mathrm{Der}_0(\mathbb{R}^n)$. Apply both sides to the coordinate function $x_j \in C^\infty(\mathbb{R}^n)$:
\begin{align*}
0 = \sum_{i=1}^{n} c_i\, \partial_{x_i}|_0(x_j) = \sum_{i=1}^{n} c_i\, \delta_{ij} = c_j.
\end{align*}
This holds for every $j \in \{1, \dots, n\}$, so all $c_j = 0$. Thus the $n$ derivations $\partial_{x_1}|_0, \dots, \partial_{x_n}|_0$ are linearly independent in $\mathrm{Der}_0(\mathbb{R}^n)$.
[/step]
[step:Write any smooth function near $0$ as $f(x) = f(0) + \sum_i x_i g_i(x)$ with $g_i \in C^\infty$]
[claim:For every $f \in C^\infty(\mathbb{R}^n)$ there exist $g_1, \dots, g_n \in C^\infty(\mathbb{R}^n)$ such that $f(x) = f(0) + \sum_{i=1}^{n} x_i g_i(x)$ on the open star-shaped neighbourhood $\{x : tx \in \operatorname{dom}(f) \text{ for all } t \in [0,1]\}$ of $0$, and $g_i(0) = \partial_{x_i} f(0)$]
[proof]
By the [Fundamental Theorem of Calculus](/theorems/???) applied to the smooth function $t \mapsto f(tx)$ on $[0, 1]$:
\begin{align*}
f(x) - f(0) = \int_0^1 \frac{d}{dt} f(tx) \, d\mathcal{L}^1(t) = \int_0^1 \sum_{i=1}^{n} x_i\, \frac{\partial f}{\partial x_i}(tx) \, d\mathcal{L}^1(t).
\end{align*}
We interchange the finite sum and the integral (this is justified because both operations are linear and the sum is finite):
\begin{align*}
f(x) = f(0) + \sum_{i=1}^{n} x_i \int_0^1 \frac{\partial f}{\partial x_i}(tx) \, d\mathcal{L}^1(t).
\end{align*}
Define
\begin{align*}
g_i: \mathbb{R}^n &\to \mathbb{R} \\
x &\mapsto \int_0^1 \frac{\partial f}{\partial x_i}(tx) \, d\mathcal{L}^1(t).
\end{align*}
Smoothness of $g_i$: the integrand $(x, t) \mapsto \partial_{x_i} f(tx)$ is smooth in $(x, t)$, and by smoothness of parameter integrals for smooth functions on compact parameter sets (differentiation under the integral sign, justified because all partial derivatives of the integrand are continuous in $(x, t)$ and $t$ ranges over the compact interval $[0, 1]$), $g_i \in C^\infty(\mathbb{R}^n)$. Evaluation at $x = 0$:
\begin{align*}
g_i(0) = \int_0^1 \frac{\partial f}{\partial x_i}(0) \, d\mathcal{L}^1(t) = \frac{\partial f}{\partial x_i}(0).
\end{align*}
[/proof]
[/claim]
[/step]
[step:Apply the Leibniz rule to conclude that $\{\partial_{x_i}|_0\}$ spans $\mathrm{Der}_0(\mathbb{R}^n)$]
Let $l \in \mathrm{Der}_0(\mathbb{R}^n)$ and $f \in C^\infty(\mathbb{R}^n)$. By the previous step we may write, on a neighbourhood of $0$,
\begin{align*}
f(x) = f(0) + \sum_{i=1}^{n} x_i\, g_i(x)
\end{align*}
for smooth $g_i$ with $g_i(0) = \partial_{x_i} f(0)$. By locality (second step of the proof), $l$ applied to $f$ equals $l$ applied to the right-hand side. Using linearity and the fact that $l$ annihilates constants (first step):
\begin{align*}
l(f) = l(f(0)) + \sum_{i=1}^{n} l(x_i\, g_i) = 0 + \sum_{i=1}^{n} l(x_i\, g_i).
\end{align*}
For each $i$, apply the Leibniz rule at $0$ to the product $x_i \cdot g_i$:
\begin{align*}
l(x_i\, g_i) = x_i(0)\, l(g_i) + g_i(0)\, l(x_i) = 0 \cdot l(g_i) + g_i(0)\, l(x_i) = \frac{\partial f}{\partial x_i}(0) \cdot l(x_i).
\end{align*}
Setting $a_i := l(x_i) \in \mathbb{R}$, we obtain
\begin{align*}
l(f) = \sum_{i=1}^{n} a_i\, \frac{\partial f}{\partial x_i}(0) = \left( \sum_{i=1}^{n} a_i\, \partial_{x_i}|_0 \right)(f).
\end{align*}
Since $f \in C^\infty(\mathbb{R}^n)$ was arbitrary, $l = \sum_i a_i\, \partial_{x_i}|_0$. Thus $\{\partial_{x_i}|_0\}_{i=1}^{n}$ spans $\mathrm{Der}_0(\mathbb{R}^n)$.
[/step]
[step:Conclude that $\dim \mathrm{Der}_p(M) = n$]
Combining the previous steps: $\{\partial_{x_i}|_0\}_{i=1}^{n}$ is a linearly independent spanning set in $\mathrm{Der}_0(\mathbb{R}^n)$, so $\dim \mathrm{Der}_0(\mathbb{R}^n) = n$. The chart-induced isomorphism $\Phi: \mathrm{Der}_0(\mathbb{R}^n) \to \mathrm{Der}_p(M)$ therefore gives $\dim \mathrm{Der}_p(M) = n$, completing the proof.
[/step]