[proofplan]
The proof identifies the linear functional $Df_a$ by its values on the standard basis of $\mathbb{R}^m$. Differentiability gives a first-order expansion, and restricting that expansion to the coordinate line in the $e_i$ direction shows that $Df_a(e_i)$ is the $i$th [partial derivative](/page/Partial%20Derivative) of $f$ at $a$. Then every vector $h$ is expanded in the standard basis, linearity of $Df_a$ is applied, and the resulting coordinate sum is exactly the Euclidean dot product with $\nabla f(a)$.
[/proofplan]
[step:Recover each partial derivative by evaluating the differential on a basis vector]
Let $(e_1,\ldots,e_m)$ denote the standard basis of $\mathbb{R}^m$, and let $|\cdot|$ denote the Euclidean norm on $\mathbb{R}^m$. Because $U$ is open and $a\in U$, choose $\rho>0$ such that $B(a,\rho)\subset U$. Since $f$ is differentiable at $a$, there exists a [linear map](/page/Linear%20Map)
\begin{align*}
Df_a: \mathbb{R}^m \to \mathbb{R}
\end{align*}
and a remainder function $r:B(0,\rho)\to\mathbb{R}$ such that, for every $h\in B(0,\rho)$,
\begin{align*}
f(a+h)=f(a)+Df_a(h)+r(h),
\end{align*}
and
\begin{align*}
\lim_{h\to 0}\frac{r(h)}{|h|}=0.
\end{align*}
Fix $i\in\{1,\ldots,m\}$. Set $\delta_i:=\rho$. Then $t e_i\in B(0,\rho)$ and $a+t e_i\in U$ whenever $|t|<\delta_i$. For every $t\in\mathbb{R}$ with $0<|t|<\delta_i$, substituting $h=t e_i$ into the differentiability expansion gives
\begin{align*}
\frac{f(a+t e_i)-f(a)}{t}=Df_a(e_i)+\frac{r(t e_i)}{t}.
\end{align*}
Since $|t e_i|=|t|$, the remainder condition implies
\begin{align*}
\lim_{t\to 0}\frac{r(t e_i)}{t}=0.
\end{align*}
Therefore the partial derivative $\partial_{x_i}f(a)$ exists and satisfies
\begin{align*}
\partial_{x_i}f(a)=Df_a(e_i).
\end{align*}
[guided]
Let $(e_1,\ldots,e_m)$ be the standard basis of $\mathbb{R}^m$, and let $|\cdot|$ denote the Euclidean norm on $\mathbb{R}^m$. Since $U$ is open and $a\in U$, choose $\rho>0$ such that $B(a,\rho)\subset U$. The differentiability hypothesis means that the first-order change of $f$ at $a$ is controlled by a linear map
\begin{align*}
Df_a: \mathbb{R}^m \to \mathbb{R}.
\end{align*}
More precisely, there is a remainder function $r:B(0,\rho)\to\mathbb{R}$ such that, for every $h\in B(0,\rho)$,
\begin{align*}
f(a+h)=f(a)+Df_a(h)+r(h)
\end{align*}
and
\begin{align*}
\lim_{h\to 0}\frac{r(h)}{|h|}=0.
\end{align*}
To compare this total derivative with a partial derivative, we restrict the increment $h$ to one coordinate direction. Fix $i\in\{1,\ldots,m\}$ and set $\delta_i:=\rho$. If $|t|<\delta_i$, then $t e_i\in B(0,\rho)$ and $a+t e_i\in U$, so the differentiability expansion is valid for the increment $h=t e_i$.
For $0<|t|<\delta_i$, put $h=t e_i$ in the differentiability expansion. Since $Df_a$ is linear,
\begin{align*}
Df_a(t e_i)=tDf_a(e_i).
\end{align*}
Thus
\begin{align*}
f(a+t e_i)-f(a)=tDf_a(e_i)+r(t e_i).
\end{align*}
Dividing by $t$ gives
\begin{align*}
\frac{f(a+t e_i)-f(a)}{t}=Df_a(e_i)+\frac{r(t e_i)}{t}.
\end{align*}
The second term tends to $0$. Indeed, $|t e_i|=|t|$, so
\begin{align*}
\left|\frac{r(t e_i)}{t}\right|=\frac{|r(t e_i)|}{|t e_i|},
\end{align*}
and the right-hand side tends to $0$ as $t\to 0$ by the defining remainder condition for differentiability. Hence
\begin{align*}
\lim_{t\to 0}\frac{f(a+t e_i)-f(a)}{t}=Df_a(e_i).
\end{align*}
By the definition of the $i$th partial derivative, this proves that $\partial_{x_i}f(a)$ exists and
\begin{align*}
\partial_{x_i}f(a)=Df_a(e_i).
\end{align*}
[/guided]
[/step]
[step:Expand an arbitrary vector and use linearity of the differential]
Let $h=(h_1,\ldots,h_m)\in\mathbb{R}^m$. The standard basis expansion in $\mathbb{R}^m$ is
\begin{align*}
h=\sum_{i=1}^m h_i e_i.
\end{align*}
Since $Df_a:\mathbb{R}^m\to\mathbb{R}$ is linear,
\begin{align*}
Df_a(h)=\sum_{i=1}^m h_i Df_a(e_i).
\end{align*}
Using the identity $Df_a(e_i)=\partial_{x_i}f(a)$ from the previous step, we obtain
\begin{align*}
Df_a(h)=\sum_{i=1}^m h_i\,\partial_{x_i}f(a).
\end{align*}
[/step]
[step:Recognize the coordinate sum as the Euclidean dot product with the gradient]
By definition, the gradient of $f$ at $a$ is the vector
\begin{align*}
\nabla f(a)=\left(\partial_{x_1}f(a),\ldots,\partial_{x_m}f(a)\right)\in\mathbb{R}^m.
\end{align*}
The Euclidean dot product of $\nabla f(a)$ with $h=(h_1,\ldots,h_m)$ is
\begin{align*}
\nabla f(a)\cdot h=\sum_{i=1}^m \partial_{x_i}f(a)\,h_i.
\end{align*}
Combining this with the formula from the previous step gives
\begin{align*}
Df_a(h)=\nabla f(a)\cdot h.
\end{align*}
Since $h\in\mathbb{R}^m$ was arbitrary, the identity holds for every $h\in\mathbb{R}^m$.
[/step]