[proofplan]
We show double inclusion. The first-order condition for convex functions ([Properties of Convex Functions](/theorems/1976), part (v)) immediately gives $\nabla f(x) \in \partial f(x)$. For the reverse inclusion, we take an arbitrary $g \in \partial f(x)$ and use the subgradient inequality along every direction $z \in \mathbb{R}^d$, pass to the limit via differentiability to obtain $g^\top z \leq \nabla f(x)^\top z$ for all $z$, and then choose $z = g - \nabla f(x)$ to force $g = \nabla f(x)$.
[/proofplan]
[step:Show that $\nabla f(x) \in \partial f(x)$ via the first-order condition]
Since $f$ is convex and differentiable at $x \in \mathbb{R}^d$, the first-order condition ([Properties of Convex Functions](/theorems/1976), part (v)) states that for all $y \in \mathbb{R}^d$:
\begin{align*}
f(y) \geq f(x) + \nabla f(x)^\top(y - x).
\end{align*}
This is precisely the subgradient inequality for $\nabla f(x)$, so $\nabla f(x) \in \partial f(x)$.
[/step]
[step:Show that every $g \in \partial f(x)$ satisfies $g^\top z \leq \nabla f(x)^\top z$ for all $z \in \mathbb{R}^d$]
Suppose $g \in \partial f(x)$. By the definition of the subdifferential, for all $y \in \mathbb{R}^d$:
\begin{align*}
f(y) \geq f(x) + g^\top(y - x).
\end{align*}
Fix an arbitrary direction $z \in \mathbb{R}^d$ and set $y = x + tz$ for $t > 0$. The subgradient inequality becomes
\begin{align*}
f(x + tz) \geq f(x) + tg^\top z.
\end{align*}
Rearranging and dividing by $t > 0$:
\begin{align*}
g^\top z \leq \frac{f(x + tz) - f(x)}{t}.
\end{align*}
Since $f$ is differentiable at $x$, the right-hand side converges to the directional derivative $\nabla f(x)^\top z$ as $t \downarrow 0$. Therefore
\begin{align*}
g^\top z \leq \nabla f(x)^\top z.
\end{align*}
Since $z \in \mathbb{R}^d$ was arbitrary, this holds for every direction.
[guided]
The key idea is to extract information about $g$ from the subgradient inequality by probing along rays $x + tz$ and sending $t \to 0$. The subgradient inequality gives a lower bound on the difference quotient $(f(x + tz) - f(x))/t$ for each $t > 0$, and differentiability at $x$ identifies the limit of these difference quotients as $\nabla f(x)^\top z$. Because $g^\top z$ is bounded above by the difference quotient for every $t > 0$, it must also be bounded above by the limit, yielding $g^\top z \leq \nabla f(x)^\top z$.
Why is differentiability essential here? Without differentiability, the directional derivative $\lim_{t \downarrow 0}(f(x + tz) - f(x))/t$ may still exist (convex functions always have directional derivatives), but it need not be a linear function of $z$. Differentiability forces the directional derivative to equal $\nabla f(x)^\top z$ -- a linear function of $z$ -- which is what allows the final step below to pin down $g$ uniquely.
[/guided]
[/step]
[step:Choose $z = g - \nabla f(x)$ to conclude $g = \nabla f(x)$]
Taking $z = g - \nabla f(x)$ in the inequality $g^\top z \leq \nabla f(x)^\top z$:
\begin{align*}
g^\top(g - \nabla f(x)) \leq \nabla f(x)^\top(g - \nabla f(x)).
\end{align*}
Subtracting the right-hand side from both sides:
\begin{align*}
(g - \nabla f(x))^\top(g - \nabla f(x)) \leq 0,
\end{align*}
i.e., $\|g - \nabla f(x)\|_2^2 \leq 0$. Since the squared norm is non-negative, we conclude $\|g - \nabla f(x)\|_2^2 = 0$ and hence $g = \nabla f(x)$.
Combining with the first step, every element of $\partial f(x)$ equals $\nabla f(x)$, so $\partial f(x) = \{\nabla f(x)\}$.
[/step]