[proofplan]
The normal cone condition is exactly a variational inequality: the gradient has nonnegative [inner product](/page/Inner%20Product) with every feasible displacement from $x_*$. For a differentiable convex function, convexity along each line segment gives the first-order lower bound $f(y) \ge f(x_*) + \nabla f(x_*) \cdot (y - x_*)$. This lower bound proves sufficiency. Conversely, if $x_*$ minimizes on $C$, then every feasible segment starting at $x_*$ has nonnegative right derivative at $0$, which gives the same variational inequality and hence the normal cone condition.
[/proofplan]
[step:Translate the normal cone condition into a variational inequality]
By the definition of $N_C(x_*)$,
\begin{align*}
-\nabla f(x_*) \in N_C(x_*)
\end{align*}
is equivalent to
\begin{align*}
(-\nabla f(x_*)) \cdot (y - x_*) \le 0 \text{ for every } y \in C.
\end{align*}
Multiplying by $-1$, this is equivalent to the variational inequality
\begin{align*}
\nabla f(x_*) \cdot (y - x_*) \ge 0 \text{ for every } y \in C.
\end{align*}
[/step]
[step:Derive the convex first order inequality along line segments]
We first record the first-order inequality needed for the sufficiency direction. Let $x \in U$ and $y \in U$. Because $U$ is convex, the segment $\{x + t(y - x) : 0 \le t \le 1\}$ is contained in $U$. Define
\begin{align*}
\phi_{x,y}: [0,1] \to \mathbb{R}, \qquad t \mapsto f(x + t(y - x)).
\end{align*}
Convexity of $f$ implies convexity of $\phi_{x,y}$. For every $t \in (0,1]$, convexity gives
\begin{align*}
\phi_{x,y}(t) \le (1 - t)\phi_{x,y}(0) + t\phi_{x,y}(1).
\end{align*}
Rearranging,
\begin{align*}
\frac{\phi_{x,y}(t) - \phi_{x,y}(0)}{t} \le \phi_{x,y}(1) - \phi_{x,y}(0).
\end{align*}
Since $f$ is differentiable at $x$, the right derivative of $\phi_{x,y}$ at $0$ exists and equals
\begin{align*}
\phi_{x,y}'(0) = \nabla f(x) \cdot (y - x).
\end{align*}
Letting $t \downarrow 0$ yields
\begin{align*}
f(y) \ge f(x) + \nabla f(x) \cdot (y - x).
\end{align*}
[guided]
We need a lower bound for $f(y)$ in terms of the value and gradient at $x$. The natural way to get it is to restrict the convex function $f$ to the line segment from $x$ to $y$. Since $U$ is convex and $x,y \in U$, every point $x + t(y - x)$ with $0 \le t \le 1$ lies in $U$, so the following map is well-defined:
\begin{align*}
\phi_{x,y}: [0,1] \to \mathbb{R}, \qquad t \mapsto f(x + t(y - x)).
\end{align*}
Convexity of $f$ implies that $\phi_{x,y}$ is convex. Indeed, for $s,t \in [0,1]$ and $\lambda \in [0,1]$, the point
\begin{align*}
x + ((1-\lambda)s + \lambda t)(y - x)
\end{align*}
is the convex combination
\begin{align*}
(1-\lambda)(x + s(y - x)) + \lambda(x + t(y - x)).
\end{align*}
Applying convexity of $f$ to these two points gives convexity of $\phi_{x,y}$.
Now fix $t \in (0,1]$. Convexity of $\phi_{x,y}$ at the point $t = (1-t)0 + t1$ gives
\begin{align*}
\phi_{x,y}(t) \le (1 - t)\phi_{x,y}(0) + t\phi_{x,y}(1).
\end{align*}
Subtracting $\phi_{x,y}(0)$ and dividing by the positive number $t$ gives
\begin{align*}
\frac{\phi_{x,y}(t) - \phi_{x,y}(0)}{t} \le \phi_{x,y}(1) - \phi_{x,y}(0).
\end{align*}
Because $f$ is differentiable at $x$, the directional derivative of $f$ at $x$ in the direction $y - x$ exists and is given by the Euclidean inner product with the gradient:
\begin{align*}
\lim_{t \downarrow 0}\frac{f(x + t(y - x)) - f(x)}{t} = \nabla f(x) \cdot (y - x).
\end{align*}
This is exactly the right derivative of $\phi_{x,y}$ at $0$. Taking the limit $t \downarrow 0$ in the previous inequality therefore yields
\begin{align*}
\nabla f(x) \cdot (y - x) \le f(y) - f(x).
\end{align*}
Equivalently,
\begin{align*}
f(y) \ge f(x) + \nabla f(x) \cdot (y - x).
\end{align*}
This is the first-order inequality for differentiable convex functions.
[/guided]
[/step]
[step:Use the variational inequality to prove minimality]
Assume
\begin{align*}
-\nabla f(x_*) \in N_C(x_*).
\end{align*}
By the first step, this means
\begin{align*}
\nabla f(x_*) \cdot (y - x_*) \ge 0 \text{ for every } y \in C.
\end{align*}
Let $y \in C$ be arbitrary. Since $C \subset U$, both $x_*$ and $y$ belong to $U$. Applying the first-order convexity inequality from the previous step with $x = x_*$ gives
\begin{align*}
f(y) \ge f(x_*) + \nabla f(x_*) \cdot (y - x_*).
\end{align*}
The variational inequality makes the second term nonnegative, so
\begin{align*}
f(y) \ge f(x_*).
\end{align*}
Because $y \in C$ was arbitrary, $x_*$ is a global minimizer of $f$ over $C$.
[/step]
[step:Use minimality along feasible segments to recover the normal cone condition]
Assume that $x_*$ is a global minimizer of $f$ over $C$. Let $y \in C$ be arbitrary. Since $C$ is convex, the segment point
\begin{align*}
x_* + t(y - x_*) \in C
\end{align*}
for every $t \in [0,1]$. Define
\begin{align*}
\psi_y: [0,1] \to \mathbb{R}, \qquad t \mapsto f(x_* + t(y - x_*)).
\end{align*}
For every $t \in (0,1]$, minimality of $x_*$ over $C$ gives
\begin{align*}
\psi_y(t) = f(x_* + t(y - x_*)) \ge f(x_*) = \psi_y(0).
\end{align*}
Hence
\begin{align*}
\frac{\psi_y(t) - \psi_y(0)}{t} \ge 0.
\end{align*}
Since $f$ is differentiable at $x_*$, letting $t \downarrow 0$ gives
\begin{align*}
\nabla f(x_*) \cdot (y - x_*) \ge 0.
\end{align*}
Because $y \in C$ was arbitrary,
\begin{align*}
(-\nabla f(x_*)) \cdot (y - x_*) \le 0 \text{ for every } y \in C.
\end{align*}
Thus
\begin{align*}
-\nabla f(x_*) \in N_C(x_*).
\end{align*}
This proves the converse implication and completes the equivalence.
[/step]