[proofplan]
We compare the values of $f$ at two points $x,y \in K$ by restricting $f$ to the line segment from $x$ to $y$. Convexity keeps that whole segment inside $K$, while compactness and continuity of the derivative make the derivative norm uniformly bounded on $K$. Applying the one-dimensional [fundamental theorem of calculus](/theorems/632) componentwise to the curve $t \mapsto f(x+t(y-x))$ gives an integral formula, and the operator norm bound estimates the integrand uniformly.
[/proofplan]
custom_env
admin
[step:Show that the derivative norm has a finite maximum on $K$]
Define the function
\begin{align*}
M_K:K \to [0,\infty)
\end{align*}
by
\begin{align*}
M_K(z)=\|Df_z\|_{\mathrm{op}}.
\end{align*}
Let $\mathcal L(\mathbb R^n,\mathbb R^m)$ denote the [normed vector space](/page/Normed%20Vector%20Space) of linear maps from $\mathbb R^n$ to $\mathbb R^m$, equipped with the operator norm induced by the Euclidean norms. Since $f \in C^1(U;\mathbb R^m)$, the derivative map
\begin{align*}
Df:U \to \mathcal L(\mathbb R^n,\mathbb R^m)
\end{align*}
is continuous. Define the operator norm map
\begin{align*}
N:\mathcal L(\mathbb R^n,\mathbb R^m) \to [0,\infty)
\end{align*}
by $N(T)=\|T\|_{\mathrm{op}}$. The map $N$ is continuous, so $M_K=N\circ Df|_K$ is continuous on $K$. Because $K$ is compact and nonempty, $M_K$ attains a finite maximum. Set
\begin{align*}
M=\sup_{z \in K}\|Df_z\|_{\mathrm{op}}=\max_{z \in K}M_K(z)<\infty.
\end{align*}
[/step]
custom_env
admin
[step:Parametrize the segment from $x$ to $y$ inside $K$]Fix $x,y \in K$. Define the affine path
\begin{align*}
\gamma:[0,1]\to K
\end{align*}
by
\begin{align*}
\gamma(t)=x+t(y-x).
\end{align*}
This is well-defined because $K$ is convex: for each $t \in [0,1]$,
\begin{align*}
\gamma(t)=(1-t)x+ty \in K.
\end{align*}
Since $K \subset U$, the composition
\begin{align*}
g:[0,1]\to \mathbb R^m
\end{align*}
defined by
\begin{align*}
g(t)=f(\gamma(t))=f(x+t(y-x))
\end{align*}
is a $C^1$ map. By the chain rule, for every $t \in (0,1)$,
\begin{align*}
g'(t)=Df_{\gamma(t)}(y-x).
\end{align*}
The map $t \mapsto Df_{\gamma(t)}(y-x)$ extends continuously to $[0,1]$.[/step]
custom_env
admin
[guided]Fix two arbitrary points $x,y \in K$. The goal is to estimate $|f(y)-f(x)|$ using derivative information only on $K$, so the natural path between the two points is the straight line segment. Define
\begin{align*}
\gamma:[0,1]\to K
\end{align*}
by
\begin{align*}
\gamma(t)=x+t(y-x).
\end{align*}
The codomain is really $K$, not merely $U$, because convexity gives
\begin{align*}
\gamma(t)=(1-t)x+ty \in K
\end{align*}
for every $t \in [0,1]$. This is the point where convexity is used: without it, the segment might leave $K$, and the supremum of $\|Df_z\|_{\mathrm{op}}$ over $K$ would not control the derivative along the path.
Now define
\begin{align*}
g:[0,1]\to \mathbb R^m
\end{align*}
by
\begin{align*}
g(t)=f(\gamma(t))=f(x+t(y-x)).
\end{align*}
Since $f \in C^1(U;\mathbb R^m)$, since $\gamma([0,1]) \subset K \subset U$, and since $\gamma$ is a $C^1$ map, the composition $g=f\circ\gamma$ is $C^1$. The derivative of $\gamma$ is the constant vector $y-x \in \mathbb R^n$. Therefore the chain rule gives, for every $t \in (0,1)$,
\begin{align*}
g'(t)=Df_{\gamma(t)}(y-x).
\end{align*}
The function $t \mapsto Df_{\gamma(t)}(y-x)$ is continuous on $[0,1]$, so this derivative has the continuous extension to the endpoints needed for the integral formula. This converts the multivariable estimate into a one-dimensional estimate along a curve.[/guided]
custom_env
admin
[step:Apply the integral formula along the segment and estimate the integrand]
Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $[0,1]$. For each component index $i \in \{1,\dots,m\}$, let
\begin{align*}
g_i:[0,1]\to \mathbb R
\end{align*}
denote the $i$-th component of $g$. Since $g_i$ is $C^1$ and $g_i'$ is the continuous extension of the derivative from $(0,1)$ to $[0,1]$, the one-dimensional fundamental theorem of calculus gives
\begin{align*}
g_i(1)-g_i(0)=\int_0^1 g_i'(t)\,d\mathcal L^1(t).
\end{align*}
Applying this componentwise yields the vector identity
\begin{align*}
f(y)-f(x)=g(1)-g(0)=\int_0^1 Df_{\gamma(t)}(y-x)\,d\mathcal L^1(t).
\end{align*}
Taking Euclidean norms and using the triangle inequality for vector-valued integrals,
\begin{align*}
|f(y)-f(x)| \leq \int_0^1 |Df_{\gamma(t)}(y-x)|\,d\mathcal L^1(t).
\end{align*}
For every $t \in [0,1]$, the point $\gamma(t)$ lies in $K$, and the definition of the operator norm gives
\begin{align*}
|Df_{\gamma(t)}(y-x)| \leq \|Df_{\gamma(t)}\|_{\mathrm{op}}|y-x| \leq M|y-x|.
\end{align*}
Therefore
\begin{align*}
|f(y)-f(x)| \leq \int_0^1 M|y-x|\,d\mathcal L^1(t)=M|y-x|.
\end{align*}
Since $M=\sup_{z \in K}\|Df_z\|_{\mathrm{op}}$ and $|y-x|=|x-y|$, this is exactly
\begin{align*}
|f(x)-f(y)|\le \left(\sup_{z \in K}\|Df_z\|_{\mathrm{op}}\right)|x-y|.
\end{align*}
The points $x,y \in K$ were arbitrary, so the estimate holds for all $x,y \in K$.
[/step]