[proofplan]
The argument joins two arbitrary points of the convex domain by the line segment between them. Convexity keeps the entire segment inside the domain, so composing the given map with this segment gives a $C^1$ curve in the target space. The endpoint difference is estimated first in every unit scalar direction by the one-dimensional fundamental theorem of calculus, and then the operator norm bound and Euclidean duality convert those scalar estimates into the desired vector norm estimate.
[/proofplan]
[step:Parametrize the segment inside the convex domain]
Fix $x,y\in U$. Define the affine map
\begin{align*}
\gamma:[0,1]\to U,\qquad t\mapsto y+t(x-y).
\end{align*}
This map is well-defined because $U$ is convex: for every $t\in[0,1]$, the point
\begin{align*}
\gamma(t)=(1-t)y+tx
\end{align*}
belongs to $U$. The map $\gamma$ is $C^1$ as a map from $[0,1]$ to $\mathbb R^n$, and its derivative is
\begin{align*}
\gamma'(t)=x-y
\end{align*}
for every $t\in[0,1]$.
[/step]
[step:Apply the scalar fundamental theorem of calculus to every direction]
Let
\begin{align*}
S^{m-1}:=\{a\in\mathbb R^m:|a|=1\}
\end{align*}
denote the Euclidean unit sphere in $\mathbb R^m$. Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R$. Fix $a\in S^{m-1}$ and define
\begin{align*}
\varphi_a:[0,1]\to\mathbb R,\qquad t\mapsto a\cdot f(\gamma(t)).
\end{align*}
Since $f\in C^1(U;\mathbb R^m)$ and $\gamma$ is $C^1$, the chain rule gives $\varphi_a\in C^1([0,1];\mathbb R)$ and
\begin{align*}
\varphi_a'(t)=a\cdot Df_{\gamma(t)}(x-y)
\end{align*}
for every $t\in[0,1]$.
By the [Fundamental Theorem of Calculus](/theorems/632) for $C^1$ scalar functions on $[0,1]$,
\begin{align*}
a\cdot(f(x)-f(y))=\int_0^1 a\cdot Df_{\gamma(t)}(x-y)\,d\mathcal L^1(t).
\end{align*}
Taking absolute values, applying the triangle inequality for the [Lebesgue integral](/page/Lebesgue%20Integral), using $|a|=1$, and then using the definition of the operator norm, we obtain
\begin{align*}
|a\cdot(f(x)-f(y))|\le \int_0^1 |Df_{\gamma(t)}(x-y)|\,d\mathcal L^1(t).
\end{align*}
For every $t\in[0,1]$, the point $\gamma(t)$ belongs to $U$, so the definition of $M$ gives
\begin{align*}
|Df_{\gamma(t)}(x-y)|\le \|Df_{\gamma(t)}\|_{\mathrm{op}}|x-y|\le M|x-y|.
\end{align*}
Therefore, since $\mathcal L^1([0,1])=1$,
\begin{align*}
|a\cdot(f(x)-f(y))|\le M|x-y|.
\end{align*}
[guided]
The goal is to estimate the vector $f(x)-f(y)\in\mathbb R^m$. A convenient way to do this is to first estimate all of its scalar projections. For that reason, define the Euclidean unit sphere
\begin{align*}
S^{m-1}:=\{a\in\mathbb R^m:|a|=1\}.
\end{align*}
Let $\mathcal L^1$ denote one-dimensional Lebesgue measure on $\mathbb R$. Fix $a\in S^{m-1}$ and define the scalar function
\begin{align*}
\varphi_a:[0,1]\to\mathbb R,\qquad t\mapsto a\cdot f(\gamma(t)).
\end{align*}
This function is $C^1$ because $f\in C^1(U;\mathbb R^m)$, the curve $\gamma:[0,1]\to U$ is $C^1$, and the Euclidean dot product with the fixed vector $a$ is a [linear map](/page/Linear%20Map) from $\mathbb R^m$ to $\mathbb R$. Applying the chain rule to the composition $t\mapsto a\cdot f(\gamma(t))$ gives
\begin{align*}
\varphi_a'(t)=a\cdot Df_{\gamma(t)}(\gamma'(t)).
\end{align*}
Since $\gamma'(t)=x-y$, this becomes
\begin{align*}
\varphi_a'(t)=a\cdot Df_{\gamma(t)}(x-y).
\end{align*}
Now we apply the [Fundamental Theorem of Calculus](/theorems/632) for $C^1$ scalar functions on $[0,1]$. Its hypotheses are satisfied because $\varphi_a\in C^1([0,1];\mathbb R)$. Hence
\begin{align*}
\varphi_a(1)-\varphi_a(0)=\int_0^1 \varphi_a'(t)\,d\mathcal L^1(t).
\end{align*}
Using $\gamma(1)=x$ and $\gamma(0)=y$, this identity reads
\begin{align*}
a\cdot(f(x)-f(y))=\int_0^1 a\cdot Df_{\gamma(t)}(x-y)\,d\mathcal L^1(t).
\end{align*}
We estimate the right-hand side. First, the triangle inequality for the Lebesgue integral gives
\begin{align*}
|a\cdot(f(x)-f(y))|\le \int_0^1 |a\cdot Df_{\gamma(t)}(x-y)|\,d\mathcal L^1(t).
\end{align*}
Since $|a|=1$, the [Cauchy-Schwarz inequality](/theorems/432) in $\mathbb R^m$ gives
\begin{align*}
|a\cdot Df_{\gamma(t)}(x-y)|\le |Df_{\gamma(t)}(x-y)|.
\end{align*}
The operator norm of the linear map $Df_{\gamma(t)}:\mathbb R^n\to\mathbb R^m$ is defined so that
\begin{align*}
|Df_{\gamma(t)}(x-y)|\le \|Df_{\gamma(t)}\|_{\mathrm{op}}|x-y|.
\end{align*}
Because $\gamma(t)\in U$ for all $t\in[0,1]$, the definition
\begin{align*}
M=\sup_{z\in U}\|Df_z\|_{\mathrm{op}}
\end{align*}
implies
\begin{align*}
\|Df_{\gamma(t)}\|_{\mathrm{op}}\le M.
\end{align*}
Combining these estimates yields
\begin{align*}
|a\cdot(f(x)-f(y))|\le \int_0^1 M|x-y|\,d\mathcal L^1(t).
\end{align*}
Since $\mathcal L^1([0,1])=1$, we conclude
\begin{align*}
|a\cdot(f(x)-f(y))|\le M|x-y|.
\end{align*}
This estimate holds for every unit vector $a\in S^{m-1}$.
[/guided]
[/step]
[step:Recover the vector estimate from the scalar projections]
Let
\begin{align*}
v:=f(x)-f(y)\in\mathbb R^m.
\end{align*}
We use the Euclidean dual norm identity
\begin{align*}
|v|=\sup_{a\in S^{m-1}}|a\cdot v|.
\end{align*}
Indeed, the Cauchy-Schwarz inequality gives $|a\cdot v|\le |v|$ for every $a\in S^{m-1}$, while if $v\ne0$, the choice $a=v/|v|$ gives $|a\cdot v|=|v|$; if $v=0$, both sides are $0$.
Applying the estimate from the previous step to every $a\in S^{m-1}$ gives
\begin{align*}
|f(x)-f(y)|=\sup_{a\in S^{m-1}}|a\cdot(f(x)-f(y))|\le M|x-y|.
\end{align*}
Since $x,y\in U$ were arbitrary, this estimate holds for all $x,y\in U$. Thus $f$ is Lipschitz continuous on $U$ with Lipschitz constant $M$.
[/step]