[proofplan]
Choose a positive distance from $K$ to the complement of $U$, so that all sufficiently small mollifier balls centred near $K$ remain inside $U$. The convergence of $f_\varepsilon$ to $f$ follows by averaging the uniform modulus of continuity of $f$ on a compact neighbourhood of $K$. For the derivative, differentiate the mollified function under the integral and identify $D(f_\varepsilon)$ with the mollification of $Df$; the same uniform-continuity argument applied to the continuous derivative map gives convergence in operator norm.
[/proofplan]
[step:Choose a compact neighbourhood on which the local convolutions are defined]
Because $K \subset\subset U$, the compact set $K$ is contained in $U$. Since $U$ is open and $K$ is compact, there exists $\rho>0$ such that the closed $\rho$-neighbourhood
\begin{align*}
K_\rho:=\{z\in\mathbb{R}^n:\operatorname{dist}(z,K)\leq \rho\}
\end{align*}
is compact and satisfies $K_\rho\subset U$.
Define the open neighbourhood
\begin{align*}
V:=\{x\in\mathbb{R}^n:\operatorname{dist}(x,K)<\rho/2\}.
\end{align*}
If $0<\varepsilon<\rho/2$, $x\in V$, and $y\in\operatorname{supp}\eta_\varepsilon$, then $|y|\leq \varepsilon<\rho/2$, so
\begin{align*}
\operatorname{dist}(x-y,K)\leq \operatorname{dist}(x,K)+|y|<\rho.
\end{align*}
Thus $x-y\in K_\rho\subset U$. Therefore $f_\varepsilon$ is well-defined on $V$ for every $0<\varepsilon<\rho/2$. Since $f$ is continuous on $U$, [citetheorem:8529] applied componentwise implies that $f_\varepsilon$ is smooth on a neighbourhood of $K$ for all sufficiently small $\varepsilon>0$.
[/step]
[step:Average the uniform modulus of continuity of $f$]
The restriction $f|_{K_\rho}:K_\rho\to\mathbb{R}^m$ is continuous on the compact set $K_\rho$. Hence it is uniformly continuous: for every $\delta>0$ there exists $\alpha>0$ such that whenever $z,w\in K_\rho$ and $|z-w|<\alpha$, one has $|f(z)-f(w)|<\delta$.
Let $0<\varepsilon<\min\{\rho/2,\alpha\}$. For every $x\in K$ and every $y\in\operatorname{supp}\eta_\varepsilon$, the points $x$ and $x-y$ belong to $K_\rho$, and $|y|\leq\varepsilon<\alpha$. Using the normalization of $\eta_\varepsilon$ and the non-negativity of $\eta_\varepsilon$,
\begin{align*}
|f_\varepsilon(x)-f(x)|\leq \int_{\mathbb{R}^n}\eta_\varepsilon(y)|f(x-y)-f(x)|\,d\mathcal{L}^n(y)\leq \delta\int_{\mathbb{R}^n}\eta_\varepsilon(y)\,d\mathcal{L}^n(y)=\delta.
\end{align*}
Taking the supremum over $x\in K$ gives
\begin{align*}
\sup_{x\in K}|f_\varepsilon(x)-f(x)|\leq\delta.
\end{align*}
Since $\delta>0$ was arbitrary, this proves
\begin{align*}
\sup_{x\in K}|f_\varepsilon(x)-f(x)|\to 0.
\end{align*}
[guided]
The point of choosing $K_\rho$ is that all translated points used by the mollifier remain inside a single compact subset of $U$. This lets us use [uniform continuity](/page/Uniform%20Continuity) rather than pointwise continuity.
Because $f|_{K_\rho}:K_\rho\to\mathbb{R}^m$ is continuous and $K_\rho$ is compact, $f$ is uniformly continuous on $K_\rho$. Thus, given $\delta>0$, there is $\alpha>0$ such that for all $z,w\in K_\rho$,
\begin{align*}
|z-w|<\alpha \implies |f(z)-f(w)|<\delta.
\end{align*}
Now choose $0<\varepsilon<\min\{\rho/2,\alpha\}$. If $x\in K$ and $y\in\operatorname{supp}\eta_\varepsilon$, then $|y|\leq\varepsilon$, so $x-y\in K_\rho$ and $|x-(x-y)|=|y|<\alpha$. Therefore
\begin{align*}
|f(x-y)-f(x)|<\delta.
\end{align*}
The mollification is an average with the non-negative kernel $\eta_\varepsilon$, whose integral is $1$. Hence
\begin{align*}
|f_\varepsilon(x)-f(x)|\leq \int_{\mathbb{R}^n}\eta_\varepsilon(y)|f(x-y)-f(x)|\,d\mathcal{L}^n(y)\leq \delta.
\end{align*}
This bound is independent of $x\in K$, so
\begin{align*}
\sup_{x\in K}|f_\varepsilon(x)-f(x)|\leq\delta.
\end{align*}
Since every $\delta>0$ is eventually an upper bound, the supremum tends to $0$ as $\varepsilon\to 0$.
[/guided]
[/step]
[step:Identify the derivative of the mollification with the mollification of the derivative]
For $i\in\{1,\dots,m\}$, let $f_i:U\to\mathbb{R}$ denote the $i$-th component of $f$. For $j\in\{1,\dots,n\}$, let $\partial_{x_j}f_i:U\to\mathbb{R}$ denote its $j$-th [partial derivative](/page/Partial%20Derivative). Since $f\in C^1(U;\mathbb{R}^m)$, each $\partial_{x_j}f_i$ is continuous on $U$.
Fix $0<\varepsilon<\rho/2$, $x\in V$, and a coordinate vector $e_j\in\mathbb{R}^n$. For $|t|$ sufficiently small, the points $x+te_j-y$ lie in $K_\rho$ for all $y\in\operatorname{supp}\eta_\varepsilon$. The one-variable mean value formula applied to
\begin{align*}
s\mapsto f_i(x+se_j-y)
\end{align*}
gives
\begin{align*}
\frac{f_i(x+te_j-y)-f_i(x-y)}{t}\to \partial_{x_j}f_i(x-y)
\end{align*}
uniformly for $y\in\operatorname{supp}\eta_\varepsilon$, because $\partial_{x_j}f_i$ is uniformly continuous on the compact set $K_\rho$. Passing the limit through the integral yields
\begin{align*}
\partial_{x_j}(f_\varepsilon)_i(x)=\int_{\mathbb{R}^n}\eta_\varepsilon(y)\partial_{x_j}f_i(x-y)\,d\mathcal{L}^n(y).
\end{align*}
Thus, for every $x\in K$ and $h=(h_1,\dots,h_n)\in\mathbb{R}^n$,
\begin{align*}
D(f_\varepsilon)_x(h)-Df_x(h)=\int_{\mathbb{R}^n}\eta_\varepsilon(y)\bigl(Df_{x-y}(h)-Df_x(h)\bigr)\,d\mathcal{L}^n(y).
\end{align*}
[/step]
[step:Apply the same averaging argument to the derivative map]
The derivative map
\begin{align*}
Df:U\to\mathcal{L}(\mathbb{R}^n,\mathbb{R}^m)
\end{align*}
is continuous, so its restriction to $K_\rho$ is uniformly continuous. Given $\delta>0$, choose $\alpha>0$ such that whenever $z,w\in K_\rho$ and $|z-w|<\alpha$, one has
\begin{align*}
\|Df_z-Df_w\|_{\mathrm{op}}<\delta.
\end{align*}
For $0<\varepsilon<\min\{\rho/2,\alpha\}$, $x\in K$, and $h\in\mathbb{R}^n$ with $|h|\leq 1$, the derivative identity gives
\begin{align*}
|D(f_\varepsilon)_x(h)-Df_x(h)|\leq \int_{\mathbb{R}^n}\eta_\varepsilon(y)\|Df_{x-y}-Df_x\|_{\mathrm{op}}|h|\,d\mathcal{L}^n(y)\leq \delta.
\end{align*}
Taking the supremum over all $h\in\mathbb{R}^n$ with $|h|\leq 1$ gives
\begin{align*}
\|D(f_\varepsilon)_x-Df_x\|_{\mathrm{op}}\leq\delta.
\end{align*}
Taking the supremum over $x\in K$ gives
\begin{align*}
\sup_{x\in K}\|D(f_\varepsilon)_x-Df_x\|_{\mathrm{op}}\leq\delta.
\end{align*}
Hence
\begin{align*}
\sup_{x\in K}\|D(f_\varepsilon)_x-Df_x\|_{\mathrm{op}}\to 0
\end{align*}
as $\varepsilon\to 0$.
[/step]
[step:Combine the two uniform convergence estimates]
We have proved that, for all sufficiently small $\varepsilon>0$, the local mollification $f_\varepsilon$ is smooth on the fixed neighbourhood $V$ of $K$. We have also shown
\begin{align*}
\sup_{x\in K}|f_\varepsilon(x)-f(x)|\to 0
\end{align*}
and
\begin{align*}
\sup_{x\in K}\|D(f_\varepsilon)_x-Df_x\|_{\mathrm{op}}\to 0.
\end{align*}
Adding these two non-negative quantities gives
\begin{align*}
\sup_{x \in K}|f_\varepsilon(x)-f(x)|+\sup_{x \in K}\|D(f_\varepsilon)_x-Df_x\|_{\mathrm{op}}\to 0.
\end{align*}
This is the asserted local $C^1$ convergence of mollifications.
[/step]