[proofplan]
The proof has two parts. First, compactly supported Sobolev functions inside $U$ are approximated by convolution with a [standard mollifier](/page/Standard%20Mollifier), and convergence is checked derivative by derivative in $L^p$. Second, a locally finite smooth [partition of unity](/page/Partition%20of%20Unity) decomposes an arbitrary $u \in W^{k,p}(U)$ into compactly supported pieces, each piece is mollified at a scale small enough to remain inside $U$, and the pieces are summed with controlled error. For bounded Lipschitz domains, one uses the bounded Lipschitz Sobolev extension theorem, mollifies the extension on $\mathbb{R}^n$, cuts off near $\overline{U}$, and restricts back to $U$.
[/proofplan]
[step:Approximate compactly supported Sobolev functions by mollification]
Let $\mathcal{L}^n$ denote $n$-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}^n$. Let $\rho \in C_c^\infty(B(0,1))$ be a standard non-negative mollifier satisfying
\begin{align*}
\int_{\mathbb{R}^n} \rho(x)\,d\mathcal{L}^n(x)=1.
\end{align*}
For $\varepsilon > 0$, define the mollifier
\begin{align*}
\rho_\varepsilon:\mathbb{R}^n \to [0,\infty), \qquad x \mapsto \varepsilon^{-n}\rho(x/\varepsilon).
\end{align*}
Let $v \in W^{k,p}(U)$ have a representative, still denoted $v$, whose distributional support is the compact set $K \subset U$. Choose $\varepsilon_0 > 0$ such that
\begin{align*}
\varepsilon_0 < \operatorname{dist}(K,\mathbb{R}^n \setminus U).
\end{align*}
Choose $\eta \in C_c^\infty(U)$ such that $\eta=1$ on an open neighbourhood of $K$. Define the zero extension $\tilde v:\mathbb{R}^n \to \mathbb{R}$ by setting $\tilde v=v$ on $U$ and $\tilde v=0$ on $\mathbb{R}^n\setminus U$. For each multi-index $\alpha$ with $|\alpha|\le k$, let $\widetilde{D^\alpha v}:\mathbb{R}^n\to\mathbb{R}$ denote the zero extension of the weak derivative $D^\alpha v$ from $U$ to $\mathbb{R}^n$. Since $\eta v=v$ in $W^{k,p}(U)$ and $\eta v$ has compact support in $U$, the extension agrees with the compactly supported Sobolev function $\eta v$ on $U$ and has no boundary contribution across $\partial U$. Equivalently, testing against any $\phi \in C_c^\infty(\mathbb{R}^n)$ and using the [Basic Properties of the Weak Derivative](/theorems/77) gives $D^\alpha \tilde v=\widetilde{D^\alpha v}$ for every multi-index $\alpha$ with $|\alpha|\le k$, so $\tilde v \in W^{k,p}(\mathbb{R}^n)$.
For $0 < \varepsilon < \varepsilon_0$, define
\begin{align*}
v_\varepsilon:U \to \mathbb{R}, \qquad x \mapsto \int_{\mathbb{R}^n} \rho_\varepsilon(x-y)\tilde v(y)\,d\mathcal{L}^n(y).
\end{align*}
Then $v_\varepsilon \in C^\infty(U)$, and the support condition on $\rho_\varepsilon$ gives $\operatorname{supp} v_\varepsilon \subset U$.
For every multi-index $\alpha \in \mathbb{N}_0^n$ with $|\alpha| \leq k$, the [weak derivative](/page/Weak%20Derivative) satisfies
\begin{align*}
D^\alpha v_\varepsilon = \rho_\varepsilon * D^\alpha \tilde v
\end{align*}
on $U$. Set $g_\alpha:=D^\alpha \tilde v\in L^p(\mathbb{R}^n)$. Since $1\le p<\infty$, translation is continuous in $L^p(\mathbb{R}^n)$. Using the substitution $y=x-\varepsilon z$ in the convolution integral with respect to $\mathcal{L}^n$, [Minkowski's integral inequality](/theorems/464) gives
\begin{align*}
\|\rho_\varepsilon*g_\alpha-g_\alpha\|_{L^p(\mathbb{R}^n)}\le \int_{B(0,1)} \rho(z)\|g_\alpha(\cdot-\varepsilon z)-g_\alpha\|_{L^p(\mathbb{R}^n)}\,d\mathcal{L}^n(z).
\end{align*}
For each fixed $z\in B(0,1)$, the norm inside the integral tends to $0$ by translation continuity. It is bounded by $2\|g_\alpha\|_{L^p(\mathbb{R}^n)}$, and $\rho$ is integrable with respect to $\mathcal{L}^n$, so the [dominated convergence theorem](/theorems/4) yields
\begin{align*}
\|D^\alpha v_\varepsilon - D^\alpha \tilde v\|_{L^p(\mathbb{R}^n)}=\|\rho_\varepsilon*g_\alpha-g_\alpha\|_{L^p(\mathbb{R}^n)} \to 0
\end{align*}
as $\varepsilon \downarrow 0$. Restricting the integral norm from $\mathbb{R}^n$ to $U$ gives
\begin{align*}
\|v_\varepsilon - v\|_{W^{k,p}(U)} \to 0.
\end{align*}
[guided]
The compact support assumption is the place where convolution is easiest to use. We use a representative of $v$ whose distributional support is the compact set $K \subset U$. Because $K$ is compact and lies inside $U$, the distance from $K$ to the complement of $U$ is positive after choosing a smaller positive number:
\begin{align*}
0 < \varepsilon_0 < \operatorname{dist}(K,\mathbb{R}^n \setminus U).
\end{align*}
This guarantees that when $0 < \varepsilon < \varepsilon_0$, the convolution kernel $\rho_\varepsilon(x-y)$ only samples values of the zero extension $\tilde v$ from points that remain inside $U$ whenever $x$ is near $K$. Thus no boundary values outside $U$ contaminate the approximation.
We define
\begin{align*}
v_\varepsilon:U \to \mathbb{R}, \qquad x \mapsto \int_{\mathbb{R}^n} \rho_\varepsilon(x-y)\tilde v(y)\,d\mathcal{L}^n(y).
\end{align*}
The integral is with respect to $n$-dimensional Lebesgue measure, and the integrand is integrable because $\rho_\varepsilon$ is smooth with compact support and $\tilde v \in L^p(\mathbb{R}^n) \subset L^1_{\mathrm{loc}}(\mathbb{R}^n)$. Differentiating convolution in the weak sense gives, for each multi-index $\alpha$ with $|\alpha| \leq k$,
\begin{align*}
D^\alpha v_\varepsilon = \rho_\varepsilon * D^\alpha \tilde v.
\end{align*}
This identity is the reason Sobolev convergence follows from ordinary $L^p$ convergence: every derivative of order at most $k$ is approximated by mollifying the corresponding weak derivative.
Set $g_\alpha:=D^\alpha \tilde v$. Since $g_\alpha\in L^p(\mathbb{R}^n)$ and $1\le p<\infty$, the global approximate identity argument applies on the whole space. The key estimate is obtained by writing the convolution with the substitution $y=x-\varepsilon z$ and then applying Minkowski's integral inequality:
\begin{align*}
\|\rho_\varepsilon*g_\alpha-g_\alpha\|_{L^p(\mathbb{R}^n)}\le \int_{B(0,1)} \rho(z)\|g_\alpha(\cdot-\varepsilon z)-g_\alpha\|_{L^p(\mathbb{R}^n)}\,d\mathcal{L}^n(z).
\end{align*}
For each fixed $z$, translation continuity in $L^p(\mathbb{R}^n)$ gives $\|g_\alpha(\cdot-\varepsilon z)-g_\alpha\|_{L^p(\mathbb{R}^n)}\to0$. The integrand is bounded by $2\rho(z)\|g_\alpha\|_{L^p(\mathbb{R}^n)}$, which is integrable over $B(0,1)$ with respect to $\mathcal{L}^n$. The dominated convergence theorem therefore gives
\begin{align*}
\|\rho_\varepsilon * D^\alpha \tilde v - D^\alpha \tilde v\|_{L^p(\mathbb{R}^n)}=\|\rho_\varepsilon*g_\alpha-g_\alpha\|_{L^p(\mathbb{R}^n)} \to 0.
\end{align*}
Applying this for all finitely many multi-indices $\alpha$ with $|\alpha| \leq k$ and summing the resulting $L^p$ errors yields
\begin{align*}
\|v_\varepsilon - v\|_{W^{k,p}(U)} \to 0.
\end{align*}
Thus compactly supported Sobolev functions inside $U$ can be approximated in the full Sobolev norm by smooth functions.
[/guided]
[/step]
[step:Choose an exhaustion partition whose pieces stay away from the boundary]
For each $j\in\mathbb{N}$, define the open exhaustion set
\begin{align*}
U_j=\{x\in U: |x|<j\text{ and }\operatorname{dist}(x,\mathbb{R}^n\setminus U)>1/j\}.
\end{align*}
Set $U_0=U_{-1}=\varnothing$ and define the open ring
\begin{align*}
V_j=U_{j+1}\setminus \overline{U_{j-1}}.
\end{align*}
The family $(V_j)_{j=1}^\infty$ is an open cover of $U$: if $x\in U$, then $x\in U_m$ for all sufficiently large $m$, and choosing the least such $m$ gives $x\in U_m\setminus\overline{U_{m-2}}=V_{m-1}$. It is locally finite because, for each $x\in U$, there exists $m$ with $x\in U_m$, and then a neighbourhood of $x$ contained in $U_m$ meets no $V_j$ with $j\ge m+2$ and meets only finitely many lower-indexed sets. Also $\overline{V_j}\subset\overline{U_{j+1}}$, and $\overline{U_{j+1}}\subset\{x\in\mathbb{R}^n: |x|\le j+1,\operatorname{dist}(x,\mathbb{R}^n\setminus U)\ge 1/(j+1)\}$ is compactly contained in $U$.
By the [Existence of Partitions of Unity](/theorems/57), applied to this locally finite precompact cover, choose a locally finite smooth refinement $(G_j)_{j=1}^\infty$ with $\overline{G_j}\Subset V_j$ and a smooth partition of unity $(\psi_j)_{j=1}^\infty$ satisfying $\operatorname{supp}\psi_j\subset \overline{G_j}\Subset V_j$, $0\le \psi_j\le 1$, and $\sum_{j=1}^\infty \psi_j(x)=1$ for every $x\in U$. In particular, $\psi_j\in C_c^\infty(V_j)$ for every $j\in\mathbb{N}$.
For $u\in W^{k,p}(U)$, define
\begin{align*}
u_j:U\to \mathbb{R}, \qquad x\mapsto \psi_j(x)u(x).
\end{align*}
By the Leibniz formula in the [Basic Properties of the Weak Derivative](/theorems/77), each $u_j$ belongs to $W^{k,p}(U)$ and has compact support in $V_j$. Let $W^{k,p}_{\mathrm{loc}}(U)$ denote the space of functions whose restriction to every relatively compact open subset of $U$ belongs to $W^{k,p}$ on that subset. Since the partition is locally finite, the identity $u=\sum_{j=1}^\infty u_j$ holds locally as an equality of distributions and hence in $W^{k,p}_{\mathrm{loc}}(U)$.
[guided]
The reason for using the particular rings $V_j$ is that each local piece must have room to be mollified without touching the boundary. The sets $U_j$ exhaust $U$ from the inside: if $x\in U$, then $|x|<m$ and $\operatorname{dist}(x,\mathbb{R}^n\setminus U)>1/m$ for all sufficiently large $m$. Hence $x$ belongs to at least one ring $V_j=U_{j+1}\setminus \overline{U_{j-1}}$, so the family covers $U$.
We also need local finiteness. Choose $m$ with $x\in U_m$. Since $U_m$ is open, a neighbourhood of $x$ lies in $U_m$; this neighbourhood cannot meet $V_j$ for $j\ge m+2$, because those rings avoid $\overline{U_{j-1}}\supset U_m$. It meets only finitely many lower-indexed rings, so the cover is locally finite. Finally, $\overline{V_j}\subset\overline{U_{j+1}}$, and $\overline{U_{j+1}}$ is bounded and has distance at least $1/(j+1)$ from $\mathbb{R}^n\setminus U$. Thus $\overline{V_j}\Subset U$.
We apply the [Existence of Partitions of Unity](/theorems/57) to this locally finite precompact open cover. The compact-support conclusion comes from the locally finite smooth refinement in the partition construction: choose open sets $G_j$ with $\overline{G_j}\Subset V_j$, then build the partition on those shrinkings. Therefore we may take $\operatorname{supp}\psi_j\subset\overline{G_j}\Subset V_j$, $0\le \psi_j\le 1$, and $\sum_{j=1}^\infty\psi_j(x)=1$ for every $x\in U$, with only finitely many non-zero terms near each point. In particular, $\psi_j\in C_c^\infty(V_j)$.
Now define $u_j:U\to\mathbb{R}$ by $u_j(x)=\psi_j(x)u(x)$. The multiplier $\psi_j$ is smooth with compact support, and $u\in W^{k,p}(U)$, so the Leibniz formula from the [Basic Properties of the Weak Derivative](/theorems/77) implies $u_j\in W^{k,p}(U)$. Since $\operatorname{supp}\psi_j\subset V_j$, the support of $u_j$ is compactly contained in $V_j$. Local finiteness is essential here: near each point, the sum $\sum_j \psi_j u$ is a finite sum, so distributional derivatives may be computed term by term. Let $W^{k,p}_{\mathrm{loc}}(U)$ denote the space of functions whose restriction to every relatively compact open subset of $U$ belongs to $W^{k,p}$ on that subset. Therefore $u=\sum_j u_j$ locally in $W^{k,p}_{\mathrm{loc}}(U)$.
[/guided]
[/step]
[step:Approximate every local piece with summable Sobolev error]
Fix $\delta>0$. For each $j\in\mathbb{N}$, the compact support of $u_j$ lies in $V_j\Subset U$. Applying the compactly supported mollification step with mollification radius smaller than $\operatorname{dist}(\operatorname{supp}u_j,\mathbb{R}^n\setminus V_j)$, choose $w_j\in C_c^\infty(V_j)$ such that
\begin{align*}
\|w_j-u_j\|_{W^{k,p}(U)}<\delta 2^{-j}.
\end{align*}
Define
\begin{align*}
w:U\to\mathbb{R}, \qquad x\mapsto \sum_{j=1}^\infty w_j(x).
\end{align*}
The sum is locally finite because $\operatorname{supp}w_j\subset V_j$ and the cover $(V_j)$ is locally finite, so $w\in C^\infty(U)$. Moreover, the error series converges absolutely in the [Banach space](/page/Banach%20Space) $W^{k,p}(U)$, since
\begin{align*}
\sum_{j=1}^\infty \|w_j-u_j\|_{W^{k,p}(U)}<\delta.
\end{align*}
The absolutely convergent series $\sum_{j=1}^\infty (w_j-u_j)$ defines an element of the Banach space $W^{k,p}(U)$. Because the sums defining $u=\sum_{j=1}^\infty u_j$ and $w=\sum_{j=1}^\infty w_j$ are locally finite, this Sobolev element agrees distributionally, hence $\mathcal{L}^n$-a.e., with the locally finite function $w-u$. Thus $w-u=\sum_{j=1}^\infty (w_j-u_j)$ in $W^{k,p}(U)$, and the triangle inequality gives
\begin{align*}
\|w-u\|_{W^{k,p}(U)}\le \sum_{j=1}^\infty \|w_j-u_j\|_{W^{k,p}(U)}<\delta.
\end{align*}
Since $u\in W^{k,p}(U)$ and $\delta>0$ were arbitrary, $C^\infty(U)\cap W^{k,p}(U)$ is dense in $W^{k,p}(U)$.
[guided]
The finite truncation argument is not enough here: the derivatives of partition functions enter the Sobolev norm, so one cannot simply discard the tail of an arbitrary partition without proving convergence. Instead, we approximate every piece and force the errors to be summable.
For each $j$, the function $u_j=\psi_j u$ has compact support inside $V_j\Subset U$. The compactly supported mollification step applies because the support is a positive distance from $\mathbb{R}^n\setminus V_j$. Therefore we can choose $w_j\in C_c^\infty(V_j)$ with $\|w_j-u_j\|_{W^{k,p}(U)}<\delta 2^{-j}$. The geometric weights are chosen so that the total error is bounded by a convergent series.
Define $w(x)=\sum_{j=1}^\infty w_j(x)$. This is a smooth function on $U$ because the family $(V_j)$ is locally finite, so only finitely many $w_j$ are non-zero near any fixed $x\in U$. In the Sobolev norm, the series of errors is absolutely summable:
\begin{align*}
\sum_{j=1}^\infty \|w_j-u_j\|_{W^{k,p}(U)}<\sum_{j=1}^\infty \delta 2^{-j}=\delta.
\end{align*}
Since $W^{k,p}(U)$ is complete, $\sum_j(w_j-u_j)$ converges in $W^{k,p}(U)$. This limit agrees distributionally with the locally finite difference $w-u$, because near every point only finitely many terms from the partitions are present. Equality as distributions between Sobolev functions implies equality $\mathcal{L}^n$-a.e., so $w-u=\sum_j(w_j-u_j)$ in $W^{k,p}(U)$. Hence
\begin{align*}
\|w-u\|_{W^{k,p}(U)}\le \sum_{j=1}^\infty \|w_j-u_j\|_{W^{k,p}(U)}<\delta.
\end{align*}
This proves density for an arbitrary Sobolev function, not only for compactly supported functions or finite sums of local pieces.
[/guided]
[/step]
[step:Use the bounded Lipschitz extension theorem to obtain global compactly supported smooth approximants]
Assume now that $U$ is a bounded Lipschitz domain. We use the [Sobolev Extension Theorem for Bounded Lipschitz Domains](/theorems/59): for every bounded Lipschitz domain $U\subset\mathbb{R}^n$, every $k\in\mathbb{N}_0$, and every $1\le p<\infty$, there exists a bounded linear extension operator
\begin{align*}
E:W^{k,p}(U) \to W^{k,p}(\mathbb{R}^n)
\end{align*}
such that $(Eu)|_U=u$ for every $u \in W^{k,p}(U)$.
Let $u \in W^{k,p}(U)$ and define the extension $u_{\mathrm{ext}}\in W^{k,p}(\mathbb{R}^n)$ by
\begin{align*}
u_{\mathrm{ext}} = Eu.
\end{align*}
For $\varepsilon > 0$, define
\begin{align*}
u_{\varepsilon,\mathrm{ext}}:\mathbb{R}^n \to \mathbb{R}, \qquad x \mapsto \int_{\mathbb{R}^n}\rho_\varepsilon(x-y)u_{\mathrm{ext}}(y)\,d\mathcal{L}^n(y).
\end{align*}
For each multi-index $\alpha\in\mathbb{N}_0^n$ with $|\alpha|\le k$, set $g_\alpha:=D^\alpha u_{\mathrm{ext}}\in L^p(\mathbb{R}^n)$. The global approximate identity argument, using translation continuity in $L^p(\mathbb{R}^n)$ and Minkowski's integral inequality, gives
\begin{align*}
\|\rho_\varepsilon*g_\alpha-g_\alpha\|_{L^p(\mathbb{R}^n)}\to 0
\end{align*}
as $\varepsilon\downarrow 0$. Applying this to all finitely many $\alpha$ with $|\alpha|\le k$ yields
\begin{align*}
\|u_{\varepsilon,\mathrm{ext}}-u_{\mathrm{ext}}\|_{W^{k,p}(\mathbb{R}^n)} \to 0.
\end{align*}
Since $U$ is bounded, choose $\zeta \in C_c^\infty(\mathbb{R}^n)$ such that $\zeta=1$ on an open neighbourhood of $\overline{U}$. Define
\begin{align*}
\varphi_\varepsilon:\mathbb{R}^n \to \mathbb{R}, \qquad x \mapsto \zeta(x)u_{\varepsilon,\mathrm{ext}}(x).
\end{align*}
Then $\varphi_\varepsilon \in C_c^\infty(\mathbb{R}^n)$, and $\varphi_\varepsilon|_U=u_{\varepsilon,\mathrm{ext}}|_U$. Hence
\begin{align*}
\|\varphi_\varepsilon|_U-u\|_{W^{k,p}(U)}=\|u_{\varepsilon,\mathrm{ext}}|_U-u_{\mathrm{ext}}|_U\|_{W^{k,p}(U)} \leq \|u_{\varepsilon,\mathrm{ext}}-u_{\mathrm{ext}}\|_{W^{k,p}(\mathbb{R}^n)}.
\end{align*}
The right-hand side tends to $0$, so restrictions of functions in $C_c^\infty(\mathbb{R}^n)$ are dense in $W^{k,p}(U)$.
[guided]
The bounded Lipschitz hypothesis is used only to obtain an extension operator. The [Sobolev Extension Theorem for Bounded Lipschitz Domains](/theorems/59) applies because $U$ is a bounded Lipschitz domain, $k\in\mathbb{N}_0$, and $1\le p<\infty$. It gives a bounded [linear map](/page/Linear%20Map) $E:W^{k,p}(U)\to W^{k,p}(\mathbb{R}^n)$ such that $(Eu)|_U=u$.
Define $u_{\mathrm{ext}}=Eu$. For $\varepsilon>0$, define $u_{\varepsilon,\mathrm{ext}}=\rho_\varepsilon*u_{\mathrm{ext}}$. The same derivative identity used earlier gives $D^\alpha u_{\varepsilon,\mathrm{ext}}=\rho_\varepsilon*D^\alpha u_{\mathrm{ext}}$ for every multi-index $\alpha$ with $|\alpha|\le k$. For each such $\alpha$, the function $g_\alpha:=D^\alpha u_{\mathrm{ext}}$ belongs to $L^p(\mathbb{R}^n)$. Since $1\le p<\infty$, translations are continuous in $L^p(\mathbb{R}^n)$, and Minkowski's integral inequality turns this translation continuity into the global approximate identity estimate
\begin{align*}
\|\rho_\varepsilon*g_\alpha-g_\alpha\|_{L^p(\mathbb{R}^n)}\to 0.
\end{align*}
Summing over the finitely many $\alpha$ with $|\alpha|\le k$ proves
\begin{align*}
\|u_{\varepsilon,\mathrm{ext}}-u_{\mathrm{ext}}\|_{W^{k,p}(\mathbb{R}^n)}\to 0.
\end{align*}
Because $U$ is bounded, $\overline U$ is compact, so we may choose $\zeta\in C_c^\infty(\mathbb{R}^n)$ with $\zeta=1$ on an open neighbourhood of $\overline U$. Then $\varphi_\varepsilon=\zeta u_{\varepsilon,\mathrm{ext}}$ lies in $C_c^\infty(\mathbb{R}^n)$ and agrees with $u_{\varepsilon,\mathrm{ext}}$ on $U$.
Restricting to $U$ cannot increase the Sobolev norm, so
\begin{align*}
\|\varphi_\varepsilon|_U-u\|_{W^{k,p}(U)}\le \|u_{\varepsilon,\mathrm{ext}}-u_{\mathrm{ext}}\|_{W^{k,p}(\mathbb{R}^n)}.
\end{align*}
The right-hand side tends to $0$, proving density by restrictions of globally compactly supported smooth functions.
[/guided]
[/step]