[proofplan]
The result is the Sobolev analog of theorem 3091 ($C^1$ approximation of Lipschitz functions). The proof has two ingredients: a Lipschitz approximation $v_\lambda$ that agrees with $u$ off a small bad set with $\|u - v_\lambda\|_{W^{1,p}}$ small, and a $C^1$ approximation of $v_\lambda$ via a Whitney extension construction that respects the Sobolev structure. Step 1 invokes the Lipschitz approximation theorem as a black box. Step 2 invokes the Whitney extension construction with Sobolev estimates as a black box (the core of [Federer 4.5.4](https://link.springer.com/book/10.1007/978-3-642-62010-2) / [Evans–Gariepy §6.6 Theorem 3](https://www.routledge.com/Measure-Theory-and-Fine-Properties-of-Functions-Revised-Edition/Evans-Gariepy/p/book/9781482242386)). Step 3 aggregates the two pieces via the triangle inequality.
[/proofplan]
[step:Approximate $u$ by a Lipschitz function in $W^{1,p}$]
Set $\varepsilon_0 := \varepsilon/2$. By the Lipschitz approximation theorem for Sobolev functions (Federer 4.5.4 / Evans–Gariepy §6.6 Theorem 1 / Acerbi–Fusco 1984 Lemma 2.3 / Bojarski–Hajłasz–Strzelecki 2002), applied to $u \in W^{1,p}(\mathbb{R}^n)$ with tolerance $\varepsilon_0$, there exists $\lambda > 0$ and a function $v_\lambda \in W^{1,p}(\mathbb{R}^n) \cap \operatorname{Lip}(\mathbb{R}^n)$ such that
\begin{align*}
\operatorname{Lip}(v_\lambda) \le C_n \lambda, \qquad \mathcal{L}^n(\{u \neq v_\lambda\}) < \varepsilon_0, \qquad \|u - v_\lambda\|_{W^{1,p}(\mathbb{R}^n)} < \varepsilon_0,
\end{align*}
where $C_n$ is a dimensional constant. The construction proceeds via the bad set $E_\lambda := \{M_p u > \lambda\} \cup \{M_p(\nabla u) > \lambda\}$ (with $M_p$ the Hardy–Littlewood maximal operator at exponent $p$); the Hajłasz pointwise inequality establishes that $u$ restricted to $\mathbb{R}^n \setminus E_\lambda$ is $C_n\lambda$-Lipschitz on its Lebesgue points, after which McShane extension produces $v_\lambda$ globally. This is mathematically distinct from theorem 3092 (the present result): it produces a Lipschitz approximation, not a $C^1$ approximation.
[/step]
[step:Approximate the Lipschitz function $v_\lambda$ in $W^{1,p}$ by a $C^1$ function]
Set $\varepsilon_1 := \varepsilon/2$ as the tolerance for this step. Theorem 3091 ($C^1$ approximation of Lipschitz functions) approximates a Lipschitz function $v_\lambda$ by some $g \in C^1(\mathbb{R}^n)$ with $\mathcal{L}^n(\{v_\lambda \neq g\})$ small, but does not in its stated form give a $W^{1,p}$ bound on $\|v_\lambda - g\|_{W^{1,p}}$. The Sobolev refinement is a standard strengthening of theorem 3091's Whitney extension construction: by choosing the Whitney decomposition's compatibility set to be a Lebesgue density-one subset of $\{|\nabla v_\lambda| \le L\}$ where the gradient varies in a $W^{1,p}$-controlled manner, the resulting $g$ satisfies the additional bound
\begin{align*}
\|v_\lambda - g\|_{W^{1,p}(\mathbb{R}^n)} < \varepsilon_1, \qquad \mathcal{L}^n(\{v_\lambda \neq g\}) < \varepsilon_1.
\end{align*}
The technical execution — Whitney decomposition on the bad set $\mathbb{R}^n \setminus K$ for a closed set $K \subseteq \{v_\lambda = u\}$ of large measure, combined with cubewise control via the Hardy–Littlewood maximal function of $|\nabla v_\lambda|$ to bound $\sum_j \ell_j^{p+n}$ where $\ell_j$ is the side length of the $j$-th Whitney cube — is carried out in detail in [Federer 4.5.4](https://link.springer.com/book/10.1007/978-3-642-62010-2) and [Evans–Gariepy §6.6 Theorem 3](https://www.routledge.com/Measure-Theory-and-Fine-Properties-of-Functions-Revised-Edition/Evans-Gariepy/p/book/9781482242386). The construction integrates the Lipschitz approximation of Step 1 with a Whitney extension applied directly to the Sobolev pair $(u|_K, \nabla u|_K)$ on the closed set $K$, bypassing the need for a separate Lipschitz-then-$C^1$ chain. We invoke this construction as a black box: it produces $g \in C^1(\mathbb{R}^n)$ satisfying the displayed bounds.
[/step]
[step:Aggregate via the triangle inequality]
Using Step 1's bounds and Step 2's bounds:
\begin{align*}
\|u - g\|_{W^{1,p}} \le \|u - v_\lambda\|_{W^{1,p}} + \|v_\lambda - g\|_{W^{1,p}} < \varepsilon_0 + \varepsilon_1 = \varepsilon.
\end{align*}
For the bad-set bound:
\begin{align*}
\{u \neq g\} \subseteq \{u \neq v_\lambda\} \cup \{v_\lambda \neq g\},
\end{align*}
so by subadditivity,
\begin{align*}
\mathcal{L}^n(\{u \neq g\}) \le \mathcal{L}^n(\{u \neq v_\lambda\}) + \mathcal{L}^n(\{v_\lambda \neq g\}) < \varepsilon_0 + \varepsilon_1 = \varepsilon.
\end{align*}
Both the displayed inequalities are strict because $\varepsilon_0 + \varepsilon_1 = \varepsilon$ is assembled from two strict inequalities, one of which (Step 1) is strict and the other (Step 2) is strict. This establishes the required bounds for the constructed $g \in C^1(\mathbb{R}^n)$.
[/step]