[proofplan]
We approximate $f \in L^p_{\mathrm{loc}}(U)$ on a compact set $K$ by a continuous function $g \in C_c(V)$ (where $V$ is an intermediate open set with $K \Subset V \Subset U$), then split the error via Minkowski's inequality into three terms: $\|f_\varepsilon - g_\varepsilon\|_{L^p(K)}$, $\|g_\varepsilon - g\|_{L^p(K)}$, and $\|g - f\|_{L^p(K)}$. The third term is small by the density approximation. The second term vanishes by uniform convergence of mollification for continuous functions. The first term is controlled by proving that mollification is an $L^p$-contraction via Jensen's inequality and Fubini's theorem.
[/proofplan]
[step:Set up the compact geometry and fix the approximation target]
Fix a compact subset $K \Subset U$. Choose an intermediate [open set](/page/Open%20Set) $V$ with $K \Subset V \Subset U$, and let $\delta := \operatorname{dist}(K, \partial V) > 0$. For $\varepsilon < \delta$, if $x \in K$ and $y \in B(0,\varepsilon)$, then $x - y \in V$, so $f_\varepsilon(x)$ depends only on values of $f$ in $V$. Since $f \in L^p_{\mathrm{loc}}(U)$, we have $f \in L^p(V)$.
Let $\sigma > 0$ be arbitrary. By Minkowski's inequality:
\begin{align*}
\|f_\varepsilon - f\|_{L^p(K)} \leq \|f_\varepsilon - g_\varepsilon\|_{L^p(K)} + \|g_\varepsilon - g\|_{L^p(K)} + \|g - f\|_{L^p(K)},
\end{align*}
where $g \in C_c(V)$ is a continuous approximation chosen in the next step.
[/step]
[step:Approximate $f$ by a continuous function $g$ in $L^p(V)$]
The space $C_c(V)$ is dense in [L^p](/page/L%5Ep%20Spaces) for $1 \leq p < \infty$. Choose $g \in C_c(V)$ such that
\begin{align*}
\|f - g\|_{L^p(V)} < \frac{\sigma}{3}.
\end{align*}
Since $K \subset V$, the restriction to $K$ satisfies $\|g - f\|_{L^p(K)} \leq \|g - f\|_{L^p(V)} < \sigma/3$. This controls the third term.
[/step]
[step:Control the second term via uniform convergence of $g_\varepsilon \to g$ on $K$]
Since $g$ is continuous on $U$, the [Uniform Convergence of Mollification on Compact Sets](/theorems/47) theorem gives $g_\varepsilon \to g$ uniformly on $K$ as $\varepsilon \to 0$. Therefore
\begin{align*}
\|g_\varepsilon - g\|_{L^p(K)}^p = \int_K |g_\varepsilon(x) - g(x)|^p \, d\mathcal{L}^n(x) \leq \mathcal{L}^n(K) \cdot \Bigl(\sup_{x \in K} |g_\varepsilon(x) - g(x)|\Bigr)^p \to 0.
\end{align*}
Choose $\varepsilon_0 \in (0, \delta)$ such that $\|g_\varepsilon - g\|_{L^p(K)} < \sigma/3$ for all $\varepsilon < \varepsilon_0$.
[/step]
[step:Control the first term by proving mollification is an $L^p$-contraction]
By linearity of [convolution](/page/Convolution), $f_\varepsilon - g_\varepsilon = (f - g)_\varepsilon$. Define $h := f - g \in L^p(V)$. We show $\|h_\varepsilon\|_{L^p(K)} \leq \|h\|_{L^p(V)}$.
For $x \in K$, since $\eta_\varepsilon \geq 0$ and $\int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, d\mathcal{L}^n(y) = 1$, the measure $\eta_\varepsilon(y) \, d\mathcal{L}^n(y)$ is a probability measure on $B(0,\varepsilon)$. Applying Jensen's inequality to the convex function $t \mapsto |t|^p$:
\begin{align*}
|h_\varepsilon(x)|^p = \left|\int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, h(x - y) \, d\mathcal{L}^n(y)\right|^p \leq \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, |h(x - y)|^p \, d\mathcal{L}^n(y).
\end{align*}
Integrating over $x \in K$:
\begin{align*}
\int_K |h_\varepsilon(x)|^p \, d\mathcal{L}^n(x) \leq \int_K \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, |h(x - y)|^p \, d\mathcal{L}^n(y) \, d\mathcal{L}^n(x).
\end{align*}
We apply Fubini's theorem to exchange the order of integration. The integrand is non-negative and measurable, and $\eta_\varepsilon \in L^1(\mathbb{R}^n)$, $h \in L^p(V)$ with $x - y \in V$ for $x \in K$, $y \in B(0,\varepsilon)$, so Tonelli's theorem applies:
\begin{align*}
\|h_\varepsilon\|_{L^p(K)}^p \leq \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \left(\int_K |h(x - y)|^p \, d\mathcal{L}^n(x)\right) d\mathcal{L}^n(y).
\end{align*}
For each fixed $y \in B(0,\varepsilon)$, the translation $x \mapsto x - y$ maps $K$ into $V$, so $\int_K |h(x - y)|^p \, d\mathcal{L}^n(x) \leq \|h\|_{L^p(V)}^p$. Substituting:
\begin{align*}
\|h_\varepsilon\|_{L^p(K)}^p \leq \|h\|_{L^p(V)}^p \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, d\mathcal{L}^n(y) = \|h\|_{L^p(V)}^p.
\end{align*}
Taking $p$-th roots: $\|(f - g)_\varepsilon\|_{L^p(K)} \leq \|f - g\|_{L^p(V)} < \sigma/3$.
[guided]
The goal is to show that mollification does not amplify the $L^p$ norm -- it is an $L^p$-contraction. This is the step where both Jensen's inequality and Fubini's theorem are essential.
By linearity, $f_\varepsilon - g_\varepsilon = (f - g)_\varepsilon$, so it suffices to bound $\|h_\varepsilon\|_{L^p(K)}$ for $h = f - g$. The key observation is that $\eta_\varepsilon \, d\mathcal{L}^n$ is a probability measure on $B(0,\varepsilon)$, since $\eta_\varepsilon \geq 0$ and $\int \eta_\varepsilon \, d\mathcal{L}^n = 1$. This allows us to apply Jensen's inequality with the convex function $\phi(t) = |t|^p$ (convex for $p \geq 1$):
\begin{align*}
|h_\varepsilon(x)|^p = \left|\int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, h(x - y) \, d\mathcal{L}^n(y)\right|^p \leq \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, |h(x - y)|^p \, d\mathcal{L}^n(y).
\end{align*}
Now integrate both sides over $x \in K$ with respect to $\mathcal{L}^n$:
\begin{align*}
\int_K |h_\varepsilon(x)|^p \, d\mathcal{L}^n(x) \leq \int_K \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \, |h(x - y)|^p \, d\mathcal{L}^n(y) \, d\mathcal{L}^n(x).
\end{align*}
Why can we apply Fubini (Tonelli)? The integrand $\eta_\varepsilon(y) |h(x-y)|^p$ is non-negative and measurable on $K \times B(0,\varepsilon) \subset \mathbb{R}^n \times \mathbb{R}^n$ with respect to the product $\sigma$-algebra $\mathcal{B}(\mathbb{R}^n) \otimes \mathcal{B}(\mathbb{R}^n)$. Tonelli's theorem (which requires only non-negativity and measurability, not integrability) permits exchanging the order:
\begin{align*}
\|h_\varepsilon\|_{L^p(K)}^p \leq \int_{B(0,\varepsilon)} \eta_\varepsilon(y) \left(\int_K |h(x - y)|^p \, d\mathcal{L}^n(x)\right) d\mathcal{L}^n(y).
\end{align*}
For the inner integral: when $x \in K$ and $y \in B(0,\varepsilon)$ with $\varepsilon < \delta$, the point $x - y$ satisfies $\operatorname{dist}(x - y, K) \leq |y| < \delta = \operatorname{dist}(K, \partial V)$, so $x - y \in V$. Therefore the map $x \mapsto h(x - y)$ is supported in $V$ when restricted to $K$, and $\int_K |h(x - y)|^p \, d\mathcal{L}^n(x) \leq \int_V |h(w)|^p \, d\mathcal{L}^n(w) = \|h\|_{L^p(V)}^p$ by translation invariance of Lebesgue measure. Substituting and using the unit-mass property:
\begin{align*}
\|h_\varepsilon\|_{L^p(K)}^p \leq \|h\|_{L^p(V)}^p \cdot 1 = \|h\|_{L^p(V)}^p.
\end{align*}
Taking $p$-th roots gives $\|(f-g)_\varepsilon\|_{L^p(K)} \leq \|f - g\|_{L^p(V)} < \sigma/3$.
[/guided]
[/step]
[step:Combine the three estimates to conclude $L^p_{\mathrm{loc}}$ convergence]
For $\varepsilon < \varepsilon_0$, combining the three bounds:
\begin{align*}
\|f_\varepsilon - f\|_{L^p(K)} < \frac{\sigma}{3} + \frac{\sigma}{3} + \frac{\sigma}{3} = \sigma.
\end{align*}
Since $\sigma > 0$ was arbitrary, $\|f_\varepsilon - f\|_{L^p(K)} \to 0$ as $\varepsilon \to 0$. Since $K \Subset U$ was an arbitrary compact subset, this proves $f_\varepsilon \to f$ in $L^p_{\mathrm{loc}}(U)$.
[/step]