[proofplan]
We express the difference $f_\varepsilon(x) - f(x)$ as a single integral against the mollifier kernel by exploiting the unit-mass property $\int \eta_\varepsilon \, d\mathcal{L}^n = 1$. Taking absolute values and bounding the kernel by $C/\varepsilon^n$, we reduce the problem to showing that the ball average $\frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y) \to 0$. The [Lebesgue Differentiation Theorem](/theorems/74) guarantees this limit at every Lebesgue point of $f$, and $\mathcal{L}^n$-almost every point of $U$ is a Lebesgue point for $f \in L^1_{\mathrm{loc}}(U)$.
[/proofplan]
[step:Express $f_\varepsilon(x) - f(x)$ as a single integral against the mollifier kernel]
Fix $x \in U$ and $\varepsilon > 0$ small enough that $B(x, \varepsilon) \subset U$, so that $x \in U_\varepsilon$. The [standard mollifier](/page/Standard%20Mollifier) $\eta_\varepsilon$ satisfies $\operatorname{supp}(\eta_\varepsilon) \subset \overline{B}(0,\varepsilon)$ and $\int_{\mathbb{R}^n} \eta_\varepsilon(z) \, d\mathcal{L}^n(z) = 1$. Substituting $z = x - y$ and using the support condition, the unit-mass property becomes
\begin{align*}
\int_{B(x,\varepsilon)} \eta_\varepsilon(x - y) \, d\mathcal{L}^n(y) = 1.
\end{align*}
Multiplying through by $f(x)$ and subtracting from the definition $f_\varepsilon(x) = \int_{B(x,\varepsilon)} \eta_\varepsilon(x - y) f(y) \, d\mathcal{L}^n(y)$ gives
\begin{align*}
f_\varepsilon(x) - f(x) = \int_{B(x,\varepsilon)} \eta_\varepsilon(x - y) \bigl(f(y) - f(x)\bigr) \, d\mathcal{L}^n(y).
\end{align*}
[guided]
The idea is to represent $f(x)$ as an integral against the same kernel so that the difference becomes a single integral. Since $\eta_\varepsilon$ is a probability kernel (non-negative with total mass $1$), we can write
\begin{align*}
f(x) = f(x) \cdot 1 = f(x) \int_{\mathbb{R}^n} \eta_\varepsilon(z) \, d\mathcal{L}^n(z).
\end{align*}
Substituting $z = x - y$ (which maps $B(0,\varepsilon)$ to $B(x,\varepsilon)$ and preserves Lebesgue measure) and using $\operatorname{supp}(\eta_\varepsilon) \subset \overline{B}(0,\varepsilon)$, this becomes
\begin{align*}
f(x) = \int_{B(x,\varepsilon)} \eta_\varepsilon(x - y) f(x) \, d\mathcal{L}^n(y).
\end{align*}
Now subtract from the definition of $f_\varepsilon(x) = \int_{B(x,\varepsilon)} \eta_\varepsilon(x - y) f(y) \, d\mathcal{L}^n(y)$:
\begin{align*}
f_\varepsilon(x) - f(x) = \int_{B(x,\varepsilon)} \eta_\varepsilon(x - y) \bigl(f(y) - f(x)\bigr) \, d\mathcal{L}^n(y).
\end{align*}
This representation localises the problem: the convergence $f_\varepsilon(x) \to f(x)$ reduces to controlling the oscillation of $f$ near $x$.
[/guided]
[/step]
[step:Bound the kernel and reduce to a ball average]
Taking absolute values and applying the triangle inequality for integrals:
\begin{align*}
|f_\varepsilon(x) - f(x)| \leq \int_{B(x,\varepsilon)} |\eta_\varepsilon(x - y)| \, |f(y) - f(x)| \, d\mathcal{L}^n(y).
\end{align*}
The rescaled mollifier satisfies $\eta_\varepsilon(z) = \varepsilon^{-n} \eta(z/\varepsilon)$. Since $\eta \in C_c^\infty(B(0,1))$, define the constant $C := \|\eta\|_{L^\infty(\mathbb{R}^n)} < \infty$. Then $|\eta_\varepsilon(x - y)| \leq C \varepsilon^{-n}$ for all $y$, so
\begin{align*}
|f_\varepsilon(x) - f(x)| \leq \frac{C}{\varepsilon^n} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y).
\end{align*}
Let $\alpha_n := \mathcal{L}^n(B(0,1))$ denote the volume of the unit ball in $\mathbb{R}^n$, so that $\mathcal{L}^n(B(x,\varepsilon)) = \alpha_n \varepsilon^n$. Substituting $\varepsilon^{-n} = \alpha_n / \mathcal{L}^n(B(x,\varepsilon))$:
\begin{align*}
|f_\varepsilon(x) - f(x)| \leq C \alpha_n \cdot \frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y).
\end{align*}
[guided]
Why bound $\eta_\varepsilon$ rather than use it exactly? Because the precise shape of $\eta$ is irrelevant for the a.e. convergence argument -- what matters is that $\eta_\varepsilon$ concentrates near $x$ as $\varepsilon \to 0$ and has bounded mass. The upper bound $|\eta_\varepsilon(x - y)| \leq C \varepsilon^{-n}$ replaces the mollifier by a constant multiple of the normalised indicator $\frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \mathbb{1}_{B(x,\varepsilon)}(y)$, which is exactly the averaging kernel that appears in the Lebesgue Differentiation Theorem.
Concretely, from $\eta_\varepsilon(z) = \varepsilon^{-n} \eta(z/\varepsilon)$ and $\|\eta\|_{L^\infty} = C$, we obtain
\begin{align*}
|\eta_\varepsilon(x - y)| \leq C \varepsilon^{-n} \quad \text{for all } y \in \mathbb{R}^n.
\end{align*}
Substituting into the integral bound and writing $\varepsilon^{-n} = \alpha_n / \mathcal{L}^n(B(x,\varepsilon))$ where $\alpha_n = \mathcal{L}^n(B(0,1))$:
\begin{align*}
|f_\varepsilon(x) - f(x)| \leq C \alpha_n \cdot \frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y).
\end{align*}
The constant $C\alpha_n$ depends only on $n$ and the choice of standard mollifier $\eta$, not on $f$ or $\varepsilon$.
[/guided]
[/step]
[step:Apply the Lebesgue Differentiation Theorem to conclude a.e. convergence]
The [Lebesgue Differentiation Theorem](/theorems/74) states that for any $f \in L^1_{\mathrm{loc}}(U)$,
\begin{align*}
\lim_{\varepsilon \to 0} \frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y) = 0
\end{align*}
at every [Lebesgue point](/page/Lebesgue%20Point) $x$ of $f$. The hypothesis $f \in L^1_{\mathrm{loc}}(U)$ is given, so the theorem applies. Since $\mathcal{L}^n$-almost every point of $U$ is a Lebesgue point, the ball average on the right-hand side vanishes as $\varepsilon \to 0$ for $\mathcal{L}^n$-a.e. $x \in U$. The bound from the previous step then gives
\begin{align*}
\lim_{\varepsilon \to 0} |f_\varepsilon(x) - f(x)| \leq C \alpha_n \cdot \lim_{\varepsilon \to 0} \frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y) = 0
\end{align*}
for $\mathcal{L}^n$-a.e. $x \in U$. This proves $f_\varepsilon(x) \to f(x)$ as $\varepsilon \to 0$ for $\mathcal{L}^n$-almost every $x \in U$.
[guided]
The Lebesgue Differentiation Theorem is the engine of the entire proof. It says that the average oscillation of a locally integrable function on small balls centred at $x$ vanishes at $\mathcal{L}^n$-almost every $x$ -- precisely the points called Lebesgue points. The key hypothesis to verify is that $f \in L^1_{\mathrm{loc}}(U)$, which ensures that the ball averages $\frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y)$ are finite for $\mathcal{L}^n$-a.e. $x$ and small $\varepsilon$. This is given directly in the theorem statement.
Applying the [Lebesgue Differentiation Theorem](/theorems/74) and combining with the bound $|f_\varepsilon(x) - f(x)| \leq C\alpha_n \cdot \frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y)$:
\begin{align*}
0 \leq \limsup_{\varepsilon \to 0} |f_\varepsilon(x) - f(x)| \leq C\alpha_n \cdot \lim_{\varepsilon \to 0} \frac{1}{\mathcal{L}^n(B(x,\varepsilon))} \int_{B(x,\varepsilon)} |f(y) - f(x)| \, d\mathcal{L}^n(y) = 0
\end{align*}
for $\mathcal{L}^n$-a.e. $x \in U$. By the squeeze theorem, $\lim_{\varepsilon \to 0} |f_\varepsilon(x) - f(x)| = 0$, which is precisely the claimed almost everywhere convergence $f_\varepsilon \to f$.
[/guided]
[/step]