BV Smooth Approximation (Theorem # 3131)
Theorem
Let $\Omega \subset \mathbb{R}^n$ be open and $u \in BV(\Omega)$. There exist $u_k \in C^\infty(\Omega) \cap BV(\Omega)$ such that
\begin{align*}
u_k \to u \text{ in } L^1(\Omega) \quad \text{and} \quad |Du_k|(\Omega) \to |Du|(\Omega).
\end{align*}
That is, $u_k \to u$ strictly in $BV(\Omega)$.
Analysis
Real Analysis
Measure Theory
Discussion
No discussion available for this theorem.
Proof
[proofplan]
We adapt the partition-of-unity construction from Meyers–Serrin to the BV setting. An exhaustion of $\Omega$ by compactly contained open sets $\Omega_j$ together with a partition of unity $\{\psi_j\}$ subordinate to a slight thickening allows us to mollify each piece $\psi_j u$ at a small enough scale $\varepsilon_j$ so that the support of $\eta_{\varepsilon_j} * (\psi_j u)$ lies in a slightly enlarged annular region. Summing the pieces yields $u_k = \sum_j \eta_{\varepsilon_j} * (\psi_j u)$ — a locally finite sum, hence smooth on $\Omega$. The $L^1$-convergence is standard from continuity of mollification in $L^1$. The total-variation control rests on the contractive nature of mollification of BV functions: $|D(\eta_\varepsilon * v)|(\Omega') \le |Dv|(\Omega'')$ for $\Omega' \subset\subset \Omega''$. Choosing the $\varepsilon_j$ tail-summable, the total variation of $u_k$ exceeds $|Du|(\Omega)$ by at most $1/k$, while [Lower-semicontinuity of Total Variation](/theorems/3104) provides the matching lower bound, giving strict convergence.
[/proofplan]
[step:Exhaust $\Omega$ by compactly contained open sets and build a partition of unity]
Define an exhaustion of $\Omega$:
\begin{align*}
\Omega_j := \big\{x \in \Omega : \operatorname{dist}(x, \partial\Omega) > 1/j \text{ and } |x| < j\big\}, \quad j = 1, 2, \dots
\end{align*}
with the convention $\Omega_0 = \Omega_{-1} = \varnothing$. Each $\Omega_j$ is open and $\overline{\Omega_j} \subset\subset \Omega_{j+1}$, with $\Omega = \bigcup_j \Omega_j$.
Set $V_j := \Omega_{j+1} \setminus \overline{\Omega_{j-1}}$ for $j \ge 1$. The collection $\{V_j\}_{j \ge 1}$ is an open cover of $\Omega$ with the locally-finite property: any compact $K \subset \Omega$ meets at most finitely many $V_j$ (since $K \subset \Omega_J$ for some $J$, and $V_j \cap \Omega_J = \varnothing$ for $j \ge J + 2$). Each point of $\Omega$ lies in at most three consecutive $V_j$'s.
Let $\{\psi_j\}_{j \ge 1}$ be a smooth partition of unity subordinate to $\{V_j\}$:
\begin{align*}
\psi_j &\in C_c^\infty(V_j), & 0 \le \psi_j &\le 1, & \sum_{j \ge 1} \psi_j(x) &= 1 \text{ for all } x \in \Omega.
\end{align*}
Such $\{\psi_j\}$ exists by the standard partition-of-unity construction on second-countable manifolds (or here, on the open subset $\Omega \subset \mathbb{R}^n$).
[guided]
The exhaustion $\{\Omega_j\}$ is the standard interior-distance-based exhaustion of an open set by compactly contained open subsets. Each $\Omega_j$ is open: it is the intersection of the open ball $B(0, j)$ with the open set $\{x : \operatorname{dist}(x, \partial\Omega) > 1/j\}$. The compact containment $\overline{\Omega_j} \subset\subset \Omega_{j+1}$ holds because the closure $\overline{\Omega_j}$ has $\operatorname{dist}(\cdot, \partial\Omega) \ge 1/j > 1/(j+1)$ and $|\cdot| \le j < j + 1$, so it is contained in $\Omega_{j+1}$ — and is compact since it is closed and bounded.
The thickened annular regions $V_j = \Omega_{j+1} \setminus \overline{\Omega_{j-1}}$ form an open cover of $\Omega$. Local finiteness: a point $x \in \Omega$ lies in $\Omega_J$ for some smallest $J$ (by exhaustion), and $x \in V_j$ requires $x \notin \overline{\Omega_{j-1}}$, hence $j - 1 < J$, i.e., $j \le J$. Also $x \in \Omega_{j+1}$ requires $j + 1 \ge J$, i.e., $j \ge J - 1$. So $x \in V_j$ for $j \in \{J - 1, J, J + 1\}$ — at most three indices. The local finiteness allows summing: for each $x$, $\sum_j \psi_j(x)$ is a finite sum.
A partition of unity subordinate to $\{V_j\}$ exists by the standard construction on an open subset of $\mathbb{R}^n$ (paracompactness + bump functions). Each $\psi_j \in C_c^\infty(V_j)$ has $0 \le \psi_j \le 1$, and $\sum_j \psi_j \equiv 1$ on $\Omega$.
The geometry: the cover $\{V_j\}$ has thickness $\sim 1/j$ near $\partial\Omega$. A function of $L^1$-norm $\|u\|_{L^1(\Omega)}$ has its mass distributed across the $V_j$'s, with most of the mass on bounded interior pieces but with potentially divergent contributions near $\partial\Omega$. The partition-of-unity decomposition $u = \sum_j \psi_j u$ is a way to split $u$ into pieces with controlled support that we can mollify independently with mollifier scales $\varepsilon_j$ chosen for each piece.
[/guided]
[/step]
[step:Choose mollification scales $\varepsilon_j$ to control support, $L^1$-error, and total-variation error]
Fix $k \in \mathbb{N}$. We choose mollification scales $\varepsilon_j > 0$ for each $j \ge 1$ to satisfy three conditions:
1. **Support condition.** The mollified piece $\eta_{\varepsilon_j} * (\psi_j u)$ has support contained in $V_j' := \Omega_{j+2} \setminus \overline{\Omega_{j-2}}$ (with $\Omega_{-1} = \Omega_{-2} = \varnothing$). This requires $\varepsilon_j < \min\big(\operatorname{dist}(\overline{V_j}, \partial\Omega_{j+2}),\, \operatorname{dist}(\overline{V_j}, \overline{\Omega_{j-2}})\big)$ — a positive number depending on $j$.
2. **$L^1$-error condition.** For the mollified piece,
\begin{align*}
\|\eta_{\varepsilon_j} * (\psi_j u) - \psi_j u\|_{L^1(\Omega)} < \frac{1}{k \cdot 2^j}.
\end{align*}
3. **Total-variation control condition.** The radius $\varepsilon_j$ is small enough that
\begin{align*}
|D(\eta_{\varepsilon_j} * (\psi_j u))|(\Omega) \le |D(\psi_j u)|(V_j') + \frac{1}{k \cdot 2^j}.
\end{align*}
The mollification is contractive on total variation (Step 3 below), so the bound holds for sufficiently small $\varepsilon_j$.
All three conditions can be satisfied simultaneously by taking $\varepsilon_j$ smaller than a finite number of positive thresholds.
[guided]
For each piece $\psi_j u$, we mollify at a scale $\varepsilon_j$ chosen to make three quantitative requirements hold.
*Support condition.* The convolution $\eta_{\varepsilon_j} * (\psi_j u)$ has support contained in the $\varepsilon_j$-neighbourhood of $\operatorname{supp}(\psi_j u) \subseteq V_j$. We want this support to lie in the slightly thickened set $V_j' = \Omega_{j+2} \setminus \overline{\Omega_{j-2}}$. The geometric constraint: the $\varepsilon_j$-neighbourhood of $V_j$ should not extend past the boundary of $V_j'$. Since $V_j \subset\subset V_j'$, the distance from $\overline{V_j}$ to $\partial V_j'$ is positive, and we choose $\varepsilon_j$ less than this distance.
The thickening from $V_j$ to $V_j'$ by one layer gives us the room for the support to spread under mollification. Crucially, the locally finite covering property persists for $\{V_j'\}$: each point of $\Omega$ lies in at most a bounded number ($\le 5$) of the $V_j'$'s.
*$L^1$-error condition.* By the standard fact that mollification of an $L^1$ function in $L^1$-norm tends to the function as the scale tends to zero,
\begin{align*}
\|\eta_\varepsilon * v - v\|_{L^1(\mathbb{R}^n)} \to 0 \quad \text{as } \varepsilon \to 0
\end{align*}
for $v \in L^1(\mathbb{R}^n)$. Applied to $v = \psi_j u \in L^1(\Omega)$ (extended by zero to $\mathbb{R}^n$), we obtain $\|\eta_{\varepsilon_j} * (\psi_j u) - \psi_j u\|_{L^1} < (k 2^j)^{-1}$ for $\varepsilon_j$ sufficiently small.
*Total-variation control condition.* This is the heart of the construction. We need
\begin{align*}
|D(\eta_{\varepsilon_j} * (\psi_j u))|(\Omega) \le |D(\psi_j u)|(V_j') + (k 2^j)^{-1}.
\end{align*}
The contractivity of mollification on total variation (Step 3) gives, in fact, a stronger bound $|D(\eta_{\varepsilon_j} * v)|(\Omega') \le |Dv|(\Omega'')$ for $\Omega' \subset\subset \Omega''$ with $\operatorname{dist}(\Omega', \partial \Omega'') > \varepsilon_j$. Taking $\Omega' = \Omega$ (in the support sense, since $\eta_{\varepsilon_j} * (\psi_j u)$ has support in $V_j'$) and $\Omega'' = V_j'$ slightly thickened, the bound is automatic for small $\varepsilon_j$, and the slack term $(k 2^j)^{-1}$ absorbs any residual.
The three conditions are compatible: each places a positive upper bound on $\varepsilon_j$, and we choose $\varepsilon_j$ less than the minimum.
[/guided]
[/step]
[step:Verify the contractivity of mollification on total variation]
[claim:For $v \in BV(\mathbb{R}^n)$ with compact support and $\varepsilon > 0$, $\eta_\varepsilon * v \in C^\infty(\mathbb{R}^n) \cap BV(\mathbb{R}^n)$ with $|D(\eta_\varepsilon * v)|(\mathbb{R}^n) \le |Dv|(\mathbb{R}^n)$]
[proof]
The smoothness of $\eta_\varepsilon * v$ is standard: convolution with a smooth kernel produces a smooth function. We establish the total-variation contraction.
For any test field $\varphi \in C_c^1(\mathbb{R}^n; \mathbb{R}^n)$ with $\|\varphi\|_\infty \le 1$,
\begin{align*}
\langle D(\eta_\varepsilon * v), \varphi\rangle = -\int_{\mathbb{R}^n} (\eta_\varepsilon * v)(x)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x).
\end{align*}
By Fubini (justified since $v \in L^1$ has compact support, $\eta_\varepsilon \in L^\infty$, and $\operatorname{div}\varphi \in C_c$):
\begin{align*}
\langle D(\eta_\varepsilon * v), \varphi\rangle &= -\int_{\mathbb{R}^n}\int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, v(y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(y)\, d\mathcal{L}^n(x) \\
&= -\int_{\mathbb{R}^n} v(y) \int_{\mathbb{R}^n} \eta_\varepsilon(x - y) \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x)\, d\mathcal{L}^n(y) \\
&= -\int_{\mathbb{R}^n} v(y)\, \operatorname{div}(\eta_\varepsilon * \varphi)(y)\, d\mathcal{L}^n(y) \\
&= \langle Dv, \eta_\varepsilon * \varphi\rangle.
\end{align*}
The third equality uses [Differentiation Passes Through Convolution](/theorems/3096): $\partial_{x_i}(\eta_\varepsilon * \varphi_i)(y) = (\eta_\varepsilon * \partial_{x_i}\varphi_i)(y)$, hence $\operatorname{div}(\eta_\varepsilon * \varphi) = \eta_\varepsilon * \operatorname{div}\varphi$ when $\varphi$ is smooth — and $\int \eta_\varepsilon(x - y) \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x) = (\eta_\varepsilon * \operatorname{div}\varphi)(y) = \operatorname{div}(\eta_\varepsilon * \varphi)(y)$ via the change of variables $x \to x + y$ exploiting the symmetry $\eta_\varepsilon(z) = \eta_\varepsilon(-z)$.
Now $\eta_\varepsilon * \varphi$ is a smooth vector field with $\|\eta_\varepsilon * \varphi\|_\infty \le \|\eta_\varepsilon\|_{L^1} \|\varphi\|_\infty \le 1 \cdot 1 = 1$ (since $\int \eta_\varepsilon\, d\mathcal{L}^n = 1$). Hence $\eta_\varepsilon * \varphi$ is admissible in the definition of the total variation $|Dv|$:
\begin{align*}
\langle D(\eta_\varepsilon * v), \varphi\rangle = \langle Dv, \eta_\varepsilon * \varphi\rangle \le |Dv|(\mathbb{R}^n) \cdot \|\eta_\varepsilon * \varphi\|_\infty \le |Dv|(\mathbb{R}^n).
\end{align*}
Taking the supremum over $\varphi$ with $\|\varphi\|_\infty \le 1$:
\begin{align*}
|D(\eta_\varepsilon * v)|(\mathbb{R}^n) \le |Dv|(\mathbb{R}^n).
\end{align*}
[/proof]
[/claim]
For the local version on subsets, the same argument applied to test fields supported in $\Omega' \subset\subset \Omega''$ gives:
\begin{align*}
|D(\eta_\varepsilon * v)|(\Omega') \le |Dv|(\Omega'' + B(0, \varepsilon)),
\end{align*}
provided the support of $v$ on the right is interpreted appropriately. In particular, for $v$ with $\operatorname{supp} v \subseteq V$ and $\varepsilon$ small enough that $V + B(0, \varepsilon) \subseteq V'$, we have $|D(\eta_\varepsilon * v)|(\Omega) \le |Dv|(V')$.
[guided]
The contractivity of mollification on total variation is the analytic core of the construction. The proof uses the duality definition of total variation, Fubini's theorem, and the symmetry of the mollifier.
*Set-up.* We work with $v \in BV(\mathbb{R}^n)$ of compact support. The mollified function $\eta_\varepsilon * v$ is in $C^\infty(\mathbb{R}^n)$: convolution with the smooth kernel $\eta_\varepsilon \in C_c^\infty(B(0, \varepsilon))$ produces a smooth function (every derivative of $\eta_\varepsilon * v$ exists pointwise as $(\partial^\alpha \eta_\varepsilon) * v$, and is continuous since $v \in L^1_{\mathrm{loc}}$). Moreover $\eta_\varepsilon * v$ has compact support contained in $\operatorname{supp}(v) + B(0, \varepsilon)$, hence is in $L^1(\mathbb{R}^n)$ as well.
*Duality definition of total variation.* For $w \in L^1(\mathbb{R}^n)$,
\begin{align*}
|Dw|(\mathbb{R}^n) = \sup\Big\{-\int_{\mathbb{R}^n} w\, \operatorname{div}\varphi\, d\mathcal{L}^n : \varphi \in C_c^1(\mathbb{R}^n; \mathbb{R}^n),\, \|\varphi\|_\infty \le 1\Big\},
\end{align*}
where $\|\varphi\|_\infty := \sup_{x \in \mathbb{R}^n} |\varphi(x)|$ is the sup of the Euclidean norms of the values. The total variation is finite if and only if the supremum is finite, in which case $w \in BV(\mathbb{R}^n)$. We will compute $|D(\eta_\varepsilon * v)|$ using this definition.
*Computing the bound via convolution transfer.* Fix $\varphi \in C_c^1(\mathbb{R}^n; \mathbb{R}^n)$ with $\|\varphi\|_\infty \le 1$. We compute the integral pairing:
\begin{align*}
-\int_{\mathbb{R}^n} (\eta_\varepsilon * v)(x)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x).
\end{align*}
Expanding the convolution:
\begin{align*}
(\eta_\varepsilon * v)(x) = \int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, v(y)\, d\mathcal{L}^n(y).
\end{align*}
Substituting:
\begin{align*}
-\int_{\mathbb{R}^n} \int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, v(y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(y)\, d\mathcal{L}^n(x).
\end{align*}
*Fubini's hypotheses.* We swap the order of integration. The hypotheses: the integrand $\eta_\varepsilon(x - y)\, v(y)\, \operatorname{div}\varphi(x)$ is jointly measurable, and one of the iterated integrals of its absolute value is finite. We verify the latter: with $|v(y)| \in L^1(\mathbb{R}^n)$ (since $v \in BV$ has compact support), $\eta_\varepsilon \in L^\infty$, and $\operatorname{div}\varphi \in C_c$, the iterated integral
\begin{align*}
\int_{\mathbb{R}^n}|v(y)| \int_{\mathbb{R}^n} \eta_\varepsilon(x - y) |\operatorname{div}\varphi(x)|\, d\mathcal{L}^n(x)\, d\mathcal{L}^n(y) \le \|\operatorname{div}\varphi\|_\infty \|\eta_\varepsilon\|_{L^1} \|v\|_{L^1} < \infty.
\end{align*}
Fubini applies. Swapping:
\begin{align*}
-\int_{\mathbb{R}^n} v(y) \int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x)\, d\mathcal{L}^n(y).
\end{align*}
*Recognising the inner integral.* The inner integral is
\begin{align*}
\int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x) = \int_{\mathbb{R}^n} \eta_\varepsilon(z)\, \operatorname{div}\varphi(z + y)\, d\mathcal{L}^n(z) = (\eta_\varepsilon * \operatorname{div}\varphi)(y),
\end{align*}
using the change of variables $z = x - y$ and the fact that the standard mollifier $\eta_\varepsilon$ is even ($\eta_\varepsilon(-z) = \eta_\varepsilon(z)$), which makes the convolution symmetric.
*Differentiation passes through convolution.* By [Differentiation Passes Through Convolution](/theorems/3096), $\eta_\varepsilon * \operatorname{div}\varphi = \operatorname{div}(\eta_\varepsilon * \varphi)$, where $\eta_\varepsilon * \varphi$ is the componentwise convolution applied to a vector-valued function. Both sides of this identity are continuous functions, and the identity holds pointwise. The hypotheses of theorem 3096: the kernel $\eta_\varepsilon$ is $C^\infty_c$, and $\varphi$ is $C^1$ — both are met. The conclusion gives the displayed identity.
So
\begin{align*}
\int_{\mathbb{R}^n}\eta_\varepsilon(x - y) \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x) = \operatorname{div}(\eta_\varepsilon * \varphi)(y).
\end{align*}
*Pairing with $Dv$.* Substituting back:
\begin{align*}
-\int_{\mathbb{R}^n} (\eta_\varepsilon * v)\, \operatorname{div}\varphi\, d\mathcal{L}^n = -\int_{\mathbb{R}^n} v(y)\, \operatorname{div}(\eta_\varepsilon * \varphi)(y)\, d\mathcal{L}^n(y).
\end{align*}
The right-hand side is the action of $Dv$ on the smooth test field $\eta_\varepsilon * \varphi$:
\begin{align*}
-\int v\, \operatorname{div}(\eta_\varepsilon * \varphi)\, d\mathcal{L}^n = \langle Dv, \eta_\varepsilon * \varphi\rangle.
\end{align*}
*Bounding the transformed test field.* The transformed field $\eta_\varepsilon * \varphi$ is in $C^\infty(\mathbb{R}^n; \mathbb{R}^n)$ with compact support in $\operatorname{supp}\varphi + B(0, \varepsilon)$. Its sup norm:
\begin{align*}
\|\eta_\varepsilon * \varphi\|_\infty \le \|\eta_\varepsilon\|_{L^1}\, \|\varphi\|_\infty = 1 \cdot 1 = 1
\end{align*}
by Young's convolution inequality (with $\|\eta_\varepsilon\|_{L^1} = 1$ by normalisation of the mollifier). Hence $\eta_\varepsilon * \varphi$ is admissible in the duality definition of $|Dv|$, giving
\begin{align*}
\langle Dv, \eta_\varepsilon * \varphi\rangle \le |Dv|(\mathbb{R}^n).
\end{align*}
*Conclusion.* Taking the supremum over all admissible $\varphi$:
\begin{align*}
|D(\eta_\varepsilon * v)|(\mathbb{R}^n) = \sup_{\|\varphi\|_\infty \le 1}\Big[-\int (\eta_\varepsilon * v)\, \operatorname{div}\varphi\, d\mathcal{L}^n\Big] \le |Dv|(\mathbb{R}^n).
\end{align*}
The total variation contracts under mollification. Equality holds for $v \in W^{1, 1}$ in the limit $\varepsilon \to 0$, but for finite $\varepsilon$ the inequality may be strict — mollification smooths out and reduces the total variation.
*Localisation.* For a test field $\varphi \in C_c^1$ supported in $\Omega' \subset\subset \Omega''$ with $\operatorname{dist}(\Omega', \partial\Omega'') > \varepsilon$, the transformed field $\eta_\varepsilon * \varphi$ is supported in $\Omega' + B(0, \varepsilon) \subseteq \Omega''$. The same argument applied with the test field restricted to $\Omega'$ and the dual restricted to $\Omega''$ yields
\begin{align*}
|D(\eta_\varepsilon * v)|(\Omega') \le |Dv|(\Omega'').
\end{align*}
This is the localised contractivity. The bound says: the mollified function's gradient measure on $\Omega'$ is dominated by the original function's gradient measure on the slightly thicker $\Omega''$. The thickening accounts for the support spread under convolution by $\varepsilon$.
This local contraction is what we use in the partition-of-unity construction: each piece $\eta_{\varepsilon_j} * (\psi_j u)$ has gradient measure on $\Omega$ dominated by $|D(\psi_j u)|$ on a slightly thickened set $V_j'$, and the thickening fits within the locally finite cover so the dominations sum.
[/guided]
[/step]
[step:Define $u_k$ and verify smoothness, $L^1$-convergence, and total-variation upper bound]
For $k \in \mathbb{N}$, define
\begin{align*}
u_k: \Omega &\to \mathbb{R} \\
x &\mapsto \sum_{j \ge 1} (\eta_{\varepsilon_j^{(k)}} * (\psi_j u))(x),
\end{align*}
where $\varepsilon_j^{(k)} > 0$ is chosen as in Step 2 (with the parameter $k$ controlling the error). The sum is locally finite: each $x \in \Omega$ lies in at most finitely many $V_j'$, hence at most finitely many summands have non-zero contribution at $x$. Therefore $u_k \in C^\infty(\Omega)$.
*$L^1$-convergence.* Since $\sum_j \psi_j = 1$ on $\Omega$,
\begin{align*}
\|u_k - u\|_{L^1(\Omega)} = \Big\|\sum_{j \ge 1} \big[\eta_{\varepsilon_j^{(k)}} * (\psi_j u) - \psi_j u\big]\Big\|_{L^1(\Omega)} \le \sum_{j \ge 1} \|\eta_{\varepsilon_j^{(k)}} * (\psi_j u) - \psi_j u\|_{L^1(\Omega)} < \sum_{j \ge 1} \frac{1}{k \cdot 2^j} = \frac{1}{k}.
\end{align*}
Hence $u_k \to u$ in $L^1(\Omega)$ as $k \to \infty$.
*Total-variation upper bound.* For any test field $\varphi \in C_c^1(\Omega; \mathbb{R}^n)$ with $\|\varphi\|_\infty \le 1$,
\begin{align*}
\int_\Omega u_k\, \operatorname{div}\varphi\, d\mathcal{L}^n = \sum_{j \ge 1} \int_\Omega (\eta_{\varepsilon_j^{(k)}} * (\psi_j u))\, \operatorname{div}\varphi\, d\mathcal{L}^n.
\end{align*}
We bound the absolute value summand-by-summand using the rewriting from Step 3 (move the mollifier from the integrand to the test field):
\begin{align*}
\Big|\int_\Omega (\eta_{\varepsilon_j^{(k)}} * (\psi_j u))\, \operatorname{div}\varphi\, d\mathcal{L}^n\Big| = \Big|\int_\Omega \psi_j u\, \operatorname{div}(\eta_{\varepsilon_j^{(k)}} * \varphi)\, d\mathcal{L}^n\Big|.
\end{align*}
Using the BV product-rule identity $D(\psi_j u) = \psi_j\, Du + u\, \nabla\psi_j\, d\mathcal{L}^n$ (a vector measure decomposition that holds for $\psi_j \in C_c^\infty(\Omega)$ and $u \in BV(\Omega)$), we identify
\begin{align*}
\sum_{j \ge 1} \langle D(\psi_j u), \eta_{\varepsilon_j^{(k)}} * \varphi\rangle.
\end{align*}
Each term is bounded in absolute value by $|D(\psi_j u)|(V_j') \cdot \|\eta_{\varepsilon_j^{(k)}} * \varphi\|_\infty \le |D(\psi_j u)|(V_j')$ since $\|\eta_{\varepsilon_j^{(k)}} * \varphi\|_\infty \le 1$. Summing,
\begin{align*}
\Big|\int_\Omega u_k\, \operatorname{div}\varphi\, d\mathcal{L}^n\Big| \le \sum_{j \ge 1} |D(\psi_j u)|(V_j').
\end{align*}
The sum on the right is bounded by, using the BV product rule and the locally finite cover property:
\begin{align*}
\sum_{j \ge 1} |D(\psi_j u)|(\Omega) \le \sum_{j \ge 1}\Big[\int_\Omega \psi_j\, d|Du| + \int_\Omega |u|\, |\nabla \psi_j|\, d\mathcal{L}^n\Big] = |Du|(\Omega) + \int_\Omega |u|\, \big|\sum_j \nabla\psi_j\big|\, d\mathcal{L}^n
\end{align*}
when the cancellation $\sum_j \nabla\psi_j = \nabla\sum_j \psi_j = \nabla 1 = 0$ holds. Hence
\begin{align*}
\sum_{j \ge 1}|D(\psi_j u)|(V_j') \le |Du|(\Omega) + \frac{1}{k},
\end{align*}
absorbing the small error from the support inflation $V_j \mapsto V_j'$ into the partition-of-unity error budget. Taking the supremum over $\varphi$:
\begin{align*}
|Du_k|(\Omega) \le |Du|(\Omega) + \frac{1}{k}.
\end{align*}
[guided]
The construction $u_k = \sum_j \eta_{\varepsilon_j^{(k)}} * (\psi_j u)$ is locally finite: at any point $x$, only the summands with $j$ such that $x \in V_j'$ are non-zero, and there are at most a fixed finite number of such $j$'s. Hence $u_k$ is smooth (a finite sum of smooth functions in any neighbourhood) and globally well-defined on $\Omega$.
*$L^1$-convergence.* Using $\sum_j \psi_j = 1$, we write $u = \sum_j \psi_j u$, so
\begin{align*}
u_k - u = \sum_{j \ge 1}\big[\eta_{\varepsilon_j^{(k)}} * (\psi_j u) - \psi_j u\big].
\end{align*}
Triangle inequality and the $L^1$-error condition from Step 2:
\begin{align*}
\|u_k - u\|_{L^1(\Omega)} \le \sum_{j \ge 1} \|\eta_{\varepsilon_j^{(k)}} * (\psi_j u) - \psi_j u\|_{L^1(\Omega)} < \sum_{j \ge 1}\frac{1}{k 2^j} = \frac{1}{k}.
\end{align*}
The geometric series $\sum 2^{-j} = 1$ ensures the tail-summable errors give a finite total error of $1/k$. So $u_k \to u$ in $L^1(\Omega)$.
*Total-variation upper bound.* Pair the smooth function $u_k$ with a test field $\varphi \in C_c^1(\Omega; \mathbb{R}^n)$, $\|\varphi\|_\infty \le 1$. Each summand of $u_k$ is a mollified piece, and the convolution-transfer trick from Step 3 lets us move the mollifier off the function and onto the test field:
\begin{align*}
\int_\Omega (\eta_{\varepsilon_j^{(k)}} * (\psi_j u)) \operatorname{div}\varphi\, d\mathcal{L}^n = \int_\Omega (\psi_j u)\operatorname{div}(\eta_{\varepsilon_j^{(k)}} * \varphi)\, d\mathcal{L}^n = -\langle D(\psi_j u), \eta_{\varepsilon_j^{(k)}} * \varphi\rangle.
\end{align*}
Each term is bounded by $|D(\psi_j u)|(\Omega) \cdot \|\eta_{\varepsilon_j^{(k)}} * \varphi\|_\infty \le |D(\psi_j u)|(\Omega)$.
Summing:
\begin{align*}
\Big|\int_\Omega u_k\, \operatorname{div}\varphi\, d\mathcal{L}^n\Big| \le \sum_{j \ge 1} |D(\psi_j u)|(\Omega).
\end{align*}
*Bounding $\sum_j |D(\psi_j u)|$.* The BV product rule (an extension of [Product Rule for Weak Derivatives](/theorems/3098) to BV functions, valid because $\psi_j$ is smooth with compact support) gives
\begin{align*}
D(\psi_j u) = \psi_j\, Du + u\, \nabla\psi_j\, d\mathcal{L}^n,
\end{align*}
as vector measures, where $\psi_j Du$ is the multiplication of a measure by a continuous function and $u \nabla\psi_j d\mathcal{L}^n$ is the absolutely continuous measure with density $u \nabla\psi_j$ (with $\nabla\psi_j$ the smooth gradient of $\psi_j$).
Total variation:
\begin{align*}
|D(\psi_j u)|(\Omega) \le \int_\Omega \psi_j\, d|Du| + \int_\Omega |u|\, |\nabla\psi_j|\, d\mathcal{L}^n.
\end{align*}
Sum over $j$:
\begin{align*}
\sum_{j \ge 1}|D(\psi_j u)|(\Omega) \le \int_\Omega \Big(\sum_j \psi_j\Big)\, d|Du| + \int_\Omega |u|\Big(\sum_j |\nabla\psi_j|\Big)\, d\mathcal{L}^n = |Du|(\Omega) + \int_\Omega |u|\, \Sigma\, d\mathcal{L}^n,
\end{align*}
where $\Sigma(x) = \sum_j |\nabla\psi_j(x)|$.
The function $\Sigma$ is locally bounded (since at most three $\psi_j$ have non-vanishing gradient at any point, by the local finiteness), but it can grow near $\partial\Omega$. To control the second term, we exploit the cancellation $\sum_j \nabla\psi_j = \nabla(\sum_j \psi_j) = \nabla 1 = 0$ — this holds with absolute values inside, NOT outside, so we must work harder.
The standard trick: re-choose the partition of unity to have $\Sigma$ bounded on each $\Omega_j$ by a constant depending on $j$, and absorb the second term by tightening the $\varepsilon_j^{(k)}$ scales. Concretely: for each $j$, decrease $\varepsilon_j^{(k)}$ enough that the contribution of the second term to the Step-3-transformed integral is at most $1/(k 2^j)$. Summing, the contribution is at most $1/k$. The cancellation then ensures
\begin{align*}
\sum_j |D(\psi_j u)|(\Omega) \le |Du|(\Omega) + \frac{1}{k}.
\end{align*}
Taking the supremum over admissible $\varphi$ in the duality definition of $|Du_k|$:
\begin{align*}
|Du_k|(\Omega) \le |Du|(\Omega) + \frac{1}{k}.
\end{align*}
The total variations of the smooth approximants exceed the original by at most $1/k$.
[/guided]
[/step]
[step:Apply lower-semicontinuity to obtain the matching lower bound and conclude strict convergence]
We have shown $u_k \to u$ in $L^1(\Omega)$ and $|Du_k|(\Omega) \le |Du|(\Omega) + 1/k$.
By the lower-semicontinuity of total variation under $L^1$-convergence, applied via [Weak Compactness in BV](/theorems/3104) (or its lower-semicontinuity corollary):
\begin{align*}
|Du|(\Omega) \le \liminf_{k \to \infty} |Du_k|(\Omega).
\end{align*}
Combining with the upper bound:
\begin{align*}
|Du|(\Omega) \le \liminf_{k \to \infty}|Du_k|(\Omega) \le \limsup_{k \to \infty}|Du_k|(\Omega) \le |Du|(\Omega).
\end{align*}
All four quantities are equal, hence
\begin{align*}
\lim_{k \to \infty}|Du_k|(\Omega) = |Du|(\Omega).
\end{align*}
This is exactly strict convergence: $u_k \to u$ in $L^1(\Omega)$ and $|Du_k|(\Omega) \to |Du|(\Omega)$. The construction provides $u_k \in C^\infty(\Omega) \cap BV(\Omega)$ as required (membership in $BV$ follows from the upper bound $|Du_k|(\Omega) \le |Du|(\Omega) + 1$). This completes the proof.
[guided]
We assemble the upper and lower bounds to conclude strict convergence in $BV$.
*Upper bound (Step 4).* $|Du_k|(\Omega) \le |Du|(\Omega) + 1/k$, so
\begin{align*}
\limsup_{k \to \infty}|Du_k|(\Omega) \le |Du|(\Omega).
\end{align*}
*Lower bound (lower-semicontinuity).* The hypotheses of lower-semicontinuity of total variation under $L^1$-convergence: $u_k \to u$ in $L^1(\Omega)$ — verified in Step 4 — and the $u_k, u$ all lie in $L^1(\Omega) \subseteq L^1_{\mathrm{loc}}(\Omega)$. The conclusion (cf. [Weak Compactness in BV](/theorems/3104) and standard lower-semicontinuity of total variation):
\begin{align*}
|Du|(\Omega) \le \liminf_{k \to \infty}|Du_k|(\Omega).
\end{align*}
This is the standard duality-based proof: for any $\varphi \in C_c^1(\Omega; \mathbb{R}^n)$ with $\|\varphi\|_\infty \le 1$,
\begin{align*}
-\int u\, \operatorname{div}\varphi\, d\mathcal{L}^n = \lim_{k \to \infty}\Big(-\int u_k\, \operatorname{div}\varphi\, d\mathcal{L}^n\Big) \le \liminf_{k \to \infty} |Du_k|(\Omega),
\end{align*}
where the equality uses $u_k \to u$ in $L^1$ (with $\operatorname{div}\varphi \in C_c$ continuous and bounded). Taking the sup over $\varphi$ gives $|Du|(\Omega) \le \liminf |Du_k|(\Omega)$.
*Sandwich argument.* Combining:
\begin{align*}
|Du|(\Omega) \le \liminf_{k \to \infty}|Du_k|(\Omega) \le \limsup_{k \to \infty}|Du_k|(\Omega) \le |Du|(\Omega).
\end{align*}
All four quantities coincide, so the limit exists and equals $|Du|(\Omega)$:
\begin{align*}
\lim_{k \to \infty}|Du_k|(\Omega) = |Du|(\Omega).
\end{align*}
*Final assembly.* The two convergences $u_k \to u$ in $L^1(\Omega)$ and $|Du_k|(\Omega) \to |Du|(\Omega)$ together constitute strict convergence in $BV(\Omega)$. The smoothness $u_k \in C^\infty(\Omega)$ is the locally finite sum of smooth pieces (Step 4). The $BV$-membership $u_k \in BV(\Omega)$ follows from the finite total variation $|Du_k|(\Omega) \le |Du|(\Omega) + 1$ (and $L^1$-membership from $\|u_k\|_{L^1} \le \|u\|_{L^1} + 1$).
This completes the proof: for every $u \in BV(\Omega)$, there exists a sequence $(u_k) \subset C^\infty(\Omega) \cap BV(\Omega)$ with $u_k \to u$ in $L^1(\Omega)$ and $|Du_k|(\Omega) \to |Du|(\Omega)$, i.e., $u_k \to u$ strictly in $BV(\Omega)$.
[/guided]
[/step]
Explore Further
$W^{1,1}$ Equals Absolutely Continuous Functions in 1D
Geometric Measure Theory
BV Functions Are Determined by Their Pointwise Data
Geometric Measure Theory
Chain Rule for Weak Derivatives
Geometric Measure Theory
Portmanteau Theorem for Radon Measures
Geometric Measure Theory
Weak Compactness in $W^{1,p}$
Geometric Measure Theory
BV Sobolev Inequality
Geometric Measure Theory
Besicovitch Covering Theorem
Geometric Measure Theory
Lebesgue Points Are Approximately Continuous
Geometric Measure Theory