[guided]The contractivity of mollification on total variation is the analytic core of the construction. The proof uses the duality definition of total variation, Fubini's theorem, and the symmetry of the mollifier.
*Set-up.* We work with $v \in BV(\mathbb{R}^n)$ of compact support. The mollified function $\eta_\varepsilon * v$ is in $C^\infty(\mathbb{R}^n)$: convolution with the smooth kernel $\eta_\varepsilon \in C_c^\infty(B(0, \varepsilon))$ produces a smooth function (every derivative of $\eta_\varepsilon * v$ exists pointwise as $(\partial^\alpha \eta_\varepsilon) * v$, and is continuous since $v \in L^1_{\mathrm{loc}}$). Moreover $\eta_\varepsilon * v$ has compact support contained in $\operatorname{supp}(v) + B(0, \varepsilon)$, hence is in $L^1(\mathbb{R}^n)$ as well.
*Duality definition of total variation.* For $w \in L^1(\mathbb{R}^n)$,
\begin{align*}
|Dw|(\mathbb{R}^n) = \sup\Big\{-\int_{\mathbb{R}^n} w\, \operatorname{div}\varphi\, d\mathcal{L}^n : \varphi \in C_c^1(\mathbb{R}^n; \mathbb{R}^n),\, \|\varphi\|_\infty \le 1\Big\},
\end{align*}
where $\|\varphi\|_\infty := \sup_{x \in \mathbb{R}^n} |\varphi(x)|$ is the sup of the Euclidean norms of the values. The total variation is finite if and only if the supremum is finite, in which case $w \in BV(\mathbb{R}^n)$. We will compute $|D(\eta_\varepsilon * v)|$ using this definition.
*Computing the bound via convolution transfer.* Fix $\varphi \in C_c^1(\mathbb{R}^n; \mathbb{R}^n)$ with $\|\varphi\|_\infty \le 1$. We compute the integral pairing:
\begin{align*}
-\int_{\mathbb{R}^n} (\eta_\varepsilon * v)(x)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x).
\end{align*}
Expanding the convolution:
\begin{align*}
(\eta_\varepsilon * v)(x) = \int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, v(y)\, d\mathcal{L}^n(y).
\end{align*}
Substituting:
\begin{align*}
-\int_{\mathbb{R}^n} \int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, v(y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(y)\, d\mathcal{L}^n(x).
\end{align*}
*Fubini's hypotheses.* We swap the order of integration. The hypotheses: the integrand $\eta_\varepsilon(x - y)\, v(y)\, \operatorname{div}\varphi(x)$ is jointly measurable, and one of the iterated integrals of its absolute value is finite. We verify the latter: with $|v(y)| \in L^1(\mathbb{R}^n)$ (since $v \in BV$ has compact support), $\eta_\varepsilon \in L^\infty$, and $\operatorname{div}\varphi \in C_c$, the iterated integral
\begin{align*}
\int_{\mathbb{R}^n}|v(y)| \int_{\mathbb{R}^n} \eta_\varepsilon(x - y) |\operatorname{div}\varphi(x)|\, d\mathcal{L}^n(x)\, d\mathcal{L}^n(y) \le \|\operatorname{div}\varphi\|_\infty \|\eta_\varepsilon\|_{L^1} \|v\|_{L^1} < \infty.
\end{align*}
Fubini applies. Swapping:
\begin{align*}
-\int_{\mathbb{R}^n} v(y) \int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x)\, d\mathcal{L}^n(y).
\end{align*}
*Recognising the inner integral.* The inner integral is
\begin{align*}
\int_{\mathbb{R}^n} \eta_\varepsilon(x - y)\, \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x) = \int_{\mathbb{R}^n} \eta_\varepsilon(z)\, \operatorname{div}\varphi(z + y)\, d\mathcal{L}^n(z) = (\eta_\varepsilon * \operatorname{div}\varphi)(y),
\end{align*}
using the change of variables $z = x - y$ and the fact that the standard mollifier $\eta_\varepsilon$ is even ($\eta_\varepsilon(-z) = \eta_\varepsilon(z)$), which makes the convolution symmetric.
*Differentiation passes through convolution.* By [Differentiation Passes Through Convolution](/theorems/3096), $\eta_\varepsilon * \operatorname{div}\varphi = \operatorname{div}(\eta_\varepsilon * \varphi)$, where $\eta_\varepsilon * \varphi$ is the componentwise convolution applied to a vector-valued function. Both sides of this identity are continuous functions, and the identity holds pointwise. The hypotheses of theorem 3096: the kernel $\eta_\varepsilon$ is $C^\infty_c$, and $\varphi$ is $C^1$ — both are met. The conclusion gives the displayed identity.
So
\begin{align*}
\int_{\mathbb{R}^n}\eta_\varepsilon(x - y) \operatorname{div}\varphi(x)\, d\mathcal{L}^n(x) = \operatorname{div}(\eta_\varepsilon * \varphi)(y).
\end{align*}
*Pairing with $Dv$.* Substituting back:
\begin{align*}
-\int_{\mathbb{R}^n} (\eta_\varepsilon * v)\, \operatorname{div}\varphi\, d\mathcal{L}^n = -\int_{\mathbb{R}^n} v(y)\, \operatorname{div}(\eta_\varepsilon * \varphi)(y)\, d\mathcal{L}^n(y).
\end{align*}
The right-hand side is the action of $Dv$ on the smooth test field $\eta_\varepsilon * \varphi$:
\begin{align*}
-\int v\, \operatorname{div}(\eta_\varepsilon * \varphi)\, d\mathcal{L}^n = \langle Dv, \eta_\varepsilon * \varphi\rangle.
\end{align*}
*Bounding the transformed test field.* The transformed field $\eta_\varepsilon * \varphi$ is in $C^\infty(\mathbb{R}^n; \mathbb{R}^n)$ with compact support in $\operatorname{supp}\varphi + B(0, \varepsilon)$. Its sup norm:
\begin{align*}
\|\eta_\varepsilon * \varphi\|_\infty \le \|\eta_\varepsilon\|_{L^1}\, \|\varphi\|_\infty = 1 \cdot 1 = 1
\end{align*}
by Young's convolution inequality (with $\|\eta_\varepsilon\|_{L^1} = 1$ by normalisation of the mollifier). Hence $\eta_\varepsilon * \varphi$ is admissible in the duality definition of $|Dv|$, giving
\begin{align*}
\langle Dv, \eta_\varepsilon * \varphi\rangle \le |Dv|(\mathbb{R}^n).
\end{align*}
*Conclusion.* Taking the supremum over all admissible $\varphi$:
\begin{align*}
|D(\eta_\varepsilon * v)|(\mathbb{R}^n) = \sup_{\|\varphi\|_\infty \le 1}\Big[-\int (\eta_\varepsilon * v)\, \operatorname{div}\varphi\, d\mathcal{L}^n\Big] \le |Dv|(\mathbb{R}^n).
\end{align*}
The total variation contracts under mollification. Equality holds for $v \in W^{1, 1}$ in the limit $\varepsilon \to 0$, but for finite $\varepsilon$ the inequality may be strict — mollification smooths out and reduces the total variation.
*Localisation.* For a test field $\varphi \in C_c^1$ supported in $\Omega' \subset\subset \Omega''$ with $\operatorname{dist}(\Omega', \partial\Omega'') > \varepsilon$, the transformed field $\eta_\varepsilon * \varphi$ is supported in $\Omega' + B(0, \varepsilon) \subseteq \Omega''$. The same argument applied with the test field restricted to $\Omega'$ and the dual restricted to $\Omega''$ yields
\begin{align*}
|D(\eta_\varepsilon * v)|(\Omega') \le |Dv|(\Omega'').
\end{align*}
This is the localised contractivity. The bound says: the mollified function's gradient measure on $\Omega'$ is dominated by the original function's gradient measure on the slightly thicker $\Omega''$. The thickening accounts for the support spread under convolution by $\varepsilon$.
This local contraction is what we use in the partition-of-unity construction: each piece $\eta_{\varepsilon_j} * (\psi_j u)$ has gradient measure on $\Omega$ dominated by $|D(\psi_j u)|$ on a slightly thickened set $V_j'$, and the thickening fits within the locally finite cover so the dominations sum.[/guided]