Approximate Gradient Identifies $D^a u$ — Statement & Proof

Approximate Gradient Identifies $D^a u$ (Theorem # 3130)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] The argument proceeds by Lebesgue–Besicovitch differentiation of the vector measure $Du$. The Radon–Nikodym decomposition writes $Du = D^a u + D^s u$ with $D^a u = g \cdot \mathcal{L}^n$, where the density $g \in L^1_{\mathrm{loc}}(\Omega; \mathbb{R}^n)$ is recovered as a symmetric derivative of $Du$ at $\mathcal{L}^n$-a.e. point and the singular part has zero $\mathcal{L}^n$-density there. Translating these density statements via the rescaled BV functions $v_r(y) := (u(x_0 + ry) - c_r)/r$ on the unit ball, where $c_r$ is the average of $u$ on $B(x_0, r)$, the BV scaling yields $|Dv_r|(B(0, 1)) \to \omega_n |g(x_0)|$ together with weak* convergence of $Dv_r$ to $g(x_0)\, \mathcal{L}^n|_{B(0, 1)}$. The BV–compact embedding then gives $L^1$-convergence of $v_r$ to the affine function $y \mapsto g(x_0)\cdot y$, which unwinds into the approximate-differentiability statement at $x_0$ with approximate gradient $g(x_0)$. The agreement of $\widetilde{u}(x_0)$ with $u(x_0)$ at $\mathcal{L}^n$-a.e. $x_0$ is the Lebesgue differentiation theorem. [/proofplan] [step:Decompose $Du$ into absolutely continuous and singular parts and identify the density via differentiation] By the [Radon–Nikodym theorem](/theorems/radon-nikodym) applied to the Radon vector measure $Du$ on $\Omega$ relative to $\mathcal{L}^n|_\Omega$, there is a unique decomposition \begin{align*} Du &= D^a u + D^s u, & D^a u &\ll \mathcal{L}^n, & D^s u &\perp \mathcal{L}^n. \end{align*} The Radon–Nikodym density $g \in L^1_{\mathrm{loc}}(\Omega; \mathbb{R}^n)$ is defined by $D^a u = g \cdot \mathcal{L}^n$. By the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch) applied to the vector measure $Du$ with the doubling reference measure $\mathcal{L}^n$, the symmetric derivative \begin{align*} \lim_{r \to 0} \frac{Du(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))} = g(x_0) \in \mathbb{R}^n \end{align*} exists at $\mathcal{L}^n$-a.e. $x_0 \in \Omega$. The singular part has vanishing $\mathcal{L}^n$-density: \begin{align*} \lim_{r \to 0} \frac{|D^s u|(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))} = 0 \quad \text{for $\mathcal{L}^n$-a.e. } x_0 \in \Omega. \end{align*} [guided] We are given $u \in BV(\Omega)$, meaning the distributional derivative $Du$ is a finite Radon vector measure on $\Omega$ with values in $\mathbb{R}^n$. The total variation $|Du|$ is a non-negative Radon measure. The hypotheses of the [Radon–Nikodym theorem](/theorems/radon-nikodym) for vector-valued measures: $Du$ is a Radon vector measure with finite total variation on $\Omega$, and $\mathcal{L}^n$ is a $\sigma$-finite Borel measure on $\Omega$. Both are met. The conclusion provides the unique decomposition \begin{align*} Du &= D^a u + D^s u, & D^a u &\ll \mathcal{L}^n, & D^s u &\perp \mathcal{L}^n, \end{align*} together with a density $g \in L^1_{\mathrm{loc}}(\Omega; \mathbb{R}^n)$ such that $D^a u = g \cdot \mathcal{L}^n$. The hypotheses of the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch): the reference measure $\mathcal{L}^n$ is doubling on $\mathbb{R}^n$ (a stronger property than required), and $Du$ is a Radon vector measure. The conclusion gives, at $\mathcal{L}^n$-a.e. $x_0$, \begin{align*} \lim_{r \to 0} \frac{Du(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))} = g(x_0). \end{align*} This is the symmetric-derivative recovery of the Radon–Nikodym density. Applied to the singular part separately, the symmetric derivative of $|D^s u|$ relative to $\mathcal{L}^n$ vanishes $\mathcal{L}^n$-a.e., because $|D^s u| \perp \mathcal{L}^n$ and the differentiation theorem applied to a singular measure gives zero density at typical points. The set of $x_0$ where both density limits hold has full $\mathcal{L}^n$-measure in $\Omega$; we restrict our attention to such $x_0$ throughout the rest of the proof. Our task is to show that the approximate gradient $\nabla u(x_0)$ — defined intrinsically by an approximate differentiability condition — coincides with the Radon–Nikodym density $g(x_0)$. [/guided] [/step] [step:Define rescaled BV functions and compute the BV scaling identity] Fix $x_0 \in \Omega$ at which the density limits in Step 1 exist; the set of such $x_0$ has full $\mathcal{L}^n$-measure. Choose $\rho > 0$ with $B(x_0, \rho) \subseteq \Omega$. For $r \in (0, \rho)$, set \begin{align*} c_r := \fint_{B(x_0, r)} u(x)\, d\mathcal{L}^n(x), \end{align*} and define \begin{align*} v_r: B(0, 1) &\to \mathbb{R} \\ y &\mapsto \frac{u(x_0 + r y) - c_r}{r}. \end{align*} Note $\fint_{B(0, 1)} v_r\, d\mathcal{L}^n = 0$ by the definition of $c_r$ and the change of variables. For any test field $\varphi \in C_c^1(B(0, 1); \mathbb{R}^n)$, the change of variables $x = x_0 + ry$, $d\mathcal{L}^n(y) = r^{-n}\, d\mathcal{L}^n(x)$, with $\Phi(x) := \varphi((x - x_0)/r) \in C_c^1(B(x_0, r); \mathbb{R}^n)$ and the chain-rule identity $\operatorname{div}_y \varphi(y) = r\, \operatorname{div}_x \Phi(x)$, yields \begin{align*} \langle Dv_r, \varphi\rangle &= -\int_{B(0, 1)} v_r(y)\, \operatorname{div}_y \varphi(y)\, d\mathcal{L}^n(y) \\ &= -\frac{1}{r}\int_{B(x_0, r)} u(x)\, r\, \operatorname{div}_x \Phi(x)\, r^{-n}\, d\mathcal{L}^n(x) \\ &= r^{-n}\int_{B(x_0, r)} u(x)\, (-\operatorname{div}_x \Phi(x))\, d\mathcal{L}^n(x) \\ &= r^{-n}\, \langle Du, \Phi\rangle, \end{align*} where the integral of the constant $c_r/r$ against $\operatorname{div}_y \varphi$ vanishes by the divergence theorem on the unit ball with $\varphi$ compactly supported. As $\varphi$ ranges over $C_c^1(B(0, 1); \mathbb{R}^n)$ with $\|\varphi\|_\infty \le 1$, $\Phi$ ranges over $C_c^1(B(x_0, r); \mathbb{R}^n)$ with $\|\Phi\|_\infty \le 1$. Taking suprema: \begin{align*} |Dv_r|(B(0, 1)) = r^{-n}\, |Du|(B(x_0, r)). \end{align*} Since $\mathcal{L}^n(B(x_0, r)) = \omega_n r^n$, this rewrites as \begin{align*} |Dv_r|(B(0, 1)) = \omega_n \cdot \frac{|Du|(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))}. \end{align*} The right-hand side tends to $\omega_n |g(x_0)|$ as $r \to 0$ by Step 1. [guided] The natural rescaling for $BV$ functions is dictated by the fact that the gradient should remain $O(1)$ as $r \to 0$. For a smooth test case $u(x) = g \cdot (x - x_0)$ with constant gradient $g$, the rescaled function $u(x_0 + ry) = g \cdot ry$ has gradient $rg$ in the $y$-coordinates; dividing by $r$ recovers a finite gradient. This motivates the centred-and-divided form $v_r(y) = (u(x_0 + ry) - c_r)/r$. We choose $c_r$ as the average of $u$ on $B(x_0, r)$ so that $v_r$ has zero average on $B(0, 1)$ — this is the Poincaré-friendly normalisation that makes $L^1$-bounds available without an additive ambiguity. Computing $|Dv_r|$. The distributional derivative pairs against test fields $\varphi \in C_c^1(B(0, 1); \mathbb{R}^n)$ via integration by parts. After change of variables $x = x_0 + ry$ (so $d\mathcal{L}^n(y) = r^{-n} d\mathcal{L}^n(x)$) and the chain-rule identity for the divergence ($\operatorname{div}_y \varphi(y) = r\, \operatorname{div}_x \Phi(x)$, where the factor of $r$ comes from differentiating the inverse change of variables), the additive constant $c_r/r$ contributes zero (integral of a divergence on the unit ball with $\varphi$ compactly supported), and the remaining integral becomes the pairing of $Du$ on $B(x_0, r)$ with the test field $\Phi(x) = \varphi((x - x_0)/r)$, scaled by $r^{-n}$: \begin{align*} \langle Dv_r, \varphi\rangle = r^{-n} \langle Du, \Phi\rangle. \end{align*} The supremum identity for total variation, \begin{align*} |Dv_r|(B(0, 1)) = \sup_{\varphi: \|\varphi\|_\infty \le 1} \langle Dv_r, \varphi\rangle, \end{align*} together with the bijection $\varphi \leftrightarrow \Phi$ preserving sup-norms (since $\Phi(x) = \varphi(y)$ takes the same values, just at rescaled points), gives \begin{align*} |Dv_r|(B(0, 1)) = r^{-n}\, |Du|(B(x_0, r)). \end{align*} Sanity check via the linear test case: for $u(x) = g \cdot (x - x_0)$, $|Du|$ has density $|g|$, so $|Du|(B(x_0, r)) = |g| \omega_n r^n$, giving $|Dv_r|(B(0, 1)) = |g|\omega_n$ — independent of $r$, matching the constant-gradient unit-ball total variation of $v(y) = g \cdot y$. Multiplying and dividing by $\mathcal{L}^n(B(x_0, r)) = \omega_n r^n$: \begin{align*} |Dv_r|(B(0, 1)) = \omega_n \cdot \frac{|Du|(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))}. \end{align*} By Step 1 (specifically the symmetric derivative of $|Du|$ relative to $\mathcal{L}^n$, which equals $|g(x_0)|$ at points where the singular density vanishes — i.e., at $\mathcal{L}^n$-a.e. $x_0$), \begin{align*} |Dv_r|(B(0, 1)) \to \omega_n |g(x_0)| \quad \text{as } r \to 0. \end{align*} The total variations remain uniformly bounded as $r \to 0$. [/guided] [/step] [step:Extract a limit of $v_r$ via BV-compactness and identify it via weak* convergence of $Dv_r$] By Poincaré's inequality on $B(0, 1)$ for zero-mean BV functions (a consequence of [Poincaré in Balls](/theorems/3103) extended to BV via approximation), \begin{align*} \|v_r\|_{L^1(B(0, 1))} \le C_n\, |Dv_r|(B(0, 1)) \le C_n \omega_n |g(x_0)| + o(1), \end{align*} so $\{v_r\}_{r \in (0, \rho)}$ is bounded in $BV(B(0, 1))$. By the BV variant of [Rellich–Kondrachov](/theorems/64) (the unit ball $B(0, 1)$ is a bounded Lipschitz domain), bounded sequences in $BV(B(0, 1))$ are precompact in $L^1(B(0, 1))$. Hence every sequence $r_k \to 0$ has a subsequence along which $v_{r_k}$ converges in $L^1(B(0, 1))$ to a limit $v \in L^1(B(0, 1))$. The limit $v$ has finite total variation by lower-semicontinuity: \begin{align*} |Dv|(B(0, 1)) \le \liminf_{k \to \infty} |Dv_{r_k}|(B(0, 1)) = \omega_n |g(x_0)|. \end{align*} We identify $v$. For any open ball $B(z, \delta) \subseteq B(0, 1)$, the rescaled vector measure satisfies \begin{align*} Dv_r(B(z, \delta)) = r^{-n} Du(x_0 + r\, B(z, \delta)) = r^{-n} Du(B(x_0 + rz, r\delta)). \end{align*} Dividing by $\mathcal{L}^n(B(z, \delta)) = \omega_n \delta^n$: \begin{align*} \frac{Dv_r(B(z, \delta))}{\mathcal{L}^n(B(z, \delta))} = \frac{Du(B(x_0 + rz, r\delta))}{\mathcal{L}^n(B(x_0 + rz, r\delta))}. \end{align*} For $|z| < 1$, $x_0 + rz \to x_0$ as $r \to 0$ and $r \delta \to 0$. We must check that the family of balls $\{B(x_0 + rz, r\delta)\}_{r \to 0}$ shrinks nicely to $x_0$ in the sense of the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch): each ball $B(x_0 + rz, r\delta)$ is contained in the centred ball $B(x_0, r(|z| + \delta))$, and the ratio of measures is \begin{align*} \frac{\mathcal{L}^n(B(x_0, r(|z| + \delta)))}{\mathcal{L}^n(B(x_0 + rz, r\delta))} = \frac{(|z| + \delta)^n}{\delta^n} \le \left(\frac{1 + \delta}{\delta}\right)^n, \end{align*} a uniform bound (independent of $r$) since $|z| < 1$. Hence the family shrinks nicely to $x_0$, and the strong form of the Lebesgue–Besicovitch theorem applies: the right-hand side converges to $g(x_0)$ as $r \to 0$. Therefore \begin{align*} Dv_r(B(z, \delta)) \to g(x_0) \cdot \mathcal{L}^n(B(z, \delta)) \quad \text{as } r \to 0, \end{align*} for every open ball $B(z, \delta) \subseteq B(0, 1)$. We now extract a weak* limit and identify it. The space $\mathcal{M}(B(0, 1); \mathbb{R}^n)$ of finite $\mathbb{R}^n$-valued Radon measures on $B(0, 1)$ is the topological dual of $C_0(B(0, 1); \mathbb{R}^n)$. The uniform bound $|Dv_r|(B(0, 1)) \le \omega_n |g(x_0)| + 1$ for small $r$ places the family $\{Dv_r\}$ in a bounded subset of this dual space. By the Banach–Alaoglu theorem, the family is sequentially weak* precompact: for any $r_k \to 0$, there exist a subsequence $r_{k_j}$ and a finite vector measure $\mu$ on $B(0, 1)$ such that $Dv_{r_{k_j}} \overset{*}{\rightharpoonup} \mu$, i.e., \begin{align*} \int_{B(0, 1)} \psi \, dDv_{r_{k_j}} \to \int_{B(0, 1)} \psi \, d\mu \quad \text{for every } \psi \in C_0(B(0, 1); \mathbb{R}^n). \end{align*} To identify $\mu$, we show $\mu = g(x_0) \mathcal{L}^n|_{B(0, 1)}$. Fix any open ball $B(z, \delta) \subseteq B(0, 1)$, and approximate its indicator $\mathbf{1}_{B(z, \delta)}$ from inside by a non-decreasing sequence of compactly supported continuous functions $\eta_m \uparrow \mathbf{1}_{B(z, \delta)}$ pointwise on $B(0, 1)$, with $0 \le \eta_m \le 1$. By weak* convergence applied to $\eta_m e_i$ for each coordinate $e_i$, and dominated convergence on the right (using that $|\mu|$ and $\mathcal{L}^n$ are finite on $B(0, 1)$), \begin{align*} \int \eta_m \, d\mu = \lim_j \int \eta_m \, dDv_{r_{k_j}} \le \liminf_j Dv_{r_{k_j}}(\operatorname{supp}\eta_m) \cdot \|\eta_m\|_\infty, \end{align*} and passing $m \to \infty$ on both sides via dominated convergence and the ball-convergence shown above yields \begin{align*} \mu(B(z, \delta)) = \lim_j Dv_{r_{k_j}}(B(z, \delta)) = g(x_0) \mathcal{L}^n(B(z, \delta)). \end{align*} Hence $\mu$ and $g(x_0) \mathcal{L}^n|_{B(0, 1)}$ agree on the family of open balls $B(z, \delta) \subseteq B(0, 1)$. Since this family is closed under finite intersection and generates the Borel $\sigma$-algebra on $B(0, 1)$, the uniqueness part of the Riesz–Radon representation theorem (or, equivalently, a Dynkin $\pi$–$\lambda$ argument applied to each coordinate signed measure of bounded variation) gives $\mu = g(x_0) \mathcal{L}^n|_{B(0, 1)}$. Since the limit is the same for every weak* convergent subsequence, the full family converges: \begin{align*} Dv_r \overset{*}{\rightharpoonup} g(x_0)\, \mathcal{L}^n|_{B(0, 1)} \quad \text{as } r \to 0. \end{align*} The $L^1$-limit $v$ then has gradient measure $Dv = g(x_0)\, \mathcal{L}^n|_{B(0, 1)}$: for any $\varphi \in C_c^1(B(0, 1); \mathbb{R}^n)$, \begin{align*} \langle Dv, \varphi\rangle = -\int_{B(0, 1)} v\, \operatorname{div}\varphi\, d\mathcal{L}^n = -\lim_k \int_{B(0, 1)} v_{r_k} \operatorname{div}\varphi\, d\mathcal{L}^n = \lim_k \langle Dv_{r_k}, \varphi\rangle = \int_{B(0, 1)} g(x_0) \cdot \varphi\, d\mathcal{L}^n. \end{align*} Hence $v \in W^{1, \infty}(B(0, 1))$ with $\nabla v = g(x_0)$ $\mathcal{L}^n$-a.e., and the zero-mean normalisation forces $v(y) = g(x_0) \cdot y$ on $B(0, 1)$. By uniqueness of the limit, the full family converges: \begin{align*} v_r \to v_*\quad \text{in } L^1(B(0, 1)) \text{ as } r \to 0, \quad v_*(y) := g(x_0) \cdot y. \end{align*} [guided] We extract a limit of the rescaled functions and identify it as an affine function with gradient $g(x_0)$. *BV-compactness.* The hypotheses of the BV Rellich–Kondrachov theorem on $B(0, 1)$: bounded Lipschitz domain (the unit ball is smooth, hence Lipschitz) and a bounded family in $BV$. The total variations $|Dv_r|(B(0, 1))$ tend to $\omega_n|g(x_0)|$, so they are uniformly bounded for small $r$. The $L^1$ norms are bounded by Poincaré–Wirtinger on the unit ball (with zero mean): $\|v_r\|_{L^1} \le C_n |Dv_r|(B(0, 1)) \le C_n \omega_n |g(x_0)| + o(1)$. So $\{v_r\}$ is uniformly $BV$-bounded for small $r$. The BV-Rellich theorem on $B(0, 1)$ states: bounded sequences in $BV(B(0, 1))$ are relatively compact in $L^1(B(0, 1))$. Applied to our family, every sequence $r_k \to 0$ has a subsequence $r_{k_j}$ with $v_{r_{k_j}} \to v$ in $L^1(B(0, 1))$ for some $v \in L^1(B(0, 1))$. *Identifying the limit via weak* convergence of gradients.* The vector measure $Dv_r$ should converge weakly* to a constant-density measure $g(x_0)\, \mathcal{L}^n$. To verify, evaluate $Dv_r$ on a test ball $B(z, \delta) \subseteq B(0, 1)$: \begin{align*} Dv_r(B(z, \delta)) = r^{-n} (Du)(x_0 + r B(z, \delta)) = r^{-n}\, Du(B(x_0 + rz, r\delta)), \end{align*} using the rescaling identity from Step 2 promoted from compactly supported test fields to Borel sets via Radon-measure regularity (open-set agreement plus inner regularity). The right-hand side is the $|Du|$-mass of a small ball centred at $x_0 + rz$, which moves toward $x_0$ as $r \to 0$. Dividing by $\mathcal{L}^n(B(z, \delta)) = \omega_n \delta^n$ on both sides: \begin{align*} \frac{Dv_r(B(z, \delta))}{\mathcal{L}^n(B(z, \delta))} = \frac{Du(B(x_0 + rz, r\delta))}{\mathcal{L}^n(B(x_0 + rz, r\delta))}. \end{align*} The right-hand side is the symmetric-derivative quotient for $Du$ taken along the family $\{B(x_0 + rz, r\delta)\}_{r \to 0}$. To apply the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch) to this off-centre family, we verify the *shrinks-nicely* hypothesis: each ball $B(x_0 + rz, r\delta)$ sits inside the centred ball $B(x_0, r(|z| + \delta))$, with measure ratio \begin{align*} \frac{\mathcal{L}^n(B(x_0, r(|z| + \delta)))}{\mathcal{L}^n(B(x_0 + rz, r\delta))} = \left(\frac{|z| + \delta}{\delta}\right)^n \le \left(\frac{1 + \delta}{\delta}\right)^n, \end{align*} which is a constant independent of $r$. Hence the family shrinks nicely to $x_0$, the radii $r(|z| + \delta) \to 0$, and the strong form of the differentiation theorem yields \begin{align*} \frac{Du(B(x_0 + rz, r\delta))}{\mathcal{L}^n(B(x_0 + rz, r\delta))} \to g(x_0) \quad \text{as } r \to 0. \end{align*} This gives $Dv_r(B(z, \delta)) \to g(x_0) \mathcal{L}^n(B(z, \delta))$ for each fixed open ball $B(z, \delta) \subseteq B(0, 1)$. Now we promote ball-convergence to weak* convergence in the standard way. By the Banach–Alaoglu theorem, the bounded family $\{Dv_r\}$ in the dual space $C_0(B(0, 1); \mathbb{R}^n)^* = \mathcal{M}(B(0, 1); \mathbb{R}^n)$ is sequentially weak* precompact: every sequence $r_k \to 0$ admits a subsequence $r_{k_j}$ along which $Dv_{r_{k_j}} \overset{*}{\rightharpoonup} \mu$ for some finite vector measure $\mu$. To identify $\mu$ with $g(x_0) \mathcal{L}^n|_{B(0, 1)}$, approximate the indicator of any open ball $B(z, \delta) \subseteq B(0, 1)$ from below by $C_c$-functions $\eta_m \uparrow \mathbf{1}_{B(z, \delta)}$ and pass to the limit using weak* convergence on each $\eta_m$ followed by dominated convergence; this recovers $\mu(B(z, \delta)) = g(x_0) \mathcal{L}^n(B(z, \delta))$ for every such ball. Since open balls form a $\pi$-system generating the Borel $\sigma$-algebra on $B(0, 1)$, uniqueness of finite Radon measures (Riesz–Radon) determines $\mu$ entirely: $\mu = g(x_0) \mathcal{L}^n|_{B(0, 1)}$. As every weak* convergent subsequence has the same limit, the full family converges: $Dv_r \overset{*}{\rightharpoonup} g(x_0)\, \mathcal{L}^n|_{B(0, 1)}$. *The $L^1$-limit has gradient $g(x_0)$.* The $L^1$-limit $v$ satisfies, for any $\varphi \in C_c^1(B(0, 1); \mathbb{R}^n)$, \begin{align*} \langle Dv, \varphi\rangle = -\int v\, \operatorname{div}\varphi\, d\mathcal{L}^n = -\lim_{k \to \infty} \int v_{r_{k_j}} \operatorname{div}\varphi\, d\mathcal{L}^n = \lim_{k \to \infty} \langle Dv_{r_{k_j}}, \varphi\rangle = \int g(x_0) \cdot \varphi\, d\mathcal{L}^n, \end{align*} where the first limit uses the $L^1$-convergence of $v_{r_{k_j}}$ to $v$ and the $C^1$-boundedness of $\operatorname{div}\varphi$, and the last uses weak* convergence of the gradient measures applied to $\varphi$. This identifies $Dv = g(x_0)\, \mathcal{L}^n|_{B(0, 1)}$ in the distributional sense, i.e., $\nabla v = g(x_0)$ a.e. So $v$ is affine: $v(y) = g(x_0) \cdot y + b$ for a constant $b$. The zero-mean condition $\int v = 0$ forces $b = -g(x_0) \cdot 0 = 0$ (since $\int_{B(0, 1)} y\, d\mathcal{L}^n(y) = 0$ by symmetry), giving $v(y) = g(x_0) \cdot y$. *Full-family convergence.* Since the limit $v_*(y) = g(x_0) \cdot y$ is unique and every subsequence has a sub-subsequence converging to $v_*$, the full family $\{v_r\}_{r > 0}$ converges to $v_*$ in $L^1(B(0, 1))$ as $r \to 0$. [/guided] [/step] [step:Translate $L^1$-convergence to approximate differentiability of $u$ at $x_0$] The convergence $v_r \to v_*$ in $L^1(B(0, 1))$ unwraps via the change of variables $x = x_0 + ry$: \begin{align*} \int_{B(0, 1)} |v_r(y) - g(x_0) \cdot y|\, d\mathcal{L}^n(y) = r^{-n} \int_{B(x_0, r)} \frac{|u(x) - c_r - g(x_0) \cdot (x - x_0)|}{r}\, d\mathcal{L}^n(x). \end{align*} Multiplying both sides by $\omega_n^{-1}$ and noting $\omega_n^{-1} r^{-n} = \mathcal{L}^n(B(x_0, r))^{-1}$ (modulo the $\omega_n$ factor): \begin{align*} \frac{1}{r}\fint_{B(x_0, r)} |u(x) - c_r - g(x_0) \cdot (x - x_0)|\, d\mathcal{L}^n(x) = \omega_n^{-1} \int_{B(0, 1)} |v_r(y) - g(x_0) \cdot y|\, d\mathcal{L}^n(y) \to 0 \end{align*} as $r \to 0$. By the triangle inequality, \begin{align*} &\frac{1}{r}\fint_{B(x_0, r)} |u(x) - u(x_0) - g(x_0)\cdot(x - x_0)|\, d\mathcal{L}^n(x) \\ &\quad \le \frac{1}{r}\fint_{B(x_0, r)} |u(x) - c_r - g(x_0)\cdot(x - x_0)|\, d\mathcal{L}^n(x) + \frac{|c_r - u(x_0)|}{r}. \end{align*} The first term tends to $0$ as just shown. The crux is to show that the second term vanishes: \begin{align*} \frac{|c_r - u(x_0)|}{r} \to 0 \quad \text{as } r \to 0. \end{align*} We establish this rate-$r$ statement at $\mathcal{L}^n$-a.e. $x_0$ by combining two facts: **(a) The Lebesgue-point property at $x_0$**, supplied by the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch) applied to $u \in L^1_{\mathrm{loc}}(\Omega)$: at $\mathcal{L}^n$-a.e. $x_0 \in \Omega$, \begin{align*} \fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) \to 0 \quad \text{as } r \to 0. \end{align*} By Jensen's inequality, $|c_r - u(x_0)| \le \fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x)$, so $|c_r - u(x_0)| = o(1)$ at every Lebesgue point. **(b) The BV Poincaré inequality at $x_0$**, supplied by [Poincaré in Balls](/theorems/3103) extended to BV: for every ball $B(x_0, r) \subseteq \Omega$, \begin{align*} \fint_{B(x_0, r)} |u(x) - c_r|\, d\mathcal{L}^n(x) \le C_n \cdot r \cdot \frac{|Du|(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))}. \end{align*} The factor of $r$ on the right-hand side is the rate-$r$ scaling that BV Poincaré provides — this is precisely what rules out a pointwise rate-$r$ statement for arbitrary $L^1_{\mathrm{loc}}$ functions but secures it for BV. Combining (a) and (b). For any $\alpha \in \mathbb{R}$, Jensen's inequality gives \begin{align*} \frac{|c_r - \alpha|}{r} \le \frac{1}{r}\fint_{B(x_0, r)} |u(x) - \alpha|\, d\mathcal{L}^n(x). \end{align*} The right-hand side at $\alpha = u(x_0)$ is precisely the rate-$r$ Lebesgue-average that BV improves over $L^1_{\mathrm{loc}}$. The relevant input is the [BV Lebesgue-point theorem](/theorems/bv-lebesgue-point) (also called the strong Lebesgue-point theorem for BV functions), whose hypothesis $u \in BV(\Omega)$ is satisfied here. Its conclusion: at $\mathcal{L}^n$-a.e. $x_0 \in \Omega$, the precise representative $\widetilde{u}(x_0)$ exists and \begin{align*} \frac{1}{r}\fint_{B(x_0, r)} |u(x) - \widetilde{u}(x_0)|\, d\mathcal{L}^n(x) \to 0 \quad \text{as } r \to 0. \end{align*} This rate-$r$ statement is the BV-specific strengthening of the $o(1)$ Lebesgue-point property in (a); its proof uses the Poincaré-in-Balls bound from (b) on dyadic scales. Combined with $\widetilde{u}(x_0) = u(x_0)$ at $\mathcal{L}^n$-a.e. $x_0$ (Step 5; the set of Lebesgue points of $u$ has full $\mathcal{L}^n$-measure, and the BV-Lebesgue-point set is contained in it up to null sets), the rate-$r$ statement holds with $\widetilde{u}(x_0)$ replaced by $u(x_0)$: \begin{align*} \frac{1}{r}\fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) \to 0. \end{align*} Applying Jensen's inequality with $\alpha = u(x_0)$, \begin{align*} \frac{|c_r - u(x_0)|}{r} \le \frac{1}{r}\fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) \to 0 \quad \text{as } r \to 0. \end{align*} This is the absorption needed. Returning to the triangle-inequality decomposition, \begin{align*} \frac{1}{r}\fint_{B(x_0, r)} |u(x) - u(x_0) - g(x_0) \cdot (x - x_0)|\, d\mathcal{L}^n(x) \to 0 \quad \text{as } r \to 0. \end{align*} This is the defining property of approximate differentiability of $u$ at $x_0$ with approximate gradient $\nabla u(x_0) = g(x_0)$. [guided] We unwrap the $L^1$-convergence of the rescaled functions into a statement about $u$ near $x_0$. *Change of variables in the integral.* The substitution $x = x_0 + ry$, $d\mathcal{L}^n(y) = r^{-n}\, d\mathcal{L}^n(x)$, transforms the unit-ball $L^1$-norm of $v_r - v_*$ into an $L^1$-norm on $B(x_0, r)$: \begin{align*} \int_{B(0, 1)} \big|v_r(y) - g(x_0) \cdot y\big|\, d\mathcal{L}^n(y) &= \int_{B(0, 1)}\Big|\frac{u(x_0 + ry) - c_r}{r} - g(x_0) \cdot y\Big|\, d\mathcal{L}^n(y) \\ &= r^{-n}\int_{B(x_0, r)} \Big|\frac{u(x) - c_r}{r} - g(x_0) \cdot \frac{x - x_0}{r}\Big|\, d\mathcal{L}^n(x) \\ &= r^{-n - 1}\int_{B(x_0, r)} |u(x) - c_r - g(x_0) \cdot (x - x_0)|\, d\mathcal{L}^n(x). \end{align*} Dividing by $\omega_n$ and recognising $\omega_n r^n = \mathcal{L}^n(B(x_0, r))$: \begin{align*} \frac{1}{r}\fint_{B(x_0, r)} |u(x) - c_r - g(x_0) \cdot (x - x_0)|\, d\mathcal{L}^n(x) = \omega_n^{-1}\int_{B(0, 1)} |v_r(y) - g(x_0) \cdot y|\, d\mathcal{L}^n(y) \to 0 \end{align*} as $r \to 0$, by Step 3. *Replacing $c_r$ with $u(x_0)$.* The triangle inequality gives \begin{align*} \frac{1}{r}\fint |u - u(x_0) - g(x_0)\cdot(\cdot - x_0)|\, d\mathcal{L}^n \le \frac{1}{r}\fint |u - c_r - g(x_0)\cdot(\cdot - x_0)|\, d\mathcal{L}^n + \frac{|c_r - u(x_0)|}{r}. \end{align*} The first term tends to $0$ as just shown. The crux is to verify that the second term vanishes: \begin{align*} \frac{|c_r - u(x_0)|}{r} \to 0 \quad \text{as } r \to 0. \end{align*} This rate-$r$ absorption is established by combining the Lebesgue-point property at $x_0$ with the BV Lebesgue-point theorem. *The Lebesgue-point step.* The [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch), applied to the $L^1_{\mathrm{loc}}$ function $u$ (since $u \in BV(\Omega) \subseteq L^1(\Omega) \subseteq L^1_{\mathrm{loc}}(\Omega)$), yields at $\mathcal{L}^n$-a.e. $x_0$, \begin{align*} \fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) \to 0, \end{align*} and Jensen's inequality bounds $|c_r - u(x_0)| \le \fint_{B(x_0, r)}|u - u(x_0)|\, d\mathcal{L}^n$. This gives only $|c_r - u(x_0)| = o(1)$, *not* $o(r)$. *The BV Poincaré / BV Lebesgue-point step.* The rate-$r$ improvement is the content of the [BV Lebesgue-point theorem](/theorems/bv-lebesgue-point) (also known as the strong Lebesgue-point theorem for BV functions), which builds on the [Poincaré in Balls](/theorems/3103) inequality applied at the scale $r$: \begin{align*} \fint_{B(x_0, r)} |u(x) - c_r|\, d\mathcal{L}^n(x) \le C_n \cdot r \cdot \frac{|Du|(B(x_0, r))}{\mathcal{L}^n(B(x_0, r))}. \end{align*} The factor of $r$ here is the rate that BV provides over generic $L^1_{\mathrm{loc}}$. The hypothesis $u \in BV(\Omega)$ holds; the conclusion of the BV Lebesgue-point theorem is: at $\mathcal{L}^n$-a.e. $x_0 \in \Omega$, the precise representative $\widetilde{u}(x_0)$ exists and \begin{align*} \frac{1}{r}\fint_{B(x_0, r)}|u(x) - \widetilde{u}(x_0)|\, d\mathcal{L}^n(x) \to 0 \quad \text{as } r \to 0. \end{align*} At Lebesgue points of $u$, $\widetilde{u}(x_0) = u(x_0)$ (Step 5). Combining, \begin{align*} \frac{1}{r}\fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) \to 0. \end{align*} By Jensen's inequality once more, \begin{align*} \frac{|c_r - u(x_0)|}{r} \le \frac{1}{r}\fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) \to 0, \end{align*} which is the rate-$r$ absorption. *Conclusion.* At $\mathcal{L}^n$-a.e. $x_0$, the integral \begin{align*} \fint_{B(x_0, r)}\frac{|u(x) - u(x_0) - g(x_0)\cdot(x - x_0)|}{r}\, d\mathcal{L}^n(x) \to 0 \quad \text{as } r \to 0. \end{align*} This is the definition of $u$ being approximately differentiable at $x_0$ with approximate gradient $\nabla u(x_0) = g(x_0)$. Since $g$ is the Radon–Nikodym density of $D^a u$ relative to $\mathcal{L}^n$, the identity $D^a u = \nabla u \cdot \mathcal{L}^n$ follows. [/guided] [/step] [step:Identify $\widetilde{u}(x_0) = u(x_0)$ at $\mathcal{L}^n$-a.e. $x_0$] The approximate limit $\widetilde{u}(x_0)$ is, by definition, the unique value $\alpha \in \mathbb{R}$ (when it exists) such that \begin{align*} \lim_{r \to 0}\fint_{B(x_0, r)} |u(x) - \alpha|\, d\mathcal{L}^n(x) = 0. \end{align*} Apply the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch) to $u \in L^1_{\mathrm{loc}}(\Omega)$ (which holds since $u \in BV(\Omega) \subseteq L^1(\Omega)$). At $\mathcal{L}^n$-a.e. $x_0 \in \Omega$ (the Lebesgue points of $u$), \begin{align*} \lim_{r \to 0}\fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) = 0. \end{align*} At such $x_0$, the value $\alpha = u(x_0)$ realises the approximate-limit condition, so $\widetilde{u}(x_0) = u(x_0)$. Since the set of Lebesgue points has full $\mathcal{L}^n$-measure in $\Omega$, $\widetilde{u}(x_0) = u(x_0)$ for $\mathcal{L}^n$-a.e. $x_0 \in \Omega$. This completes the proof of all three claims: the existence of $\nabla u$ a.e., the identity $D^a u = \nabla u \cdot \mathcal{L}^n$, and $\widetilde{u} = u$ a.e. [guided] The final assertion is the $\mathcal{L}^n$-a.e. identification of the approximate limit and the precise representative. The defining property of the approximate limit at $x_0$: $\widetilde{u}(x_0) = \alpha$ when \begin{align*} \lim_{r \to 0}\fint_{B(x_0, r)}|u(x) - \alpha|\, d\mathcal{L}^n(x) = 0, \end{align*} and $\alpha$ is the unique such value when it exists. The hypotheses of the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch): $u \in L^1_{\mathrm{loc}}(\Omega)$. Verification: $u \in BV(\Omega)$ implies $u \in L^1(\Omega) \subseteq L^1_{\mathrm{loc}}(\Omega)$ (the $L^1$-component of the $BV$-norm guarantees integrability on $\Omega$, and locally so on any compact subset). The conclusion of the differentiation theorem is the strong form \begin{align*} \lim_{r \to 0}\fint_{B(x_0, r)} |u(x) - u(x_0)|\, d\mathcal{L}^n(x) = 0 \quad \text{at $\mathcal{L}^n$-a.e. } x_0. \end{align*} At such $x_0$ — the *Lebesgue points* of $u$ — the value $\alpha = u(x_0)$ realises the approximate-limit condition. By uniqueness of the approximate limit (when it exists), $\widetilde{u}(x_0) = u(x_0)$. Since the Lebesgue points form a set of full $\mathcal{L}^n$-measure (a standard consequence of the [Lebesgue–Besicovitch differentiation theorem](/theorems/lebesgue-besicovitch), holding for any $L^1_{\mathrm{loc}}$ function), the identity $\widetilde{u}(x_0) = u(x_0)$ holds at $\mathcal{L}^n$-a.e. $x_0 \in \Omega$. This completes the proof of all three claims of the theorem: (i) the approximate gradient $\nabla u(x_0)$ exists for $\mathcal{L}^n$-a.e. $x_0$ (Step 4), (ii) the identity $D^a u = \nabla u \cdot \mathcal{L}^n$ holds (Step 4 via the Radon–Nikodym density identification), and (iii) $\widetilde{u}(x_0) = u(x_0)$ at $\mathcal{L}^n$-a.e. $x_0$ (this step). [/guided] [/step]

Explore Further

What brings you to Androma?

Start with a route through the knowledge graph.

Approximate Gradient Identifies $D^a u$ (Theorem # 3130)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Approximate Gradient Identifies $D^a u$ (Theorem # 3130)

Discussion

Proof

Explore Further