Lebesgue Differentiation Theorem — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove that almost every point in $\mathbb{R}^n$ is a Lebesgue point of $f \in L^1_{\mathrm{loc}}(\mathbb{R}^n)$. First, we reduce to $f \in L^1(\mathbb{R}^n)$ by a localisation argument. Then we approximate $f$ by a continuous compactly supported function $g$, decompose the oscillation $T_r(f)(x)$ via the triangle inequality into a term involving $g$ (which vanishes as $r \to 0$ by continuity) and terms involving the error $h = f - g$ controlled by the Hardy-Littlewood maximal function. The weak-type $(1,1)$ bound for the maximal operator and Chebyshev's inequality show the "bad set" has measure at most $C\varepsilon/\delta$, and since $\varepsilon$ is arbitrary, the bad set is null for each threshold $\delta$. [/proofplan] [step:Reduce to $f \in L^1(\mathbb{R}^n)$ by localisation] It suffices to prove the theorem for $f \in L^1(\mathbb{R}^n)$. For general $f \in L^1_{\mathrm{loc}}(\mathbb{R}^n)$, define $f_k := f \cdot \mathbb{1}_{B(0,k)}$ for each $k \in \mathbb{N}$. Each $f_k \in L^1(\mathbb{R}^n)$, and for $x \in B(0, k)$ and $r < \operatorname{dist}(x, \partial B(0,k))$, we have $f_k = f$ on $B(x,r)$, so $x$ is a Lebesgue point of $f$ if and only if it is a Lebesgue point of $f_k$. If the theorem holds for each $f_k$, the set of non-Lebesgue points of $f$ inside $B(0,k)$ is a null set for every $k$, and the full set of non-Lebesgue points is $\bigcup_{k=1}^\infty (\text{non-Lebesgue points of } f \text{ in } B(0,k))$, a countable union of null sets, hence null. We assume $f \in L^1(\mathbb{R}^n)$ for the remainder. [/step] [step:Define the oscillation functional and decompose via a continuous approximation] For $x \in \mathbb{R}^n$ and $r > 0$, define the average oscillation: \begin{align*} T_r(f)(x) := \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |f(y) - f(x)| \, d\mathcal{L}^n(y). \end{align*} We must show $\limsup_{r \to 0^+} T_r(f)(x) = 0$ for $\mathcal{L}^n$-a.e. $x$. Fix $\varepsilon > 0$. Since $C_c(\mathbb{R}^n)$ is dense in $L^1(\mathbb{R}^n)$, there exists $g \in C_c(\mathbb{R}^n)$ with $\|f - g\|_{L^1} < \varepsilon$. Write $h := f - g$, so $f = g + h$ and $\|h\|_{L^1} < \varepsilon$. By the triangle inequality: \begin{align*} T_r(f)(x) &\le \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |g(y) - g(x)| \, d\mathcal{L}^n(y) \\ &\quad + \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |h(y)| \, d\mathcal{L}^n(y) + |h(x)|. \end{align*} The second term on the right is bounded above by the Hardy-Littlewood maximal function $M(h)(x)$, defined by: \begin{align*} M(h)(x) := \sup_{r > 0} \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |h(y)| \, d\mathcal{L}^n(y). \end{align*} Therefore: \begin{align*} T_r(f)(x) \le \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |g(y) - g(x)| \, d\mathcal{L}^n(y) + M(h)(x) + |h(x)|. \end{align*} [guided] The strategy is to split $f$ into a "nice" part $g$ (continuous, compactly supported) and a "small" error $h = f - g$. For the nice part, the oscillation vanishes as $r \to 0$ by continuity. For the error, we bound the oscillation using the maximal function, which we can control in measure via the weak-type inequality. Fix $\varepsilon > 0$. Density of $C_c(\mathbb{R}^n)$ in $L^1(\mathbb{R}^n)$ provides $g \in C_c(\mathbb{R}^n)$ with $\|f - g\|_{L^1} < \varepsilon$. Set $h = f - g$. We decompose the oscillation using $f(y) - f(x) = (g(y) - g(x)) + (h(y) - h(x))$ and the triangle inequality: \begin{align*} |f(y) - f(x)| \le |g(y) - g(x)| + |h(y)| + |h(x)|. \end{align*} Averaging over $B(x,r)$: \begin{align*} T_r(f)(x) \le \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |g(y) - g(x)| \, d\mathcal{L}^n(y) + \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |h(y)| \, d\mathcal{L}^n(y) + |h(x)|. \end{align*} The second term is at most $M(h)(x)$ by definition of the Hardy-Littlewood maximal function. The third term $|h(x)|$ arises because $|h(x)|$ is constant in $y$ and factors out of the average. [/guided] [/step] [step:Take $\limsup$ as $r \to 0$ and eliminate the continuous part] Since $g$ is continuous, for every $x \in \mathbb{R}^n$: \begin{align*} \lim_{r \to 0^+} \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |g(y) - g(x)| \, d\mathcal{L}^n(y) = 0. \end{align*} Taking $\limsup_{r \to 0^+}$ in the decomposition inequality: \begin{align*} \Omega(f)(x) := \limsup_{r \to 0^+} T_r(f)(x) \le M(h)(x) + |h(x)|. \end{align*} [guided] Why does the continuous term vanish? Fix $x$ and $\varepsilon' > 0$. By continuity of $g$, there exists $\delta > 0$ such that $|g(y) - g(x)| < \varepsilon'$ whenever $|y - x| < \delta$. For $r < \delta$, every $y \in B(x,r)$ satisfies $|y - x| < r < \delta$, so: \begin{align*} \frac{1}{\mathcal{L}^n(B(x,r))} \int_{B(x,r)} |g(y) - g(x)| \, d\mathcal{L}^n(y) < \varepsilon'. \end{align*} Since $\varepsilon' > 0$ was arbitrary, the limit is $0$. The $\limsup$ of the remaining terms gives the bound $\Omega(f)(x) \le M(h)(x) + |h(x)|$. Note that the right-hand side does not depend on $r$ -- it is already a pointwise bound valid for all $r$. [/guided] [/step] [step:Bound the measure of the singular set using the weak-type $(1,1)$ inequality] For $\delta > 0$, define the singular set $E_\delta := \{x \in \mathbb{R}^n : \Omega(f)(x) > \delta\}$. From the bound $\Omega(f)(x) \le M(h)(x) + |h(x)|$: \begin{align*} E_\delta \subseteq \{x : M(h)(x) > \delta/2\} \cup \{x : |h(x)| > \delta/2\}. \end{align*} The Hardy-Littlewood maximal inequality (weak-type $(1,1)$) provides a dimensional constant $C = C(n) > 0$ with: \begin{align*} \mathcal{L}^n(\{M(h) > \delta/2\}) \le \frac{2C}{\delta} \|h\|_{L^1}. \end{align*} Chebyshev's inequality (Markov's inequality) gives: \begin{align*} \mathcal{L}^n(\{|h| > \delta/2\}) \le \frac{2}{\delta} \|h\|_{L^1}. \end{align*} Combining: \begin{align*} \mathcal{L}^n(E_\delta) \le \frac{2(C + 1)}{\delta} \|h\|_{L^1} < \frac{2(C + 1)}{\delta} \varepsilon. \end{align*} [guided] We need to show the set where $\Omega(f) > \delta$ has measure zero. The bound $\Omega(f)(x) \le M(h)(x) + |h(x)|$ means that if $\Omega(f)(x) > \delta$, then at least one of $M(h)(x)$ or $|h(x)|$ exceeds $\delta/2$. Therefore: \begin{align*} E_\delta \subseteq \{M(h) > \delta/2\} \cup \{|h| > \delta/2\}. \end{align*} For the first set, we use the weak-type $(1,1)$ bound for the Hardy-Littlewood maximal operator. This is a fundamental result in harmonic analysis: there exists a constant $C > 0$ depending only on the dimension $n$ such that for any $g \in L^1(\mathbb{R}^n)$ and $\lambda > 0$, $\mathcal{L}^n(\{M(g) > \lambda\}) \le \frac{C}{\lambda} \|g\|_{L^1}$. Applying this with $g = h$ and $\lambda = \delta/2$: \begin{align*} \mathcal{L}^n(\{M(h) > \delta/2\}) \le \frac{2C}{\delta} \|h\|_{L^1}. \end{align*} For the second set, Chebyshev's inequality gives $\mathcal{L}^n(\{|h| > \delta/2\}) \le \frac{2}{\delta} \|h\|_{L^1}$. Adding both estimates and using $\|h\|_{L^1} = \|f - g\|_{L^1} < \varepsilon$: \begin{align*} \mathcal{L}^n(E_\delta) \le \frac{2C}{\delta} \|h\|_{L^1} + \frac{2}{\delta} \|h\|_{L^1} = \frac{2(C+1)}{\delta} \|h\|_{L^1} < \frac{2(C+1)}{\delta} \varepsilon. \end{align*} [/guided] [/step] [step:Send $\varepsilon \to 0$ and take a countable union to conclude] Since $\varepsilon > 0$ was arbitrary and $\mathcal{L}^n(E_\delta) < \frac{2(C+1)}{\delta} \varepsilon$ for every $\varepsilon > 0$, it follows that $\mathcal{L}^n(E_\delta) = 0$ for each $\delta > 0$. The set of points where $\Omega(f)(x) > 0$ is: \begin{align*} \{x : \Omega(f)(x) > 0\} = \bigcup_{k=1}^{\infty} E_{1/k}. \end{align*} As a countable union of null sets, this set has $\mathcal{L}^n$-measure zero. Therefore $\limsup_{r \to 0^+} T_r(f)(x) = 0$ for $\mathcal{L}^n$-a.e. $x$, which means $\lim_{r \to 0^+} T_r(f)(x) = 0$ a.e. (since $T_r(f)(x) \ge 0$, the $\limsup$ being zero forces the limit to exist and equal zero). [guided] The final step ties the argument together. The bound $\mathcal{L}^n(E_\delta) < \frac{2(C+1)}{\delta} \varepsilon$ holds for every $\varepsilon > 0$ (we can always find a better continuous approximation $g$). Since the left-hand side $\mathcal{L}^n(E_\delta)$ does not depend on $\varepsilon$, we must have $\mathcal{L}^n(E_\delta) = 0$. This holds for every $\delta > 0$. The set where $\Omega(f)(x) > 0$ -- the set of non-Lebesgue points -- equals $\bigcup_{k=1}^{\infty} E_{1/k}$: a point $x$ with $\Omega(f)(x) > 0$ satisfies $\Omega(f)(x) > 1/k$ for some $k$, so $x \in E_{1/k}$. Conversely, every $x \in E_{1/k}$ has $\Omega(f)(x) > 1/k > 0$. Since each $E_{1/k}$ has measure zero and the union is countable: \begin{align*} \mathcal{L}^n(\{x : \Omega(f)(x) > 0\}) \le \sum_{k=1}^{\infty} \mathcal{L}^n(E_{1/k}) = 0. \end{align*} Therefore $\Omega(f)(x) = 0$ for $\mathcal{L}^n$-a.e. $x$. Since $T_r(f)(x) \ge 0$ for all $r$ and $\limsup_{r \to 0^+} T_r(f)(x) = 0$, the limit exists and equals $0$ at almost every point. [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Lebesgue Differentiation Theorem (Theorem # 74)

Discussion

Proof

Sign in to Androma

Check your inbox

One last step

Lebesgue Differentiation Theorem (Theorem # 74)

Discussion

Proof