A powerful approach in modern analysis is to view a partial differential equation not as a pointwise equality, but as an operator equation:
\begin{align*}L(u) = f.
\end{align*}
Here, $L$ is a differential operator (e.g., the Laplacian $\Delta$ or a derivative $d/dx$). To understand the solvability of this equation using functional analysis, we must treat $L$ as a mapping between specific vector spaces, $L: X \to Y$.
The classical choice of spaces, such as $X = C^2(U)$ and $Y=C^0(U)$, is often inadequate. These spaces are not complete in the $L^2$ norms, preventing us from using powerful tools like the Spectral Theorem or Riesz Representation Theorem. Furthermore, classical spaces exclude physically valid "rough" solutions.
[example: The Failure of Classical Spaces]
Consider the problem of finding $u$ on $U=(-1,1)$ such that $u'(x) = \text{sgn}(x)$.
The natural candidate is $u(x) = |x|$. However, strictly speaking, $u \notin C^1(U)$ because it is not differentiable at $x=0$. From the perspective of classical operators, $u$ is not in the domain of the differentiation operator.
However, if we test $u(x)=|x|$ against a test function $\phi \in C^\infty_c(U)$ using integration by parts:
\begin{align*}
\int_{-1}^1 |x| \phi'(x) \, dx &= \int_{-1}^0 (-x)\phi'(x) \, dx + \int_{0}^1 x\phi'(x) \, dx \\
&= - \int_{-1}^1 \text{sgn}(x) \phi(x) \, dx
\end{align*}
The boundary terms vanish, and the integral relation holds perfectly.
[/example]
**Constructing the Correct Space**
To resolve this, we do not discard the candidate solution; we strictly enlarge the space of admissible functions. We want a Banach space $X$ that:
1. Includes functions with "weak" derivatives (like our example).
2. Is complete, allowing us to take limits of sequences of functions.
This leads us to **Sobolev spaces**, denoted $W^{k,p}(U)$. These are essentially $L^p$ spaces that carry enough structure to support differential operators up to order $k$.
## Definition
[definition: Sobolev Space]
The **Sobolev space** $W^{k,p}(U)$ consists of all locally summable functions $u: U \to \mathbb{R}$ such that for each multi-index $\alpha$ with $|\alpha| \leq k$, the weak derivative $D^{\alpha}u$ exists and belongs to $L^p(U)$.
[/definition]
If $p=2$, we usually write $H^k(U) = W^{k,2}(U)$ for $k=0, 1, 2, \dots$.
We now need to add some structure to the space to make it a bit more interesting. Let us define the Sobolev norm.
[definition: Sobolev Norm]
For a function $u \in W^{k,p}(U)$, the **Sobolev norm** is defined as follows:
- For $1 \leq p < \infty$:
\begin{align*} \|u\|_{W^{k,p}(U)} := \left( \sum_{|\alpha| \leq k} \|D^{\alpha}u\|_{L^p(U)}^p \right)^{1/p}\end{align*}
- For $p = \infty$:
\begin{align*} \|u\|_{W^{k,\infty}(U)} := \max_{|\alpha| \leq k} \|D^{\alpha}u\|_{L^\infty(U)} \end{align*}
[/definition]
Equipped with this norm, $W^{k,p}(U)$ is a Banach space. Spaces with $p=2$, denoted $H^k(U)$, are Hilbert spaces.
[citetheorem:77]
## Approximation by Smooth Functions
A central difficulty in the theory of Sobolev spaces is that weak derivatives are defined via integration by parts, not by pointwise limits of difference quotients. Consequently, standard calculus operations—such as the Chain Rule, the Product Rule, or change of variables—cannot be applied directly using classical definitions.
To overcome this, we rely on **density arguments**. The strategy is threefold:
1. **Approximate:** Construct a sequence of smooth functions $\{u_m\}_{m=1}^\infty \subset C^\infty(U)$ that converges to $u$ in the Sobolev norm $W^{k,p}(U)$.
2. **Verify:** Prove the desired property (e.g., an inequality or identity) for the smooth functions $u_m$ using classical calculus.
3. **Pass to Limit:** Use the convergence in the norm to show that the property holds for $u$ in the limit.
[example:TheChainRuleProblem]
Consider the problem of justifying the Chain Rule for a Sobolev function. Let $F: \mathbb{R} \to \mathbb{R}$ be a smooth function with bounded derivative, and let $u \in W^{1,p}(U)$. We expect that $v(x) := F(u(x))$ is also in $W^{1,p}(U)$ and that its gradient is $\nabla v = F'(u) \nabla u$.
If we attempt to prove this directly using the definition of the weak derivative, we must show:
\begin{align*}
\int_U F(u) \partial_{x_i} \phi \, \mathrm{d}\mathcal{L}^n = - \int_U (F'(u) \partial_{x_i} u) \phi \, \mathrm{d}\mathcal{L}^n \quad \forall \phi \in C_c^\infty(U).
\end{align*}
Using integration by parts on the left side is invalid because $u$ is not necessarily $C^1$, so the classical Chain Rule does not apply pointwise to allow us to "move" the derivative. However, if we can find smooth $u_m \to u$, we can write $\nabla (F(u_m)) = F'(u_m) \nabla u_m$ legitimately, and then investigate the convergence of each term.
[/example]
### Local Approximation
The most basic tool for smoothing a function is **mollification** (convolution with a smooth kernel). This provides smooth approximations that converge to the original function on strict subdomains.
[definition:Standard Mollifier]
Define the function $\eta: \mathbb{R}^n \to \mathbb{R}$ by:
\begin{align*}
\eta(x) := \begin{cases}
C \exp\left(\frac{1}{|x|^2 - 1}\right) & \text{if } |x| < 1 \\
0 & \text{if } |x| \ge 1
\end{cases}
\end{align*}
where the constant $C$ is chosen such that $\int_{\mathbb{R}^n} \eta \, \mathrm{d}\mathcal{L}^n = 1$.
For $\varepsilon > 0$, we define the **mollifier** $\eta_\varepsilon: \mathbb{R}^n \to \mathbb{R}$ by rescaling:
\begin{align*}
\eta_\varepsilon(x) := \frac{1}{\varepsilon^n} \eta\left(\frac{x}{\varepsilon}\right).
\end{align*}
For a locally integrable function $u \in L^1_{\text{loc}}(U)$, we define its **mollification** $u_\varepsilon$ on the set $U_\varepsilon := \{x \in U : \operatorname{dist}(x, \partial U) > \varepsilon\}$ by:
\begin{align*}
u_\varepsilon(x) := (\eta_\varepsilon * u)(x) = \int_{U} \eta_\varepsilon(x-y) u(y) \, \mathrm{d}\mathcal{L}^n(y).
\end{align*}
[/definition]
[citetheorem:56]
### Global Approximation
Mollification only works locally because the convolution kernel $\eta_\varepsilon(x-y)$ "reaches out" a distance $\varepsilon$. Near the boundary $\partial U$, this kernel might exit the domain $U$, making the integral undefined.
To prove that smooth functions are dense in $W^{k,p}(U)$ *globally*, we need a more sophisticated approach employing a partition of unity. This result is known as the **Meyers-Serrin Theorem**. Note that this theorem establishes the density of $C^\infty(U) \cap W^{k,p}(U)$, which functions are smooth *inside* $U$ but may behave wildly near the boundary. It does **not** guarantee density of functions smooth up to the boundary ($C^\infty(\bar{U})$) without further assumptions on the regularity of $\partial U$.
[citetheorem:58]
## Extensions and Traces
Standard Sobolev functions are defined only up to a set of measure zero. This creates two immediate difficulties for boundary value problems:
1. **Boundary Values:** Since the boundary $\partial U$ has Lebesgue measure zero ($\mathcal{L}^n(\partial U) = 0$), the restriction $u|_{\partial U}$ is mathematically undefined. We cannot simply "evaluate" a measurable function on a null set.
2. **Global Analysis:** Many powerful tools (like the Fourier transform or convolution with global mollifiers) work best on all of $\mathbb{R}^n$. We often need to "extend" a function $u \in W^{1,p}(U)$ to a function $\bar{u} \in W^{1,p}(\mathbb{R}^n)$.
### Extension Theorems
In the theory of Sobolev spaces, functions are often defined on a bounded domain $U$. However, many analytical tools—such as convolution with standard mollifiers or Fourier analysis—are naturally defined on the entire space $\mathbb{R}^n$. An **Extension Operator** allows us to bridge this gap by extending a function $u \in W^{1,p}(U)$ to a function $Eu \in W^{1,p}(\mathbb{R}^n)$ while preserving the Sobolev norm and regularity properties.
[citetheorem:59]
### The Trace Operator
We now address the problem of boundary values. While pointwise evaluation is impossible, we can define boundary values via a density argument. If $u$ is smooth ($u \in C^\infty(\bar{U})$), its boundary values are perfectly defined. Since smooth functions are dense in $W^{1,p}(U)$, we can extend the "restriction map" continuously.
[definition:Trace Operator]
Let $U$ be bounded with $C^1$ boundary. The **Trace Operator** $T$ is the unique bounded linear operator
\begin{align*}
T: W^{1,p}(U) \to L^p(\partial U)
\end{align*}
such that for any $u \in W^{1,p}(U) \cap C(\bar{U})$, we have $Tu = u|_{\partial U}$.
[/definition]
[citetheorem:60]
## Sobolev Inequalities
The definition of a Sobolev space $W^{k,p}(U)$ controls the $L^p$ norms of a function and its derivatives. A fundamental question in the theory is: **Does controlling the derivatives in $L^p$ gain us better properties for the function itself?**
The answer depends on the relationship between the integrability exponent $p$ and the spatial dimension $n$. There are three distinct regimes:
1. **$p < n$:** We gain **integrability**. The function belongs to a "better" Lebesgue space $L^{p^*}$ where $p^* > p$.
2. **$p > n$:** We gain **regularity**. The function is actually Hölder continuous ($C^{0, \gamma}$).
3. **$p = n$:** This is the critical case (often leading to BMO/VMO spaces), which we will not treat in detail here.
### Gagliardo-Nirenberg-Sobolev Inequality ($p < n$)
When $p < n$, the function is not necessarily continuous (it can have singularities, like $|x|^{-\alpha}$ for small $\alpha$), but the singularities cannot be too severe. The Sobolev conjugate exponent $p^*$ quantifies exactly how much "better" the integrability becomes.
[definition:Sobolev Conjugate]
For $1 \le p < n$, the **Sobolev conjugate** $p^*$ is defined by:
\begin{align*}
\frac{1}{p^*} = \frac{1}{p} - \frac{1}{n} \quad \implies \quad p^* := \frac{np}{n-p}.
\end{align*}
[/definition]
[citetheorem:61]
### Morrey's Inequality ($p > n$)
When $p > n$, the function possesses "more than $n$ dimensions worth" of integrability. This forces the function to be continuous.
[citetheorem:62]
### The Critical Case ($p = n$)
When the integrability exponent $p$ equals the dimension $n$, the formula for the Sobolev conjugate yields $p^* = \infty$. This suggests a potential embedding into $L^\infty$. However, the limiting behavior of the Gagliardo-Nirenberg-Sobolev inequality breaks down.
[example:Failure of $L^\infty$ Embedding]
Let $U = B(0, 1)$ be the open unit ball in $\mathbb{R}^n$ with $n \ge 2$. We construct a function that is in $W^{1,n}(U)$ but is unbounded near the origin.
Consider the function $u: U \to \mathbb{R}$ defined by:
\begin{align*}
u(x) = \log\left( \log \left( \frac{e}{|x|} \right) \right).
\end{align*}
For $|x| < 1$, we have $\frac{e}{|x|} > e$, implying $\log(e/|x|) > 1$, so the outer logarithm is well-defined and positive.
As $|x| \to 0$, $u(x) \to \infty$, which proves that $u \notin L^\infty(U)$.
**Verification of Sobolev Regularity:**
We compute the gradient of $u$. Let $r = |x|$. By the Chain Rule:
\begin{align*}
\nabla u(x) &= \frac{1}{\log(e/r)} \nabla \left( \log \frac{e}{r} \right) \\
&= \frac{1}{\log(e/r)} \cdot \frac{1}{e/r} \cdot \nabla \left( \frac{e}{r} \right) \\
&= \frac{1}{\log(e/r)} \cdot \frac{r}{e} \cdot \left( -\frac{e}{r^2} \frac{x}{r} \right) \\
&= -\frac{1}{r \log(e/r)} \frac{x}{r}.
\end{align*}
The magnitude of the gradient is:
\begin{align*}
|\nabla u(x)| = \frac{1}{r \log(e/r)}.
\end{align*}
We check the $L^n$ integrability of the gradient using polar coordinates. Note that $\log(e/r) = 1 - \log r$.
\begin{align*}
\|\nabla u\|_{L^n(U)}^n &= \int_{B(0,1)} \left( \frac{1}{r (1 - \log r)} \right)^n \, \mathrm{d}\mathcal{L}^n(x) \\
&= n \omega_n \int_0^1 \frac{1}{r^n (1 - \log r)^n} r^{n-1} \, \mathrm{d}r \\
&= n \omega_n \int_0^1 \frac{1}{r (1 - \log r)^n} \, \mathrm{d}r.
\end{align*}
We perform the substitution $t = 1 - \log r$. Then $\mathrm{d}t = -\frac{1}{r} \mathrm{d}r$.
The limits of integration change: as $r \to 0^+$, $t \to \infty$; as $r \to 1^-$, $t \to 1$.
\begin{align*}
\int_0^1 \frac{1}{r (1 - \log r)^n} \, \mathrm{d}r = \int_1^\infty \frac{1}{t^n} \, \mathrm{d}t.
\end{align*}
Since $n \ge 2$, this integral converges ($\int_1^\infty t^{-n} \, \mathrm{d}t = \frac{1}{n-1}$).
Thus, $\nabla u \in L^n(U)$, proving that $W^{1,n}(U) \not\subset L^\infty(U)$.
[/example]
Although boundedness fails, $W^{1,n}$ functions are "almost" bounded in the sense that they belong to every finite $L^q$ space.
[citetheorem:63]
## Compactness
In finite-dimensional spaces (like $\mathbb{R}^n$), the Bolzano-Weierstrass theorem guarantees that every bounded sequence has a convergent subsequence. This property is vital for existence proofs, particularly in the calculus of variations and PDE theory, where we often minimize functionals over a class of functions.
In infinite-dimensional spaces like $L^p(U)$, the closed unit ball is **not** compact. A bounded sequence may oscillate infinitely fast or "drift away" to the boundary without ever converging to a limit function in the norm.
The **Rellich-Kondrachov Theorem** provides the crucial remedy: if we control both a function and its derivatives (i.e., we stay inside a bounded set in a Sobolev space), we regain compactness, provided we are willing to accept convergence in a slightly "worse" norm (a lower $L^q$ space).
### Definitions
[definition:Compact Embedding]
Let $X$ and $Y$ be Banach spaces with $X \subset Y$. We say $X$ is **compactly embedded** in $Y$, denoted $X \subset \subset Y$, if:
1. **Continuous Injection:** The identity map $I: X \to Y$ is continuous. That is, there exists $C$ such that $\|u\|_Y \le C \|u\|_X$ for all $u \in X$.
2. **Compactness:** The identity map is a compact operator. This means that any bounded sequence $\{u_m\}_{m=1}^\infty$ in $X$ is precompact in $Y$; it contains a subsequence $\{u_{m_j}\}$ that converges in $Y$.
[/definition]
The theorem states that we essentially "trade" one derivative for compactness in $L^q$.
[citetheorem:64]
### Rellich-Kondrachov: The Case $p > n$
When $p > n$, Morrey's Inequality tells us that $W^{1,p}(U)$ embeds continuously into the Hölder space $C^{0, 1-n/p}(\bar{U})$. The compactness result here is even stronger: the embedding into the space of continuous functions is compact.
[citetheorem:213]
### General Rellich-Kondrachov Theorem
We can summarize the compactness results for $W^{1,p}(U)$ (for bounded $U$ with $C^1$ boundary) as follows:
* **If $p < n$:** $W^{1,p}(U) \subset \subset L^q(U)$ for all $1 \le q < p^*$.
* **If $p = n$:** $W^{1,n}(U) \subset \subset L^q(U)$ for all $1 \le q < \infty$.
* **If $p > n$:** $W^{1,p}(U) \subset \subset C(\bar{U})$.
### Counterexample: Failure on Unbounded Domains
Compactness relies heavily on the domain $U$ being bounded. If the domain is unbounded (e.g., $U = \mathbb{R}^n$), the embedding is **not** compact. Mass can "escape to infinity."
[example:EscapeToInfinity]
Consider $U = \mathbb{R}^n$ and $u \in C_c^\infty(\mathbb{R}^n)$ with $\|u\|_{W^{1,p}} = 1$.
Define the sequence of translations $u_m(x) := u(x + m e_1)$, where $e_1 = (1, 0, \dots, 0)$.
1. **Boundedness:** The Sobolev norm is translation invariant, so $\|u_m\|_{W^{1,p}} = \|u\|_{W^{1,p}} = 1$. The sequence is bounded.
2. **Pointwise Limit:** For any fixed $x$, $u_m(x) \to 0$ as $m \to \infty$ (since the support moves away). Thus, the only possible limit is the zero function.
3. **Non-Convergence:** However, $\|u_m - 0\|_{L^p(\mathbb{R}^n)} = \|u\|_{L^p(\mathbb{R}^n)} \neq 0$.
The sequence does not converge to $0$ in the $L^p$ norm. No subsequence can converge strongly, so the embedding is not compact.
[/example]
### Poincaré Inequalities
Poincaré inequalities relate the $L^p$ norm of a function to the $L^p$ norm of its gradient. These estimates are essential for establishing **coercivity** in variational problems (e.g., proving the Lax-Milgram conditions). Roughly speaking, if a function is "pinned down" in some way (for example, its values on the boundary are zero, or its average value is zero), the gradient controls the total size of the function.
### The General Poincaré Inequality
This inequality asserts that a function cannot drift too far from its average value unless its gradient is large.
[citetheorem:75]
## Poincaré Inequality for $W_0^{1,p}(U)$
If a function vanishes on the boundary, we do not need to subtract the average value. The "anchor" at the boundary is sufficient to control the norm.
[citetheorem:76]
## Difference Quotients
So far, we have established the properties of functions that *already* possess weak derivatives. However, in the regularity theory of partial differential equations, we often face the reverse problem: we construct a candidate solution (usually via energy minimization or Galerkin methods) and must prove that it possesses higher regularity (e.g., that a $W^{1,2}$ solution is actually in $W^{2,2}$).
Since we cannot yet take derivatives, we cannot check the differential equation directly. Instead, we return to the definition of the derivative as a limit of finite differences.
### Definition and Properties
[definition: Difference Quotient]
Let $u: U \to \mathbb{R}$ be a locally summable function, and let $V \subset \subset U$ (meaning $V$ is precompact in $U$). For $0 < |h| < \operatorname{dist}(V, \partial U)$ and $i \in \{1, \dots, n\}$, the **$i$-th difference quotient** of size $h$ is defined as:
\begin{align*}
D_i^h u(x) := \frac{u(x + h e_i) - u(x)}{h} \quad \text{for } x \in V,
\end{align*}
where $e_i$ is the standard basis vector in the $i$-th direction.
[/definition]
A vector-valued difference quotient $D^h u$ is the vector $(D_1^h u, \dots, D_n^h u)$.
We rely on two fundamental theorems regarding difference quotients. The first states that weak derivatives control the size of difference quotients. The second (and more powerful) states that bounded difference quotients imply the existence of weak derivatives.
[citetheorem:78]
## Common Techniques
Working with Sobolev spaces requires a toolkit of techniques that appear repeatedly across the theory — from basic computations to advanced regularity arguments. This section collects the most important methods and explains when each is appropriate. Most of these techniques are not unique to Sobolev spaces, but the specific forms they take here reflect the interplay between weak differentiability, $L^p$ integrability, and compactness.
### Mollification
Mollification is the most basic smoothing technique: convolve a rough function with a smooth, compactly supported kernel to produce a smooth approximation. The standard mollifier $\eta_\varepsilon$ (defined earlier) produces $u_\varepsilon = \eta_\varepsilon * u \in C^\infty(U_\varepsilon)$ on the shrunken domain $U_\varepsilon = \{x \in U : \operatorname{dist}(x, \partial U) > \varepsilon\}$.
The key properties that make mollification useful in the Sobolev context are:
1. **Derivative commutation:** For $u \in W^{k,p}(U)$, the mollification commutes with weak differentiation: $D^\alpha(u_\varepsilon) = (D^\alpha u)_\varepsilon$ on $U_\varepsilon$ for all $|\alpha| \le k$. This means the smooth approximation inherits the derivative structure of the original function, not just its values.
2. **$L^p$ convergence:** $u_\varepsilon \to u$ in $W^{k,p}(V)$ for every $V \subset \subset U$ as $\varepsilon \to 0$.
3. **Norm control:** $\|u_\varepsilon\|_{W^{k,p}(V)} \le \|u\|_{W^{k,p}(U)}$ for $V \subset \subset U$ and $\varepsilon$ small enough.
The limitation of mollification is that it is *local*: because $\eta_\varepsilon$ has support of radius $\varepsilon$, the convolution $u_\varepsilon$ is only defined on $U_\varepsilon$, which shrinks away from the boundary. To obtain global approximations on all of $U$, mollification must be combined with either a partition of unity (as in the [Meyers-Serrin Theorem](/theorems/58)) or an extension operator.
[example: Derivative Commutation Under Mollification]
We prove the fundamental identity: if $u \in W^{1,p}(U)$, then $\partial_i(u_\varepsilon) = (\partial_i u)_\varepsilon$ on $U_\varepsilon$.
Fix $V \subset \subset U$ with $\varepsilon < \operatorname{dist}(V, \partial U)$. For any $x \in V$, the classical derivative of the mollification is computed by differentiating under the integral sign (justified because $\eta_\varepsilon$ is smooth and compactly supported):
\begin{align*}
\partial_i(u_\varepsilon)(x) = \partial_{x_i} \int_U \eta_\varepsilon(x - y) u(y) \, d\mathcal{L}^n(y) = \int_U \partial_{x_i}[\eta_\varepsilon(x - y)] \, u(y) \, d\mathcal{L}^n(y).
\end{align*}
The key observation is the sign flip: $\partial_{x_i}[\eta_\varepsilon(x - y)] = -\partial_{y_i}[\eta_\varepsilon(x - y)]$. Substituting:
\begin{align*}
\partial_i(u_\varepsilon)(x) = -\int_U \partial_{y_i}[\eta_\varepsilon(x - y)] \, u(y) \, d\mathcal{L}^n(y).
\end{align*}
For fixed $x \in V$, the function $y \mapsto \eta_\varepsilon(x - y)$ is smooth and compactly supported in $U$ (its support is $B(x, \varepsilon) \subset U$). It is therefore a valid test function for the weak derivative of $u$. The definition of the weak derivative $\partial_i u$ gives:
\begin{align*}
-\int_U \partial_{y_i}[\eta_\varepsilon(x - y)] \, u(y) \, d\mathcal{L}^n(y) = \int_U \eta_\varepsilon(x - y) \, \partial_i u(y) \, d\mathcal{L}^n(y) = (\partial_i u)_\varepsilon(x).
\end{align*}
Combining: $\partial_i(u_\varepsilon)(x) = (\partial_i u)_\varepsilon(x)$ for all $x \in V$. Since $V \subset \subset U$ was arbitrary with $\varepsilon < \operatorname{dist}(V, \partial U)$, the identity holds on all of $U_\varepsilon$.
As a consequence, $L^p$ convergence of mollifications extends to the Sobolev norm. Standard approximation theory gives $f_\varepsilon \to f$ in $L^p(V)$ for any $f \in L^p(U)$. Applying this to both $u$ and $\partial_i u$:
\begin{align*}
\|u_\varepsilon - u\|_{W^{1,p}(V)}^p = \|u_\varepsilon - u\|_{L^p(V)}^p + \sum_{i=1}^n \|\partial_i(u_\varepsilon) - \partial_i u\|_{L^p(V)}^p = \|u_\varepsilon - u\|_{L^p(V)}^p + \sum_{i=1}^n \|(\partial_i u)_\varepsilon - \partial_i u\|_{L^p(V)}^p \to 0.
\end{align*}
[/example]
### Cut-Off Functions and Localisation
A **cut-off function** is a smooth function $\zeta \in C^\infty_c(U)$ satisfying $0 \le \zeta \le 1$ that equals $1$ on a chosen subdomain and vanishes outside a slightly larger region. Cut-offs serve two purposes: they localise global problems to compact subsets, and they manufacture test functions with prescribed support.
The standard construction for a cut-off between $V \subset \subset W \subset \subset U$ is: choose $\delta = \operatorname{dist}(V, \partial W) / 2$ and set $\zeta = \eta_\delta * \mathbb{1}_{V_\delta}$, where $V_\delta = \{x : \operatorname{dist}(x, V) < \delta\}$. Then $\zeta \equiv 1$ on $V$, $\operatorname{supp}(\zeta) \subset W$, $\zeta \in C^\infty_c(W)$, and $|\nabla \zeta| \le C/\delta$.
Cut-off functions are essential in three recurring situations:
**Localising estimates.** If we know an estimate holds on compactly contained subdomains (e.g., an interior regularity bound), we multiply by a cut-off to extend it to the full domain while controlling the error terms. The gradient $\nabla \zeta$ introduces lower-order terms that must be absorbed — this is the source of many "absorption" arguments in PDE regularity theory.
**Constructing test functions.** In weak formulations, we need test functions in $W^{1,2}_0(U)$ or $C^\infty_c(U)$ with specific properties. The product $\phi = \zeta^2 v$ (with $v$ a Sobolev function and $\zeta$ a cut-off) is a standard choice: the square ensures $\nabla \phi = 2\zeta v \nabla \zeta + \zeta^2 \nabla v$, which separates cleanly into a "good" term ($\zeta^2 \nabla v$, supported where $\zeta = 1$) and an "error" term ($2\zeta v \nabla \zeta$, controlled by $\|\nabla \zeta\|_{L^\infty}$).
**Excising singularities.** To handle a function with a point singularity at the origin, define $\zeta_m \in C^\infty_c(U)$ with $\zeta_m \equiv 0$ on $B(0, 1/m)$ and $\zeta_m \equiv 1$ outside $B(0, 2/m)$, with $|\nabla \zeta_m| \le Cm$. The product $u_m := \zeta_m u$ is smooth near the singularity (it vanishes there), so $u_m \in W^{1,p}(U)$. Whether $u_m \to u$ in $W^{1,p}$ depends on the capacity of the singularity — see the discussion of approximation arguments below.
[example: Caccioppoli's Inequality via Cut-Off Test Functions]
We derive the classical **Caccioppoli inequality** (reverse Poincaré inequality) for weak solutions of the Laplace equation, illustrating the $\zeta^2 u$ test function trick.
Let $u \in W^{1,2}(U)$ be a weak solution of $-\Delta u = 0$, meaning:
\begin{align*}
\int_U \nabla u \cdot \nabla \phi \, d\mathcal{L}^n = 0 \quad \text{for all } \phi \in W^{1,2}_0(U).
\end{align*}
Fix $V \subset \subset W \subset \subset U$ and let $\zeta$ be a cut-off function with $\zeta \equiv 1$ on $V$, $\operatorname{supp}(\zeta) \subset W$, $0 \le \zeta \le 1$, and $|\nabla \zeta| \le C / \operatorname{dist}(V, \partial W)$.
Insert the test function $\phi = \zeta^2 u \in W^{1,2}_0(U)$ into the weak formulation. Computing its gradient:
\begin{align*}
\nabla \phi = 2\zeta u \nabla \zeta + \zeta^2 \nabla u.
\end{align*}
Substituting:
\begin{align*}
0 = \int_U \nabla u \cdot (2\zeta u \nabla \zeta + \zeta^2 \nabla u) \, d\mathcal{L}^n = \int_U \zeta^2 |\nabla u|^2 \, d\mathcal{L}^n + 2\int_U \zeta u \, \nabla u \cdot \nabla \zeta \, d\mathcal{L}^n.
\end{align*}
Rearranging and applying the Cauchy-Schwarz inequality to the cross term:
\begin{align*}
\int_U \zeta^2 |\nabla u|^2 \, d\mathcal{L}^n &= -2\int_U \zeta u \, \nabla u \cdot \nabla \zeta \, d\mathcal{L}^n \\
&\le 2\int_U |\zeta \nabla u| \, |u \nabla \zeta| \, d\mathcal{L}^n \\
&\le 2\left(\int_U \zeta^2 |\nabla u|^2 \, d\mathcal{L}^n\right)^{1/2} \left(\int_U u^2 |\nabla \zeta|^2 \, d\mathcal{L}^n\right)^{1/2}.
\end{align*}
Set $A = \left(\int_U \zeta^2 |\nabla u|^2\right)^{1/2}$ and $B = \left(\int_U u^2 |\nabla \zeta|^2\right)^{1/2}$. The inequality reads $A^2 \le 2AB$, so $A \le 2B$, i.e. $A^2 \le 4B^2$. Since $\zeta \equiv 1$ on $V$:
\begin{align*}
\int_V |\nabla u|^2 \, d\mathcal{L}^n \le \int_U \zeta^2 |\nabla u|^2 \, d\mathcal{L}^n \le \frac{4}{\operatorname{dist}(V, \partial W)^2} \int_W u^2 \, d\mathcal{L}^n.
\end{align*}
This is the Caccioppoli inequality: the gradient on the inner domain $V$ is controlled by the function values on the larger domain $W$. The key structural features were: the $\zeta^2$ (not $\zeta$) ensured the absorption step $A^2 \le 2AB \implies A \le 2B$ worked cleanly, and the cut-off localised the estimate to $V$ while the error term $|\nabla \zeta|$ contributed only a geometric constant.
[/example]
### Direct Verification by Excision
This is the most elementary method for proving that a specific function with an isolated singularity has a weak derivative. The idea is to verify the integration by parts identity directly by excising a neighbourhood of the singularity and controlling the boundary term.
**Setup.** Let $u \in L^p(U)$ with $u \in C^1(U \setminus \{0\})$. Define the candidate weak derivative $v := \nabla u$ (the classical gradient, defined a.e.). To verify $v = \partial_i u$ weakly, fix $\phi \in C^\infty_c(U)$ and apply the divergence theorem on the excised domain $U_\varepsilon := U \setminus \bar{B}(0, \varepsilon)$:
\begin{align*}
\int_{U_\varepsilon} u \, \partial_i \phi \, d\mathcal{L}^n = -\int_{U_\varepsilon} v_i \, \phi \, d\mathcal{L}^n + \int_{\partial B(0, \varepsilon)} u \, \phi \, \nu^i \, d\mathcal{H}^{n-1},
\end{align*}
where $\nu$ is the outward normal to $U_\varepsilon$ (pointing into $B(0, \varepsilon)$). If $v \in L^p(U)$, the volume integrals converge by dominated convergence as $\varepsilon \to 0$. The method succeeds if and only if the boundary integral vanishes:
\begin{align*}
\lim_{\varepsilon \to 0} \int_{\partial B(0, \varepsilon)} |u| \, d\mathcal{H}^{n-1} = 0.
\end{align*}
For a function with power-law singularity $|u(x)| \le C|x|^{-\gamma}$, this integral scales as $\varepsilon^{n-1-\gamma}$, which vanishes provided $\gamma < n - 1$. This is precisely what was verified in the worked problem for $u(x) = |x|^{-\gamma}$ earlier in this article.
[example: The Logarithmic Singularity in Two Dimensions]
Let $n = 2$, $U = B(0, 1) \subset \mathbb{R}^2$, and define $u(x) = \log|x|$. We prove that $u \in W^{1,p}(U)$ for all $1 \le p < 2$ by the method of excision.
**Step 1: $L^p$ integrability of $u$.** Using polar coordinates with $r = |x|$:
\begin{align*}
\int_{B(0,1)} |\log|x||^p \, d\mathcal{L}^2 = 2\pi \int_0^1 |\log r|^p \, r \, dr.
\end{align*}
Near $r = 0$, $|\log r|^p$ grows like $|\log r|^p$ while the factor $r$ forces integrability. Substituting $t = -\log r$ (so $r = e^{-t}$, $dr = -e^{-t} dt$):
\begin{align*}
2\pi \int_0^1 |\log r|^p r \, dr = 2\pi \int_0^\infty t^p e^{-2t} \, dt = 2\pi \cdot \frac{\Gamma(p+1)}{2^{p+1}} < \infty
\end{align*}
for all $p \ge 1$. So $u \in L^p(U)$ for every $p$.
**Step 2: Candidate weak derivative.** For $x \neq 0$, the classical gradient is:
\begin{align*}
\nabla u(x) = \frac{x}{|x|^2}, \quad |\nabla u(x)| = \frac{1}{|x|}.
\end{align*}
We check $\nabla u \in L^p(U)$:
\begin{align*}
\int_{B(0,1)} \frac{1}{|x|^p} \, d\mathcal{L}^2 = 2\pi \int_0^1 r^{-p} \cdot r \, dr = 2\pi \int_0^1 r^{1 - p} \, dr.
\end{align*}
This converges if and only if $1 - p > -1$, i.e. $p < 2$. So $v := \nabla u \in L^p(U; \mathbb{R}^2)$ for $p < 2$.
**Step 3: Excision and boundary estimate.** Fix $\phi \in C^\infty_c(U)$ and $\varepsilon > 0$. On $U_\varepsilon = B(0,1) \setminus \bar{B}(0, \varepsilon)$, the divergence theorem gives:
\begin{align*}
\int_{U_\varepsilon} u \, \partial_i \phi \, d\mathcal{L}^2 = -\int_{U_\varepsilon} v_i \, \phi \, d\mathcal{L}^2 + \int_{\partial B(0, \varepsilon)} u \, \phi \, \nu^i \, d\mathcal{H}^1.
\end{align*}
We estimate the boundary integral. On $\partial B(0, \varepsilon)$: $|u| = |\log \varepsilon|$ and $\mathcal{H}^1(\partial B(0, \varepsilon)) = 2\pi\varepsilon$. Thus:
\begin{align*}
\left|\int_{\partial B(0, \varepsilon)} u \, \phi \, \nu^i \, d\mathcal{H}^1\right| \le \|\phi\|_{L^\infty} \cdot |\log \varepsilon| \cdot 2\pi\varepsilon = C \varepsilon |\log \varepsilon| \to 0 \quad \text{as } \varepsilon \to 0.
\end{align*}
The volume integrals converge by dominated convergence (since $u \partial_i \phi \in L^1(U)$ and $v_i \phi \in L^1(U)$). Taking $\varepsilon \to 0$:
\begin{align*}
\int_U u \, \partial_i \phi \, d\mathcal{L}^2 = -\int_U v_i \, \phi \, d\mathcal{L}^2,
\end{align*}
confirming $v = \nabla u$ weakly. Therefore $u = \log|x| \in W^{1,p}(B(0,1))$ for all $1 \le p < 2$.
This example is prototypical of the **critical dimension** phenomenon: the fundamental solution of the Laplacian in $\mathbb{R}^n$ belongs to $W^{1,p}$ for $p < n$ but fails at $p = n$, matching the threshold in the Sobolev embedding theory. Note also the boundary term $\varepsilon|\log \varepsilon| \to 0$ — the logarithmic singularity is just barely removable in dimension $2$.
[/example]
### Approximation and Completeness Arguments
Rather than verifying the integration by parts identity directly, we can exploit the fact that $W^{1,p}(U)$ is a Banach space (hence complete) and that the weak derivative is a closed operator.
**The principle.** Suppose we have $u \in L^p(U)$ and a candidate derivative $v \in L^p(U; \mathbb{R}^n)$. If we can construct a sequence $\{u_m\} \subset W^{1,p}(U)$ satisfying:
\begin{align*}
u_m \to u \quad \text{in } L^p(U), \qquad \nabla u_m \to v \quad \text{in } L^p(U; \mathbb{R}^n),
\end{align*}
then $u \in W^{1,p}(U)$ with $\nabla u = v$. This follows from the closedness of the graph of the gradient operator: if $(u_m, \nabla u_m) \to (u, v)$ in $L^p \times L^p$, then $(u, v)$ lies in the graph.
This is often more flexible than direct verification because we do not need to compute boundary integrals. Instead, we transfer the problem to estimating error terms in $L^p$.
**Implementation via cut-offs.** For a function with a point singularity at the origin, take $u_m = \zeta_m u$ as above. Then:
\begin{align*}
\nabla u_m - v = (\zeta_m - 1) \nabla u + u \nabla \zeta_m.
\end{align*}
The first term $(\zeta_m - 1)\nabla u \to 0$ in $L^p$ by dominated convergence (since $\zeta_m \to 1$ pointwise a.e. and $|\zeta_m - 1| \le 1$). The second term is the "error" from the cut-off. Since $|\nabla \zeta_m| \le Cm$ and $\operatorname{supp}(\nabla \zeta_m) \subset B(0, 2/m) \setminus B(0, 1/m)$:
\begin{align*}
\|u \nabla \zeta_m\|_{L^p}^p \le C^p m^p \int_{B(0, 2/m)} |u|^p \, d\mathcal{L}^n.
\end{align*}
For $u(x) = |x|^{-\gamma}$, this integral scales as $m^p \cdot m^{-(n - \gamma p)} = m^{p + \gamma p - n}$, which vanishes as $m \to \infty$ if and only if $p + \gamma p < n$, i.e., $\gamma < (n-p)/p$. This recovers the same threshold as the direct method — the two approaches are equivalent, but the approximation argument avoids boundary integrals entirely.
**Implementation via mollification.** If $u$ is already in $L^p$ and we suspect $u \in W^{1,p}$, we can mollify: set $u_\varepsilon = \eta_\varepsilon * u$. If the mollified gradients $\nabla u_\varepsilon$ converge in $L^p$ (or at least remain bounded), the Banach-Alaoglu theorem (for $1 < p < \infty$) extracts a weak limit, and closedness of the gradient identifies it as $\nabla u$.
[example: The Absolute Value Function via Cut-Off Approximation]
Let $n = 1$, $U = (-1, 1)$, and $u(x) = |x|$. We prove $u \in W^{1,p}(U)$ for all $1 \le p \le \infty$ with $u' = \operatorname{sgn}(x)$ using the approximation method.
**Step 1: Construct the approximating sequence.** Let $\zeta_m \in C^\infty(\mathbb{R})$ be a standard cut-off with $\zeta_m(x) = 0$ for $|x| \le 1/m$, $\zeta_m(x) = 1$ for $|x| \ge 2/m$, $0 \le \zeta_m \le 1$, and $|\zeta_m'| \le 2m$. Define:
\begin{align*}
u_m(x) := \zeta_m(x) \cdot |x| + (1 - \zeta_m(x)) \cdot \psi_m(x),
\end{align*}
where $\psi_m$ is any smooth function on $(-2/m, 2/m)$ with $\psi_m(x) = |x|$ for $|x| \ge 2/m$ and $\psi_m(x) = x^2 m / 2$ for $|x| \le 1/m$ (a smooth "rounding" of the corner). However, a simpler choice suffices: just take $u_m(x) = \sqrt{x^2 + 1/m^2}$. This is $C^\infty$ on all of $U$ with:
\begin{align*}
u_m(x) = \sqrt{x^2 + 1/m^2}, \qquad u_m'(x) = \frac{x}{\sqrt{x^2 + 1/m^2}}.
\end{align*}
**Step 2: Verify $L^p$ convergence of $u_m$.** For every $x \in U$:
\begin{align*}
|u_m(x) - |x|| = \sqrt{x^2 + 1/m^2} - |x| = \frac{1/m^2}{\sqrt{x^2 + 1/m^2} + |x|} \le \frac{1}{m},
\end{align*}
so $\|u_m - u\|_{L^\infty(U)} \le 1/m \to 0$. In particular, $u_m \to u$ in $L^p(U)$ for all $p$.
**Step 3: Verify $L^p$ convergence of $u_m'$.** The candidate weak derivative is $v(x) = \operatorname{sgn}(x)$ (i.e., $v(x) = x/|x|$ for $x \neq 0$, defined arbitrarily at $0$ since $\{0\}$ has measure zero). We compute:
\begin{align*}
|u_m'(x) - v(x)| = \left|\frac{x}{\sqrt{x^2 + 1/m^2}} - \frac{x}{|x|}\right| = |x| \left|\frac{1}{\sqrt{x^2 + 1/m^2}} - \frac{1}{|x|}\right|
\end{align*}
for $x \neq 0$. Since $\sqrt{x^2 + 1/m^2} \ge |x|$, the expression in the absolute value is non-positive, giving:
\begin{align*}
|u_m'(x) - v(x)| = |x| \cdot \frac{1}{|x|} - |x| \cdot \frac{1}{\sqrt{x^2 + 1/m^2}} = 1 - \frac{|x|}{\sqrt{x^2 + 1/m^2}}.
\end{align*}
This is bounded by $1$ and converges to $0$ pointwise for every $x \neq 0$. Moreover, $|u_m'(x)| \le 1$ everywhere, so $|u_m'(x) - v(x)|^p \le 2^p$ is integrable. By the Dominated Convergence Theorem:
\begin{align*}
\|u_m' - v\|_{L^p(U)}^p = \int_{-1}^1 |u_m'(x) - \operatorname{sgn}(x)|^p \, dx \to 0 \quad \text{as } m \to \infty.
\end{align*}
**Step 4: Conclude by closedness.** Since $u_m \in C^\infty(U) \subset W^{1,p}(U)$, $u_m \to u$ in $L^p$, and $u_m' \to v$ in $L^p$, the closedness of the weak derivative gives $u \in W^{1,p}(U)$ with $u' = \operatorname{sgn}(x)$.
The advantage of this method over direct excision is clear: we never needed to integrate by parts across the non-differentiable point $x = 0$. The smooth approximations $u_m = \sqrt{x^2 + 1/m^2}$ "round the corner" and the convergence is handled entirely by the Dominated Convergence Theorem.
[/example]
### Difference Quotients (Nirenberg's Method)
The techniques above require a candidate for the weak derivative — we must "guess" $v$ from classical calculus or from the structure of the problem. In regularity theory, this is often impossible: we have a weak solution $u \in W^{1,p}(U)$ and want to prove $u \in W^{2,p}(U)$, but we have no formula for the second derivatives.
Nirenberg's method bypasses the need for a candidate by using the [Difference Quotient Characterisation](/theorems/78): a function $u \in L^p(U)$ with $1 < p < \infty$ belongs to $W^{1,p}(U)$ if and only if its difference quotients $D^h u$ are uniformly bounded in $L^p$ on compactly contained subdomains. The strategy is:
1. **Construct a test function involving difference quotients.** In the weak formulation $B[u, v] = (f, v)$, insert $v = -D^{-h}(\zeta^2 D^h u)$ (where $\zeta$ is a cut-off function). This is a valid test function because it is constructed from $u$ using only algebraic operations and translations, without assuming any additional differentiability.
2. **Derive an $L^p$ bound on $D^h \nabla u$.** The bilinear form $B$ and the test function interact to produce terms involving $D^h \nabla u$, which are controlled by the coercivity of $B$ on one side and the boundedness of $f$ on the other. After absorption of lower-order terms, this yields a uniform estimate $\|D^h \nabla u\|_{L^p(V)} \le C$ independent of $h$.
3. **Invoke the reverse direction of the difference quotient theorem.** The uniform bound implies $\nabla u \in W^{1,p}_{loc}(U)$, i.e., $u \in W^{2,p}_{loc}(U)$.
This method is the standard tool for proving interior regularity — see the [Interior $H^2$ Regularity Theorem](/theorems/95) for the full implementation. The restriction to $1 < p < \infty$ is essential: at $p = 1$, bounded difference quotients only guarantee that $u$ is of [bounded variation](/pages/1085), not that it has an $L^1$ weak derivative.
[example: Interior $H^2$ Regularity via Difference Quotients]
We illustrate Nirenberg's method for the model problem $-\Delta u = f$ in $U$ with $f \in L^2(U)$. Suppose $u \in W^{1,2}(U)$ is a weak solution:
\begin{align*}
\int_U \nabla u \cdot \nabla \phi \, d\mathcal{L}^n = \int_U f \phi \, d\mathcal{L}^n \quad \text{for all } \phi \in W^{1,2}_0(U).
\end{align*}
We prove $u \in W^{2,2}_{loc}(U)$.
**Step 1: Construct the test function.** Fix $V \subset \subset W \subset \subset U$ and let $\zeta$ be a cut-off with $\zeta \equiv 1$ on $V$, $\operatorname{supp}(\zeta) \subset W$. For $|h| < \operatorname{dist}(W, \partial U)$ and direction $e_k$, define:
\begin{align*}
\phi := -D_k^{-h}(\zeta^2 D_k^h u).
\end{align*}
This belongs to $W^{1,2}_0(U)$ because: $D_k^h u \in W^{1,2}_{loc}(U)$ (translation preserves Sobolev regularity), $\zeta^2 D_k^h u$ has compact support in $W$, and $D_k^{-h}$ translates the support by at most $|h|$ while remaining inside $U$.
**Step 2: Insert and use the discrete integration by parts identity.** The key algebraic identity for difference quotients is $\int D_k^h f \cdot g = -\int f \cdot D_k^{-h} g$ (discrete analogue of integration by parts). Substituting $\phi$ into the weak formulation:
\begin{align*}
-\int_U \nabla u \cdot \nabla[D_k^{-h}(\zeta^2 D_k^h u)] \, d\mathcal{L}^n = -\int_U f \, D_k^{-h}(\zeta^2 D_k^h u) \, d\mathcal{L}^n.
\end{align*}
Applying discrete integration by parts to the left side (moving $D_k^{-h}$ to $\nabla u$):
\begin{align*}
\int_U D_k^h(\nabla u) \cdot \nabla(\zeta^2 D_k^h u) \, d\mathcal{L}^n = -\int_U f \, D_k^{-h}(\zeta^2 D_k^h u) \, d\mathcal{L}^n.
\end{align*}
**Step 3: Expand and estimate.** Set $w = D_k^h u$ for brevity. The left side expands as:
\begin{align*}
\int_U D_k^h(\nabla u) \cdot (\zeta^2 \nabla w + 2\zeta w \nabla \zeta) \, d\mathcal{L}^n = \int_U \zeta^2 |D_k^h(\nabla u)|^2 \, d\mathcal{L}^n + 2\int_U \zeta w \, D_k^h(\nabla u) \cdot \nabla \zeta \, d\mathcal{L}^n,
\end{align*}
where we used $D_k^h(\nabla u) = \nabla(D_k^h u) = \nabla w$ (difference quotients commute with spatial derivatives).
For the cross term, Young's inequality gives for any $\delta > 0$:
\begin{align*}
\left|2\int_U \zeta w \, \nabla w \cdot \nabla \zeta\right| \le \delta \int_U \zeta^2 |\nabla w|^2 + \frac{C}{\delta} \int_U w^2 |\nabla \zeta|^2.
\end{align*}
For the right side, discrete integration by parts and Cauchy-Schwarz yield:
\begin{align*}
\left|\int_U f \, D_k^{-h}(\zeta^2 w)\right| \le \|f\|_{L^2(W)} \|D_k^{-h}(\zeta^2 w)\|_{L^2} \le \|f\|_{L^2(W)} \|\nabla(\zeta^2 w)\|_{L^2},
\end{align*}
using the forward direction of the difference quotient characterisation. This last norm is bounded by $C(\|\zeta \nabla w\|_{L^2} + \|w \nabla \zeta\|_{L^2})$.
**Step 4: Absorb and conclude.** Choosing $\delta = 1/2$ and absorbing $\frac{1}{2}\int \zeta^2 |\nabla w|^2$ from the right side into the left:
\begin{align*}
\frac{1}{2}\int_U \zeta^2 |\nabla w|^2 \, d\mathcal{L}^n \le C\left(\|f\|_{L^2(W)}^2 + \|\nabla \zeta\|_{L^\infty}^2 \|w\|_{L^2(W)}^2\right).
\end{align*}
The right side is bounded independently of $h$: $\|w\|_{L^2(W)} = \|D_k^h u\|_{L^2(W)} \le \|\nabla u\|_{L^2(U)}$ by the forward difference quotient bound. Since $\zeta \equiv 1$ on $V$:
\begin{align*}
\|D_k^h(\nabla u)\|_{L^2(V)}^2 \le C\left(\|f\|_{L^2(U)}^2 + \|\nabla u\|_{L^2(U)}^2\right)
\end{align*}
for all small $|h|$ and all directions $k$. By the reverse direction of the [Difference Quotient Characterisation](/theorems/78), this uniform bound implies $\nabla u \in W^{1,2}(V)$, hence $u \in W^{2,2}(V)$. Since $V \subset \subset U$ was arbitrary, $u \in W^{2,2}_{loc}(U)$ with the estimate:
\begin{align*}
\|D^2 u\|_{L^2(V)} \le C\left(\|f\|_{L^2(U)} + \|\nabla u\|_{L^2(U)}\right).
\end{align*}
[/example]
### Density Arguments for Calculus Rules
Many properties of Sobolev functions (chain rules, product rules, change of variables) are proved by a three-step density argument:
1. **Approximate.** By the Meyers-Serrin Theorem ([Theorem 58](/theorems/58)), choose $u_m \in C^\infty(U) \cap W^{k,p}(U)$ with $u_m \to u$ in $W^{k,p}(U)$.
2. **Verify for smooth functions.** Prove the desired identity for $u_m$ using classical calculus (the chain rule, product rule, or divergence theorem all apply pointwise to smooth functions).
3. **Pass to the limit.** Use the $W^{k,p}$ convergence to show that each term in the identity converges, yielding the identity for $u$.
[example: Weak Derivative of the Positive Part]
Let $u \in W^{1,p}(U)$ with $1 \le p < \infty$. Define $u^+(x) = \max(u(x), 0)$. We prove that $u^+ \in W^{1,p}(U)$ with:
\begin{align*}
\nabla u^+ = \begin{cases} \nabla u & \text{a.e. on } \{u > 0\}, \\ 0 & \text{a.e. on } \{u \le 0\}. \end{cases}
\end{align*}
**Step 1: Smooth approximation of the $\max$ operation.** We cannot apply the density argument to $u$ directly (approximating $u$ by smooth functions does not help, because $\max$ applied to a smooth function is not smooth). Instead, we approximate the function $t \mapsto t^+ = \max(t, 0)$ by a family of smooth functions.
For $\delta > 0$, define $F_\delta: \mathbb{R} \to \mathbb{R}$ by:
\begin{align*}
F_\delta(t) = \begin{cases} t & \text{if } t > \delta, \\ \frac{(t + \delta)^2}{4\delta} & \text{if } |t| \le \delta, \\ 0 & \text{if } t < -\delta. \end{cases}
\end{align*}
One verifies: $F_\delta \in C^1(\mathbb{R})$ with $F_\delta'(t) \in [0, 1]$ for all $t$, $F_\delta(t) \to t^+$ as $\delta \to 0$ for all $t$, and $F_\delta'(t) \to \mathbb{1}_{\{t > 0\}}$ for all $t \neq 0$.
**Step 2: Apply the Sobolev chain rule to $F_\delta \circ u$.** Since $F_\delta$ is $C^1$ with bounded derivative ($\|F_\delta'\|_{L^\infty} \le 1$), the Sobolev chain rule (proved below) gives $F_\delta(u) \in W^{1,p}(U)$ with:
\begin{align*}
\nabla(F_\delta(u)) = F_\delta'(u) \nabla u \quad \text{a.e.}
\end{align*}
**Step 3: Pass $\delta \to 0$.** We verify the hypotheses of the closedness argument:
*$L^p$ convergence of $F_\delta(u) \to u^+$:* Since $|F_\delta(t) - t^+| \le \delta$ for all $t$ (direct verification in each piece), we have $\|F_\delta(u) - u^+\|_{L^p}^p \le \delta^p \cdot \mathcal{L}^n(U) \to 0$.
*$L^p$ convergence of the gradients:* Set $v(x) = \mathbb{1}_{\{u > 0\}}(x) \nabla u(x)$. We must show $F_\delta'(u) \nabla u \to v$ in $L^p(U)$.
\begin{align*}
\|F_\delta'(u) \nabla u - v\|_{L^p}^p = \int_U |F_\delta'(u(x)) - \mathbb{1}_{\{u > 0\}}(x)|^p \, |\nabla u(x)|^p \, d\mathcal{L}^n.
\end{align*}
The integrand converges to zero a.e.: at points where $u(x) > 0$ or $u(x) < 0$, $F_\delta'(u(x)) \to \mathbb{1}_{\{u > 0\}}(x)$ for small $\delta$. At points where $u(x) = 0$, the set $\{u = 0\}$ satisfies $\nabla u = 0$ a.e. on $\{u = 0\}$ (this is a standard result: if $u \in W^{1,p}$ and $G = \{u = c\}$ for any constant $c$, then $\nabla u = 0$ a.e. on $G$; it follows from the chain rule applied to $(u - c)^+$ and $(u - c)^-$, or from the ACL characterisation). So the integrand vanishes a.e. on $\{u = 0\}$ regardless of $F_\delta'$.
Since $|F_\delta'(u)|^p |\nabla u|^p \le |\nabla u|^p \in L^1(U)$, the Dominated Convergence Theorem gives convergence.
**Step 4: Conclude.** By closedness of the gradient operator: $u^+ \in W^{1,p}(U)$ with $\nabla u^+ = \mathbb{1}_{\{u > 0\}} \nabla u$. An identical argument applied to $u^- = (-u)^+$ gives $\nabla u^- = -\mathbb{1}_{\{u < 0\}} \nabla u$, which is consistent with $u = u^+ - u^-$.
This result has an important corollary: if $u \in W^{1,p}(U)$, then $|u| = u^+ + u^- \in W^{1,p}(U)$ with $\nabla|u| = \operatorname{sgn}(u) \nabla u$ a.e. More generally, Lipschitz functions preserve Sobolev regularity — a fact that underpins the theory of truncation arguments in nonlinear PDE.
[/example]
The same density pattern establishes the product rule ($uv \in W^{1,p}$ for $u \in W^{1,p} \cap L^\infty$ and $v \in W^{1,p} \cap L^\infty$), the chain rule with $C^1$ functions of bounded derivative, and the composition with Lipschitz functions. In each case, the identity is obvious for smooth functions and extends to Sobolev functions by density and the Dominated Convergence Theorem.