[motivation]
### Why $W^{1,1}$ Is Too Small
Consider a sequence of smooth approximations to the Heaviside step function on $U = (-1, 1)$. Define:
\begin{align*}
u_\epsilon(x) := \begin{cases} 0 & \text{if } x < -\epsilon, \\ \frac{1}{2} + \frac{x}{2\epsilon} & \text{if } -\epsilon \le x \le \epsilon, \\ 1 & \text{if } x > \epsilon. \end{cases}
\end{align*}
Each $u_\epsilon$ is Lipschitz (hence in $W^{1,1}(U)$), and its derivative is $u_\epsilon'(x) = \frac{1}{2\epsilon} \mathbb{1}_{(-\epsilon, \epsilon)}(x)$. The $L^1$ norm of the derivative is:
\begin{align*}
\|u_\epsilon'\|_{L^1(U)} = \int_{-\epsilon}^{\epsilon} \frac{1}{2\epsilon} \, d\mathcal{L}^1(x) = 1
\end{align*}
for every $\epsilon > 0$. As $\epsilon \to 0$, the functions $u_\epsilon$ converge in $L^1(U)$ to the Heaviside function $H(x) = \mathbb{1}_{(0,1)}(x)$. The derivatives $u_\epsilon'$ have uniformly bounded $L^1$ norm, yet they converge (in the sense of distributions) not to an $L^1$ function but to the **Dirac measure** $\delta_0$:
\begin{align*}
\int_{-1}^1 u_\epsilon'(x) \phi(x) \, d\mathcal{L}^1(x) \to \phi(0) = \int_{-1}^1 \phi \, d\delta_0 \qquad \text{for all } \phi \in C^\infty_c(U).
\end{align*}
Thus $H \notin W^{1,1}(U)$ — its distributional derivative is a measure, not an integrable function — yet $H$ is the $L^1$-limit of $W^{1,1}$ functions with uniformly bounded derivatives.
### The Failure of Weak Compactness at $p = 1$
The root cause is that $L^1(U)$ is not reflexive. The Banach-Alaoglu theorem guarantees that bounded sequences in a reflexive space have weakly convergent subsequences, and this is exactly the mechanism used in the [Difference Quotient Characterisation](/theorems/78) to extract weak derivatives from bounded difference quotients when $p > 1$.
At $p = 1$, the bounded sequence $\{u_\epsilon'\}$ does not converge weakly in $L^1$ — it *escapes* to a measure. More precisely, $L^1(U)$ embeds isometrically into the space of finite signed Radon measures $\mathcal{M}(U)$ (which is the dual of $C_0(U)$), and the Banach-Alaoglu theorem applied in $\mathcal{M}(U)$ gives weak-$*$ convergence to a measure. This is the correct topology for extracting limits.
### The Resolution: Allow Measure-Valued Derivatives
Instead of insisting that the distributional derivative $Du$ be an $L^1$ function, we enlarge the target space and require only that $Du$ be a **finite Radon measure**. This gives the space $BV(U)$. It is strictly larger than $W^{1,1}(U)$ (since it contains the Heaviside function and characteristic functions of nice [sets](/page/Set)), but it retains enough structure to support compactness theorems that replace the Rellich-Kondrachov theorem at the endpoint $p = 1$.
[/motivation]