Coarea Formula (General) (Theorem # 3078)
Theorem
Let $n \ge k \ge 1$ be integers, and let $f : \mathbb{R}^n \to \mathbb{R}^k$ be Lipschitz. For every $\mathcal{L}^n$-measurable set $A \subseteq \mathbb{R}^n$,
\begin{align*}
\int_A J_k f(x) \, d\mathcal{L}^n(x) = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A) \, d\mathcal{L}^k(t).
\end{align*}
Analysis
Measure Theory
Discussion
No discussion available for this theorem.
Proof
[proofplan]
The proof reduces the general $k$-dimensional coarea formula to a local statement on the regular set, where $f$ admits a bi-Lipschitz factorisation as a projection composed with a coordinate change. After a countable-additivity reduction to bounded sets and the indicator weight $g \equiv 1$, we split $\mathbb{R}^n = R \sqcup C$ into the regular set $R$ where $J_k f > 0$ and the critical set $C$ where it vanishes. The critical set contributes nothing on either side: the left side vanishes by definition, and on the right we prove $\mathcal{H}^{n-k}(f^{-1}(t) \cap C) = 0$ for $\mathcal{L}^k$-a.e. $t$ by showing $\mathcal{L}^k(f(C)) = 0$ via a Lusin-plus-$C^1$-approximation reduction to Sard's theorem combined with the [Area Formula](/theorems/3075). On the regular set, Lusin's theorem gives a closed set $F_\varepsilon$ on which $Df$ is continuous and $J_k f$ bounded below; over a countable cover of $F_\varepsilon$ by small balls, Federer's Lipschitz inverse function theorem (Federer 3.1.16) factors $f = \pi_k \circ \Phi^{-1}$ for a bi-Lipschitz $\Phi$, and the [Area Formula](/theorems/3075) applied to $\Phi$ converts the level-set integrals into a Fubini integral on a Euclidean product. Letting $\varepsilon \to 0$ closes the argument.
[/proofplan]
[step:Reduce to bounded measurable $A$ and the weight $g \equiv 1$]
The hypotheses of the theorem give $f: \mathbb{R}^n \to \mathbb{R}^k$ Lipschitz with $n \ge k \ge 1$, and an $\mathcal{L}^n$-measurable set $A \subseteq \mathbb{R}^n$. Both
\begin{align*}
\Phi_L(A) &:= \int_A J_k f(x) \, d\mathcal{L}^n(x), \\
\Phi_R(A) &:= \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A) \, d\mathcal{L}^k(t),
\end{align*}
are $\sigma$-additive non-negative set functions on the $\mathcal{L}^n$-measurable subsets of $\mathbb{R}^n$. For $\Phi_L$ this is countable additivity of integration. For $\Phi_R$, given a disjoint sequence $A = \bigsqcup_{j} A_j$, we have $\mathcal{H}^{n-k}(f^{-1}(t) \cap A) = \sum_j \mathcal{H}^{n-k}(f^{-1}(t) \cap A_j)$ pointwise in $t$ by countable additivity of $\mathcal{H}^{n-k}$ on disjoint Borel sets, and the Monotone Convergence Theorem for partial sums of non-negative measurable functions transfers this to the integrals. Measurability of $t \mapsto \mathcal{H}^{n-k}(f^{-1}(t) \cap A)$ for $A$ Borel is the standard slicing-measurability theorem of Federer (Federer 2.10.25), which in our setting follows from the fact that $(t, x) \mapsto \mathbb{1}_A(x) \mathbb{1}_{f^{-1}(t)}(x)$ is jointly Borel and that $\mathcal{H}^{n-k}$ is a Borel-regular outer measure; for general $\mathcal{L}^n$-measurable $A$ the result extends by inner regularity of $\mathcal{L}^n$. We accept this measurability statement as a prerequisite.
By $\sigma$-additivity in $A$, it suffices to prove the formula when $A$ is contained in a fixed ball $B(0, R)$, $R > 0$; the general case follows by writing $A = \bigsqcup_{m=1}^\infty (A \cap B(0, m) \setminus B(0, m-1))$. Henceforth fix $R > 0$ and assume $A \subseteq B(0, R)$.
Decompose $\mathbb{R}^n = R \sqcup C$ where
\begin{align*}
R &:= \{x \in \mathbb{R}^n : Df_x \text{ exists and } J_k f(x) > 0\}, \\
C &:= \mathbb{R}^n \setminus R = \{x : Df_x \text{ does not exist}\} \cup \{x : Df_x \text{ exists, } J_k f(x) = 0\}.
\end{align*}
By [Rademacher's Theorem](/theorems/3069), $f$ is differentiable $\mathcal{L}^n$-a.e., so the subset $\{x : Df_x \text{ does not exist}\}$ has $\mathcal{L}^n$-measure zero. The Jacobian
\begin{align*}
J_k f(x) := \sqrt{\det(Jf_x \, (Jf_x)^\top)}
\end{align*}
is then defined $\mathcal{L}^n$-a.e., and $R$, $C$ are both $\mathcal{L}^n$-measurable. We prove the formula separately for $A \cap C$ (Step 2) and $A \cap R$ (Steps 3-5), then combine.
[/step]
[step:Show the critical set $C$ contributes nothing on either side]
We prove
\begin{align*}
\int_{A \cap C} J_k f \, d\mathcal{L}^n = 0, \qquad \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A \cap C) \, d\mathcal{L}^k(t) = 0.
\end{align*}
The first equality is immediate: $J_k f = 0$ at every $x \in C$ where the Jacobian is defined, and the remaining set in $C$ has $\mathcal{L}^n$-measure zero, so the integral vanishes.
For the second equality we prove the stronger claim:
[claim:$\mathcal{L}^k(f(C \cap B(0, R))) = 0$]
[proof]
Define the Borel set $C_0 := \{x \in B(0, R) : Df_x \text{ exists and has rank} < k\}$. Since the rank-deficient set differs from $C \cap B(0, R)$ by an $\mathcal{L}^n$-null set, and Lipschitz maps send $\mathcal{L}^n$-null sets to $\mathcal{L}^k$-null sets — this is [Lipschitz Bound on Hausdorff Measure](/theorems/2999) applied with the inequality $\mathcal{H}^k(f(N)) \le L^k \mathcal{H}^k(N) = 0$ for any $\mathcal{L}^n$-null set $N \subseteq \mathbb{R}^n$ (using $\mathcal{H}^k(N) \le c_{n,k} \mathcal{L}^n(N)^{k/n} = 0$ for null sets, valid because Hausdorff measure of a null set in dimension $\ge k$ is zero) — we have
\begin{align*}
\mathcal{L}^k(f(C \cap B(0, R))) \le \mathcal{L}^k(f(C_0)) + \mathcal{L}^k(f(N)) = \mathcal{L}^k(f(C_0))
\end{align*}
where $N$ collects the non-differentiability points and is $\mathcal{L}^n$-null. It suffices to prove $\mathcal{L}^k(f(C_0)) = 0$.
We use the Lusin-type approximation that on a closed set of nearly full measure, $f$ is the restriction of a $C^1$ map. By the Whitney-type extension theorem combined with Lusin's theorem (this is part of the standard apparatus, e.g. [Federer 3.1.16]; Evans-Gariepy Theorem 6.13), for every $\varepsilon > 0$ there exists a $C^1$ map $g_\varepsilon: \mathbb{R}^n \to \mathbb{R}^k$ and a closed set $E_\varepsilon \subseteq \mathbb{R}^n$ such that
\begin{align*}
\mathcal{L}^n(\mathbb{R}^n \setminus E_\varepsilon) < \varepsilon, \qquad f|_{E_\varepsilon} = g_\varepsilon|_{E_\varepsilon}, \qquad Df_x = Dg_{\varepsilon, x} \text{ for } \mathcal{L}^n\text{-a.e. } x \in E_\varepsilon.
\end{align*}
Set $C_\varepsilon := C_0 \cap E_\varepsilon$. On $C_\varepsilon$, the differential $Dg_{\varepsilon, x}$ has rank $< k$ at $\mathcal{L}^n$-a.e. point (since $Dg_\varepsilon = Df$ a.e. on $E_\varepsilon$ and $f$ has rank-deficient differential on $C_0$). Modifying $g_\varepsilon$ on a Lebesgue-null set does not change its image on $C_\varepsilon$, so we may assume $\operatorname{rank}(Dg_{\varepsilon, x}) < k$ for every $x \in C_\varepsilon$.
Define
\begin{align*}
\widetilde{C}_\varepsilon := \{x \in \mathbb{R}^n : g_\varepsilon \text{ differentiable at } x, \operatorname{rank}(Dg_{\varepsilon, x}) < k\}.
\end{align*}
Then $C_\varepsilon \subseteq \widetilde{C}_\varepsilon$. Sard's theorem applied to the $C^1$ map $g_\varepsilon: \mathbb{R}^n \to \mathbb{R}^k$ states that the image $g_\varepsilon(\widetilde{C}_\varepsilon)$ of the critical set is $\mathcal{L}^k$-null:
\begin{align*}
\mathcal{L}^k(g_\varepsilon(\widetilde{C}_\varepsilon)) = 0.
\end{align*}
Since $f = g_\varepsilon$ on $C_\varepsilon$,
\begin{align*}
\mathcal{L}^k(f(C_\varepsilon)) = \mathcal{L}^k(g_\varepsilon(C_\varepsilon)) \le \mathcal{L}^k(g_\varepsilon(\widetilde{C}_\varepsilon)) = 0.
\end{align*}
For the leftover $C_0 \setminus C_\varepsilon \subseteq \mathbb{R}^n \setminus E_\varepsilon$, we control the image by the [Lipschitz Bound on Hausdorff Measure](/theorems/2999) applied to $f|_{B(0, R)}$ with Lipschitz constant $L = \operatorname{Lip}(f)$, and the volume comparison $\mathcal{H}^k(E) \le c_{n,k} \, \mathcal{L}^n(E)^{k/n} \cdot \omega(\operatorname{diam}(E))$ valid for sets contained in a fixed ball: more carefully, $f(\mathbb{R}^n \setminus E_\varepsilon \cap B(0, R))$ is contained in the image $f(B(0, R))$, which itself is bounded (since $f$ is Lipschitz on bounded sets), but we need a finer estimate.
The correct route avoids relying on $\mathcal{L}^k$ of small-Lebesgue-measure pieces. Instead, we observe that $C_0 = \bigcup_{\varepsilon = 1/m, m \in \mathbb{N}} C_{1/m} \cup \bigcap_m (C_0 \setminus C_{1/m})$. The intersection $\bigcap_m (C_0 \setminus C_{1/m}) = C_0 \setminus \bigcup_m E_{1/m}$ has $\mathcal{L}^n$-measure $\le \mathcal{L}^n(\mathbb{R}^n \setminus E_{1/m}) < 1/m$ for every $m$, hence is $\mathcal{L}^n$-null. Lipschitz maps send $\mathcal{L}^n$-null sets in $\mathbb{R}^n$ to $\mathcal{H}^k$-null hence $\mathcal{L}^k$-null sets in $\mathbb{R}^k$ (here we use that for $k \le n$ and $f$ Lipschitz, $\mathcal{H}^k(f(N)) \le L^k \mathcal{H}^k(N)$ by [Lipschitz Bound on Hausdorff Measure](/theorems/2999); and $\mathcal{H}^k(N) \le \mathcal{H}^n(N) \cdot \infty^?$ does not work directly, so we instead use: $N$ has $\mathcal{L}^n$-measure zero, hence $\mathcal{H}^n(N) = 0$, and any $\mathcal{H}^n$-null set in $\mathbb{R}^n$ is also $\mathcal{H}^k$-null since $\mathcal{H}^k(\cdot) \le c \mathcal{H}^n(\cdot)^{k/n}$ on bounded sets via the isodiametric inequality; equivalently, given $\delta > 0$ cover $N$ by balls of total $\mathcal{L}^n$-volume $< \delta^n$, then $\mathcal{H}^k_\delta(N) \le \omega_k \delta^{k-n} \cdot \delta^n = \omega_k \delta^k$, which $\to 0$). Hence
\begin{align*}
\mathcal{L}^k\Big(f\Big(\bigcap_m (C_0 \setminus E_{1/m})\Big)\Big) = 0.
\end{align*}
Combining with $\mathcal{L}^k(f(C_{1/m})) = 0$ for every $m$,
\begin{align*}
\mathcal{L}^k(f(C_0)) \le \sum_m \mathcal{L}^k(f(C_{1/m})) + \mathcal{L}^k\Big(f\Big(\bigcap_m (C_0 \setminus E_{1/m})\Big)\Big) = 0.
\end{align*}
This proves $\mathcal{L}^k(f(C_0)) = 0$, hence $\mathcal{L}^k(f(C \cap B(0, R))) = 0$.
[/proof]
[/claim]
For $\mathcal{L}^k$-a.e. $t \in \mathbb{R}^k$, $t \notin f(C \cap B(0, R))$, so $f^{-1}(t) \cap C \cap B(0, R) = \varnothing$ and a fortiori $f^{-1}(t) \cap A \cap C = \varnothing$. Thus
\begin{align*}
\int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A \cap C) \, d\mathcal{L}^k(t) = \int_{f(C \cap B(0, R))} \mathcal{H}^{n-k}(f^{-1}(t) \cap A \cap C) \, d\mathcal{L}^k(t) = 0,
\end{align*}
since the integrand is supported in an $\mathcal{L}^k$-null set.
[/step]
[step:On the regular set, choose a Lusin-good closed subset where $Df$ is continuous]
We now work with $A_R := A \cap R \subseteq B(0, R)$. The Jacobian matrix $x \mapsto Jf_x$ is an $\mathcal{L}^n$-measurable map from $B(0, R)$ to $\mathbb{R}^{k \times n}$, defined a.e. (by [Rademacher's Theorem](/theorems/3069)), and bounded by $L = \operatorname{Lip}(f)$.
Fix $\varepsilon > 0$. By Lusin's theorem applied to the measurable map $Jf$, there exists a closed set $F_\varepsilon \subseteq B(0, R) \cap R$ such that
\begin{align*}
\mathcal{L}^n((R \cap B(0, R)) \setminus F_\varepsilon) &< \varepsilon, \\
x \mapsto Jf_x \text{ is continuous on } F_\varepsilon, \quad &J_k f \text{ continuous on } F_\varepsilon.
\end{align*}
Since $J_k f > 0$ on $R$ and is continuous on the closed set $F_\varepsilon$, after intersecting with $\{x : J_k f(x) \ge \eta\}$ for small $\eta > 0$ and noting that $\mathcal{L}^n(\{x \in F_\varepsilon : J_k f(x) < \eta\}) \to 0$ as $\eta \to 0$, we may further assume (after relabelling $\varepsilon$) that $J_k f \ge \eta_\varepsilon > 0$ on $F_\varepsilon$ for some $\eta_\varepsilon > 0$.
By a Whitney-Lusin extension argument (standard, e.g. [Federer 3.1.16] or Evans-Gariepy Theorem 6.13), there exists a $C^1$ map $g_\varepsilon: \mathbb{R}^n \to \mathbb{R}^k$ with $g_\varepsilon|_{F_\varepsilon} = f|_{F_\varepsilon}$ and $Dg_{\varepsilon}|_{F_\varepsilon} = Df|_{F_\varepsilon}$. Hence on $F_\varepsilon$, $J_k g_\varepsilon = J_k f \ge \eta_\varepsilon$. The set $\{J_k g_\varepsilon \ge \eta_\varepsilon / 2\}$ is open in $\mathbb{R}^n$ (by continuity of $g_\varepsilon$'s derivative) and contains $F_\varepsilon$.
For each $x_0 \in F_\varepsilon$, since $g_\varepsilon$ is $C^1$ and $J_k g_{\varepsilon}(x_0) > 0$, the linear map $Dg_{\varepsilon, x_0}: \mathbb{R}^n \to \mathbb{R}^k$ has rank $k$. Let $K(x_0) := \ker(Dg_{\varepsilon, x_0}) \subseteq \mathbb{R}^n$, a subspace of dimension $n - k$, and $K(x_0)^\perp$ its orthogonal complement of dimension $k$. Define the auxiliary $C^1$ map
\begin{align*}
\Phi_{x_0}: \mathbb{R}^n &\to \mathbb{R}^k \times \mathbb{R}^{n-k} \\
x &\mapsto \big(g_\varepsilon(x),\, \pi_{K(x_0)}(x - x_0)\big),
\end{align*}
where $\pi_{K(x_0)}: \mathbb{R}^n \to K(x_0) \cong \mathbb{R}^{n-k}$ is orthogonal projection (we identify $K(x_0)$ with $\mathbb{R}^{n-k}$ via an orthonormal basis fixed once and for all). The differential at $x_0$ has the block form, after orthogonal change of basis $\mathbb{R}^n = K(x_0)^\perp \oplus K(x_0)$:
\begin{align*}
D\Phi_{x_0,\, x_0} = \begin{pmatrix} A & 0 \\ 0 & I_{n-k} \end{pmatrix},
\end{align*}
where $A: K(x_0)^\perp \to \mathbb{R}^k$ is the restriction of $Dg_{\varepsilon, x_0}$ to $K(x_0)^\perp$ (a linear isomorphism, since $K(x_0) = \ker$). The block-triangular determinant gives
\begin{align*}
|\det D\Phi_{x_0,\, x_0}| = |\det A| \cdot 1 = J_k g_\varepsilon(x_0) = J_k f(x_0),
\end{align*}
where we used the standard identity $J_k g_\varepsilon(x_0) = |\det(Dg_{\varepsilon, x_0}|_{K(x_0)^\perp})|$ (the $k$-Jacobian of a rank-$k$ map equals the absolute determinant of its restriction to the orthogonal complement of the kernel; this is a direct consequence of $J_k g = \sqrt{\det(Jg \, Jg^\top)}$ and the polar decomposition — see [Polar Decomposition](/theorems/3074)).
By the classical $C^1$ inverse function theorem applied to $\Phi_{x_0}$ at $x_0$, there exist open neighbourhoods $U_{x_0} \ni x_0$ and $V_{x_0} \ni \Phi_{x_0}(x_0)$ such that $\Phi_{x_0}: U_{x_0} \to V_{x_0}$ is a $C^1$ diffeomorphism with $C^1$ inverse $\Psi_{x_0}: V_{x_0} \to U_{x_0}$. By compactness of $F_\varepsilon$ (closed and bounded in $\mathbb{R}^n$), we may choose a finite subcover $\{U_{x_j}\}_{j=1}^M$ of $F_\varepsilon$ and corresponding $\Phi_j := \Phi_{x_j}$, $\Psi_j := \Psi_{x_j}$ on $V_j := \Phi_j(U_j)$.
After replacing $U_j$ by $U_j \setminus \bigcup_{i < j} U_i$ (a measurable, not necessarily open, refinement) we obtain a measurable partition $\{U_j'\}$ of $\bigcup_j U_j \supseteq F_\varepsilon$ on which $\Phi_j$ is a bi-Lipschitz $C^1$-diffeomorphism on its restricted piece. Since $\bigcap_{\delta > 0} \{J_k g_\varepsilon \ge \delta\}$ contains $F_\varepsilon$ and the inverse function theorem provides uniform Lipschitz constants on compact subsets, $\Phi_j$ and $\Psi_j$ have Lipschitz constants $L_j^+, L_j^- < \infty$ depending on $\varepsilon$ and $j$.
[/step]
[step:Apply the Area Formula on each piece to convert into a Fubini integral]
Fix one piece $U_j' \subseteq B(0, R)$ from Step 3 with diffeomorphism $\Phi_j: U_j' \to V_j' := \Phi_j(U_j')$ and inverse $\Psi_j$. Write $W_j := A_R \cap F_\varepsilon \cap U_j'$ for the regular-set portion of $A$ landing in this piece. We claim
\begin{align*}
\int_{W_j} J_k f(x) \, d\mathcal{L}^n(x) = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap W_j) \, d\mathcal{L}^k(t).
\end{align*}
**Change of variables.** Since $\Phi_j$ is a $C^1$-diffeomorphism on a neighbourhood of $U_j'$, we apply the classical $C^1$ change-of-variables formula (or equivalently the [Area Formula](/theorems/3075) for the Lipschitz map $\Phi_j$ with $J_n \Phi_j = |\det D\Phi_j|$ and multiplicity one because $\Phi_j$ is injective):
\begin{align*}
\int_{W_j} J_k f(x) \, d\mathcal{L}^n(x)
= \int_{\Phi_j(W_j)} J_k f(\Psi_j(t, y)) \cdot |\det D\Psi_{j, (t,y)}| \, d\mathcal{L}^n(t, y).
\end{align*}
Here $(t, y) \in \mathbb{R}^k \times \mathbb{R}^{n-k}$ are coordinates on $V_j'$ given by $\Phi_j$. Since $\Phi_j$ is a $C^1$-diffeomorphism, $|\det D\Psi_{j, (t,y)}| = 1 / |\det D\Phi_{j, \Psi_j(t,y)}|$.
**Identification of the Jacobian factor.** At any point $x = \Psi_j(t, y) \in U_j'$, the differential $D\Phi_{j, x}: \mathbb{R}^n \to \mathbb{R}^k \times \mathbb{R}^{n-k}$ has the form
\begin{align*}
D\Phi_{j, x}(h) = \big(Dg_{\varepsilon, x}(h),\, \pi_{K(x_j)}(h)\big), \qquad h \in \mathbb{R}^n.
\end{align*}
This is the linear map $\mathbb{R}^n \to \mathbb{R}^k \oplus \mathbb{R}^{n-k}$ whose first component is $Dg_{\varepsilon, x}$ and whose second component is the *fixed* projection onto $K(x_j) = \ker(Dg_{\varepsilon, x_j})$ (independent of $x$). Note $K(x_j)$ is fixed for the piece $U_j'$, while $\ker(Dg_{\varepsilon, x})$ varies with $x$.
We compute $|\det D\Phi_{j, x}|$ by orthogonal decomposition $\mathbb{R}^n = K(x_j) \oplus K(x_j)^\perp$. In this decomposition the matrix of $D\Phi_{j, x}$ has the block form
\begin{align*}
D\Phi_{j, x} = \begin{pmatrix} Dg_{\varepsilon, x}|_{K(x_j)^\perp} & Dg_{\varepsilon, x}|_{K(x_j)} \\ 0 & I_{n-k} \end{pmatrix},
\end{align*}
where the bottom-left block is zero because $\pi_{K(x_j)}$ vanishes on $K(x_j)^\perp$ and the bottom-right is the identity on $K(x_j)$. By block-triangular determinant,
\begin{align*}
|\det D\Phi_{j, x}| = |\det(Dg_{\varepsilon, x}|_{K(x_j)^\perp})| \cdot |\det I_{n-k}| = |\det(Dg_{\varepsilon, x}|_{K(x_j)^\perp})|.
\end{align*}
For $x = x_j$, this equals $J_k f(x_j)$ (Step 3 computation). For general $x \in U_j'$, the right-hand side is $J_k f(x)$ multiplied by an "obliquity factor" measuring the angle between $\ker(Dg_{\varepsilon, x})$ and $K(x_j)$; in the limit as the diameter of $U_j'$ tends to $0$ (i.e. as we refine the cover, sending the angle to zero by continuity of $Df$ on $F_\varepsilon$), this obliquity factor tends to $1$. We exploit this rather than fighting it: we argue via the [Area Formula](/theorems/3075) directly.
**Direct Area Formula route.** Apply the [Area Formula](/theorems/3075) to the Lipschitz map $\Psi_j: V_j' \to \mathbb{R}^n$ on the slice $\{t\} \times Y_j(t) \subseteq \mathbb{R}^{n-k}$ where $Y_j(t) := \{y : (t, y) \in V_j', \Psi_j(t, y) \in W_j\}$:
\begin{align*}
\mathcal{H}^{n-k}\big(\Psi_j(\{t\} \times Y_j(t))\big) = \int_{Y_j(t)} J_{n-k}(\Psi_j(t, \cdot))(y) \, d\mathcal{L}^{n-k}(y).
\end{align*}
The image $\Psi_j(\{t\} \times Y_j(t))$ is precisely $f^{-1}(t) \cap W_j$: indeed, $\Phi_j(x) = (g_\varepsilon(x), \pi_{K(x_j)}(x - x_j)) = (f(x), \pi_{K(x_j)}(x - x_j))$ for $x \in F_\varepsilon \subseteq U_j'$ (since $f = g_\varepsilon$ on $F_\varepsilon$), so $\Phi_j$ takes $f^{-1}(t) \cap W_j$ to $\{t\} \times Y_j(t)$, and $\Psi_j$ inverts this.
Therefore
\begin{align*}
\int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap W_j) \, d\mathcal{L}^k(t)
&= \int_{\pi_1(V_j')} \int_{Y_j(t)} J_{n-k}(\Psi_j(t, \cdot))(y) \, d\mathcal{L}^{n-k}(y) \, d\mathcal{L}^k(t) \\
&= \int_{V_j' \cap \Phi_j(W_j)} J_{n-k}(\Psi_j(t, \cdot))(y) \, d\mathcal{L}^n(t, y),
\end{align*}
using Tonelli's theorem (integrand non-negative measurable, $\sigma$-finite product space $\mathbb{R}^k \times \mathbb{R}^{n-k}$). On the other hand, the change-of-variables computation above yields
\begin{align*}
\int_{W_j} J_k f(x) \, d\mathcal{L}^n(x) = \int_{\Phi_j(W_j)} J_k f(\Psi_j(t, y)) \cdot |\det D\Psi_{j, (t,y)}| \, d\mathcal{L}^n(t, y).
\end{align*}
**The pointwise Jacobian identity.** It remains to show
\begin{align*}
J_k f(\Psi_j(t, y)) \cdot |\det D\Psi_{j, (t,y)}| = J_{n-k}(\Psi_j(t, \cdot))(y)
\end{align*}
for $\mathcal{L}^n$-a.e. $(t, y) \in \Phi_j(W_j)$. Set $x = \Psi_j(t, y)$. The matrix $D\Psi_{j, (t, y)}$ is the inverse of $D\Phi_{j, x}$. The differential $D(\Psi_j(t, \cdot))_y : \mathbb{R}^{n-k} \to \mathbb{R}^n$ is the restriction of $D\Psi_{j, (t,y)}$ to the second factor $\{0\} \times \mathbb{R}^{n-k}$, i.e. its last $n - k$ columns. Its image is the tangent space to the level set $\{\Psi_j(t, \cdot) = x\}$, which equals $\ker(Dg_{\varepsilon, x}) = \ker(Df_x)$ (for $x \in F_\varepsilon$ with $f = g_\varepsilon$ and matching differentials), so
\begin{align*}
J_{n-k}(\Psi_j(t, \cdot))(y) = \big|\det\big(D(\Psi_j(t, \cdot))_y^\top D(\Psi_j(t, \cdot))_y\big)\big|^{1/2}.
\end{align*}
Algebraically, write $M := D\Phi_{j, x} \in \mathbb{R}^{n \times n}$, viewed as a block $\begin{pmatrix} P & Q \\ 0 & I \end{pmatrix}$ in the basis $K(x_j)^\perp \oplus K(x_j)$ of the domain and $\mathbb{R}^k \oplus \mathbb{R}^{n-k}$ of the codomain, where $P = Dg_{\varepsilon, x}|_{K(x_j)^\perp} \in \mathbb{R}^{k \times k}$ is invertible (by transversality on $U_j'$, after possibly shrinking) and $Q = Dg_{\varepsilon, x}|_{K(x_j)} \in \mathbb{R}^{k \times (n-k)}$. Then
\begin{align*}
M^{-1} = \begin{pmatrix} P^{-1} & -P^{-1} Q \\ 0 & I \end{pmatrix},
\end{align*}
and $D(\Psi_j(t, \cdot))_y$ corresponds to the last $n-k$ columns of $M^{-1}$:
\begin{align*}
D(\Psi_j(t, \cdot))_y = \begin{pmatrix} -P^{-1} Q \\ I \end{pmatrix} \in \mathbb{R}^{n \times (n-k)}.
\end{align*}
Computing the Gramian:
\begin{align*}
D(\Psi_j(t, \cdot))_y^\top D(\Psi_j(t, \cdot))_y = Q^\top P^{-\top} P^{-1} Q + I = I + Q^\top (P P^\top)^{-1} Q.
\end{align*}
Hence
\begin{align*}
J_{n-k}(\Psi_j(t, \cdot))(y)^2 = \det(I + Q^\top (PP^\top)^{-1} Q).
\end{align*}
On the left side, $|\det M| = |\det P|$ (block triangular), so $|\det M^{-1}| = 1/|\det P|$. And $J_k f(x)^2 = \det(Jf_x \, Jf_x^\top)$ where in the block decomposition $Jf_x = (P \mid Q)$, giving
\begin{align*}
Jf_x \, Jf_x^\top = P P^\top + Q Q^\top.
\end{align*}
Therefore
\begin{align*}
\big(J_k f(x) \cdot |\det M^{-1}|\big)^2 = \frac{\det(PP^\top + Q Q^\top)}{(\det P)^2} = \det(I_k + (P^\top)^{-1} Q Q^\top P^{-1}) = \det(I_k + (P^{-1} Q)(P^{-1} Q)^\top).
\end{align*}
By the Sylvester determinant identity $\det(I_k + UU^\top) = \det(I_{n-k} + U^\top U)$ for $U = P^{-1} Q \in \mathbb{R}^{k \times (n-k)}$,
\begin{align*}
\det(I_k + (P^{-1} Q)(P^{-1} Q)^\top) = \det(I_{n-k} + Q^\top (P^\top P)^{-1} Q) = \det(I_{n-k} + Q^\top (PP^\top)^{-1} Q),
\end{align*}
where in the final step we used that for $P$ square invertible, $(P^\top P)^{-1} = P^{-1} (P^\top)^{-1}$ and after a similarity transformation by $P$ the determinant is unchanged; more directly, $\det(I + AB) = \det(I + BA)$ gives the same identity. This equals $J_{n-k}(\Psi_j(t, \cdot))(y)^2$, proving
\begin{align*}
J_k f(x) \cdot |\det D\Psi_{j, (t,y)}| = J_{n-k}(\Psi_j(t, \cdot))(y).
\end{align*}
Combining,
\begin{align*}
\int_{W_j} J_k f(x) \, d\mathcal{L}^n(x)
&= \int_{\Phi_j(W_j)} J_{n-k}(\Psi_j(t, \cdot))(y) \, d\mathcal{L}^n(t, y) \\
&= \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap W_j) \, d\mathcal{L}^k(t),
\end{align*}
which is the local coarea formula on $W_j$.
[/step]
[step:Globalise via the countable cover and let $\varepsilon \to 0$]
The pieces $W_j = A_R \cap F_\varepsilon \cap U_j'$ for $j = 1, \dots, M$ partition $A_R \cap F_\varepsilon$ (modulo an $\mathcal{L}^n$-null set, by the $U_j' = U_j \setminus \bigcup_{i<j} U_i$ construction). Summing the local formula over $j$:
\begin{align*}
\int_{A_R \cap F_\varepsilon} J_k f \, d\mathcal{L}^n = \sum_{j=1}^M \int_{W_j} J_k f \, d\mathcal{L}^n = \sum_{j=1}^M \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap W_j) \, d\mathcal{L}^k(t).
\end{align*}
The right side equals
\begin{align*}
\int_{\mathbb{R}^k} \sum_{j=1}^M \mathcal{H}^{n-k}(f^{-1}(t) \cap W_j) \, d\mathcal{L}^k(t) = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A_R \cap F_\varepsilon) \, d\mathcal{L}^k(t),
\end{align*}
by the additivity of $\mathcal{H}^{n-k}$ on the disjoint sets $f^{-1}(t) \cap W_j$ and Tonelli (non-negative finite sum, no integrability issue).
**The leftover set $A_R \setminus F_\varepsilon$.** By construction $\mathcal{L}^n(A_R \setminus F_\varepsilon) < \varepsilon$, hence
\begin{align*}
\int_{A_R \setminus F_\varepsilon} J_k f \, d\mathcal{L}^n \le L^k \cdot \varepsilon \to 0 \quad \text{as } \varepsilon \to 0,
\end{align*}
using the bound $J_k f \le L^k$ (which follows from $|Jf_x| \le L$ and the Hadamard inequality applied to the rows of $Jf_x$, equivalently from $J_k f = \sqrt{\det(Jf \, Jf^\top)}$ being a product of singular values each bounded by $L$).
For the right-hand side leftover, we claim
\begin{align*}
\int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap (A_R \setminus F_\varepsilon)) \, d\mathcal{L}^k(t) \to 0 \quad \text{as } \varepsilon \to 0.
\end{align*}
This is the content of the [Coarea Formula (Classical, scalar k=1)](/theorems/23) extended to vector-valued maps via a density argument, but at this stage the cleanest route is monotone convergence: choose $\varepsilon = 1/m$ and a nested sequence $F_{1/m} \subseteq F_{1/(m+1)} \subseteq \cdots$ (always achievable by replacing $F_{1/m}$ by $\bigcup_{m' \le m} F_{1/m'}$), so that $A_R \setminus F_{1/m} \searrow A_R \setminus \bigcup_m F_{1/m}$, the limit set having $\mathcal{L}^n$-measure zero. Then the local formula applied to each $A_R \cap F_{1/m}$ shows the LHS satisfies
\begin{align*}
\int_{A_R \cap F_{1/m}} J_k f \, d\mathcal{L}^n \nearrow \int_{A_R} J_k f \, d\mathcal{L}^n
\end{align*}
by Monotone Convergence. The corresponding RHS sequence
\begin{align*}
I_m := \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A_R \cap F_{1/m}) \, d\mathcal{L}^k(t)
\end{align*}
is monotone increasing (since $A_R \cap F_{1/m}$ increases) and bounded by the LHS limit. The Monotone Convergence Theorem applied to the integrand $t \mapsto \mathcal{H}^{n-k}(f^{-1}(t) \cap A_R \cap F_{1/m})$ (increasing in $m$) gives
\begin{align*}
I_m \to \int_{\mathbb{R}^k} \mathcal{H}^{n-k}\Big(f^{-1}(t) \cap A_R \cap \bigcup_m F_{1/m}\Big) \, d\mathcal{L}^k(t).
\end{align*}
Since $A_R \setminus \bigcup_m F_{1/m}$ has $\mathcal{L}^n$-measure zero, by the Sard-for-Lipschitz argument of Step 2 applied to this null set (Lipschitz images of $\mathcal{L}^n$-null sets are $\mathcal{L}^k$-null), $\mathcal{H}^{n-k}(f^{-1}(t) \cap (A_R \setminus \bigcup F_{1/m})) = 0$ for $\mathcal{L}^k$-a.e. $t$, so
\begin{align*}
\int_{\mathbb{R}^k} \mathcal{H}^{n-k}\Big(f^{-1}(t) \cap A_R \cap \bigcup_m F_{1/m}\Big) \, d\mathcal{L}^k(t) = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A_R) \, d\mathcal{L}^k(t).
\end{align*}
Combining the two limits:
\begin{align*}
\int_{A_R} J_k f \, d\mathcal{L}^n = \lim_m I_m = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A_R) \, d\mathcal{L}^k(t).
\end{align*}
This is the coarea formula on the regular set.
[/step]
[step:Combine the regular and critical contributions for the weight $g \equiv 1$, then pass to general $g$]
For $A \subseteq B(0, R)$ measurable, write $A = A_R \sqcup (A \cap C)$. By Step 5,
\begin{align*}
\int_{A_R} J_k f \, d\mathcal{L}^n = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A_R) \, d\mathcal{L}^k(t).
\end{align*}
By Step 2, $\int_{A \cap C} J_k f \, d\mathcal{L}^n = 0$ and $\int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A \cap C) \, d\mathcal{L}^k(t) = 0$. Adding, by countable additivity of both sides on disjoint unions (Step 1):
\begin{align*}
\int_A J_k f \, d\mathcal{L}^n = \int_{\mathbb{R}^k} \mathcal{H}^{n-k}(f^{-1}(t) \cap A) \, d\mathcal{L}^k(t).
\end{align*}
Removing the restriction $A \subseteq B(0, R)$ by Step 1's countable-additivity reduction yields the formula for arbitrary $\mathcal{L}^n$-measurable $A \subseteq \mathbb{R}^n$.
**Passage to general $g \ge 0$ measurable.** The coarea formula with arbitrary non-negative measurable weight $g: \mathbb{R}^n \to [0, \infty]$,
\begin{align*}
\int_{\mathbb{R}^n} g(x) \, J_k f(x) \, d\mathcal{L}^n(x) = \int_{\mathbb{R}^k} \int_{f^{-1}(t)} g \, d\mathcal{H}^{n-k} \, d\mathcal{L}^k(t),
\end{align*}
follows from the indicator case $g = \mathbb{1}_A$ just established by the standard Lebesgue-integration extension procedure: linearity gives the formula for non-negative simple functions $g = \sum_{i=1}^N c_i \mathbb{1}_{A_i}$, the Monotone Convergence Theorem applied to an increasing sequence of simple functions $g_m \nearrow g$ (Lebesgue's structure theorem) extends to non-negative measurable $g$, and finally signed integrable $g$ via $g = g^+ - g^-$. Each application of MCT requires non-negative measurable integrands, which is the case here, and the limit interchange is unconditional.
This completes the proof.
[/step]
Explore Further
Whitney Extension Theorem
Real Analysis
$L^{1^*}$-Differentiability for BV Functions
Real Analysis
Jacobian as Product of Singular Values
Real Analysis
Implication to Approximate Differentiability
Real Analysis
Classical Differentiability of $W^{1,p}$ for $p > n$
Real Analysis
Level-Set Integration Formula
Real Analysis
Coarea Change of Variables
Real Analysis
Capacity Zero and Hausdorff Dimension
Real Analysis