[proofplan]
The strategy is to control the Taylor remainder $v_x(y) := u(y) - u(x) - \nabla u(x) \cdot (y - x)$ in a power-weighted average and show that for $\mathcal{L}^n$-almost every $x$, the quantity $\fint_{B(x,r)} (|v_x(y)|/|y-x|)^{p^*} \, d\mathcal{L}^n(y)$ tends to $0$ as $r \to 0$, where $p^* := np/(n-p)$. We work at points $x$ that are simultaneously Lebesgue points of $u$ and of $\nabla u$, and where the Hardy-Littlewood maximal function $M(|\nabla u - \nabla u(x)|^p)(x)$ is finite — this set is of full measure. We decompose $B(x, r)$ into dyadic shells $A_k = B(x, 2^{-k} r) \setminus B(x, 2^{-k-1} r)$, on each of which $|y - x| \asymp 2^{-k} r$, then apply Sobolev-Poincaré on the smaller ball $B(x, 2^{-k} r)$ after re-centering $v_x$ at its mean on that ball. The technical heart of the argument is a telescoping bound on the local mean $m_k(r) := (v_x)_{B(x, 2^{-k} r)}$ in terms of the gradient-deviation averages $G_j(x, r) := (\fint_{B(x, 2^{-j} r)} |\nabla u - \nabla u(x)|^p)^{1/p}$, derived by applying Sobolev-Poincaré to $\tilde u(y) := u(y) - \nabla u(x) \cdot (y - x)$ on each dyadic ball and exploiting the linear-mean-zero observation that $\tilde u_{B(x, \rho)} = u_{B(x, \rho)} - u(x) + (m_k\text{-correction})$. The shell sum is controlled by Lebesgue-Differentiation pointwise convergence, with the maximal function providing a summable dominator for the dominated convergence step. Approximate differentiability follows from $L^{p^*}$-differentiability via Markov's inequality.
[/proofplan]
[step:Set up: Sobolev conjugate exponent, precise representatives, and a full-measure good set]
Throughout, we work under the standing hypothesis $u \in W^{1,p}_{\mathrm{loc}}(\Omega)$ with $\Omega \subseteq \mathbb{R}^n$ open and $1 \le p < n$. Define the Sobolev conjugate exponent
\begin{align*}
p^* := \frac{np}{n - p},
\end{align*}
which is finite and satisfies $p^* > p \ge 1$ under $1 \le p < n$.
We define the precise (Lebesgue) representatives of $u$ and $\nabla u$. For $u \in L^1_{\mathrm{loc}}(\Omega)$, set
\begin{align*}
u^*: \Omega &\to \mathbb{R} \cup \{*\} \\
x &\mapsto \begin{cases} \displaystyle \lim_{r \to 0^+} \fint_{B(x, r)} u(y) \, d\mathcal{L}^n(y) & \text{if the limit exists} \\ 0 & \text{otherwise.} \end{cases}
\end{align*}
For $\nabla u \in L^p_{\mathrm{loc}}(\Omega; \mathbb{R}^n) \subseteq L^1_{\mathrm{loc}}$, define analogously the componentwise precise representative
\begin{align*}
(\nabla u)^*: \Omega &\to \mathbb{R}^n \\
x &\mapsto \begin{cases} \displaystyle \lim_{r \to 0^+} \fint_{B(x, r)} \nabla u(y) \, d\mathcal{L}^n(y) & \text{if the (componentwise) limit exists} \\ 0 & \text{otherwise.} \end{cases}
\end{align*}
By the Lebesgue Differentiation Theorem, $u^*(x) = u(x)$ and $(\nabla u)^*(x) = \nabla u(x)$ for $\mathcal{L}^n$-a.e. $x \in \Omega$. We henceforth replace $u$ by $u^*$ and $\nabla u$ by $(\nabla u)^*$ and write $u(x)$, $\nabla u(x)$ for the precise pointwise values without further comment.
Define the Hardy-Littlewood maximal function for a locally integrable $f: \Omega \to \mathbb{R}$ by
\begin{align*}
M f: \Omega &\to [0, \infty] \\
x &\mapsto \sup_{r > 0 : B(x, r) \subseteq \Omega} \fint_{B(x, r)} |f(y)| \, d\mathcal{L}^n(y).
\end{align*}
Fix $x_0 \in \Omega$ and $r_0 > 0$ with $\overline{B}(x_0, 2 r_0) \subseteq \Omega$. We will establish the desired conclusion at $\mathcal{L}^n$-a.e. point of $B(x_0, r_0)$; since $\Omega$ admits a countable cover by such balls, this yields the conclusion on all of $\Omega$. Set
\begin{align*}
\Omega_0 := \big\{ x \in B(x_0, r_0) : (i), (ii), (iii) \text{ hold} \big\},
\end{align*}
where:
- (i) $x$ is a Lebesgue point of $u$, so $\lim_{\rho \to 0} \fint_{B(x, \rho)} u \, d\mathcal{L}^n = u(x)$;
- (ii) $x$ is a Lebesgue point of $\nabla u$ (componentwise), so $\lim_{\rho \to 0} \fint_{B(x, \rho)} |\nabla u - \nabla u(x)|^p \, d\mathcal{L}^n = 0$ (this is the stronger $L^p$-Lebesgue condition, equivalent for $|\nabla u|^p \in L^1_{\mathrm{loc}}$);
- (iii) $M\big(|\nabla u - \nabla u(x)|^p \mathbb{1}_{B(x_0, 2r_0)}\big)(x) < \infty$.
Each of (i), (ii), (iii) holds at $\mathcal{L}^n$-a.e. $x \in B(x_0, r_0)$: (i) and (ii) by the Lebesgue Differentiation Theorem (applied to $u \in L^1_{\mathrm{loc}}$ and to $|\nabla u - \nabla u(x)|^p \in L^1_{\mathrm{loc}}$ pointwise in $x$ — note this is the strong $L^p$ Lebesgue point condition); (iii) by the weak (1,1) inequality of the Hardy-Littlewood maximal theorem applied to $|\nabla u - \nabla u(x)|^p \mathbb{1}_{B(x_0, 2r_0)} \in L^1$ (note: the strong $(p,p)$ inequality is irrelevant here since we are in the $L^1$ class for the integrand $|\nabla u - \nabla u(x)|^p$, and the relevant statement is precisely that the maximal function of an $L^1$ function is finite a.e.). Hence $\mathcal{L}^n(B(x_0, r_0) \setminus \Omega_0) = 0$. We work with $x \in \Omega_0$ and $0 < r \le r_0$ so that $\overline{B}(x, r) \subseteq B(x_0, 2 r_0) \subseteq \Omega$.
[/step]
[step:Define the Taylor remainder $v_x$ and reduce $\Phi(x, r)$ to dyadic-ball averages]
For $x \in \Omega_0$ and $0 < r \le r_0$, define the Taylor remainder
\begin{align*}
v_x: B(x, r) &\to \mathbb{R} \\
y &\mapsto u(y) - u(x) - \nabla u(x) \cdot (y - x).
\end{align*}
Since the affine map $y \mapsto u(x) + \nabla u(x) \cdot (y - x)$ is smooth, $v_x \in W^{1,p}(B(x, r))$ with weak gradient
\begin{align*}
\nabla v_x(y) = \nabla u(y) - \nabla u(x) \qquad \text{for $\mathcal{L}^n$-a.e. } y \in B(x, r).
\end{align*}
Write $\omega_n := \mathcal{L}^n(B(0, 1))$ so $\mathcal{L}^n(B(x, \rho)) = \omega_n \rho^n$. For a measurable set $E$ with $0 < \mathcal{L}^n(E) < \infty$ and a measurable $f$ on $E$, write $f_E := \fint_E f \, d\mathcal{L}^n$. The goal is:
\begin{align*}
\Phi(x, r) := \fint_{B(x, r)} \left( \frac{|v_x(y)|}{|y - x|} \right)^{p^*} d\mathcal{L}^n(y) \to 0 \qquad (r \to 0^+) \text{ for every } x \in \Omega_0.
\end{align*}
For $k \in \mathbb{N}_0 = \{0, 1, 2, \dots\}$, let $A_k := B(x, 2^{-k} r) \setminus B(x, 2^{-k-1} r)$ be the dyadic shell. Then $B(x, r) \setminus \{x\} = \bigsqcup_{k \ge 0} A_k$ (disjoint), and $\mathcal{L}^n(\{x\}) = 0$. For $y \in A_k$ we have $|y - x| \ge 2^{-k-1} r$, so $|y - x|^{-p^*} \le 2^{(k+1) p^*} r^{-p^*}$. Splitting $\Phi(x, r)$ over shells:
\begin{align*}
\Phi(x, r) &= \frac{1}{\omega_n r^n} \sum_{k = 0}^\infty \int_{A_k} \frac{|v_x(y)|^{p^*}}{|y - x|^{p^*}} \, d\mathcal{L}^n(y) \\
&\le \frac{1}{\omega_n r^n} \sum_{k = 0}^\infty \frac{2^{(k+1) p^*}}{r^{p^*}} \int_{A_k} |v_x|^{p^*} d\mathcal{L}^n.
\end{align*}
Since $A_k \subseteq B(x, 2^{-k} r)$ and $\mathcal{L}^n(B(x, 2^{-k} r)) = \omega_n (2^{-k} r)^n$,
\begin{align*}
\int_{A_k} |v_x|^{p^*} d\mathcal{L}^n \le \omega_n (2^{-k} r)^n \fint_{B(x, 2^{-k} r)} |v_x|^{p^*} d\mathcal{L}^n.
\end{align*}
Combining,
\begin{align*}
\Phi(x, r) \le \frac{2^{p^*}}{r^{p^*}} \sum_{k = 0}^\infty 2^{k(p^* - n)} \fint_{B(x, 2^{-k} r)} |v_x|^{p^*} d\mathcal{L}^n.
\end{align*}
This reduces the problem to bounding the dyadic-ball averages $\fint_{B(x, 2^{-k} r)} |v_x|^{p^*} \, d\mathcal{L}^n$.
We will need the linear-mean-zero observation. By the change of variables $y = x + z$ on $B(x, \rho)$, and using that the Lebesgue measure on $B(0, \rho)$ is invariant under $z \mapsto -z$,
\begin{align*}
\fint_{B(x, \rho)} \nabla u(x) \cdot (y - x) \, d\mathcal{L}^n(y) = \nabla u(x) \cdot \fint_{B(0, \rho)} z \, d\mathcal{L}^n(z) = \nabla u(x) \cdot 0 = 0.
\end{align*}
Hence for the auxiliary function
\begin{align*}
\tilde u_x: B(x, r) &\to \mathbb{R} \\
y &\mapsto u(y) - \nabla u(x) \cdot (y - x),
\end{align*}
we have $\tilde u_x \in W^{1,p}(B(x, r))$ with $\nabla \tilde u_x = \nabla u - \nabla u(x)$ a.e., and the linear-mean-zero observation gives
\begin{align*}
(\tilde u_x)_{B(x, \rho)} = u_{B(x, \rho)} - \nabla u(x) \cdot \fint_{B(x, \rho)}(y - x) \, d\mathcal{L}^n(y) = u_{B(x, \rho)}.
\end{align*}
We use this auxiliary function in the next step.
[/step]
[step:Key lemma: telescoping bound on the local mean $m_k(r)$ in terms of $G_j(x, r)$]
Define
\begin{align*}
m_k(r) &:= (v_x)_{B(x, 2^{-k} r)} = \fint_{B(x, 2^{-k} r)} v_x \, d\mathcal{L}^n, \\
G_j(x, r) &:= \left( \fint_{B(x, 2^{-j} r)} |\nabla u(y) - \nabla u(x)|^p \, d\mathcal{L}^n(y) \right)^{1/p}.
\end{align*}
By the linear-mean-zero observation of Step 2, $m_k(r) = u_{B(x, 2^{-k} r)} - u(x)$.
[claim:Telescoping bound on $|m_k(r)|$]
There exists a constant $C = C(n, p)$ such that for every $x \in \Omega_0$, every $0 < r \le r_0$, and every $k \in \mathbb{N}_0$,
\begin{align*}
|m_k(r)| \le C \cdot r \sum_{j = k}^\infty 2^{-j} G_j(x, r).
\end{align*}
[/claim]
*Proof of claim.* Recall the Sobolev-Poincaré inequality on balls (a standard result in the theory of Sobolev functions): there is a constant $C_{\mathrm{SP}} = C_{\mathrm{SP}}(n, p)$ such that for every $w \in W^{1,p}(B(z, \rho))$ with $1 \le p < n$,
\begin{align*}
\left( \fint_{B(z, \rho)} |w - w_{B(z, \rho)}|^{p^*} d\mathcal{L}^n \right)^{1/p^*} \le C_{\mathrm{SP}} \cdot \rho \cdot \left( \fint_{B(z, \rho)} |\nabla w|^p \, d\mathcal{L}^n \right)^{1/p}.
\end{align*}
Apply this to $\tilde u_x \in W^{1,p}(B(x, r))$ on the ball $B(x, 2^{-j} r)$, using $\nabla \tilde u_x = \nabla u - \nabla u(x)$ and $(\tilde u_x)_{B(x, 2^{-j} r)} = u_{B(x, 2^{-j} r)}$:
\begin{align*}
\left( \fint_{B(x, 2^{-j} r)} \big| \tilde u_x(y) - u_{B(x, 2^{-j} r)} \big|^{p^*} d\mathcal{L}^n(y) \right)^{1/p^*} \le C_{\mathrm{SP}} \cdot 2^{-j} r \cdot G_j(x, r).
\end{align*}
For each $j \ge 0$, since $B(x, 2^{-j-1} r) \subseteq B(x, 2^{-j} r)$ with volume ratio $2^{-n}$, by Jensen's inequality (which gives $|\fint f| \le \fint |f| \le (\fint |f|^{p^*})^{1/p^*}$ for $p^* \ge 1$),
\begin{align*}
\big| u_{B(x, 2^{-j-1} r)} - u_{B(x, 2^{-j} r)} \big| &= \big| (\tilde u_x)_{B(x, 2^{-j-1} r)} - (\tilde u_x)_{B(x, 2^{-j} r)} \big| \\
&\le \fint_{B(x, 2^{-j-1} r)} \big| \tilde u_x - (\tilde u_x)_{B(x, 2^{-j} r)} \big| \, d\mathcal{L}^n \\
&\le 2^n \fint_{B(x, 2^{-j} r)} \big| \tilde u_x - (\tilde u_x)_{B(x, 2^{-j} r)} \big| \, d\mathcal{L}^n \\
&\le 2^n \left( \fint_{B(x, 2^{-j} r)} \big| \tilde u_x - (\tilde u_x)_{B(x, 2^{-j} r)} \big|^{p^*} d\mathcal{L}^n \right)^{1/p^*} \\
&\le 2^n C_{\mathrm{SP}} \cdot 2^{-j} r \cdot G_j(x, r).
\end{align*}
The first equality used $(\tilde u_x)_{B(x, \rho)} = u_{B(x, \rho)}$ for both balls.
Now apply the Lebesgue Differentiation Theorem at the Lebesgue point $x$ of $u$: $u_{B(x, 2^{-j} r)} \to u(x)$ as $j \to \infty$. Hence the telescoping series
\begin{align*}
u(x) - u_{B(x, 2^{-k} r)} = \sum_{j = k}^\infty \big( u_{B(x, 2^{-j-1} r)} - u_{B(x, 2^{-j} r)} \big)
\end{align*}
converges absolutely (granted the bound just derived; we verify this in the next display), and the triangle inequality gives
\begin{align*}
|m_k(r)| = \big| u_{B(x, 2^{-k} r)} - u(x) \big| \le \sum_{j = k}^\infty \big| u_{B(x, 2^{-j-1} r)} - u_{B(x, 2^{-j} r)} \big| \le 2^n C_{\mathrm{SP}} \cdot r \sum_{j = k}^\infty 2^{-j} G_j(x, r).
\end{align*}
Absolute convergence of the right side follows from $G_j(x, r) \le M_*(x)^{1/p}$ (where $M_*(x) := M(|\nabla u - \nabla u(x)|^p \mathbb{1}_{B(x_0, 2r_0)})(x) < \infty$ by condition (iii)) and $\sum_j 2^{-j} < \infty$. The constant is $C := 2^n C_{\mathrm{SP}}$. This proves the claim. $\square$
[/step]
[step:Apply Sobolev-Poincaré on each dyadic ball and combine with the telescoping bound]
Apply Sobolev-Poincaré to $v_x \in W^{1,p}(B(x, r))$ on the ball $B(x, 2^{-k} r)$, using $\nabla v_x = \nabla u - \nabla u(x)$:
\begin{align*}
\left( \fint_{B(x, 2^{-k} r)} |v_x - m_k(r)|^{p^*} d\mathcal{L}^n \right)^{1/p^*} \le C_{\mathrm{SP}} \cdot 2^{-k} r \cdot G_k(x, r).
\end{align*}
By the triangle inequality in $L^{p^*}(B(x, 2^{-k} r))$ for the decomposition $v_x = (v_x - m_k(r)) + m_k(r)$,
\begin{align*}
\left( \fint_{B(x, 2^{-k} r)} |v_x|^{p^*} d\mathcal{L}^n \right)^{1/p^*} \le C_{\mathrm{SP}} \cdot 2^{-k} r \cdot G_k(x, r) + |m_k(r)|.
\end{align*}
Raising to the $p^*$-th power (a pure algebraic manipulation; not Hölder) and using $(a + b)^{p^*} \le 2^{p^*}(a^{p^*} + b^{p^*})$ for $a, b \ge 0$:
\begin{align*}
\fint_{B(x, 2^{-k} r)} |v_x|^{p^*} d\mathcal{L}^n \le 2^{p^*} \big( C_{\mathrm{SP}}^{p^*} (2^{-k} r)^{p^*} G_k(x, r)^{p^*} + |m_k(r)|^{p^*} \big).
\end{align*}
[/step]
[step:Sum the dyadic shell estimates and conclude $\Phi(x, r) \to 0$ via dominated convergence]
Substitute the Step 4 dyadic-ball bound into the Step 2 reduction:
\begin{align*}
\Phi(x, r) &\le \frac{2^{p^*}}{r^{p^*}} \sum_{k = 0}^\infty 2^{k(p^* - n)} \fint_{B(x, 2^{-k} r)} |v_x|^{p^*} d\mathcal{L}^n \\
&\le \frac{2^{p^*}}{r^{p^*}} \sum_{k = 0}^\infty 2^{k(p^* - n)} \cdot 2^{p^*} \big( C_{\mathrm{SP}}^{p^*} (2^{-k} r)^{p^*} G_k(x, r)^{p^*} + |m_k(r)|^{p^*} \big) \\
&= 2^{2 p^*} C_{\mathrm{SP}}^{p^*} \underbrace{\sum_{k = 0}^\infty 2^{-k n} G_k(x, r)^{p^*}}_{=: S_I(r)} + \underbrace{\frac{2^{2 p^*}}{r^{p^*}} \sum_{k = 0}^\infty 2^{k(p^* - n)} |m_k(r)|^{p^*}}_{=: S_{II}(r)},
\end{align*}
where in the first sum we used $2^{k(p^* - n)} (2^{-k} r)^{p^*} / r^{p^*} = 2^{-k n}$. We treat $S_I(r)$ and $S_{II}(r)$ separately. Both arguments rely on the dominator $M_*(x) := M(|\nabla u - \nabla u(x)|^p \mathbb{1}_{B(x_0, 2 r_0)})(x) < \infty$, finite by condition (iii). Note: throughout this step the constants depend on $x$ via $M_*(x)$; we are proving pointwise (in $x$) convergence, not a uniform bound.
**Sum I.** For each fixed $k$, condition (ii) gives $G_k(x, r)^p = \fint_{B(x, 2^{-k} r)} |\nabla u - \nabla u(x)|^p \, d\mathcal{L}^n \to 0$ as $r \to 0^+$, hence $G_k(x, r)^{p^*} \to 0$. The maximal-function dominator gives, whenever $2^{-k} r \le r_0$,
\begin{align*}
G_k(x, r)^p \le M_*(x), \qquad \text{so} \qquad 2^{-k n} G_k(x, r)^{p^*} \le 2^{-k n} M_*(x)^{p^*/p}.
\end{align*}
The bound $\sum_{k} 2^{-k n} M_*(x)^{p^*/p} = (1 - 2^{-n})^{-1} M_*(x)^{p^*/p} < \infty$ provides a summable dominator.
By the Dominated Convergence Theorem applied to the counting measure on $\mathbb{N}_0$ weighted by $2^{-k n}$ with the parameter $r \to 0^+$ (the parameter version: for every sequence $r_\ell \to 0^+$, $G_k(x, r_\ell)^{p^*} \to 0$ pointwise in $k$ and is dominated by $M_*(x)^{p^*/p}$, so the weighted sum converges to $0$; this is the standard sequential reformulation of DCT for parameter limits),
\begin{align*}
S_I(r) \to 0 \qquad (r \to 0^+).
\end{align*}
**Sum II.** Apply the telescoping bound from the Step 3 claim:
\begin{align*}
|m_k(r)|^{p^*} \le (2^n C_{\mathrm{SP}})^{p^*} r^{p^*} \left( \sum_{j = k}^\infty 2^{-j} G_j(x, r) \right)^{p^*}.
\end{align*}
Using $G_j(x, r) \le M_*(x)^{1/p}$, the inner sum is bounded uniformly by $\sum_{j \ge k} 2^{-j} M_*(x)^{1/p} = 2^{1 - k} M_*(x)^{1/p}$, so
\begin{align*}
|m_k(r)|^{p^*} \le (2^n C_{\mathrm{SP}})^{p^*} r^{p^*} \cdot 2^{(1 - k) p^*} M_*(x)^{p^*/p}.
\end{align*}
Multiplying by $2^{k(p^* - n)} / r^{p^*}$ and using $2^{k(p^* - n)} \cdot 2^{(1 - k) p^*} = 2^{p^*} \cdot 2^{-k n}$:
\begin{align*}
2^{k(p^* - n)} \frac{|m_k(r)|^{p^*}}{r^{p^*}} \le (2^n C_{\mathrm{SP}})^{p^*} \cdot 2^{p^*} \cdot 2^{-k n} M_*(x)^{p^*/p}.
\end{align*}
This is the summable dominator (in $k$) for $S_{II}(r)$, independent of $r$.
For pointwise convergence in $k$ as $r \to 0^+$: for each fixed $k$ and each fixed $j \ge k$, condition (ii) gives $G_j(x, r) \to 0$ as $r \to 0^+$; the dominator $2^{-j} M_*(x)^{1/p}$ is summable in $j$. By DCT (parameter version, on the counting measure on $\{j : j \ge k\}$),
\begin{align*}
\sum_{j = k}^\infty 2^{-j} G_j(x, r) \to 0 \qquad (r \to 0^+) \text{ for each fixed } k.
\end{align*}
Hence $|m_k(r)|^{p^*} / r^{p^*} \to 0$ as $r \to 0^+$ for each fixed $k$, and consequently $2^{k(p^* - n)} |m_k(r)|^{p^*} / r^{p^*} \to 0$.
By DCT once more (now on the counting measure on $\mathbb{N}_0$ with the dominator $(2^n C_{\mathrm{SP}})^{p^*} \cdot 2^{p^*} \cdot 2^{-k n} M_*(x)^{p^*/p}$),
\begin{align*}
S_{II}(r) \to 0 \qquad (r \to 0^+).
\end{align*}
**Conclusion.** $\Phi(x, r) \le 2^{2 p^*} C_{\mathrm{SP}}^{p^*} S_I(r) + S_{II}(r) \to 0$ as $r \to 0^+$ for every $x \in \Omega_0$. This is the $L^{p^*}$-differentiability statement: for $\mathcal{L}^n$-a.e. $x \in \Omega$,
\begin{align*}
\left( \fint_{B(x, r)} \left( \frac{|u(y) - u(x) - \nabla u(x) \cdot (y - x)|}{|y - x|} \right)^{p^*} d\mathcal{L}^n(y) \right)^{1/p^*} \to 0 \qquad (r \to 0^+).
\end{align*}
[/step]
[step:Approximate differentiability via Markov's inequality]
For $\varepsilon > 0$ and $r > 0$, define
\begin{align*}
E_\varepsilon(r) := \left\{ y \in B(x, r) \setminus \{x\} : \frac{|u(y) - u(x) - \nabla u(x) \cdot (y - x)|}{|y - x|} > \varepsilon \right\}
\end{align*}
(the singleton $\{x\}$ has measure zero, so excluding it is harmless).
By Markov's inequality (the elementary bound $\mu(\{|f| > \lambda\}) \le \lambda^{-q} \int |f|^q \, d\mu$ for $q \ge 1$, $\lambda > 0$, valid because $\mathbb{1}_{\{|f| > \lambda\}} \le |f|^q / \lambda^q$ pointwise) with $q = p^*$, $\lambda = \varepsilon^{p^*}$, $\mu = \mathcal{L}^n$, applied to $y \mapsto |v_x(y)|^{p^*} / |y - x|^{p^*}$ on $B(x, r)$:
\begin{align*}
\mathcal{L}^n(E_\varepsilon(r)) \le \frac{1}{\varepsilon^{p^*}} \int_{B(x, r)} \frac{|v_x(y)|^{p^*}}{|y - x|^{p^*}} \, d\mathcal{L}^n(y).
\end{align*}
Dividing by $\mathcal{L}^n(B(x, r)) = \omega_n r^n$:
\begin{align*}
\frac{\mathcal{L}^n(E_\varepsilon(r))}{\mathcal{L}^n(B(x, r))} \le \frac{\Phi(x, r)}{\varepsilon^{p^*}}.
\end{align*}
By Step 5, $\Phi(x, r) \to 0$ as $r \to 0^+$ for every $x \in \Omega_0$, so
\begin{align*}
\lim_{r \to 0^+} \frac{\mathcal{L}^n(E_\varepsilon(r))}{\mathcal{L}^n(B(x, r))} = 0 \qquad \text{for every } \varepsilon > 0.
\end{align*}
This is precisely the definition of approximate differentiability of $u$ at $x$ with approximate derivative $\nabla u(x)$. The conclusion holds for $\mathcal{L}^n$-a.e. $x \in \Omega$, completing the proof.
(This is a Sobolev-function refinement of [Rademacher's Theorem](/theorems/3069), which gives classical differentiability $\mathcal{L}^n$-a.e. for Lipschitz $u$. For $u$ merely in $W^{1,p}_{\mathrm{loc}}$ with $1 \le p < n$, the gradient need not be locally bounded, and one obtains approximate (rather than classical) differentiability.)
[/step]