[proofplan]
**This proof is a declared sketch.** We follow Bourgain, "Besicovitch type maximal operators and applications to Fourier analysis," *Geometric and Functional Analysis* **1** (1991), 147--187, §2 (the wave-packet decomposition for restriction). The argument has three ingredients: (a) discretise the Kakeya maximal function on a $\delta$-net of directions in $S^{n-1}$, reducing to a sum over $N \asymp \delta^{-(n-1)}$ caps; (b) test the restriction estimate against a randomised cap-indicator superposition with Rademacher signs, expanding via Khintchine to a square-function over wave-packet plates; (c) close via parabolic rescaling to convert dual plates to Kakeya tubes plus a per-tube Hölder pairing. The wave-packet identification (passing from Schwartz-tail decay to a clean tube indicator) is the technical content of Bourgain 1991 and is invoked as a cited fact. The exponent $(n-1)/p$ in the conclusion is the one that emerges from the Hölder cap-sum closure with $\mathcal{L}^n(T_j) \asymp \delta^{n-1}$.
[/proofplan]
[step:Discretise the Kakeya maximal function on a $\delta$-net of directions]
Fix $\delta \in (0, 1)$ and $\varepsilon > 0$. Choose a $\delta$-separated set $\{\omega_j\}_{j=1}^N \subset S^{n-1}$ — that is, $|\omega_j - \omega_k| \ge \delta$ for $j \ne k$ — with $N \asymp \delta^{-(n-1)}$ (this is achievable by a packing argument on $S^{n-1}$). Let $\Omega_j \subset S^{n-1}$ be the spherical $\delta$-cap centred at $\omega_j$, so $\sigma(\Omega_j) \asymp \delta^{n-1}$, the caps $\{\Omega_j\}$ have bounded overlap, and $\bigcup_j \Omega_j = S^{n-1}$.
For each $j$, let $T_j^* \subset \mathbb{R}^n$ be a tube of length $\delta^{-2}$ and radius $\delta^{-1}$ aligned with $\omega_j$, centred at the origin (this is the dual-tube of the cap $\Omega_j$ under the parabolic uncertainty principle). The dual tube has volume $\mathcal{L}^n(T_j^*) \asymp \delta^{-2} \cdot \delta^{-(n-1)} = \delta^{-(n+1)}$.
Since $f^*_\delta$ is a maximum of $1$-Lipschitz averages over tubes of axis $\omega$, the discretisation $f^*_\delta(\omega) \asymp f^*_\delta(\omega_j)$ for $\omega \in \Omega_j$ holds up to a multiplicative factor absorbed into the eventual $\delta^{-\varepsilon}$ slack. Hence
\begin{align*}
\|f^*_\delta\|_{L^q(S^{n-1}, d\sigma)}^q \le C \sum_{j=1}^N \sigma(\Omega_j)\, \bigl(f^*_\delta(\omega_j)\bigr)^q \le C\, \delta^{n-1} \sum_{j=1}^N (f^*_\delta(\omega_j))^q.
\end{align*}
[/step]
[step:Test the restriction estimate on a randomised cap-indicator superposition]
Let $\{\varepsilon_j\}_{j=1}^N$ be independent Rademacher random variables on a probability space $(\Omega_{\mathrm{Rad}}, \mathbb{P}_{\mathrm{Rad}})$, each taking values $\pm 1$ with probability $1/2$. For each centre $x \in \mathbb{R}^n$ and each Rademacher sample $\eta \in \Omega_{\mathrm{Rad}}$, define the *cap-indicator superposition*
\begin{align*}
F_x : S^{n-1} &\to \mathbb{C}, \\
\omega' &\mapsto \sum_{j=1}^N \varepsilon_j(\eta)\, e^{-i x \cdot \omega'}\, \mathbb{1}_{\Omega_j}(\omega').
\end{align*}
Apply the restriction estimate $R^*(q \to p)$ to $F_x$:
\begin{align*}
\|\mathcal{E}(F_x)\|_{L^p(\mathbb{R}^n)} \le C_q\, \|F_x\|_{L^q(S^{n-1}, d\sigma)}.
\end{align*}
The right-hand side: the caps $\{\Omega_j\}$ are essentially disjoint, so $|F_x|^q = \sum_j \mathbb{1}_{\Omega_j}$ pointwise (Rademacher signs have $|\varepsilon_j| = 1$ and the modulating phase $e^{-ix\cdot\omega'}$ has unit modulus), giving
\begin{align*}
\|F_x\|_{L^q(d\sigma)}^q = \sum_{j=1}^N \sigma(\Omega_j) \asymp N\, \delta^{n-1} \asymp 1.
\end{align*}
Hence $\|F_x\|_{L^q(d\sigma)} \asymp 1$ uniformly in $x$ and in the Rademacher sample $\eta$.
The left-hand side: by linearity of the extension operator,
\begin{align*}
\mathcal{E}(F_x)(y) = \sum_{j=1}^N \varepsilon_j(\eta)\, \int_{\Omega_j} e^{-i x \cdot \omega'}\, e^{-i y \cdot \omega'}\, d\sigma(\omega') = \sum_{j=1}^N \varepsilon_j(\eta)\, \widehat{\mathbb{1}_{\Omega_j} d\sigma}(x + y).
\end{align*}
Each summand $\widehat{\mathbb{1}_{\Omega_j} d\sigma}$ is, by the **Bourgain wave-packet description** of restriction (Bourgain, *Geom. Funct. Anal.* **1** (1991), §2; also Stein, *Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals*, Princeton University Press 1993, §IX.6), a single wave packet localised on the dual tube $T_j^*$ centred at the origin in the sense that its modulus is bounded above by a Schwartz-tailed bump on $T_j^*$ of amplitude $\asymp \sigma(\Omega_j) = \delta^{n-1}$. We adopt this description as a cited fact.
[/step]
[step:Apply Khintchine to extract a square-function estimate, then rescale to Kakeya tubes]
Take the $L^p$-power of $\mathcal{E}(F_x)$ and average over the Rademacher sample $\eta$. By the [Khintchine inequality](/theorems/3177) for Rademacher sums (which, for $p \ge 2$, asserts $\mathbb{E}|\sum_j \varepsilon_j a_j|^p \asymp (\sum_j |a_j|^2)^{p/2}$ for scalars $a_j$; applied pointwise in $y$),
\begin{align*}
\mathbb{E}_\eta |\mathcal{E}(F_x)(y)|^p \asymp \Bigl(\sum_{j=1}^N |\widehat{\mathbb{1}_{\Omega_j} d\sigma}(x + y)|^2\Bigr)^{p/2}.
\end{align*}
By Tonelli's theorem (the integrand is non-negative measurable on $\Omega_{\mathrm{Rad}} \times \mathbb{R}^n$), integrating over $y \in \mathbb{R}^n$ and using the upper bound $\|\mathcal{E}(F_x)\|_{L^p}^p \le C_q^p\,\|F_x\|_{L^q}^p \asymp C_q^p$:
\begin{align*}
\int_{\mathbb{R}^n} \Bigl(\sum_{j=1}^N |\widehat{\mathbb{1}_{\Omega_j} d\sigma}(x + y)|^2\Bigr)^{p/2} d\mathcal{L}^n(y) \le C_q^p.
\end{align*}
By the wave-packet description from Step 2, in an averaged $L^{p/2}$ sense (this is the cited content of Bourgain 1991: passing from the Schwartz-tail decay of $\widehat{\mathbb{1}_{\Omega_j}d\sigma}$ to a clean indicator of $T_j^*$, modulo an absorbable $\delta^{-\varepsilon}$ loss),
\begin{align*}
\int_{\mathbb{R}^n} \Bigl(\sum_{j=1}^N \mathbb{1}_{T_j^*(x)}(y)\Bigr)^{p/2} d\mathcal{L}^n(y) \le C\, \delta^{-(n-1)p}\, \delta^{-\varepsilon p / 2},
\end{align*}
where $T_j^*(x)$ is the translate of $T_j^*$ to centre $x$ (the cited tube-sum estimate of Bourgain).
**Parabolic rescaling.** The dilation $y \mapsto \delta^2 y$ maps each $\delta^{-1} \times \cdots \times \delta^{-1} \times \delta^{-2}$ tube $T_j^*$ to a $\delta \times \cdots \times \delta \times 1$ tube $T_j$ — exactly the Kakeya scale. Under this dilation $d\mathcal{L}^n(y) = \delta^{-2n} d\mathcal{L}^n(y')$ where $y = \delta^2 y'$, so the integral rescales:
\begin{align*}
\int_{\mathbb{R}^n} \Bigl(\sum_{j=1}^N \mathbb{1}_{T_j(x')}(y')\Bigr)^{p/2} d\mathcal{L}^n(y') \le C\, \delta^{-(n-1)p + 2n - \varepsilon p / 2}.
\end{align*}
[/step]
[step:Close via Hölder per-tube and the cap-volume sum]
Let $f \in L^p(\mathbb{R}^n)$ with $f \ge 0$ (the general case follows by replacing $f$ with $|f|$). For each $j$, let $T_j \subset \mathbb{R}^n$ be a $\delta$-tube of unit length aligned with $\omega_j$ achieving (up to a factor of $2$) the supremum in $f^*_\delta(\omega_j)$:
\begin{align*}
f^*_\delta(\omega_j) \asymp \frac{1}{\mathcal{L}^n(T_j)} \int_{T_j} f\, d\mathcal{L}^n.
\end{align*}
With $\mathcal{L}^n(T_j) \asymp \delta^{n-1}$,
\begin{align*}
\sum_{j=1}^N (f^*_\delta(\omega_j))^q \asymp \delta^{-(n-1)q} \sum_{j=1}^N \Bigl(\int_{T_j} f\, d\mathcal{L}^n\Bigr)^q.
\end{align*}
By [Hölder's inequality](/theorems/3178) applied to the inner integral with conjugate exponents $(p, p')$ where $1/p + 1/p' = 1$, applied to the pair $(f, \mathbb{1}_{T_j})$,
\begin{align*}
\int_{T_j} f\, d\mathcal{L}^n = \int f\, \mathbb{1}_{T_j}\, d\mathcal{L}^n \le \|f\|_{L^p(\mathbb{R}^n)}\, \mathcal{L}^n(T_j)^{1/p'} \asymp \|f\|_{L^p}\, \delta^{(n-1)/p'}.
\end{align*}
Therefore
\begin{align*}
\sum_{j=1}^N (f^*_\delta(\omega_j))^q \asymp \delta^{-(n-1)q} \sum_{j=1}^N \Bigl(\|f\|_{L^p}\, \delta^{(n-1)/p'}\Bigr)^q = \|f\|_{L^p}^q\, N\, \delta^{-(n-1)q}\, \delta^{(n-1)q/p'}.
\end{align*}
Using $N \asymp \delta^{-(n-1)}$,
\begin{align*}
\sum_{j=1}^N (f^*_\delta(\omega_j))^q \le C\, \|f\|_{L^p}^q\, \delta^{-(n-1)}\, \delta^{(n-1)q(1/p' - 1)} = C\, \|f\|_{L^p}^q\, \delta^{-(n-1) - (n-1)q/p}.
\end{align*}
The exponent simplification uses $1/p' - 1 = -1/p$.
Multiplying by $\delta^{n-1}$ to convert the cap sum to an integral (Step 1), $\|f^*_\delta\|_{L^q(d\sigma)}^q \le C\,\|f\|_{L^p}^q\, \delta^{-(n-1)q/p}$, and taking $q$-th roots:
\begin{align*}
\|f^*_\delta\|_{L^q(S^{n-1}, d\sigma)} \le C\, \delta^{-(n-1)/p}\, \|f\|_{L^p(\mathbb{R}^n)}.
\end{align*}
Allowing the $\delta^{-\varepsilon}$ slack from the wave-packet Schwartz tails of Step 3 and the cap-discretisation of Step 1, we obtain
\begin{align*}
\|f^*_\delta\|_{L^q(S^{n-1}, d\sigma)} \le C_\varepsilon\, \delta^{-(n-1)/p - \varepsilon}\, \|f\|_{L^p(\mathbb{R}^n)},
\end{align*}
which is the conclusion of the theorem. The Bourgain wave-packet identification of Step 2--3 (the passage from Schwartz-tail decay of $\widehat{\mathbb{1}_{\Omega_j}d\sigma}$ to the tube-sum bound) is the cited input from Bourgain, *Geom. Funct. Anal.* **1** (1991), 147--187, §2.
[/step]