[proofplan]
We construct an auxiliary multiplier $\hat{\chi} \in C_c^\infty(\mathbb{R}^n)$ that equals $1$ on the unit ball, so that $\hat{\chi}(R^{-1}\xi)$ equals $1$ on the Fourier support of $f$ and hence $\hat{f} = \hat{\chi}(R^{-1}\cdot)\hat{f}$. Inverting the Fourier transform converts this into a convolution $f = \chi_R * f$ where $\chi_R(x) = R^n \chi(Rx)$. The derivative bound is obtained by differentiating the convolution: $D^\alpha f = (D^\alpha \chi_R) * f$. The $L^p \to L^q$ improvement and the derivative gain both follow from a single application of [Young's convolution inequality](/theorems/???) with appropriate exponents, combined with the dimensional scaling of $\|D^\alpha \chi_R\|_{L^r}$ where $1/r = 1 - 1/p + 1/q$.
[/proofplan]
[step:Construct an auxiliary cutoff and reproducing identity]
Pick a function $\hat{\chi} \in C_c^\infty(\mathbb{R}^n)$ satisfying $\hat{\chi}(\eta) = 1$ for $|\eta| \le 1$ and $\hat{\chi}(\eta) = 0$ for $|\eta| \ge 2$, with $0 \le \hat{\chi} \le 1$. (For instance, $\hat{\chi}$ can be taken as the function $\hat{\varphi}$ from the [Littlewood–Paley decomposition](/theorems/3186).) Define
\begin{align*}
\chi: \mathbb{R}^n &\to \mathbb{R}, \\
x &\mapsto \mathcal{F}^{-1}(\hat{\chi})(x) = \frac{1}{(2\pi)^n} \int_{\mathbb{R}^n} \hat{\chi}(\eta) e^{i\eta \cdot x} \, d\mathcal{L}^n(\eta),
\end{align*}
which lies in $\mathcal{S}(\mathbb{R}^n)$ because $\hat{\chi} \in C_c^\infty(\mathbb{R}^n) \subseteq \mathcal{S}(\mathbb{R}^n)$ and the [Schwartz space is invariant under Fourier transform](/theorems/???). For $R > 0$, define the rescaled function
\begin{align*}
\chi_R: \mathbb{R}^n &\to \mathbb{R}, \\
x &\mapsto R^n \chi(R x).
\end{align*}
Its Fourier transform is $\widehat{\chi_R}(\xi) = \hat{\chi}(R^{-1}\xi)$, by the standard scaling identity for the Fourier transform.
Since $\operatorname{supp}(\hat{f}) \subseteq \{|\xi| \le R\}$ and $\hat{\chi}(R^{-1}\xi) = 1$ for $|\xi| \le R$, the multiplication identity holds:
\begin{align*}
\hat{f}(\xi) = \hat{\chi}(R^{-1}\xi) \, \hat{f}(\xi) \quad \text{for all } \xi \in \mathbb{R}^n.
\end{align*}
Inverting the Fourier transform, this becomes the **reproducing identity**
\begin{align*}
f = \chi_R * f \quad \text{in } \mathcal{S}'(\mathbb{R}^n),
\end{align*}
where the convolution is initially defined in the distributional sense and is shown below to coincide with the classical convolution because $f \in L^p$ and $\chi_R \in \mathcal{S} \subseteq L^1 \cap L^\infty$.
[/step]
[step:Reduce $D^\alpha f$ to a convolution with $D^\alpha \chi_R$]
Since $\chi_R \in \mathcal{S}(\mathbb{R}^n) \subseteq L^1(\mathbb{R}^n)$ and $f \in L^p(\mathbb{R}^n)$, the convolution $\chi_R * f$ is defined in the classical sense by [Young's convolution inequality](/theorems/???) (taking $r = 1$, $1/p + 1/1 = 1/p + 1$). Pointwise,
\begin{align*}
(\chi_R * f)(x) = \int_{\mathbb{R}^n} \chi_R(x - y) f(y) \, d\mathcal{L}^n(y),
\end{align*}
and this defines a function in $L^p(\mathbb{R}^n)$. The distributional identity $f = \chi_R * f$ from the previous step, combined with the classical interpretation on $L^p$, yields $f(x) = (\chi_R * f)(x)$ for almost every $x \in \mathbb{R}^n$.
Differentiating under the integral sign: for any multi-index $\alpha$, the partial derivatives of $\chi_R \in \mathcal{S}(\mathbb{R}^n)$ are bounded by Schwartz seminorms uniformly. Specifically, $D^\alpha \chi_R \in \mathcal{S}(\mathbb{R}^n) \subseteq L^1(\mathbb{R}^n)$, and the derivative of the convolution $\chi_R * f$ is given by
\begin{align*}
D^\alpha (\chi_R * f) = (D^\alpha \chi_R) * f.
\end{align*}
This is the [smoothing-by-convolution identity](/theorems/???): differentiation of a convolution falls on the smooth factor when the smooth factor is in $L^1$ together with all derivatives. Since the right-hand side is again a convolution of $L^1$ with $L^p$, the derivative is in $L^p$ at least, and we will refine this using Young's inequality with sharper exponents.
Hence $D^\alpha f = (D^\alpha \chi_R) * f$ in the sense of distributions and almost everywhere, justifying that $f \in C^\infty(\mathbb{R}^n)$ since $\chi_R \in \mathcal{S}$ implies $D^\alpha \chi_R \in C^\infty \cap L^1$ and the convolution with $f \in L^p \subseteq L^1_{\mathrm{loc}}$ yields a smooth function.
[/step]
[step:Compute the $L^r$ norm of $D^\alpha \chi_R$ via dimensional scaling]
We compute $\|D^\alpha \chi_R\|_{L^r(\mathbb{R}^n)}$ for any $1 \le r \le \infty$ and any multi-index $\alpha$, in terms of $\|D^\alpha \chi\|_{L^r(\mathbb{R}^n)}$.
By the chain rule applied to $\chi_R(x) = R^n \chi(Rx)$:
\begin{align*}
D^\alpha \chi_R(x) = R^n \cdot R^{|\alpha|} \cdot (D^\alpha \chi)(Rx) = R^{n + |\alpha|} (D^\alpha \chi)(Rx).
\end{align*}
The first $R^n$ comes from the prefactor $R^n$ in the definition of $\chi_R$; the additional $R^{|\alpha|}$ comes from the chain rule $\partial_{x_i}[\chi(Rx)] = R \cdot (\partial_{x_i}\chi)(Rx)$ applied $|\alpha|$ times.
Now compute the $L^r$ norm. For $1 \le r < \infty$, change variables $u := R x$, $d\mathcal{L}^n(u) = R^n \, d\mathcal{L}^n(x)$, so $d\mathcal{L}^n(x) = R^{-n} \, d\mathcal{L}^n(u)$:
\begin{align*}
\|D^\alpha \chi_R\|_{L^r(\mathbb{R}^n)}^r &= \int_{\mathbb{R}^n} |D^\alpha \chi_R(x)|^r \, d\mathcal{L}^n(x) \\
&= R^{r(n + |\alpha|)} \int_{\mathbb{R}^n} |(D^\alpha \chi)(Rx)|^r \, d\mathcal{L}^n(x) \\
&= R^{r(n + |\alpha|)} \cdot R^{-n} \int_{\mathbb{R}^n} |(D^\alpha \chi)(u)|^r \, d\mathcal{L}^n(u) \\
&= R^{r(n + |\alpha|) - n} \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)}^r.
\end{align*}
The first equality is the definition of $L^r$ norm. The second uses the formula for $D^\alpha \chi_R$ from above. The third applies the change of variables $u = Rx$ — note: this is a bijection $\mathbb{R}^n \to \mathbb{R}^n$ with Jacobian $R^n$, so the Jacobian formula for [Lebesgue measure pushforward](/theorems/???) gives $d\mathcal{L}^n(x) = R^{-n} d\mathcal{L}^n(u)$. The fourth combines the powers of $R$.
Taking $r$-th roots:
\begin{align*}
\|D^\alpha \chi_R\|_{L^r(\mathbb{R}^n)} = R^{n + |\alpha| - n/r} \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)} = R^{|\alpha| + n(1 - 1/r)} \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)}.
\end{align*}
For $r = \infty$, we directly compute $\|D^\alpha \chi_R\|_{L^\infty} = R^{n + |\alpha|} \|D^\alpha \chi\|_{L^\infty}$, which agrees with the formula above setting $1/r = 0$, that is, $r(1 - 1/r) = n$.
In all cases, with $C_{n, \alpha, \chi} := \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)}$ (which is finite because $\chi \in \mathcal{S}(\mathbb{R}^n)$ and Schwartz functions have all $L^r$-integrable derivatives),
\begin{align*}
\|D^\alpha \chi_R\|_{L^r(\mathbb{R}^n)} = R^{|\alpha| + n(1 - 1/r)} \cdot C_{n, \alpha, \chi}, \quad 1 \le r \le \infty.
\end{align*}
[/step]
[step:Pick the right exponent $r$ via Young's inequality and combine]
The classical [Young's convolution inequality](/theorems/???) states: for $1 \le p, q, r \le \infty$ with $1 + 1/q = 1/p + 1/r$,
\begin{align*}
\|g * h\|_{L^q(\mathbb{R}^n)} \le \|g\|_{L^r(\mathbb{R}^n)} \|h\|_{L^p(\mathbb{R}^n)}, \quad g \in L^r, \, h \in L^p.
\end{align*}
We solve $1 + 1/q = 1/p + 1/r$ for $r$:
\begin{align*}
\frac{1}{r} = 1 - \frac{1}{p} + \frac{1}{q}.
\end{align*}
For the assumed range $1 \le p \le q \le \infty$, we have $1/q \le 1/p \le 1$, so $1/r = 1 + 1/q - 1/p \le 1$ (since $1/q \le 1/p$) and $1/r \ge 1 - 1/p + 0 = 1 - 1/p \ge 0$ (since $1/p \le 1$). Therefore $1 \le r \le \infty$ and Young's inequality applies.
Apply Young's inequality with $g = D^\alpha \chi_R$ and $h = f$, valid because $D^\alpha \chi_R \in \mathcal{S}(\mathbb{R}^n) \subseteq L^r(\mathbb{R}^n)$ for all $r \in [1, \infty]$ and $f \in L^p(\mathbb{R}^n)$:
\begin{align*}
\|D^\alpha f\|_{L^q(\mathbb{R}^n)} &= \|(D^\alpha \chi_R) * f\|_{L^q(\mathbb{R}^n)} \\
&\le \|D^\alpha \chi_R\|_{L^r(\mathbb{R}^n)} \|f\|_{L^p(\mathbb{R}^n)} \\
&= C_{n, \alpha, \chi} \cdot R^{|\alpha| + n(1 - 1/r)} \cdot \|f\|_{L^p(\mathbb{R}^n)},
\end{align*}
where the third line uses the formula from the previous step.
Substitute $1 - 1/r = 1 - (1 + 1/q - 1/p) = 1/p - 1/q$:
\begin{align*}
\|D^\alpha f\|_{L^q(\mathbb{R}^n)} \le C_{n, \alpha, \chi} \cdot R^{|\alpha| + n(1/p - 1/q)} \cdot \|f\|_{L^p(\mathbb{R}^n)}.
\end{align*}
Setting $C_{n, \alpha} := C_{n, \alpha, \chi} = \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)}$, where $\chi$ is the fixed cutoff from Step 1 and $r$ is determined by $1/r = 1 - 1/p + 1/q$, gives the asserted Bernstein inequality.
The constant $C_{n, \alpha}$ depends on $n$ (through $\chi$), on $|\alpha|$ (through the differential order applied to $\chi$), and on $r$ (which is a function of $p, q, n$). Since $\chi$ is a fixed Schwartz function depending only on $n$, all these dependencies reduce to the stated $n, |\alpha|$ dependence, plus a mild additional dependence on $p, q$ through $r$. To eliminate the $p, q$ dependence one can use the uniform Schwartz bound $\sup_{1 \le r \le \infty} \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)} < \infty$ (finite because $\chi \in \mathcal{S}$ ensures all $L^r$ norms are finite and continuous in $r$).
[/step]
[step:Conclude the Bernstein inequality and the smoothness of $f$]
Combining Steps 1–4:
- The reproducing identity $f = \chi_R * f$ (Step 1) holds whenever $\operatorname{supp}(\hat{f}) \subseteq \{|\xi| \le R\}$.
- The convolution differentiation identity $D^\alpha f = (D^\alpha \chi_R) * f$ (Step 2) places $f$ in $C^\infty(\mathbb{R}^n)$ since $\chi_R \in \mathcal{S}$.
- Young's inequality combined with the scaling $\|D^\alpha \chi_R\|_{L^r} = C_{n, \alpha, \chi} R^{|\alpha| + n(1 - 1/r)}$ (Steps 3–4) gives
\begin{align*}
\|D^\alpha f\|_{L^q(\mathbb{R}^n)} \le C_{n, \alpha} R^{|\alpha| + n(1/p - 1/q)} \|f\|_{L^p(\mathbb{R}^n)}
\end{align*}
for any $1 \le p \le q \le \infty$ and any multi-index $\alpha$, where $C_{n, \alpha} = \|D^\alpha \chi\|_{L^r(\mathbb{R}^n)}$ with $1/r = 1 - 1/p + 1/q$.
This is the Bernstein inequality. The membership $f \in L^q(\mathbb{R}^n)$ for $q \ge p$ is contained in the case $\alpha = 0$:
\begin{align*}
\|f\|_{L^q(\mathbb{R}^n)} \le C_{n, 0} R^{n(1/p - 1/q)} \|f\|_{L^p(\mathbb{R}^n)} < \infty,
\end{align*}
so $f \in L^p \cap L^q \cap C^\infty$ as claimed.
[/step]