[proofplan]
We first prove the estimate for $f \in C_c^\infty(\mathbb{R}^n)$ by applying the [Fourier transform](/page/Fourier%20Transform) and splitting frequency space into the ball $B(0,R)$ and its complement. The low-frequency part is controlled by the $L^1$ norm through the pointwise Fourier bound, while the high-frequency part is controlled by the $L^2$ norm of the gradient through Plancherel's theorem. Choosing $R$ to balance these two contributions gives the desired exponent. Finally, we pass from smooth compactly supported functions to arbitrary $f \in L^1(\mathbb{R}^n) \cap H^1(\mathbb{R}^n)$ by density.
[/proofplan]
[step:Prove the estimate first for smooth compactly supported functions]
Assume first that $f \in C_c^\infty(\mathbb{R}^n)$, where $C_c^\infty(\mathbb{R}^n)$ denotes the space of smooth functions $u: \mathbb{R}^n \to \mathbb{R}$ whose support $\operatorname{supp} u$ is compact in $\mathbb{R}^n$. Let $\mathcal{L}^n$ denote [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}^n$, defined on the Borel $\sigma$-algebra $\mathcal{B}(\mathbb{R}^n)$. Define the Fourier transform of $f$ by the map
\begin{align*}
\hat f: \mathbb{R}^n \to \mathbb{C}
\end{align*}
whose value at $\xi \in \mathbb{R}^n$ is
\begin{align*}
\hat f(\xi) := \frac{1}{(2\pi)^{n/2}}\int_{\mathbb{R}^n} f(x)e^{-i\xi \cdot x}\,d\mathcal{L}^n(x).
\end{align*}
Since $f \in C_c^\infty(\mathbb{R}^n)$, we have $f \in L^1(\mathbb{R}^n) \cap L^2(\mathbb{R}^n)$ and $\partial_{x_j}f \in L^2(\mathbb{R}^n)$ for every $j \in \{1,\dots,n\}$, so the Fourier transform, [Plancherel theorem](/theorems/247), and Fourier derivative identity apply.
For every $\xi \in \mathbb{R}^n$, the triangle inequality gives
\begin{align*}
|\hat f(\xi)| \leq \frac{1}{(2\pi)^{n/2}}\int_{\mathbb{R}^n}|f(x)|\,d\mathcal{L}^n(x) = \frac{1}{(2\pi)^{n/2}}\|f\|_{L^1(\mathbb{R}^n)}.
\end{align*}
[/step]
[step:Split the Fourier mass into low and high frequencies]
Let $R > 0$ be a frequency radius, and let $\alpha_n := \mathcal{L}^n(B(0,1))$ denote the Lebesgue measure of the unit ball in $\mathbb{R}^n$. By the Plancherel theorem for the Fourier transform on $L^2(\mathbb{R}^n)$,
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 = \int_{\mathbb{R}^n} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi).
\end{align*}
Splitting the domain of integration into $B(0,R)$ and $\mathbb{R}^n \setminus B(0,R)$ gives
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 = \int_{B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) + \int_{\mathbb{R}^n \setminus B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi).
\end{align*}
On $B(0,R)$, the pointwise Fourier bound yields
\begin{align*}
\int_{B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) \leq \frac{1}{(2\pi)^n}\|f\|_{L^1(\mathbb{R}^n)}^2 \mathcal{L}^n(B(0,R)).
\end{align*}
Since $\mathcal{L}^n(B(0,R)) = \alpha_n R^n$, this becomes
\begin{align*}
\int_{B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) \leq \frac{\alpha_n}{(2\pi)^n}R^n\|f\|_{L^1(\mathbb{R}^n)}^2.
\end{align*}
On $\mathbb{R}^n \setminus B(0,R)$, we have $|\xi| \geq R$, hence $1 \leq R^{-2}|\xi|^2$. Therefore
\begin{align*}
\int_{\mathbb{R}^n \setminus B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) \leq R^{-2}\int_{\mathbb{R}^n}|\xi|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi).
\end{align*}
Using the Fourier derivative identity $\widehat{\partial_{x_j}f}(\xi)=i\xi_j\hat f(\xi)$ and the Plancherel theorem for each $\partial_{x_j}f$, we get
\begin{align*}
\int_{\mathbb{R}^n}|\xi|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) = \sum_{j=1}^n \|\partial_{x_j}f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
By the definition of the $L^2$ norm of the gradient,
\begin{align*}
\sum_{j=1}^n \|\partial_{x_j}f\|_{L^2(\mathbb{R}^n)}^2 = \|\nabla f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
Combining the two estimates gives, for every $R > 0$,
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 \leq \frac{\alpha_n}{(2\pi)^n}R^n\|f\|_{L^1(\mathbb{R}^n)}^2 + R^{-2}\|\nabla f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
[guided]
The Fourier-side estimate works because the two frequency regions remember different information about $f$. Low frequencies are controlled by the size of $\hat f$ itself, and the only general pointwise control available is the $L^1$ estimate
\begin{align*}
|\hat f(\xi)| \leq \frac{1}{(2\pi)^{n/2}}\|f\|_{L^1(\mathbb{R}^n)}.
\end{align*}
This estimate is valid because the Fourier transform was defined by an absolutely convergent integral and the triangle inequality applies to that integral.
We now introduce a radius $R > 0$ and decompose $\mathbb{R}^n$ as the disjoint union of $B(0,R)$ and $\mathbb{R}^n \setminus B(0,R)$. Define $\alpha_n := \mathcal{L}^n(B(0,1))$, the Lebesgue measure of the unit ball in $\mathbb{R}^n$. The Plancherel theorem for the Fourier transform on $L^2(\mathbb{R}^n)$ applies because $f \in C_c^\infty(\mathbb{R}^n) \subset L^2(\mathbb{R}^n)$, and it gives
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 = \int_{B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) + \int_{\mathbb{R}^n \setminus B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi).
\end{align*}
For the low-frequency part, we use the pointwise bound uniformly on $B(0,R)$:
\begin{align*}
\int_{B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) \leq \frac{1}{(2\pi)^n}\|f\|_{L^1(\mathbb{R}^n)}^2\mathcal{L}^n(B(0,R)).
\end{align*}
The scaling of Lebesgue measure under the dilation $\xi = R\eta$ gives $\mathcal{L}^n(B(0,R)) = R^n\mathcal{L}^n(B(0,1)) = \alpha_n R^n$, so
\begin{align*}
\int_{B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) \leq \frac{\alpha_n}{(2\pi)^n}R^n\|f\|_{L^1(\mathbb{R}^n)}^2.
\end{align*}
For the high-frequency part, the point is that large frequencies can be paid for by derivatives. On $\mathbb{R}^n \setminus B(0,R)$, the inequality $|\xi| \geq R$ implies $|\hat f(\xi)|^2 \leq R^{-2}|\xi|^2|\hat f(\xi)|^2$. Hence
\begin{align*}
\int_{\mathbb{R}^n \setminus B(0,R)} |\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) \leq R^{-2}\int_{\mathbb{R}^n}|\xi|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi).
\end{align*}
The Fourier derivative identity applies because each $\partial_{x_j}f$ is smooth, compactly supported, and in $L^2(\mathbb{R}^n)$. Since $\widehat{\partial_{x_j}f}(\xi)=i\xi_j\hat f(\xi)$, the Plancherel theorem gives
\begin{align*}
\int_{\mathbb{R}^n}|\xi|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) = \sum_{j=1}^n \int_{\mathbb{R}^n}|\xi_j|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi).
\end{align*}
For each $j \in \{1,\dots,n\}$, Plancherel applied to $\partial_{x_j}f$ gives
\begin{align*}
\int_{\mathbb{R}^n}|\xi_j|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) = \|\partial_{x_j}f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
Therefore
\begin{align*}
\int_{\mathbb{R}^n}|\xi|^2|\hat f(\xi)|^2\,d\mathcal{L}^n(\xi) = \sum_{j=1}^n\|\partial_{x_j}f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
By the definition of the $L^2$ norm of the gradient, the last sum is $\|\nabla f\|_{L^2(\mathbb{R}^n)}^2$. Therefore, for every $R > 0$,
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 \leq \frac{\alpha_n}{(2\pi)^n}R^n\|f\|_{L^1(\mathbb{R}^n)}^2 + R^{-2}\|\nabla f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
[/guided]
[/step]
[step:Choose the frequency radius that balances the two bounds]
Set
\begin{align*}
A := \|f\|_{L^1(\mathbb{R}^n)}^2.
\end{align*}
Define also
\begin{align*}
B := \|\nabla f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
Finally set
\begin{align*}
a_n := \frac{\alpha_n}{(2\pi)^n}.
\end{align*}
The preceding estimate is
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 \leq a_n A R^n + B R^{-2}.
\end{align*}
If $A = 0$, then $f = 0$ $\mathcal{L}^n$-a.e., and the desired inequality holds. If $B = 0$, then $\nabla f = 0$ in $L^2(\mathbb{R}^n)$, so $f$ is a constant $\mathcal{L}^n$-a.e.; since $f \in L^1(\mathbb{R}^n)$, this constant must be $0$, and the desired inequality again holds.
Assume now that $A > 0$ and $B > 0$. Choose
\begin{align*}
R := \left(\frac{B}{A}\right)^{1/(n+2)}.
\end{align*}
Then
\begin{align*}
A R^n = A^{2/(n+2)}B^{n/(n+2)}
\end{align*}
and
\begin{align*}
B R^{-2} = A^{2/(n+2)}B^{n/(n+2)}.
\end{align*}
Thus
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^2 \leq (a_n + 1)A^{2/(n+2)}B^{n/(n+2)}.
\end{align*}
Raising both sides to the power $(n+2)/n$ gives
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^{2+4/n} \leq (a_n + 1)^{(n+2)/n}A^{2/n}B.
\end{align*}
Substituting the definitions of $A$ and $B$, we obtain
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^{2+4/n} \leq (a_n + 1)^{(n+2)/n}\|f\|_{L^1(\mathbb{R}^n)}^{4/n}\|\nabla f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
So the Nash inequality holds for every $f \in C_c^\infty(\mathbb{R}^n)$ with
\begin{align*}
C_n := \left(1 + \frac{\alpha_n}{(2\pi)^n}\right)^{(n+2)/n}.
\end{align*}
[/step]
[step:Pass to general functions by density]
Let $f \in L^1(\mathbb{R}^n) \cap H^1(\mathbb{R}^n)$. We use the standard simultaneous density theorem for $L^1$ and $H^1$ approximation on $\mathbb{R}^n$: $C_c^\infty(\mathbb{R}^n)$ is dense in $L^1(\mathbb{R}^n)\cap H^1(\mathbb{R}^n)$ with respect to the norm
\begin{align*}
\|u\|_{L^1(\mathbb{R}^n)}+\|u\|_{H^1(\mathbb{R}^n)}.
\end{align*}
For completeness, this density statement follows by first multiplying $f$ by smooth cutoff functions $\zeta_m: \mathbb{R}^n \to [0,1]$ with $\zeta_m = 1$ on $B(0,m)$ and $\operatorname{supp}\zeta_m \subset B(0,2m)$, which gives $\zeta_m f \to f$ in both $L^1(\mathbb{R}^n)$ and $H^1(\mathbb{R}^n)$, and then mollifying each compactly supported function $\zeta_m f$ by a [standard mollifier](/page/Standard%20Mollifier) at a sufficiently small scale. The cutoff convergence uses absolute continuity of the $L^1$ and $L^2$ integrals and the bound $|\nabla \zeta_m| \leq C m^{-1}$; the mollification convergence uses the standard approximation identity in $L^1(\mathbb{R}^n)$ and in $H^1(\mathbb{R}^n)$. Hence there exists a sequence $(f_k)_{k=1}^\infty$ with $f_k \in C_c^\infty(\mathbb{R}^n)$ such that
\begin{align*}
\|f_k - f\|_{L^1(\mathbb{R}^n)} + \|f_k - f\|_{H^1(\mathbb{R}^n)} \to 0.
\end{align*}
In particular,
\begin{align*}
\|f_k\|_{L^1(\mathbb{R}^n)} \to \|f\|_{L^1(\mathbb{R}^n)}.
\end{align*}
Also,
\begin{align*}
\|f_k\|_{L^2(\mathbb{R}^n)} \to \|f\|_{L^2(\mathbb{R}^n)}.
\end{align*}
Finally,
\begin{align*}
\|\nabla f_k\|_{L^2(\mathbb{R}^n)} \to \|\nabla f\|_{L^2(\mathbb{R}^n)}.
\end{align*}
For each $k$, the smooth case gives
\begin{align*}
\|f_k\|_{L^2(\mathbb{R}^n)}^{2+4/n} \leq C_n\|f_k\|_{L^1(\mathbb{R}^n)}^{4/n}\|\nabla f_k\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
Taking the limit as $k \to \infty$ and using the continuity of powers and products on $[0,\infty)$ yields
\begin{align*}
\|f\|_{L^2(\mathbb{R}^n)}^{2+4/n} \leq C_n\|f\|_{L^1(\mathbb{R}^n)}^{4/n}\|\nabla f\|_{L^2(\mathbb{R}^n)}^2.
\end{align*}
This proves the stated inequality for every $f \in L^1(\mathbb{R}^n) \cap H^1(\mathbb{R}^n)$.
[/step]