[proofplan]
We prove $\|M_\varphi a\|_{L^1(\mathbb{R}^n)} \le C_n$ for the grand maximal function $M_\varphi a$ associated to a fixed Schwartz function $\varphi$ with $\int \varphi\,d\mathcal{L}^n = 1$. The integration domain $\mathbb{R}^n$ splits into the dilated ball $2B$, where we use $L^2$ control via the Hardy--Littlewood maximal inequality, and the complement $(2B)^c$, where the cancellation $\int a\,d\mathcal{L}^n = 0$ allows us to expand the kernel difference $\varphi_t(x - y) - \varphi_t(x - x_B)$ and exploit Schwartz decay. Both regions yield bounds depending only on $n$ and $\varphi$.
[/proofplan]
[step:Set up the grand maximal function and reduce to estimating its $L^1$ norm]
Fix once and for all a Schwartz function $\varphi: \mathbb{R}^n \to \mathbb{R}$ with $\int_{\mathbb{R}^n} \varphi\,d\mathcal{L}^n = 1$ and $\varphi \ge 0$. (The non-negativity is a free choice — for instance, a normalised Gaussian works — and the [independence of approximation kernel theorem](/theorems/???) ensures the resulting Hardy norm is equivalent to any other.) Define the dilates
\begin{align*}
\varphi_t : \mathbb{R}^n &\to \mathbb{R} \\
x &\mapsto t^{-n}\varphi(x/t), \qquad t > 0,
\end{align*}
which satisfy $\int \varphi_t\,d\mathcal{L}^n = 1$ for every $t > 0$. The grand maximal function of $a$ is
\begin{align*}
M_\varphi a : \mathbb{R}^n &\to [0, \infty] \\
x &\mapsto \sup_{t > 0}|(\varphi_t * a)(x)|,
\end{align*}
where $(\varphi_t * a)(x) = \int_{\mathbb{R}^n}\varphi_t(x - y)a(y)\,d\mathcal{L}^n(y)$.
Let $r_B > 0$ denote the radius of $B$ and $x_B \in \mathbb{R}^n$ its centre, so $|B| = \alpha_n r_B^n$ where $\alpha_n = \mathcal{L}^n(B(0,1))$. The dilated ball $2B := B(x_B, 2r_B)$ has $|2B| = 2^n |B|$. The Hardy norm of $a$ is
\begin{align*}
\|a\|_{H^1(\mathbb{R}^n)} = \|M_\varphi a\|_{L^1(\mathbb{R}^n)} = \int_{2B}M_\varphi a\,d\mathcal{L}^n + \int_{(2B)^c}M_\varphi a\,d\mathcal{L}^n.
\end{align*}
We bound each summand by a constant depending only on $n$ and $\varphi$.
[/step]
[step:Bound the integral over $2B$ using the Hardy--Littlewood $L^2$ estimate]
We show $\int_{2B}M_\varphi a\,d\mathcal{L}^n \le C_n$.
Pointwise, since $\varphi \ge 0$ and $\varphi$ has a radially decreasing majorant in $L^1$ (an immediate consequence of $\varphi \in \mathcal{S}(\mathbb{R}^n)$), the [pointwise domination of grand maximal by Hardy--Littlewood maximal](/theorems/???) gives a constant $C_\varphi > 0$ depending only on $\varphi$ such that
\begin{align*}
M_\varphi a(x) \le C_\varphi\,M a(x), \qquad x \in \mathbb{R}^n,
\end{align*}
where $M a(x) = \sup_{r > 0}\frac{1}{|B(x,r)|}\int_{B(x,r)}|a(y)|\,d\mathcal{L}^n(y)$ is the [Hardy--Littlewood maximal function](/page/Hardy--Littlewood%20Maximal%20Function).
By the [strong $(2,2)$ bound for the Hardy--Littlewood maximal function](/theorems/???), there is an absolute constant $A_n > 0$ depending only on $n$ such that
\begin{align*}
\|M a\|_{L^2(\mathbb{R}^n)} \le A_n\,\|a\|_{L^2(\mathbb{R}^n)}.
\end{align*}
We compute $\|a\|_{L^2}$. Since $a$ is supported in $B$ and $\|a\|_{L^\infty} \le |B|^{-1}$,
\begin{align*}
\|a\|_{L^2(\mathbb{R}^n)}^2 = \int_B |a(y)|^2\,d\mathcal{L}^n(y) \le |B| \cdot |B|^{-2} = |B|^{-1},
\end{align*}
so $\|a\|_{L^2} \le |B|^{-1/2}$.
Applying the [Cauchy--Schwarz inequality](/theorems/???) with weight $\mathbb{1}_{2B}$,
\begin{align*}
\int_{2B}M a\,d\mathcal{L}^n = \int_{\mathbb{R}^n}\mathbb{1}_{2B} \cdot M a\,d\mathcal{L}^n \le \|\mathbb{1}_{2B}\|_{L^2}\,\|M a\|_{L^2} \le |2B|^{1/2} \cdot A_n\,\|a\|_{L^2}.
\end{align*}
Substituting $|2B|^{1/2} = 2^{n/2}|B|^{1/2}$ and $\|a\|_{L^2} \le |B|^{-1/2}$,
\begin{align*}
\int_{2B}M a\,d\mathcal{L}^n \le 2^{n/2}|B|^{1/2} \cdot A_n \cdot |B|^{-1/2} = 2^{n/2}A_n.
\end{align*}
Combining with the pointwise bound,
\begin{align*}
\int_{2B}M_\varphi a\,d\mathcal{L}^n \le C_\varphi\int_{2B}M a\,d\mathcal{L}^n \le 2^{n/2}A_n C_\varphi =: C_n^{(1)}.
\end{align*}
[/step]
[step:Bound the integral over $(2B)^c$ using the cancellation $\int a = 0$ and Schwartz decay]
We show $\int_{(2B)^c}M_\varphi a\,d\mathcal{L}^n \le C_n^{(2)}$.
Fix $x \in (2B)^c$ and $t > 0$. Using $\int_{\mathbb{R}^n}a\,d\mathcal{L}^n = 0$, we may subtract the constant $\varphi_t(x - x_B)\int a\,d\mathcal{L}^n = 0$ from $\varphi_t * a$:
\begin{align*}
(\varphi_t * a)(x) &= \int_{\mathbb{R}^n}\varphi_t(x - y)\,a(y)\,d\mathcal{L}^n(y) \\
&= \int_{B}[\varphi_t(x - y) - \varphi_t(x - x_B)]\,a(y)\,d\mathcal{L}^n(y),
\end{align*}
the second equality using that $a$ is supported in $B$.
Apply the [mean value theorem](/theorems/???) along the segment from $x - x_B$ to $x - y$: there exists $\xi(x,y,t)$ on the segment such that
\begin{align*}
\varphi_t(x - y) - \varphi_t(x - x_B) = \nabla\varphi_t(\xi(x,y,t)) \cdot (x_B - y).
\end{align*}
By the chain rule, $\nabla \varphi_t(z) = t^{-n-1}(\nabla\varphi)(z/t)$. Since $\varphi \in \mathcal{S}(\mathbb{R}^n)$, the gradient $\nabla\varphi$ has rapid decay; in particular, there is a constant $C_{\varphi,1} > 0$ depending only on $\varphi$ and $n$ such that
\begin{align*}
|\nabla\varphi(w)| \le C_{\varphi,1}(1 + |w|)^{-n-1}, \qquad w \in \mathbb{R}^n.
\end{align*}
Now estimate the gradient term. For $y \in B$ and $x \in (2B)^c$, the segment from $x - x_B$ to $x - y$ lies in the set $\{z \in \mathbb{R}^n : |z| \ge |x - x_B| - r_B \ge \tfrac{1}{2}|x - x_B|\}$, since $|x - x_B| \ge 2r_B$ implies $|x - x_B| - r_B \ge \tfrac{1}{2}|x - x_B|$. Therefore $|\xi(x,y,t)| \ge \tfrac{1}{2}|x - x_B|$, and
\begin{align*}
|\nabla\varphi_t(\xi)| = t^{-n-1}|\nabla\varphi(\xi/t)| \le C_{\varphi,1}\,t^{-n-1}(1 + |\xi|/t)^{-n-1} = C_{\varphi,1}(t + |\xi|)^{-n-1}.
\end{align*}
Using $|\xi| \ge \tfrac{1}{2}|x - x_B|$ and dropping $t \ge 0$,
\begin{align*}
|\nabla\varphi_t(\xi)| \le C_{\varphi,1}\,2^{n+1}|x - x_B|^{-n-1}.
\end{align*}
This bound is uniform in $t > 0$.
Combine with $|x_B - y| \le r_B$ for $y \in B$:
\begin{align*}
|\varphi_t(x - y) - \varphi_t(x - x_B)| \le C_{\varphi,1}\,2^{n+1}\,\frac{r_B}{|x - x_B|^{n+1}}.
\end{align*}
Substituting into the integral,
\begin{align*}
|(\varphi_t * a)(x)| &\le \int_B C_{\varphi,1}\,2^{n+1}\,\frac{r_B}{|x - x_B|^{n+1}}\,|a(y)|\,d\mathcal{L}^n(y) \\
&= C_{\varphi,1}\,2^{n+1}\,\frac{r_B}{|x - x_B|^{n+1}}\,\|a\|_{L^1(\mathbb{R}^n)}.
\end{align*}
We bound $\|a\|_{L^1(\mathbb{R}^n)} \le |B|\cdot|B|^{-1} = 1$ using support in $B$ and the size condition $\|a\|_{L^\infty} \le |B|^{-1}$.
Taking the supremum over $t > 0$,
\begin{align*}
M_\varphi a(x) \le C_{\varphi,1}\,2^{n+1}\,\frac{r_B}{|x - x_B|^{n+1}}, \qquad x \in (2B)^c.
\end{align*}
Integrating over $(2B)^c$, we apply the [polar coordinates formula for Lebesgue measure](/theorems/???), $d\mathcal{L}^n(x) = \rho^{n-1}\,d\sigma(\omega)\,d\mathcal{L}^1(\rho)$ under the substitution $x = x_B + \rho\omega$ with $\omega \in S^{n-1}$ and $\rho > 0$. The condition $x \in (2B)^c$ becomes $\rho > 2r_B$, and $|x - x_B| = \rho$:
\begin{align*}
\int_{(2B)^c}\frac{r_B}{|x - x_B|^{n+1}}\,d\mathcal{L}^n(x) &= r_B\int_{S^{n-1}}d\sigma(\omega)\int_{2r_B}^\infty \rho^{-n-1}\rho^{n-1}\,d\mathcal{L}^1(\rho) \\
&= r_B\,n\alpha_n\int_{2r_B}^\infty\rho^{-2}\,d\mathcal{L}^1(\rho) \\
&= r_B\,n\alpha_n \cdot \frac{1}{2r_B} = \frac{n\alpha_n}{2},
\end{align*}
using $\sigma(S^{n-1}) = n\alpha_n$. Therefore
\begin{align*}
\int_{(2B)^c}M_\varphi a\,d\mathcal{L}^n \le C_{\varphi,1}\,2^{n+1}\cdot\frac{n\alpha_n}{2} = n\alpha_n\,2^n\,C_{\varphi,1} =: C_n^{(2)}.
\end{align*}
[/step]
[step:Combine the two regions to obtain the dimension-only bound]
Adding the bounds from the previous two steps,
\begin{align*}
\|a\|_{H^1(\mathbb{R}^n)} = \int_{2B}M_\varphi a\,d\mathcal{L}^n + \int_{(2B)^c}M_\varphi a\,d\mathcal{L}^n \le C_n^{(1)} + C_n^{(2)} =: C_n.
\end{align*}
The constant $C_n = 2^{n/2}A_n C_\varphi + n\alpha_n 2^n C_{\varphi,1}$ depends only on the dimension $n$ and the fixed Schwartz function $\varphi$ — and any other admissible $\varphi$ yields an equivalent norm by the [independence of approximation kernel theorem](/theorems/???). This proves $a \in H^1(\mathbb{R}^n)$ with $\|a\|_{H^1} \le C_n$, completing the proof.
[/step]