[proofplan]
We use the [Coifman--Latter Atomic Decomposition](/theorems/3177) to reduce the global $H^1\to H^1$ bound to a uniform atomic estimate $\|Ta\|_{H^1}\le C_0$ for every $H^1$-atom $a$. The atomic cancellation $\int a = 0$, combined with the Hörmander smoothness, produces a summable annular decay of $Ta$ outside the doubled ball $2B$. The hypothesis $T^*(1) = 0$ is used to show $\int Ta = 0$ via duality. We decompose $Ta$ into an interior $L^2$-piece on $2B$ and an exterior sum of pieces on dyadic annuli, recentre each to be mean-zero, and assemble the residual constants into a single mean-zero combination $\Psi$ of nested-ball indicators. Each piece is a multiple of a $(1,2)$-atom; the $\ell^1$-bookkeeping of coefficients is summable by the Hörmander annular masses $\theta_k$.
[/proofplan]
[step:Reduce to a uniform atomic bound on $Ta$]
Let $f\in H^1(\mathbb{R}^n)$. By the [Coifman--Latter Atomic Decomposition](/theorems/3177), there exist $H^1$-atoms $\{a_j\}$ and scalars $\{\lambda_j\}\subseteq\mathbb{C}$ with
\begin{align*}
f = \sum_j\lambda_j a_j\quad\text{in } \mathcal{S}'(\mathbb{R}^n),\qquad \sum_j|\lambda_j|\le C_n^{\mathrm{at}}\,\|f\|_{H^1}.
\end{align*}
Suppose for every $H^1$-atom $a$,
\begin{align*}
\|Ta\|_{H^1(\mathbb{R}^n)}\le C_0 \tag{$\star\star$}
\end{align*}
with $C_0 = C_0(A,\|T\|_{\mathcal{L}(L^2)},n)$. Then by linearity and the converse direction of [Coifman--Latter](/theorems/3177),
\begin{align*}
\|Tf\|_{H^1(\mathbb{R}^n)}\le\sum_j|\lambda_j|\,\|Ta_j\|_{H^1}\le C_0\,C_n^{\mathrm{at}}\,\|f\|_{H^1},
\end{align*}
which is the conclusion. It remains to establish $(\star\star)$.
[/step]
[step:State the hypotheses and exploit atomic cancellation via Hörmander smoothness]
Fix an $H^1$-atom $a:\mathbb{R}^n\to\mathbb{C}$ supported in a ball $B = B(x_B,r_B)$ with $\|a\|_{L^\infty}\le|B|^{-1}$ and $\int_{\mathbb{R}^n}a\,d\mathcal{L}^n = 0$. By the [$H^1\to L^1$ boundedness theorem](/theorems/3178), $Ta\in L^1(\mathbb{R}^n)$ with $\|Ta\|_{L^1}\le C_1$ where $C_1 = C_1(A,\|T\|_{\mathcal{L}(L^2)},n)$.
**Hörmander smoothness condition.** The hypothesis (H) on $K$ states
\begin{align*}
\sup_{y,y'\in\mathbb{R}^n}\int_{|x-y|>2|y-y'|}|K(x,y) - K(x,y')|\,d\mathcal{L}^n(x)\le C_H. \tag{H}
\end{align*}
**Cancellation expansion from $\int a = 0$.** For $x\notin 2B$, the size of $K$ near $y\in B$ ensures $K(x,x_B)$ is finite, and the cancellation $\int a = 0$ gives
\begin{align*}
Ta(x) = \int_B[K(x,y) - K(x,x_B)]\,a(y)\,d\mathcal{L}^n(y)\quad\text{for } x\notin 2B. \tag{C}
\end{align*}
**Annular Hörmander mass.** For $k\ge 1$, define the dyadic annulus $A_k := \{x : 2^k r_B\le|x-x_B| < 2^{k+1}r_B\}$. The condition (H) applied with $y\in B$ and $y' := x_B$ (so $|y - y'|\le r_B$, and the integration region $\{|x-y| > 2|y-x_B|\}$ contains $\{|x-y|>2r_B\}$ which contains $\{|x-x_B|>3r_B\}\supseteq\bigcup_{k\ge 2}A_k$ — hence on $\bigcup_{k\ge 2}A_k$ the local Hörmander integral is bounded by $C_H$) gives
\begin{align*}
\theta_k := \sup_{y\in B}\int_{A_k}|K(x,y) - K(x,x_B)|\,d\mathcal{L}^n(x)
\end{align*}
satisfies $\sum_{k\ge 1}\theta_k\le C_H$. (For $k = 1$ — the inner annulus $\{2r_B\le|x-x_B|<4r_B\}$ — the bound on $\theta_1$ follows from the size condition $|K(x,y)|\le A|x-y|^{-n}$ on the bounded annular volume, yielding $\theta_1\le C(A,n)$, which is absorbed into $C_H$.) We do **not** assume any quantitative power-rate decay $\theta_k\le C\,2^{-k\delta}$; only summability is used.
[/step]
[step:Use $T^*(1) = 0$ to derive $\int Ta\,d\mathcal{L}^n = 0$]
The hypothesis $T^*(1) = 0$ (column integrals of $K$ vanish) is interpreted as follows: for every $\phi\in L^2(\mathbb{R}^n)$ with bounded support and $\int\phi\,d\mathcal{L}^n = 0$, the duality pairing
\begin{align*}
\langle T\phi, 1\rangle = \langle \phi, T^*(1)\rangle = 0,
\end{align*}
where the left-hand side is interpreted as the integral $\int T\phi\,d\mathcal{L}^n$ (which is well-defined because $T\phi\in L^1$ by the [$H^1\to L^1$ boundedness theorem](/theorems/3178), since $\phi$ is itself a multiple of a $(1,2)$-atom). The pairing $\langle\phi,T^*(1)\rangle$ vanishes by the column-integral hypothesis: for $\phi\in L^2_c$ with $\int\phi = 0$, the BMO duality $\langle\phi,T^*(1)\rangle$ reduces to $\int T^*(1)(y)\phi(y)\,d\mathcal{L}^n(y) = 0$ since $T^*(1) = 0$ in BMO modulo constants and $\int\phi = 0$ kills any constant offset.
Applied to $\phi := a$ (an $H^1$-atom, in particular $L^2$ with bounded support and mean zero),
\begin{align*}
\int_{\mathbb{R}^n}Ta\,d\mathcal{L}^n = 0. \tag{Z}
\end{align*}
This is the genuine use of $T^*(1) = 0$: it ensures the global cancellation needed for the redistribution of constants in Step 5.
(The hypothesis $T(1) = 0$, the row-integral condition, is not used directly in this proof — it is assumed in the theorem statement for symmetry with $T^*(1) = 0$ and because the joint condition is the form in which $T$ is most naturally defined as a CZO; see the remark after the theorem statement.)
[/step]
[step:Decompose $Ta$ into annular pieces and estimate each in $L^2$]
Set $A_0 := 2B$. Then $\mathbb{R}^n = \bigsqcup_{k\ge 0}A_k$. Define $m_k:\mathbb{R}^n\to\mathbb{C}$, $x\mapsto Ta(x)\,\mathbb{1}_{A_k}(x)$. Then $\sum_{k\ge 0}m_k = Ta$ pointwise.
**Case $k = 0$.** By the $L^2$-boundedness of $T$ and the $(1,2)$-atom bound $\|a\|_{L^2}\le|B|^{-1/2}$ (every $(1,\infty)$-atom is a $(1,2)$-atom; the $L^2$ bound follows from $\|a\|_{L^2}^2\le\|a\|_{L^\infty}^2|B|\le|B|^{-1}$),
\begin{align*}
\|m_0\|_{L^2}\le\|Ta\|_{L^2}\le\|T\|_{\mathcal{L}(L^2)}\,\|a\|_{L^2}\le\|T\|_{\mathcal{L}(L^2)}\,|B|^{-1/2}.
\end{align*}
**Case $k\ge 1$.** Use the cancellation expansion (C). Apply [Cauchy--Schwarz](/theorems/432) to the $y$-integral:
\begin{align*}
|Ta(x)|^2\le\biggl(\int_B|K(x,y) - K(x,x_B)|^2\,d\mathcal{L}^n(y)\biggr)\cdot\|a\|_{L^2(B)}^2\le|B|^{-1}\int_B|K(x,y) - K(x,x_B)|^2\,d\mathcal{L}^n(y).
\end{align*}
Integrate over $A_k$ and apply [Tonelli](/theorems/2956):
\begin{align*}
\|m_k\|_{L^2}^2\le|B|^{-1}\int_B\int_{A_k}|K(x,y) - K(x,x_B)|^2\,d\mathcal{L}^n(x)\,d\mathcal{L}^n(y).
\end{align*}
The size bound on $K$ gives $\|K(\cdot,y) - K(\cdot,x_B)\|_{L^\infty(A_k)}\le 2A\,(2^{k-1}r_B)^{-n} = 2^{1+n}A\,(2^k r_B)^{-n}$ (since for $y\in B$, $x\in A_k$ with $k\ge 1$, $|x-y|\ge 2^k r_B - r_B\ge 2^{k-1}r_B$). The $L^1$-norm on $A_k$ is bounded by $\theta_k$. The interpolation $\int|g|^2\le\|g\|_{L^\infty}\int|g|$ gives
\begin{align*}
\int_{A_k}|K(x,y) - K(x,x_B)|^2\,d\mathcal{L}^n(x)\le\theta_k\cdot 2^{1+n}A\,(2^k r_B)^{-n}.
\end{align*}
Substituting and using $|B| = \alpha_n r_B^n$,
\begin{align*}
\|m_k\|_{L^2}^2\le|B|^{-1}\cdot|B|\cdot 2^{1+n}A\,\theta_k\,2^{-kn}r_B^{-n} = 2^{1+n}A\alpha_n\,\theta_k\,2^{-kn}\,|B|^{-1},
\end{align*}
hence $\|m_k\|_{L^2}\le C(A,n)\,\theta_k^{1/2}\,2^{-kn/2}\,|B|^{-1/2}$.
[/step]
[step:Split into an interior $(1,2)$-atom on $2B$ plus an exterior sum of $(1,2)$-atoms]
Define the **interior piece** $M_{\mathrm{in}} := Ta\,\mathbb{1}_{2B} = m_0$ and the **exterior piece** $M_{\mathrm{ex}} := Ta\,\mathbb{1}_{(2B)^c} = \sum_{k\ge 1}m_k$. Cancellation (Z) gives $\int M_{\mathrm{in}}\,d\mathcal{L}^n + \int M_{\mathrm{ex}}\,d\mathcal{L}^n = 0$.
**Interior $(1,2)$-atom $\tilde M_{\mathrm{in}}$.** Set
\begin{align*}
\tilde M_{\mathrm{in}}(x) := M_{\mathrm{in}}(x) - c_{\mathrm{in}}\frac{\mathbb{1}_{2B}(x)}{|2B|},\qquad c_{\mathrm{in}} := \int M_{\mathrm{in}}\,d\mathcal{L}^n.
\end{align*}
Then $\operatorname{supp}\tilde M_{\mathrm{in}}\subseteq 2B$, $\int\tilde M_{\mathrm{in}}\,d\mathcal{L}^n = 0$. By Cauchy--Schwarz, $|c_{\mathrm{in}}|\le|2B|^{1/2}\|M_{\mathrm{in}}\|_{L^2}\le 2^{n/2}|B|^{1/2}\cdot\|T\|_{\mathcal{L}(L^2)}|B|^{-1/2} = 2^{n/2}\|T\|_{\mathcal{L}(L^2)}$. The $L^2$-norm
\begin{align*}
\|\tilde M_{\mathrm{in}}\|_{L^2}\le\|M_{\mathrm{in}}\|_{L^2} + |c_{\mathrm{in}}|\,|2B|^{-1/2}\le\|T\|_{\mathcal{L}(L^2)}|B|^{-1/2} + 2^{n/2}\|T\|_{\mathcal{L}(L^2)}\cdot 2^{-n/2}|B|^{-1/2} = 2\|T\|_{\mathcal{L}(L^2)}\,|B|^{-1/2}.
\end{align*}
Comparing $|B|^{-1/2}$ to $|2B|^{-1/2} = 2^{-n/2}|B|^{-1/2}$, the function $\tilde M_{\mathrm{in}}/(2^{1+n/2}\|T\|_{\mathcal{L}(L^2)})$ is a $(1,2)$-atom adapted to $2B$. Hence
\begin{align*}
\|\tilde M_{\mathrm{in}}\|_{H^1_{\mathrm{at}}}\le C_n^{(1)}\cdot 2^{1+n/2}\|T\|_{\mathcal{L}(L^2)} =: C_{\mathrm{in}},
\end{align*}
where $C_n^{(1)}$ is the dimensional constant from the equivalence of $(1,2)$- and $(1,\infty)$-atomic norms (via the Calderón--Zygmund decomposition of the $L^2$-bounded $\tilde M_{\mathrm{in}}$ into $(1,\infty)$-atoms; see Stein, op. cit., Ch. III §2.4 or Grafakos, *Modern Fourier Analysis*, 3rd ed., Theorem 2.4.6).
**Exterior $(1,2)$-atoms $\tilde m_k$.** For $k\ge 1$, let $\tilde B_k := B(x_B, 2^{k+1}r_B)$, with $|\tilde B_k| = 2^{(k+1)n}|B|$. Define
\begin{align*}
\tilde m_k(x) := m_k(x) - c_k\,\frac{\mathbb{1}_{\tilde B_k}(x)}{|\tilde B_k|},\qquad c_k := \int m_k\,d\mathcal{L}^n.
\end{align*}
Then $\operatorname{supp}\tilde m_k\subseteq\tilde B_k$, $\int\tilde m_k\,d\mathcal{L}^n = 0$. From Step 4 and Cauchy--Schwarz on $A_k$,
\begin{align*}
\|m_k\|_{L^1}\le|A_k|^{1/2}\|m_k\|_{L^2}\le(2^{(k+1)n}|B|)^{1/2}\cdot C(A,n)\theta_k^{1/2}2^{-kn/2}|B|^{-1/2} = 2^{n/2}C(A,n)\,\theta_k^{1/2}.
\end{align*}
By the cancellation expansion (C) and Tonelli,
\begin{align*}
\|m_k\|_{L^1}\le\int_B|a(y)|\int_{A_k}|K(x,y) - K(x,x_B)|\,d\mathcal{L}^n(x)\,d\mathcal{L}^n(y)\le\theta_k\,\|a\|_{L^1},
\end{align*}
and $\|a\|_{L^1}\le|B|^{1/2}\|a\|_{L^2}\le 1$, hence $\|m_k\|_{L^1}\le\theta_k$, and $|c_k|\le\theta_k$. Therefore $\|\tilde m_k\|_{L^1}\le\|m_k\|_{L^1} + |c_k|\le 2\theta_k$.
**Atomic-norm bound on $\tilde m_k$.** The function $\tilde m_k$ is mean-zero, supported in $\tilde B_k$, with $\|\tilde m_k\|_{L^1}\le 2\theta_k$ and $L^2$-bound
\begin{align*}
\|\tilde m_k\|_{L^2}\le\|m_k\|_{L^2} + |c_k|\,|\tilde B_k|^{-1/2}\le C(A,n)\,\theta_k^{1/2}\,2^{-kn/2}|B|^{-1/2} + \theta_k\,2^{-(k+1)n/2}|B|^{-1/2}.
\end{align*}
Combining and using $\theta_k\le C_H$ to bound $\theta_k\le C_H^{1/2}\theta_k^{1/2}$,
\begin{align*}
\|\tilde m_k\|_{L^2}\le C(A,C_H,n)\,\theta_k^{1/2}\,2^{-kn/2}|B|^{-1/2}.
\end{align*}
The Calderón--Zygmund decomposition of $\tilde m_k$ at level $\alpha_k := 2\theta_k/|\tilde B_k|$ (the value of the $L^1$-mean of $|\tilde m_k|$ on its support) produces a representation of $\tilde m_k$ as an absolutely-summable combination of $(1,\infty)$-atoms with $\ell^1$-coefficient sum bounded by $C_n\,\|\tilde m_k\|_{L^1}\le 2C_n\,\theta_k$. This is the standard $L^1\cap(\text{mean zero, bounded support})\hookrightarrow H^1$ embedding (Stein, op. cit., Ch. III §2.4, Proposition 2; Grafakos, *Modern Fourier Analysis*, 3rd ed., Theorem 2.4.6). Hence
\begin{align*}
\|\tilde m_k\|_{H^1_{\mathrm{at}}}\le C_n\cdot 2\theta_k = C(C_H,n)\,\theta_k.
\end{align*}
Summing using $\sum_{k\ge 1}\theta_k\le C_H$,
\begin{align*}
\biggl\|\sum_{k\ge 1}\tilde m_k\biggr\|_{H^1_{\mathrm{at}}}\le C(C_H,n)\,\sum_{k\ge 1}\theta_k\le C(C_H,n)\,C_H.
\end{align*}
[/step]
[step:Assemble the residual constants into a single mean-zero combination $\Psi$]
Set $\rho(x) := \sum_{k\ge 1}c_k\,\mathbb{1}_{\tilde B_k}(x)/|\tilde B_k|$. Then
\begin{align*}
M_{\mathrm{ex}}(x) = \sum_{k\ge 1}m_k(x) = \sum_{k\ge 1}\tilde m_k(x) + \rho(x),
\end{align*>
and $\sum_{k\ge 1}c_k = \int M_{\mathrm{ex}}\,d\mathcal{L}^n = -c_{\mathrm{in}}$ by the global cancellation (Z). The series for $\rho$ converges absolutely in $L^1$ since $\sum_k|c_k|\le\sum_k\theta_k\le C_H$.
Define
\begin{align*}
\Psi(x) := \rho(x) + c_{\mathrm{in}}\,\frac{\mathbb{1}_{2B}(x)}{|2B|}.
\end{align*}
Then $\int\Psi\,d\mathcal{L}^n = \sum_{k\ge 1}c_k + c_{\mathrm{in}} = 0$. With $b_0 := \mathbb{1}_{2B}/|2B| = \mathbb{1}_{\tilde B_0}/|\tilde B_0|$ and $b_k := \mathbb{1}_{\tilde B_k}/|\tilde B_k|$ for $k\ge 1$,
\begin{align*}
\Psi = c_{\mathrm{in}}\,b_0 + \sum_{k\ge 1}c_k\,b_k.
\end{align*}
Using $c_{\mathrm{in}} = -\sum_{k\ge 1}c_k$, write $\Psi = \sum_{k\ge 1}c_k\,(b_k - b_0)$. Each difference $d_k := b_k - b_0$ satisfies $\int d_k\,d\mathcal{L}^n = 1 - 1 = 0$, $\operatorname{supp} d_k\subseteq\tilde B_k$ (since $\tilde B_0 = 2B\subseteq\tilde B_k$ for $k\ge 0$), and $\|d_k\|_{L^1}\le\|b_k\|_{L^1} + \|b_0\|_{L^1} = 2$. Moreover $\|d_k\|_{L^2}\le\|b_k\|_{L^2} + \|b_0\|_{L^2} = |\tilde B_k|^{-1/2} + |2B|^{-1/2}\le 2|2B|^{-1/2}$ (using $|\tilde B_k|\ge|2B|$). Hence each $d_k/(2\cdot 2^{n/2})$ is a $(1,2)$-atom adapted to $\tilde B_k$ (by the same atomic-norm equivalence used in the previous step), giving $\|d_k\|_{H^1_{\mathrm{at}}}\le C_n^{(1)}\cdot 2^{1+n/2}$. Summing,
\begin{align*}
\|\Psi\|_{H^1_{\mathrm{at}}}\le\sum_{k\ge 1}|c_k|\,\|d_k\|_{H^1_{\mathrm{at}}}\le C_n^{(1)}\,2^{1+n/2}\sum_{k\ge 1}|c_k|\le C_n^{(1)}\,2^{1+n/2}\,C_H.
\end{align*}
[/step]
[step:Combine the bounds and conclude]
From the three pieces $\tilde M_{\mathrm{in}}$, $\sum_{k\ge 1}\tilde m_k$, and $\Psi$, with
\begin{align*}
Ta = \tilde M_{\mathrm{in}} + \sum_{k\ge 1}\tilde m_k + \Psi,
\end{align*}
the triangle inequality in the atomic norm gives
\begin{align*}
\|Ta\|_{H^1_{\mathrm{at}}}\le C_{\mathrm{in}} + C(C_H,n)\,C_H + C_n^{(1)}\,2^{1+n/2}\,C_H =: C_0,
\end{align*}
with $C_0 = C_0(A,\|T\|_{\mathcal{L}(L^2)},n)$. By the [Coifman--Latter equivalence](/theorems/3177), $\|Ta\|_{H^1(\mathbb{R}^n)}\asymp\|Ta\|_{H^1_{\mathrm{at}}}$ with constants depending only on $n$, hence $\|Ta\|_{H^1(\mathbb{R}^n)}\le C_0$, the bound $(\star\star)$ required in Step 1.
By Step 1,
\begin{align*}
\|Tf\|_{H^1(\mathbb{R}^n)}\le C_0\,C_n^{\mathrm{at}}\,\|f\|_{H^1(\mathbb{R}^n)} =: C_{A,n}\,\|f\|_{H^1(\mathbb{R}^n)},
\end{align*}
with $C_{A,n}$ depending only on $A$, $\|T\|_{\mathcal{L}(L^2)}$, and $n$. The series $\sum_j\lambda_j Ta_j$ converges in $H^1$ by completeness, with sum equal to $Tf$ by uniqueness of limits in $L^1$ (since $H^1\hookrightarrow L^1$ and the series also converges in $L^1$ to $Tf$ via the [$H^1\to L^1$ theorem](/theorems/3178)). This completes the proof.
[/step]