[proofplan]
**This proof is a sketch.** It records the strategy and the structural identities; three deep technical inputs are stated and used as black boxes, with citations: (a) the construction of the $b$-adapted Littlewood-Paley resolution of identity (David–Journé–Semmes 1985, §2; Christ 1990, Ch. 2), (b) the $b$-adapted Carleson measure characterisation of BMO controlling the paraproduct on $L^2$ (David–Journé–Semmes 1985, §5), and (c) the off-diagonal Cotlar–Stein decay estimates for the remainder operator (David–Journé–Semmes 1985, §6). The strategy generalises the proof of the [T(1) theorem](/theorems/???) by replacing the constant function $1$ with the para-accretive functions $b_1, b_2$. Necessity of conditions (1)-(3) follows from $L^\infty \to \mathrm{BMO}$ boundedness of Calderón-Zygmund operators and Cauchy-Schwarz. For sufficiency, one constructs a $b$-adapted paraproduct $\Pi^{b_1}_a$ satisfying the defining identity $\Pi^{b_1}_a(b_1) = a$ by construction (David–Journé–Semmes 1985, §6), decomposes $T = \Pi^{b_1}_{T(b_1)} + (\Pi^{b_2}_{T^*(b_2)})^* + R$, observes that the remainder $R$ satisfies $R(b_1) = 0$ and $R^*(b_2) = 0$, and bounds $R$ on $L^2$ by an almost-orthogonality argument. For full proofs of the cited inputs see [David–Journé–Semmes, Rev. Mat. Iberoam. 1 (1985)] and [Christ, Lectures on Singular Integral Operators, CBMS 77].
[/proofplan]
[step:Set up the operator-theoretic framework]
Let $T: C_c^\infty(\mathbb{R}^n) \to (C_c^\infty(\mathbb{R}^n))'$ be continuous and associated with a [standard kernel](/theorems/???) $K$ on $\mathbb{R}^n \times \mathbb{R}^n$, meaning $K \in L^1_{\mathrm{loc}}(\{(x,y): x \neq y\})$ satisfies the size estimate
\begin{align*}
|K(x,y)| \le C_K\, |x - y|^{-n}, \quad x \ne y,
\end{align*}
and the regularity estimate
\begin{align*}
|K(x,y) - K(x',y)| + |K(y,x) - K(y,x')| \le C_K\, \frac{|x - x'|^\gamma}{|x - y|^{n+\gamma}} \quad \text{for } |x - x'| \le \tfrac{1}{2}|x - y|,
\end{align*}
for some $\gamma \in (0,1]$. The operator and kernel are linked by
\begin{align*}
T(\varphi)(\psi) = \int_{\mathbb{R}^n}\int_{\mathbb{R}^n} K(x,y)\, \varphi(y)\, \psi(x)\, d\mathcal{L}^n(y)\, d\mathcal{L}^n(x), \quad \operatorname{supp}\varphi \cap \operatorname{supp}\psi = \varnothing.
\end{align*}
A function $b \in L^\infty(\mathbb{R}^n)$ is [para-accretive](/theorems/???) if there exist $\delta, C > 0$ such that for every cube $Q \subset \mathbb{R}^n$ there is a subcube $Q' \subseteq Q$ with $\mathcal{L}^n(Q') \ge C\, \mathcal{L}^n(Q)$ and $|\mathcal{L}^n(Q')^{-1} \int_{Q'} b\, d\mathcal{L}^n| \ge \delta$. Throughout, $b_1, b_2 \in L^\infty(\mathbb{R}^n)$ are para-accretive with constants $\delta, C$.
[/step]
[step:Establish necessity of the three conditions]
Assume $T$ extends to a bounded operator on $L^2(\mathbb{R}^n)$. We verify (1)-(3).
For (1), the [$b$-weak boundedness property](/theorems/???) requires the bound $|T(\varphi_Q^{b_1})(\psi_Q^{b_2})| \le C\, \mathcal{L}^n(Q)$ uniformly over cubes $Q$ and $b_i$-normalised bumps; this follows by Cauchy-Schwarz, $\|\varphi_Q^{b_1}\|_{L^2} \lesssim \mathcal{L}^n(Q)^{1/2}$, $\|\psi_Q^{b_2}\|_{L^2} \lesssim \mathcal{L}^n(Q)^{1/2}$, and the $L^2$ bound on $T$.
For (2), since $T$ is $L^2$-bounded with standard kernel, it is a Calderón-Zygmund operator, hence by the [$L^\infty \to \mathrm{BMO}$ mapping](/theorems/???) it sends $L^\infty(\mathbb{R}^n) \cap L^2(\mathbb{R}^n)$ into $\mathrm{BMO}(\mathbb{R}^n)$. The image $T(b_1)$ is defined as a distribution modulo constants by duality with $H^1$-atoms; the $L^\infty \to \mathrm{BMO}$ map gives $\|T(b_1)\|_{\mathrm{BMO}} \lesssim \|T\|_{L^2 \to L^2}\, \|b_1\|_{L^\infty}$.
For (3), the same argument applied to the adjoint $T^*$ (which has the transposed standard kernel and the same $L^2$ norm) gives $T^*(b_2) \in \mathrm{BMO}$ with $\|T^*(b_2)\|_{\mathrm{BMO}} \lesssim \|T\|_{L^2 \to L^2}\, \|b_2\|_{L^\infty}$.
[/step]
[step:Construct a $b$-adapted Littlewood-Paley decomposition]
For sufficiency, fix para-accretive $b_1, b_2$. **The construction of the $b$-adapted Littlewood-Paley resolution is taken as a black box** from [David–Journé–Semmes 1985, §2] and [Christ, CBMS 77, Ch. 2]; we record only the properties that we use in Steps 4–6.
There exist families of bounded operators $\mathbb{E}^{b_1}_j: L^2(\mathbb{R}^n) \to L^2(\mathbb{R}^n)$ ($j \in \mathbb{Z}$), built from $b_1$-weighted averages on a dyadic family of cubes (with the cubes selected so that $b_1$-averages are non-degenerate, using the para-accretivity constants $\delta, C$) such that
\begin{align*}
\mathbb{E}^{b_1}_j(b_1) = b_1 \quad \text{for every } j \in \mathbb{Z},
\end{align*}
and $\|\mathbb{E}^{b_1}_j\|_{L^2 \to L^2} \le C(\delta, n)$ uniformly in $j$. (The reproducing identity $\mathbb{E}^{b_1}_j(b_1) = b_1$ is the design property of the David–Journé–Semmes averaging operators; we cite it rather than verify it.) The $b_1$-difference operators (denoted $\Delta^{b_1}_j$ throughout — the same operators are written $\mathbb{D}^{b_1}_j$ in some references; we adopt $\Delta^{b_1}_j$ exclusively below)
\begin{align*}
\Delta^{b_1}_j := \mathbb{E}^{b_1}_{j+1} - \mathbb{E}^{b_1}_j
\end{align*}
satisfy $\Delta^{b_1}_j(b_1) = 0$ and provide the resolution of identity
\begin{align*}
\operatorname{Id}_{L^2} = \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j \quad \text{(strongly on } L^2(\mathbb{R}^n)\text{)}.
\end{align*}
The analogous construction with $b_2$ gives $\mathbb{E}^{b_2}_j$, $\Delta^{b_2}_j$.
[/step]
[step:Construct the $b$-adapted paraproduct $\Pi^{b_1}_a$]
Following [David–Journé–Semmes, Rev. Mat. Iberoam. 1 (1985), §6], for $a \in \mathrm{BMO}(\mathbb{R}^n)$ we associate the [$b_1$-adapted paraproduct](/theorems/???)
\begin{align*}
\Pi^{b_1}_a &: L^2(\mathbb{R}^n) \to L^2(\mathbb{R}^n),
\end{align*}
defined so that, for any $f \in L^2(\mathbb{R}^n)$,
\begin{align*}
\Pi^{b_1}_a(b_1) = a \quad \text{in $\mathrm{BMO}$ (i.e. modulo constants).}
\end{align*}
The crucial point is that this identity is **definitional**: in the David–Journé–Semmes construction, $\Pi^{b_1}_a$ is *built* from the $b_1$-adapted Littlewood–Paley resolution of $a$ in such a way that the action on $b_1$ recovers $a$. To see why, recall first that the resolution of identity from Step 3 gives, for any $g \in L^2(\mathbb{R}^n)$,
\begin{align*}
g = \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j g \quad \text{in } L^2,
\end{align*}
and applied formally to $a \in \mathrm{BMO}$, $\sum_j \Delta^{b_1}_j a = a$ in $\mathrm{BMO}$ modulo constants. The naive bilinear paraproduct (the analogue of the classical formula) would be
\begin{align*}
\widetilde{\Pi}^{b_1}_a f := \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j(a)\,\mathbb{E}^{b_1}_j f,
\end{align*}
and on $f = b_1$ this yields, using $\mathbb{E}^{b_1}_j(b_1) = b_1$,
\begin{align*}
\widetilde{\Pi}^{b_1}_a(b_1) = \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j(a)\,b_1 = b_1\,\sum_{j \in \mathbb{Z}} \Delta^{b_1}_j(a) = b_1 \cdot a \quad \text{(in BMO modulo constants)}.
\end{align*}
This is the discrepancy noted in the literature: the *naive* construction recovers $a\cdot b_1$, not $a$. **The reconciliation, due to David–Journé–Semmes 1985, §6, is to define a different operator.** Let $S^{b_1}_j: L^2(\mathbb{R}^n) \to L^2(\mathbb{R}^n)$ be the $b_1$-adapted *projection-onto-the-$b_1$-direction* operator defined as follows: at the dyadic scale $2^{-j}$, $S^{b_1}_j$ acts on $f \in L^2$ by averaging $f$ against the $b_1$-cube system, then *dividing pointwise by the $b_1$-average on the same cubes*. Concretely, with $\mathcal{Q}_j$ the $b_1$-adapted dyadic cubes at scale $j$ and $\langle\cdot\rangle_Q$ denoting the average over $Q$,
\begin{align*}
(S^{b_1}_j f)(x) := \sum_{Q \in \mathcal{Q}_j} \frac{\langle f\rangle_Q}{\langle b_1\rangle_Q}\,\mathbb{1}_Q(x), \qquad \|S^{b_1}_j\|_{L^2 \to L^2} \le C(\delta, n)\|b_1\|_\infty,
\end{align*}
where para-accretivity forces $|\langle b_1\rangle_Q| \ge \delta'$ on the relevant subcubes (so the division is well-defined). With this $S^{b_1}_j$, applying to $f = b_1$ gives
\begin{align*}
S^{b_1}_j(b_1)(x) = \sum_{Q \in \mathcal{Q}_j} \frac{\langle b_1\rangle_Q}{\langle b_1\rangle_Q}\,\mathbb{1}_Q(x) = \sum_{Q \in \mathcal{Q}_j} \mathbb{1}_Q(x) = 1 \quad \text{a.e.}
\end{align*}
That is, $S^{b_1}_j(b_1) = 1$, not $b_1$. The David–Journé–Semmes paraproduct is then
\begin{align*}
\Pi^{b_1}_a f := \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j(a)\,S^{b_1}_j f.
\end{align*}
Evaluation on $b_1$ gives
\begin{align*}
\Pi^{b_1}_a(b_1) = \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j(a)\,S^{b_1}_j(b_1) = \sum_{j \in \mathbb{Z}} \Delta^{b_1}_j(a) \cdot 1 = a \quad \text{(in BMO modulo constants)},
\end{align*}
which is the desired identity. The reconciliation with the naive paraproduct is now transparent: the naive operator pairs $\Delta^{b_1}_j(a)$ with $\mathbb{E}^{b_1}_j$ which preserves $b_1$, hence multiplies the output by $b_1$; the corrected operator uses $S^{b_1}_j$ which sends $b_1$ to $1$, hence the $b_1$ factor disappears.
The construction of $S^{b_1}_j$ — including the verification that the pointwise division by $\langle b_1\rangle_Q$ is bounded and that $S^{b_1}_j$ remains a Calderón–Zygmund-type localisation operator — uses the para-accretivity of $b_1$ in an essential way and is the technical content of David–Journé–Semmes 1985, §6. **We cite this construction as a black box.**
The second key property is the $L^2$ bound
\begin{align*}
\|\Pi^{b_1}_a\|_{L^2 \to L^2} \le C(\delta,n)\, \|a\|_{\mathrm{BMO}},
\end{align*}
which is a consequence of the $b$-adapted [Carleson measure characterisation of BMO](/theorems/???) (David–Journé–Semmes 1985, §5): the discrete measure
\begin{align*}
d\mu(x, j) := |\Delta^{b_1}_j(a)(x)|^2\,d\mathcal{L}^n(x)\,d\#(j) \quad \text{on } \mathbb{R}^n \times \mathbb{Z},
\end{align*}
where $d\#$ denotes counting measure on $\mathbb{Z}$, defines a $b_1$-Carleson measure of mass $\lesssim \|a\|_{\mathrm{BMO}}^2$. Pairing this Carleson measure against the $L^2$-vector $(S^{b_1}_j f)_j$ via a Cauchy–Schwarz argument gives the operator bound. We cite this as a black box.
[/step]
[step:Decompose $T$ as paraproducts plus a remainder with vanishing test-function values]
Set $a := T(b_1)$ and $a^* := T^*(b_2)$, both in $\mathrm{BMO}(\mathbb{R}^n)$ by hypotheses (2)-(3). Decompose
\begin{align*}
T = \Pi^{b_1}_{a} + \bigl(\Pi^{b_2}_{a^*}\bigr)^* + R, \quad R := T - \Pi^{b_1}_a - \bigl(\Pi^{b_2}_{a^*}\bigr)^*.
\end{align*}
By construction:
\begin{align*}
R(b_1) &= T(b_1) - \Pi^{b_1}_a(b_1) - \bigl(\Pi^{b_2}_{a^*}\bigr)^*(b_1) = a - a - 0 = 0, \\
R^*(b_2) &= T^*(b_2) - \bigl(\Pi^{b_1}_a\bigr)^*(b_2) - \Pi^{b_2}_{a^*}(b_2) = a^* - 0 - a^* = 0.
\end{align*}
The two vanishing terms $(\Pi^{b_2}_{a^*})^*(b_1) = 0$ and $(\Pi^{b_1}_a)^*(b_2) = 0$ are arranged by a second design choice in the David–Journé–Semmes construction (1985, §6): the paraproduct $\Pi^{b_1}_a$ is built so that, in addition to $\Pi^{b_1}_a(b_1) = a$, its formal adjoint $(\Pi^{b_1}_a)^*$ satisfies $(\Pi^{b_1}_a)^*(b_2) = 0$ in $\mathrm{BMO}$ modulo constants — equivalently, the coefficients $\Delta^{b_1}_j(a)$ in Step 4 are placed so that $b_2$ pairs to zero against the dual averaging system. **This is part of the same black-box cited construction**; we record it as a property the paraproduct enjoys, and refer to David–Journé–Semmes 1985, §6 for its verification. Both paraproducts are $L^2$-bounded with norms controlled by $\|a\|_{\mathrm{BMO}} \lesssim \|T(b_1)\|_{\mathrm{BMO}}$ and $\|a^*\|_{\mathrm{BMO}} \lesssim \|T^*(b_2)\|_{\mathrm{BMO}}$ (Step 4).
[/step]
[step:Bound the remainder $R$ on $L^2$ via a $b$-adapted Cotlar-Stein argument]
The remainder $R$ inherits the standard kernel of $T$ (corrected by the smooth kernels of the paraproducts, which themselves satisfy Calderón-Zygmund estimates). Decompose $R$ in the $b_1, b_2$-adapted Littlewood-Paley system:
\begin{align*}
R = \sum_{j, k \in \mathbb{Z}} \mathbb{D}^{b_2}_j \, R\, \mathbb{D}^{b_1}_k =: \sum_{j,k} R_{j,k}.
\end{align*}
**The core technical estimate** — taken as a black box from [David–Journé–Semmes 1985, §6] and [Christ, CBMS 77, Ch. 4] — is the off-diagonal almost-orthogonality bound
\begin{align*}
\|R_{j,k}^* R_{j',k'}\|_{L^2 \to L^2} + \|R_{j,k} R_{j',k'}^*\|_{L^2 \to L^2} \le C\, 2^{-\gamma(|j - j'| + |k - k'|)}
\end{align*}
for some $\gamma > 0$, where $C$ depends on $\delta$, $n$, the kernel constants of $T$, the [$b$-weak boundedness](/theorems/???) constant, and $\|b_1\|_\infty, \|b_2\|_\infty$. **We do not reproduce the proof of this estimate.** Heuristically, the decay reflects the cancellation conditions $R(b_1) = 0$ and $R^*(b_2) = 0$: when $|j - k|$ is large, the kernel of $R_{j,k}\, R_{j',k'}^*$ is paired with a function having vanishing $b_1$-mean against a kernel essentially constant on its support, and the rigorous justification of the $2^{-\gamma(\cdot)}$ rate is the technical content of the cited references. Granted the off-diagonal estimate, the [Cotlar-Stein lemma](/theorems/???) gives
\begin{align*}
\Bigl\|\sum_{j,k} R_{j,k}\Bigr\|_{L^2 \to L^2} \le \sum_{m, \ell} C\, 2^{-\gamma(|m| + |\ell|)/2} = C' < \infty,
\end{align*}
so $\|R\|_{L^2 \to L^2} \le C'$, depending only on the kernel and $b$-weak boundedness constants and $\|b_i\|_\infty$.
[/step]
[step:Combine the paraproduct and remainder bounds to conclude $L^2$ boundedness]
Adding the bounds from Steps 4-6:
\begin{align*}
\|T\|_{L^2 \to L^2} &\le \|\Pi^{b_1}_a\|_{L^2 \to L^2} + \|\Pi^{b_2}_{a^*}\|_{L^2 \to L^2} + \|R\|_{L^2 \to L^2} \\
&\le C(\delta, n)\bigl(\|T(b_1)\|_{\mathrm{BMO}} + \|T^*(b_2)\|_{\mathrm{BMO}}\bigr) + C',
\end{align*}
where $C' = C'(\delta, n, C_K, \mathrm{WBP}, \|b_i\|_\infty)$. Combining with the necessity in Step 2, this gives the equivalence
\begin{align*}
\|T\|_{L^2 \to L^2} \asymp_{\delta, n, C_K, \|b_i\|_\infty}\ \|T(b_1)\|_{\mathrm{BMO}} + \|T^*(b_2)\|_{\mathrm{BMO}} + \mathrm{WBP\ constant},
\end{align*}
proving the T(b) theorem.
[/step]