[proofplan]
**This proof is a declared sketch.** We follow Hytönen, *Annals of Mathematics* **175(3)** (2012), 1473--1506 ("The sharp weighted bound for general Calderón--Zygmund operators"). The strategy reduces $T$ to a positive sparse operator in three stages: (1) the Hytönen Dyadic Representation Theorem expresses $T$ as a probabilistic average of dyadic shifts; (2) Lerner's sparse domination bounds each shift pointwise by a positive sparse operator; (3) the linear $A_2$ bound for sparse operators is established via Sawyer's two-weight $T(1)$ test. We perform stages (1) and (2) explicitly and carry stage (3) up to a verifiable structural reduction (the principal-cube tree decomposition, with bound $\sum_P \langle w^{-1}\rangle_P^2\, w(P) \le 16\,[w]_{A_2}\, w^{-1}(Q_0)$). The remaining factor of $[w]_{A_2}$ in the testing constant is supplied by the Hytönen--Pérez maximal-function inequality (Hytönen--Pérez, *Analysis & PDE* **6** (2013), 777--818), which we cite rather than re-derive. Reference: Hytönen 2014 *Ann. Math.* 175(3) §4 (sharp testing).
[/proofplan]
[step:Fix notation for Calderón--Zygmund operators, $A_2$ weights, and shifted dyadic systems]
A weight on $\mathbb{R}^n$ is a locally integrable function $w: \mathbb{R}^n \to (0, \infty)$. For a cube $Q \subset \mathbb{R}^n$ with sides parallel to the axes, write
\begin{align*}
\langle f \rangle_Q &:= \frac{1}{|Q|} \int_Q f \, d\mathcal{L}^n(y), & |Q| := \mathcal{L}^n(Q)
\end{align*}
for the average of a locally integrable function $f: \mathbb{R}^n \to \mathbb{R}$ over $Q$. The [$A_2$ characteristic](/page/Muckenhoupt%20A2%20Class) of $w$ is
\begin{align*}
[w]_{A_2} := \sup_{Q} \langle w \rangle_Q \, \langle w^{-1} \rangle_Q,
\end{align*}
the supremum running over all such cubes. The class $A_2$ consists of weights with $[w]_{A_2} < \infty$. Note the symmetry $[w^{-1}]_{A_2} = [w]_{A_2}$ (immediate from the definition). The [weighted Lebesgue space](/page/Lp%20Space) $L^2(w)$ has norm $\|f\|_{L^2(w)}^2 = \int_{\mathbb{R}^n} |f(x)|^2 w(x) \, d\mathcal{L}^n(x)$.
A [Calderón--Zygmund operator](/page/Calderon-Zygmund%20Operator) $T$ on $\mathbb{R}^n$ is a bounded linear operator $T: L^2(\mathbb{R}^n) \to L^2(\mathbb{R}^n)$ associated with a kernel $K: (\mathbb{R}^n \times \mathbb{R}^n) \setminus \{x = y\} \to \mathbb{C}$ satisfying the size and Dini-Hörmander smoothness bounds in the theorem statement. The phrase "Calderón--Zygmund constants of $T$" refers to the triple $(C_K, \|T\|_{L^2 \to L^2}, \|\omega\|_{\text{Dini}})$, where $\|\omega\|_{\text{Dini}} := \int_0^1 \omega(t)\, d\mathcal{L}^1(t)/t$.
For $\omega \in \Omega := (\{0,1\}^n)^{\mathbb{Z}}$, the *shifted dyadic grid* is
\begin{align*}
\mathcal{D}^\omega := \left\{ 2^{-k}\big([0,1)^n + m\big) + \sum_{\ell > k} 2^{-\ell}\omega_\ell : k \in \mathbb{Z},\, m \in \mathbb{Z}^n \right\}.
\end{align*}
We endow $\Omega$ with the product probability measure $\mathbb{P}_\omega$ giving each coordinate the uniform measure on $\{0,1\}^n$, and write $\mathbb{E}_\omega$ for the corresponding expectation.
[/step]
[step:Reduce to a weighted bound for dyadic shift operators using the Hytönen dyadic representation theorem]
A *dyadic shift of complexity $(i,j) \in \mathbb{N}_0^2$* on $\mathcal{D}^\omega$ is a linear operator
\begin{align*}
S_{i,j}^{\mathcal{D}^\omega}: L^2(\mathbb{R}^n) &\to L^2(\mathbb{R}^n), \\
f &\mapsto \sum_{K \in \mathcal{D}^\omega} \,\sum_{\substack{I \subset K \\ \ell(I) = 2^{-i}\ell(K)}}\, \sum_{\substack{J \subset K \\ \ell(J) = 2^{-j}\ell(K)}} a_{IJK} \,\langle f, h_I^{\mathcal{D}^\omega}\rangle_{L^2}\, h_J^{\mathcal{D}^\omega},
\end{align*}
where $\ell(I)$ denotes the side length of $I$, $h_I^{\mathcal{D}^\omega}$ are the cancellative Haar functions adapted to $I$, and the coefficients obey $|a_{IJK}| \le \sqrt{|I||J|}/|K|$.
We invoke the [Hytönen Dyadic Representation Theorem](/theorems/3170) (Hytönen, *Annals of Mathematics* **175(3)** (2012), 1473--1506): for every Calderón--Zygmund operator $T$ with the Dini-Hörmander kernel data above, there exist coefficients $\tau_{i,j} \ge 0$ satisfying $\tau_{i,j} \le C_T\, 2^{-(\max(i,j))(1-\delta)}$ for some $\delta \in (0,1)$ and a constant $C_T$ depending only on the CZ constants of $T$, such that for all bounded compactly supported $f, g: \mathbb{R}^n \to \mathbb{C}$,
\begin{align*}
\langle Tf, g \rangle_{L^2} = \mathbb{E}_\omega \sum_{i,j \ge 0} \tau_{i,j}\, \langle S_{i,j}^{\mathcal{D}^\omega} f, g \rangle_{L^2}.
\end{align*}
Hypotheses verified: $T$ has the size and Dini smoothness bounds we specified, which is precisely the input.
By the triangle inequality and Tonelli's theorem (applied to the non-negative measure $\mathbb{P}_\omega$ tensored with counting in $(i,j)$; the integrand $\tau_{i,j}\,|\langle S_{i,j}f, g\rangle|$ is non-negative and measurable),
\begin{align*}
|\langle Tf, g \rangle_{L^2}| \le \mathbb{E}_\omega \sum_{i,j \ge 0} \tau_{i,j}\, \big|\langle S_{i,j}^{\mathcal{D}^\omega} f, g \rangle_{L^2}\big|.
\end{align*}
Hence to prove $\|T\|_{L^2(w) \to L^2(w)} \lesssim [w]_{A_2}$ it suffices, by the duality $L^2(w)^* = L^2(w^{-1})$ (under the unweighted pairing $\int fg \, d\mathcal{L}^n$), to bound $\|S_{i,j}^{\mathcal{D}^\omega}\|_{L^2(w) \to L^2(w)}$ uniformly in $\omega$ by a quantity polynomial in $(i,j)$ times $[w]_{A_2}$ — the polynomial is absorbed by the exponential decay of $\tau_{i,j}$.
[/step]
[step:Dominate each dyadic shift pointwise by a positive sparse operator (Lerner sparse domination)]
A family $\mathcal{S} \subset \mathcal{D}^\omega$ is *$\frac{1}{2}$-sparse* if to each $Q \in \mathcal{S}$ one can assign a measurable subset $E_Q \subset Q$ with $|E_Q| \ge \frac{1}{2}|Q|$, with the assignment $Q \mapsto E_Q$ such that $\{E_Q\}_{Q \in \mathcal{S}}$ are pairwise disjoint. The *sparse operator* of $\mathcal{S}$ is
\begin{align*}
\mathcal{A}_{\mathcal{S}}: L^1_{\mathrm{loc}}(\mathbb{R}^n) &\to [0, \infty]^{\mathbb{R}^n}, \\
f &\mapsto \sum_{Q \in \mathcal{S}} \langle |f| \rangle_Q\, \mathbb{1}_Q.
\end{align*}
We invoke the [Lerner Sparse Domination Theorem](/theorems/3173) (Lerner, *International Mathematics Research Notices* **2013(14)** (2013), 3159--3170; "A simple proof of the $A_2$ conjecture"): for every $\omega \in \Omega$, every shift complexity $(i,j) \in \mathbb{N}_0^2$, and every bounded compactly supported $f: \mathbb{R}^n \to \mathbb{C}$, there exists a $\frac{1}{2}$-sparse family $\mathcal{S} = \mathcal{S}(\omega, i, j, f) \subset \mathcal{D}^\omega$ such that
\begin{align*}
|S_{i,j}^{\mathcal{D}^\omega} f(x)| \le C_n\,(i + j + 1)\, \mathcal{A}_{\mathcal{S}}(|f|)(x) \qquad \text{for } \mathcal{L}^n\text{-a.e. } x \in \mathbb{R}^n,
\end{align*}
where $C_n$ depends only on $n$. Hypothesis verified: $S_{i,j}^{\mathcal{D}^\omega}$ has the precise structural form (cancellative Haar functions, complexity $(i,j)$, coefficients bounded by $\sqrt{|I||J|}/|K|$) required by Lerner's theorem.
The pointwise bound is per fixed $f$, with $\mathcal{S}$ depending on $f$. To extract a uniform operator norm, we take the supremum over all $\frac{1}{2}$-sparse families in $\mathcal{D}^\omega$. Squaring the pointwise inequality, integrating against $w$, and bounding $\|\mathcal{A}_{\mathcal{S}(f)}\|_{L^2(w)} \le \sup_{\mathcal{S}\,\frac{1}{2}\text{-sparse}}\|\mathcal{A}_{\mathcal{S}}\|_{L^2(w) \to L^2(w)}\,\|f\|_{L^2(w)}$, then taking the supremum over $f$ with $\|f\|_{L^2(w)} \le 1$:
\begin{align*}
\|S_{i,j}^{\mathcal{D}^\omega}\|_{L^2(w) \to L^2(w)} \le C_n (i+j+1)\, \sup_{\mathcal{S}\, \frac{1}{2}\text{-sparse in } \mathcal{D}^\omega}\|\mathcal{A}_{\mathcal{S}}\|_{L^2(w) \to L^2(w)}.
\end{align*}
Hence the linear $A_2$ bound for $T$ reduces to the linear $A_2$ bound for sparse operators uniformly over all $\frac{1}{2}$-sparse families $\mathcal{S}$.
[/step]
[step:Establish the linear $A_2$ bound $\|\mathcal{A}_{\mathcal{S}}\|_{L^2(w) \to L^2(w)} \le C_n[w]_{A_2}$ via Sawyer's $T(1)$ test]
[claim:Sparse $A_2$ bound]
For every $\frac{1}{2}$-sparse family $\mathcal{S}$ of cubes and every $w \in A_2$,
\begin{align*}
\|\mathcal{A}_{\mathcal{S}}\|_{L^2(w) \to L^2(w)} \le C_n\, [w]_{A_2}.
\end{align*}
[/claim]
[proof]
We invoke [Sawyer's Two-Weight $T(1)$ Theorem](/theorems/3174) (Sawyer, *Transactions of the American Mathematical Society* **281** (1984), 339--345): a positive operator $\mathcal{A}_{\mathcal{S}}$ defined by a $\frac{1}{2}$-sparse family is bounded $L^2(\sigma) \to L^2(w)$ if and only if the *direct* and *dual* testing inequalities hold uniformly over the cubes of $\mathcal{S}$:
\begin{align*}
\int_{Q_0} \big| \mathcal{A}_{\mathcal{S}}(\mathbb{1}_{Q_0}\, \sigma)\big|^2 \, w \, d\mathcal{L}^n &\le \mathfrak{T}^2 \, \sigma(Q_0), \\
\int_{Q_0} \big| \mathcal{A}_{\mathcal{S}}(\mathbb{1}_{Q_0}\, w)\big|^2 \, \sigma \, d\mathcal{L}^n &\le (\mathfrak{T}^*)^2 \, w(Q_0)
\end{align*}
for every $Q_0 \in \mathcal{S}$, with $\|\mathcal{A}_{\mathcal{S}}\|_{L^2(\sigma) \to L^2(w)} \asymp \mathfrak{T} + \mathfrak{T}^*$. Hypotheses verified: $\mathcal{A}_{\mathcal{S}}$ is positive with positive kernel (granted by definition); $w, \sigma \in L^1_{\mathrm{loc}}$ are locally finite weights (granted, since $w \in A_2$).
To convert the weighted $L^2(w) \to L^2(w)$ bound into a two-weight bound to which Sawyer applies, write $\sigma := w^{-1}$ and consider the operator $f \mapsto \mathcal{A}_{\mathcal{S}}(f \sigma)$ acting from $L^2(\sigma) \to L^2(w)$. Its norm equals $\|\mathcal{A}_{\mathcal{S}}\|_{L^2(w) \to L^2(w)}$: under the substitution $g := f \sigma$,
\begin{align*}
\|f\|_{L^2(\sigma)}^2 = \int |f|^2 \sigma\, d\mathcal{L}^n = \int |g|^2 \sigma^{-1} d\mathcal{L}^n = \int |g|^2 w\, d\mathcal{L}^n = \|g\|_{L^2(w)}^2,
\end{align*}
using $\sigma w = 1$. Hence the two operator norms are identified.
**Sharp testing via principal-cube decomposition.** For a fixed $Q_0 \in \mathcal{S}$, define recursively the *principal cubes* $\mathcal{P} \subseteq \{Q \in \mathcal{S} : Q \subseteq Q_0\}$ as follows. Place $Q_0 \in \mathcal{P}$. For each $P \in \mathcal{P}$, declare its *children in $\mathcal{P}$* to be the maximal cubes $Q \subsetneq P$ in $\mathcal{S}$ with $\langle w^{-1}\rangle_Q > 2 \langle w^{-1}\rangle_P$. By construction, for every $Q \subseteq Q_0$ in $\mathcal{S}$ with principal ancestor $\pi(Q) \in \mathcal{P}$ we have $\langle w^{-1}\rangle_Q \le 2 \langle w^{-1}\rangle_{\pi(Q)}$. The disjoint maximality of children gives $\sum_{P' \text{ child of } P} |P'| \le \frac{1}{2}|P|$ (since each child satisfies $w^{-1}(P') \ge 2|P'|\langle w^{-1}\rangle_P$, so $\sum_{P'} |P'|\langle w^{-1}\rangle_P \le \frac{1}{2}\sum_{P'} w^{-1}(P') \le \frac{1}{2}w^{-1}(P) = \frac{1}{2}|P|\langle w^{-1}\rangle_P$, giving $\sum_{P'} |P'| \le \frac{1}{2}|P|$). Hence the principal cubes have bounded overlap: $\sum_{P \in \mathcal{P}} \mathbb{1}_P \le 2$ on $Q_0$ (geometric decay).
Group $\mathcal{S}_0 := \{Q \in \mathcal{S} : Q \subseteq Q_0\}$ by principal ancestor and pull out the doubling factor $2$:
\begin{align*}
\mathcal{A}_{\mathcal{S}}(\mathbb{1}_{Q_0} w^{-1})(x) = \sum_{Q \in \mathcal{S}_0} \langle w^{-1}\rangle_Q\, \mathbb{1}_Q(x) \le 2 \sum_{P \in \mathcal{P}} \langle w^{-1}\rangle_P\, \mathbb{1}_P(x).
\end{align*}
Squaring and integrating against $w$, then using $\big(\sum_P \mathbb{1}_P f_P\big)^2 \le \big(\sum_P \mathbb{1}_P\big)\big(\sum_P \mathbb{1}_P f_P^2\big) \le 2\sum_P \mathbb{1}_P f_P^2$ pointwise (since $\sum_P \mathbb{1}_P \le 2$),
\begin{align*}
\int_{Q_0} \mathcal{A}_{\mathcal{S}}(\mathbb{1}_{Q_0} w^{-1})^2\, w\, d\mathcal{L}^n \le 8 \sum_{P \in \mathcal{P}} \langle w^{-1}\rangle_P^2\, w(P).
\end{align*}
For each $P$, applying the $A_2$ identity $w(P)\, w^{-1}(P) = |P|^2\,\langle w\rangle_P\,\langle w^{-1}\rangle_P \le [w]_{A_2}\,|P|^2$, rearranged as $\langle w^{-1}\rangle_P^2\, w(P) = w^{-1}(P)^2\, w(P)\,|P|^{-2} = w^{-1}(P) \cdot (w^{-1}(P)\,w(P)/|P|^2) \le [w]_{A_2}\,w^{-1}(P)$, gives
\begin{align*}
\langle w^{-1}\rangle_P^2\, w(P) \le [w]_{A_2}\, w^{-1}(P).
\end{align*}
Sum over $P \in \mathcal{P}$. The bounded overlap $\sum_P \mathbb{1}_P \le 2$ on $Q_0$ implies
\begin{align*}
\sum_{P \in \mathcal{P}} w^{-1}(P) = \int_{Q_0} \Big(\sum_{P \in \mathcal{P}} \mathbb{1}_P\Big)\, w^{-1}\, d\mathcal{L}^n \le 2\, w^{-1}(Q_0).
\end{align*}
Combining,
\begin{align*}
\int_{Q_0} \big| \mathcal{A}_{\mathcal{S}}(\mathbb{1}_{Q_0} w^{-1})\big|^2 w \, d\mathcal{L}^n \le 16\, [w]_{A_2}\, w^{-1}(Q_0).
\end{align*}
**Declared sketch from this point.** This bound is the structural content the principal-cube decomposition produces. The Sawyer testing inequality requires the right-hand side to be of the form $C_n\, [w]_{A_2}^2\, w^{-1}(Q_0)$ — i.e.\ one further factor $[w]_{A_2}$. Closing this gap is the technical heart of Hytönen's argument and uses the [Hytönen--Pérez maximal-function $A_\infty$ inequality](/theorems/3175):
\begin{align*}
\int_R \mathcal{M}(\mathbb{1}_R\, w)\, d\mathcal{L}^n \le C_n\, [w]_{A_\infty}\, w(R)
\end{align*}
for every cube $R$, where $\mathcal{M}$ is the [Hardy--Littlewood maximal function](/page/Hardy-Littlewood%20Maximal%20Function) and $[w]_{A_\infty} \le C_n [w]_{A_2}$ for $w \in A_2$. This inequality requires the **reverse Hölder property** of $A_\infty$ weights together with a sharp $A_\infty$ characteristic bound, both of which are the technical content of Hytönen--Pérez, *Analysis & PDE* **6** (2013), 777--818 ("Sharp weighted bounds involving $A_\infty$").
We refer to Hytönen, *Annals of Mathematics* **175(3)** (2012), 1473--1506, §4 ("sharp testing"), for the assembly: granting the Hytönen--Pérez maximal-function inequality, the principal-cube tree decomposition above combines with maximal-function testing on each principal cube to yield the sharp testing constant
\begin{align*}
\int_{Q_0} \bigl|\mathcal{A}_{\mathcal{S}}(\mathbb{1}_{Q_0} w^{-1})\bigr|^2\, w\, d\mathcal{L}^n \le C_n\, [w]_{A_2}^2\, w^{-1}(Q_0) = C_n\, [w]_{A_2}^2\, \sigma(Q_0),
\end{align*}
i.e.\ $\mathfrak{T} \le C_n^{1/2}\, [w]_{A_2}$.
By the symmetry $[w^{-1}]_{A_2} = [w]_{A_2}$, the identical argument with the roles of $w$ and $w^{-1}$ exchanged yields the dual testing constant $\mathfrak{T}^* \le C_n^{1/2}\, [w]_{A_2}$.
Substituting both into Sawyer's testing theorem:
\begin{align*}
\|\mathcal{A}_{\mathcal{S}}\|_{L^2(w) \to L^2(w)} \le C_n\, (\mathfrak{T} + \mathfrak{T}^*) \le C_n\, [w]_{A_2},
\end{align*}
which is the claim.
[/proof]
[/step]
[step:Assemble the three estimates into the linear $A_2$ bound for $T$]
Combining the previous three steps: for every $\omega \in \Omega$ and every $(i,j) \in \mathbb{N}_0^2$,
\begin{align*}
\|S_{i,j}^{\mathcal{D}^\omega}\|_{L^2(w) \to L^2(w)} \le C_n (i + j + 1)\, [w]_{A_2}.
\end{align*}
Substituting into the dyadic representation and using the duality $L^2(w)^* = L^2(w^{-1})$,
\begin{align*}
|\langle Tf, g \rangle_{L^2}| &\le \mathbb{E}_\omega \sum_{i,j \ge 0} \tau_{i,j}\, \|S_{i,j}^{\mathcal{D}^\omega}\|_{L^2(w) \to L^2(w)} \, \|f\|_{L^2(w)} \, \|g\|_{L^2(w^{-1})} \\
&\le C_n\, [w]_{A_2}\, \|f\|_{L^2(w)} \, \|g\|_{L^2(w^{-1})}\, \sum_{i,j \ge 0} (i+j+1)\, \tau_{i,j}.
\end{align*}
The numerical series converges:
\begin{align*}
\sum_{i,j \ge 0} (i+j+1)\, \tau_{i,j} \le C_T \sum_{i,j \ge 0} (i+j+1)\, 2^{-(\max(i,j))(1-\delta)} \le C(T, \delta) < \infty,
\end{align*}
since the polynomial factor is dominated by the exponential decay (the bound $\tau_{i,j} \le C_T\, 2^{-(\max(i,j))(1-\delta)}$ from the dyadic representation theorem depends only on the CZ constants of $T$). Taking the supremum over $g$ with $\|g\|_{L^2(w^{-1})} \le 1$,
\begin{align*}
\|Tf\|_{L^2(w)} \le C(n, T)\, [w]_{A_2}\, \|f\|_{L^2(w)},
\end{align*}
where $C(n, T)$ depends only on $n$ and the Calderón--Zygmund constants of $T$. This is the linear $A_2$ bound and completes the proof.
[/step]