[proofplan]
The strategy is to iterate a [Calderón–Zygmund decomposition](/theorems/???) of $|f - f_Q|$ inside $Q_0 := Q$. Normalising so that $\|f\|_{\mathrm{BMO}} = 1$, we fix a stopping level $\alpha > 2^n$ and select maximal dyadic subcubes of $Q_0$ where the dyadic mean of $|f - f_{Q_0}|$ exceeds $\alpha$. On each selected cube the average of $|f - f_{Q_0}|$ is between $\alpha$ and $2^n\alpha$ by the stopping criterion, and the total measure of the selected cubes is at most $|Q_0|/\alpha$. Outside the selected cubes, $|f - f_{Q_0}| \le \alpha$ a.e. by the Lebesgue differentiation theorem. Iterating the construction inside each selected cube and tracking the geometric decay of the total measure as a function of the iteration depth $m$ produces an exponential bound on $\mathcal{L}^n(\{|f - f_{Q_0}| > m\alpha\})$. Choosing the constants in the exponent then yields the inequality for all $\lambda > 0$.
[/proofplan]
[step:Reduce to the case $\|f\|_{\mathrm{BMO}} = 1$ and a fixed cube $Q_0$]
If $\|f\|_{\mathrm{BMO}} = 0$, then $f$ is a.e. constant, $|f - f_Q| = 0$ a.e., and the left-hand side vanishes for every $\lambda > 0$. So the inequality is automatic and we may assume $\|f\|_{\mathrm{BMO}} > 0$.
Replacing $f$ by $\tilde{f} := f/\|f\|_{\mathrm{BMO}}$, we have $\tilde{f} \in \mathrm{BMO}(\mathbb{R}^n)$ with $\|\tilde{f}\|_{\mathrm{BMO}} = 1$, and the level sets transform by
\begin{align*}
\{x \in Q : |\tilde{f}(x) - \tilde{f}_Q| > \mu\} = \{x \in Q : |f(x) - f_Q| > \mu\,\|f\|_{\mathrm{BMO}}\}.
\end{align*}
Thus the original inequality with parameter $\lambda$ is equivalent to the same inequality applied to $\tilde{f}$ with parameter $\mu := \lambda/\|f\|_{\mathrm{BMO}}$. We henceforth assume $\|f\|_{\mathrm{BMO}} = 1$ and fix an arbitrary cube $Q_0 \subset \mathbb{R}^n$ with sides parallel to the coordinate axes.
[/step]
[step:Apply the Calderón–Zygmund stopping decomposition at level $\alpha$ inside $Q_0$]
Choose a stopping level $\alpha$ to be fixed later, satisfying $\alpha > 2^n$. We perform the dyadic stopping-time decomposition of $|f - f_{Q_0}|$ on $Q_0$ at level $\alpha$.
Bisect $Q_0$ along each coordinate axis to produce $2^n$ congruent dyadic children, and continue bisecting recursively. Call a dyadic subcube $Q' \subseteq Q_0$ **selected at level $1$** if it is maximal among dyadic subcubes with
\begin{align*}
\frac{1}{|Q'|}\int_{Q'} |f(x) - f_{Q_0}|\, d\mathcal{L}^n(x) > \alpha.
\end{align*}
Maximality means $Q'$ has this property but its dyadic parent $\widehat{Q'} \subseteq Q_0$ does not. Let $\{Q_j^1\}_{j}$ be the (countable) collection of all level-$1$ selected cubes; they are pairwise disjoint by maximality.
For the level-$0$ cube $Q_0$ itself, the BMO normalisation gives $\frac{1}{|Q_0|}\int_{Q_0} |f - f_{Q_0}|\, d\mathcal{L}^n \le \|f\|_{\mathrm{BMO}} = 1 < \alpha$, so $Q_0$ is not selected. In particular every selected $Q_j^1$ has a dyadic parent $\widehat{Q_j^1} \subseteq Q_0$ with average at most $\alpha$.
We collect three properties of $\{Q_j^1\}$.
(P1) **Total measure bound.** Since the selected cubes are pairwise disjoint and each satisfies $\frac{1}{|Q_j^1|}\int_{Q_j^1} |f - f_{Q_0}|\, d\mathcal{L}^n > \alpha$,
\begin{align*}
\sum_j |Q_j^1| < \frac{1}{\alpha} \sum_j \int_{Q_j^1} |f - f_{Q_0}|\, d\mathcal{L}^n \le \frac{1}{\alpha}\int_{Q_0} |f - f_{Q_0}|\, d\mathcal{L}^n \le \frac{|Q_0|}{\alpha},
\end{align*}
where the last inequality uses $\|f\|_{\mathrm{BMO}} = 1$ applied to the cube $Q_0$.
(P2) **Mean transition bound.** For each $j$, let $\widehat{Q_j^1}$ denote the dyadic parent of $Q_j^1$. Since $\widehat{Q_j^1}$ is not selected, its mean satisfies $\frac{1}{|\widehat{Q_j^1}|}\int_{\widehat{Q_j^1}} |f - f_{Q_0}|\, d\mathcal{L}^n \le \alpha$. Because dyadic bisection multiplies volume by $2^n$, $|\widehat{Q_j^1}| = 2^n|Q_j^1|$, and integration is monotone in the domain:
\begin{align*}
|f_{Q_j^1} - f_{Q_0}|
&= \left|\frac{1}{|Q_j^1|}\int_{Q_j^1} (f - f_{Q_0})\, d\mathcal{L}^n\right| \le \frac{1}{|Q_j^1|}\int_{Q_j^1} |f - f_{Q_0}|\, d\mathcal{L}^n \\
&\le \frac{2^n}{|\widehat{Q_j^1}|}\int_{\widehat{Q_j^1}} |f - f_{Q_0}|\, d\mathcal{L}^n \le 2^n\alpha.
\end{align*}
(P3) **Pointwise bound off the selected cubes.** Set $E_1 := \bigcup_j Q_j^1 \subseteq Q_0$. For $\mathcal{L}^n$-almost every $x \in Q_0 \setminus E_1$, every dyadic subcube $Q' \subseteq Q_0$ containing $x$ is **not** selected, so
\begin{align*}
\frac{1}{|Q'|}\int_{Q'} |f(y) - f_{Q_0}|\, d\mathcal{L}^n(y) \le \alpha.
\end{align*}
By the [Lebesgue Differentiation Theorem](/theorems/???) applied to $|f - f_{Q_0}| \in L^1_{\mathrm{loc}}(\mathbb{R}^n)$ along the dyadic basis (which forms a regular differentiation basis), the dyadic averages converge to $|f(x) - f_{Q_0}|$ for $\mathcal{L}^n$-a.e. $x$. Hence
\begin{align*}
|f(x) - f_{Q_0}| \le \alpha \qquad \text{for } \mathcal{L}^n\text{-a.e. } x \in Q_0 \setminus E_1.
\end{align*}
[/step]
[step:Iterate the decomposition inside each selected cube to produce a tower of generations]
We construct, inductively in $m \ge 0$, a family $\mathcal{F}_m$ of pairwise disjoint dyadic subcubes of $Q_0$ called **generation-$m$ selected cubes**, with the following properties:
(I1) $\mathcal{F}_0 = \{Q_0\}$ and $\mathcal{F}_1 = \{Q_j^1\}$ from the previous step.
(I2) For each $Q \in \mathcal{F}_m$ with $m \ge 1$, the parent generation contains a unique $\widetilde{Q} \in \mathcal{F}_{m-1}$ with $Q \subseteq \widetilde{Q}$.
(I3) $|f_Q - f_{\widetilde{Q}}| \le 2^n \alpha$ for every $Q \in \mathcal{F}_m$ with parent $\widetilde{Q} \in \mathcal{F}_{m-1}$.
(I4) The total measure satisfies $\sum_{Q \in \mathcal{F}_m} |Q| \le \alpha^{-m}|Q_0|$.
(I5) For $\mathcal{L}^n$-a.e. $x \in \widetilde{Q} \setminus \bigcup_{Q \in \mathcal{F}_m,\, Q \subseteq \widetilde{Q}} Q$ with $\widetilde{Q} \in \mathcal{F}_{m-1}$, $|f(x) - f_{\widetilde{Q}}| \le \alpha$.
The base case $m = 0, 1$ is the previous step. For the inductive step, suppose $\mathcal{F}_m$ has been constructed. For each $\widetilde{Q} \in \mathcal{F}_m$, perform the **same** dyadic stopping-time decomposition of $|f - f_{\widetilde{Q}}|$ inside $\widetilde{Q}$ at level $\alpha$. The BMO-normalisation gives $\frac{1}{|\widetilde{Q}|}\int_{\widetilde{Q}} |f - f_{\widetilde{Q}}|\, d\mathcal{L}^n \le \|f\|_{\mathrm{BMO}} = 1 < \alpha$, so $\widetilde{Q}$ is itself not selected at the new stage. Properties (P1)–(P3) of the previous step, applied with $Q_{0}$ replaced by $\widetilde{Q}$, produce a family $\{Q_k^{m+1, \widetilde{Q}}\}_k$ of new selected cubes with
\begin{align*}
\sum_k |Q_k^{m+1, \widetilde{Q}}| &\le \frac{|\widetilde{Q}|}{\alpha}, \\
|f_{Q_k^{m+1,\widetilde{Q}}} - f_{\widetilde{Q}}| &\le 2^n\alpha, \\
|f(x) - f_{\widetilde{Q}}| &\le \alpha \quad \text{for } \mathcal{L}^n\text{-a.e. } x \in \widetilde{Q} \setminus \textstyle\bigcup_k Q_k^{m+1,\widetilde{Q}}.
\end{align*}
Define $\mathcal{F}_{m+1} := \bigcup_{\widetilde{Q} \in \mathcal{F}_m} \{Q_k^{m+1, \widetilde{Q}}\}_k$. The disjointness across distinct $\widetilde{Q} \in \mathcal{F}_m$ follows from the disjointness of the generation-$m$ cubes themselves.
We verify (I4) for generation $m+1$:
\begin{align*}
\sum_{Q \in \mathcal{F}_{m+1}} |Q| = \sum_{\widetilde{Q} \in \mathcal{F}_m} \sum_k |Q_k^{m+1,\widetilde{Q}}| \le \sum_{\widetilde{Q} \in \mathcal{F}_m} \frac{|\widetilde{Q}|}{\alpha} = \frac{1}{\alpha}\sum_{\widetilde{Q} \in \mathcal{F}_m} |\widetilde{Q}| \le \frac{1}{\alpha} \cdot \alpha^{-m}|Q_0| = \alpha^{-(m+1)}|Q_0|.
\end{align*}
Properties (I3) and (I5) for generation $m+1$ are immediate from the corresponding (P2) and (P3) applied to each $\widetilde{Q} \in \mathcal{F}_m$.
[/step]
[step:Bound the level set $\{|f - f_{Q_0}| > m \cdot 2^n \alpha + \alpha\}$ by the generation-$m$ measure]
Set $\beta := 2^n\alpha + \alpha$ and $\gamma := 2^n\alpha$, so that after passing through $m$ generations the cumulative drift in means is at most $m\gamma$ and the residual oscillation is at most $\alpha$. We claim that for every $m \ge 0$,
\begin{align*}
\mathcal{L}^n\!\left(\{x \in Q_0 : |f(x) - f_{Q_0}| > m\gamma + \alpha\}\right) \le \alpha^{-m}|Q_0| \qquad \text{(\#m)}.
\end{align*}
The case $m = 0$ is property (P3) of step 2 applied to $Q_0$ in the role of $\widetilde{Q}$: away from the generation-$1$ selected cubes, $|f - f_{Q_0}| \le \alpha$ a.e., which makes the level set $\{|f - f_{Q_0}| > \alpha\}$ contained (up to a null set) in $\bigcup \mathcal{F}_1$, and $\sum_{Q \in \mathcal{F}_1}|Q| \le \alpha^{-1}|Q_0| \le |Q_0|$ — actually for $m=0$ the inequality becomes $\mathcal{L}^n(\{|f - f_{Q_0}| > \alpha\}) \le |Q_0|$, which is a tautology since the level set is a subset of $Q_0$ up to a null set.
For the inductive step, suppose (\#m) holds. Let $x \in Q_0$ satisfy $|f(x) - f_{Q_0}| > (m+1)\gamma + \alpha$. We show that $x$ lies in some $Q \in \mathcal{F}_{m+1}$ up to a null set.
Indeed, recall property (I5): for a.e. $x \in \widetilde{Q} \setminus \bigcup_{Q \in \mathcal{F}_{m+1}, Q \subseteq \widetilde{Q}} Q$ with $\widetilde{Q} \in \mathcal{F}_m$, we have $|f(x) - f_{\widetilde{Q}}| \le \alpha$. Combined with the cumulative mean-drift bound — which states that for any chain $Q_0 \supset Q^{(1)} \supset \cdots \supset Q^{(m)} = \widetilde{Q}$ with $Q^{(j)} \in \mathcal{F}_j$,
\begin{align*}
|f_{\widetilde{Q}} - f_{Q_0}| \le \sum_{j=1}^m |f_{Q^{(j)}} - f_{Q^{(j-1)}}| \le m\gamma
\end{align*}
by the triangle inequality and (I3) — this gives $|f(x) - f_{Q_0}| \le |f(x) - f_{\widetilde{Q}}| + |f_{\widetilde{Q}} - f_{Q_0}| \le \alpha + m\gamma$ for a.e. $x \in \widetilde{Q} \setminus \bigcup \mathcal{F}_{m+1}$. Equivalently,
\begin{align*}
\{x \in Q_0 : |f(x) - f_{Q_0}| > m\gamma + \alpha\} \subseteq \bigcup_{Q \in \mathcal{F}_{m+1}} Q \quad \text{up to a null set.}
\end{align*}
By (I4) the right-hand side has measure at most $\alpha^{-(m+1)}|Q_0|$, but this is a stronger statement than (\#m) — in fact it gives
\begin{align*}
\mathcal{L}^n\!\left(\{x \in Q_0 : |f(x) - f_{Q_0}| > m\gamma + \alpha\}\right) \le \alpha^{-(m+1)}|Q_0|.
\end{align*}
Re-indexing with $m' = m + 1$, the same argument that gave (P3) in the $m$-th iteration shows
\begin{align*}
\mathcal{L}^n\!\left(\{x \in Q_0 : |f(x) - f_{Q_0}| > (m+1)\gamma + \alpha\}\right) \le \mathcal{L}^n\!\left(\textstyle\bigcup \mathcal{F}_{m+1}\right) \le \alpha^{-(m+1)}|Q_0|.
\end{align*}
This completes the inductive step.
[/step]
[step:Convert the discrete bound into the continuous exponential estimate]
For arbitrary $\lambda > 0$ choose $m \in \mathbb{Z}_{\ge 0}$ with $m\gamma + \alpha \le \lambda < (m+1)\gamma + \alpha$, where $\gamma = 2^n\alpha$. Such an $m$ exists provided $\lambda \ge \alpha$, namely $m = \lfloor (\lambda - \alpha)/\gamma \rfloor$. From the previous step,
\begin{align*}
\mathcal{L}^n\!\left(\{x \in Q_0 : |f(x) - f_{Q_0}| > \lambda\}\right) \le \mathcal{L}^n\!\left(\{x \in Q_0 : |f(x) - f_{Q_0}| > m\gamma + \alpha\}\right) \le \alpha^{-m}|Q_0|.
\end{align*}
Taking logarithms,
\begin{align*}
\log\frac{\mathcal{L}^n(\{|f - f_{Q_0}| > \lambda\})}{|Q_0|} \le -m\log\alpha \le -\frac{\lambda - \alpha - \gamma}{\gamma}\log\alpha = -\frac{\log\alpha}{\gamma}\lambda + \frac{(\alpha+\gamma)\log\alpha}{\gamma}.
\end{align*}
Equivalently,
\begin{align*}
\mathcal{L}^n(\{x \in Q_0 : |f(x) - f_{Q_0}| > \lambda\}) \le \alpha^{(\alpha+\gamma)/\gamma}|Q_0|\,\exp\!\left(-\frac{\log\alpha}{2^n\alpha}\,\lambda\right).
\end{align*}
For $\lambda < \alpha$ the elementary bound $\mathcal{L}^n(\cdot) \le |Q_0|$ holds, and we can absorb this case by inflating the constant $c_1$. Setting
\begin{align*}
\alpha &:= 2^{n+1}, & c_1 &:= e \cdot \alpha^{(\alpha+\gamma)/\gamma} = e \cdot \alpha^{(\alpha+2^n\alpha)/(2^n\alpha)} = e\cdot \alpha^{(1+2^n)/2^n}, & c_2 &:= \frac{\log\alpha}{2^n\alpha} = \frac{(n+1)\log 2}{2^{2n+1}},
\end{align*}
we obtain
\begin{align*}
\mathcal{L}^n\!\left(\{x \in Q_0 : |f(x) - f_{Q_0}| > \lambda\}\right) \le c_1|Q_0|\exp(-c_2\lambda) \qquad \text{for all } \lambda > 0.
\end{align*}
Substituting back the original $f$ via the rescaling $\lambda \mapsto \lambda/\|f\|_{\mathrm{BMO}}$ from step 1 gives the desired inequality with $c_1, c_2$ depending only on $n$.
[/step]