[proofplan]
The proof is a dyadic peeling argument. The shells $S_{-1},S_0,S_1,\dots$ partition $\mathcal F$ according to the size of $R(f)$, and each shell is contained in the localized class at its upper radius $r_k$. We choose a deterministic deviation level $u_k$ large enough that the assumed localized tail bound gives probability at most $e^{-a_k}$ on the $k$-th shell, then sum these bounds over all shell indices by [countable subadditivity](/theorems/1108).
[/proofplan]
custom_env
admin
[step:Choose the universal shell deviation constant]Define
\begin{align*}
c_B:=\max\{1,B\}+1.
\end{align*}
Then $c_B>0$ depends only on $B$, and
\begin{align*}
c_B^2\ge B
\end{align*}
and
\begin{align*}
c_B^2\ge c_B.
\end{align*}
Fix $s>0$ and $k\ge -1$. Define the shell deviation level $u_k\in(0,\infty)$ by
\begin{align*}
u_k:=c_B\left(r_k\sqrt{\frac{a_k}{n}}+\frac{M a_k}{n}\right).
\end{align*}
Since $r_k\ge r_0>0$ and $a_k>0$, this number is strictly positive. Set
\begin{align*}
x_k:=r_k\sqrt{\frac{a_k}{n}}
\end{align*}
and
\begin{align*}
y_k:=\frac{M a_k}{n}.
\end{align*}
Then $x_k>0$, $y_k\ge0$, and $u_k=c_B(x_k+y_k)$. Also,
\begin{align*}
r_k^2=\frac{n x_k^2}{a_k}
\end{align*}
and, when written in terms of $y_k$, the identity $M=n y_k/a_k$ holds if $a_k>0$. Therefore
\begin{align*}
B r_k^2+M u_k
=\frac{n}{a_k}\left(Bx_k^2+c_B y_k(x_k+y_k)\right).
\end{align*}
Using $c_B^2\ge B$ and $c_B^2\ge c_B$, we obtain
\begin{align*}
B x_k^2+c_B y_k(x_k+y_k)
\le c_B^2 x_k^2+c_B^2 y_k(x_k+y_k)
\le c_B^2(x_k+y_k)^2.
\end{align*}
Consequently,
\begin{align*}
\frac{n u_k^2}{B r_k^2+M u_k}
=
\frac{a_k c_B^2(x_k+y_k)^2}{B x_k^2+c_B y_k(x_k+y_k)}
\ge a_k.
\end{align*}[/step]
custom_env
admin
[guided]The only algebraic point in the proof is to choose the constant multiplying the deviation level so that it dominates both terms in the denominator of the assumed Bernstein-type bound. We make this explicit by defining
\begin{align*}
c_B:=\max\{1,B\}+1.
\end{align*}
This gives two inequalities that will be used below:
\begin{align*}
c_B^2\ge B
\end{align*}
and
\begin{align*}
c_B^2\ge c_B.
\end{align*}
For a fixed shell index $k\ge -1$, define
\begin{align*}
u_k:=c_B\left(r_k\sqrt{\frac{a_k}{n}}+\frac{M a_k}{n}\right).
\end{align*}
This is the deviation level we will feed into the localized tail estimate. It is positive because $r_k\ge r_0>0$, $a_k>0$, and $n\in\mathbb N$.
To verify that this choice produces the exponent $a_k$, introduce the auxiliary nonnegative quantities
\begin{align*}
x_k:=r_k\sqrt{\frac{a_k}{n}}
\end{align*}
and
\begin{align*}
y_k:=\frac{M a_k}{n}.
\end{align*}
Then $u_k=c_B(x_k+y_k)$. The point of this substitution is that it expresses both pieces of the Bernstein denominator in the same scale. Since $a_k>0$,
\begin{align*}
r_k^2=\frac{n x_k^2}{a_k}
\end{align*}
and
\begin{align*}
M=\frac{n y_k}{a_k}.
\end{align*}
Hence
\begin{align*}
B r_k^2+M u_k
=\frac{n}{a_k}\left(Bx_k^2+c_B y_k(x_k+y_k)\right).
\end{align*}
Now use the two inequalities built into the definition of $c_B$. Since $x_k\ge0$ and $y_k\ge0$,
\begin{align*}
B x_k^2\le c_B^2 x_k^2
\end{align*}
and
\begin{align*}
c_B y_k(x_k+y_k)\le c_B^2 y_k(x_k+y_k).
\end{align*}
Adding these inequalities gives
\begin{align*}
B x_k^2+c_B y_k(x_k+y_k)
\le c_B^2 x_k^2+c_B^2 y_k(x_k+y_k).
\end{align*}
Finally,
\begin{align*}
x_k^2+y_k(x_k+y_k)\le (x_k+y_k)^2,
\end{align*}
so
\begin{align*}
B x_k^2+c_B y_k(x_k+y_k)
\le c_B^2(x_k+y_k)^2.
\end{align*}
Substituting this estimate into the exponent yields
\begin{align*}
\frac{n u_k^2}{B r_k^2+M u_k}
=
\frac{a_k c_B^2(x_k+y_k)^2}{B x_k^2+c_B y_k(x_k+y_k)}
\ge a_k.
\end{align*}
This is the uniform inequality needed for every shell index, and it depends only on $B$ through the single constant $c_B$.[/guided]
custom_env
admin
[step:Embed each shell into its localized class]
For $k=-1$, the definitions give
\begin{align*}
S_{-1}=\{f\in\mathcal F:R(f)\le r_0\}=\mathcal F(r_{-1}).
\end{align*}
For $k\ge0$, if $f\in S_k$, then
\begin{align*}
R(f)\le 2^{k+1}r_0=r_k.
\end{align*}
Thus
\begin{align*}
S_k\subseteq\mathcal F(r_k)
\end{align*}
for every $k\ge -1$.
[/step]
custom_env
admin
[step:Apply the localized tail bound to each shell]
For each $k\ge -1$, define the event
\begin{align*}
E_k:=\left\{\sup_{f\in S_k}|(P_n-P)f|>A(r_k)+u_k\right\}.
\end{align*}
Since $S_k\subseteq\mathcal F(r_k)$, monotonicity of the supremum over nested index sets gives
\begin{align*}
\sup_{f\in S_k}|(P_n-P)f|
\le
\sup_{f\in\mathcal F(r_k)}|(P_n-P)f|.
\end{align*}
Therefore
\begin{align*}
E_k\subseteq
\left\{\sup_{f\in\mathcal F(r_k)}|(P_n-P)f|>A(r_k)+u_k\right\}.
\end{align*}
Because $r_k\ge r_0$ and $u_k>0$, the assumed localized deviation inequality applies with $r=r_k$ and $u=u_k$. Hence
\begin{align*}
\mathbb P(E_k)
\le
\exp\left(-\frac{n u_k^2}{B r_k^2+M u_k}\right).
\end{align*}
By the exponent estimate from the previous step,
\begin{align*}
\mathbb P(E_k)\le e^{-a_k}.
\end{align*}
[/step]
custom_env
admin
[step:Sum the shell probabilities by countable subadditivity]
Define the exceptional event
\begin{align*}
E:=\bigcup_{k=-1}^{\infty}E_k.
\end{align*}
By countable subadditivity of probability measures,
\begin{align*}
\mathbb P(E)\le \sum_{k=-1}^{\infty}\mathbb P(E_k).
\end{align*}
Using the shellwise estimates,
\begin{align*}
\mathbb P(E)\le \sum_{k=-1}^{\infty}e^{-a_k}.
\end{align*}
By the definitions of $a_{-1}$ and $a_k$ for $k\ge0$,
\begin{align*}
\sum_{k=-1}^{\infty}e^{-a_k}
=
e^{-(s+1)}+\sum_{k=0}^{\infty}e^{-(s+k+2)}.
\end{align*}
The second term is a geometric series:
\begin{align*}
\sum_{k=0}^{\infty}e^{-(s+k+2)}
=
e^{-s}\sum_{k=0}^{\infty}e^{-(k+2)}
=
e^{-s}\frac{e^{-2}}{1-e^{-1}}.
\end{align*}
Thus
\begin{align*}
\sum_{k=-1}^{\infty}e^{-a_k}
=
e^{-s}\left(e^{-1}+\frac{e^{-2}}{1-e^{-1}}\right).
\end{align*}
Since
\begin{align*}
e^{-1}+\frac{e^{-2}}{1-e^{-1}}
=
\frac{1}{e-1},
\end{align*}
we obtain
\begin{align*}
\mathbb P(E)\le \frac{e^{-s}}{e-1}.
\end{align*}
This is precisely
\begin{align*}
\mathbb P\left(
\exists k\ge -1:\sup_{f\in S_k}|(P_n-P)f|
>A(r_k)+c_B\left(r_k\sqrt{\frac{a_k}{n}}+\frac{M a_k}{n}\right)
\right)
\le \frac{e^{-s}}{e-1}.
\end{align*}
[/step]
custom_env
admin
[step:Pass from the exceptional event to the simultaneous functionwise bound]
It remains to justify the equivalent formulation. Since $P f^2<\infty$ for every $f\in\mathcal F$, the number $R(f)$ is finite for every $f\in\mathcal F$. Because $r_0>0$, every finite nonnegative number belongs to exactly one of the intervals
\begin{align*}
[0,r_0]
\end{align*}
or
\begin{align*}
(2^k r_0,2^{k+1}r_0]
\end{align*}
with $k\ge0$. Therefore every $f\in\mathcal F$ belongs to exactly one shell $S_k$ with $k\ge -1$.
On the complement $E^c$, no event $E_k$ occurs. Hence, for every $k\ge -1$,
\begin{align*}
\sup_{f\in S_k}|(P_n-P)f|
\le A(r_k)+u_k.
\end{align*}
Substituting the definition of $u_k$ gives
\begin{align*}
\sup_{f\in S_k}|(P_n-P)f|
\le
A(r_k)+c_B\left(r_k\sqrt{\frac{a_k}{n}}+\frac{M a_k}{n}\right).
\end{align*}
If $f\in\mathcal F$ and $k$ is its unique shell index, then $f\in S_k$, and so
\begin{align*}
|(P_n-P)f|
\le
A(r_k)+c_B\left(r_k\sqrt{\frac{a_k}{n}}+\frac{M a_k}{n}\right).
\end{align*}
Since $\mathbb P(E^c)\ge 1-e^{-s}/(e-1)$, the simultaneous shellwise bound holds with the asserted probability.
[/step]