[proofplan]
We argue by contradiction. If an honest band is also narrow at the smoother rate around $f_0$, then it cannot contain both $f_0$ and the rough perturbation $f_n$, because their supremum-norm separation is much larger than $r_n(s_2)$. This converts the band into a test between $P_{f_0}^{(n)}$ and $P_{f_n}^{(n)}$ with asymptotic sum of errors at most $2\alpha$. The total-variation lower bound for testing gives every such test sum of errors at least $\eta$, contradicting $\eta>2\alpha$; the theorem has been stated directly in terms of the membership, separation, and testing-distance assumptions needed for this argument.
[/proofplan]
[step:Assume an adaptive honest band exists]
Suppose, toward a contradiction, that there exist a sequence of random confidence bands $C_n$ and a constant $M<\infty$ satisfying the two asserted properties. We use the standard measurability convention for confidence bands: for each deterministic signal $g$ in the model class, the event $\{g\in C_n\}$ is measurable, and the event $\{\operatorname{diam}_\infty(C_n)>u\}$ is measurable for each real number $u>0$. Since $f_0\in\mathcal{F}_{s_2}\subset\mathcal{F}_{s_1}$, honesty over $\mathcal{F}_{s_1}$ implies that the noncoverage probabilities at $f_0$ have limit superior at most $\alpha$. Since $f_n\in\mathcal{F}_{s_1}$ for all sufficiently large $n$, the same honesty condition implies that the noncoverage probabilities at $f_n$ have limit superior at most $\alpha$. Finally, the diameter condition for the smooth class $\mathcal{F}_{s_2}$, applied at $f_0$, implies that the $P_{f_0,n}$-probability of the event $\operatorname{diam}_\infty(C_n)>M r_n(s_2)$ tends to $0$.
[/step]
[step:Use the separation of $f_0$ and $f_n$ to exclude simultaneous containment in a narrow band]
Define the separation
\begin{align*}
\Delta_n:=\|f_n-f_0\|_\infty.
\end{align*}
By hypothesis,
\begin{align*}
\frac{\Delta_n}{r_n(s_2)}\to\infty.
\end{align*}
Hence, for all sufficiently large $n$,
\begin{align*}
\Delta_n>M r_n(s_2).
\end{align*}
On the event
\begin{align*}
A_n:=\{f_0\in C_n\}\cap\{\operatorname{diam}_\infty(C_n)\leq M r_n(s_2)\},
\end{align*}
the band cannot also contain $f_n$. Indeed, if both $f_0$ and $f_n$ belonged to $C_n$, then the definition of $\operatorname{diam}_\infty(C_n)$ would imply
\begin{align*}
\Delta_n=\|f_n-f_0\|_\infty\leq \operatorname{diam}_\infty(C_n)\leq M r_n(s_2),
\end{align*}
contradicting $\Delta_n>M r_n(s_2)$.
[guided]
The role of the bump construction is to produce two signals that are statistically hard to distinguish but geometrically far apart in supremum norm at the smoother confidence-band scale. We formalize the geometric part first.
Define
\begin{align*}
\Delta_n:=\|f_n-f_0\|_\infty.
\end{align*}
The hypothesis gives
\begin{align*}
\frac{\Delta_n}{r_n(s_2)}\to\infty.
\end{align*}
Since $M<\infty$ is fixed, this implies that, for all sufficiently large $n$,
\begin{align*}
\Delta_n>M r_n(s_2).
\end{align*}
Now consider the event
\begin{align*}
A_n:=\{f_0\in C_n\}\cap\{\operatorname{diam}_\infty(C_n)\leq M r_n(s_2)\}.
\end{align*}
On this event, the band contains $f_0$ and has supremum-norm diameter at most $M r_n(s_2)$. If $f_n$ also belonged to $C_n$, then the pair $f_0,f_n$ would be among the functions over which the diameter is computed, so
\begin{align*}
\|f_n-f_0\|_\infty\leq \operatorname{diam}_\infty(C_n)\leq M r_n(s_2).
\end{align*}
This contradicts $\|f_n-f_0\|_\infty=\Delta_n>M r_n(s_2)$. Hence, on $A_n$, the event $\{f_n\in C_n\}$ cannot occur.
[/guided]
[/step]
[step:Convert the band into a test between $f_0$ and $f_n$]
For all sufficiently large $n$, let $\mathcal{Y}_n$ denote the observation space of the Gaussian white noise experiment and define the measurable test $\varphi_n:\mathcal{Y}_n\to\{0,1\}$ by
\begin{align*}
\varphi_n(Y)=\mathbb{1}_{\{f_n\in C_n(Y)\}}.
\end{align*}
We interpret $\varphi_n=1$ as rejection of $H_0:f=f_0$ in favor of $H_1:f=f_n$.
Under $P_{f_n,n}$, the type II error is
\begin{align*}
P_{f_n,n}(\varphi_n=0)
=
P_{f_n,n}(f_n\notin C_n),
\end{align*}
so
\begin{align*}
\limsup_{n\to\infty}P_{f_n,n}(\varphi_n=0)\leq \alpha.
\end{align*}
Under $P_{f_0,n}$, the preceding step gives the event inclusion
\begin{align*}
\{\varphi_n=1\}
=\{f_n\in C_n\}
\subset
\{f_0\notin C_n\}\cup\{\operatorname{diam}_\infty(C_n)>M r_n(s_2)\}.
\end{align*}
Therefore
\begin{align*}
P_{f_0,n}(\varphi_n=1)
\leq
P_{f_0,n}(f_0\notin C_n)
+
P_{f_0,n}\left(\operatorname{diam}_\infty(C_n)>M r_n(s_2)\right),
\end{align*}
and hence
\begin{align*}
\limsup_{n\to\infty}P_{f_0,n}(\varphi_n=1)\leq \alpha.
\end{align*}
Combining the two error bounds,
\begin{align*}
\limsup_{n\to\infty}\left(P_{f_0,n}(\varphi_n=1)+P_{f_n,n}(\varphi_n=0)\right)\leq 2\alpha.
\end{align*}
[/step]
[step:Apply the total-variation testing lower bound]
For any measurable space $\mathcal{Y}$, any measurable test $\varphi:\mathcal{Y}\to\{0,1\}$, and any two probability measures $P,Q$ on $\mathcal{Y}$, the defining variational bound for total variation distance applied to the event $\{y\in\mathcal{Y}:\varphi(y)=1\}$ gives
\begin{align*}
|P(\varphi=1)-Q(\varphi=1)|\leq \|P-Q\|_{\mathrm{TV}}.
\end{align*}
In particular, $P(\varphi=1)-Q(\varphi=1)\geq -\|P-Q\|_{\mathrm{TV}}$. Since $Q(\varphi=0)=1-Q(\varphi=1)$, this implies
\begin{align*}
P(\varphi=1)+Q(\varphi=0)=1+P(\varphi=1)-Q(\varphi=1).
\end{align*}
The preceding lower bound for $P(\varphi=1)-Q(\varphi=1)$ therefore gives
\begin{align*}
P(\varphi=1)+Q(\varphi=0)\geq 1-\|P-Q\|_{\mathrm{TV}}.
\end{align*}
Applying this with $P=P_{f_0,n}$, $Q=P_{f_n,n}$, and $\varphi=\varphi_n$, and using $\|P_{f_n,n}-P_{f_0,n}\|_{\mathrm{TV}}\leq 1-\eta$, yields
\begin{align*}
P_{f_0,n}(\varphi_n=1)+P_{f_n,n}(\varphi_n=0)\geq \eta
\end{align*}
for all sufficiently large $n$.
[guided]
We now use the statistical indistinguishability assumption. The relevant elementary fact is that total variation controls the best possible testing error.
Let $\mathcal{Y}$ be the common observation space, let $P$ and $Q$ be probability measures on $\mathcal{Y}$, and let $\varphi:\mathcal{Y}\to\{0,1\}$ be any measurable test. By the definition of total variation distance as the supremum discrepancy over measurable events, applied to the event $\{\varphi=1\}$, we have
\begin{align*}
|P(\varphi=1)-Q(\varphi=1)|\leq \|P-Q\|_{\mathrm{TV}}.
\end{align*}
In particular,
\begin{align*}
P(\varphi=1)-Q(\varphi=1)\geq -\|P-Q\|_{\mathrm{TV}}.
\end{align*}
Adding $1$ to both sides and using $Q(\varphi=0)=1-Q(\varphi=1)$ gives
\begin{align*}
P(\varphi=1)+Q(\varphi=0)=P(\varphi=1)+1-Q(\varphi=1).
\end{align*}
Rearranging the right-hand side gives
\begin{align*}
P(\varphi=1)+1-Q(\varphi=1)=1+P(\varphi=1)-Q(\varphi=1).
\end{align*}
Using the lower bound for $P(\varphi=1)-Q(\varphi=1)$ now yields
\begin{align*}
P(\varphi=1)+Q(\varphi=0)\geq 1-\|P-Q\|_{\mathrm{TV}}.
\end{align*}
This inequality says that if two probability measures have total variation distance bounded away from $1$, then no test can make both the type I and type II errors arbitrarily small.
Apply this inequality with $P=P_{f_0,n}$, with $Q=P_{f_n,n}$, and with $\varphi=\varphi_n$. The hypothesis gives
\begin{align*}
\|P_{f_n,n}-P_{f_0,n}\|_{\mathrm{TV}}\leq 1-\eta,
\end{align*}
so
\begin{align*}
P_{f_0,n}(\varphi_n=1)+P_{f_n,n}(\varphi_n=0)
\geq
1-\|P_{f_0,n}-P_{f_n,n}\|_{\mathrm{TV}}.
\end{align*}
Combining this with $\|P_{f_n,n}-P_{f_0,n}\|_{\mathrm{TV}}\leq 1-\eta$ gives
\begin{align*}
P_{f_0,n}(\varphi_n=1)+P_{f_n,n}(\varphi_n=0)
\geq \eta.
\end{align*}
[/guided]
[/step]
[step:Derive the contradiction]
The constructed tests satisfy
\begin{align*}
\limsup_{n\to\infty}\left(P_{f_0,n}(\varphi_n=1)+P_{f_n,n}(\varphi_n=0)\right)\leq 2\alpha,
\end{align*}
while the total-variation lower bound gives
\begin{align*}
P_{f_0,n}(\varphi_n=1)+P_{f_n,n}(\varphi_n=0)\geq \eta
\end{align*}
for all sufficiently large $n$. Taking the limit superior in the [second inequality](/theorems/2136) yields
\begin{align*}
\eta\leq 2\alpha,
\end{align*}
contradicting the hypothesis $\eta>2\alpha$. Therefore no such sequence of confidence bands $C_n$ and constant $M<\infty$ can exist. The contradiction used only the smooth-class diameter condition for $i=2$; the additional diameter condition for $i=1$ is part of the stronger full-adaptivity requirement and is therefore also impossible to satisfy simultaneously with honesty.
[/step]