[proofplan]
We decompose the estimation error into a centered stochastic term and a deterministic bias term. The centered term is a triangular-array sum of independent kernel summands; Lyapunov's [central limit theorem](/theorems/521) applies because boundedness of $K$, together with the pointwise variance expansion, controls the required Lyapunov moments and identifies the limiting variance. The undersmoothing condition removes the bias at the standard-error scale. Finally, pointwise consistency makes the positive-part plug-in standard error asymptotically equivalent to the oracle standard error, and Slutsky's theorem gives the stated interval coverage.
[/proofplan]
[step:Normalize the centered stochastic term as a triangular array]
Let $(\Omega,\mathcal{F},\mathbb{P})$ be the probability space on which the independent random variables $X_i:(\Omega,\mathcal{F})\to(\mathbb{R},\mathcal{B}(\mathbb{R}))$ are defined. Let $(h_n)_{n\in\mathbb{N}}$ denote the bandwidth sequence from the theorem statement; this is the asymptotic version of the bandwidth parameter $h$ appearing in the interval formula. For each $n\in\mathbb{N}$ and $1\leq i\leq n$, define the real-valued [random variable](/page/Random%20Variable)
\begin{align*}
Y_{n,i}
:=
\frac{1}{\sqrt{n h_n}}
\left[
K\left(\frac{x-X_i}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_i}{h_n}\right)\right]
\right].
\end{align*}
Then $(Y_{n,i})_{1\leq i\leq n}$ are independent and centered. Their sum is exactly the centered kernel estimator at the $\sqrt{n h_n}$ scale:
\begin{align*}
\sum_{i=1}^{n}Y_{n,i}
&=
\sqrt{n h_n}
\left(
\hat f_{h_n}(x)-\mathbb{E}[\hat f_{h_n}(x)]
\right).
\end{align*}
The variance of this sum is
\begin{align*}
s_n^2
:=
\operatorname{Var}\left(\sum_{i=1}^{n}Y_{n,i}\right)
=
n\operatorname{Var}(Y_{n,1})
=
n h_n\,\operatorname{Var}(\hat f_{h_n}(x)).
\end{align*}
By the assumed pointwise variance expansion,
\begin{align*}
s_n^2\to f(x)R(K).
\end{align*}
Since $f(x)>0$ and $R(K)>0$, the limiting variance is positive.
[/step]
[step:Verify Lyapunov's condition for the centered kernel array]
Set $\delta:=1$, let $\mathcal{L}^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb{R}$, and let $M:=\|K\|_\infty<\infty$. Since $K$ is bounded, for every $u\in\mathbb{R}$,
\begin{align*}
|K(u)|^{2+\delta}\leq M^\delta K(u)^2.
\end{align*}
Thus $\int_{\mathbb{R}} |K(u)|^{2+\delta}\,d\mathcal{L}^1(u)<\infty$ is controlled by $K\in L^\infty(\mathbb{R})$ and $R(K)<\infty$.
Using the inequality $|a-b|^{2+\delta}\leq 2^{1+\delta}(|a|^{2+\delta}+|b|^{2+\delta})$ for $a,b\in\mathbb{R}$ and [Jensen's inequality](/theorems/9) for the convex function $t\mapsto |t|^{2+\delta}$, we obtain
\begin{align*}
\mathbb{E}\left[
\left|
K\left(\frac{x-X_1}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]
\right|^{2+\delta}
\right]
\leq
2^{2+\delta}
\mathbb{E}\left[
\left|
K\left(\frac{x-X_1}{h_n}\right)
\right|^{2+\delta}
\right].
\end{align*}
Using the previous bound then gives
\begin{align*}
\mathbb{E}\left[
\left|
K\left(\frac{x-X_1}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]
\right|^{2+\delta}
\right]
\leq
2^{2+\delta}M^\delta
\mathbb{E}\left[
K\left(\frac{x-X_1}{h_n}\right)^2
\right].
\end{align*}
The bias expansion gives
\begin{align*}
\mathbb{E}[\hat f_{h_n}(x)]=f(x)+o(1).
\end{align*}
Since
\begin{align*}
\mathbb{E}[\hat f_{h_n}(x)]=\frac{1}{h_n}\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right],
\end{align*}
we obtain
\begin{align*}
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]=h_n f(x)+o(h_n),
\end{align*}
and hence the squared mean term is $O(h_n^2)$. The variance expansion gives
\begin{align*}
\operatorname{Var}\left(K\left(\frac{x-X_1}{h_n}\right)\right)=h_n f(x)R(K)+o(h_n).
\end{align*}
Therefore
\begin{align*}
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)^2\right]=h_n f(x)R(K)+o(h_n).
\end{align*} Choose $A>f(x)R(K)$ such that, for all sufficiently large $n$,
\begin{align*}
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)^2\right]\leq A h_n.
\end{align*}
Define the constant $C:=2^{2+\delta}M^\delta A$. Then, for all sufficiently large $n$,
\begin{align*}
\mathbb{E}\left[\left|K\left(\frac{x-X_1}{h_n}\right)-\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]\right|^{2+\delta}\right]\leq C h_n.
\end{align*}
By the definition of $Y_{n,i}$,
\begin{align*}
\sum_{i=1}^{n}\mathbb{E}[|Y_{n,i}|^{2+\delta}]=n(nh_n)^{-(1+\delta/2)}\mathbb{E}\left[\left|K\left(\frac{x-X_1}{h_n}\right)-\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]\right|^{2+\delta}\right].
\end{align*}
Using the preceding centered-moment bound,
\begin{align*}
\sum_{i=1}^{n}\mathbb{E}[|Y_{n,i}|^{2+\delta}]\leq C(nh_n)^{-\delta/2}\to0.
\end{align*}
Since $s_n^2\to f(x)R(K)>0$, Lyapunov's condition holds:
\begin{align*}
\frac{1}{s_n^{2+\delta}}
\sum_{i=1}^{n}\mathbb{E}[|Y_{n,i}|^{2+\delta}]
\to0.
\end{align*}
By the Lyapunov [central limit theorem](/theorems/1848) for triangular arrays,
\begin{align*}
\frac{\sum_{i=1}^{n}Y_{n,i}}{s_n}
\xrightarrow{d}
\mathcal N(0,1).
\end{align*}
[guided]
The stochastic part of the estimator is a sum of independent terms whose distribution changes with $n$, because the bandwidth $h_n$ changes with $n$. This is why the correct central limit theorem is a triangular-array version, not the classical i.i.d. central limit theorem.
We must verify Lyapunov's condition with exponent $\delta:=1$. The random variable in one summand is
\begin{align*}
K\left(\frac{x-X_1}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right].
\end{align*}
The first issue is to control its $(2+\delta)$ moment. Let $M:=\|K\|_\infty$. Since $K$ is bounded, for every $u\in\mathbb{R}$,
\begin{align*}
|K(u)|^{2+\delta}=|K(u)|^\delta K(u)^2\leq M^\delta K(u)^2.
\end{align*}
Consequently $\int_{\mathbb{R}} |K(u)|^{2+\delta}\,d\mathcal{L}^1(u)<\infty$ is automatic from boundedness of $K$ and $R(K)<\infty$.
Also, for [real numbers](/page/Real%20Numbers) $a,b$,
\begin{align*}
|a-b|^{2+\delta}\leq 2^{1+\delta}(|a|^{2+\delta}+|b|^{2+\delta}).
\end{align*}
Applying this with $a=K\left(\frac{x-X_1}{h_n}\right)$ and $b=\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]$, and then using [Jensen's inequality](/theorems/1977) for the convex function $t\mapsto |t|^{2+\delta}$ gives
\begin{align*}
\mathbb{E}\left[
\left|
K\left(\frac{x-X_1}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]
\right|^{2+\delta}
\right]
\leq
2^{2+\delta}
\mathbb{E}\left[
\left|
K\left(\frac{x-X_1}{h_n}\right)
\right|^{2+\delta}
\right].
\end{align*}
Using $|K(u)|^{2+\delta}\leq M^\delta K(u)^2$ then gives
\begin{align*}
\mathbb{E}\left[
\left|
K\left(\frac{x-X_1}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]
\right|^{2+\delta}
\right]
\leq
2^{2+\delta}M^\delta
\mathbb{E}\left[
K\left(\frac{x-X_1}{h_n}\right)^2
\right].
\end{align*}
The pointwise variance expansion gives the size of the last expectation. Indeed,
\begin{align*}
\operatorname{Var}(\hat f_{h_n}(x))
=
\frac{1}{n h_n^2}
\operatorname{Var}\left(K\left(\frac{x-X_1}{h_n}\right)\right),
\end{align*}
so the hypothesis
\begin{align*}
\operatorname{Var}(\hat f_{h_n}(x))
=
\frac{f(x)R(K)}{n h_n}+o\left(\frac{1}{n h_n}\right)
\end{align*}
implies
\begin{align*}
\operatorname{Var}\left(K\left(\frac{x-X_1}{h_n}\right)\right)
=
h_n f(x)R(K)+o(h_n).
\end{align*}
The bias expansion also gives
\begin{align*}
\mathbb{E}[\hat f_{h_n}(x)]=f(x)+o(1).
\end{align*}
Since $\mathbb{E}[\hat f_{h_n}(x)]=h_n^{-1}\mathbb{E}[K((x-X_1)/h_n)]$, this implies
\begin{align*}
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]=h_n f(x)+o(h_n).
\end{align*}
Thus the squared mean is $O(h_n^2)$, and consequently
\begin{align*}
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)^2\right]=h_n f(x)R(K)+o(h_n).
\end{align*}
Choose $A>f(x)R(K)$ such that, for all sufficiently large $n$,
\begin{align*}
\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)^2\right]\leq A h_n.
\end{align*}
The constant in the centered-moment estimate is therefore explicit: define $C:=2^{2+\delta}M^\delta A$. Then, for all sufficiently large $n$,
\begin{align*}
\mathbb{E}\left[\left|K\left(\frac{x-X_1}{h_n}\right)-\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]\right|^{2+\delta}\right]\leq C h_n.
\end{align*}
Now compute Lyapunov's numerator for the array $Y_{n,i}$. Since
\begin{align*}
Y_{n,i}
=
\frac{1}{\sqrt{n h_n}}
\left[
K\left(\frac{x-X_i}{h_n}\right)
-
\mathbb{E}\left[K\left(\frac{x-X_i}{h_n}\right)\right]
\right],
\end{align*}
From the definition of $Y_{n,i}$,
\begin{align*}
\sum_{i=1}^{n}\mathbb{E}[|Y_{n,i}|^{2+\delta}]=n(nh_n)^{-(1+\delta/2)}\mathbb{E}\left[\left|K\left(\frac{x-X_1}{h_n}\right)-\mathbb{E}\left[K\left(\frac{x-X_1}{h_n}\right)\right]\right|^{2+\delta}\right].
\end{align*}
The centered-moment estimate gives
\begin{align*}
\sum_{i=1}^{n}\mathbb{E}[|Y_{n,i}|^{2+\delta}]\leq C(nh_n)^{-\delta/2}.
\end{align*}
Because $nh_n\to\infty$, the right-hand side tends to $0$. The variance of the whole array is
\begin{align*}
s_n^2
=
\operatorname{Var}\left(\sum_{i=1}^{n}Y_{n,i}\right)
=
n h_n\,\operatorname{Var}(\hat f_{h_n}(x))
\to f(x)R(K)>0.
\end{align*}
Therefore
\begin{align*}
\frac{1}{s_n^{2+\delta}}
\sum_{i=1}^{n}\mathbb{E}[|Y_{n,i}|^{2+\delta}]
\to0.
\end{align*}
This is exactly Lyapunov's condition. Hence the Lyapunov central limit theorem for triangular arrays yields
\begin{align*}
\frac{\sum_{i=1}^{n}Y_{n,i}}{s_n}
\xrightarrow{d}
\mathcal N(0,1).
\end{align*}
[/guided]
[/step]
[step:Replace the random-array variance by its deterministic limit]
Since
\begin{align*}
s_n^2\to f(x)R(K)>0,
\end{align*}
the deterministic ratio satisfies
\begin{align*}
\frac{s_n}{\sqrt{f(x)R(K)}}\to1.
\end{align*}
By Slutsky's theorem,
\begin{align*}
\frac{\sqrt{n h_n}\left(\hat f_{h_n}(x)-\mathbb{E}[\hat f_{h_n}(x)]\right)}
{\sqrt{f(x)R(K)}}
\xrightarrow{d}
\mathcal N(0,1).
\end{align*}
Equivalently,
\begin{align*}
\frac{\hat f_{h_n}(x)-\mathbb{E}[\hat f_{h_n}(x)]}
{\sqrt{f(x)R(K)/(n h_n)}}
\xrightarrow{d}
\mathcal N(0,1).
\end{align*}
[/step]
[step:Use undersmoothing to remove the bias]
Decompose
\begin{align*}
\frac{\hat f_{h_n}(x)-f(x)}
{\sqrt{f(x)R(K)/(n h_n)}}
&=
\frac{\hat f_{h_n}(x)-\mathbb{E}[\hat f_{h_n}(x)]}
{\sqrt{f(x)R(K)/(n h_n)}}
+
\frac{\mathbb{E}[\hat f_{h_n}(x)]-f(x)}
{\sqrt{f(x)R(K)/(n h_n)}}.
\end{align*}
The first term converges in distribution to $\mathcal N(0,1)$ by the previous step. The second term equals
\begin{align*}
\frac{\sqrt{n h_n}\left(\mathbb{E}[\hat f_{h_n}(x)]-f(x)\right)}
{\sqrt{f(x)R(K)}},
\end{align*}
which tends to $0$ by the undersmoothing condition and the positivity of $f(x)R(K)$. A second application of Slutsky's theorem gives
\begin{align*}
\frac{\hat f_{h_n}(x)-f(x)}
{\sqrt{f(x)R(K)/(n h_n)}}
\xrightarrow{d}
\mathcal N(0,1).
\end{align*}
[/step]
[step:Show that the plug-in standard error is asymptotically equivalent]
We first prove pointwise consistency. The pointwise variance expansion gives
\begin{align*}
\operatorname{Var}(\hat f_{h_n}(x))
=
\frac{f(x)R(K)}{n h_n}+o\left(\frac{1}{n h_n}\right)
\to0,
\end{align*}
because the bandwidth assumptions include $n h_n\to\infty$. The undersmoothing condition gives
\begin{align*}
\sqrt{n h_n}\left(\mathbb{E}[\hat f_{h_n}(x)]-f(x)\right)\to0.
\end{align*}
Since $n h_n\to\infty$, this implies
\begin{align*}
\mathbb{E}[\hat f_{h_n}(x)]\to f(x).
\end{align*}
For every $\varepsilon>0$, [Chebyshev's inequality](/theorems/1126) applied to $\hat f_{h_n}(x)-\mathbb{E}[\hat f_{h_n}(x)]$ gives
\begin{align*}
\mathbb{P}\left(
|\hat f_{h_n}(x)-f(x)|>\varepsilon
\right)
\leq
\mathbb{P}\left(
|\hat f_{h_n}(x)-\mathbb{E}[\hat f_{h_n}(x)]|>\frac{\varepsilon}{2}
\right)
\end{align*}
for all sufficiently large $n$, because $|\mathbb{E}[\hat f_{h_n}(x)]-f(x)|\leq\varepsilon/2$ eventually. Hence
\begin{align*}
\mathbb{P}\left(
|\hat f_{h_n}(x)-f(x)|>\varepsilon
\right)
\leq
\frac{4\operatorname{Var}(\hat f_{h_n}(x))}{\varepsilon^2}
\to0.
\end{align*}
Thus
\begin{align*}
\hat f_{h_n}(x)\xrightarrow{\mathbb P} f(x).
\end{align*}
Define the positive part $a^+:=\max\{a,0\}$ for $a\in\mathbb{R}$. Since $f(x)>0$ and $\hat f_{h_n}(x)\xrightarrow{\mathbb P} f(x)$, the [continuous mapping theorem](/theorems/1847) applied to the continuous map $a\mapsto \sqrt{a^+/f(x)}$ gives
\begin{align*}
\sqrt{\frac{\hat f_{h_n}(x)^+}{f(x)}}\xrightarrow{\mathbb P}1.
\end{align*}
Equivalently,
\begin{align*}
\frac{\sqrt{\hat f_{h_n}(x)^+R(K)/(n h_n)}}{\sqrt{f(x)R(K)/(n h_n)}}\xrightarrow{\mathbb P}1.
\end{align*}
Therefore, by Slutsky's theorem,
\begin{align*}
\frac{\hat f_{h_n}(x)-f(x)}{\sqrt{\hat f_{h_n}(x)^+R(K)/(n h_n)}}\xrightarrow{d}\mathcal N(0,1),
\end{align*}
where the statistic may be assigned any value on the event $\{\hat f_{h_n}(x)^+=0\}$, whose probability tends to $0$.
[/step]
[step:Convert the studentized limit into interval coverage]
Let $\Phi: \mathbb{R}\to[0,1]$ denote the distribution function of $\mathcal N(0,1)$, and let $z_{1-\tau/2}$ be the number satisfying
\begin{align*}
\Phi(z_{1-\tau/2})=1-\tau/2.
\end{align*}
Let $L_n$ and $U_n$ denote the lower and upper endpoints
\begin{align*}
L_n:=\hat f_{h_n}(x)-z_{1-\tau/2}\sqrt{\frac{\hat f_{h_n}(x)^+R(K)}{n h_n}}
\end{align*}
and
\begin{align*}
U_n:=\hat f_{h_n}(x)+z_{1-\tau/2}\sqrt{\frac{\hat f_{h_n}(x)^+R(K)}{n h_n}}.
\end{align*}
Define $I_n:=[L_n,U_n]$.
The event that the interval covers $f(x)$ is $\{f(x)\in I_n\}$. On the event $\{\hat f_{h_n}(x)^+>0\}$, this event is equivalent to
\begin{align*}
\left\{\left|\frac{\hat f_{h_n}(x)-f(x)}{\sqrt{\hat f_{h_n}(x)^+R(K)/(n h_n)}}\right|\leq z_{1-\tau/2}\right\}.
\end{align*}
The probability of $\{\hat f_{h_n}(x)^+=0\}$ tends to $0$ because $\hat f_{h_n}(x)\xrightarrow{\mathbb P}f(x)>0$. Since the studentized statistic converges in distribution to $\mathcal N(0,1)$ and the boundary points $\pm z_{1-\tau/2}$ have zero normal probability,
\begin{align*}
\mathbb{P}\left(
\left|
\frac{\hat f_{h_n}(x)-f(x)}{\sqrt{\hat f_{h_n}(x)^+R(K)/(n h_n)}}
\right|
\leq z_{1-\tau/2}
\right)
\to
\Phi(z_{1-\tau/2})-\Phi(-z_{1-\tau/2})
=
1-\tau.
\end{align*}
Thus the positive-part plug-in pointwise interval has asymptotic coverage $1-\tau$.
[/step]