[proofplan]
We write the centred kernel estimator as the sum of a triangular array of independent centred random variables. The variance of this array is computed directly by a change of variables and the [dominated convergence theorem](/theorems/4), giving the limiting variance $f(x)R(K)$. Boundedness and compact support of $K$ imply that the individual triangular-array summands converge uniformly to $0$, so the Lindeberg condition holds. The final centring replacement follows because the scaled bias is deterministic and tends to $0$.
[/proofplan]
[step:Rewrite the centred estimator as a triangular-array sum]
Let $(\Omega,\mathcal F,\mathbb P)$ denote the probability space on which the real-valued random variables $X_i:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ are defined. Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R$, let $\mathbb E[Z]:=\int_\Omega Z\,d\mathbb P$ denote expectation of an integrable real-valued [random variable](/page/Random%20Variable) $Z$, and let $\operatorname{Var}(Z):=\mathbb E[(Z-\mathbb E[Z])^2]$ denote its variance when $Z\in L^2(\Omega,\mathcal F,\mathbb P)$. Define the kernel estimator by
\begin{align*}
\hat f_{n,h_n}(x):=\frac{1}{nh_n}\sum_{i=1}^{n}K\left(\frac{x-X_i}{h_n}\right).
\end{align*}
Define the squared kernel integral by
\begin{align*}
R(K):=\int_{\mathbb R}K(t)^2\,d\mathcal L^1(t).
\end{align*}
Let $\operatorname{supp}K$ denote the closed support of the measurable kernel $K:\mathbb R\to\mathbb R$. Because $K$ is measurable, each composition $\omega\mapsto K((x-X_i(\omega))/h_n)$ is measurable.
For each $n\in\mathbb N$ and $1\le i\le n$, define the random variable
\begin{align*}
W_{n,i}:\Omega\to\mathbb R,\qquad W_{n,i}(\omega)=h_n^{-1}K\left(\frac{x-X_i(\omega)}{h_n}\right),
\end{align*}
and define
\begin{align*}
Y_{n,i}:\Omega\to\mathbb R,\qquad Y_{n,i}(\omega)=\sqrt{\frac{h_n}{n}}\left(W_{n,i}(\omega)-\mathbb E[W_{n,i}]\right).
\end{align*}
For each fixed $n$, the random variables $Y_{n,1},\dots,Y_{n,n}$ are independent because $X_1,\dots,X_n$ are independent, and they are centred by construction. Moreover,
\begin{align*}
\sum_{i=1}^{n}Y_{n,i}=\sqrt{\frac{h_n}{n}}\sum_{i=1}^{n}\left(h_n^{-1}K\left(\frac{x-X_i}{h_n}\right)-\mathbb E\left[h_n^{-1}K\left(\frac{x-X_i}{h_n}\right)\right]\right).
\end{align*}
Equivalently,
\begin{align*}
\sum_{i=1}^{n}Y_{n,i}=\sqrt{nh_n}\left(\frac{1}{nh_n}\sum_{i=1}^{n}K\left(\frac{x-X_i}{h_n}\right)-\mathbb E[\hat f_{n,h_n}(x)]\right).
\end{align*}
By the definition of $\hat f_{n,h_n}(x)$,
\begin{align*}
\sum_{i=1}^{n}Y_{n,i}=\sqrt{nh_n}\left(\hat f_{n,h_n}(x)-\mathbb E[\hat f_{n,h_n}(x)]\right).
\end{align*}
Thus it suffices to prove a [central limit theorem](/theorems/521) for the triangular array $(Y_{n,i})_{1\le i\le n}$.
[/step]
[step:Compute the limiting variance of the triangular array]
Let
\begin{align*}
\sigma_n^2:=\sum_{i=1}^{n}\operatorname{Var}(Y_{n,i}).
\end{align*}
Since the $Y_{n,i}$ have the same distribution for fixed $n$,
\begin{align*}
\sigma_n^2=n\operatorname{Var}(Y_{n,1})=h_n\operatorname{Var}(W_{n,1}).
\end{align*}
We compute the two terms in this variance. Since $X_1$ has density $f$ with respect to $\mathcal L^1$,
\begin{align*}
\mathbb E[W_{n,1}]
&=\int_{\mathbb R} h_n^{-1}K\left(\frac{x-u}{h_n}\right)f(u)\,d\mathcal L^1(u).
\end{align*}
Under the substitution $t=(x-u)/h_n$, equivalently $u=x-h_nt$, the one-dimensional Lebesgue measure transforms as $d\mathcal L^1(u)=h_n\,d\mathcal L^1(t)$, and the domain $\mathbb R$ is mapped onto $\mathbb R$. Therefore
\begin{align*}
\mathbb E[W_{n,1}]
&=\int_{\mathbb R}K(t)f(x-h_nt)\,d\mathcal L^1(t).
\end{align*}
Similarly,
\begin{align*}
h_n\mathbb E[W_{n,1}^2]
=h_n\int_{\mathbb R}h_n^{-2}K\left(\frac{x-u}{h_n}\right)^2f(u)\,d\mathcal L^1(u).
\end{align*}
Applying the same substitution gives
\begin{align*}
h_n\mathbb E[W_{n,1}^2]
=\int_{\mathbb R}K(t)^2f(x-h_nt)\,d\mathcal L^1(t).
\end{align*}
Choose $A>0$ such that $\operatorname{supp}K\subset[-A,A]$. Since $f$ is continuous at $x$, there exist $\delta>0$ and $B<\infty$ such that $f(y)\le B$ whenever $|y-x|<\delta$. For all sufficiently large $n$, $h_nA<\delta$, so on $\operatorname{supp}K$ we have $f(x-h_nt)\le B$. Since $K$ is bounded and supported in $[-A,A]$, the functions $t\mapsto B K(t)^2$ and $t\mapsto B|K(t)|$ are integrable with respect to $\mathcal L^1$. The dominated convergence theorem, applied first with dominating function $B K(t)^2$ and then with dominating function $B|K(t)|$, gives
\begin{align*}
\int_{\mathbb R}K(t)^2f(x-h_nt)\,d\mathcal L^1(t)
\to f(x)\int_{\mathbb R}K(t)^2\,d\mathcal L^1(t)
=f(x)R(K),
\end{align*}
and
\begin{align*}
\int_{\mathbb R}K(t)f(x-h_nt)\,d\mathcal L^1(t)
\to f(x)\int_{\mathbb R}K(t)\,d\mathcal L^1(t).
\end{align*}
Hence $\mathbb E[W_{n,1}]$ is bounded as $n\to\infty$, and because $h_n\to0$,
\begin{align*}
h_n(\mathbb E[W_{n,1}])^2\to0.
\end{align*}
Combining the preceding identities,
\begin{align*}
\sigma_n^2=h_n\operatorname{Var}(W_{n,1}).
\end{align*}
The variance identity gives
\begin{align*}
\sigma_n^2=h_n\mathbb E[W_{n,1}^2]-h_n(\mathbb E[W_{n,1}])^2.
\end{align*}
Therefore
\begin{align*}
\sigma_n^2\to f(x)R(K).
\end{align*}
[guided]
The variance computation is the point where the scaling $\sqrt{nh_n}$ is determined. We need the total variance of the triangular array to converge to a finite non-zero quantity, and the natural candidate is the local value of the density times the squared $L^2$ size of the kernel.
For each $n\in\mathbb N$, define
\begin{align*}
\sigma_n^2:=\sum_{i=1}^{n}\operatorname{Var}(Y_{n,i}).
\end{align*}
Because $X_1,\dots,X_n$ are identically distributed, the random variables $Y_{n,1},\dots,Y_{n,n}$ also have the same distribution. Therefore
\begin{align*}
\sigma_n^2=n\operatorname{Var}(Y_{n,1})=n\operatorname{Var}\left(\sqrt{\frac{h_n}{n}}W_{n,1}\right)=h_n\operatorname{Var}(W_{n,1}),
\end{align*}
where subtracting the mean inside the definition of $Y_{n,1}$ does not change the variance except for centring.
We now compute $\mathbb E[W_{n,1}]$ and $\mathbb E[W_{n,1}^2]$ from the density of $X_1$. Since $X_1$ has density $f$ with respect to $\mathcal L^1$,
\begin{align*}
\mathbb E[W_{n,1}]
&=\int_{\mathbb R} h_n^{-1}K\left(\frac{x-u}{h_n}\right)f(u)\,d\mathcal L^1(u).
\end{align*}
Use the substitution $t=(x-u)/h_n$, so that $u=x-h_nt$. The map $t\mapsto x-h_nt$ sends $\mathbb R$ onto $\mathbb R$, and the one-dimensional Lebesgue measure transforms by
\begin{align*}
d\mathcal L^1(u)=h_n\,d\mathcal L^1(t).
\end{align*}
Thus
\begin{align*}
\mathbb E[W_{n,1}]
&=\int_{\mathbb R}K(t)f(x-h_nt)\,d\mathcal L^1(t).
\end{align*}
The same substitution applied to the second moment starts from
\begin{align*}
h_n\mathbb E[W_{n,1}^2]
=h_n\int_{\mathbb R}h_n^{-2}K\left(\frac{x-u}{h_n}\right)^2f(u)\,d\mathcal L^1(u).
\end{align*}
After substituting $t=(x-u)/h_n$, this becomes
\begin{align*}
h_n\mathbb E[W_{n,1}^2]
=\int_{\mathbb R}K(t)^2f(x-h_nt)\,d\mathcal L^1(t).
\end{align*}
Now we justify the limiting passage. Since $K$ has compact support, choose $A>0$ such that $\operatorname{supp}K\subset[-A,A]$. Since $f$ is continuous at $x$, it is bounded in a neighbourhood of $x$: there exist $\delta>0$ and $B<\infty$ such that $f(y)\le B$ whenever $|y-x|<\delta$. For all sufficiently large $n$, $h_nA<\delta$. Hence, whenever $t\in\operatorname{supp}K$, we have
\begin{align*}
|x-h_nt-x|\le h_nA<\delta,
\end{align*}
and therefore $f(x-h_nt)\le B$.
The functions $t\mapsto K(t)^2f(x-h_nt)$ are then dominated by the integrable function $t\mapsto B K(t)^2$ supported on $[-A,A]$. Also $f(x-h_nt)\to f(x)$ for each fixed $t\in\mathbb R$, because $h_n\to0$ and $f$ is continuous at $x$. By the dominated convergence theorem,
\begin{align*}
\int_{\mathbb R}K(t)^2f(x-h_nt)\,d\mathcal L^1(t)
\to f(x)\int_{\mathbb R}K(t)^2\,d\mathcal L^1(t)
=f(x)R(K).
\end{align*}
For the first moment, the possible sign changes of $K$ require domination by the absolute value. Since $K$ is bounded and compactly supported, the function $t\mapsto B|K(t)|$ is integrable with respect to $\mathcal L^1$, and
\begin{align*}
|K(t)f(x-h_nt)|\le B|K(t)|
\end{align*}
for all sufficiently large $n$ and all $t\in\operatorname{supp}K$. Applying the dominated convergence theorem again gives
\begin{align*}
\int_{\mathbb R}K(t)f(x-h_nt)\,d\mathcal L^1(t)
\to f(x)\int_{\mathbb R}K(t)\,d\mathcal L^1(t).
\end{align*}
In particular, $\mathbb E[W_{n,1}]$ is bounded as $n\to\infty$. Therefore
\begin{align*}
h_n(\mathbb E[W_{n,1}])^2\to0,
\end{align*}
because $h_n\to0$. Finally,
\begin{align*}
\sigma_n^2=h_n\operatorname{Var}(W_{n,1}).
\end{align*}
Using the variance identity gives
\begin{align*}
\sigma_n^2=h_n\mathbb E[W_{n,1}^2]-h_n(\mathbb E[W_{n,1}])^2.
\end{align*}
Combining the two limits proved above,
\begin{align*}
\sigma_n^2\to f(x)R(K).
\end{align*}
This proves that the total variance of the centred triangular array converges to the variance appearing in the claimed normal limit.
[/guided]
[/step]
[step:Verify the Lindeberg condition from the uniform boundedness of the kernel]
Let $M:=\sup_{t\in\mathbb R}|K(t)|<\infty$. From the preceding step, there is a constant $C_0<\infty$ such that $|\mathbb E[W_{n,1}]|\le C_0$ for all sufficiently large $n$. Hence, for all sufficiently large $n$ and all $1\le i\le n$,
\begin{align*}
|Y_{n,i}|\le \sqrt{\frac{h_n}{n}}\left(|W_{n,i}|+|\mathbb E[W_{n,i}]|\right).
\end{align*}
Using
\begin{align*}
|W_{n,i}|\le \frac{M}{h_n}
\end{align*}
gives
\begin{align*}
|Y_{n,i}|\le \sqrt{\frac{h_n}{n}}\left(\frac{M}{h_n}+C_0\right).
\end{align*}
Equivalently,
\begin{align*}
|Y_{n,i}|\le\frac{M}{\sqrt{nh_n}}+C_0\sqrt{\frac{h_n}{n}}.
\end{align*}
Since
\begin{align*}
nh_n\to\infty
\end{align*}
and
\begin{align*}
\frac{h_n}{n}\to0,
\end{align*}
the right-hand side tends to $0$. Therefore, for every $\varepsilon>0$, there exists $N_\varepsilon\in\mathbb N$ such that $|Y_{n,i}|\le\varepsilon$ for all $n\ge N_\varepsilon$ and all $1\le i\le n$. Consequently,
\begin{align*}
\sum_{i=1}^{n}\mathbb E\left[Y_{n,i}^2\mathbb{1}_{\{|Y_{n,i}|>\varepsilon\}}\right]=0
\end{align*}
for all $n\ge N_\varepsilon$. This is the Lindeberg condition for the triangular array $(Y_{n,i})_{1\le i\le n}$.
[guided]
We need to verify that no single summand in the triangular array can contribute a non-negligible jump. Let
\begin{align*}
M:=\sup_{t\in\mathbb R}|K(t)|<\infty.
\end{align*}
From the variance computation, the sequence $\mathbb E[W_{n,1}]$ is bounded for all sufficiently large $n$; hence there is a constant $C_0<\infty$ such that $|\mathbb E[W_{n,1}]|\le C_0$ for all sufficiently large $n$. For such $n$ and all $1\le i\le n$,
\begin{align*}
|Y_{n,i}|\le \sqrt{\frac{h_n}{n}}\left(|W_{n,i}|+|\mathbb E[W_{n,i}]|\right).
\end{align*}
The boundedness of $K$ gives
\begin{align*}
|W_{n,i}|=h_n^{-1}\left|K\left(\frac{x-X_i}{h_n}\right)\right|\le \frac{M}{h_n}.
\end{align*}
Therefore
\begin{align*}
|Y_{n,i}|\le\frac{M}{\sqrt{nh_n}}+C_0\sqrt{\frac{h_n}{n}}.
\end{align*}
The first term tends to $0$ because $nh_n\to\infty$, and the second term tends to $0$ because $h_n\to0$ while $n\to\infty$. Thus for every $\varepsilon>0$ there exists $N_\varepsilon\in\mathbb N$ such that $|Y_{n,i}|\le\varepsilon$ for all $n\ge N_\varepsilon$ and all $1\le i\le n$. Hence the event $\{|Y_{n,i}|>\varepsilon\}$ is empty for every row index $i$ once $n\ge N_\varepsilon$, and so
\begin{align*}
\sum_{i=1}^{n}\mathbb E\left[Y_{n,i}^2\mathbb{1}_{\{|Y_{n,i}|>\varepsilon\}}\right]=0.
\end{align*}
This proves the Lindeberg condition.
[/guided]
[/step]
[step:Apply the triangular-array central limit theorem]
If $R(K)>0$, then the variance limit from the previous step is
\begin{align*}
\sigma^2:=f(x)R(K)>0.
\end{align*}
The triangular array $(Y_{n,i})_{1\le i\le n}$ is row-wise independent, centred, satisfies the Lindeberg condition, and has total variance $\sigma_n^2\to\sigma^2$. By the Lindeberg-Feller [central limit theorem](/theorems/1848) for triangular arrays applied to this row-wise independent centred array,
\begin{align*}
\sum_{i=1}^{n}Y_{n,i}\xrightarrow{d}\mathcal N(0,\sigma^2)
=\mathcal N(0,f(x)R(K)).
\end{align*}
Using the identity from the first step,
\begin{align*}
\sqrt{nh_n}\left(\hat f_{n,h_n}(x)-\mathbb E[\hat f_{n,h_n}(x)]\right)
\xrightarrow{d}\mathcal N(0,f(x)R(K)).
\end{align*}
If $R(K)=0$, then $K=0$ $\mathcal L^1$-a.e. Since each $X_i$ has a density with respect to $\mathcal L^1$, the random variable $K((x-X_i)/h_n)$ is $0$ almost surely for every $i$ and $n$. Hence the centred estimator is identically $0$, and the same conclusion holds with the degenerate normal law $\mathcal N(0,0)$.
[guided]
There are two cases, depending on whether the limiting variance is positive. First suppose $R(K)>0$. Since $f(x)>0$, define
\begin{align*}
\sigma^2:=f(x)R(K)>0.
\end{align*}
The preceding steps verify exactly the hypotheses of the Lindeberg-Feller central limit theorem for triangular arrays: the array is row-wise independent, each summand is centred, the Lindeberg condition holds for every $\varepsilon>0$, and the total variance satisfies $\sigma_n^2\to\sigma^2$. Therefore
\begin{align*}
\sum_{i=1}^{n}Y_{n,i}\xrightarrow{d}\mathcal N(0,\sigma^2)
=\mathcal N(0,f(x)R(K)).
\end{align*}
The first step identified this sum with the scaled centred estimator, so
\begin{align*}
\sqrt{nh_n}\left(\hat f_{n,h_n}(x)-\mathbb E[\hat f_{n,h_n}(x)]\right)
\xrightarrow{d}\mathcal N(0,f(x)R(K)).
\end{align*}
If $R(K)=0$, then
\begin{align*}
\int_{\mathbb R}K(t)^2\,d\mathcal L^1(t)=0,
\end{align*}
so $K=0$ $\mathcal L^1$-a.e. Because each $X_i$ has a density with respect to $\mathcal L^1$, the transformed random variable $K((x-X_i)/h_n)$ is $0$ almost surely for every $i$ and $n$. Hence the centred estimator is identically $0$, and the asserted convergence holds with the degenerate normal law $\mathcal N(0,0)$.
[/guided]
[/step]
[step:Replace the expectation by $f(x)$ when the scaled bias vanishes]
Assume now that, for some $s>0$,
\begin{align*}
\mathbb E[\hat f_{n,h_n}(x)]-f(x)=O(h_n^s)
\end{align*}
and that $\sqrt{nh_n}\,h_n^s\to0$. Then there are constants $C_b<\infty$ and $N_b\in\mathbb N$ such that, for all $n\ge N_b$,
\begin{align*}
\left|\mathbb E[\hat f_{n,h_n}(x)]-f(x)\right|\le C_bh_n^s.
\end{align*}
Multiplying by $\sqrt{nh_n}$ gives
\begin{align*}
\sqrt{nh_n}\left|\mathbb E[\hat f_{n,h_n}(x)]-f(x)\right|
\le C_b\sqrt{nh_n}\,h_n^s\to0.
\end{align*}
Therefore
\begin{align*}
\sqrt{nh_n}\left(\hat f_{n,h_n}(x)-f(x)\right)=\sqrt{nh_n}\left(\hat f_{n,h_n}(x)-\mathbb E[\hat f_{n,h_n}(x)]\right)+\sqrt{nh_n}\left(\mathbb E[\hat f_{n,h_n}(x)]-f(x)\right),
\end{align*}
where the first term converges in distribution to $\mathcal N(0,f(x)R(K))$ and the second term is deterministic and converges to $0$. Hence, by the elementary stability of convergence in distribution under addition of deterministic $o(1)$ terms,
\begin{align*}
\sqrt{nh_n}\left(\hat f_{n,h_n}(x)-f(x)\right)
\xrightarrow{d}\mathcal N(0,f(x)R(K)).
\end{align*}
This proves the bias-corrected centring statement and completes the proof.
[/step]