[proofplan]
We first compute the expectation of the kernel density estimator and rewrite it as an integral against $K(u)f(x-hu)$. Since $K$ has compact support, only values of $f$ in a bounded shrinking neighbourhood of $x$ enter the integral. We then apply [Taylor's theorem](/theorems/827) to $f(x-hu)$ through order $s$, use the kernel moment conditions to cancel all terms except the zeroth and $s$th terms, and control the Taylor remainder uniformly on the support of $K$ by continuity of $f^{(s)}$ at $x$.
[/proofplan]
[step:Rewrite the expectation as a localized convolution integral]
Let $R>0$ be such that $\operatorname{supp} K \subset [-R,R]$. By the definition of a kernel of order $s$, $K\in L^1(\mathbb R)$, its zeroth moment equals $1$, and its moments of orders $1,\dots,s-1$ vanish. For each $h>0$, define the kernel density estimator $\hat f_{n,h}:\mathbb R\to\mathbb R$ by
\begin{align*}
\hat f_{n,h}(t) = \frac{1}{nh}\sum_{i=1}^n K\!\left(\frac{t-X_i}{h}\right), \qquad t\in\mathbb R.
\end{align*}
Fix $h>0$ small enough that $x+[-hR,hR]$ is contained in a neighbourhood on which $f$ is continuous. Then $f$ is bounded on the compact interval $x+[-hR,hR]$. Moreover, the map $\kappa_h:\mathbb R\to\mathbb R$ defined by $\kappa_h(y)=K((x-y)/h)$ belongs to $L^1(\mathbb R)$, because the affine change of variables $u=(x-y)/h$ gives
\begin{align*}
\int_{\mathbb R}|\kappa_h(y)|\,d\mathcal L^1(y)=h\int_{\mathbb R}|K(u)|\,d\mathcal L^1(u)<\infty.
\end{align*}
Thus $K((x-\cdot)/h)f$ is integrable with respect to $\mathcal L^1$ because its support is contained in $x+[-hR,hR]$ and $f$ is bounded there.
Since the random variables $X_1,\dots,X_n$ are identically distributed with density $f$, linearity of expectation gives
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \frac{1}{nh}\sum_{i=1}^n \mathbb E\!\left[K\!\left(\frac{x-X_i}{h}\right)\right].
\end{align*}
Since all summands have the same expectation and each $X_i$ has density $f$ with respect to $\mathcal{L}^1$,
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \frac{1}{h}\int_{\mathbb{R}} K\!\left(\frac{x-y}{h}\right)f(y)\,d\mathcal{L}^1(y).
\end{align*}
For $h>0$, use the [change of variables formula](/theorems/22) with $u=(x-y)/h$, equivalently $y=x-hu$. Under this affine substitution, $d\mathcal{L}^1(y)=h\,d\mathcal{L}^1(u)$, and the domain $\mathbb{R}$ is mapped onto $\mathbb{R}$. Hence
\begin{align*}
\mathbb E[\hat f_{n,h}(x)]
= \int_{\mathbb{R}} K(u)f(x-hu)\,d\mathcal{L}^1(u).
\end{align*}
[guided]
Let $R>0$ be such that $\operatorname{supp} K \subset [-R,R]$. The expectation reduces to a one-dimensional integral because each $X_i$ has density $f$ with respect to $\mathcal{L}^1$, but first we state exactly which estimator is being averaged. For each $h>0$, define the map $\hat f_{n,h}:\mathbb R\to\mathbb R$ by
\begin{align*}
\hat f_{n,h}(t) = \frac{1}{nh}\sum_{i=1}^n K\!\left(\frac{t-X_i}{h}\right), \qquad t\in\mathbb R.
\end{align*}
Now we check that the integrand appearing in the expectation is integrable. By the definition of a kernel of order $s$, $K\in L^1(\mathbb R)$, its zeroth moment equals $1$, and its moments of orders $1,\dots,s-1$ vanish. For $h>0$ small enough that $x+[-hR,hR]$ is contained in a neighbourhood on which $f$ is continuous, the function $f$ is bounded on the compact interval $x+[-hR,hR]$. Also $K((x-\cdot)/h)$ is supported in $x+[-hR,hR]$. To see that the rescaled kernel is integrable, define $\kappa_h:\mathbb R\to\mathbb R$ by $\kappa_h(y)=K((x-y)/h)$. The affine substitution $u=(x-y)/h$ gives
\begin{align*}
\int_{\mathbb R}|\kappa_h(y)|\,d\mathcal L^1(y)=h\int_{\mathbb R}|K(u)|\,d\mathcal L^1(u)<\infty.
\end{align*}
Thus $K((x-\cdot)/h)f$ is integrable with respect to $\mathcal L^1$, and for each index $i \in \{1,\dots,n\}$,
\begin{align*}
\mathbb E\!\left[K\!\left(\frac{x-X_i}{h}\right)\right]
= \int_{\mathbb{R}} K\!\left(\frac{x-y}{h}\right)f(y)\,d\mathcal{L}^1(y).
\end{align*}
The variables are identically distributed, so all $n$ summands have the same expectation. Therefore
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \frac{1}{nh}\sum_{i=1}^n \int_{\mathbb{R}} K\!\left(\frac{x-y}{h}\right)f(y)\,d\mathcal{L}^1(y).
\end{align*}
Because the $n$ summands are identical, this becomes
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \frac{1}{h}\int_{\mathbb{R}} K\!\left(\frac{x-y}{h}\right)f(y)\,d\mathcal{L}^1(y).
\end{align*}
Now apply the [change of variables formula](/theorems/22) with $u=(x-y)/h$, so $y=x-hu$. Because $h>0$, the one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) transforms by
\begin{align*}
d\mathcal{L}^1(y)=h\,d\mathcal{L}^1(u).
\end{align*}
The affine map $u \mapsto x-hu$ maps $\mathbb{R}$ onto $\mathbb{R}$, so the integration domain remains the whole real line. Substituting gives
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \frac{1}{h}\int_{\mathbb{R}} K(u)f(x-hu)\,h\,d\mathcal{L}^1(u).
\end{align*}
Cancelling the scalar factor $h$ yields
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \int_{\mathbb{R}} K(u)f(x-hu)\,d\mathcal{L}^1(u).
\end{align*}
This is the form in which the kernel moments can act directly on the Taylor expansion of $f$ at $x$.
[/guided]
[/step]
[step:Apply Taylor's theorem uniformly on the support of the kernel]
Let $I\subset\mathbb R$ be an open interval containing $x$ on which $f$ has $s$ continuous derivatives. Choose $\delta>0$ such that $(x-\delta,x+\delta)\subset I$. For $0<h<\delta/R$, and for every $u \in \operatorname{supp}K$, the point $x-hu$ lies in $I$. Thus the one-dimensional Taylor theorem with integral remainder, applied with order parameter $s-1$, applies to the restriction $f|_I:I\to\mathbb R$ on the line segment from $x$ to $x-hu$. For $u\in\operatorname{supp}K$, Taylor's theorem gives
\begin{align*}
f(x-hu)
= \sum_{k=0}^{s-1} \frac{(-hu)^k}{k!}f^{(k)}(x)
+ \frac{(-hu)^s}{(s-1)!}\int_0^1 (1-\theta)^{s-1} f^{(s)}(x-\theta hu)\,d\mathcal L^1(\theta).
\end{align*}
Adding and subtracting the term $\frac{(-hu)^s}{s!}f^{(s)}(x)$, and using
\begin{align*}
\int_0^1 (1-\theta)^{s-1}\,d\mathcal L^1(\theta)=\frac{1}{s},
\end{align*}
we obtain
\begin{align*}
f(x-hu)
= \sum_{k=0}^s \frac{(-hu)^k}{k!}f^{(k)}(x) + \rho_h(u),
\end{align*}
where the remainder function $\rho_h:\mathbb R \to \mathbb{R}$ is defined on $\operatorname{supp}K$ by
\begin{align*}
\rho_h(u)=\frac{(-hu)^s}{(s-1)!}\int_0^1 (1-\theta)^{s-1}\left(f^{(s)}(x-\theta hu)-f^{(s)}(x)\right)\,d\mathcal L^1(\theta),
\end{align*}
and is defined by $\rho_h(u)=0$ for $u\notin\operatorname{supp}K$. On $\operatorname{supp}K$, it satisfies
\begin{align*}
|\rho_h(u)|
\leq \frac{h^s |u|^s}{(s-1)!}\int_0^1 (1-\theta)^{s-1}
\left|f^{(s)}(x-\theta hu)-f^{(s)}(x)\right|\,d\mathcal{L}^1(\theta).
\end{align*}
Define the modulus quantity
\begin{align*}
\omega(h) := \sup\left\{\left|f^{(s)}(z)-f^{(s)}(x)\right| : |z-x|\leq hR\right\}.
\end{align*}
Since $f^{(s)}$ is continuous at $x$, $\omega(h)\to 0$ as $h\to0^+$. Thus
\begin{align*}
|\rho_h(u)| \leq \frac{h^s |u|^s}{s!}\omega(h)
\end{align*}
for every $u \in \operatorname{supp}K$.
[guided]
Let $I\subset\mathbb R$ be an open interval containing $x$ on which $f$ has $s$ continuous derivatives. The reason for introducing $I$ is that Taylor's theorem must be applied on an interval containing both the expansion point $x$ and the evaluation point $x-hu$. Choose $\delta>0$ such that $(x-\delta,x+\delta)\subset I$. Since $R$ was chosen positive and $\operatorname{supp}K\subset[-R,R]$, if $0<h<\delta/R$ and $u\in\operatorname{supp}K$, then $|hu|\le hR<\delta$, so $x-hu\in I$.
The one-dimensional Taylor theorem with integral remainder, applied with order parameter $s-1$, applies to the map $f|_I:I\to\mathbb R$ because $f$ has $s$ continuous derivatives on $I$ and the segment from $x$ to $x-hu$ is contained in $I$. This version of Taylor's theorem requires the $s$th derivative as the integral-remainder derivative. For each $u\in\operatorname{supp}K$, we add and subtract the $s$th Taylor term to write
\begin{align*}
f(x-hu)
= \sum_{k=0}^s \frac{(-hu)^k}{k!}f^{(k)}(x) + \rho_h(u),
\end{align*}
where the remainder function $\rho_h:\mathbb R\to\mathbb R$ is defined on $\operatorname{supp}K$ as the difference between the integral remainder through order $s-1$ and the added $s$th term, and is defined by $\rho_h(u)=0$ for $u\notin\operatorname{supp}K$. On $\operatorname{supp}K$, it satisfies
\begin{align*}
|\rho_h(u)|
\leq \frac{h^s |u|^s}{(s-1)!}\int_0^1 (1-\theta)^{s-1}
\left|f^{(s)}(x-\theta hu)-f^{(s)}(x)\right|\,d\mathcal{L}^1(\theta).
\end{align*}
Define
\begin{align*}
\omega(h) := \sup\left\{\left|f^{(s)}(z)-f^{(s)}(x)\right| : |z-x|\leq hR\right\}.
\end{align*}
For $u\in\operatorname{supp}K$ and $\theta\in[0,1]$, the point $z=x-\theta hu$ satisfies $|z-x|\le hR$, so the integrand is bounded above by $\omega(h)$. Therefore
\begin{align*}
|\rho_h(u)|
\leq \frac{h^s |u|^s\omega(h)}{(s-1)!}\int_0^1 (1-\theta)^{s-1}\,d\mathcal{L}^1(\theta).
\end{align*}
Since
\begin{align*}
\int_0^1 (1-\theta)^{s-1}\,d\mathcal{L}^1(\theta)=\frac{1}{s},
\end{align*}
we obtain
\begin{align*}
|\rho_h(u)| \leq \frac{h^s |u|^s}{s!}\omega(h).
\end{align*}
Finally, continuity of $f^{(s)}$ at $x$ gives $\omega(h)\to0$ as $h\to0^+$, which is the uniform smallness needed after integration against $K$.
[/guided]
[/step]
[step:Use the kernel moment conditions to isolate the leading bias term]
For each $k\in\{0,\dots,s\}$, the function $u\mapsto u^kK(u)$ is integrable with respect to $\mathcal L^1$ because $K$ is compactly supported and $K\in L^1(\mathbb R)$. The remainder estimate gives $|K(u)\rho_h(u)|\leq h^s\omega(h)|u|^s|K(u)|/s!$ on $\operatorname{supp}K$, and the right-hand side is integrable. Hence linearity of the [Lebesgue integral](/page/Lebesgue%20Integral) allows the finite Taylor sum and the remainder term to be integrated separately. Substituting the Taylor expansion into the expectation formula gives
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \sum_{k=0}^s \frac{(-h)^k}{k!}f^{(k)}(x)
\int_{\mathbb{R}} u^kK(u)\,d\mathcal{L}^1(u)
+ \int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u).
\end{align*}
The zeroth moment equals $1$, and the moments of orders $1,\dots,s-1$ vanish. Therefore
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] - f(x)
= \frac{(-1)^s h^s}{s!}f^{(s)}(x)\mu_s(K)
+ \int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u).
\end{align*}
[guided]
We now use exactly the moment conditions encoded in the phrase kernel of order $s$. First, for each $k\in\{0,\dots,s\}$, the function $u\mapsto u^kK(u)$ is integrable with respect to $\mathcal L^1$: the factor $|u|^k$ is bounded on the compact set $[-R,R]$, and $K\in L^1(\mathbb R)$.
The Taylor expansion from the previous step is valid for every $u\in\operatorname{supp}K$. Multiplying by $K(u)$ and integrating is legitimate because the finite Taylor terms are integrable, and the remainder estimate gives
\begin{align*}
|K(u)\rho_h(u)|\leq \frac{h^s\omega(h)}{s!}|u|^s|K(u)|
\end{align*}
on $\operatorname{supp}K$, with the right-hand side integrable with respect to $\mathcal L^1$. Hence linearity of the Lebesgue integral gives
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] = \sum_{k=0}^s \frac{(-h)^k}{k!}f^{(k)}(x)
\int_{\mathbb{R}} u^kK(u)\,d\mathcal{L}^1(u)
+ \int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u).
\end{align*}
The normalization condition gives
\begin{align*}
\int_{\mathbb{R}}K(u)\,d\mathcal L^1(u)=1,
\end{align*}
and the order-$s$ moment conditions give
\begin{align*}
\int_{\mathbb{R}}u^kK(u)\,d\mathcal L^1(u)=0,\qquad 1\leq k\leq s-1.
\end{align*}
Thus all Taylor terms except the constant term and the $s$th term disappear after integration. Using the definition
\begin{align*}
\mu_s(K)=\int_{\mathbb R}u^sK(u)\,d\mathcal L^1(u),
\end{align*}
we obtain
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] - f(x)
= \frac{(-1)^s h^s}{s!}f^{(s)}(x)\mu_s(K)
+ \int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u).
\end{align*}
[/guided]
[/step]
[step:Show that the integrated Taylor remainder is $o(h^s)$]
Because $\operatorname{supp}K \subset [-R,R]$, the function $u \mapsto |u|^s|K(u)|$ is integrable with respect to $\mathcal{L}^1$. Using the triangle inequality for the Lebesgue integral,
\begin{align*}
\left|\int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u)\right| \leq \int_{\mathbb{R}} |K(u)|\,|\rho_h(u)|\,d\mathcal{L}^1(u).
\end{align*}
Using the remainder bound from the Taylor step on $\operatorname{supp}K$, and noting that $K(u)=0$ for $u\notin\operatorname{supp}K$ up to $\mathcal{L}^1$-null changes of representative, gives
\begin{align*}
\left|\int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u)\right| \leq \frac{h^s\omega(h)}{s!}
\int_{\mathbb{R}} |u|^s|K(u)|\,d\mathcal{L}^1(u).
\end{align*}
The integral on the right-hand side is finite and independent of $h$, while $\omega(h)\to0$. Hence
\begin{align*}
\int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u)=o(h^s).
\end{align*}
Combining this estimate with the previous identity yields
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] - f(x)
= \frac{(-1)^s h^s}{s!} f^{(s)}(x)\mu_s(K) + o(h^s),
\end{align*}
which is the asserted pointwise bias expansion.
[guided]
It remains to prove that the integrated Taylor remainder is smaller than $h^s$. Because $\operatorname{supp}K \subset [-R,R]$ and $K\in L^1(\mathbb R)$, the function $u\mapsto |u|^s|K(u)|$ is integrable with respect to $\mathcal L^1$.
Apply the triangle inequality for the Lebesgue integral to the remainder term:
\begin{align*}
\left|\int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u)\right| \leq \int_{\mathbb{R}} |K(u)|\,|\rho_h(u)|\,d\mathcal{L}^1(u).
\end{align*}
The pointwise Taylor remainder bound applies on $\operatorname{supp}K$. Outside $\operatorname{supp}K$, the value of $K$ is zero up to the chosen $\mathcal L^1$-representative, so the same integral bound over $\mathbb R$ gives
\begin{align*}
\left|\int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u)\right| \leq \frac{h^s\omega(h)}{s!}
\int_{\mathbb{R}} |u|^s|K(u)|\,d\mathcal{L}^1(u).
\end{align*}
The integral
\begin{align*}
\int_{\mathbb{R}} |u|^s|K(u)|\,d\mathcal{L}^1(u)
\end{align*}
is finite and independent of $h$, while continuity of $f^{(s)}$ at $x$ gives $\omega(h)\to0$ as $h\to0^+$. Therefore
\begin{align*}
\int_{\mathbb{R}} K(u)\rho_h(u)\,d\mathcal{L}^1(u)=o(h^s).
\end{align*}
Substituting this into the identity from the moment-cancellation step yields
\begin{align*}
\mathbb E[\hat f_{n,h}(x)] - f(x)
= \frac{(-1)^s h^s}{s!} f^{(s)}(x)\mu_s(K) + o(h^s),
\end{align*}
which is the desired pointwise bias expansion.
[/guided]
[/step]