[guided]We first reduce the expectation of the $U$-statistic to a deterministic integral. Let $(\Omega,\mathcal F,\mathbb P)$ be the probability space on which the independent identically distributed real-valued random variables $X_1,\dots,X_n:\Omega\to\mathbb R$ are defined, each with density $f:\mathbb R\to[0,\infty)$. Let $L^2(\mathbb R)$ denote the Hilbert space of real-valued $\mathcal L^1$-square-integrable functions on $\mathbb R$, modulo equality $\mathcal L^1$-a.e., with norm
\begin{align*}
\|g\|_{L^2(\mathbb R)}:=\left(\int_{\mathbb R}|g(x)|^2\,d\mathcal L^1(x)\right)^{1/2}.
\end{align*}
Let $H^2(\mathbb R)$ denote the Sobolev space of all $g\in L^2(\mathbb R)$ whose first and second weak derivatives belong to $L^2(\mathbb R)$. Define the quadratic functional $Q(f)\in\mathbb R$ by
\begin{align*}
Q(f):=\int_{\mathbb R}f(x)^2\,d\mathcal L^1(x).
\end{align*}
For $h>0$, define the rescaled kernel $K_h:\mathbb R\to\mathbb R$ by
\begin{align*}
K_h(z):=h^{-1}K(z/h).
\end{align*}
Define the kernel quadratic estimator $\widehat Q_{n,h}:\Omega\to\mathbb R$ by
\begin{align*}
\widehat Q_{n,h}:=\frac{2}{n(n-1)}\sum_{1\le i<j\le n}K_h(X_i-X_j).
\end{align*}
The first issue is integrability: we need the expectation and the changes in the order of integration to be legitimate. Since $K_h$ is a rescaling of $K$, the affine substitution $z=hu$ gives
\begin{align*}
\|K_h\|_{L^1(\mathbb R)}=\int_{\mathbb R}|K_h(z)|\,d\mathcal L^1(z)=\int_{\mathbb R}|K(u)|\,d\mathcal L^1(u)<\infty.
\end{align*}
Since $f\in H^2(\mathbb R)\subset L^2(\mathbb R)$, Tonelli's theorem applied to the non-negative integrand and Young's convolution bound for the pairing with $f$ give
\begin{align*}
\int_{\mathbb R}\int_{\mathbb R}|K_h(x-y)|f(x)f(y)\,d\mathcal L^1(y)\,d\mathcal L^1(x)
\le \|K\|_{L^1(\mathbb R)}\|f\|_{L^2(\mathbb R)}^2<\infty.
\end{align*}
Thus the signed kernel integral is absolutely integrable, so Fubini's theorem applies.
By independence and identical distribution, each pair $(X_i,X_j)$ with $i<j$ has joint density $(x,y)\mapsto f(x)f(y)$ on $\mathbb R^2$. Therefore
\begin{align*}
\mathbb E[\widehat Q_{n,h}]=\frac{2}{n(n-1)}\sum_{1\le i<j\le n}\mathbb E[K_h(X_i-X_j)].
\end{align*}
There are $n(n-1)/2$ pairs, and every pair has the same expectation, so the coefficient cancels the number of summands and yields
\begin{align*}
\mathbb E[\widehat Q_{n,h}]=\int_{\mathbb R}\int_{\mathbb R}K_h(x-y)f(x)f(y)\,d\mathcal L^1(y)\,d\mathcal L^1(x).
\end{align*}
For fixed $x\in\mathbb R$, make the affine substitution $u=(x-y)/h$, equivalently $y=x-hu$. Under this substitution, $d\mathcal L^1(y)=h\,d\mathcal L^1(u)$ and $K_h(x-y)=h^{-1}K(u)$, so the factors of $h$ cancel. Hence
\begin{align*}
\mathbb E[\widehat Q_{n,h}]=\int_{\mathbb R}K(u)\int_{\mathbb R}f(x)f(x-hu)\,d\mathcal L^1(x)\,d\mathcal L^1(u).
\end{align*}
The inner integral is finite for every $u\in\mathbb R$ by the Cauchy-Schwarz inequality and translation-invariance of $\mathcal L^1$, because
\begin{align*}
\left|\int_{\mathbb R}f(x)f(x-hu)\,d\mathcal L^1(x)\right|\le \|f\|_{L^2(\mathbb R)}^2.
\end{align*}
Consequently the outer integral is absolutely integrable against $|K(u)|\,d\mathcal L^1(u)$. Finally, using the normalization $\int_{\mathbb R}K(u)\,d\mathcal L^1(u)=1$, we subtract
\begin{align*}
Q(f)=\int_{\mathbb R}K(u)\int_{\mathbb R}f(x)^2\,d\mathcal L^1(x)\,d\mathcal L^1(u)
\end{align*}
from the preceding identity. This gives the bias identity
\begin{align*}
\mathbb E[\widehat Q_{n,h}]-Q(f)
=\int_{\mathbb R}K(u)\int_{\mathbb R}f(x)\{f(x-hu)-f(x)\}\,d\mathcal L^1(x)\,d\mathcal L^1(u).
\end{align*}
This identity is the bridge from probability to analysis: the rest of the proof only needs to estimate the translation difference $f(\cdot-hu)-f$ in $L^2(\mathbb R)$.[/guided]