[proofplan]
We write the prediction error as the sum of three pieces: the new noise, the deterministic bias, and the centered fluctuation of the estimator. After expanding the square, the three diagonal terms become the noise variance, squared bias, and estimator variance. The mixed terms vanish because the new noise has mean zero, the centered estimator has mean zero, and the new noise is independent of the estimator.
[/proofplan]
[step:Decompose the prediction error into noise, bias, and centered estimator fluctuation]
Define the deterministic bias scalar $b \in \mathbb R$ by
\begin{align*}
b := \mathbb E[\hat f(x_0)] - f(x_0).
\end{align*}
Define the centered estimator fluctuation
\begin{align*}
Z : \Omega \to \mathbb R,
\qquad
Z := \hat f(x_0) - \mathbb E[\hat f(x_0)].
\end{align*}
Since $\hat f(x_0)$ is square-integrable, $Z$ is square-integrable and
\begin{align*}
\mathbb E[Z] = 0,
\qquad
\mathbb E[Z^2] = \operatorname{Var}(\hat f(x_0)).
\end{align*}
Using $Y_{\mathrm{new}} = f(x_0)+\varepsilon_{\mathrm{new}}$, we obtain
\begin{align*}
Y_{\mathrm{new}}-\hat f(x_0)
&=
f(x_0)+\varepsilon_{\mathrm{new}}-\hat f(x_0) \\
&=
\varepsilon_{\mathrm{new}}-\left(\mathbb E[\hat f(x_0)]-f(x_0)\right)
-\left(\hat f(x_0)-\mathbb E[\hat f(x_0)]\right) \\
&=
\varepsilon_{\mathrm{new}} - b - Z.
\end{align*}
[/step]
[step:Expand the squared error and take expectations]
Because $\varepsilon_{\mathrm{new}}$ and $Z$ are square-integrable, all terms in the following expansion are integrable. Expanding the square gives
\begin{align*}
(Y_{\mathrm{new}}-\hat f(x_0))^2
&=
(\varepsilon_{\mathrm{new}}-b-Z)^2 \\
&=
\varepsilon_{\mathrm{new}}^2 + b^2 + Z^2
-2b\varepsilon_{\mathrm{new}}
-2\varepsilon_{\mathrm{new}}Z
+2bZ.
\end{align*}
Taking expectations and using linearity of expectation,
\begin{align*}
\mathbb E[(Y_{\mathrm{new}}-\hat f(x_0))^2]
&=
\mathbb E[\varepsilon_{\mathrm{new}}^2]
+b^2
+\mathbb E[Z^2]
-2b\,\mathbb E[\varepsilon_{\mathrm{new}}]
-2\mathbb E[\varepsilon_{\mathrm{new}}Z]
+2b\,\mathbb E[Z].
\end{align*}
[/step]
[step:Show that the mixed terms vanish]
Since $\mathbb E[\varepsilon_{\mathrm{new}}]=0$, the term $-2b\,\mathbb E[\varepsilon_{\mathrm{new}}]$ is zero. Since $\mathbb E[Z]=0$, the term $2b\,\mathbb E[Z]$ is zero.
The random variable $Z$ is a measurable function of $\hat f(x_0)$. Since $\hat f(x_0)$ is independent of $\varepsilon_{\mathrm{new}}$, the random variables $Z$ and $\varepsilon_{\mathrm{new}}$ are independent. Therefore,
\begin{align*}
\mathbb E[\varepsilon_{\mathrm{new}}Z]
=
\mathbb E[\varepsilon_{\mathrm{new}}]\,\mathbb E[Z]
=
0 \cdot 0
=
0.
\end{align*}
Thus all mixed terms vanish.
[/step]
[step:Identify the remaining terms with noise variance, squared bias, and estimator variance]
Since $\mathbb E[\varepsilon_{\mathrm{new}}]=0$ and $\operatorname{Var}(\varepsilon_{\mathrm{new}})=\sigma^2$,
\begin{align*}
\mathbb E[\varepsilon_{\mathrm{new}}^2]
=
\operatorname{Var}(\varepsilon_{\mathrm{new}})
=
\sigma^2.
\end{align*}
By the definition of $b$,
\begin{align*}
b^2
=
\left(\mathbb E[\hat f(x_0)]-f(x_0)\right)^2.
\end{align*}
By the definition of $Z$,
\begin{align*}
\mathbb E[Z^2]
=
\operatorname{Var}(\hat f(x_0)).
\end{align*}
Substituting these three identities into the expectation expansion gives
\begin{align*}
\mathbb E[(Y_{\mathrm{new}}-\hat f(x_0))^2]
=
\sigma^2
+
\left(\mathbb E[\hat f(x_0)]-f(x_0)\right)^2
+
\operatorname{Var}(\hat f(x_0)).
\end{align*}
This is the desired bias–variance decomposition.
[/step]