[proofplan]
We condition on the training data, so the fitted predictor $\hat f$ is fixed relative to the remaining randomness. Each test observation is independent of the training data and has the same distribution as a fresh observation $(X,Y)$, hence each test loss has conditional expectation equal to the conditional risk $R(\hat f)$. Linearity of conditional expectation then shows that the average of the test losses has the same conditional expectation. Taking expectations gives the unconditional identity.
[/proofplan]
[step:Condition on the training data and define the test losses]
Let
\begin{align*}
\mathcal G:=\sigma(D_{\mathrm{train}})
\end{align*}
be the $\sigma$-algebra generated by the training sample. Since $A$ is measurable and $\hat f=A(D_{\mathrm{train}})$, the random hypothesis $\hat f$ is $\mathcal G$-measurable.
For each $i\in I_{\mathrm{test}}$, define the real-valued random variable
\begin{align*}
Z_i:\Omega\to\mathbb R,
\qquad
Z_i:=L(Y_i,\hat f(X_i)).
\end{align*}
The integrability hypothesis in the statement gives
\begin{align*}
\mathbb E\left[|L(Y,\hat f(X))|\right]<\infty.
\end{align*}
Because $(X_i,Y_i)$ has the same distribution as $(X,Y)$ and is independent of $\mathcal G$, the same kernel argument used in the next step with the non-negative measurable function $u\mapsto |u|$ gives
\begin{align*}
\mathbb E[|Z_i|]
=
\mathbb E\left[|L(Y,\hat f(X))|\right]
<
\infty.
\end{align*}
Thus each $Z_i$ is integrable, so the ordinary signed conditional expectations $\mathbb E[Z_i\mid\mathcal G]$ are well-defined.
[/step]
[step:Compute the conditional expectation of one test loss]
Fix $i\in I_{\mathrm{test}}$. Since $I_{\mathrm{train}}$ and $I_{\mathrm{test}}$ are disjoint and the observations are i.i.d., the random pair $(X_i,Y_i)$ is independent of $\mathcal G=\sigma(D_{\mathrm{train}})$ and has the same law as the independent copy $(X,Y)$.
We justify the conditional distribution identity by testing against bounded [measurable functions](/page/Measurable%20Functions). Let $\varphi:\mathbb R\to\mathbb R$ be bounded and Borel measurable. Since $\hat f$ is $\mathcal G$-measurable and $(X_i,Y_i)$ is independent of $\mathcal G$, conditioning on $\mathcal G$ freezes the value of $\hat f$ and integrates only over the law of $(X_i,Y_i)$. Since $(X_i,Y_i)$ and $(X,Y)$ have the same law, we obtain
\begin{align*}
\mathbb E\left[\varphi\left(L(Y_i,\hat f(X_i))\right)\mid\mathcal G\right]
=
\mathbb E\left[\varphi\left(L(Y,\hat f(X))\right)\mid\mathcal G\right]
\quad\text{a.s.}
\end{align*}
This equality for all bounded Borel $\varphi$ identifies the conditional laws. Applying it first to bounded truncations of the identity map and then using integrability of $Z_i$ and $L(Y,\hat f(X))$ gives
\begin{align*}
\mathbb E[Z_i\mid\mathcal G]
&=
\mathbb E\left[L(Y,\hat f(X))\mid\mathcal G\right] \\
&=
R(\hat f)
\quad\text{a.s.}
\end{align*}
[guided]
Fix $i\in I_{\mathrm{test}}$. The purpose of conditioning on $\mathcal G=\sigma(D_{\mathrm{train}})$ is to freeze the fitted predictor. Since $\hat f=A(D_{\mathrm{train}})$ and $A$ is measurable, $\hat f$ is $\mathcal G$-measurable.
The test observation $(X_i,Y_i)$ is independent of $\mathcal G$ because the observations are i.i.d. and the test index $i$ is disjoint from the training index set. It also has the same distribution as the fresh observation $(X,Y)$. The loss is integrable: by the argument from the first step,
\begin{align*}
\mathbb E\left[|L(Y_i,\hat f(X_i))|\right]
=
\mathbb E\left[|L(Y,\hat f(X))|\right]
<
\infty.
\end{align*}
To make the conditioning argument precise, let $\varphi:\mathbb R\to\mathbb R$ be bounded and Borel measurable. Because $\hat f$ is $\mathcal G$-measurable, conditioning on $\mathcal G$ freezes the value of $\hat f$. Because $(X_i,Y_i)$ is independent of $\mathcal G$ and has the same law as $(X,Y)$, the conditional expectations of the bounded test functions agree:
\begin{align*}
\mathbb E\left[\varphi\left(L(Y_i,\hat f(X_i))\right)\mid\mathcal G\right]
=
\mathbb E\left[\varphi\left(L(Y,\hat f(X))\right)\mid\mathcal G\right].
\end{align*}
Taking bounded truncations of the identity map as $\varphi$ and using integrability gives
\begin{align*}
\mathbb E\left[L(Y_i,\hat f(X_i))\mid\mathcal G\right]
=
\mathbb E\left[L(Y,\hat f(X))\mid\mathcal G\right].
\end{align*}
By the definition of the conditional risk,
\begin{align*}
R(\hat f)
&:=
\mathbb E\left[L(Y,\hat f(X))\mid D_{\mathrm{train}}\right] \\
&=
\mathbb E\left[L(Y,\hat f(X))\mid\mathcal G\right].
\end{align*}
Combining these identities gives
\begin{align*}
\mathbb E[Z_i\mid\mathcal G]=R(\hat f)
\quad\text{a.s.}
\end{align*}
[/guided]
[/step]
[step:Average the conditional expectations over the test sample]
By definition of the holdout estimator,
\begin{align*}
\hat R_{\mathrm{test}}
=
\frac{1}{m}\sum_{i\in I_{\mathrm{test}}}Z_i.
\end{align*}
Since $m=|I_{\mathrm{test}}|\ge 1$ and each $Z_i$ is integrable, the [linearity of conditional expectation](/page/Conditional%20Expectation) gives
\begin{align*}
\mathbb E[\hat R_{\mathrm{test}}\mid\mathcal G]
&=
\mathbb E\left[\frac{1}{m}\sum_{i\in I_{\mathrm{test}}}Z_i\mid\mathcal G\right] \\
&=
\frac{1}{m}\sum_{i\in I_{\mathrm{test}}}\mathbb E[Z_i\mid\mathcal G].
\end{align*}
Using the identity from the previous step for every $i\in I_{\mathrm{test}}$,
\begin{align*}
\mathbb E[\hat R_{\mathrm{test}}\mid\mathcal G]
&=
\frac{1}{m}\sum_{i\in I_{\mathrm{test}}}R(\hat f) \\
&=
\frac{m}{m}R(\hat f) \\
&=
R(\hat f)
\quad\text{a.s.}
\end{align*}
Since conditioning on $D_{\mathrm{train}}$ is conditioning on $\mathcal G=\sigma(D_{\mathrm{train}})$, this proves
\begin{align*}
\mathbb E[\hat R_{\mathrm{test}}\mid D_{\mathrm{train}}]
=
R(\hat f)
\quad\text{a.s.}
\end{align*}
[/step]
[step:Take expectations to obtain unconditional unbiasedness]
Since each $Z_i$ is integrable, $\hat R_{\mathrm{test}}$ is integrable. Also $R(\hat f)=\mathbb E[L(Y,\hat f(X))\mid\mathcal G]$ is integrable by the integrability hypothesis and the defining integrability property of conditional expectation. Taking expectations in the conditional identity and using the [tower property of conditional expectation](/page/Conditional%20Expectation) yields
\begin{align*}
\mathbb E[\hat R_{\mathrm{test}}]
&=
\mathbb E\left[\mathbb E[\hat R_{\mathrm{test}}\mid D_{\mathrm{train}}]\right] \\
&=
\mathbb E[R(\hat f)].
\end{align*}
Thus the holdout estimator is conditionally unbiased for the conditional risk, and its unconditional expectation equals the expected conditional risk.
[/step]