[proofplan]
Fix an arbitrary estimator and convert it into a test between $\theta_0$ and $\theta_1$ by declaring $\theta_0$ when the $\theta_0$-loss is less than $\Delta/2$. The loss separation assumption forces the risk under $\theta_0$ plus the risk under $\theta_1$ to dominate $\Delta/2$ times the sum of the two testing errors. We then compute the smallest possible sum of testing errors in terms of total variation distance and combine this with the elementary inequality
\begin{align*}
\max\{x,y\} \geq \frac{x+y}{2}.
\end{align*}
[/proofplan]
[step:Convert an arbitrary estimator into a test between the two parameter points]
Fix an $\mathcal F/\mathcal G$-measurable estimator
\begin{align*}
\hat{\theta}:\Omega &\to \mathcal A.
\end{align*}
Define the measurable set
\begin{align*}
B_{\hat{\theta}}
:=
\{\omega \in \Omega : L(\theta_0,\hat{\theta}(\omega)) < \Delta/2\}.
\end{align*}
Measurability follows because the statement assumes $a \mapsto L(\theta_0,a)$ is $\mathcal G/\mathcal B([0,\infty])$-measurable and $\hat{\theta}$ is $\mathcal F/\mathcal G$-measurable. The threshold is an ordinary real number because the statement now assumes $\Delta \in [0,\infty)$. Define the test $\varphi_{\hat{\theta}}:\Omega \to \{0,1\}$ by
\begin{align*}
\varphi_{\hat{\theta}}(\omega)
=
\mathbb{1}_{\Omega \setminus B_{\hat{\theta}}}(\omega)
\end{align*}
for each $\omega \in \Omega$. Here $\varphi_{\hat{\theta}}=0$ means deciding $\theta_0$ and $\varphi_{\hat{\theta}}=1$ means deciding $\theta_1$.
[/step]
[step:Use the loss separation to lower bound the two risks by testing errors]
Let
\begin{align*}
R_i(\hat{\theta})
:=
\mathbb E_{\theta_i}[L(\theta_i,\hat{\theta})]
\end{align*}
denote the risk at $\theta_i$ for $i \in \{0,1\}$. On $\Omega \setminus B_{\hat{\theta}}$ we have $L(\theta_0,\hat{\theta}) \geq \Delta/2$, so monotonicity of the [Lebesgue integral](/page/Lebesgue%20Integral) gives
\begin{align*}
R_0(\hat{\theta})
=
\int_\Omega L(\theta_0,\hat{\theta}(\omega))\,d\mathbb P_{\theta_0}(\omega)
\geq
\frac{\Delta}{2}\mathbb P_{\theta_0}(\Omega \setminus B_{\hat{\theta}}).
\end{align*}
On $B_{\hat{\theta}}$ we have $L(\theta_0,\hat{\theta}) < \Delta/2$, and the separation hypothesis gives
\begin{align*}
L(\theta_1,\hat{\theta}(\omega))
\geq
\Delta - L(\theta_0,\hat{\theta}(\omega))
>
\Delta/2
\end{align*}
for every $\omega \in B_{\hat{\theta}}$. Hence
\begin{align*}
R_1(\hat{\theta})
=
\int_\Omega L(\theta_1,\hat{\theta}(\omega))\,d\mathbb P_{\theta_1}(\omega)
\geq
\frac{\Delta}{2}\mathbb P_{\theta_1}(B_{\hat{\theta}}).
\end{align*}
Adding the two estimates,
\begin{align*}
R_0(\hat{\theta})+R_1(\hat{\theta})
\geq
\frac{\Delta}{2}\left(
\mathbb P_{\theta_0}(\Omega \setminus B_{\hat{\theta}})
+
\mathbb P_{\theta_1}(B_{\hat{\theta}})
\right).
\end{align*}
[guided]
The role of the threshold $\Delta/2$ is to turn an estimator into a binary decision rule. Define
\begin{align*}
B_{\hat{\theta}}
:=
\{\omega \in \Omega : L(\theta_0,\hat{\theta}(\omega)) < \Delta/2\}.
\end{align*}
Because $a \mapsto L(\theta_0,a)$ is $\mathcal G/\mathcal B([0,\infty])$-measurable and $\hat{\theta}$ is $\mathcal F/\mathcal G$-measurable, the composition $\omega \mapsto L(\theta_0,\hat{\theta}(\omega))$ is $\mathcal F/\mathcal B([0,\infty])$-measurable. Since $[0,\Delta/2)$ belongs to $\mathcal B([0,\infty])$, this gives $B_{\hat{\theta}}\in\mathcal F$.
On this set the estimator has produced an action whose loss at $\theta_0$ is small, so the associated test decides $\theta_0$. On the complement $\Omega \setminus B_{\hat{\theta}}$, the test decides $\theta_1$.
We now compare risk with testing error. If the true parameter is $\theta_0$ and $\omega \in \Omega \setminus B_{\hat{\theta}}$, then by definition of the complement,
\begin{align*}
L(\theta_0,\hat{\theta}(\omega)) \geq \Delta/2.
\end{align*}
Therefore the $\theta_0$-risk satisfies
\begin{align*}
R_0(\hat{\theta})
=
\int_\Omega L(\theta_0,\hat{\theta}(\omega))\,d\mathbb P_{\theta_0}(\omega)
\geq
\int_{\Omega \setminus B_{\hat{\theta}}} L(\theta_0,\hat{\theta}(\omega))\,d\mathbb P_{\theta_0}(\omega)
\geq
\frac{\Delta}{2}\mathbb P_{\theta_0}(\Omega \setminus B_{\hat{\theta}}).
\end{align*}
This is exactly $\Delta/2$ times the probability that the induced test incorrectly rejects $\theta_0$.
If the true parameter is $\theta_1$ and $\omega \in B_{\hat{\theta}}$, then the loss at $\theta_0$ is less than $\Delta/2$. The assumed separation
\begin{align*}
L(\theta_0,a)+L(\theta_1,a) \geq \Delta
\end{align*}
applied to $a=\hat{\theta}(\omega)$ forces
\begin{align*}
L(\theta_1,\hat{\theta}(\omega))
\geq
\Delta - L(\theta_0,\hat{\theta}(\omega))
>
\Delta/2.
\end{align*}
Thus
\begin{align*}
R_1(\hat{\theta})
=
\int_\Omega L(\theta_1,\hat{\theta}(\omega))\,d\mathbb P_{\theta_1}(\omega)
\geq
\int_{B_{\hat{\theta}}} L(\theta_1,\hat{\theta}(\omega))\,d\mathbb P_{\theta_1}(\omega)
\geq
\frac{\Delta}{2}\mathbb P_{\theta_1}(B_{\hat{\theta}}).
\end{align*}
This is $\Delta/2$ times the probability that the induced test incorrectly accepts $\theta_0$ when $\theta_1$ is true. Adding the two inequalities gives
\begin{align*}
R_0(\hat{\theta})+R_1(\hat{\theta})
\geq
\frac{\Delta}{2}\left(
\mathbb P_{\theta_0}(\Omega \setminus B_{\hat{\theta}})
+
\mathbb P_{\theta_1}(B_{\hat{\theta}})
\right).
\end{align*}
[/guided]
[/step]
[step:Lower bound the induced testing errors by total variation]
For any measurable set $B \in \mathcal F$, additivity of the probability measure $\mathbb P_{\theta_0}$ gives $\mathbb P_{\theta_0}(\Omega \setminus B)=1-\mathbb P_{\theta_0}(B)$, and therefore
\begin{align*}
\mathbb P_{\theta_0}(\Omega \setminus B)+\mathbb P_{\theta_1}(B)
=
1-\left(\mathbb P_{\theta_0}(B)-\mathbb P_{\theta_1}(B)\right).
\end{align*}
Since $\mathbb P_{\theta_0}(B)-\mathbb P_{\theta_1}(B) \leq \sup_{C \in \mathcal F}|\mathbb P_{\theta_0}(C)-\mathbb P_{\theta_1}(C)|$, the statement-level convention
\begin{align*}
\|P-Q\|_{\mathrm{TV}} := \sup_{C \in \mathcal F}|P(C)-Q(C)|
\end{align*}
for probability measures $P$ and $Q$ on $(\Omega,\mathcal F)$ gives
\begin{align*}
\mathbb P_{\theta_0}(\Omega \setminus B)+\mathbb P_{\theta_1}(B)
\geq
1-\|\mathbb P_{\theta_0}-\mathbb P_{\theta_1}\|_{\mathrm{TV}}.
\end{align*}
Applying this to $B=B_{\hat{\theta}}$ and using $\Delta/2 \geq 0$ gives
\begin{align*}
R_0(\hat{\theta})+R_1(\hat{\theta})
\geq
\frac{\Delta}{2}
\left(1-\|\mathbb P_{\theta_0}-\mathbb P_{\theta_1}\|_{\mathrm{TV}}\right).
\end{align*}
[/step]
[step:Pass from the two risks to the minimax risk]
Since $\{\theta_0,\theta_1\} \subset \Theta$,
\begin{align*}
\sup_{\theta \in \Theta}\mathbb E_\theta[L(\theta,\hat{\theta})]
\geq
\max\{R_0(\hat{\theta}),R_1(\hat{\theta})\}
\geq
\frac{R_0(\hat{\theta})+R_1(\hat{\theta})}{2}.
\end{align*}
Combining this with the preceding lower bound yields, for every estimator $\hat{\theta}$,
\begin{align*}
\sup_{\theta \in \Theta}\mathbb E_\theta[L(\theta,\hat{\theta})]
\geq
\frac{\Delta}{4}
\left(1-\|\mathbb P_{\theta_0}-\mathbb P_{\theta_1}\|_{\mathrm{TV}}\right).
\end{align*}
For every $\theta \in \Theta$, the composition $\omega \mapsto L(\theta,\hat{\theta}(\omega))$ is $\mathcal F/\mathcal B([0,\infty])$-measurable by the statement-level measurability assumption on $L$, so the risks appearing in the minimax expression are defined as extended non-negative expectations. Taking the infimum over all $\mathcal F/\mathcal G$-measurable estimators $\hat{\theta}:\Omega \to \mathcal A$ preserves the lower bound and gives
\begin{align*}
\inf_{\hat{\theta}} \sup_{\theta \in \Theta}\mathbb E_\theta[L(\theta,\hat{\theta})]
\geq
\frac{\Delta}{4}
\left(1-\|\mathbb P_{\theta_0}-\mathbb P_{\theta_1}\|_{\mathrm{TV}}\right).
\end{align*}
This is the claimed minimax lower bound.
[/step]