[proofplan]
The proof is a direct random-index consequence of stochastic equicontinuity. We fix an error level and split the event where the empirical-process increment at the random index is large according to whether the random index lies in a small $d_P$-ball around $m_{\theta_0}$. On that local event, the random increment is bounded by the local supremum appearing in the definition of stochastic equicontinuity; on the complementary event, convergence of $d_P(m_{\hat\theta_n},m_{\theta_0})$ to zero controls the probability. Letting the local radius be chosen by stochastic equicontinuity gives the desired convergence in outer probability $\mathbb P_{\mathrm{out}}$.
[/proofplan]
custom_env
admin
[step:Control the random increment by a local supremum and a localization error]Fix $\varepsilon>0$. For each $\delta>0$ and $n\in\mathbb N$, define the event
\begin{align*}
A_{n,\delta}:=\{\omega\in\Omega:d_P(m_{\hat\theta_n(\omega)},m_{\theta_0})<\delta\}.
\end{align*}
Define the local empirical-process supremum
\begin{align*}
S_{n,\delta}:=\sup_{\theta\in\Theta:\ d_P(m_\theta,m_{\theta_0})<\delta}|G_n(m_\theta)-G_n(m_{\theta_0})|.
\end{align*}
On $A_{n,\delta}$, the random function $m_{\hat\theta_n}$ is one of the functions included in the supremum defining $S_{n,\delta}$. Therefore
\begin{align*}
|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|\le S_{n,\delta}
\end{align*}
on $A_{n,\delta}$. Hence, by [monotonicity and subadditivity](/theorems/1081) of outer probability,
\begin{align*}
\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)
\le \mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon)+\mathbb P_{\mathrm{out}}(A_{n,\delta}^c).
\end{align*}[/step]
custom_env
admin
[guided]Fix $\varepsilon>0$. The purpose of the argument is to replace the random index $\hat\theta_n$ by a deterministic local supremum. For a radius $\delta>0$, define the localization event
\begin{align*}
A_{n,\delta}:=\{\omega\in\Omega:d_P(m_{\hat\theta_n(\omega)},m_{\theta_0})<\delta\}.
\end{align*}
This event records exactly when the random criterion function $m_{\hat\theta_n}$ falls inside the $d_P$-ball of radius $\delta$ around the base function $m_{\theta_0}$.
Define also
\begin{align*}
S_{n,\delta}:=\sup_{\theta\in\Theta:\ d_P(m_\theta,m_{\theta_0})<\delta}|G_n(m_\theta)-G_n(m_{\theta_0})|.
\end{align*}
The definition of $S_{n,\delta}$ is deterministic in the index set but random through the empirical process $G_n$. If $\omega\in A_{n,\delta}$, then $m_{\hat\theta_n(\omega)}$ satisfies the condition appearing under the supremum. Therefore the particular increment at the random index is bounded by the supremum over all such local increments:
\begin{align*}
|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|\le S_{n,\delta}
\end{align*}
on $A_{n,\delta}$.
Consequently, if the random increment is larger than $\varepsilon$, then either the local supremum is larger than $\varepsilon$ or the localization event has failed. In set notation,
\begin{align*}
\{|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon\}\subseteq \{S_{n,\delta}>\varepsilon\}\cup A_{n,\delta}^c.
\end{align*}
Taking outer probabilities and using monotonicity and subadditivity gives
\begin{align*}
\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)
\le \mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon)+\mathbb P_{\mathrm{out}}(A_{n,\delta}^c).
\end{align*}
This is the key reduction: the first term is controlled by stochastic equicontinuity, and the second by the assumed convergence of the random index in the semimetric $d_P$.[/guided]
custom_env
admin
[step:Choose the localization radius using stochastic equicontinuity]Let $\eta>0$. By stochastic equicontinuity at $m_{\theta_0}$, there exists $\delta>0$ such that
\begin{align*}
\limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon)<\eta.
\end{align*}
For this fixed $\delta$, the assumption
\begin{align*}
d_P(m_{\hat\theta_n},m_{\theta_0})\xrightarrow{\mathbb P_{\mathrm{out}}}0
\end{align*}
implies
\begin{align*}
\lim_{n\to\infty}\mathbb P_{\mathrm{out}}(A_{n,\delta}^c)=0.
\end{align*}
Indeed,
\begin{align*}
A_{n,\delta}^c=\{\omega\in\Omega:d_P(m_{\hat\theta_n(\omega)},m_{\theta_0})\ge\delta\}.
\end{align*}[/step]
custom_env
admin
[guided]Let $\eta>0$. The preceding step reduced the problem to two probabilities, so we now choose the radius $\delta$ to make the local-supremum probability small. Stochastic equicontinuity at $m_{\theta_0}$ states that the limiting upper probability of the local supremum becomes arbitrarily small as the radius decreases. Therefore there exists $\delta>0$ such that
\begin{align*}
\limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon)<\eta.
\end{align*}
With this radius fixed, the localization error is exactly the event that the random index has not entered the $d_P$-ball of radius $\delta$ around $m_{\theta_0}$. By definition of $A_{n,\delta}$,
\begin{align*}
A_{n,\delta}^c=\{\omega\in\Omega:d_P(m_{\hat\theta_n(\omega)},m_{\theta_0})\ge\delta\}.
\end{align*}
The assumed convergence
\begin{align*}
d_P(m_{\hat\theta_n},m_{\theta_0})\xrightarrow{\mathbb P_{\mathrm{out}}}0
\end{align*}
means precisely that, for every fixed positive radius, the outer probability of this complement tends to zero. Hence
\begin{align*}
\lim_{n\to\infty}\mathbb P_{\mathrm{out}}(A_{n,\delta}^c)=0.
\end{align*}[/guided]
custom_env
admin
[step:Pass to the limit and obtain convergence in outer probability]Using the estimate from the first step with the $\delta$ chosen in the second step gives
\begin{align*}
\limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)
\le \limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon).
\end{align*}
The right-hand side is strictly smaller than $\eta$. Since $\eta>0$ was arbitrary, we obtain
\begin{align*}
\lim_{n\to\infty}\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)=0.
\end{align*}
Because this holds for every $\varepsilon>0$, it is precisely
\begin{align*}
G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})\xrightarrow{\mathbb P_{\mathrm{out}}}0.
\end{align*}
If all displayed quantities are measurable, the same proof with ordinary probability $\mathbb P$ in place of outer probability proves the measurable-probability variant. This proves the theorem.[/step]
custom_env
admin
[guided]We now combine the two bounds. From the first step, for the radius $\delta$ chosen using stochastic equicontinuity,
\begin{align*}
\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)
\le \mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon)+\mathbb P_{\mathrm{out}}(A_{n,\delta}^c).
\end{align*}
Taking the limit superior in $n$ and using
\begin{align*}
\lim_{n\to\infty}\mathbb P_{\mathrm{out}}(A_{n,\delta}^c)=0
\end{align*}
from the preceding step gives
\begin{align*}
\limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)
\le \limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon).
\end{align*}
The choice of $\delta$ gives
\begin{align*}
\limsup_{n\to\infty}\mathbb P_{\mathrm{out}}(S_{n,\delta}>\varepsilon)<\eta.
\end{align*}
Since $\eta>0$ was arbitrary, the only possible nonnegative value of the left-hand side is $0$. Therefore
\begin{align*}
\lim_{n\to\infty}\mathbb P_{\mathrm{out}}(|G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})|>\varepsilon)=0.
\end{align*}
This is exactly convergence to zero in outer probability:
\begin{align*}
G_n(m_{\hat\theta_n})-G_n(m_{\theta_0})\xrightarrow{\mathbb P_{\mathrm{out}}}0.
\end{align*}
When all random variables and local suprema are measurable, each invocation of outer probability above may be replaced by ordinary probability $\mathbb P$, so the ordinary-probability statement follows by the identical argument.[/guided]