[proofplan]
We view the empirical distribution function as the empirical measure indexed by the class of lower half-lines. That class is a pointwise measurable Vapnik-Chervonenkis class with bounded envelope, so the ordinary empirical process converges to the Brownian-bridge process and the conditional multinomial bootstrap empirical process has the same weak limit in bounded-Lipschitz distance. We use the countable rational subfamily to fix separable versions of the indexed processes, which makes the Borel-law identification in $\ell^\infty$ legitimate. Finally, the map from the indexed process to the distribution-function process is continuous, and the supremum norm is a Lipschitz functional, so the Kolmogorov-statistic convergence follows from the continuous mapping theorem.
[/proofplan]
[step:Represent the empirical distribution functions as indexed empirical measures]
For each $x \in \mathbb{R}$, define the lower half-line $H_x := (-\infty,x]$. Let $h_x: \mathbb{R} \to \{0,1\}$ be the Borel measurable map whose value at $y \in \mathbb{R}$ is $\mathbb{1}_{H_x}(y)$, and define $\mathcal{H} := \{h_x \mid x \in \mathbb{R}\}$.
Let $P$ denote the law of $X_1$. Let $P_n$ denote the empirical measure on Borel sets $A \subset \mathbb{R}$ defined by $P_n(A) := n^{-1}\sum_{i=1}^{n}\mathbb{1}_A(X_i)$. Let $X_1^*,\dots,X_n^*$ be the nonparametric bootstrap sample, conditionally i.i.d. with conditional law $P_n$, and let $P_n^*$ denote its empirical measure. Let $\mathbb{P}^*$ and $\mathbb{E}^*$ denote [conditional probability](/page/Conditional%20Probability) and [conditional expectation](/page/Conditional%20Expectation) given $X_1,\dots,X_n$; all bootstrap laws below are conditional laws under $\mathbb{P}^*$. Then, for every $x \in \mathbb{R}$,
\begin{align*}
F(x) = P(H_x), \qquad F_n(x) = P_n(H_x), \qquad F_n^*(x) = P_n^*(H_x).
\end{align*}
Define the ordinary empirical process $\mathbb{G}_n: \mathcal{H} \to \mathbb{R}$ and the bootstrap empirical process $\mathbb{G}_n^*: \mathcal{H} \to \mathbb{R}$ by
\begin{align*}
\mathbb{G}_n(h_x) := \sqrt n\{P_n(H_x)-P(H_x)\}
\end{align*}
and
\begin{align*}
\mathbb{G}_n^*(h_x) := \sqrt n\{P_n^*(H_x)-P_n(H_x)\}.
\end{align*}
Thus proving conditional [weak convergence](/page/Weak%20Convergence) of $\mathbb{G}_n^*$ in $\ell^\infty(\mathcal{H})$ proves the first assertion after the index change $h_x \leftrightarrow x$.
[/step]
[step:Apply the conditional multinomial Donsker theorem to lower half-lines]
The class $\mathcal{H}$ has Vapnik-Chervonenkis dimension $1$: on any two ordered points $a<b$, the trace of lower half-lines cannot realize the subset $\{b\}$ without also containing $a$, while any one-point set is shattered. The class is pointwise measurable because the countable subclass $\mathcal{H}_{\mathbb{Q}} := \{h_q \mid q \in \mathbb{Q}\}$ approximates it pointwise: for each $x \in \mathbb{R}$ choose rational numbers $q_m \downarrow x$, and then $h_{q_m}(y) \to h_x(y)$ for every $y \in \mathbb{R}$. Its envelope $E: \mathbb{R} \to \mathbb{R}$, $E(y):=1$, is bounded and satisfies $P(E^2)=1$.
Let $\mathbb{G}_P: \mathcal{H} \to \mathbb{R}$ denote the centered Gaussian process with covariance
\begin{align*}
\operatorname{Cov}(\mathbb{G}_P(h_x),\mathbb{G}_P(h_y)) := P(H_x \cap H_y)-P(H_x)P(H_y).
\end{align*}
Since $H_x \cap H_y = H_{\min\{x,y\}}$, this covariance equals
\begin{align*}
F(\min\{x,y\})-F(x)F(y).
\end{align*}
We use the conditional multinomial bootstrap Donsker theorem in the following precise form. Let $\mathcal{F}$ be a pointwise measurable $P$-Donsker class of measurable real-valued functions with envelope $E_{\mathcal{F}}$ satisfying $P(E_{\mathcal{F}}^2)<\infty$. If $X_1^*,\dots,X_n^*$ are conditionally i.i.d. from $P_n$, then a separable version of the conditional bootstrap empirical process $\sqrt n(P_n^*-P_n)$ indexed by $\mathcal{F}$ satisfies
\begin{align*}
d_{\mathrm{BL}}\bigl(\mathcal{L}^*(\sqrt n(P_n^*-P_n)),\mathcal{L}(\mathbb{G}_P)\bigr) \xrightarrow{\mathbb{P}} 0
\end{align*}
for the bounded-Lipschitz metric on Borel probability laws on $\ell^\infty(\mathcal{F})$, where the separable version is determined by the countable pointwise-dense subclass. In the present case, $\mathcal{H}_{\mathbb{Q}}$ supplies that countable subclass, the VC property gives the $P$-Donsker property, and $P(E^2)=1$ gives the square-integrable envelope condition. Therefore
\begin{align*}
d_{\mathrm{BL}}\bigl(\mathcal{L}^*(\mathbb{G}_n^*),\mathcal{L}(\mathbb{G}_P)\bigr) \xrightarrow{\mathbb{P}} 0,
\end{align*}
where $\mathcal{L}^*(\mathbb{G}_n^*)$ is the conditional law of the separable version of $\mathbb{G}_n^*$ under $\mathbb{P}^*$, $\mathcal{L}(\mathbb{G}_P)$ is the law of the corresponding tight separable Gaussian limit, and $d_{\mathrm{BL}}$ denotes the supremum of expectation differences over all real-valued Borel functions on $\ell^\infty(\mathcal{H})$ bounded by $1$ and Lipschitz with constant at most $1$. This bounded-Lipschitz metric statement is the asserted conditional weak convergence in probability of $\mathbb{G}_n^*$ to $\mathbb{G}_P$ in the empirical-process sense.
[guided]
The point of introducing $\mathcal{H}$ is that the empirical distribution function is an empirical process indexed by sets. For $h_x = \mathbb{1}_{(-\infty,x]}$, the coordinate $\mathbb{G}_n^*(h_x)$ is exactly $\sqrt n\{F_n^*(x)-F_n(x)\}$. Thus we need a bootstrap [central limit theorem](/theorems/521) for the whole indexed process, not merely for one fixed $x$.
We verify the hypotheses of the conditional multinomial bootstrap Donsker theorem. First, $\mathcal{H}$ is a VC class. If $a<b$ are two [real numbers](/page/Real%20Numbers), a lower half-line containing $b$ also contains $a$, so the subset $\{b\}$ of $\{a,b\}$ cannot be realized as a trace. Any one-point set can be shattered, hence the VC dimension is $1$.
Second, the uncountable index class is measurable in the empirical-process sense. Define the countable subclass $\mathcal{H}_{\mathbb{Q}} := \{h_q \mid q \in \mathbb{Q}\}$. For a fixed $x \in \mathbb{R}$, choose rational numbers $q_m$ with $q_m \downarrow x$. Then for every $y \in \mathbb{R}$, the indicators $h_{q_m}(y)=\mathbb{1}_{(-\infty,q_m]}(y)$ converge to $h_x(y)=\mathbb{1}_{(-\infty,x]}(y)$. This pointwise approximation by a countable subclass is the separability condition that prevents measurability pathologies in $\ell^\infty(\mathcal{H})$.
Third, the envelope condition holds. The envelope map $E: \mathbb{R} \to \mathbb{R}$ defined by $E(y):=1$ dominates every $h_x$ and satisfies
\begin{align*}
P(E^2)=1.
\end{align*}
Thus the envelope is square-integrable and bounded. Since VC classes with square-integrable envelope are $P$-Donsker, the ordinary empirical process indexed by $\mathcal{H}$ has a tight centered Gaussian limit $\mathbb{G}_P$ in $\ell^\infty(\mathcal{H})$.
The conditional multinomial bootstrap Donsker theorem now applies because, under $\mathbb{P}^*$, the bootstrap variables $X_1^*,\dots,X_n^*$ are i.i.d. with law $P_n$. The theorem gives convergence of the conditional bootstrap law in bounded-Lipschitz metric, not merely pointwise convergence for each [test function](/page/Test%20Function):
\begin{align*}
d_{\mathrm{BL}}\bigl(\mathcal{L}^*(\mathbb{G}_n^*),\mathcal{L}(\mathbb{G}_P)\bigr) \xrightarrow{\mathbb{P}} 0.
\end{align*}
Here $\mathcal{L}^*(\mathbb{G}_n^*)$ is the conditional law of $\mathbb{G}_n^*$, $\mathcal{L}(\mathbb{G}_P)$ is the law of the Gaussian limit, and $d_{\mathrm{BL}}$ is the supremum over all real-valued functions on $\ell^\infty(\mathcal{H})$ bounded by $1$ and Lipschitz with constant at most $1$. This metric formulation is what justifies calling the result conditional weak convergence in probability.
It remains to record the covariance of the Gaussian limit. The limiting process $\mathbb{G}_P$ is centered Gaussian, and for $x,y \in \mathbb{R}$ its covariance is the covariance of the two indicators:
\begin{align*}
\operatorname{Cov}(\mathbb{G}_P(h_x),\mathbb{G}_P(h_y)) = P(H_x \cap H_y)-P(H_x)P(H_y).
\end{align*}
Because $H_x \cap H_y = H_{\min\{x,y\}}$, this becomes
\begin{align*}
\operatorname{Cov}(\mathbb{G}_P(h_x),\mathbb{G}_P(h_y)) = F(\min\{x,y\})-F(x)F(y).
\end{align*}
That is the covariance of the Brownian bridge evaluated at $F(x)$ and $F(y)$.
[/guided]
[/step]
[step:Identify the Gaussian limit with the Brownian bridge composed with $F$]
Let $B: [0,1] \to \mathbb{R}$ be a standard Brownian bridge with continuous sample paths, meaning a centered Gaussian process with covariance
\begin{align*}
\operatorname{Cov}(B(s),B(t)) = \min\{s,t\}-st.
\end{align*}
Let $Z: \mathbb{R} \to \mathbb{R}$ be the random element of $\ell^\infty(\mathbb{R})$ whose value at $x \in \mathbb{R}$ is $Z(x):=B(F(x))$. For $x,y \in \mathbb{R}$, monotonicity of $F$ gives
\begin{align*}
\min\{F(x),F(y)\}=F(\min\{x,y\}).
\end{align*}
Therefore
\begin{align*}
\operatorname{Cov}(Z(x),Z(y)) = F(\min\{x,y\})-F(x)F(y),
\end{align*}
which is the covariance of $\mathbb{G}_P(h_x)$ and $\mathbb{G}_P(h_y)$.
It remains to identify the full laws, not only finite-dimensional distributions. Let $\mathcal{H}_{\mathbb{Q}}:=\{h_q\mid q\in\mathbb{Q}\}$. The Gaussian limit $\mathbb{G}_P$ is taken in its separable version determined by $\mathcal{H}_{\mathbb{Q}}$. For $x\in\mathbb{R}$ and rational $q_m\downarrow x$, right-continuity of $F$ gives $F(q_m)\to F(x)$, and continuity of $B$ gives $B(F(q_m))\to B(F(x))$. Hence $Z$ is also determined by its coordinates on $\mathbb{Q}$. Since the two centered Gaussian processes have identical finite-dimensional distributions on the countable determining set $\mathbb{Q}$, their induced Borel probability laws on the corresponding separable subspace of $\ell^\infty(\mathbb{R})$ agree. Thus $T\mathbb{G}_P$ has the same law as $B\circ F$ as a random element of $\ell^\infty(\mathbb{R})$.
[guided]
We must be slightly careful here: equality of finite-dimensional distributions on an uncountable index set does not automatically identify a Borel law on $\ell^\infty(\mathbb{R})$. The countable rational subfamily is what removes this ambiguity.
Let $B: [0,1]\to\mathbb{R}$ be a continuous standard Brownian bridge, so $B$ is centered Gaussian and
\begin{align*}
\operatorname{Cov}(B(s),B(t))=\min\{s,t\}-st.
\end{align*}
Define $Z: \mathbb{R}\to\mathbb{R}$ by $Z(x):=B(F(x))$. Then $Z$ is bounded because $B$ is continuous on the compact interval $[0,1]$. For $x,y\in\mathbb{R}$, the distribution function $F$ is nondecreasing, so
\begin{align*}
\min\{F(x),F(y)\}=F(\min\{x,y\}).
\end{align*}
Thus
\begin{align*}
\operatorname{Cov}(Z(x),Z(y))=F(\min\{x,y\})-F(x)F(y),
\end{align*}
which matches the covariance of $\mathbb{G}_P(h_x)$ and $\mathbb{G}_P(h_y)$.
Now we pass from covariance matching to equality of laws in $\ell^\infty(\mathbb{R})$. The process $\mathbb{G}_P$ was chosen as the separable Gaussian version determined by $\mathcal{H}_{\mathbb{Q}}$. For the Brownian-bridge process, take any $x\in\mathbb{R}$ and choose rational numbers $q_m\downarrow x$. Right-continuity of the distribution function gives $F(q_m)\to F(x)$, and continuity of the sample path of $B$ gives
\begin{align*}
B(F(q_m))\to B(F(x)).
\end{align*}
Therefore $Z(x)$ is determined by the values $Z(q)$ with $q\in\mathbb{Q}$. Both processes are consequently supported on the same kind of separable subspace determined by rational coordinates. On that countable coordinate set, matching covariance and centered Gaussianity imply matching all finite-dimensional distributions. Since countable coordinate laws determine the Borel law on this separable version, $T\mathbb{G}_P$ and $B\circ F$ have the same law in $\ell^\infty(\mathbb{R})$.
[/guided]
[/step]
[step:Transfer the indexed convergence to $\ell^\infty(\mathbb{R})$]
Define the [linear map](/page/Linear%20Map) $T: \ell^\infty(\mathcal{H}) \to \ell^\infty(\mathbb{R})$ by
\begin{align*}
(Tz)(x) := z(h_x).
\end{align*}
For $z,w \in \ell^\infty(\mathcal{H})$,
\begin{align*}
\|Tz-Tw\|_{\ell^\infty(\mathbb{R})} \leq \|z-w\|_{\ell^\infty(\mathcal{H})},
\end{align*}
so $T$ is continuous. Applying the [Continuous Mapping Theorem](/theorems/1847) in its conditional bounded-Lipschitz form to the continuous map $T$ and to the convergence from the previous step gives
\begin{align*}
T\mathbb{G}_n^* \xrightarrow{d} T\mathbb{G}_P
\end{align*}
conditionally in probability in $\ell^\infty(\mathbb{R})$. By the definitions of $T$ and $\mathbb{G}_n^*$,
\begin{align*}
(T\mathbb{G}_n^*)(x)=\sqrt n\{F_n^*(x)-F_n(x)\}.
\end{align*}
By the identification of the preceding step, $T\mathbb{G}_P$ has the same law as $B \circ F$. Hence
\begin{align*}
\sqrt n(F_n^*-F_n) \xrightarrow{d} B\circ F
\end{align*}
conditionally in probability as a random element of $\ell^\infty(\mathbb{R})$.
[guided]
The map $T$ only changes the index notation: it sends a function indexed by half-lines to the same values indexed by real numbers. Its continuity is exactly the hypothesis needed for the [Continuous Mapping Theorem](/theorems/1847). Since the bootstrap convergence is stated in bounded-Lipschitz distance conditionally on the data, applying the conditional version of the theorem gives the conditional convergence in probability of $T\mathbb{G}_n^*$ to $T\mathbb{G}_P$.
For each $x\in\mathbb{R}$, the definition of the bootstrap empirical process gives
\begin{align*}
(T\mathbb{G}_n^*)(x)=\mathbb{G}_n^*(h_x)=\sqrt n\{P_n^*(H_x)-P_n(H_x)\}=\sqrt n\{F_n^*(x)-F_n(x)\}.
\end{align*}
The previous step identifies the law of $T\mathbb{G}_P$ with the law of $B\circ F$ as a random element of $\ell^\infty(\mathbb{R})$. Therefore the distribution-function process itself satisfies
\begin{align*}
\sqrt n(F_n^*-F_n) \xrightarrow{d} B\circ F
\end{align*}
conditionally in probability in $\ell^\infty(\mathbb{R})$.
[/guided]
[/step]
[step:Apply the supremum functional to obtain the Kolmogorov statistic limit]
Define $S: \ell^\infty(\mathbb{R}) \to \mathbb{R}$ by
\begin{align*}
S(z) := \sup_{x \in \mathbb{R}} |z(x)|.
\end{align*}
For $z,w \in \ell^\infty(\mathbb{R})$, the [reverse triangle inequality](/theorems/2300) gives
\begin{align*}
|S(z)-S(w)| \leq \|z-w\|_{\ell^\infty(\mathbb{R})},
\end{align*}
so $S$ is Lipschitz continuous. The [Continuous Mapping Theorem](/theorems/1847), applied conditionally to $S$, yields
\begin{align*}
\sup_{x \in \mathbb{R}} \sqrt n |F_n^*(x)-F_n(x)| \xrightarrow{d} \sup_{x \in \mathbb{R}} |B(F(x))|
\end{align*}
conditionally in probability.
The ordinary empirical-process Donsker theorem for the same pointwise measurable VC class $\mathcal{H}$ with square-integrable envelope $E=1$ gives, in the same separable empirical-process sense,
\begin{align*}
\sqrt n(F_n-F) \xrightarrow{d} B\circ F
\end{align*}
in $\ell^\infty(\mathbb{R})$. Applying the same Lipschitz functional $S$ and the [Continuous Mapping Theorem](/theorems/1847) gives
\begin{align*}
\sup_{x \in \mathbb{R}} \sqrt n |F_n(x)-F(x)| \xrightarrow{d} \sup_{x \in \mathbb{R}} |B(F(x))|.
\end{align*}
Thus the conditional distribution of the bootstrap supremum statistic converges in probability to the same limiting distribution as the ordinary empirical supremum statistic. This completes the proof.
[/step]