[proofplan]
We prove Donsker convergence by verifying the two standard ingredients for [weak convergence](/page/Weak%20Convergence) of empirical processes in $\ell^\infty(\mathcal F)$: finite-dimensional convergence and asymptotic equicontinuity for the intrinsic $L^2(P)$ semimetric. The square-integrable envelope gives the required second moments and domination, while the uniform entropy integral gives the maximal inequality that forces asymptotic equicontinuity. Once these two conditions are established, the empirical-process convergence criterion yields a tight centred Gaussian limit, and the covariance is computed directly from the finite-dimensional [central limit theorem](/theorems/521).
[/proofplan]
[step:Define the empirical process and its finite-dimensional marginals]
Let $(\Omega,\mathcal G,\mathbb P)$ be the probability space on which the sample is defined. For each $n \in \mathbb N$, let $X_1,\dots,X_n$ be independent measurable maps $X_i:(\Omega,\mathcal G)\to(\mathcal X,\mathcal A)$ with common law $P$. For every $P$-integrable measurable function $h:\mathcal X\to\mathbb R$, write
\begin{align*}
P h=\int_{\mathcal X} h(x)\,dP(x).
\end{align*}
Define the empirical process map $\alpha_n:\mathcal F\to\mathbb R$ by
\begin{align*}
\alpha_n(f)=\frac{1}{\sqrt n}\sum_{i=1}^n \bigl(f(X_i)-P f\bigr),\qquad f\in\mathcal F.
\end{align*}
Since $|f|\leq F_e$ and $P F_e^2<\infty$, every $f\in\mathcal F$ belongs to $L^2(P)$ and hence $P|f|<\infty$ by Cauchy-Schwarz. For fixed $f_1,\dots,f_k\in\mathcal F$, define the random vector $Z_i:\Omega\to\mathbb R^k$ by
\begin{align*}
Z_i(\omega)=\bigl(f_1(X_i(\omega))-P f_1,\dots,f_k(X_i(\omega))-P f_k\bigr),\qquad \omega\in\Omega.
\end{align*}
The vectors $Z_1,\dots,Z_n$ are independent and identically distributed because $X_1,\dots,X_n$ are independent and identically distributed. They have mean zero by the definition of $P f_j$, and they have finite second moments since $|f_j|\leq F_e$ and $P F_e^2<\infty$ for each $j\in\{1,\dots,k\}$. To prove vector convergence, fix an arbitrary vector $a=(a_1,\dots,a_k)\in\mathbb R^k$ and define the real-valued [random variable](/page/Random%20Variable) $Y_i^a:\Omega\to\mathbb R$ by
\begin{align*}
Y_i^a(\omega)=\sum_{j=1}^k a_j\bigl(f_j(X_i(\omega))-P f_j\bigr),\qquad \omega\in\Omega.
\end{align*}
The variables $Y_1^a,Y_2^a,\dots$ are independent and identically distributed, have mean zero, and have finite variance because
\begin{align*}
|Y_i^a|^2\leq \left(\sum_{j=1}^k |a_j|\,|f_j(X_i)-P f_j|\right)^2
\leq k\sum_{j=1}^k |a_j|^2 |f_j(X_i)-P f_j|^2
\end{align*}
and each summand has finite expectation. Applying the [Central Limit Theorem](/theorems/532) to $(Y_i^a)_{i\geq1}$ gives
\begin{align*}
\sum_{j=1}^k a_j\alpha_n(f_j)=\frac{1}{\sqrt n}\sum_{i=1}^n Y_i^a
\xrightarrow{d}
\mathcal N(0,a^\top\Sigma a),
\end{align*}
where the matrix $\Sigma\in\mathbb R^{k\times k}$ is defined by
\begin{align*}
\Sigma_{ij}=P(f_i f_j)-P f_i\,P f_j.
\end{align*}
By the Cramer-Wold characterization of convergence in distribution in finite-dimensional Euclidean spaces, this proves
\begin{align*}
\bigl(\alpha_n(f_1),\dots,\alpha_n(f_k)\bigr)
\xrightarrow{d}
\mathcal N_k(0,\Sigma).
\end{align*}
Thus the finite-dimensional distributions converge to those of a centred Gaussian random map with the covariance stated in the theorem.
[/step]
[step:Use the entropy integral to obtain asymptotic equicontinuity]
Define the intrinsic semimetric $\rho_P: \mathcal F\times\mathcal F\to[0,\infty)$ by
\begin{align*}
\rho_P(f,g)=\|f-g\|_{L^2(P)}.
\end{align*}
For a finitely supported probability measure $Q$ with $0<\|F_e\|_{L^2(Q)}<\infty$, let $N(r,\mathcal H,L^2(Q))$ denote the least number of $L^2(Q)$-balls of radius $r$ needed to cover a function class $\mathcal H$. If $\{f_1,\dots,f_m\}$ is an $r$-cover of $\mathcal F$ in $L^2(Q)$, then $\{f_a-f_b:1\leq a,b\leq m\}$ is a $2r$-cover of $\mathcal F-\mathcal F$ in $L^2(Q)$, because
\begin{align*}
\|(f-g)-(f_a-f_b)\|_{L^2(Q)}\leq \|f-f_a\|_{L^2(Q)}+\|g-f_b\|_{L^2(Q)}<2r.
\end{align*}
Thus
\begin{align*}
N(2r,\mathcal F-\mathcal F,L^2(Q))\leq N(r,\mathcal F,L^2(Q))^2.
\end{align*}
The envelope of $\mathcal F-\mathcal F$ is $2F_e$, and $P(2F_e)^2=4P F_e^2<\infty$. After the change of variables $u=2r$, the preceding covering inequality shows that the uniform entropy integral of the increment class with envelope $2F_e$ is finite whenever $J(1,\mathcal F)<\infty$.
For $\delta>0$, define the localized increment class $\mathcal H_\delta$ by
\begin{align*}
\mathcal H_\delta=\{f-g:f,g\in\mathcal F,\ \rho_P(f,g)<\delta\}.
\end{align*}
This class has envelope $2F_e$ and satisfies
\begin{align*}
\sup_{h\in\mathcal H_\delta}\|h\|_{L^2(P)}\leq \delta.
\end{align*}
We use the uniform entropy maximal inequality for pointwise measurable empirical-process classes. In the form needed here, it states the following: if $\mathcal H$ is pointwise measurable, has envelope $H_e$, satisfies $P H_e^2<\infty$, and has finite uniform entropy integral relative to $H_e$, then the local empirical-process expectations over the set of all $h\in\mathcal H$ satisfying $\|h\|_{L^2(P)}<\eta$ obey
\begin{align*}
\lim_{\eta\downarrow0}\limsup_{n\to\infty}
\mathbb E\left[\sup\left\{\left|\frac{1}{\sqrt n}\sum_{i=1}^n\bigl(h(X_i)-P h\bigr)\right|:h\in\mathcal H,\ \|h\|_{L^2(P)}<\eta\right\}\right]=0.
\end{align*}
The pointwise measurability hypothesis is inherited by $\mathcal F-\mathcal F$ from the pointwise measurable class $\mathcal F$, because a countable pointwise-dense subclass of $\mathcal F$ gives the countable difference subclass for increments. The square-integrable-envelope tail condition required by this maximal inequality is exactly $P(2F_e)^2<\infty$. This result is strictly weaker than the theorem being proved: it gives only a localized maximal expectation bound for increments, and it does not assert weak convergence, Gaussian limits, or the Donsker property.
We apply this maximal inequality to $\mathcal H=\mathcal F-\mathcal F$ with envelope $H_e=2F_e$. The preceding covering calculation verifies the finite entropy hypothesis, pointwise measurability verifies the separability hypothesis, and $P(2F_e)^2<\infty$ verifies the square-integrability and envelope-tail hypotheses. Therefore
\begin{align*}
\lim_{\delta\downarrow0}\limsup_{n\to\infty}
\mathbb E\left[\sup_{h\in\mathcal H_\delta}\left|\frac{1}{\sqrt n}\sum_{i=1}^n\bigl(h(X_i)-P h\bigr)\right|\right]=0.
\end{align*}
For $h=f-g$, the displayed empirical process equals $\alpha_n(f)-\alpha_n(g)$. Markov's inequality then gives, for every $\varepsilon>0$,
\begin{align*}
\lim_{\delta\downarrow0}\limsup_{n\to\infty}
\mathbb P\left(\sup_{\rho_P(f,g)<\delta}|\alpha_n(f)-\alpha_n(g)|>\varepsilon\right)=0.
\end{align*}
[guided]
We need asymptotic equicontinuity, so the right object is not the original class alone but the class of local increments. For $\delta>0$, define
\begin{align*}
\mathcal H_\delta=\{f-g:f,g\in\mathcal F,\ \rho_P(f,g)<\delta\}.
\end{align*}
If $h=f-g\in\mathcal H_\delta$, then $|h|\leq |f|+|g|\leq 2F_e$, so $2F_e$ is an envelope for every localized increment class. Also
\begin{align*}
\|h\|_{L^2(P)}=\|f-g\|_{L^2(P)}=\rho_P(f,g)<\delta,
\end{align*}
and hence
\begin{align*}
\sup_{h\in\mathcal H_\delta}\|h\|_{L^2(P)}\leq\delta.
\end{align*}
Here is the covering estimate, stated independently of the exact proof. If $\{f_1,\dots,f_m\}$ is an $r$-cover of $\mathcal F$ in $L^2(Q)$, then every increment $f-g$ is within $2r$ of some $f_a-f_b$, because
\begin{align*}
\|(f-g)-(f_a-f_b)\|_{L^2(Q)}\leq \|f-f_a\|_{L^2(Q)}+\|g-f_b\|_{L^2(Q)}<2r.
\end{align*}
Therefore
\begin{align*}
N(2r,\mathcal F-\mathcal F,L^2(Q))\leq N(r,\mathcal F,L^2(Q))^2.
\end{align*}
Together with the envelope $2F_e$ and the identity
\begin{align*}
P(2F_e)^2=4P F_e^2<\infty,
\end{align*}
this proves that the increment class has finite uniform entropy integral whenever $J(1,\mathcal F)<\infty$.
Now we invoke only a maximal inequality, not the Donsker conclusion we are proving. The uniform entropy maximal inequality says that a pointwise measurable class $\mathcal H$ with envelope $H_e$, finite uniform entropy integral relative to $H_e$, and $P H_e^2<\infty$ satisfies the following local expectation bound over all $h\in\mathcal H$ with $\|h\|_{L^2(P)}<\eta$:
\begin{align*}
\lim_{\eta\downarrow0}\limsup_{n\to\infty}
\mathbb E\left[\sup\left\{\left|\frac{1}{\sqrt n}\sum_{i=1}^n\bigl(h(X_i)-P h\bigr)\right|:h\in\mathcal H,\ \|h\|_{L^2(P)}<\eta\right\}\right]=0.
\end{align*}
This is not circular: it is an expectation estimate for suprema over small $L^2(P)$ balls, and it does not assert weak convergence in $\ell^\infty(\mathcal F)$ or the existence of a Gaussian limit.
Apply the inequality with $\mathcal H=\mathcal F-\mathcal F$ and $H_e=2F_e$. Since every $h\in\mathcal H_\delta$ has $\|h\|_{L^2(P)}<\delta$, we obtain
\begin{align*}
\lim_{\delta\downarrow0}\limsup_{n\to\infty}
\mathbb E\left[\sup_{h\in\mathcal H_\delta}\left|\frac{1}{\sqrt n}\sum_{i=1}^n\bigl(h(X_i)-P h\bigr)\right|\right]=0.
\end{align*}
For $h=f-g$, linearity of the empirical process gives
\begin{align*}
\frac{1}{\sqrt n}\sum_{i=1}^n\bigl(h(X_i)-P h\bigr)=\alpha_n(f)-\alpha_n(g).
\end{align*}
Thus Markov's inequality yields, for every $\varepsilon>0$,
\begin{align*}
\mathbb P\left(\sup_{\rho_P(f,g)<\delta}|\alpha_n(f)-\alpha_n(g)|>\varepsilon\right)
\leq \frac{1}{\varepsilon}\mathbb E\left[\sup_{h\in\mathcal H_\delta}\left|\frac{1}{\sqrt n}\sum_{i=1}^n\bigl(h(X_i)-P h\bigr)\right|\right].
\end{align*}
Taking $\limsup_{n\to\infty}$ and then letting $\delta\downarrow0$ proves
\begin{align*}
\lim_{\delta\downarrow0}\limsup_{n\to\infty}
\mathbb P\left(\sup_{\rho_P(f,g)<\delta}|\alpha_n(f)-\alpha_n(g)|>\varepsilon\right)=0.
\end{align*}
[/guided]
[/step]
[step:Combine finite-dimensional convergence with the empirical-process convergence criterion]
We now use the empirical-process convergence criterion, which is distinct from the entropy sufficient condition being proved. In this context, asymptotic measurability means that for every bounded continuous functional $\Phi:\ell^\infty(\mathcal F)\to\mathbb R$, the outer and inner expectations of $\Phi(\alpha_n)$ have the same limit whenever that limit is tested along the sequence. The criterion states that a pointwise measurable empirical process indexed by a semimetric class $(\mathcal F,\rho_P)$ converges weakly in $\ell^\infty(\mathcal F)$ if its finite-dimensional distributions converge, it is asymptotically uniformly equicontinuous in $\rho_P$, and the limiting semimetric space is totally bounded after quotienting by the relation $\rho_P(f,g)=0$.
We verify these hypotheses. Pointwise measurability supplies the asymptotic measurability and separability requirement through a countable pointwise-dense subclass. The finite entropy integral supplies [total boundedness](/page/Total%20Boundedness) of the quotient of $(\mathcal F,\rho_P)$, because the same entropy bound controls the number of $L^2(P)$-balls needed after the standard finitely supported approximation of $P$ used in the uniform-entropy criterion. The first step gives finite-dimensional convergence, and the preceding step gives asymptotic uniform equicontinuity.
The first step gives convergence of every finite-dimensional marginal of $\alpha_n$ to a centred Gaussian vector with covariance matrix
\begin{align*}
\Sigma_{ij}=P(f_i f_j)-P f_i\,P f_j.
\end{align*}
The preceding step proves the required asymptotic uniform equicontinuity:
\begin{align*}
\lim_{\delta\downarrow 0}\limsup_{n\to\infty}
\mathbb P\left(\sup_{\rho_P(f,g)<\delta}|\alpha_n(f)-\alpha_n(g)|>\varepsilon\right)=0
\end{align*}
for every $\varepsilon>0$. The convergence criterion therefore yields weak convergence in $\ell^\infty(\mathcal F)$ to a tight centred Gaussian random map $G_P:\mathcal F\to\mathbb R$. Its covariance is determined by the displayed finite-dimensional limits, namely
\begin{align*}
\operatorname{Cov}(G_P(f),G_P(g))=P(fg)-P f\,P g.
\end{align*}
This is precisely the asserted $P$-Donsker property of $\mathcal F$.
[/step]