[guided]The finite bracketing cover turns an infinite supremum over $\mathcal F$ into a finite maximum over endpoint functions. Fix $\eta>0$, and choose measurable $P$-integrable functions
\begin{align*}
l_1,u_1,\dots,l_m,u_m:S\to\mathbb R
\end{align*}
such that every $f\in\mathcal F$ lies in at least one bracket, $l_j\le u_j$ pointwise on $S$, and
\begin{align*}
\int_S |u_j(x)-l_j(x)|\,dP(x)=\int_S (u_j(x)-l_j(x))\,dP(x)\le \eta
\end{align*}
for each $j$. We write
\begin{align*}
\mathcal H_\eta:=\{l_1,u_1,\dots,l_m,u_m\}.
\end{align*}
This set is finite, and every element of it is integrable with respect to $P$.
Now fix $f\in\mathcal F$. Choose $j\in\{1,\dots,m\}$ such that
\begin{align*}
l_j(x)\le f(x)\le u_j(x)
\end{align*}
for every $x\in S$. The empirical measure $P_n$ is positive: if $a:S\to\mathbb R$ and $b:S\to\mathbb R$ are measurable and $a\le b$ pointwise, then
\begin{align*}
P_na=\frac{1}{n}\sum_{i=1}^{n}a(X_i)\le \frac{1}{n}\sum_{i=1}^{n}b(X_i)=P_nb.
\end{align*}
The measure $P$ is also positive, so pointwise inequalities may be integrated with respect to $P$. Hence $f\le u_j$ gives $P_nf\le P_nu_j$, and $l_j\le f$ gives $Pl_j\le Pf$. Combining these two inequalities,
\begin{align*}
P_nf-Pf\le P_nu_j-Pl_j.
\end{align*}
We split the right-hand side into an empirical fluctuation term and a deterministic bracket-width term:
\begin{align*}
P_nu_j-Pl_j=(P_n-P)u_j+P(u_j-l_j).
\end{align*}
The first term is bounded by the largest endpoint deviation,
\begin{align*}
(P_n-P)u_j\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|,
\end{align*}
and the second term is bounded by $\eta$ because the bracket width is at most $\eta$. Thus
\begin{align*}
P_nf-Pf\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta.
\end{align*}
The lower tail is controlled by the same bracket. Since $f\le u_j$ gives $Pf\le Pu_j$ and $l_j\le f$ gives $P_nl_j\le P_nf$, we get
\begin{align*}
Pf-P_nf\le Pu_j-P_nl_j.
\end{align*}
Again split the right-hand side:
\begin{align*}
Pu_j-P_nl_j=(P-P_n)l_j+P(u_j-l_j).
\end{align*}
The endpoint fluctuation is bounded by the same finite maximum, and the bracket width is bounded by $\eta$, so
\begin{align*}
Pf-P_nf\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta.
\end{align*}
The two one-sided estimates imply
\begin{align*}
|(P_n-P)f|\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta.
\end{align*}
Because this bound holds for every $f\in\mathcal F$, taking the supremum over $f$ gives
\begin{align*}
\sup_{f\in\mathcal F}|(P_n-P)f|\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta.
\end{align*}
This is the key reduction: the possibly nonseparable class $\mathcal F$ appears only through the deterministic bracket width, while the random part involves finitely many integrable functions.[/guided]