[proofplan]
We prove the first one-sided deviation bound by applying one-sided symmetrisation to the expected supremum and then applying [McDiarmid's bounded differences inequality](/theorems/6072) to the sample-dependent supremum. Countability of $\mathcal F$ makes the relevant suprema measurable, and boundedness of the functions gives the bounded-differences constant $(b-a)/n$ in each coordinate. The second bound follows by applying the same argument to the reflected class $-\mathcal F$ and using Rademacher symmetry to identify its expected Rademacher complexity with $\mathfrak R_n(\mathcal F)$.
[/proofplan]
[step:Verify measurability and finiteness of the upper deviation supremum]
Define the product measurable space
\begin{align*}
(S^n,\mathcal E^{\otimes n})
\end{align*}
and define the sample map $X_{1:n}:(\Omega,\mathcal A)\to(S^n,\mathcal E^{\otimes n})$ by
\begin{align*}
X_{1:n}(\omega):=(X_1(\omega),\dots,X_n(\omega)).
\end{align*}
For each $f\in\mathcal F$, define
\begin{align*}
T_f:S^n\to\mathbb R,\qquad (x_1,\dots,x_n)\mapsto P f-\frac{1}{n}\sum_{i=1}^{n}f(x_i).
\end{align*}
Since $f$ is $\mathcal E$-measurable and bounded, $P f$ is finite and $T_f$ is $\mathcal E^{\otimes n}$-measurable. Because $\mathcal F$ is countable, the map $z:S^n\to\mathbb R$ defined by
\begin{align*}
z(x_1,\dots,x_n):=\sup_{f\in\mathcal F}T_f(x_1,\dots,x_n)
\end{align*}
is measurable as the pointwise supremum of a countable family of measurable real-valued maps. Moreover, for every $(x_1,\dots,x_n)\in S^n$ and every $f\in\mathcal F$,
\begin{align*}
a\le P f\le b
\end{align*}
and
\begin{align*}
a\le \frac{1}{n}\sum_{i=1}^{n}f(x_i)\le b,
\end{align*}
so
\begin{align*}
-(b-a)\le T_f(x_1,\dots,x_n)\le b-a.
\end{align*}
Thus $z$ is finite and bounded. Finally,
\begin{align*}
Z:=z(X_{1:n})=\sup_{f\in\mathcal F}(P f-P_n f)
\end{align*}
is a bounded real-valued [random variable](/page/Random%20Variable).
[guided]
We first isolate the random quantity to which concentration will be applied. The concentration inequality is a statement about a [measurable function](/page/Measurable%20Function) of independent inputs, so we must define that deterministic function and check measurability.
Let $X_{1:n}:(\Omega,\mathcal A)\to(S^n,\mathcal E^{\otimes n})$ be the vector of observations, defined by
\begin{align*}
X_{1:n}(\omega):=(X_1(\omega),\dots,X_n(\omega)).
\end{align*}
For each $f\in\mathcal F$, define $T_f:S^n\to\mathbb R$ by
\begin{align*}
T_f(x_1,\dots,x_n):=P f-\frac{1}{n}\sum_{i=1}^{n}f(x_i).
\end{align*}
The integral $P f=\int_S f(x)\,dP(x)$ is finite because $a\le f\le b$. The map $(x_1,\dots,x_n)\mapsto f(x_i)$ is $\mathcal E^{\otimes n}$-measurable for each coordinate $i$, hence $T_f$ is measurable as a finite linear combination of measurable real-valued functions plus the constant $P f$.
Now define $z:S^n\to\mathbb R$ by
\begin{align*}
z(x_1,\dots,x_n):=\sup_{f\in\mathcal F}T_f(x_1,\dots,x_n).
\end{align*}
The countability of $\mathcal F$ is used exactly here: a countable supremum of measurable real-valued functions is measurable. Without countability, this step would require an outer-probability or separability convention.
The function $z$ is also bounded. Since each $f$ takes values in $[a,b]$, the integral average $P f$ lies in $[a,b]$, and every empirical average $\frac{1}{n}\sum_{i=1}^{n}f(x_i)$ also lies in $[a,b]$. Therefore, for every $f\in\mathcal F$ and every sample point $(x_1,\dots,x_n)\in S^n$,
\begin{align*}
-(b-a)\le P f-\frac{1}{n}\sum_{i=1}^{n}f(x_i)\le b-a.
\end{align*}
Taking the supremum over $f\in\mathcal F$ preserves finiteness, so $z$ is a bounded measurable map. Consequently
\begin{align*}
Z:=z(X_{1:n})=\sup_{f\in\mathcal F}(P f-P_n f)
\end{align*}
is a bounded real-valued random variable.
[/guided]
[/step]
[step:Bound the expectation by one-sided symmetrisation]
The class $\mathcal F$ is nonempty and countable, and each $f\in\mathcal F$ is measurable and $P$-integrable because $a\le f\le b$. Hence the hypotheses of [citetheorem:9824] apply to $\mathcal F$. Therefore
\begin{align*}
\mathbb E[Z]\le 2\mathbb E\left[\sup_{f\in\mathcal F}\frac{1}{n}\sum_{i=1}^{n}\varepsilon_i f(X_i)\right].
\end{align*}
By the definition of $\mathfrak R_n(\mathcal F)$, this gives
\begin{align*}
\mathbb E[Z]\le 2\mathfrak R_n(\mathcal F).
\end{align*}
[/step]
[step:Check the bounded differences constants for the upper deviation]
Let $j\in\{1,\dots,n\}$. Let $x=(x_1,\dots,x_n)\in S^n$ and $x'=(x_1',\dots,x_n')\in S^n$ satisfy $x_i=x_i'$ for every $i\ne j$. For each $f\in\mathcal F$,
\begin{align*}
T_f(x)-T_f(x')=-\frac{1}{n}\bigl(f(x_j)-f(x_j')\bigr).
\end{align*}
Since $f(S)\subset[a,b]$, we have
\begin{align*}
|T_f(x)-T_f(x')|\le \frac{b-a}{n}.
\end{align*}
Taking suprema preserves this Lipschitz bound in the following sense: for all $f\in\mathcal F$,
\begin{align*}
T_f(x)\le T_f(x')+\frac{b-a}{n}\le z(x')+\frac{b-a}{n}.
\end{align*}
Taking the supremum over $f\in\mathcal F$ gives
\begin{align*}
z(x)\le z(x')+\frac{b-a}{n}.
\end{align*}
Interchanging $x$ and $x'$ gives
\begin{align*}
z(x')\le z(x)+\frac{b-a}{n}.
\end{align*}
Thus
\begin{align*}
|z(x)-z(x')|\le \frac{b-a}{n}.
\end{align*}
So $z$ has bounded differences with coordinate constants
\begin{align*}
c_j:=\frac{b-a}{n}
\end{align*}
for $j=1,\dots,n$.
[/step]
[step:Apply McDiarmid's inequality to obtain the upper deviation bound]
The random variables $X_1,\dots,X_n$ are independent, and the measurable function $z:S^n\to\mathbb R$ has bounded differences constants $c_j=(b-a)/n$. By McDiarmid's bounded differences inequality, for every $t>0$,
\begin{align*}
\mathbb P\bigl(Z-\mathbb E[Z]\ge t\bigr)\le \exp\left(-\frac{2t^2}{\sum_{j=1}^{n}c_j^2}\right).
\end{align*}
Since
\begin{align*}
\sum_{j=1}^{n}c_j^2=n\left(\frac{b-a}{n}\right)^2=\frac{(b-a)^2}{n},
\end{align*}
we obtain
\begin{align*}
\mathbb P\bigl(Z-\mathbb E[Z]\ge t\bigr)\le \exp\left(-\frac{2nt^2}{(b-a)^2}\right).
\end{align*}
Set
\begin{align*}
t_\delta:=(b-a)\sqrt{\frac{\log(1/\delta)}{2n}}.
\end{align*}
Because $\delta\in(0,1)$, $t_\delta>0$, and substitution gives
\begin{align*}
\mathbb P\bigl(Z-\mathbb E[Z]\ge t_\delta\bigr)\le \delta.
\end{align*}
Therefore, with probability at least $1-\delta$,
\begin{align*}
Z\le \mathbb E[Z]+t_\delta.
\end{align*}
Combining this with $\mathbb E[Z]\le 2\mathfrak R_n(\mathcal F)$ yields
\begin{align*}
\sup_{f\in\mathcal F}(P f-P_n f)\le 2\mathfrak R_n(\mathcal F)+(b-a)\sqrt{\frac{\log(1/\delta)}{2n}}.
\end{align*}
[guided]
We now convert the expectation estimate into a high-probability estimate. The correct concentration tool is McDiarmid's bounded differences inequality, because $Z$ is a function of the independent inputs $X_1,\dots,X_n$ and changing one input changes every empirical average by at most $(b-a)/n$.
The hypotheses required for McDiarmid's inequality are satisfied. First, $X_1,\dots,X_n$ are independent by assumption. Second, the map $z:S^n\to\mathbb R$ is measurable by the first step. Third, the previous step proved that if two samples differ only in coordinate $j$, then
\begin{align*}
|z(x)-z(x')|\le \frac{b-a}{n}.
\end{align*}
Thus the bounded differences constants are
\begin{align*}
c_j:=\frac{b-a}{n}
\end{align*}
for $j=1,\dots,n$.
McDiarmid's bounded differences inequality gives, for every $t>0$,
\begin{align*}
\mathbb P\bigl(Z-\mathbb E[Z]\ge t\bigr)\le \exp\left(-\frac{2t^2}{\sum_{j=1}^{n}c_j^2}\right).
\end{align*}
The denominator is explicit:
\begin{align*}
\sum_{j=1}^{n}c_j^2=n\left(\frac{b-a}{n}\right)^2=\frac{(b-a)^2}{n}.
\end{align*}
Therefore
\begin{align*}
\mathbb P\bigl(Z-\mathbb E[Z]\ge t\bigr)\le \exp\left(-\frac{2nt^2}{(b-a)^2}\right).
\end{align*}
To make the right-hand side equal to $\delta$, choose
\begin{align*}
t_\delta:=(b-a)\sqrt{\frac{\log(1/\delta)}{2n}}.
\end{align*}
Since $\delta\in(0,1)$, $\log(1/\delta)>0$, so $t_\delta>0$. Substituting this value gives
\begin{align*}
\mathbb P\bigl(Z-\mathbb E[Z]\ge t_\delta\bigr)\le \delta.
\end{align*}
Equivalently, with probability at least $1-\delta$,
\begin{align*}
Z\le \mathbb E[Z]+t_\delta.
\end{align*}
Finally, the symmetrisation step gave
\begin{align*}
\mathbb E[Z]\le 2\mathfrak R_n(\mathcal F).
\end{align*}
Since $Z=\sup_{f\in\mathcal F}(P f-P_n f)$, we conclude
\begin{align*}
\sup_{f\in\mathcal F}(P f-P_n f)\le 2\mathfrak R_n(\mathcal F)+(b-a)\sqrt{\frac{\log(1/\delta)}{2n}}.
\end{align*}
[/guided]
[/step]
[step:Reflect the class to prove the lower deviation bound]
Define the reflected class
\begin{align*}
-\mathcal F:=\{-f:f\in\mathcal F\}.
\end{align*}
Each $g\in-\mathcal F$ is a measurable function $g:S\to[-b,-a]$, and $-\mathcal F$ is nonempty and countable. Applying the already proved upper deviation bound to $-\mathcal F$ gives, with probability at least $1-\delta$,
\begin{align*}
\sup_{g\in-\mathcal F}(P g-P_n g)\le 2\mathfrak R_n(-\mathcal F)+(b-a)\sqrt{\frac{\log(1/\delta)}{2n}}.
\end{align*}
For $g=-f$,
\begin{align*}
P g-P_n g=P(-f)-P_n(-f)=P_n f-P f.
\end{align*}
Hence
\begin{align*}
\sup_{g\in-\mathcal F}(P g-P_n g)=\sup_{f\in\mathcal F}(P_n f-P f).
\end{align*}
It remains to identify the Rademacher complexities. Conditional on $X_1,\dots,X_n$, the vector $(-\varepsilon_1,\dots,-\varepsilon_n)$ has the same distribution as $(\varepsilon_1,\dots,\varepsilon_n)$. Therefore
\begin{align*}
\mathfrak R_n(-\mathcal F)=\mathbb E\left[\sup_{f\in\mathcal F}\frac{1}{n}\sum_{i=1}^{n}\varepsilon_i(-f)(X_i)\right].
\end{align*}
By Rademacher symmetry,
\begin{align*}
\mathfrak R_n(-\mathcal F)=\mathbb E\left[\sup_{f\in\mathcal F}\frac{1}{n}\sum_{i=1}^{n}\varepsilon_i f(X_i)\right]=\mathfrak R_n(\mathcal F).
\end{align*}
Substitution yields, with probability at least $1-\delta$,
\begin{align*}
\sup_{f\in\mathcal F}(P_n f-P f)\le 2\mathfrak R_n(\mathcal F)+(b-a)\sqrt{\frac{\log(1/\delta)}{2n}}.
\end{align*}
This proves both asserted inequalities.
[/step]