Bracketing Glivenko-Cantelli Criterion — Statement & Proof

Bracketing Glivenko-Cantelli Criterion (Theorem # 9835)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first use the bracketing assumption to show that each $f\in\mathcal F$ is squeezed between two integrable functions whose $L^1(P)$ gap is finite, hence is itself integrable. Then we fix a finite bracket cover and bound the uniform empirical deviation over $\mathcal F$ by the maximum empirical deviation over finitely many bracket endpoints plus the deterministic bracket width. The finite maximum is controlled by the ordinary [strong law of large numbers](/theorems/520) applied to the finitely many integrable endpoint functions. Finally, because the bracket width can be made arbitrarily small, the uniform deviation converges to zero, with outer probability used when the supremum is not measurable. [/proofplan] [step:Show that every function in the class is integrable] Fix $f\in\mathcal F$. Apply the bracketing hypothesis with $\varepsilon=1$. There exist $m\in\mathbb N$ and measurable $P$-integrable functions $l_1,u_1,\dots,l_m,u_m:S\to\mathbb R$ covering $\mathcal F$ with $L^1(P)$ widths at most $1$. Hence for some $j\in\{1,\dots,m\}$, \begin{align*} l_j(x)\le f(x)\le u_j(x) \end{align*} for every $x\in S$. Since $f$ is measurable by hypothesis, it remains only to prove $P|f|<\infty$. The pointwise order gives \begin{align*} |f(x)|\le |l_j(x)|+|u_j(x)| \end{align*} for every $x\in S$. Indeed, if $f(x)\ge 0$, then $f(x)\le u_j(x)\le |u_j(x)|$ when $u_j(x)\ge f(x)\ge 0$, while if $f(x)<0$, then $|f(x)|=-f(x)\le -l_j(x)\le |l_j(x)|$. Integrating with respect to $P$ gives \begin{align*} \int_S |f(x)|\,dP(x)\le \int_S |l_j(x)|\,dP(x)+\int_S |u_j(x)|\,dP(x)<\infty. \end{align*} Thus $f$ is $P$-integrable. [/step] [step:Bound each empirical deviation by endpoint deviations and bracket width] Fix $\eta>0$. Choose a finite bracketing cover \begin{align*} [l_1,u_1],\dots,[l_m,u_m] \end{align*} as in the hypothesis, with $l_j\le u_j$ pointwise on $S$ and \begin{align*} P(u_j-l_j)=\int_S (u_j(x)-l_j(x))\,dP(x)\le\eta \end{align*} for every $j\in\{1,\dots,m\}$. The equality with the absolute-value width holds because the theorem statement requires $l_j\le u_j$ pointwise on $S$. Define the finite endpoint set \begin{align*} \mathcal H_\eta:=\{l_1,u_1,\dots,l_m,u_m\}. \end{align*} For any $f\in\mathcal F$, choose $j\in\{1,\dots,m\}$ such that $l_j\le f\le u_j$ pointwise on $S$. Since $P_n$ and $P$ are positive linear functionals on integrable [measurable functions](/page/Measurable%20Functions), the inequalities $f\le u_j$ and $l_j\le f$ imply \begin{align*} P_nf-Pf\le P_nu_j-Pl_j. \end{align*} Rewriting the right-hand side gives \begin{align*} P_nu_j-Pl_j=(P_n-P)u_j+P(u_j-l_j)\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} Similarly, \begin{align*} Pf-P_nf\le Pu_j-P_nl_j=(P-P_n)l_j+P(u_j-l_j)\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} Therefore \begin{align*} |(P_n-P)f|\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} Taking the supremum over $f\in\mathcal F$ yields \begin{align*} \sup_{f\in\mathcal F}|(P_n-P)f|\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} [guided] The finite bracketing cover turns an infinite supremum over $\mathcal F$ into a finite maximum over endpoint functions. Fix $\eta>0$, and choose measurable $P$-integrable functions \begin{align*} l_1,u_1,\dots,l_m,u_m:S\to\mathbb R \end{align*} such that every $f\in\mathcal F$ lies in at least one bracket, $l_j\le u_j$ pointwise on $S$, and \begin{align*} \int_S |u_j(x)-l_j(x)|\,dP(x)=\int_S (u_j(x)-l_j(x))\,dP(x)\le \eta \end{align*} for each $j$. We write \begin{align*} \mathcal H_\eta:=\{l_1,u_1,\dots,l_m,u_m\}. \end{align*} This set is finite, and every element of it is integrable with respect to $P$. Now fix $f\in\mathcal F$. Choose $j\in\{1,\dots,m\}$ such that \begin{align*} l_j(x)\le f(x)\le u_j(x) \end{align*} for every $x\in S$. The empirical measure $P_n$ is positive: if $a:S\to\mathbb R$ and $b:S\to\mathbb R$ are measurable and $a\le b$ pointwise, then \begin{align*} P_na=\frac{1}{n}\sum_{i=1}^{n}a(X_i)\le \frac{1}{n}\sum_{i=1}^{n}b(X_i)=P_nb. \end{align*} The measure $P$ is also positive, so pointwise inequalities may be integrated with respect to $P$. Hence $f\le u_j$ gives $P_nf\le P_nu_j$, and $l_j\le f$ gives $Pl_j\le Pf$. Combining these two inequalities, \begin{align*} P_nf-Pf\le P_nu_j-Pl_j. \end{align*} We split the right-hand side into an empirical fluctuation term and a deterministic bracket-width term: \begin{align*} P_nu_j-Pl_j=(P_n-P)u_j+P(u_j-l_j). \end{align*} The first term is bounded by the largest endpoint deviation, \begin{align*} (P_n-P)u_j\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|, \end{align*} and the second term is bounded by $\eta$ because the bracket width is at most $\eta$. Thus \begin{align*} P_nf-Pf\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} The lower tail is controlled by the same bracket. Since $f\le u_j$ gives $Pf\le Pu_j$ and $l_j\le f$ gives $P_nl_j\le P_nf$, we get \begin{align*} Pf-P_nf\le Pu_j-P_nl_j. \end{align*} Again split the right-hand side: \begin{align*} Pu_j-P_nl_j=(P-P_n)l_j+P(u_j-l_j). \end{align*} The endpoint fluctuation is bounded by the same finite maximum, and the bracket width is bounded by $\eta$, so \begin{align*} Pf-P_nf\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} The two one-sided estimates imply \begin{align*} |(P_n-P)f|\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} Because this bound holds for every $f\in\mathcal F$, taking the supremum over $f$ gives \begin{align*} \sup_{f\in\mathcal F}|(P_n-P)f|\le \max_{h\in\mathcal H_\eta}|(P_n-P)h|+\eta. \end{align*} This is the key reduction: the possibly nonseparable class $\mathcal F$ appears only through the deterministic bracket width, while the random part involves finitely many integrable functions. [/guided] [/step] [step:Apply the strong law to the finitely many bracket endpoints] For each $h\in\mathcal H_\eta$, the random variables \begin{align*} h(X_1),h(X_2),\dots:\Omega\to\mathbb R \end{align*} are i.i.d. and integrable, because $h$ is measurable and $P|h|<\infty$. The set $\mathcal H_\eta$ is finite. Therefore the hypotheses of the [Finite-Class Uniform Law of Large Numbers](/theorems/9816) [citetheorem:9816] hold for the finite class $\mathcal H_\eta$, and it gives \begin{align*} \max_{h\in\mathcal H_\eta}|(P_n-P)h|\to 0 \end{align*} almost surely. Combining this almost sure convergence with the endpoint bound from the previous step, we obtain an event of probability $1$ depending on $\eta$ such that \begin{align*} \limsup_{n\to\infty}\sup_{f\in\mathcal F}|(P_n-P)f|\le \eta \end{align*} whenever the supremum is interpreted pointwise as an extended real-valued function. [/step] [step:Let the bracket width tend to zero] First assume that \begin{align*} Z_n:\Omega\to[0,\infty] \end{align*} defined by \begin{align*} Z_n(\omega):=\sup_{f\in\mathcal F}|(P_n-P)f(\omega)| \end{align*} is measurable for every $n\in\mathbb N$. Apply the previous step with $\eta=1/k$ for each $k\in\mathbb N$. Intersecting the corresponding probability-one events over $k\in\mathbb N$, we obtain an event $\Omega_0\in\mathcal A$ with $\mathbb P(\Omega_0)=1$ such that for every $\omega\in\Omega_0$ and every $k\in\mathbb N$, \begin{align*} \limsup_{n\to\infty}Z_n(\omega)\le \frac{1}{k}. \end{align*} Letting $k\to\infty$ gives \begin{align*} \limsup_{n\to\infty}Z_n(\omega)\le 0. \end{align*} Since $Z_n(\omega)\ge 0$, it follows that $Z_n(\omega)\to 0$ for every $\omega\in\Omega_0$. Hence \begin{align*} \sup_{f\in\mathcal F}|(P_n-P)f|\to 0 \end{align*} almost surely. If the supremum is not known to be measurable, use outer probability. Fix $\delta>0$ and choose $\eta\in(0,\delta)$. From the endpoint bound, \begin{align*} \left\{\sup_{f\in\mathcal F}|(P_n-P)f|>\delta\right\}\subseteq \left\{\max_{h\in\mathcal H_\eta}|(P_n-P)h|>\delta-\eta\right\} \end{align*} as subsets of $\Omega$, where the set on the right is measurable because $\mathcal H_\eta$ is finite. By monotonicity of outer probability, \begin{align*} \mathbb P^*\left(\sup_{f\in\mathcal F}|(P_n-P)f|>\delta\right)\le \mathbb P^*\left(\max_{h\in\mathcal H_\eta}|(P_n-P)h|>\delta-\eta\right). \end{align*} The event on the right is measurable because $\mathcal H_\eta$ is finite and each endpoint average is measurable, so its outer probability equals its ordinary probability. Hence \begin{align*} \mathbb P^*\left(\sup_{f\in\mathcal F}|(P_n-P)f|>\delta\right)\le \mathbb P\left(\max_{h\in\mathcal H_\eta}|(P_n-P)h|>\delta-\eta\right). \end{align*} The right-hand side tends to $0$ by the finite endpoint law of large numbers proved in the previous step. Hence \begin{align*} \mathbb P^*\left(\sup_{f\in\mathcal F}|(P_n-P)f|>\delta\right)\to 0. \end{align*} This proves the asserted Glivenko-Cantelli conclusion under the ordinary measurability convention and under the outer-probability convention. [/step]

Prerequisites (0/3 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Explore Further

Distribution Definition Supremum Definition Event Definition Local Rademacher Bound Probability & Statistics Boundary Continuity Under the Cone Condition Brownian Motion Frisch-Waugh-Lovell Theorem Probability & Statistics Optional Stopping for UI Martingales Martingale Theory Poisson Limit Theorem for Binomial Distributions Probability & Statistics Non-comparability of McDiarmid's Bounded-Difference Proxy with Variance Probability & Statistics CDF Formula for a Random Variable with Density Probability & Statistics Bobkov–Götze Concentration Theorem for $T_1$ Transport-Entropy Inequalities Probability & Statistics Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.

Bracketing Glivenko-Cantelli Criterion (Theorem # 9835)

Discussion

Proof

Prerequisites (0/3 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Bracketing Glivenko-Cantelli Criterion (Theorem # 9835)

Discussion

Proof

Prerequisites (0/3 completed)

Prerequisites Graph

Explore Further