Dudley-Pollard Entropy Bound for VC Classes — Statement & Proof

Dudley-Pollard Entropy Bound for VC Classes (Theorem # 9830)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove the estimate by bounding packing numbers for the pseudometric $d_Q$. A maximal separated family of sets is first interpreted as a separated family of indicator functions, so its pairwise $L^2(Q)$ distances are lower bounds on the $Q$-measures of symmetric differences. The Pollard discretisation lemma converts this measure separation into a bound by finite traces, and the Sauer-Shelah lemma bounds the number of possible traces of a VC class on a finite sample. Finally we pass from packing to covering and absorb harmless numerical factors into one universal constant. [/proofplan] [step:Reduce covering to packing in the $L^2(Q)$ pseudometric] Define the pseudometric \begin{align*} d_Q:\mathcal C\times\mathcal C\to[0,\infty),\qquad d_Q(C,D):=Q(C\triangle D)^{1/2}. \end{align*} This is exactly the $L^2(Q)$ distance between the indicator functions $\mathbb 1_C$ and $\mathbb 1_D$, because \begin{align*} \|\mathbb 1_C-\mathbb 1_D\|_{L^2(Q)}^2 =\int_S |\mathbb 1_C(x)-\mathbb 1_D(x)|^2\,dQ(x) =Q(C\triangle D). \end{align*} For $\delta>0$, let $M(\delta,\mathcal C,d_Q)$ denote the supremum of the cardinalities of finite $\delta$-separated subsets of $\mathcal C$ with respect to $d_Q$. If every finite $\varepsilon$-separated subset of $\mathcal C$ has cardinality at most $B$, then a maximality argument gives \begin{align*} N(\varepsilon,\mathcal C,L^2(Q))\le B. \end{align*} Indeed, if no finite $\varepsilon$-net with at most $B$ points existed, one could inductively choose $B+1$ points with pairwise $d_Q$-distance greater than $\varepsilon$, contradicting the finite packing bound. It is therefore enough to prove the stated upper bound for every finite $\varepsilon$-separated subfamily of $\mathcal C$. [/step] [step:Apply Pollard's packing bound to the associated symmetric-difference class] Let $m\in\mathbb N$, and let $C_1,\dots,C_m\in\mathcal C$ be a finite $\varepsilon$-separated family in $d_Q$. For distinct indices $i,j\in\{1,\dots,m\}$, separation gives \begin{align*} Q(C_i\triangle C_j)\ge \varepsilon^2. \end{align*} We use Pollard's packing estimate for VC classes: if $\mathcal C$ has VC dimension $v<\infty$, $Q$ is a probability measure on $(S,\mathcal A)$, and $0<\alpha<1$, then every finite family $A_1,\dots,A_r\in\mathcal C$ satisfying $Q(A_a\triangle A_b)>\alpha$ for all $a\ne b$ has cardinality \begin{align*} r\le 2(v+1)(4e)^v\alpha^{-v}. \end{align*} This is Pollard's probabilistic discretisation argument: one samples finitely many points from $Q$, shows that an $L^1(Q)$-separated family must create many distinct traces with positive probability, and then bounds those traces by the Sauer-Shelah lemma [citetheorem:1969]. Fix $\alpha$ with $0<\alpha<\varepsilon^2$. Because $0<\varepsilon<1$, we have $0<\alpha<1$. Since $Q(C_i\triangle C_j)\ge \varepsilon^2>\alpha$ for all distinct $i,j$, the family $C_1,\dots,C_m$ satisfies the strict separation hypothesis in Pollard's estimate. Hence \begin{align*} m\le 2(v+1)(4e)^v\alpha^{-v}. \end{align*} Letting $\alpha\uparrow\varepsilon^2$ gives \begin{align*} m\le 2(v+1)(4e)^v\varepsilon^{-2v}. \end{align*} [guided] Let $m\in\mathbb N$, and let $C_1,\dots,C_m\in\mathcal C$ be a finite family satisfying $d_Q(C_i,C_j)>\varepsilon$ for all distinct indices $i,j\in\{1,\dots,m\}$. The metric in the theorem is an $L^2(Q)$ metric on indicator functions. Squaring it converts the problem into the symmetric-difference metric \begin{align*} \rho_Q:\mathcal C\times\mathcal C\to[0,1],\qquad \rho_Q(C,D):=Q(C\triangle D). \end{align*} Indeed, by the definition of $d_Q$, \begin{align*} \rho_Q(C_i,C_j)=d_Q(C_i,C_j)^2\ge \varepsilon^2 \end{align*} for all distinct $i,j$. The external ingredient is Pollard's packing estimate for VC classes. Its hypotheses are: the class has finite VC dimension, the measure is a probability measure on the underlying measurable space, and the separation parameter $\alpha$ satisfies $0<\alpha<1$. These are available here because $V(\mathcal C)=v<\infty$, $Q$ is a probability measure on $(S,\mathcal A)$, and any choice $0<\alpha<\varepsilon^2$ also satisfies $0<\alpha<1$ since $0<\varepsilon<1$. The conclusion of Pollard's estimate is that every finite family in $\mathcal C$ whose pairwise symmetric differences have $Q$-measure strictly larger than $\alpha$ has cardinality at most \begin{align*} 2(v+1)(4e)^v\alpha^{-v}. \end{align*} The finite-trace mechanism behind the estimate is that a random sample converts measure separation into separation of traces on a finite set, while the Sauer-Shelah lemma [citetheorem:1969] bounds the number of traces of a VC class of dimension $v$. Now fix $\alpha$ with $0<\alpha<\varepsilon^2$. Since $Q(C_i\triangle C_j)\ge\varepsilon^2>\alpha$ for all distinct $i,j$, the strict separation hypothesis in Pollard's estimate is satisfied. Therefore \begin{align*} m\le 2(v+1)(4e)^v\alpha^{-v}. \end{align*} This inequality holds for every $0<\alpha<\varepsilon^2$. Passing to the limit as $\alpha\uparrow\varepsilon^2$ gives \begin{align*} m\le 2(v+1)(4e)^v\varepsilon^{-2v}. \end{align*} [/guided] [/step] [step:Absorb numerical factors into the stated universal constant] From the previous step, every finite $\varepsilon$-separated subfamily of $\mathcal C$ has cardinality at most $2(v+1)(4e)^v\varepsilon^{-2v}$. The finite packing-to-covering reduction from the first step therefore gives \begin{align*} N(\varepsilon,\mathcal C,L^2(Q)) \le 2(v+1)(4e)^v\varepsilon^{-2v}. \end{align*} Since \begin{align*} 2(4e)^v\le (4e)^{v+1} \end{align*} for every $v\ge 0$, the estimate holds with the universal constant $K=1$: \begin{align*} N(\varepsilon,\mathcal C,L^2(Q)) \le (v+1)(4e)^{v+1}\varepsilon^{-2v}. \end{align*} Allowing an arbitrary universal constant $K\ge 1$ gives exactly the stated form. [/step]

Explore Further

What brings you to Androma?

Start with a route through the knowledge graph.

Dudley-Pollard Entropy Bound for VC Classes (Theorem # 9830)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Dudley-Pollard Entropy Bound for VC Classes (Theorem # 9830)

Discussion

Proof

Explore Further