Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[step:Convert uniform empirical convergence into weighted $L^2(\mu_0)$ convergence]Let $(\Omega,\mathcal{F},\mathbb{P}_F)$ denote the probability space carrying the sample sequence $(X_n)_{n=1}^{\infty}$ under the true distribution function $F$. Let $\mu_0$ denote the probability measure on $\mathbb{R}$ induced by the distribution function $F_0$. For each $n \in \mathbb{N}$, let $F_n: \mathbb{R} \to [0,1]$ denote the empirical distribution function of the first $n$ observations. Define the function $\Delta: \mathbb{R} \to [-1,1]$ by \begin{align*} \Delta(x)=F(x)-F_0(x). \end{align*} For each $n \in \mathbb{N}$, define the function $\Delta_n: \mathbb{R} \to [-1,1]$ by \begin{align*} \Delta_n(x)=F_n(x)-F_0(x). \end{align*} We invoke the classical [Glivenko-Cantelli theorem](/theorems/2004), applied to the independent identically distributed real-valued random variables $(X_n)_{n=1}^{\infty}$ with distribution function $F$. It gives \begin{align*} \sup_{x \in \mathbb{R}} |F_n(x)-F(x)| \to 0 \end{align*} $\mathbb{P}_F$-almost surely. For every $x \in \mathbb{R}$ and every $n \in \mathbb{N}$, since distribution functions take values in $[0,1]$, we have $|\Delta_n(x)| \leq 1$ and $|\Delta(x)| \leq 1$. The algebraic identity $a^2-b^2=(a-b)(a+b)$ gives \begin{align*} \left|(\Delta_n(x))^2-(\Delta(x))^2\right|= |\Delta_n(x)-\Delta(x)|\,|\Delta_n(x)+\Delta(x)|. \end{align*} Since $\Delta_n(x)-\Delta(x)=F_n(x)-F(x)$ and $|\Delta_n(x)+\Delta(x)| \leq 2$, it follows that \begin{align*} \left|(\Delta_n(x))^2-(\Delta(x))^2\right| \leq 2|F_n(x)-F(x)|. \end{align*} The triangle inequality for integrals gives \begin{align*} \left|\int_{\mathbb{R}}\bigl((\Delta_n(x))^2-(\Delta(x))^2\bigr)\,d\mu_0(x)\right| \leq \int_{\mathbb{R}}\left|(\Delta_n(x))^2-(\Delta(x))^2\right|\,d\mu_0(x). \end{align*} Using the pointwise estimate above gives \begin{align*} \left|\int_{\mathbb{R}}(\Delta_n(x))^2\,d\mu_0(x)-\int_{\mathbb{R}}(\Delta(x))^2\,d\mu_0(x)\right| \leq \int_{\mathbb{R}}2|F_n(x)-F(x)|\,d\mu_0(x). \end{align*} Since $\mu_0(\mathbb{R})=1$, the right-hand side is bounded by \begin{align*} 2\sup_{x \in \mathbb{R}}|F_n(x)-F(x)|. \end{align*} Therefore \begin{align*} \int_{\mathbb{R}}(F_n(x)-F_0(x))^2\,d\mu_0(x) \to \int_{\mathbb{R}}(F(x)-F_0(x))^2\,d\mu_0(x) \end{align*} $\mathbb{P}_F$-almost surely.[/step]

custom_env admin

[guided]The statistic is built from the weighted squared distance between $F_n$ and $F_0$, so the first task is to show that this weighted distance converges to the corresponding fixed distance between $F$ and $F_0$. The theorem statement defines $F_0: \mathbb{R} \to [0,1]$ and $F: \mathbb{R} \to [0,1]$ as distribution functions and $F_n: \mathbb{R} \to [0,1]$ as the empirical distribution function of the sample. The probability space $(\Omega,\mathcal{F},\mathbb{P}_F)$ carries the sample sequence $(X_n)_{n=1}^{\infty}$ under the true distribution function $F$. Let $\mu_0$ be the probability measure on $\mathbb{R}$ induced by the distribution function $F_0$. Define the discrepancy function $\Delta: \mathbb{R} \to [-1,1]$ by \begin{align*} \Delta(x)=F(x)-F_0(x). \end{align*} For each $n \in \mathbb{N}$, define the discrepancy function $\Delta_n: \mathbb{R} \to [-1,1]$ by \begin{align*} \Delta_n(x)=F_n(x)-F_0(x). \end{align*} These functions take values in $[-1,1]$ because all distribution functions take values in $[0,1]$. We now invoke the classical [Glivenko-Cantelli theorem](/theorems/2004). Its hypothesis is exactly that $X_1,X_2,\dots$ are independent and identically distributed real-valued random variables with common distribution function $F$, which is the sampling model under the true distribution. Therefore \begin{align*} \sup_{x \in \mathbb{R}} |F_n(x)-F(x)| \to 0 \end{align*} $\mathbb{P}_F$-almost surely. Why does [uniform convergence](/page/Uniform%20Convergence) imply convergence of the weighted $L^2(\mu_0)$ discrepancies? For each $x \in \mathbb{R}$, \begin{align*} \Delta_n(x)-\Delta(x)=(F_n(x)-F_0(x))-(F(x)-F_0(x)). \end{align*} After cancelling the two occurrences of $F_0(x)$, this becomes \begin{align*} \Delta_n(x)-\Delta(x)=F_n(x)-F(x). \end{align*} Using the algebraic identity $a^2-b^2=(a-b)(a+b)$ with $a=\Delta_n(x)$ and $b=\Delta(x)$, and using $|\Delta_n(x)| \leq 1$ and $|\Delta(x)| \leq 1$, we obtain \begin{align*} \left|(\Delta_n(x))^2-(\Delta(x))^2\right|= |\Delta_n(x)-\Delta(x)|\,|\Delta_n(x)+\Delta(x)|. \end{align*} Substituting $\Delta_n(x)-\Delta(x)=F_n(x)-F(x)$ and applying the triangle inequality to $\Delta_n(x)+\Delta(x)$ gives \begin{align*} \left|(\Delta_n(x))^2-(\Delta(x))^2\right| \leq |F_n(x)-F(x)|\bigl(|\Delta_n(x)|+|\Delta(x)|\bigr). \end{align*} Since $|\Delta_n(x)| \leq 1$ and $|\Delta(x)| \leq 1$, we conclude \begin{align*} \left|(\Delta_n(x))^2-(\Delta(x))^2\right| \leq 2|F_n(x)-F(x)|. \end{align*} Now integrate with respect to $\mu_0$. Since $\mu_0$ is a probability measure induced by the distribution function $F_0$, we have $\mu_0(\mathbb{R})=1$, and hence First, the triangle inequality for integrals gives \begin{align*} \left|\int_{\mathbb{R}}\bigl((\Delta_n(x))^2-(\Delta(x))^2\bigr)\,d\mu_0(x)\right| \leq \int_{\mathbb{R}}\left|(\Delta_n(x))^2-(\Delta(x))^2\right|\,d\mu_0(x). \end{align*} Using the pointwise bound above gives \begin{align*} \int_{\mathbb{R}}\left|(\Delta_n(x))^2-(\Delta(x))^2\right|\,d\mu_0(x) \leq \int_{\mathbb{R}}2|F_n(x)-F(x)|\,d\mu_0(x). \end{align*} Finally, since $|F_n(x)-F(x)| \leq \sup_{y \in \mathbb{R}}|F_n(y)-F(y)|$ for every $x \in \mathbb{R}$, and since $\mu_0(\mathbb{R})=1$, we obtain \begin{align*} \int_{\mathbb{R}}2|F_n(x)-F(x)|\,d\mu_0(x) \leq 2\sup_{y \in \mathbb{R}}|F_n(y)-F(y)|. \end{align*} The right-hand side converges to $0$ $\mathbb{P}_F$-almost surely by Glivenko-Cantelli. Therefore \begin{align*} \int_{\mathbb{R}}(F_n(x)-F_0(x))^2\,d\mu_0(x) \to \int_{\mathbb{R}}(F(x)-F_0(x))^2\,d\mu_0(x) \end{align*} $\mathbb{P}_F$-almost surely.[/guided]

custom_env admin

Verification Progress

5 Total Blocks

0 Verified

0% verified

Contributors

admin 5 blocks (0 verified)

Who Can Verify

Areas: Probability & Statistics

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data