Cantelli Inequality — Statement & Proof

Cantelli Inequality (Theorem # 6039)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] We prove a one-sided concentration bound by shifting $X$ by an arbitrary positive parameter $a$ and applying the elementary Markov argument to the non-negative [random variable](/page/Random%20Variable) $(X+a)^2$. The event $\{X\ge r\}$ forces $(X+a)^2$ to be at least $(r+a)^2$, so the probability is bounded by a rational function of $a$. We then minimize this rational function over $a>0$, obtaining the optimal choice $a=\sigma^2/r$ when $\sigma^2>0$, while the degenerate case $\sigma^2=0$ is handled separately. [/proofplan] [step:Handle the zero-variance case directly] Assume first that $\sigma^2=0$. Since $\mathbb E[X]=0$, the definition of variance gives $\mathbb E[X^2]=0$. For the fixed number $r>0$, define the event \begin{align*} A_r:=\{\omega\in\Omega:X(\omega)\ge r\}\in\mathcal F. \end{align*} Pointwise on $\Omega$, \begin{align*} r^2\mathbb 1_{A_r}\le X^2. \end{align*} Taking expectations and using monotonicity of expectation gives \begin{align*} r^2\mathbb P(A_r)=\mathbb E[r^2\mathbb 1_{A_r}]\le \mathbb E[X^2]=0. \end{align*} Because $r^2>0$, this implies $\mathbb P(X\ge r)=0$. Therefore \begin{align*} \mathbb P(X\ge r)=0=\frac{\sigma^2}{\sigma^2+r^2}, \end{align*} so the desired inequality holds when $\sigma^2=0$. [/step] [step:Bound the upper tail after shifting by a positive parameter] Assume now that $\sigma^2>0$. Fix $a>0$, and define the non-negative real-valued random variable $Y_a:\Omega\to[0,\infty)$ by \begin{align*} Y_a(\omega)=(X(\omega)+a)^2 \end{align*} for every $\omega\in\Omega$. For the event \begin{align*} A_r:=\{\omega\in\Omega:X(\omega)\ge r\}, \end{align*} we have $X(\omega)+a\ge r+a>0$ for every $\omega\in A_r$, hence \begin{align*} (r+a)^2\mathbb 1_{A_r}\le Y_a \end{align*} pointwise on $\Omega$. Taking expectations gives \begin{align*} (r+a)^2\mathbb P(A_r) =\mathbb E[(r+a)^2\mathbb 1_{A_r}] \le \mathbb E[Y_a]. \end{align*} Since $X\in L^2(\Omega,\mathcal F,\mathbb P)$ and $a$ is finite, $Y_a$ is integrable. Expanding the square and using $\mathbb E[X]=0$ gives \begin{align*} \mathbb E[Y_a] =\mathbb E[(X+a)^2] =\mathbb E[X^2]+2a\mathbb E[X]+a^2 =\sigma^2+a^2. \end{align*} Dividing by $(r+a)^2>0$, we obtain \begin{align*} \mathbb P(X\ge r)\le \frac{\sigma^2+a^2}{(r+a)^2} \end{align*} for every $a>0$. [guided] The purpose of introducing $a>0$ is to build a non-negative square whose size is forced by the event $X\ge r$. Define $Y_a:\Omega\to[0,\infty)$ by \begin{align*} Y_a(\omega)=(X(\omega)+a)^2 \end{align*} for every $\omega\in\Omega$. This is a valid non-negative random variable because $X$ is measurable and the map $t\mapsto(t+a)^2$ is Borel measurable. Now define the tail event \begin{align*} A_r:=\{\omega\in\Omega:X(\omega)\ge r\}. \end{align*} Since $X$ is measurable and $[r,\infty)$ is Borel, $A_r\in\mathcal F$. On this event, $X(\omega)\ge r$, and therefore \begin{align*} X(\omega)+a\ge r+a>0. \end{align*} Squaring preserves the inequality because both sides are non-negative, so for every $\omega\in A_r$, \begin{align*} (X(\omega)+a)^2\ge (r+a)^2. \end{align*} For $\omega\notin A_r$, the indicator $\mathbb 1_{A_r}(\omega)$ is $0$. Hence the pointwise inequality on all of $\Omega$ is \begin{align*} (r+a)^2\mathbb 1_{A_r}\le Y_a. \end{align*} Taking expectations and using monotonicity of expectation gives \begin{align*} (r+a)^2\mathbb P(A_r) =\mathbb E[(r+a)^2\mathbb 1_{A_r}] \le \mathbb E[Y_a]. \end{align*} The random variable $Y_a$ is integrable because \begin{align*} Y_a=(X+a)^2\le 2X^2+2a^2, \end{align*} and $X\in L^2(\Omega,\mathcal F,\mathbb P)$. Expanding the square inside the expectation, \begin{align*} \mathbb E[Y_a] =\mathbb E[(X+a)^2] =\mathbb E[X^2]+2a\mathbb E[X]+a^2. \end{align*} The hypotheses give $\mathbb E[X]=0$ and, since the mean is zero, \begin{align*} \operatorname{Var}(X)=\mathbb E[(X-\mathbb E[X])^2]=\mathbb E[X^2]=\sigma^2. \end{align*} Therefore \begin{align*} \mathbb E[Y_a]=\sigma^2+a^2. \end{align*} Because $r+a>0$, division by $(r+a)^2$ is valid, and we conclude that \begin{align*} \mathbb P(X\ge r)\le \frac{\sigma^2+a^2}{(r+a)^2}. \end{align*} This estimate holds for every $a>0$, so the remaining task is to choose the best shift. [/guided] [/step] [step:Minimize the shifted bound over the parameter $a$] Define the function $\phi:(0,\infty)\to(0,\infty)$ by \begin{align*} \phi(a)=\frac{\sigma^2+a^2}{(r+a)^2} \end{align*} for every $a\in(0,\infty)$. The function $\phi$ is differentiable on $(0,\infty)$. By the quotient rule, \begin{align*} \phi'(a)=\frac{2a(r+a)^2-2(r+a)(\sigma^2+a^2)}{(r+a)^4}. \end{align*} Factoring $2(r+a)$ from the numerator and using $r+a>0$ gives \begin{align*} \phi'(a)=\frac{2(ar-\sigma^2)}{(r+a)^3}. \end{align*} Since $r>0$ and $\sigma^2>0$, the unique critical point is \begin{align*} a_0:=\frac{\sigma^2}{r}>0. \end{align*} Moreover, $\phi'(a)<0$ for $0<a<a_0$ and $\phi'(a)>0$ for $a>a_0$, so $\phi$ attains its minimum on $(0,\infty)$ at $a_0$. Substituting $a_0=\sigma^2/r$ into the bound from the previous step yields \begin{align*} \mathbb P(X\ge r)\le \frac{\sigma^2+(\sigma^2/r)^2}{(r+\sigma^2/r)^2}. \end{align*} Rewriting numerator and denominator over the common factor $r^2$ gives \begin{align*} \frac{\sigma^2+(\sigma^2/r)^2}{(r+\sigma^2/r)^2}=\frac{\sigma^2(r^2+\sigma^2)/r^2}{(r^2+\sigma^2)^2/r^2}. \end{align*} Cancelling $r^2>0$ and one factor of $r^2+\sigma^2>0$ gives \begin{align*} \mathbb P(X\ge r)\le \frac{\sigma^2}{r^2+\sigma^2}. \end{align*} This is exactly the claimed inequality. [/step]

Prerequisites (0/4 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Explore Further

Random Variable Definition Expectation Definition Event Definition Variance Definition Asymptotic Normality of Random Design Ordinary Least Squares Probability & Statistics Lasso Basic Inequality Probability & Statistics Independence and Conditional Probability Probability Theory Stirling's Formula Probability Theory Stable and Robust Recovery under the Restricted Isometry Property Probability & Statistics PGF of a Random Sum Probability Theory Finite-Rank BBP Outlier Location Theorem for Spiked Covariance Matrices Probability & Statistics Gambler's Ruin Probability Probability Theory Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.

Cantelli Inequality (Theorem # 6039)

Discussion

Proof

Prerequisites (0/4 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Cantelli Inequality (Theorem # 6039)

Discussion

Proof

Prerequisites (0/4 completed)

Prerequisites Graph

Explore Further