Slutsky's Theorem — Statement & Proof

Slutsky's Theorem (Theorem # 10012)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We first prove the joint convergence statement $(X_n,Y_n)\Rightarrow (X,c)$ by checking two-dimensional distribution functions at their continuity points. Once joint convergence is available, the sum and product conclusions follow by applying the continuous mapping principle to the continuous maps $(x,y)\mapsto x+y$ and $(x,y)\mapsto xy$. For the quotient, we replace $1/y$ by a globally [continuous function](/page/Continuous%20Function) agreeing with $1/y$ near $c$, use convergence in probability to show that the replacement changes the quotient only on an event of vanishing probability, and then use a stability lemma for [convergence in distribution](/page/Convergence%20In%20Distribution) under convergence in probability of the difference. [/proofplan] [step:Prove joint convergence to the deterministic second coordinate] Let $F:\mathbb R\to[0,1]$ be the distribution function of $X$, defined by $F(t)=\mathbb P(X\le t)$. For each $n\in\mathbb N$, define the joint distribution function $H_n:\mathbb R^2\to[0,1]$ by \begin{align*} H_n(a,b)=\mathbb P(X_n\le a,\;Y_n\le b). \end{align*} Define $H:\mathbb R^2\to[0,1]$ by \begin{align*} H(a,b)=\mathbb P(X\le a,\;c\le b). \end{align*} Equivalently, $H(a,b)=0$ if $b<c$, and $H(a,b)=F(a)$ if $b\ge c$. Let $(a,b)\in\mathbb R^2$ be a continuity point of $H$. We show that $H_n(a,b)\to H(a,b)$. If $b<c$, define $\varepsilon=(c-b)/2>0$. The event $\{Y_n\le b\}$ is contained in $\{|Y_n-c|\ge\varepsilon\}$, hence \begin{align*} 0\le H_n(a,b)\le \mathbb P(|Y_n-c|\ge\varepsilon)\to 0. \end{align*} Since $H(a,b)=0$, this proves convergence in the case $b<c$. If $b>c$, define $\varepsilon=(b-c)/2>0$. Then $\{Y_n>b\}\subset\{|Y_n-c|\ge\varepsilon\}$. Therefore \begin{align*} \mathbb P(X_n\le a)-\mathbb P(|Y_n-c|\ge\varepsilon)\le H_n(a,b)\le \mathbb P(X_n\le a). \end{align*} Because $(a,b)$ is a continuity point of $H$ and $b>c$, the point $a$ is a continuity point of $F$. Since $X_n\Rightarrow X$, we have $\mathbb P(X_n\le a)\to F(a)$. Since $Y_n\xrightarrow{\mathbb P}c$, the error probability tends to $0$. Hence $H_n(a,b)\to F(a)=H(a,b)$. The case $b=c$ cannot be a continuity point of $H$ unless $F(a)=0$ and the preceding one-sided estimates already force the same conclusion. Thus $H_n(a,b)\to H(a,b)$ at every continuity point of $H$. By the distribution-function characterization of convergence in distribution on $\mathbb R^2$, \begin{align*} (X_n,Y_n)\Rightarrow (X,c). \end{align*} [guided] The goal of this step is to combine two different modes of convergence into one joint convergence statement. Define $F:\mathbb R\to[0,1]$ by $F(t)=\mathbb P(X\le t)$, the distribution function of $X$. For each $n$, define $H_n:\mathbb R^2\to[0,1]$ by \begin{align*} H_n(a,b)=\mathbb P(X_n\le a,\;Y_n\le b). \end{align*} This is the joint distribution function of the random vector $(X_n,Y_n)$. The candidate limiting joint distribution function is $H:\mathbb R^2\to[0,1]$ given by \begin{align*} H(a,b)=\mathbb P(X\le a,\;c\le b). \end{align*} Since the second limiting coordinate is the constant $c$, this means $H(a,b)=0$ when $b<c$, and $H(a,b)=F(a)$ when $b\ge c$. We now check convergence at continuity points of $H$. Fix a continuity point $(a,b)\in\mathbb R^2$ of $H$. First suppose $b<c$. Set $\varepsilon=(c-b)/2>0$. If $Y_n\le b$, then $Y_n$ is at least $\varepsilon$ away from $c$, so \begin{align*} \{Y_n\le b\}\subset\{|Y_n-c|\ge\varepsilon\}. \end{align*} Therefore \begin{align*} 0\le H_n(a,b)\le \mathbb P(|Y_n-c|\ge\varepsilon). \end{align*} The hypothesis $Y_n\xrightarrow{\mathbb P}c$ says precisely that this last probability tends to $0$ for every positive $\varepsilon$. Hence $H_n(a,b)\to 0$. Since $b<c$, we also have $H(a,b)=0$, so the desired convergence holds in this case. Now suppose $b>c$. Set $\varepsilon=(b-c)/2>0$. If $Y_n>b$, then $|Y_n-c|\ge\varepsilon$, so \begin{align*} \{Y_n>b\}\subset\{|Y_n-c|\ge\varepsilon\}. \end{align*} The event $\{X_n\le a\}$ splits into the part where $Y_n\le b$ and the part where $Y_n>b$. Consequently, \begin{align*} \mathbb P(X_n\le a)-\mathbb P(|Y_n-c|\ge\varepsilon)\le H_n(a,b)\le \mathbb P(X_n\le a). \end{align*} The right-hand probability $\mathbb P(X_n\le a)$ converges to $F(a)$ because $X_n\Rightarrow X$ and, since $(a,b)$ is a continuity point of $H$ with $b>c$, the point $a$ is a continuity point of $F$. The error term tends to $0$ by convergence in probability of $Y_n$ to $c$. The [squeeze theorem](/theorems/627) gives $H_n(a,b)\to F(a)=H(a,b)$. The horizontal line $b=c$ is the only possible discontinuity contributed by the deterministic second coordinate. At continuity points lying on that line, the same one-sided estimates force convergence as well. Hence $H_n(a,b)\to H(a,b)$ at every continuity point of the limiting joint distribution function. By the distribution-function characterization of convergence in distribution on $\mathbb R^2$, \begin{align*} (X_n,Y_n)\Rightarrow (X,c). \end{align*} [/guided] [/step] [step:Apply continuous maps to obtain the sum and product limits] We use the following elementary continuous mapping principle: if $Z_n\Rightarrow Z$ as random variables with values in $\mathbb R^d$ and $g:\mathbb R^d\to\mathbb R$ is continuous, then $g(Z_n)\Rightarrow g(Z)$. Indeed, for every bounded continuous map $\varphi:\mathbb R\to\mathbb R$, the composition $\varphi\circ g:\mathbb R^d\to\mathbb R$ is bounded and continuous, so $\mathbb E[\varphi(g(Z_n))]\to\mathbb E[\varphi(g(Z))]$. Define $s:\mathbb R^2\to\mathbb R$ by $s(x,y)=x+y$. The map $s$ is continuous, so the joint convergence from the previous step gives \begin{align*} X_n+Y_n=s(X_n,Y_n)\Rightarrow s(X,c)=X+c. \end{align*} Define $m:\mathbb R^2\to\mathbb R$ by $m(x,y)=xy$. The map $m$ is continuous, so the same principle gives \begin{align*} X_nY_n=m(X_n,Y_n)\Rightarrow m(X,c)=cX. \end{align*} [/step] [step:Replace the reciprocal by a continuous function near the nonzero limit] Assume $c\ne 0$. Define $\delta=|c|/2>0$. Choose a continuous map $a:\mathbb R\to\mathbb R$ such that $a(y)=1/y$ for every $y\in(c-\delta,c+\delta)$. Define $r:\mathbb R^2\to\mathbb R$ by \begin{align*} r(x,y)=x\,a(y). \end{align*} The map $r$ is continuous. Hence, by the continuous mapping principle and $(X_n,Y_n)\Rightarrow(X,c)$, \begin{align*} r(X_n,Y_n)\Rightarrow r(X,c)=\frac{X}{c}. \end{align*} Let $A_n=\{|Y_n-c|<\delta\}$. On $A_n$, we have $Y_n\ne 0$ and $a(Y_n)=1/Y_n$, so $Q_n=r(X_n,Y_n)$ on $A_n$. Therefore \begin{align*} \mathbb P(Q_n\ne r(X_n,Y_n))\le \mathbb P(A_n^c)=\mathbb P(|Y_n-c|\ge\delta)\to 0. \end{align*} [/step] [step:Transfer convergence across a vanishing probability modification] We use the following stability fact. If $U_n$ and $V_n$ are real-valued random variables, $U_n\Rightarrow U$, and for every $\eta>0$ one has $\mathbb P(|U_n-V_n|>\eta)\to 0$, then $V_n\Rightarrow U$. To prove it, let $G:\mathbb R\to[0,1]$ be the distribution function of $U$, and let $t\in\mathbb R$ be a continuity point of $G$. For each $\eta>0$, \begin{align*} \mathbb P(U_n\le t-\eta)-\mathbb P(|U_n-V_n|>\eta)\le \mathbb P(V_n\le t)\le \mathbb P(U_n\le t+\eta)+\mathbb P(|U_n-V_n|>\eta). \end{align*} Taking lower and upper limits as $n\to\infty$, then letting $\eta\downarrow0$ through values for which $t-\eta$ and $t+\eta$ are continuity points of $G$, gives \begin{align*} \lim_{n\to\infty}\mathbb P(V_n\le t)=G(t). \end{align*} Thus $V_n\Rightarrow U$. Apply this fact with $U_n=r(X_n,Y_n)$, $V_n=Q_n$, and $U=X/c$. The previous step gives $U_n\Rightarrow X/c$, and it also gives $\mathbb P(U_n\ne V_n)\to0$, hence $|U_n-V_n|\xrightarrow{\mathbb P}0$. Therefore \begin{align*} Q_n\Rightarrow \frac{X}{c}. \end{align*} Together with the sum and product conclusions, this proves Slutsky's theorem. [/step]

Prerequisites (0/4 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Continuous Mapping Theorem

Definitions & Concepts

Explore Further

Probability Space Definition Random Variable Definition Convergence In Distribution Definition Continuous Mapping Theorem Theorem #1847 Optional Stopping for Non-Negative Supermartingales Martingale Theory Basic Symmetrisation Inequality Probability & Statistics Cauchy-Schwarz Inequality for Covariance Probability Theory MISE Rate for Multivariate Kernel Density Estimation Probability & Statistics Hoeffding Confidence Radius for Bounded Independent Means Probability & Statistics Asymptotic Normality of Interior Local Polynomial Regression Estimators with Negligible Projection Residual Probability & Statistics Chernoff Bound for Sub-Gaussian Random Variables Probability & Statistics Hoeffding Decomposition for U-Statistics Probability & Statistics Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.

Slutsky's Theorem (Theorem # 10012)

Discussion

Proof

Prerequisites (0/4 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Slutsky's Theorem (Theorem # 10012)

Discussion

Proof

Prerequisites (0/4 completed)

Prerequisites Graph

Explore Further