Bai Yin Theorem — Statement & Proof

Bai Yin Theorem (Theorem # 5929)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We reduce the eigenvalue assertions for \begin{align*} X_n^\top X_n/n \end{align*} to singular-value assertions for the rectangular matrix $X_n$. The required almost-sure limits are supplied by the [Bai-Yin singular-value edge theorem](/page/Bai-Yin%20Theorem) under the hypotheses of independence, centering, unit variance, finite fourth moment, and \begin{align*} d_n/n \to \gamma. \end{align*} After applying that theorem, the upper edge follows by squaring the largest singular value, the lower edge for $\gamma<1$ follows by squaring the smallest column-side singular value, the square case $\gamma=1$ follows by combining the zero lower singular-value edge with the possible deterministic kernel, and the zero-eigenvalue statement for $\gamma>1$ follows from the [Rank Nullity Theorem](/theorems/916). [/proofplan] [step:Apply the Bai-Yin singular-value edge theorem to the rectangular matrices] For each $n \in \mathbb N$, define \begin{align*} q_n := \min\{n,d_n\}. \end{align*} Let \begin{align*} \sigma_{1,n} \geq \cdots \geq \sigma_{q_n,n} \geq 0 \end{align*} denote all singular values of $X_n$, listed with multiplicity. The [Bai-Yin singular-value edge theorem](/page/Bai-Yin%20Theorem) applies because the entries of $X_n$ are i.i.d. real random variables with mean $0$, variance $1$, finite fourth moment, and because \begin{align*} d_n/n \to \gamma \in (0,\infty). \end{align*} Hence, almost surely, \begin{align*} \frac{\sigma_{1,n}}{\sqrt{n}} &\to 1 + \sqrt{\gamma}. \end{align*} The same theorem gives the lower rectangular edge \begin{align*} \frac{\sigma_{q_n,n}}{\sqrt{n}} &\to |1-\sqrt{\gamma}|. \end{align*} In particular, if $\gamma<1$, then $d_n<n$ for all sufficiently large $n$, so $q_n=d_n$ eventually and \begin{align*} \frac{\sigma_{d_n,n}}{\sqrt{n}} &\to 1-\sqrt{\gamma}. \end{align*} If $\gamma=1$, the same lower-edge conclusion is \begin{align*} \frac{\sigma_{q_n,n}}{\sqrt{n}} &\to 0. \end{align*} If $\gamma>1$, then $d_n>n$ for all sufficiently large $n$, so $q_n=n$ eventually and \begin{align*} \frac{\sigma_{n,n}}{\sqrt{n}} &\to \sqrt{\gamma}-1>0. \end{align*} [guided] The matrix \begin{align*} X_n^\top X_n/n \end{align*} is positive semidefinite, so its spectral edges are controlled by the singular values of $X_n$. We therefore introduce the singular values in a way that is defined in every rectangular regime. For each $n \in \mathbb N$, set \begin{align*} q_n := \min\{n,d_n\}, \end{align*} and let \begin{align*} \sigma_{1,n} \geq \cdots \geq \sigma_{q_n,n} \geq 0 \end{align*} denote all singular values of $X_n$, counted with multiplicity. This avoids referring to $\sigma_{d_n,n}$ before knowing that $d_n\leq n$. We now invoke the [Bai-Yin singular-value edge theorem](/page/Bai-Yin%20Theorem). Its hypotheses are exactly the hypotheses available here: the entries of $X_n$ are independent and identically distributed, have mean $0$, have variance $1$, have finite fourth moment, and the rectangular aspect ratio satisfies \begin{align*} d_n/n \to \gamma \in (0,\infty). \end{align*} Therefore the upper singular-value edge satisfies, almost surely, \begin{align*} \frac{\sigma_{1,n}}{\sqrt{n}} &\to 1 + \sqrt{\gamma}, \end{align*} and the lower rectangular edge satisfies, almost surely, \begin{align*} \frac{\sigma_{q_n,n}}{\sqrt{n}} &\to |1-\sqrt{\gamma}|. \end{align*} The distinction between $\gamma<1$, $\gamma=1$, and $\gamma>1$ matters. If $\gamma<1$, then eventually $d_n<n$, so $q_n=d_n$ and the lower edge is the smallest column-side singular value. If $\gamma=1$, there need not be eventual inequalities between $d_n$ and $n$; nevertheless the lower edge tends to $0$, which is exactly the desired limiting lower eigenvalue. If $\gamma>1$, then eventually $d_n>n$, so $q_n=n$ and \begin{align*} \frac{\sigma_{n,n}}{\sqrt{n}} \to \sqrt{\gamma}-1>0, \end{align*} which will force full row rank for all sufficiently large $n$ on the full-probability event. [/guided] [/step] [step:Convert singular-value convergence into upper spectral-edge convergence] For each $n \in \mathbb N$, the eigenvalues of $X_n^\top X_n$ consist of \begin{align*} \sigma_{1,n}^2,\dots,\sigma_{q_n,n}^2 \end{align*} together with $d_n-q_n$ additional zero eigenvalues when $d_n>n$. Therefore \begin{align*} \lambda_{\max}(X_n^\top X_n/n) &= \frac{\sigma_{1,n}^2}{n} = \left(\frac{\sigma_{1,n}}{\sqrt{n}}\right)^2. \end{align*} Since \begin{align*} \frac{\sigma_{1,n}}{\sqrt{n}} \to 1+\sqrt{\gamma} \end{align*} almost surely and the map $t \mapsto t^2$ from $\mathbb R$ to $\mathbb R$ is continuous, the [continuous mapping theorem](/theorems/1847) for almost-sure convergence gives \begin{align*} \lambda_{\max}(X_n^\top X_n/n) &\xrightarrow{a.s.} (1+\sqrt{\gamma})^2. \end{align*} [guided] The deterministic bridge from singular values to eigenvalues is the identity between the spectrum of $X_n^\top X_n$ and the squared singular values of $X_n$. For each $n$, the eigenvalues of $X_n^\top X_n$ are \begin{align*} \sigma_{1,n}^2,\dots,\sigma_{q_n,n}^2 \end{align*} plus $d_n-q_n$ extra zeros if $d_n>n$. The largest eigenvalue is therefore \begin{align*} \lambda_{\max}(X_n^\top X_n/n) &= \frac{\sigma_{1,n}^2}{n} = \left(\frac{\sigma_{1,n}}{\sqrt{n}}\right)^2. \end{align*} The previous step gives \begin{align*} \frac{\sigma_{1,n}}{\sqrt{n}} \to 1+\sqrt{\gamma} \end{align*} almost surely. Since $t\mapsto t^2$ is continuous on $\mathbb R$, the continuous mapping theorem for almost-sure convergence permits us to square the limit, giving \begin{align*} \lambda_{\max}(X_n^\top X_n/n) &\xrightarrow{a.s.} (1+\sqrt{\gamma})^2. \end{align*} [/guided] [/step] [step:Convert the lower singular-value edge into the lower eigenvalue edge when $\gamma \leq 1$] Assume $\gamma \leq 1$. If $\gamma<1$, then $d_n<n$ for all sufficiently large $n$, so $q_n=d_n$ eventually. On the full-probability event supplied by the Bai-Yin lower-edge theorem, \begin{align*} \lambda_{\min}(X_n^\top X_n/n) &= \frac{\sigma_{d_n,n}^2}{n} = \left(\frac{\sigma_{d_n,n}}{\sqrt{n}}\right)^2 \to (1-\sqrt{\gamma})^2. \end{align*} Now assume $\gamma=1$. If $d_n>n$, then $X_n^\top X_n/n$ has a zero eigenvalue, so its smallest eigenvalue is $0$. If $d_n\leq n$, then the smallest eigenvalue is bounded above by \begin{align*} \frac{\sigma_{q_n,n}^2}{n}. \end{align*} In both cases, \begin{align*} 0\leq \lambda_{\min}(X_n^\top X_n/n)\leq \frac{\sigma_{q_n,n}^2}{n}. \end{align*} Since \begin{align*} \frac{\sigma_{q_n,n}}{\sqrt{n}} \to 0 \end{align*} almost surely, the [squeeze theorem](/theorems/627) gives \begin{align*} \lambda_{\min}(X_n^\top X_n/n) &\xrightarrow{a.s.} 0=(1-\sqrt{\gamma})^2. \end{align*} [guided] The case $\gamma<1$ is the tall-matrix regime. Since \begin{align*} d_n/n\to\gamma<1, \end{align*} we have $d_n<n$ for every sufficiently large $n$, hence $q_n=d_n$ eventually. The lower Bai-Yin edge gives \begin{align*} \frac{\sigma_{d_n,n}}{\sqrt{n}}\to 1-\sqrt{\gamma} \end{align*} almost surely. The smallest eigenvalue of $X_n^\top X_n/n$ is then the square of this smallest column-side singular value divided by $n$, so \begin{align*} \lambda_{\min}(X_n^\top X_n/n) &=\left(\frac{\sigma_{d_n,n}}{\sqrt{n}}\right)^2 \to (1-\sqrt{\gamma})^2. \end{align*} The square case $\gamma=1$ needs separate handling because $d_n\leq n$ need not hold eventually. If $d_n>n$, then the $d_n\times d_n$ matrix $X_n^\top X_n/n$ has rank at most $n<d_n$, so it has a zero eigenvalue and its smallest eigenvalue is $0$. If $d_n\leq n$, the smallest eigenvalue is one of the squared singular values divided by $n$, and in particular is bounded above by \begin{align*} \frac{\sigma_{q_n,n}^2}{n}. \end{align*} Thus in all cases \begin{align*} 0\leq \lambda_{\min}(X_n^\top X_n/n)\leq \frac{\sigma_{q_n,n}^2}{n}. \end{align*} The lower Bai-Yin edge at $\gamma=1$ says \begin{align*} \frac{\sigma_{q_n,n}}{\sqrt{n}}\to0, \end{align*} so the squeeze theorem gives \begin{align*} \lambda_{\min}(X_n^\top X_n/n)\xrightarrow{a.s.}0=(1-\sqrt{\gamma})^2. \end{align*} [/guided] [/step] [step:Compute the zero-eigenvalue multiplicity when $\gamma > 1$] Assume $\gamma > 1$. Since $d_n/n \to \gamma$, there exists $N \in \mathbb N$ such that $d_n > n$ for every $n \geq N$. For such $n$, the rank-nullity theorem applied to the [linear map](/page/Linear%20Map) \begin{align*} X_n: \mathbb R^{d_n} &\to \mathbb R^n \end{align*} gives \begin{align*} \dim \ker X_n &= d_n - \operatorname{rank}(X_n). \end{align*} Because $\ker(X_n^\top X_n)=\ker(X_n)$, the multiplicity of zero as an eigenvalue of $X_n^\top X_n/n$ is $d_n-\operatorname{rank}(X_n)$. The Bai-Yin lower-edge theorem on the non-zero singular values implies that, almost surely, $\operatorname{rank}(X_n)=n$ for all sufficiently large $n$. Therefore the zero-eigenvalue multiplicity $m_n$ satisfies \begin{align*} m_n &= d_n-n \end{align*} for all sufficiently large $n$ almost surely, and hence \begin{align*} \frac{m_n}{d_n} &= 1 - \frac{n}{d_n} \to 1 - \frac{1}{\gamma}. \end{align*} Thus $X_n^\top X_n/n$ has zero as an eigenvalue with asymptotic multiplicity $(1-1/\gamma)d_n$. [/step]

Prerequisites (0/8 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Definitions & Concepts

Explore Further

Variance Definition Event Definition Matrix Definition Limit Definition Set Definition Rank-Nullity Theorem Theorem #916 Continuous Mapping Theorem Theorem #1847 Squeeze Theorem Theorem #627 Backwards Martingale Convergence Theorem Martingale Theory Robust Wald Confidence Interval Probability & Statistics Countable Subadditivity Probability Theory Ingster Detection Boundary for Sparse Gaussian Mean Mixtures Probability & Statistics Soft-Thresholding Formula for the Lasso with Orthonormal Design Probability & Statistics Weak Stirling Probability Theory Restricted Isometry Property Implies Injectivity on Sparse Vectors Probability & Statistics Doob's Maximal Inequality Martingale Theory Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.

Bai Yin Theorem (Theorem # 5929)

Discussion

Proof

Prerequisites (0/8 completed)

Prerequisites Graph

Explore Further

Sign in to Androma

Check your inbox

One last step

Bai Yin Theorem (Theorem # 5929)

Discussion

Proof

Prerequisites (0/8 completed)

Prerequisites Graph

Explore Further