Covariance-Structure Distinction Between Factor Analysis and Principal Component Analysis

Covariance-Structure Distinction Between Factor Analysis and Principal Component Analysis (Theorem # 4041)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We compare the two decompositions entry by entry. The diagonal residual in factor analysis forces all off-diagonal covariance to be carried by the low-rank term $\Lambda\Lambda^\top$. PCA instead removes leading orthogonal rank-one eigendirections, and its residual is the sum of the remaining eigendirections. A concrete positive semidefinite covariance matrix shows that this PCA residual can contain off-diagonal covariance, so the two decompositions encode different structural assumptions. [/proofplan] [step:Show that factor analysis assigns all off-diagonal covariance to the loading term] Assume $\Sigma = \Lambda\Lambda^\top + \Psi$, where $\Lambda \in \mathbb{R}^{p \times k}$ and $\Psi \in \mathbb{R}^{p \times p}$ is diagonal. Since $\Psi$ is diagonal, $\Psi_{ij} = 0$ whenever $i \neq j$. Therefore, for every $i,j \in \{1,\dots,p\}$ with $i \neq j$, \begin{align*} \Sigma_{ij} &= (\Lambda\Lambda^\top)_{ij} + \Psi_{ij} \\ &= (\Lambda\Lambda^\top)_{ij}. \end{align*} Thus the off-diagonal covariance structure of a factor analysis model is entirely determined by the low-rank matrix $\Lambda\Lambda^\top$. [guided] The point of the factor analysis decomposition is not merely that $\Sigma$ is written as a sum of two matrices. The crucial structural condition is that the residual matrix $\Psi$ is diagonal. Writing the equality entrywise gives, for every pair of indices $i,j \in \{1,\dots,p\}$, \begin{align*} \Sigma_{ij} = (\Lambda\Lambda^\top)_{ij} + \Psi_{ij}. \end{align*} If $i \neq j$, the diagonal assumption gives $\Psi_{ij}=0$. Hence \begin{align*} \Sigma_{ij} = (\Lambda\Lambda^\top)_{ij}. \end{align*} So factor analysis forces every cross-covariance between distinct coordinates to come from the common-factor term. The diagonal matrix $\Psi$ contributes only coordinate-specific variances. [/guided] [/step] [step:Identify the PCA residual as the sum of omitted eigendirections] Let $(q_1,\dots,q_p)$ be an orthonormal basis of $\mathbb{R}^p$ consisting of eigenvectors of $\Sigma$, with corresponding eigenvalues $\lambda_1 \geq \cdots \geq \lambda_p \geq 0$. The PCA decomposition is \begin{align*} \Sigma = \sum_{\ell=1}^{p} \lambda_\ell q_\ell q_\ell^\top. \end{align*} For $1 \leq k < p$, define the rank-$k$ PCA approximation $\Sigma_k \in \mathbb{R}^{p \times p}$ by \begin{align*} \Sigma_k = \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top. \end{align*} Then the residual matrix $R_k \in \mathbb{R}^{p \times p}$ is \begin{align*} R_k &= \Sigma - \Sigma_k \\ &= \sum_{\ell=1}^{p} \lambda_\ell q_\ell q_\ell^\top - \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top \\ &= \sum_{\ell=k+1}^{p} \lambda_\ell q_\ell q_\ell^\top. \end{align*} This formula contains no condition requiring $(R_k)_{ij}=0$ for $i \neq j$. [guided] PCA ranks orthogonal directions by variance. Algebraically, this means that $\Sigma$ is written as a sum of rank-one matrices $\lambda_\ell q_\ell q_\ell^\top$, where $q_\ell$ is a unit direction and $\lambda_\ell$ is the variance in that direction. The rank-$k$ approximation keeps the first $k$ such directions: \begin{align*} \Sigma_k = \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top. \end{align*} Subtracting this approximation from the full covariance matrix leaves \begin{align*} R_k &= \Sigma - \Sigma_k \\ &= \sum_{\ell=1}^{p} \lambda_\ell q_\ell q_\ell^\top - \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top \\ &= \sum_{\ell=k+1}^{p} \lambda_\ell q_\ell q_\ell^\top. \end{align*} The important difference from factor analysis is visible here: the residual is a sum of omitted eigendirections, not a matrix assumed to be diagonal in the original coordinate system. A rank-one matrix $q_\ell q_\ell^\top$ generally has off-diagonal entries, so the sum of omitted eigendirections generally does as well. [/guided] [/step] [step:Exhibit a PCA residual with nonzero off-diagonal entries] Consider the covariance matrix $\Sigma \in \mathbb{R}^{3 \times 3}$ given by \begin{align*} \Sigma = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{pmatrix}. \end{align*} This matrix is symmetric, and its eigenvalues are $2+\sqrt{2}$, $2$, and $2-\sqrt{2}$, all of which are nonnegative; hence it is positive semidefinite. A unit eigenvector for the largest eigenvalue $\lambda_1 = 2+\sqrt{2}$ is \begin{align*} q_1 = \begin{pmatrix} \frac{1}{2} \\ \frac{\sqrt{2}}{2} \\ \frac{1}{2} \end{pmatrix}. \end{align*} The rank-$1$ PCA approximation is $\Sigma_1 = \lambda_1 q_1 q_1^\top$, so the residual $R_1 = \Sigma - \Sigma_1$ has $(1,3)$ entry \begin{align*} (R_1)_{13} &= \Sigma_{13} - \lambda_1(q_1)_1(q_1)_3 \\ &= 0 - (2+\sqrt{2})\frac{1}{2}\frac{1}{2} \\ &= -\frac{2+\sqrt{2}}{4}. \end{align*} Since $-\frac{2+\sqrt{2}}{4} \neq 0$, the PCA residual $R_1$ is not diagonal. [guided] To show that PCA imposes no diagonal residual condition, it is enough to give one covariance matrix for which the residual is not diagonal. Define \begin{align*} \Sigma = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{pmatrix}. \end{align*} The matrix is symmetric. Its eigenvalues are $2+\sqrt{2}$, $2$, and $2-\sqrt{2}$; each is nonnegative, so $\Sigma$ is positive semidefinite and is a valid covariance matrix. For the largest eigenvalue $\lambda_1 = 2+\sqrt{2}$, take \begin{align*} q_1 = \begin{pmatrix} \frac{1}{2} \\ \frac{\sqrt{2}}{2} \\ \frac{1}{2} \end{pmatrix}. \end{align*} This vector has Euclidean norm $1$, because \begin{align*} \left(\frac{1}{2}\right)^2 + \left(\frac{\sqrt{2}}{2}\right)^2 + \left(\frac{1}{2}\right)^2 = \frac{1}{4}+\frac{1}{2}+\frac{1}{4} = 1. \end{align*} The rank-$1$ PCA approximation is \begin{align*} \Sigma_1 = \lambda_1 q_1q_1^\top. \end{align*} Therefore the residual $R_1 = \Sigma-\Sigma_1$ has $(1,3)$ entry \begin{align*} (R_1)_{13} &= \Sigma_{13} - \lambda_1(q_1)_1(q_1)_3 \\ &= 0 - (2+\sqrt{2})\frac{1}{2}\frac{1}{2} \\ &= -\frac{2+\sqrt{2}}{4}. \end{align*} This entry is nonzero. Hence the PCA residual is not diagonal in this valid example. [/guided] [/step] [step:Conclude that the decompositions answer different covariance-structure questions] Factor analysis requires the residual covariance $\Psi$ to be diagonal, so it separates covariance into a low-rank common part and coordinate-specific residual variance. PCA requires orthogonality of the directions $q_1,\dots,q_p$ and orders the rank-one terms by eigenvalue size, but it does not require the omitted residual $R_k$ to be diagonal. Since a PCA residual can have nonzero off-diagonal entries, PCA does not impose the factor-analysis covariance structure. Therefore factor analysis and PCA are conceptually distinct even though both may involve eigenvectors or low-rank matrices. [/step]

Prerequisites (0/1 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Definitions & Concepts

Orthogonality

Explore Further

Orthogonality Definition Strong Consistency of the Multivariate Normal Maximum Likelihood Estimators probability Tail Dependence Coefficients of the Bivariate Student $t$ Copula probability Inconsistency of the Periodogram probability Characteristic Function of the Multivariate Normal Distribution probability Independence of the Sample Mean and Sample Covariance for Multivariate Normal Samples probability Innovations Algorithm probability Wishart Distribution Density Formula probability Convergence of Finite Past Linear Predictors probability

What brings you to Androma?

Start with a route through the knowledge graph.