[proofplan]
We compare the two decompositions entry by entry. The diagonal residual in factor analysis forces all off-diagonal covariance to be carried by the low-rank term $\Lambda\Lambda^\top$. PCA instead removes leading orthogonal rank-one eigendirections, and its residual is the sum of the remaining eigendirections. A concrete positive semidefinite covariance matrix shows that this PCA residual can contain off-diagonal covariance, so the two decompositions encode different structural assumptions.
[/proofplan]
[step:Show that factor analysis assigns all off-diagonal covariance to the loading term]
Assume $\Sigma = \Lambda\Lambda^\top + \Psi$, where $\Lambda \in \mathbb{R}^{p \times k}$ and $\Psi \in \mathbb{R}^{p \times p}$ is diagonal. Since $\Psi$ is diagonal, $\Psi_{ij} = 0$ whenever $i \neq j$. Therefore, for every $i,j \in \{1,\dots,p\}$ with $i \neq j$,
\begin{align*}
\Sigma_{ij}
&= (\Lambda\Lambda^\top)_{ij} + \Psi_{ij} \\
&= (\Lambda\Lambda^\top)_{ij}.
\end{align*}
Thus the off-diagonal covariance structure of a factor analysis model is entirely determined by the low-rank matrix $\Lambda\Lambda^\top$.
[guided]
The point of the factor analysis decomposition is not merely that $\Sigma$ is written as a sum of two matrices. The crucial structural condition is that the residual matrix $\Psi$ is diagonal. Writing the equality entrywise gives, for every pair of indices $i,j \in \{1,\dots,p\}$,
\begin{align*}
\Sigma_{ij}
= (\Lambda\Lambda^\top)_{ij} + \Psi_{ij}.
\end{align*}
If $i \neq j$, the diagonal assumption gives $\Psi_{ij}=0$. Hence
\begin{align*}
\Sigma_{ij}
= (\Lambda\Lambda^\top)_{ij}.
\end{align*}
So factor analysis forces every cross-covariance between distinct coordinates to come from the common-factor term. The diagonal matrix $\Psi$ contributes only coordinate-specific variances.
[/guided]
[/step]
[step:Identify the PCA residual as the sum of omitted eigendirections]
Let $(q_1,\dots,q_p)$ be an orthonormal basis of $\mathbb{R}^p$ consisting of eigenvectors of $\Sigma$, with corresponding eigenvalues $\lambda_1 \geq \cdots \geq \lambda_p \geq 0$. The PCA decomposition is
\begin{align*}
\Sigma
= \sum_{\ell=1}^{p} \lambda_\ell q_\ell q_\ell^\top.
\end{align*}
For $1 \leq k < p$, define the rank-$k$ PCA approximation $\Sigma_k \in \mathbb{R}^{p \times p}$ by
\begin{align*}
\Sigma_k
= \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top.
\end{align*}
Then the residual matrix $R_k \in \mathbb{R}^{p \times p}$ is
\begin{align*}
R_k
&= \Sigma - \Sigma_k \\
&= \sum_{\ell=1}^{p} \lambda_\ell q_\ell q_\ell^\top
- \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top \\
&= \sum_{\ell=k+1}^{p} \lambda_\ell q_\ell q_\ell^\top.
\end{align*}
This formula contains no condition requiring $(R_k)_{ij}=0$ for $i \neq j$.
[guided]
PCA ranks orthogonal directions by variance. Algebraically, this means that $\Sigma$ is written as a sum of rank-one matrices $\lambda_\ell q_\ell q_\ell^\top$, where $q_\ell$ is a unit direction and $\lambda_\ell$ is the variance in that direction. The rank-$k$ approximation keeps the first $k$ such directions:
\begin{align*}
\Sigma_k
= \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top.
\end{align*}
Subtracting this approximation from the full covariance matrix leaves
\begin{align*}
R_k
&= \Sigma - \Sigma_k \\
&= \sum_{\ell=1}^{p} \lambda_\ell q_\ell q_\ell^\top
- \sum_{\ell=1}^{k} \lambda_\ell q_\ell q_\ell^\top \\
&= \sum_{\ell=k+1}^{p} \lambda_\ell q_\ell q_\ell^\top.
\end{align*}
The important difference from factor analysis is visible here: the residual is a sum of omitted eigendirections, not a matrix assumed to be diagonal in the original coordinate system. A rank-one matrix $q_\ell q_\ell^\top$ generally has off-diagonal entries, so the sum of omitted eigendirections generally does as well.
[/guided]
[/step]
[step:Exhibit a PCA residual with nonzero off-diagonal entries]
Consider the covariance matrix $\Sigma \in \mathbb{R}^{3 \times 3}$ given by
\begin{align*}
\Sigma
=
\begin{pmatrix}
2 & 1 & 0 \\
1 & 2 & 1 \\
0 & 1 & 2
\end{pmatrix}.
\end{align*}
This matrix is symmetric, and its eigenvalues are $2+\sqrt{2}$, $2$, and $2-\sqrt{2}$, all of which are nonnegative; hence it is positive semidefinite. A unit eigenvector for the largest eigenvalue $\lambda_1 = 2+\sqrt{2}$ is
\begin{align*}
q_1 =
\begin{pmatrix}
\frac{1}{2} \\
\frac{\sqrt{2}}{2} \\
\frac{1}{2}
\end{pmatrix}.
\end{align*}
The rank-$1$ PCA approximation is $\Sigma_1 = \lambda_1 q_1 q_1^\top$, so the residual $R_1 = \Sigma - \Sigma_1$ has $(1,3)$ entry
\begin{align*}
(R_1)_{13}
&= \Sigma_{13} - \lambda_1(q_1)_1(q_1)_3 \\
&= 0 - (2+\sqrt{2})\frac{1}{2}\frac{1}{2} \\
&= -\frac{2+\sqrt{2}}{4}.
\end{align*}
Since $-\frac{2+\sqrt{2}}{4} \neq 0$, the PCA residual $R_1$ is not diagonal.
[guided]
To show that PCA imposes no diagonal residual condition, it is enough to give one covariance matrix for which the residual is not diagonal. Define
\begin{align*}
\Sigma
=
\begin{pmatrix}
2 & 1 & 0 \\
1 & 2 & 1 \\
0 & 1 & 2
\end{pmatrix}.
\end{align*}
The matrix is symmetric. Its eigenvalues are $2+\sqrt{2}$, $2$, and $2-\sqrt{2}$; each is nonnegative, so $\Sigma$ is positive semidefinite and is a valid covariance matrix.
For the largest eigenvalue $\lambda_1 = 2+\sqrt{2}$, take
\begin{align*}
q_1 =
\begin{pmatrix}
\frac{1}{2} \\
\frac{\sqrt{2}}{2} \\
\frac{1}{2}
\end{pmatrix}.
\end{align*}
This vector has Euclidean norm $1$, because
\begin{align*}
\left(\frac{1}{2}\right)^2
+
\left(\frac{\sqrt{2}}{2}\right)^2
+
\left(\frac{1}{2}\right)^2
=
\frac{1}{4}+\frac{1}{2}+\frac{1}{4}
=
1.
\end{align*}
The rank-$1$ PCA approximation is
\begin{align*}
\Sigma_1 = \lambda_1 q_1q_1^\top.
\end{align*}
Therefore the residual $R_1 = \Sigma-\Sigma_1$ has $(1,3)$ entry
\begin{align*}
(R_1)_{13}
&= \Sigma_{13} - \lambda_1(q_1)_1(q_1)_3 \\
&= 0 - (2+\sqrt{2})\frac{1}{2}\frac{1}{2} \\
&= -\frac{2+\sqrt{2}}{4}.
\end{align*}
This entry is nonzero. Hence the PCA residual is not diagonal in this valid example.
[/guided]
[/step]
[step:Conclude that the decompositions answer different covariance-structure questions]
Factor analysis requires the residual covariance $\Psi$ to be diagonal, so it separates covariance into a low-rank common part and coordinate-specific residual variance. PCA requires orthogonality of the directions $q_1,\dots,q_p$ and orders the rank-one terms by eigenvalue size, but it does not require the omitted residual $R_k$ to be diagonal. Since a PCA residual can have nonzero off-diagonal entries, PCA does not impose the factor-analysis covariance structure. Therefore factor analysis and PCA are conceptually distinct even though both may involve eigenvectors or low-rank matrices.
[/step]