[proofplan]
This is a quoted standard one-spike form of the Baik-Ben Arous-Peche transition, in the Gaussian spiked-covariance formulation proved by Baik and Silverstein, "Eigenvalues of large sample covariance matrices of spiked population models," Journal of Multivariate Analysis 97 (2006), 1382-1408. The external theorem includes the uniform resolvent control, root transfer, no-outlier statement, and upper-edge sticking needed for almost-sure convergence of the largest sample eigenvalue. The proof below verifies that the model reduces to the one-spike Wishart setting of that cited theorem and then carries out the deterministic branch calculation showing that the theorem's outlier formula is above the Marchenko-Pastur edge exactly when $\theta>1+\sqrt c$.
[/proofplan]
[step:Represent the spike as a rank-one perturbation of a white sample covariance matrix]
Let $X_n$ be the $p\times n$ observation matrix whose columns are independent Gaussian vectors with covariance matrix $\Sigma_n$. Let $I_p\in\mathbb R^{p\times p}$ denote the identity matrix. Since only the eigenvalues of the covariance matrix matter for a Gaussian sample after an orthogonal change of basis, diagonalise $\Sigma_n$ and write
\begin{align*}
\Sigma_n = I_p + (\theta-1) u_p u_p^\top,
\end{align*}
where $u_p\in\mathbb{R}^p$ satisfies $|u_p|=1$. Let $Z_n$ be a $p\times n$ matrix with independent standard Gaussian entries and define the white Wishart matrix
\begin{align*}
W_n := \frac{1}{n} Z_n Z_n^\top.
\end{align*}
Let $\Sigma_n^{1/2}$ denote the symmetric positive definite square root of $\Sigma_n$. The sample covariance matrix is orthogonally equivalent to
\begin{align*}
S_n := \Sigma_n^{1/2} W_n \Sigma_n^{1/2}.
\end{align*}
Orthogonal equivalence preserves eigenvalues, so it is enough to study the eigenvalues of $S_n$.
[/step]
[step:Record the quoted one-spike BBP theorem used]
We use the following external result not yet entered as a wiki theorem: Baik-Silverstein's almost-sure spiked population eigenvalue theorem for large sample covariance matrices. In its one-spike identity-bulk specialization, if the population eigenvalues are
\begin{align*}
\theta,1,\dots,1
\end{align*}
with fixed $\theta>1$, if the observations are Gaussian, and if
\begin{align*}
\frac{p}{n}\to c\in(0,\infty)
\end{align*}
then the largest sample eigenvalue satisfies, almost surely,
\begin{align*}
\lambda_{1,n}\to
\begin{cases}
\theta\left(1+\dfrac{c}{\theta-1}\right), & \theta>1+\sqrt c,\\
(1+\sqrt c)^2, & 1<\theta\le 1+\sqrt c.
\end{cases}
\end{align*}
Equivalently, the quoted theorem asserts that a separated outlier exists exactly in the supercritical regime $\theta>1+\sqrt c$. The current theorem is precisely the one-spike specialization of that cited result. The remaining steps explain how the threshold and the displayed outlier location arise from the deterministic resolvent equation; the almost-sure no-outlier and edge-sticking conclusions are part of the quoted Baik-Silverstein theorem.
[/step]
[step:Use the rank-one outlier equation outside the noise spectrum]
Let $\lambda$ lie outside the spectrum of $W_n$. The matrix $S_n$ has an eigenvalue at $\lambda$ not already in the spectrum of $W_n$ precisely when
\begin{align*}
1 = (\theta-1)\, u_p^\top W_n(\lambda I_p-W_n)^{-1}u_p.
\end{align*}
Indeed, the non-zero eigenvalues of
\begin{align*}
\Sigma_n^{1/2}W_n\Sigma_n^{1/2}
\end{align*}
agree with the non-zero eigenvalues of $W_n\Sigma_n$. Since
\begin{align*}
W_n\Sigma_n=W_n+(\theta-1)W_nu_pu_p^\top,
\end{align*}
the matrix determinant lemma gives, for $\lambda\notin\sigma(W_n)$,
\begin{align*}
\det(\lambda I_p-W_n\Sigma_n)
&=
\det(\lambda I_p-W_n)
\left(
1-(\theta-1)u_p^\top(\lambda I_p-W_n)^{-1}W_nu_p
\right).
\end{align*}
Because $W_n$ commutes with its resolvent $(\lambda I_p-W_n)^{-1}$, the scalar factor vanishes exactly when
\begin{align*}
1=(\theta-1)u_p^\top W_n(\lambda I_p-W_n)^{-1}u_p.
\end{align*}
This is the Schur-complement eigenvalue equation for the rank-one multiplicative deformation.
[guided]
The point of passing to the resolvent is that a finite-dimensional eigenvalue problem becomes a scalar equation. For $\lambda\notin\sigma(W_n)$, the resolvent $(\lambda I_p-W_n)^{-1}$ is a well-defined [linear map](/page/Linear%20Map) on $\mathbb{R}^p$. The rank-one part of the population covariance only acts in the direction $u_p$, so the only scalar quantity that remains is
\begin{align*}
u_p^\top W_n(\lambda I_p-W_n)^{-1}u_p.
\end{align*}
The determinant calculation factors $\det(\lambda I_p-W_n\Sigma_n)$ into the white-Wishart factor $\det(\lambda I_p-W_n)$ and one scalar correction term. Since $W_n$ commutes with $(\lambda I_p-W_n)^{-1}$, that correction term is the one displayed above. Thus possible separated eigenvalues are exactly the zeros, outside the white Wishart spectrum, of the scalar function
\begin{align*}
\lambda \mapsto 1-(\theta-1)u_p^\top W_n(\lambda I_p-W_n)^{-1}u_p.
\end{align*}
[/guided]
[/step]
[step:Replace the spike-direction resolvent by its Marchenko-Pastur deterministic equivalent]
For every fixed $\lambda>(1+\sqrt c)^2$, the Gaussian isotropic Marchenko-Pastur resolvent limit gives, almost surely,
\begin{align*}
u_p^\top W_n(\lambda I_p-W_n)^{-1}u_p \longrightarrow \int \frac{x}{\lambda-x}\,d\mu_c(x),
\end{align*}
where $\mu_c$ is the Marchenko-Pastur law with aspect ratio $c$. The hypotheses are satisfied because the entries of $Z_n$ are independent Gaussian variables,
\begin{align*}
\frac{p}{n}\to c\in(0,\infty),
\end{align*}
and $u_p$ is deterministic with $|u_p|=1$.
Let
\begin{align*}
a_c=(1-\sqrt c)^2,
\qquad
b_c=(1+\sqrt c)^2.
\end{align*}
The continuous part of $\mu_c$ is supported on $[a_c,b_c]$, with a possible atom at $0$ when $c>1$.
Define
\begin{align*}
h_c:(b_c,\infty)&\to(0,\infty),\\
\lambda&\mapsto \int \frac{x}{\lambda-x}\,d\mu_c(x).
\end{align*}
The Stieltjes transform computation for the Marchenko-Pastur law gives that $h_c$ is continuous and strictly decreasing, with
\begin{align*}
\lim_{\lambda\downarrow b_c} h_c(\lambda)=\frac{1}{\sqrt c},
\qquad
\lim_{\lambda\to\infty} h_c(\lambda)=0.
\end{align*}
Moreover, solving
\begin{align*}
h_c(\lambda)=\frac{1}{\theta-1}
\end{align*}
on this physical branch gives
\begin{align*}
\lambda=\theta\left(1+\frac{c}{\theta-1}\right).
\end{align*}
This solution lies on the physical branch $\lambda>(1+\sqrt c)^2$ if and only if
\begin{align*}
\frac{1}{\theta-1}<\frac{1}{\sqrt c},
\end{align*}
equivalently $\theta>1+\sqrt c$. Hence any separated spike eigenvalue must converge almost surely to this value in the supercritical regime, while the deterministic outlier equation has no solution above the edge in the subcritical regime.
[guided]
The random scalar in the outlier equation is a quadratic form of the resolvent of the white Wishart matrix. The Gaussian isotropic Marchenko-Pastur resolvent theorem applies because the white matrix has independent standard Gaussian entries, the dimensions satisfy
\begin{align*}
\frac{p}{n}\to c\in(0,\infty),
\end{align*}
and the spike direction $u_p$ is deterministic and normalised. Therefore, for each fixed spectral parameter $\lambda$ above the upper edge, the random quantity
\begin{align*}
u_p^\top W_n(\lambda I_p-W_n)^{-1}u_p
\end{align*}
converges almost surely to the deterministic Marchenko-Pastur integral
\begin{align*}
\int \frac{x}{\lambda-x}\,d\mu_c(x).
\end{align*}
Substituting this deterministic equivalent into the outlier equation yields
\begin{align*}
1=(\theta-1)\int \frac{x}{\lambda-x}\,d\mu_c(x).
\end{align*}
The Marchenko-Pastur Stieltjes transform identity shows that the function
\begin{align*}
h_c(\lambda)=\int \frac{x}{\lambda-x}\,d\mu_c(x)
\end{align*}
decreases from $1/\sqrt c$ at the upper edge to $0$ at infinity. Therefore the equation has a solution above the edge exactly when
\begin{align*}
\frac{1}{\theta-1}<\frac{1}{\sqrt c},
\end{align*}
that is, exactly when $\theta>1+\sqrt c$. In that supercritical case, solving the equation gives the unique candidate limit
\begin{align*}
\lambda=\theta\left(1+\frac{c}{\theta-1}\right).
\end{align*}
This identifies the only possible limit of an eigenvalue that remains separated from the noise spectrum.
[/guided]
[/step]
[step:Check that the candidate lies above the noise edge exactly in the supercritical regime]
Define the function
\begin{align*}
\ell_c:(1+\sqrt c,\infty)&\to((1+\sqrt c)^2,\infty),\\
\theta&\mapsto \theta\left(1+\frac{c}{\theta-1}\right).
\end{align*}
Then
\begin{align*}
\ell_c(\theta)-(1+\sqrt c)^2
&=\theta+\frac{c\theta}{\theta-1}-1-2\sqrt c-c \\
&=\frac{(\theta-1-\sqrt c)^2}{\theta-1}.
\end{align*}
For $\theta>1+\sqrt c$ the denominator is positive and the numerator is nonzero, so $\ell_c(\theta)>(1+\sqrt c)^2$. At $\theta=1+\sqrt c$ the candidate touches the edge. When $1<\theta<1+\sqrt c$, the same algebraic expression is not the physical solution of the outlier equation, because the right-hand side
\begin{align*}
\frac{1}{\theta-1}
\end{align*}
lies outside the range
\begin{align*}
\left(0,\frac{1}{\sqrt c}\right)
\end{align*}
of $h_c$ on $((1+\sqrt c)^2,\infty)$. Thus the resolvent equation admits a stable solution separated from the edge precisely when
\begin{align*}
\theta>1+\sqrt c.
\end{align*}
[/step]
[step:Conclude separation above the edge and sticking at the edge below the threshold]
If $\theta>1+\sqrt c$, the previous steps give a unique solution to the outlier equation above $(1+\sqrt c)^2$, and the uniform resolvent convergence on compact subsets of $((1+\sqrt c)^2,\infty)$ implies that the associated sample eigenvalue converges almost surely to
\begin{align*}
\theta\left(1+\frac{c}{\theta-1}\right).
\end{align*}
This limit lies strictly above the upper Marchenko-Pastur edge, so the eigenvalue separates from the bulk.
If $\theta\le 1+\sqrt c$, there is no stable solution of the limiting outlier equation strictly above the edge. More explicitly, fix $\varepsilon>0$. On the compact interval $[(1+\sqrt c)^2+\varepsilon,\varepsilon^{-1}]$, the limiting function satisfies
\begin{align*}
1-(\theta-1)h_c(\lambda)>0
\end{align*}
after decreasing $\varepsilon$ only if necessary, because throughout the interval one has
\begin{align*}
h_c(\lambda)<\frac{1}{\sqrt c}\le \frac{1}{\theta-1}.
\end{align*}
The quoted Baik-Silverstein theorem supplies the corresponding uniform root exclusion and edge-sticking conclusion for the finite-dimensional eigenvalues. Hence the largest eigenvalue has limiting upper and lower edge equal to
\begin{align*}
(1+\sqrt c)^2.
\end{align*}
This proves both the supercritical outlier limit and the subcritical edge-sticking statement.
[guided]
In the supercritical case $\theta>1+\sqrt c$, the algebraic candidate found above lies strictly above the upper Marchenko-Pastur edge. [Uniform convergence](/page/Uniform%20Convergence) of the resolvent quadratic form on compact intervals above the edge transfers the deterministic zero of the limiting scalar equation to a random zero of the finite-$n$ scalar equation. This random zero is exactly the separated sample eigenvalue, so the eigenvalue converges almost surely to
\begin{align*}
\theta\left(1+\frac{c}{\theta-1}\right).
\end{align*}
Because this number is larger than $(1+\sqrt c)^2$, the eigenvalue separates from the bulk.
In the subcritical case $\theta\le 1+\sqrt c$, the deterministic scalar equation has no stable solution strictly above the edge. Indeed, $h_c$ takes values only in
\begin{align*}
\left(0,\frac{1}{\sqrt c}\right)
\end{align*}
above the edge, while
\begin{align*}
\frac{1}{\theta-1}\ge \frac{1}{\sqrt c}.
\end{align*}
The quoted Baik-Silverstein theorem turns this deterministic no-root condition into the finite-dimensional no-outlier and edge-sticking conclusion. Thus the largest sample eigenvalue converges almost surely to
\begin{align*}
(1+\sqrt c)^2.
\end{align*}
Together these two cases are exactly the BBP phase transition stated in the theorem.
[/guided]
[/step]