[proofplan]
The likelihood and observational distribution are unchanged by right multiplication of the loading matrix by an orthogonal matrix, because this operation preserves $\Lambda\Lambda^\top$. We therefore choose the orthogonal rotation to diagonalize the symmetric matrix $\hat{\Lambda}^\top\hat{\Psi}^{-1}\hat{\Lambda}$. The finite-dimensional real spectral theorem supplies such an orthogonal matrix, and the covariance calculation shows that the rotated representative remains in the same maximum likelihood equivalence class.
[/proofplan]
[step:Form the symmetric weighted Gram matrix of the loading representative]
Since $\hat{\Psi}$ is positive diagonal, every diagonal entry of $\hat{\Psi}$ is strictly positive, so $\hat{\Psi}^{-1} \in \mathbb{R}^{p \times p}$ exists and is positive diagonal. Define the matrix
\begin{align*}
A := \hat{\Lambda}^\top \hat{\Psi}^{-1}\hat{\Lambda} \in \mathbb{R}^{m \times m}.
\end{align*}
The matrix $A$ is symmetric, because $\hat{\Psi}^{-1}$ is symmetric and hence
\begin{align*}
A^\top
&= \left(\hat{\Lambda}^\top \hat{\Psi}^{-1}\hat{\Lambda}\right)^\top \\
&= \hat{\Lambda}^\top \left(\hat{\Psi}^{-1}\right)^\top \hat{\Lambda} \\
&= \hat{\Lambda}^\top \hat{\Psi}^{-1}\hat{\Lambda} \\
&= A.
\end{align*}
[/step]
[step:Choose an orthogonal rotation that diagonalizes the weighted Gram matrix]
By the finite-dimensional real spectral theorem for symmetric matrices (citing a result not yet in the wiki: Spectral Theorem for Real Symmetric Matrices), since $A \in \mathbb{R}^{m \times m}$ is symmetric, there exist an orthogonal matrix $T \in \mathbb{R}^{m \times m}$ and a diagonal matrix $D \in \mathbb{R}^{m \times m}$ such that
\begin{align*}
T^\top A T = D.
\end{align*}
Substituting the definition of $A$ gives
\begin{align*}
T^\top \hat{\Lambda}^\top \hat{\Psi}^{-1}\hat{\Lambda}T = D.
\end{align*}
Equivalently,
\begin{align*}
(\hat{\Lambda}T)^\top \hat{\Psi}^{-1}(\hat{\Lambda}T) = D,
\end{align*}
so the rotated loading matrix $\hat{\Lambda}T$ has diagonal weighted Gram matrix.
[guided]
The matrix we want to make diagonal is
\begin{align*}
A := \hat{\Lambda}^\top \hat{\Psi}^{-1}\hat{\Lambda}.
\end{align*}
The previous step verified that $A$ is a real symmetric $m \times m$ matrix. This is exactly the hypothesis needed for the finite-dimensional real spectral theorem for symmetric matrices (citing a result not yet in the wiki: Spectral Theorem for Real Symmetric Matrices). That theorem gives an orthogonal matrix $T \in \mathbb{R}^{m \times m}$ whose columns are an orthonormal eigenbasis for $A$, and it gives a diagonal matrix $D \in \mathbb{R}^{m \times m}$ such that
\begin{align*}
T^\top A T = D.
\end{align*}
Now replace $A$ by its definition. We obtain
\begin{align*}
T^\top \hat{\Lambda}^\top \hat{\Psi}^{-1}\hat{\Lambda}T = D.
\end{align*}
Because matrix multiplication is associative and $T^\top\hat{\Lambda}^\top = (\hat{\Lambda}T)^\top$, the left-hand side is exactly
\begin{align*}
(\hat{\Lambda}T)^\top \hat{\Psi}^{-1}(\hat{\Lambda}T).
\end{align*}
Thus the same orthogonal matrix $T$ that diagonalizes $A$ also makes the rotated representative satisfy
\begin{align*}
(\hat{\Lambda}T)^\top \hat{\Psi}^{-1}(\hat{\Lambda}T) = D,
\end{align*}
with $D$ diagonal.
[/guided]
[/step]
[step:Verify that the rotation preserves the fitted covariance and likelihood]
Since $T$ is orthogonal, $TT^\top = I_m$, where $I_m \in \mathbb{R}^{m \times m}$ denotes the identity matrix. Therefore the fitted covariance matrix of the rotated pair $(\hat{\Lambda}T,\hat{\Psi})$ is
\begin{align*}
\Sigma(\hat{\Lambda}T,\hat{\Psi})
&= (\hat{\Lambda}T)(\hat{\Lambda}T)^\top + \hat{\Psi} \\
&= \hat{\Lambda}T T^\top \hat{\Lambda}^\top + \hat{\Psi} \\
&= \hat{\Lambda} I_m \hat{\Lambda}^\top + \hat{\Psi} \\
&= \hat{\Lambda}\hat{\Lambda}^\top + \hat{\Psi} \\
&= \Sigma(\hat{\Lambda},\hat{\Psi}).
\end{align*}
By hypothesis, the Gaussian likelihood depends on the parameters only through the covariance matrix $\Sigma(\Lambda,\Psi)$. Hence $(\hat{\Lambda}T,\hat{\Psi})$ has the same likelihood value as $(\hat{\Lambda},\hat{\Psi})$. Since $(\hat{\Lambda},\hat{\Psi})$ is a maximum likelihood representative, $(\hat{\Lambda}T,\hat{\Psi})$ is also a maximum likelihood representative. The equality of covariance matrices also shows that the two representatives are observationally equivalent.
[guided]
The only remaining point is to check that the rotation did not change the statistical model represented by the parameters. The covariance map is
\begin{align*}
\Sigma(\Lambda,\Psi) := \Lambda\Lambda^\top + \Psi.
\end{align*}
For the rotated pair $(\hat{\Lambda}T,\hat{\Psi})$, we compute directly:
\begin{align*}
\Sigma(\hat{\Lambda}T,\hat{\Psi})
&= (\hat{\Lambda}T)(\hat{\Lambda}T)^\top + \hat{\Psi} \\
&= \hat{\Lambda}T T^\top \hat{\Lambda}^\top + \hat{\Psi}.
\end{align*}
The orthogonality of $T$ means $TT^\top = I_m$, so this becomes
\begin{align*}
\Sigma(\hat{\Lambda}T,\hat{\Psi})
&= \hat{\Lambda} I_m \hat{\Lambda}^\top + \hat{\Psi} \\
&= \hat{\Lambda}\hat{\Lambda}^\top + \hat{\Psi} \\
&= \Sigma(\hat{\Lambda},\hat{\Psi}).
\end{align*}
Thus the rotated and unrotated representatives determine the same covariance matrix. Since the Gaussian likelihood is assumed to depend on $(\Lambda,\Psi)$ only through $\Sigma(\Lambda,\Psi)$, the two representatives have the same likelihood value. The original representative $(\hat{\Lambda},\hat{\Psi})$ is maximum likelihood by hypothesis, so the rotated representative $(\hat{\Lambda}T,\hat{\Psi})$ must also be maximum likelihood. The same covariance equality proves observational equivalence.
[/guided]
[/step]
[step:Conclude the existence of a diagonalizing maximum likelihood representative]
The orthogonal matrix $T$ constructed above satisfies both required properties:
\begin{align*}
\Sigma(\hat{\Lambda}T,\hat{\Psi}) = \Sigma(\hat{\Lambda},\hat{\Psi})
\end{align*}
and
\begin{align*}
(\hat{\Lambda}T)^\top \hat{\Psi}^{-1}(\hat{\Lambda}T)
\end{align*}
is diagonal. Therefore $(\hat{\Lambda}T,\hat{\Psi})$ is an observationally equivalent maximum likelihood representative whose weighted loading Gram matrix is diagonal. This proves the theorem.
[/step]