[proofplan]
We write the estimation error as an average of independent centered random matrices. Expanding the squared Frobenius norm eliminates all cross terms by independence and centering, reducing the risk to the second moment of one summand. That one-summand moment is computed by expanding $\|XX^\top-\Sigma\|_F^2$ and evaluating the required Gaussian fourth moments from the Gaussian moment generating function.
[/proofplan]
custom_env
admin
[step:Rewrite the covariance error as an average of centered independent matrices]
For each $i \in \{1,\dots,n\}$, define the random matrix $Y_i: \Omega \to \mathbb{R}^{p \times p}$ by
\begin{align*}
Y_i(\omega)=X_i(\omega)X_i(\omega)^\top-\Sigma .
\end{align*}
Since $\mathbb{E}[X_iX_i^\top]=\Sigma$, each entry of $Y_i$ has expectation $0$. The matrices $Y_1,\dots,Y_n$ are independent because $X_1,\dots,X_n$ are independent and $Y_i$ is a measurable function of $X_i$. By the definition of $\widehat{\Sigma}$,
\begin{align*}
\widehat{\Sigma}-\Sigma=\frac{1}{n}\sum_{i=1}^n Y_i .
\end{align*}
[/step]
custom_env
admin
[step:Use independence to remove all Frobenius cross terms]Let $\langle A,B\rangle_F := \operatorname{tr}(A^\top B)$ denote the Frobenius [inner product](/page/Inner%20Product) on $\mathbb{R}^{p \times p}$. Expanding the square gives
\begin{align*}
\mathbb{E}\left[\|\widehat{\Sigma}-\Sigma\|_F^2\right]=\mathbb{E}\left[\left\langle \frac{1}{n}\sum_{i=1}^n Y_i,\frac{1}{n}\sum_{j=1}^n Y_j\right\rangle_F\right]=\frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n\mathbb{E}\left[\langle Y_i,Y_j\rangle_F\right].
\end{align*}
If $i \neq j$, then independence and centering give
\begin{align*}
\mathbb{E}\left[\langle Y_i,Y_j\rangle_F\right]=\sum_{a=1}^p\sum_{b=1}^p\mathbb{E}\left[(Y_i)_{ab}(Y_j)_{ab}\right]=\sum_{a=1}^p\sum_{b=1}^p\mathbb{E}\left[(Y_i)_{ab}\right]\mathbb{E}\left[(Y_j)_{ab}\right]=0.
\end{align*}
Therefore only the diagonal terms remain. Since the $Y_i$ are identically distributed,
\begin{align*}
\mathbb{E}\left[\|\widehat{\Sigma}-\Sigma\|_F^2\right]=\frac{1}{n}\mathbb{E}\left[\|Y_1\|_F^2\right].
\end{align*}[/step]
custom_env
admin
[guided]The point of introducing $Y_i$ is that the sample covariance error becomes an average of independent mean-zero objects. We use the Frobenius inner product
\begin{align*}
\langle A,B\rangle_F := \operatorname{tr}(A^\top B)
\end{align*}
on $\mathbb{R}^{p \times p}$, so that $\|A\|_F^2=\langle A,A\rangle_F$. Expanding the norm of the average gives
\begin{align*}
\mathbb{E}\left[\|\widehat{\Sigma}-\Sigma\|_F^2\right]=\mathbb{E}\left[\left\langle \frac{1}{n}\sum_{i=1}^n Y_i,\frac{1}{n}\sum_{j=1}^n Y_j\right\rangle_F\right]=\frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n\mathbb{E}\left[\langle Y_i,Y_j\rangle_F\right].
\end{align*}
Now consider a cross term with $i \neq j$. Since $Y_i$ and $Y_j$ are independent random matrices, their corresponding entries are independent real-valued random variables. Since each entry is centered,
\begin{align*}
\mathbb{E}\left[\langle Y_i,Y_j\rangle_F\right]=\sum_{a=1}^p\sum_{b=1}^p\mathbb{E}\left[(Y_i)_{ab}(Y_j)_{ab}\right]=\sum_{a=1}^p\sum_{b=1}^p\mathbb{E}\left[(Y_i)_{ab}\right]\mathbb{E}\left[(Y_j)_{ab}\right]=0.
\end{align*}
Thus all cross terms vanish. The remaining $n$ diagonal terms are equal because the matrices $Y_1,\dots,Y_n$ have the same distribution. Hence
\begin{align*}
\mathbb{E}\left[\|\widehat{\Sigma}-\Sigma\|_F^2\right]=\frac{1}{n}\mathbb{E}\left[\|Y_1\|_F^2\right].
\end{align*}[/guided]
custom_env
admin
[step:Expand the one-sample Frobenius square into Gaussian moments]
Let $X: \Omega \to \mathbb{R}^p$ be a random vector with law $\mathcal{N}(0,\Sigma)$, and define the random matrix $Y: \Omega \to \mathbb{R}^{p \times p}$ by
\begin{align*}
Y(\omega)=X(\omega)X(\omega)^\top-\Sigma .
\end{align*}
Since $\Sigma^\top=\Sigma$,
\begin{align*}
\|Y\|_F^2=\|XX^\top\|_F^2-2\operatorname{tr}(XX^\top\Sigma)+\|\Sigma\|_F^2.
\end{align*}
The first term satisfies
\begin{align*}
\|XX^\top\|_F^2=\operatorname{tr}(XX^\top XX^\top)=(X^\top X)^2=|X|^4,
\end{align*}
and the trace cyclicity identity gives
\begin{align*}
\operatorname{tr}(XX^\top\Sigma)=\operatorname{tr}(\Sigma XX^\top)=X^\top\Sigma X.
\end{align*}
Thus
\begin{align*}
\mathbb{E}[\|Y\|_F^2]=\mathbb{E}[|X|^4]-2\mathbb{E}[X^\top\Sigma X]+\|\Sigma\|_F^2.
\end{align*}
[/step]
custom_env
admin
[step:Compute the required Gaussian fourth moments]
Because $X \sim \mathcal{N}(0,\Sigma)$, its moment generating function $M:\mathbb{R}^p \to \mathbb{R}$ is finite on all of $\mathbb{R}^p$ and is given by
\begin{align*}
M(t)=\mathbb{E}\left[e^{t^\top X}\right]=\exp\left(\frac{1}{2}t^\top\Sigma t\right).
\end{align*}
The finiteness of $M$ on a neighbourhood of $0$ justifies differentiating the moment generating function at $0$ to recover moments. For all $a,b,c,d \in \{1,\dots,p\}$, differentiating at $t=0$ gives
\begin{align*}
\mathbb{E}[X_aX_b]=\Sigma_{ab}.
\end{align*}
It also gives the Gaussian fourth-moment identity
\begin{align*}
\mathbb{E}[X_aX_bX_cX_d]=\Sigma_{ab}\Sigma_{cd}+\Sigma_{ac}\Sigma_{bd}+\Sigma_{ad}\Sigma_{bc}.
\end{align*}
Therefore
\begin{align*}
\mathbb{E}[|X|^4]=\mathbb{E}\left[\left(\sum_{a=1}^p X_a^2\right)\left(\sum_{b=1}^p X_b^2\right)\right]=\sum_{a=1}^p\sum_{b=1}^p\mathbb{E}[X_aX_aX_bX_b]=\sum_{a=1}^p\sum_{b=1}^p\left(\Sigma_{aa}\Sigma_{bb}+2\Sigma_{ab}^2\right)=\operatorname{tr}(\Sigma)^2+2\|\Sigma\|_F^2.
\end{align*}
Also,
\begin{align*}
\mathbb{E}[X^\top\Sigma X]=\mathbb{E}\left[\sum_{a=1}^p\sum_{b=1}^p\Sigma_{ab}X_aX_b\right]=\sum_{a=1}^p\sum_{b=1}^p\Sigma_{ab}\mathbb{E}[X_aX_b]=\sum_{a=1}^p\sum_{b=1}^p\Sigma_{ab}^2=\|\Sigma\|_F^2,
\end{align*}
where the final equality uses the symmetry $\Sigma_{ab}=\Sigma_{ba}$.
[/step]
custom_env
admin
[step:Substitute the one-sample moment into the risk identity]
Substituting the two Gaussian moment computations into the one-sample expansion gives
\begin{align*}
\mathbb{E}[\|Y\|_F^2]=\left(\operatorname{tr}(\Sigma)^2+2\|\Sigma\|_F^2\right)-2\|\Sigma\|_F^2+\|\Sigma\|_F^2=\operatorname{tr}(\Sigma)^2+\|\Sigma\|_F^2.
\end{align*}
Since
\begin{align*}
\mathbb{E}\left[\|\widehat{\Sigma}-\Sigma\|_F^2\right]=\frac{1}{n}\mathbb{E}[\|Y\|_F^2],
\end{align*}
we obtain
\begin{align*}
\mathbb{E}\left[\|\widehat{\Sigma}-\Sigma\|_F^2\right]=\frac{1}{n}\left(\operatorname{tr}(\Sigma)^2+\|\Sigma\|_F^2\right).
\end{align*}
This is the claimed Frobenius risk formula.
[/step]