[proofplan]
We embed a finite testing problem into the covariance class by perturbing a scalar covariance matrix in many rank-one directions. A spherical packing gives exponentially many unit vectors whose rank-one projectors are separated in operator norm. The Kullback-Leibler divergence between the corresponding Gaussian product measures is quadratic in the perturbation size and independent of the packing cardinality except through $p$. Choosing the perturbation size of order $\sqrt{p/n}\wedge 1$, [Fano's inequality](/theorems/1654) forces a nontrivial testing error, and the [testing-to-estimation reduction](/theorems/5895) converts that error into an operator-norm risk lower bound.
[/proofplan]
[step:Build many separated rank-one perturbations inside the spectrum class]
Set
\begin{align*}
a:=\frac{m+M}{2},
\qquad
r:=\frac{M-m}{4}.
\end{align*}
For each unit vector $u\in\mathbb R^p$, define the rank-one [orthogonal projection](/theorems/437) $P_u:\mathbb R^p\to\mathbb R^p$ by
\begin{align*}
P_u(x):=(x\cdot u)u \quad \text{for every } x\in\mathbb R^p.
\end{align*}
Equivalently, in matrix notation, $P_u=uu^\top$.
We use the following elementary spherical packing fact: there exist unit vectors $u_1,\dots,u_N\in\mathbb R^p$ such that
\begin{align*}
N\ge \exp(\beta p)
\end{align*}
for a universal constant $\beta>0$, and
\begin{align*}
|u_i\cdot u_j|\le \frac12
\end{align*}
for all $i\ne j$. For such $i\ne j$, the operator norm of the difference of the corresponding rank-one projections is
\begin{align*}
\|P_{u_i}-P_{u_j}\|_{\mathrm{op}}
=
\sqrt{1-|u_i\cdot u_j|^2}
\ge
\frac{\sqrt 3}{2}.
\end{align*}
Let $\lambda\in(0,r]$ be chosen later. For each $j\in\{1,\dots,N\}$, define
\begin{align*}
\Sigma_j:=aI_p+\lambda P_{u_j}.
\end{align*}
The eigenvalues of $\Sigma_j$ are $a+\lambda$ in the direction $\operatorname{span}\{u_j\}$ and $a$ on its orthogonal complement. Since $\lambda\le r$, we have
\begin{align*}
m
<
a-r
\le
a
\le
a+\lambda
\le
a+r
<
M.
\end{align*}
Thus $\Sigma_j\in\mathcal C_p(m,M)$ for every $j$.
Moreover, for $i\ne j$,
\begin{align*}
\|\Sigma_i-\Sigma_j\|_{\mathrm{op}}
=
\lambda\|P_{u_i}-P_{u_j}\|_{\mathrm{op}}
\ge
\frac{\sqrt 3}{2}\lambda.
\end{align*}
[guided]
The point of using rank-one perturbations is that they create many covariances while keeping the information distance small. We start from the scalar matrix $aI_p$, which lies in the middle of the spectral interval $[m,M]$, and add a small positive perturbation in one direction.
Define
\begin{align*}
a:=\frac{m+M}{2},
\qquad
r:=\frac{M-m}{4}.
\end{align*}
The number $r$ is a spectral safety margin: if $0<\lambda\le r$, then $a+\lambda$ is still strictly below $M$, while $a$ is strictly above $m$.
For a unit vector $u\in\mathbb R^p$, define the map $P_u:\mathbb R^p\to\mathbb R^p$ by
\begin{align*}
P_u(x):=(x\cdot u)u \quad \text{for every } x\in\mathbb R^p.
\end{align*}
This is the orthogonal projection onto the line $\operatorname{span}\{u\}$. Its matrix is $uu^\top$, and its only eigenvalues are $1$ on $\operatorname{span}\{u\}$ and $0$ on $\operatorname{span}\{u\}^{\perp}$.
Choose unit vectors $u_1,\dots,u_N\in\mathbb R^p$ with
\begin{align*}
N\ge \exp(\beta p),
\qquad
|u_i\cdot u_j|\le \frac12
\quad\text{for }i\ne j,
\end{align*}
where $\beta>0$ is a universal constant. This is the standard volumetric packing construction on the Euclidean unit sphere.
For each $j$, define
\begin{align*}
\Sigma_j:=aI_p+\lambda P_{u_j}.
\end{align*}
Since $P_{u_j}$ has eigenvalue $1$ in the direction $u_j$ and eigenvalue $0$ on $u_j^\perp$, the covariance matrix $\Sigma_j$ has eigenvalue $a+\lambda$ in the direction $u_j$ and eigenvalue $a$ on $u_j^\perp$. Therefore, if $\lambda\le r$,
\begin{align*}
m
<
a-r
\le
a
\le
a+\lambda
\le
a+r
<
M.
\end{align*}
So every $\Sigma_j$ belongs to $\mathcal C_p(m,M)$.
Finally, the separation of the directions gives separation of the covariance matrices. For rank-one orthogonal projections,
\begin{align*}
\|P_{u_i}-P_{u_j}\|_{\mathrm{op}}
=
\sqrt{1-|u_i\cdot u_j|^2}.
\end{align*}
Since $|u_i\cdot u_j|\le 1/2$, this gives
\begin{align*}
\|\Sigma_i-\Sigma_j\|_{\mathrm{op}}
=
\lambda\|P_{u_i}-P_{u_j}\|_{\mathrm{op}}
\ge
\frac{\sqrt 3}{2}\lambda.
\end{align*}
Thus the parameter set contains exponentially many covariances separated by order $\lambda$ in operator norm.
[/guided]
[/step]
[step:Bound the Gaussian product divergences]
For $j\in\{1,\dots,N\}$, let $\mathbb P_j$ denote the joint law of $(X_1,\dots,X_n)$ when $X_1,\dots,X_n$ are independent with common distribution $\mathcal N(0,\Sigma_j)$.
For two positive definite matrices $\Sigma,\Gamma\in\mathbb R^{p\times p}$, the Kullback-Leibler divergence between the centred Gaussian product laws is
\begin{align*}
D_{\mathrm{KL}}\left(\mathcal N(0,\Sigma)^{\otimes n}\,\middle\|\,\mathcal N(0,\Gamma)^{\otimes n}\right)
=
\frac n2
\left(
\operatorname{tr}(\Gamma^{-1}\Sigma-I_p)
-
\log\det(\Gamma^{-1}\Sigma)
\right).
\end{align*}
For matrices whose eigenvalues lie in $[m,M]$, the scalar inequality
\begin{align*}
t-1-\log t\le L_{m,M}(t-1)^2
\end{align*}
holds for every $t\in[m/M,M/m]$, where
\begin{align*}
L_{m,M}:=\sup_{t\in[m/M,M/m]}
\frac{t-1-\log t}{(t-1)^2}
\end{align*}
with the value at $t=1$ interpreted as $1/2$.
Applying this inequality to the eigenvalues of $\Gamma^{-1/2}\Sigma\Gamma^{-1/2}$ gives
\begin{align*}
D_{\mathrm{KL}}\left(\mathcal N(0,\Sigma)^{\otimes n}\,\middle\|\,\mathcal N(0,\Gamma)^{\otimes n}\right)
\le
\frac{nL_{m,M}}{2m^2}\|\Sigma-\Gamma\|_F^2.
\end{align*}
For $\Sigma_i-\Sigma_j=\lambda(P_{u_i}-P_{u_j})$, we have
\begin{align*}
\|P_{u_i}-P_{u_j}\|_F^2
=
2-2(u_i\cdot u_j)^2
\le 2.
\end{align*}
Hence, for every $i,j$,
\begin{align*}
D_{\mathrm{KL}}(\mathbb P_i\|\mathbb P_j)
\le
\frac{nL_{m,M}}{m^2}\lambda^2.
\end{align*}
[/step]
[step:Choose the perturbation size so Fano applies]
Let
\begin{align*}
\kappa
:=
\min\left\{
r,\,
\frac{m}{4}\sqrt{\frac{\beta}{L_{m,M}}}
\right\}
\end{align*}
and set
\begin{align*}
\lambda:=\kappa\left(\sqrt{\frac{p}{n}}\wedge 1\right).
\end{align*}
Then $\lambda\le r$, so the covariance matrices constructed above remain in $\mathcal C_p(m,M)$.
Since $\lambda^2\le \kappa^2(p/n)$, the divergence bound gives
\begin{align*}
D_{\mathrm{KL}}(\mathbb P_i\|\mathbb P_j)
\le
\frac{L_{m,M}}{m^2}n\lambda^2
\le
\frac{L_{m,M}\kappa^2}{m^2}p
\le
\frac{\beta}{16}p.
\end{align*}
Because $N\ge \exp(\beta p)$, we have $\log N\ge \beta p$, and therefore
\begin{align*}
\max_{i,j}D_{\mathrm{KL}}(\mathbb P_i\|\mathbb P_j)
\le
\frac{1}{16}\log N.
\end{align*}
By Fano's inequality (citing a result not yet in the wiki: Fano's inequality), every measurable testing rule
\begin{align*}
\widehat J:(\mathbb R^p)^n\to\{1,\dots,N\}
\end{align*}
satisfies
\begin{align*}
\sup_{j\in\{1,\dots,N\}}
\mathbb P_j(\widehat J\ne j)
\ge
\alpha
\end{align*}
for a universal constant $\alpha>0$.
[/step]
[step:Convert testing error into estimation risk]
Let
\begin{align*}
\widetilde\Sigma:(\mathbb R^p)^n\to\mathbb R^{p\times p}
\end{align*}
be any measurable estimator. From it, define the nearest-neighbour testing rule $\widehat J:(\mathbb R^p)^n\to\{1,\dots,N\}$ by
\begin{align*}
\widehat J(x):=\min\operatorname*{argmin}_{1\le k\le N}
\|\widetilde\Sigma(x)-\Sigma_k\|_{\mathrm{op}} \quad \text{for every } x\in(\mathbb R^p)^n,
\end{align*}
where the minimum is used only to break ties.
If the true index is $j$ and $\widehat J(x)\ne j$, then by the definition of nearest neighbour and the triangle inequality,
\begin{align*}
\|\Sigma_j-\Sigma_{\widehat J(x)}\|_{\mathrm{op}}
\le
\|\Sigma_j-\widetilde\Sigma(x)\|_{\mathrm{op}}
+
\|\widetilde\Sigma(x)-\Sigma_{\widehat J(x)}\|_{\mathrm{op}}
\le
2\|\widetilde\Sigma(x)-\Sigma_j\|_{\mathrm{op}}.
\end{align*}
Since distinct parameter points are separated by at least $(\sqrt 3/2)\lambda$, the event $\{\widehat J\ne j\}$ implies
\begin{align*}
\|\widetilde\Sigma-\Sigma_j\|_{\mathrm{op}}
\ge
\frac{\sqrt 3}{4}\lambda.
\end{align*}
Therefore,
\begin{align*}
\mathbb E_j\left[\|\widetilde\Sigma-\Sigma_j\|_{\mathrm{op}}\right]
\ge
\frac{\sqrt 3}{4}\lambda\,\mathbb P_j(\widehat J\ne j).
\end{align*}
Taking the supremum over $j$ and using the Fano lower bound,
\begin{align*}
\sup_{1\le j\le N}
\mathbb E_j\left[\|\widetilde\Sigma-\Sigma_j\|_{\mathrm{op}}\right]
\ge
\frac{\sqrt 3}{4}\alpha\lambda.
\end{align*}
Since $\{\Sigma_1,\dots,\Sigma_N\}\subset\mathcal C_p(m,M)$, it follows that
\begin{align*}
\sup_{\Sigma\in\mathcal C_p(m,M)}
\mathbb E_\Sigma\left[\|\widetilde\Sigma-\Sigma\|_{\mathrm{op}}\right]
\ge
\frac{\sqrt 3}{4}\alpha\kappa
\left(\sqrt{\frac{p}{n}}\wedge 1\right).
\end{align*}
Finally, because the estimator $\widetilde\Sigma$ was arbitrary, taking the infimum over all measurable estimators gives the desired bound with
\begin{align*}
c(m,M):=\frac{\sqrt 3}{4}\alpha\kappa>0.
\end{align*}
[/step]