[proofplan]
We rewrite the hypothesis part using the matrix-normal distribution of $C\hat B M$. The row covariance of this estimator is $C(X^\top X)^{-1}C^\top$, so premultiplying by its inverse square root reduces the hypothesis sum of squares to the cross-product of an $r \times s$ matrix with independent $\mathcal{N}_s(0,\Omega_M)$ rows. The residual matrix is the projection of the noise onto the orthogonal complement of the column space of $X$, and after applying $M$ its cross-product is a Wishart matrix with $n-q$ degrees of freedom. Orthogonality of the projection onto $\operatorname{Range}(X)$ and its complement gives independence of the hypothesis and error parts.
[/proofplan]
[step:Standardize the estimable hypothesis matrix]
Define
\begin{align*}
A_C := C(X^\top X)^{-1}C^\top \in \mathbb{R}^{r \times r}.
\end{align*}
Since $X$ has rank $q$, the matrix $X^\top X$ is symmetric positive definite. Since $C$ has rank $r$, $A_C$ is symmetric positive definite. Let $A_C^{-1/2}$ denote the symmetric positive definite inverse square root of $A_C$. Let $(\Omega_{\mathrm{prob}},\mathcal{F}_{\mathrm{prob}},\mathbb{P})$ denote the probability space on which the random error matrix $U: \Omega_{\mathrm{prob}} \to \mathbb{R}^{n \times p}$ is defined.
The least-squares estimator satisfies
\begin{align*}
\hat B = B + (X^\top X)^{-1}X^\top U.
\end{align*}
Therefore
\begin{align*}
C\hat B M = CBM + C(X^\top X)^{-1}X^\top U M.
\end{align*}
Under $H_0$, $CBM=0$, so
\begin{align*}
C\hat B M = C(X^\top X)^{-1}X^\top U M.
\end{align*}
Define the standardized hypothesis matrix
\begin{align*}
Z: \Omega_{\mathrm{prob}} &\to \mathbb{R}^{r \times s} \\
\omega &\mapsto A_C^{-1/2}C\hat B(\omega)M.
\end{align*}
Because $UM$ has independent rows with distribution $\mathcal{N}_s(0,\Omega_M)$ and
\begin{align*}
A_C^{-1/2}C(X^\top X)^{-1}X^\top
\end{align*}
has row Gram matrix
\begin{align*}
A_C^{-1/2}C(X^\top X)^{-1}X^\top X(X^\top X)^{-1}C^\top A_C^{-1/2}
= A_C^{-1/2}A_C A_C^{-1/2}
= I_r,
\end{align*}
the rows of $Z$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors. Hence
\begin{align*}
H_M
= (A_C^{-1/2}C\hat B M)^\top(A_C^{-1/2}C\hat B M)
= Z^\top Z.
\end{align*}
Thus
\begin{align*}
H_M \sim W_s(\Omega_M,r).
\end{align*}
[guided]
The role of $A_C$ is to measure the covariance among the $r$ linear hypothesis directions. We define
\begin{align*}
A_C := C(X^\top X)^{-1}C^\top.
\end{align*}
This matrix is invertible because $X^\top X$ is symmetric positive definite and $C$ has full row rank. Thus the inverse square root $A_C^{-1/2}$ exists and standardizes the hypothesis directions. Let $(\Omega_{\mathrm{prob}},\mathcal{F}_{\mathrm{prob}},\mathbb{P})$ denote the probability space on which the random error matrix $U: \Omega_{\mathrm{prob}} \to \mathbb{R}^{n \times p}$ is defined.
The least-squares estimator is
\begin{align*}
\hat B = (X^\top X)^{-1}X^\top Y
= (X^\top X)^{-1}X^\top(XB+U)
= B + (X^\top X)^{-1}X^\top U.
\end{align*}
Multiplying on the left by $C$ and on the right by $M$ gives
\begin{align*}
C\hat B M = CBM + C(X^\top X)^{-1}X^\top U M.
\end{align*}
Under the null hypothesis $CBM=0$, this becomes
\begin{align*}
C\hat B M = C(X^\top X)^{-1}X^\top U M.
\end{align*}
Now define
\begin{align*}
Z: \Omega_{\mathrm{prob}} &\to \mathbb{R}^{r \times s} \\
\omega &\mapsto A_C^{-1/2}C\hat B(\omega)M.
\end{align*}
The rows of $UM$ are independent centered normal vectors with covariance
\begin{align*}
\Omega_M = M^\top \Sigma M.
\end{align*}
The row covariance of $Z$ is the Gram matrix of the deterministic multiplier:
\begin{align*}
A_C^{-1/2}C(X^\top X)^{-1}X^\top X(X^\top X)^{-1}C^\top A_C^{-1/2}
= A_C^{-1/2}A_C A_C^{-1/2}
= I_r.
\end{align*}
Thus $Z$ has independent rows, each distributed as $\mathcal{N}_s(0,\Omega_M)$. Its cross-product is exactly
\begin{align*}
Z^\top Z
&= M^\top \hat B^\top C^\top A_C^{-1/2}A_C^{-1/2}C\hat B M \\
&= M^\top \hat B^\top C^\top A_C^{-1}C\hat B M \\
&= H_M.
\end{align*}
Therefore $H_M$ is a central Wishart matrix with scale $\Omega_M$ and $r$ degrees of freedom:
\begin{align*}
H_M \sim W_s(\Omega_M,r).
\end{align*}
[/guided]
[/step]
[step:Represent the residual cross-product as a projected Gaussian cross-product]
Define the projection matrices
\begin{align*}
P_X := X(X^\top X)^{-1}X^\top \in \mathbb{R}^{n \times n},
\qquad
Q_X := I_n - P_X \in \mathbb{R}^{n \times n}.
\end{align*}
Then
\begin{align*}
Y-X\hat B = (I_n-P_X)Y = Q_X(XB+U)=Q_XU,
\end{align*}
because $Q_XX=0$. Hence
\begin{align*}
E_M = M^\top U^\top Q_X U M = (Q_XUM)^\top(Q_XUM).
\end{align*}
The matrix $Q_X$ is symmetric idempotent with rank $n-q$. Therefore there exists an orthogonal matrix $O \in \mathbb{R}^{n \times n}$ such that
\begin{align*}
Q_X = O^\top
\begin{pmatrix}
I_{n-q} & 0 \\
0 & 0_q
\end{pmatrix}
O.
\end{align*}
Since left multiplication by an orthogonal matrix preserves independence and the common covariance of the rows of $UM$, the first $n-q$ rows of $OUM$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors, and the remaining rows do not contribute to the cross-product. Therefore
\begin{align*}
E_M \sim W_s(\Omega_M,n-q).
\end{align*}
[guided]
The residual matrix is obtained by projecting $Y$ away from the column space of $X$. Define
\begin{align*}
P_X := X(X^\top X)^{-1}X^\top,
\qquad
Q_X := I_n-P_X.
\end{align*}
The matrix $P_X$ is the [orthogonal projection](/theorems/437) onto $\operatorname{Range}(X)$, and $Q_X$ is the orthogonal projection onto $\operatorname{Range}(X)^\perp$. Since $\hat B=(X^\top X)^{-1}X^\top Y$, we have
\begin{align*}
X\hat B = X(X^\top X)^{-1}X^\top Y = P_XY.
\end{align*}
Thus
\begin{align*}
Y-X\hat B = Y-P_XY = Q_XY.
\end{align*}
Substituting $Y=XB+U$ gives
\begin{align*}
Y-X\hat B = Q_X(XB+U)=Q_XXB+Q_XU.
\end{align*}
Because $P_XX=X$, we have $Q_XX=(I_n-P_X)X=0$, so
\begin{align*}
Y-X\hat B = Q_XU.
\end{align*}
After projecting through $M$, the error cross-product becomes
\begin{align*}
E_M = M^\top U^\top Q_X^\top Q_X U M.
\end{align*}
Since $Q_X$ is symmetric and idempotent, $Q_X^\top Q_X=Q_X$, hence
\begin{align*}
E_M = M^\top U^\top Q_X U M = (Q_XUM)^\top(Q_XUM).
\end{align*}
The matrix $Q_X$ is symmetric idempotent with rank $n-q$, so it is orthogonally diagonalizable with eigenvalue $1$ repeated $n-q$ times and eigenvalue $0$ repeated $q$ times. Therefore there is an orthogonal matrix $O \in \mathbb{R}^{n \times n}$ such that
\begin{align*}
Q_X = O^\top
\begin{pmatrix}
I_{n-q} & 0 \\
0 & 0_q
\end{pmatrix}
O.
\end{align*}
Let
\begin{align*}
V: \Omega_{\mathrm{prob}} &\to \mathbb{R}^{n \times s} \\
\omega &\mapsto O\,U(\omega)M.
\end{align*}
The rows of $UM$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors, and orthogonal left multiplication preserves the joint centered Gaussian law and transforms the row covariance by $OO^\top=I_n$. Hence the rows of $V$ are independent $\mathcal{N}_s(0,\Omega_M)$ random vectors. The cross-product becomes the sum of the first $n-q$ row outer products:
\begin{align*}
E_M
= V^\top
\begin{pmatrix}
I_{n-q} & 0 \\
0 & 0_q
\end{pmatrix}
V.
\end{align*}
Thus $E_M$ is the cross-product of $n-q$ independent $\mathcal{N}_s(0,\Omega_M)$ rows, and therefore
\begin{align*}
E_M \sim W_s(\Omega_M,n-q).
\end{align*}
[/guided]
[/step]
[step:Use orthogonal projection to prove independence]
The standardized hypothesis matrix $Z$ depends on $U$ only through $P_XUM$, because
\begin{align*}
A_C^{-1/2}C(X^\top X)^{-1}X^\top U M
=
A_C^{-1/2}C(X^\top X)^{-1}X^\top P_X U M.
\end{align*}
The residual cross-product $E_M$ depends on $U$ only through $Q_XUM$. Since $P_XQ_X=0$, the Gaussian random matrices $P_XUM$ and $Q_XUM$ are independent. Therefore $Z$ and $E_M$ are independent, and hence $H_M=Z^\top Z$ and $E_M$ are independent.
[/step]
[step:Identify the resulting MANOVA hypothesis-error structure]
We have shown that, under $H_0$,
\begin{align*}
H_M \sim W_s(\Omega_M,r),
\qquad
E_M \sim W_s(\Omega_M,n-q),
\end{align*}
and that $H_M$ and $E_M$ are independent. The matrix $H_M$ is therefore the hypothesis sum-of-squares-and-products matrix for $r$ standardized hypothesis directions and $s$ response combinations, while $E_M$ is the corresponding error sum-of-squares-and-products matrix with $n-q$ residual degrees of freedom. This is precisely the central MANOVA hypothesis-error structure for the transformed responses $YM$ and the linear hypothesis $CBM=0$.
[/step]