[proofplan]
For fixed positive definite $\Sigma$, the part of the log-likelihood depending on $B$ is the negative of a weighted residual sum of squares. We show that the ordinary least-squares matrix $\hat{B} = (X^\top X)^{-1}X^\top Y$ satisfies the normal equations. Then we expand the weighted residual criterion around $\hat{B}$ and prove that every deviation $B-\hat{B}$ increases the criterion by a non-negative term, which vanishes only when $B=\hat{B}$.
[/proofplan]
[step:Write the likelihood as a weighted residual criterion]
For $B \in \mathbb{R}^{p \times m}$, define the residual matrix
\begin{align*}
R(B) := Y - XB \in \mathbb{R}^{n \times m}.
\end{align*}
Since the rows of $E$ are independent Gaussian vectors in $\mathbb{R}^m$ with common covariance matrix $\Sigma$, the log-likelihood as a function of $B$ and fixed $\Sigma$ differs from a constant independent of $B$ by
\begin{align*}
\ell(B;\Sigma)
=
-\frac{1}{2}\operatorname{tr}\!\left(\Sigma^{-1}R(B)^\top R(B)\right).
\end{align*}
The term $-\frac{n}{2}\log\det\Sigma$ is independent of $B$, so maximizing $\ell(B;\Sigma)$ over $B \in \mathbb{R}^{p \times m}$ is equivalent to minimizing the function
\begin{align*}
Q_\Sigma:\mathbb{R}^{p \times m} &\to \mathbb{R},\\
B &\mapsto \operatorname{tr}\!\left(\Sigma^{-1}(Y-XB)^\top(Y-XB)\right).
\end{align*}
[/step]
[step:Verify that the least-squares matrix satisfies the normal equations]
Because $X$ has full column rank, $X^\top X \in \mathbb{R}^{p \times p}$ is invertible. Define
\begin{align*}
\hat{B} := (X^\top X)^{-1}X^\top Y \in \mathbb{R}^{p \times m}.
\end{align*}
Then
\begin{align*}
X^\top(Y-X\hat{B})
&= X^\top Y - X^\top X (X^\top X)^{-1}X^\top Y\\
&= X^\top Y - X^\top Y\\
&= 0.
\end{align*}
Thus $\hat{B}$ satisfies the normal equations
\begin{align*}
X^\top R(\hat{B}) = 0.
\end{align*}
[guided]
The candidate estimator is the same matrix that appears in ordinary least squares. The full column rank hypothesis is used exactly here: since the columns of $X$ are linearly independent, the Gram matrix $X^\top X$ is invertible, so the formula
\begin{align*}
\hat{B} := (X^\top X)^{-1}X^\top Y
\end{align*}
defines a matrix in $\mathbb{R}^{p \times m}$.
Now compute the residual normal equations. The residual at $\hat{B}$ is
\begin{align*}
R(\hat{B}) = Y-X\hat{B}.
\end{align*}
Multiplying on the left by $X^\top$ gives
\begin{align*}
X^\top R(\hat{B})
&= X^\top(Y-X\hat{B})\\
&= X^\top Y - X^\top X (X^\top X)^{-1}X^\top Y\\
&= X^\top Y - X^\top Y\\
&= 0.
\end{align*}
This orthogonality identity is the algebraic reason the covariance matrix $\Sigma$ will not affect the minimizing value of $B$.
[/guided]
[/step]
[step:Expand the criterion around the normal-equation solution]
Let $B \in \mathbb{R}^{p \times m}$ be arbitrary and define
\begin{align*}
D := B-\hat{B} \in \mathbb{R}^{p \times m}.
\end{align*}
Then
\begin{align*}
R(B) = Y-XB = Y-X\hat{B}-XD = R(\hat{B})-XD.
\end{align*}
Expanding the residual product gives
\begin{align*}
R(B)^\top R(B)
&= \bigl(R(\hat{B})-XD\bigr)^\top\bigl(R(\hat{B})-XD\bigr)\\
&= R(\hat{B})^\top R(\hat{B})
- R(\hat{B})^\top XD
- D^\top X^\top R(\hat{B})
+ D^\top X^\top XD.
\end{align*}
Since $X^\top R(\hat{B})=0$, also $R(\hat{B})^\top X=0$, so the two cross terms vanish. Therefore
\begin{align*}
Q_\Sigma(B)
=
Q_\Sigma(\hat{B})
+
\operatorname{tr}\!\left(\Sigma^{-1}D^\top X^\top XD\right).
\end{align*}
[/step]
[step:Show that the remainder term is non-negative and vanishes only at $\hat{B}$]
Let $A := X^\top X \in \mathbb{R}^{p \times p}$. Since $X$ has full column rank, $A$ is positive definite. Since $\Sigma$ is positive definite, $\Sigma^{-1}$ is positive definite.
Let $A^{1/2} \in \mathbb{R}^{p \times p}$ and $\Sigma^{-1/2} \in \mathbb{R}^{m \times m}$ denote the positive definite square roots of $A$ and $\Sigma^{-1}$, respectively. Using cyclic invariance of trace for finite matrices,
\begin{align*}
\operatorname{tr}\!\left(\Sigma^{-1}D^\top A D\right)
&=
\operatorname{tr}\!\left(\Sigma^{-1/2}D^\top A D\Sigma^{-1/2}\right)\\
&=
\operatorname{tr}\!\left((A^{1/2}D\Sigma^{-1/2})^\top(A^{1/2}D\Sigma^{-1/2})\right)\\
&=
\sum_{i=1}^{p}\sum_{j=1}^{m}
\left(A^{1/2}D\Sigma^{-1/2}\right)_{ij}^{2}
\geq 0.
\end{align*}
Equality holds if and only if $A^{1/2}D\Sigma^{-1/2}=0$. Since $A^{1/2}$ and $\Sigma^{-1/2}$ are invertible, this is equivalent to $D=0$, hence to $B=\hat{B}$.
[guided]
The expansion from the previous step reduced the problem to the sign of one matrix trace:
\begin{align*}
\operatorname{tr}\!\left(\Sigma^{-1}D^\top X^\top XD\right).
\end{align*}
Set
\begin{align*}
A := X^\top X.
\end{align*}
The full column rank of $X$ implies that $A$ is positive definite, because for every nonzero $v \in \mathbb{R}^p$,
\begin{align*}
v^\top A v = v^\top X^\top Xv = |Xv|^2 > 0.
\end{align*}
Also, because $\Sigma$ is positive definite, its inverse $\Sigma^{-1}$ is positive definite.
We now rewrite the trace as a sum of squares. Let $A^{1/2}$ be the positive definite square root of $A$, and let $\Sigma^{-1/2}$ be the positive definite square root of $\Sigma^{-1}$. Then cyclic invariance of trace gives
\begin{align*}
\operatorname{tr}\!\left(\Sigma^{-1}D^\top A D\right)
&=
\operatorname{tr}\!\left(\Sigma^{-1/2}D^\top A D\Sigma^{-1/2}\right)\\
&=
\operatorname{tr}\!\left((A^{1/2}D\Sigma^{-1/2})^\top(A^{1/2}D\Sigma^{-1/2})\right).
\end{align*}
The final trace is the sum of the squares of all entries of the matrix $A^{1/2}D\Sigma^{-1/2}$:
\begin{align*}
\operatorname{tr}\!\left((A^{1/2}D\Sigma^{-1/2})^\top(A^{1/2}D\Sigma^{-1/2})\right)
=
\sum_{i=1}^{p}\sum_{j=1}^{m}
\left(A^{1/2}D\Sigma^{-1/2}\right)_{ij}^{2}
\geq 0.
\end{align*}
This sum is zero exactly when every entry of $A^{1/2}D\Sigma^{-1/2}$ is zero. Since both $A^{1/2}$ and $\Sigma^{-1/2}$ are invertible, this happens exactly when $D=0$, that is, exactly when $B=\hat{B}$.
[/guided]
[/step]
[step:Conclude uniqueness of the maximum likelihood estimator]
For every $B \in \mathbb{R}^{p \times m}$,
\begin{align*}
Q_\Sigma(B)
=
Q_\Sigma(\hat{B})
+
\operatorname{tr}\!\left(\Sigma^{-1}(B-\hat{B})^\top X^\top X(B-\hat{B})\right)
\geq
Q_\Sigma(\hat{B}),
\end{align*}
with equality if and only if $B=\hat{B}$. Thus $\hat{B}$ is the unique minimizer of $Q_\Sigma$ and therefore the unique maximizer of the log-likelihood over $B$ for fixed $\Sigma$.
The formula
\begin{align*}
\hat{B} = (X^\top X)^{-1}X^\top Y
\end{align*}
contains only $X$ and $Y$. Hence the maximum likelihood estimator of the coefficient matrix does not depend on the positive definite covariance matrix $\Sigma$.
[/step]