[proofplan]
We first compute the distribution of the sample mean directly from the characteristic function of a multivariate normal vector. For the covariance estimator, we temporarily replace the empirical centering by the true centering and apply the [multivariate central limit theorem](/theorems/1854) to the vector of products $Y_iY_i^\top-\Sigma$, where $Y_i=X_i-\mu$. The Gaussian fourth moment formula gives the limiting covariance, and the correction from replacing $\mu$ by $\widehat{\mu}_n$ is negligible after multiplication by $\sqrt n$. Finally, an [orthogonal decomposition](/theorems/436) of the Gaussian sample shows that the sample mean and the centred residual covariance are independent, so the joint limit has independent blocks.
[/proofplan]
[step:Compute the exact distribution of the scaled sample mean]
Let $(\Omega, \mathcal{F}, \mathbb{P})$ denote the probability space on which the sequence $(X_i)_{i \in \mathbb{N}}$ is defined. Let $\mathcal{B}(\mathbb{R}^p)$ denote the Borel $\sigma$-algebra on $\mathbb{R}^p$, and for each $i \in \mathbb{N}$ regard $X_i$ as the measurable random vector
\begin{align*}
X_i:(\Omega, \mathcal{F}) &\to (\mathbb{R}^p, \mathcal{B}(\mathbb{R}^p)).
\end{align*}
For each $i \in \{1,\dots,n\}$, define the centred random vector
\begin{align*}
Y_i:\Omega &\to \mathbb{R}^p, \\
\omega &\mapsto X_i(\omega)-\mu .
\end{align*}
Then $Y_1,\dots,Y_n$ are independent and identically distributed as $\mathcal{N}_p(0,\Sigma)$. Define
\begin{align*}
Z_n:\Omega &\to \mathbb{R}^p, \\
\omega &\mapsto \sqrt{n}(\widehat{\mu}_n(\omega)-\mu)
= \frac{1}{\sqrt n}\sum_{i=1}^{n}Y_i(\omega).
\end{align*}
For $t \in \mathbb{R}^p$, the characteristic function of $Y_i$ is
\begin{align*}
\mathbb{E}\left[e^{i t^\top Y_i}\right]
=
\exp\left(-\frac{1}{2}t^\top \Sigma t\right).
\end{align*}
Using independence,
\begin{align*}
\mathbb{E}\left[e^{i t^\top Z_n}\right]
&=
\prod_{i=1}^{n}\mathbb{E}\left[e^{i t^\top Y_i/\sqrt n}\right] \\
&=
\left[
\exp\left(-\frac{1}{2n}t^\top \Sigma t\right)
\right]^n \\
&=
\exp\left(-\frac{1}{2}t^\top \Sigma t\right).
\end{align*}
This is the characteristic function of $\mathcal{N}_p(0,\Sigma)$, so $Z_n \sim \mathcal{N}_p(0,\Sigma)$ for every $n$. Hence
\begin{align*}
\sqrt n(\widehat{\mu}_n-\mu)\xrightarrow{d}\mathcal{N}_p(0,\Sigma).
\end{align*}
[/step]
[step:Apply the multivariate central limit theorem to the true-centred covariance]
Define the true-centred covariance fluctuation
\begin{align*}
A_n:\Omega &\to \mathbb{R}^{p \times p}, \\
\omega &\mapsto \frac{1}{n}\sum_{i=1}^{n}Y_i(\omega)Y_i(\omega)^\top-\Sigma .
\end{align*}
Also define the product vector
\begin{align*}
V_i:\Omega &\to \mathbb{R}^{p^2}, \\
\omega &\mapsto \operatorname{vec}\left(Y_i(\omega)Y_i(\omega)^\top-\Sigma\right).
\end{align*}
Then $V_1,V_2,\dots$ are independent and identically distributed, $\mathbb{E}[V_i]=0$, and every second moment is finite because Gaussian random variables have finite moments of all orders. Define the covariance matrix $\Gamma \in \mathbb{R}^{p^2 \times p^2}$ by
\begin{align*}
\Gamma := \operatorname{Cov}(V_1).
\end{align*}
The hypotheses of the multivariate [central limit theorem](/theorems/1848) are therefore satisfied: the summands are independent and identically distributed in $\mathbb{R}^{p^2}$, have mean zero, and have finite second moments. The multivariate [central limit theorem](/theorems/521) gives
\begin{align*}
\sqrt n\,\operatorname{vec}(A_n)
=
\frac{1}{\sqrt n}\sum_{i=1}^{n}V_i
\xrightarrow{d}
\mathcal{N}_{p^2}(0,\Gamma).
\end{align*}
In coordinates, this definition of $\Gamma$ means
\begin{align*}
\Gamma_{(j,k),(l,m)}
=
\operatorname{Cov}(Y_{1,j}Y_{1,k},Y_{1,l}Y_{1,m}).
\end{align*}
It remains to compute this covariance. Since $Y_1$ is centred Gaussian, its moment generating function
\begin{align*}
M:\mathbb{R}^p &\to \mathbb{R}, \\
t &\mapsto \mathbb{E}\left[e^{t^\top Y_1}\right]
=
\exp\left(\frac{1}{2}t^\top\Sigma t\right)
\end{align*}
is smooth. Differentiating four times at $t=0$ gives the Gaussian fourth moment identity
\begin{align*}
\mathbb{E}[Y_{1,j}Y_{1,k}Y_{1,l}Y_{1,m}]
=
\Sigma_{jk}\Sigma_{lm}
+
\Sigma_{jl}\Sigma_{km}
+
\Sigma_{jm}\Sigma_{kl}.
\end{align*}
Therefore
\begin{align*}
\Gamma_{(j,k),(l,m)}
&=
\mathbb{E}[Y_{1,j}Y_{1,k}Y_{1,l}Y_{1,m}]
-
\mathbb{E}[Y_{1,j}Y_{1,k}]
\mathbb{E}[Y_{1,l}Y_{1,m}] \\
&=
\left(
\Sigma_{jk}\Sigma_{lm}
+
\Sigma_{jl}\Sigma_{km}
+
\Sigma_{jm}\Sigma_{kl}
\right)
-
\Sigma_{jk}\Sigma_{lm} \\
&=
\Sigma_{jl}\Sigma_{km}+\Sigma_{jm}\Sigma_{kl}.
\end{align*}
[/step]
[step:Show that empirical centering changes the covariance estimator by a negligible term]
Define
\begin{align*}
\delta_n:\Omega &\to \mathbb{R}^p, \\
\omega &\mapsto \widehat{\mu}_n(\omega)-\mu .
\end{align*}
For every $i \in \{1,\dots,n\}$,
\begin{align*}
X_i-\widehat{\mu}_n=Y_i-\delta_n.
\end{align*}
Expanding the outer product and using $\frac{1}{n}\sum_{i=1}^{n}Y_i=\delta_n$,
\begin{align*}
\widehat{\Sigma}_n
&=
\frac{1}{n}\sum_{i=1}^{n}(Y_i-\delta_n)(Y_i-\delta_n)^\top \\
&=
\frac{1}{n}\sum_{i=1}^{n}Y_iY_i^\top
-
\delta_n\delta_n^\top .
\end{align*}
Thus
\begin{align*}
\widehat{\Sigma}_n-\Sigma=A_n-\delta_n\delta_n^\top.
\end{align*}
Using the Frobenius norm $|\cdot|_F$ on $\mathbb{R}^{p\times p}$,
\begin{align*}
\left|\sqrt n\,\operatorname{vec}(\delta_n\delta_n^\top)\right|
=
\sqrt n\,|\delta_n\delta_n^\top|_F
=
\sqrt n\,|\delta_n|^2
=
\frac{1}{\sqrt n}\left|\sqrt n\,\delta_n\right|^2.
\end{align*}
The sequence $\sqrt n\,\delta_n$ is tight because it has the fixed distribution $\mathcal{N}_p(0,\Sigma)$ by the first step. Hence
\begin{align*}
\sqrt n\,\operatorname{vec}(\delta_n\delta_n^\top)\xrightarrow{\mathbb{P}}0.
\end{align*}
The preceding convergence in probability and the distributional convergence of $\sqrt n\,\operatorname{vec}(A_n)$ satisfy the hypotheses of Slutsky's theorem. Hence
\begin{align*}
\sqrt n\,\operatorname{vec}(\widehat{\Sigma}_n-\Sigma)
=
\sqrt n\,\operatorname{vec}(A_n)
-
\sqrt n\,\operatorname{vec}(\delta_n\delta_n^\top)
\xrightarrow{d}
\mathcal{N}_{p^2}(0,\Gamma).
\end{align*}
[/step]
[step:Establish independence of the limiting mean and covariance components]
Let $\Sigma^{1/2}\in \mathbb{R}^{p\times p}$ denote the symmetric positive definite square root of $\Sigma$, and define
\begin{align*}
R_i:\Omega &\to \mathbb{R}^p, \\
\omega &\mapsto \Sigma^{-1/2}(X_i(\omega)-\mu).
\end{align*}
Then $R_1,\dots,R_n$ are independent standard normal random vectors in $\mathbb{R}^p$. Choose an orthogonal matrix $Q_n\in\mathbb{R}^{n\times n}$ whose first row is
\begin{align*}
\left(\frac{1}{\sqrt n},\dots,\frac{1}{\sqrt n}\right).
\end{align*}
For $a\in\{1,\dots,n\}$, define
\begin{align*}
W_a:\Omega &\to \mathbb{R}^p, \\
\omega &\mapsto \sum_{i=1}^{n}(Q_n)_{ai}R_i(\omega).
\end{align*}
Because the joint vector $(R_1,\dots,R_n)$ is centred Gaussian and the transformation induced by $Q_n$ is orthogonal, $W_1,\dots,W_n$ are independent standard normal random vectors in $\mathbb{R}^p$.
The first row of $Q_n$ gives
\begin{align*}
\sqrt n(\widehat{\mu}_n-\mu)=\Sigma^{1/2}W_1.
\end{align*}
The remaining rows span the orthogonal complement of the constant vector in $\mathbb{R}^n$, so the residual sum of squares satisfies
\begin{align*}
\widehat{\Sigma}_n
=
\Sigma^{1/2}
\left(
\frac{1}{n}\sum_{a=2}^{n}W_aW_a^\top
\right)
\Sigma^{1/2}.
\end{align*}
Thus $\sqrt n(\widehat{\mu}_n-\mu)$ is a measurable function of $W_1$, while $\widehat{\Sigma}_n$ is a measurable function of $(W_2,\dots,W_n)$. Since $W_1$ is independent of $(W_2,\dots,W_n)$, the random vectors $\sqrt n(\widehat{\mu}_n-\mu)$ and $\sqrt n\,\operatorname{vec}(\widehat{\Sigma}_n-\Sigma)$ are independent for each $n$.
We now pass from marginal convergence to joint convergence using characteristic functions. For $u \in \mathbb{R}^p$ and $v \in \mathbb{R}^{p^2}$, define
\begin{align*}
\phi_n(u,v)
:=
\mathbb{E}\left[
\exp\left(
i u^\top \sqrt n(\widehat{\mu}_n-\mu)
+ i v^\top \sqrt n\operatorname{vec}(\widehat{\Sigma}_n-\Sigma)
\right)
\right].
\end{align*}
The finite-sample independence just proved gives the factorisation
\begin{align*}
\phi_n(u,v)
&=
\mathbb{E}\left[\exp\left(i u^\top \sqrt n(\widehat{\mu}_n-\mu)\right)\right]
\mathbb{E}\left[\exp\left(i v^\top \sqrt n\operatorname{vec}(\widehat{\Sigma}_n-\Sigma)\right)\right].
\end{align*}
The first factor converges to $\exp(-u^\top\Sigma u/2)$ by the exact normal distribution of the scaled sample mean. The second factor converges to $\exp(-v^\top\Gamma v/2)$ by the covariance asymptotic normality proved above. Therefore
\begin{align*}
\phi_n(u,v)
\to
\exp\left(-\frac{1}{2}u^\top\Sigma u\right)
\exp\left(-\frac{1}{2}v^\top\Gamma v\right)
=
\exp\left(
-\frac{1}{2}
\begin{pmatrix} u \\ v \end{pmatrix}^\top
\begin{pmatrix}
\Sigma & 0 \\
0 & \Gamma
\end{pmatrix}
\begin{pmatrix} u \\ v \end{pmatrix}
\right).
\end{align*}
This is the characteristic function of the centred normal vector in $\mathbb{R}^{p+p^2}$ with block diagonal covariance matrix. Hence
\begin{align*}
\left(\sqrt{n}(\widehat{\mu}_n-\mu),\sqrt{n}\operatorname{vec}(\widehat{\Sigma}_n-\Sigma)\right)
\xrightarrow{d}
\mathcal{N}_{p+p^2}\left(0,
\begin{pmatrix}
\Sigma & 0 \\
0 & \Gamma
\end{pmatrix}
\right).
\end{align*}
The factorisation of the limiting characteristic function into a function of $u$ times a function of $v$ shows that the limiting mean component and limiting covariance component are independent.
[/step]