[proofplan]
We identify the ordinary Gaussian likelihood ratio as $\Lambda_n^{n/2}$, where $\Lambda_n$ is Wilks' determinant lambda computed from the formally defined sample covariance blocks. The unrestricted covariance model has $pq$ more free parameters than the null model, so Wilks' likelihood-ratio theorem gives the limiting $\chi^2_{pq}$ law for $-n\log\Lambda_n$ on the probability-one eventual domain where the sample covariance determinants are nonzero. Multiplying first by $(n-1)/n$ and then by the Bartlett factor $\bigl(n-1-(p+q+1)/2\bigr)/(n-1)$ changes the statistic only by deterministic factors tending to $1$, so Slutsky's theorem preserves the same first-order limiting distribution.
[/proofplan]
[step:Identify Wilks' lambda as the Gaussian likelihood-ratio statistic]
For $x_1,\dots,x_n \in \mathbb{R}^{p+q}$, define the sample mean $\bar x_n\in\mathbb{R}^{p+q}$ and the unbiased sample covariance matrix $S_n\in\mathbb{R}^{(p+q)\times(p+q)}$ by
\begin{align*}
\bar x_n &:= \frac{1}{n}\sum_{i=1}^n x_i, \\
S_n &:= \frac{1}{n-1}\sum_{i=1}^n (x_i-\bar x_n)(x_i-\bar x_n)^\top.
\end{align*}
Write the block decomposition of $S_n$ according to the split $\mathbb{R}^{p+q}=\mathbb{R}^p\times\mathbb{R}^q$ as
\begin{align*}
S_n=
\begin{pmatrix}
S_{11,n} & S_{12,n} \\
S_{21,n} & S_{22,n}
\end{pmatrix},
\end{align*}
where $S_{11,n}\in\mathbb{R}^{p\times p}$ and $S_{22,n}\in\mathbb{R}^{q\times q}$ are the diagonal sample covariance blocks. Define the Gaussian log-likelihood function
\begin{align*}
\ell_n:\mathbb{R}^{p+q}\times \mathcal{S}_{++}^{p+q} &\to \mathbb{R} \\
(m,A) &\mapsto
-\frac{n(p+q)}{2}\log(2\pi)
-\frac{n}{2}\log\det A
-\frac{1}{2}\sum_{i=1}^n (x_i-m)^\top A^{-1}(x_i-m),
\end{align*}
where $\mathcal{S}_{++}^{p+q}$ denotes the set of positive definite symmetric $(p+q)\times(p+q)$ real matrices.
With this notation, the unrestricted maximum likelihood estimator of the mean is $\bar x_n$, and the unrestricted maximum likelihood estimator of the covariance matrix is
\begin{align*}
\widehat\Sigma_n := \frac{1}{n}\sum_{i=1}^n (x_i-\bar x_n)(x_i-\bar x_n)^\top
= \frac{n-1}{n}S_n.
\end{align*}
Under the null model $\Sigma_{12}=0$, the covariance matrix is block diagonal, and the restricted covariance maximum likelihood estimator is
\begin{align*}
\widehat\Sigma_{0,n} :=
\begin{pmatrix}
\frac{n-1}{n}S_{11,n} & 0 \\
0 & \frac{n-1}{n}S_{22,n}
\end{pmatrix}.
\end{align*}
Substituting these maximizers into the likelihood gives the likelihood-ratio statistic
\begin{align*}
\Lambda_n
&=
\frac{\sup\{\exp(\ell_n(m,A)):\ A_{12}=0\}}
{\sup\{\exp(\ell_n(m,A)):\ A\in \mathcal{S}_{++}^{p+q}\}} \\
&=
\frac{\det \widehat\Sigma_n}{\det \widehat\Sigma_{0,n}}
=
\frac{\det S_n}{\det S_{11,n}\det S_{22,n}}.
\end{align*}
Thus the statistic in the statement is Bartlett's corrected form of Wilks' likelihood-ratio statistic for the Gaussian independence hypothesis.
[/step]
[step:Compute the number of restrictions imposed by the null model]
The unrestricted covariance parameter space $\mathcal{S}_{++}^{p+q}$ is an open subset of the real [vector space](/page/Vector%20Space) of symmetric $(p+q)\times(p+q)$ matrices, so its dimension is
\begin{align*}
\frac{(p+q)(p+q+1)}{2}.
\end{align*}
Under $H_0$, the covariance matrix is block diagonal with one positive definite $p\times p$ block and one positive definite $q\times q$ block. Hence the null covariance parameter space has dimension
\begin{align*}
\frac{p(p+1)}{2}+\frac{q(q+1)}{2}.
\end{align*}
The difference between these dimensions is
\begin{align*}
\frac{(p+q)(p+q+1)}{2}
-
\frac{p(p+1)}{2}
-
\frac{q(q+1)}{2}
=
pq.
\end{align*}
Therefore the null hypothesis imposes exactly $pq$ independent smooth restrictions on the unrestricted covariance model, namely the vanishing of the $pq$ entries of $\Sigma_{12}$.
[/step]
[step:Apply Wilks' theorem to obtain the uncorrected chi-squared limit]
Let the full parameter space be
\begin{align*}
\Theta := \mathbb{R}^{p+q}\times \mathcal{S}_{++}^{p+q},
\end{align*}
and let the null parameter space be
\begin{align*}
\Theta_0 := \{(m,A)\in \Theta : A_{12}=0\}.
\end{align*}
The mean parameter $m\in\mathbb{R}^{p+q}$ is unrestricted in both models, so it is a nuisance parameter common to the full and null parameter spaces. The multivariate normal model is a regular finite-dimensional parametric model: the true parameter $(m_0,A_0)$ lies in the interior of $\Theta$ because $A_0\in\mathcal{S}_{++}^{p+q}$, it lies in the relative interior of $\Theta_0$ under $H_0$ whenever $A_{0,11}$ and $A_{0,22}$ are positive definite, the log-likelihood is twice continuously differentiable in a neighbourhood of $(m_0,A_0)$, and the Fisher information for the Gaussian mean-covariance parameter is nonsingular. The null model is a smooth embedded submodel of codimension $pq$, as computed above.
Let $E_n$ be the event that $S_n$, $S_{11,n}$, and $S_{22,n}$ are positive definite. Under the nonsingular multivariate normal law, the centered observations span $\mathbb{R}^{p+q}$ with probability $1$ whenever $n\ge p+q+1$, and their first $p$ and last $q$ coordinate projections span $\mathbb{R}^p$ and $\mathbb{R}^q$ with probability $1$ whenever $n\ge \max\{p+1,q+1\}$. Hence $\mathbb{P}(E_n)=1$ for all sufficiently large $n$, so the determinant ratio defining $\Lambda_n$ is eventually defined almost surely.
Define the ordinary likelihood-ratio statistic
\begin{align*}
R_n :=
\frac{\sup\{\exp(\ell_n(m,A)):\ (m,A)\in\Theta_0\}}
{\sup\{\exp(\ell_n(m,A)):\ (m,A)\in\Theta\}}.
\end{align*}
By Wilks' likelihood-ratio theorem (citing a result not yet in the wiki: [Wilks' theorem](/theorems/1431)), applied to the regular model $\Theta$ and the embedded null submodel $\Theta_0$ of codimension $pq$, under $H_0$,
\begin{align*}
-2\log R_n
\xrightarrow{d}
\chi^2_{pq}.
\end{align*}
Using the maximized Gaussian likelihoods computed above, the normalizing constants and residual quadratic terms cancel at the maximizers, giving
\begin{align*}
R_n
&=
\left(\frac{\det \widehat\Sigma_n}{\det \widehat\Sigma_{0,n}}\right)^{n/2}
=
\Lambda_n^{n/2}.
\end{align*}
Therefore
\begin{align*}
-2\log R_n = -n\log\Lambda_n,
\end{align*}
and [Wilks' theorem](/theorems/1864) gives
\begin{align*}
-n\log\Lambda_n \xrightarrow{d} \chi^2_{pq}.
\end{align*}
Since $(n-1)/n\to 1$, Slutsky's theorem yields
\begin{align*}
-(n-1)\log\Lambda_n
=
\frac{n-1}{n}\bigl(-n\log\Lambda_n\bigr)
\xrightarrow{d}
\chi^2_{pq}.
\end{align*}
[/step]
[step:Insert Bartlett's correction without changing the limiting law]
Define the deterministic correction factor
\begin{align*}
a_n := \frac{n-1-\frac{p+q+1}{2}}{n-1}.
\end{align*}
Since $p$ and $q$ are fixed,
\begin{align*}
\lim_{n\to\infty} a_n = 1.
\end{align*}
The Bartlett-corrected statistic can be written as
\begin{align*}
-\left(n-1-\frac{p+q+1}{2}\right)\log\Lambda_n
=
a_n\bigl(-(n-1)\log\Lambda_n\bigr).
\end{align*}
Since $a_n \to 1$ and $-(n-1)\log\Lambda_n \xrightarrow{d}\chi^2_{pq}$, Slutsky's theorem gives
\begin{align*}
-\left(n-1-\frac{p+q+1}{2}\right)\log\Lambda_n
\xrightarrow{d}
\chi^2_{pq}.
\end{align*}
This is precisely the asserted first-order large-sample chi-squared approximation with $pq$ degrees of freedom. The argument uses only that the Bartlett factor tends to $1$; it does not assert the stronger second-order Bartlett correction property.
[/step]