[proofplan]
We prove the result directly from the Gaussian MANOVA decomposition. Under $H_0$, the between-group matrix $B$ has rank-$g-1$ Wishart form and $\operatorname{tr}(\Sigma^{-1}B)$ has the $\chi^2_{p(g-1)}$ distribution, while the pooled within-group matrix satisfies $W/N \xrightarrow{\mathbb P} \Sigma$. Since $B = O_{\mathbb P}(1)$ and $W = N\Sigma + o_{\mathbb P}(N)$, the logarithm of Wilks' lambda has the first-order expansion $-N\log\Lambda = \operatorname{tr}(\Sigma^{-1}B) + o_{\mathbb P}(1)$. The Bartlett factor differs from $N$ only by a fixed additive constant, so it has the same limiting distribution.
[/proofplan]
[step:Declare the MANOVA matrices and rewrite Wilks' lambda]
For each group $i \in \{1,\dots,g\}$, define the sample mean $\bar X_i: (\Omega,\mathcal F) \to \mathbb R^p$ and the grand mean $\bar X: (\Omega,\mathcal F) \to \mathbb R^p$ by
\begin{align*}
\bar X_i &= \frac{1}{n_i}\sum_{j=1}^{n_i} X_{ij},
&
\bar X &= \frac{1}{N}\sum_{i=1}^g\sum_{j=1}^{n_i} X_{ij}.
\end{align*}
Define the within-group and between-group sum-of-squares-and-cross-products matrices $W,B: \Omega \to \mathbb R^{p\times p}$ by
\begin{align*}
W &= \sum_{i=1}^g\sum_{j=1}^{n_i}(X_{ij}-\bar X_i)(X_{ij}-\bar X_i)^\top,\\
B &= \sum_{i=1}^g n_i(\bar X_i-\bar X)(\bar X_i-\bar X)^\top.
\end{align*}
Wilks' lambda is
\begin{align*}
\Lambda = \frac{\det W}{\det(W+B)}.
\end{align*}
For all sufficiently large $N$, $W$ is invertible with probability tending to $1$, since $N-g \to \infty$ and $\Sigma$ is positive definite. On this event,
\begin{align*}
-\log\Lambda
&= \log\det(W+B)-\log\det W\\
&= \log\det\left(I_p+W^{-1/2}BW^{-1/2}\right),
\end{align*}
where $I_p \in \mathbb R^{p\times p}$ is the identity matrix and $W^{-1/2}$ denotes the symmetric inverse square root of the positive definite matrix $W$.
[/step]
[step:Identify the null distribution of the between-group matrix]
Under $H_0$, let $\mu \in \mathbb R^p$ denote the common mean. Define the standardized group-mean matrix $Y: \Omega \to \mathbb R^{g\times p}$ by giving its $i$th row as
\begin{align*}
Y_i = \sqrt{n_i}(\bar X_i-\mu)^\top\Sigma^{-1/2} \in \mathbb R^{1\times p}.
\end{align*}
The rows $Y_1,\dots,Y_g$ are independent $\mathcal N(0,I_p)$ row vectors, because the observations are independent Gaussian vectors with covariance $\Sigma$. Define $a_N \in \mathbb R^g$ by $(a_N)_i = \sqrt{n_i/N}$. Then $|a_N|^2=1$, and the matrix
\begin{align*}
P_N = I_g-a_Na_N^\top \in \mathbb R^{g\times g}
\end{align*}
is the [orthogonal projection](/theorems/437) onto the $(g-1)$-dimensional subspace $a_N^\perp \subset \mathbb R^g$. A direct substitution of the definitions gives
\begin{align*}
\Sigma^{-1/2}B\Sigma^{-1/2} = Y^\top P_NY.
\end{align*}
Choose an orthogonal matrix $Q_N \in \mathbb R^{g\times g}$ whose first $g-1$ rows form an orthonormal basis of $a_N^\perp$ and whose last row is $a_N^\top$. Orthogonal invariance of the standard Gaussian law gives that $Z=Q_NY: \Omega \to \mathbb R^{g\times p}$ again has independent $\mathcal N(0,I_p)$ rows. Since $Q_NP_NQ_N^\top=\operatorname{diag}(1,\dots,1,0)$,
\begin{align*}
\Sigma^{-1/2}B\Sigma^{-1/2}
= Z^\top \operatorname{diag}(1,\dots,1,0)Z
= \sum_{r=1}^{g-1} Z_r^\top Z_r,
\end{align*}
where $Z_r \in \mathbb R^{1\times p}$ is the $r$th row of $Z$. Therefore
\begin{align*}
\operatorname{tr}(\Sigma^{-1}B)
= \operatorname{tr}(\Sigma^{-1/2}B\Sigma^{-1/2})
= \sum_{r=1}^{g-1}\sum_{k=1}^p Z_{rk}^2
\sim \chi^2_{p(g-1)}.
\end{align*}
[guided]
The aim of this step is to turn the between-group variation into a sum of squares of independent standard normal variables. Under $H_0$, all group means have the same expectation $\mu \in \mathbb R^p$, and the Gaussian assumption gives
\begin{align*}
\bar X_i \sim \mathcal N\left(\mu,\frac{1}{n_i}\Sigma\right).
\end{align*}
Thus the standardized row vector
\begin{align*}
Y_i = \sqrt{n_i}(\bar X_i-\mu)^\top\Sigma^{-1/2}
\end{align*}
is a standard Gaussian row vector in $\mathbb R^p$, and the rows are independent because the groups are independent.
The grand mean removes one linear degree of freedom among the $g$ group means. This is encoded by the unit vector $a_N \in \mathbb R^g$ with entries $(a_N)_i = \sqrt{n_i/N}$. The orthogonal projection
\begin{align*}
P_N = I_g-a_Na_N^\top
\end{align*}
projects onto the subspace perpendicular to $a_N$, which has dimension $g-1$. Expanding $Y^\top P_NY$ gives
\begin{align*}
Y^\top P_NY
&= \sum_{i=1}^g n_i\Sigma^{-1/2}(\bar X_i-\mu)(\bar X_i-\mu)^\top\Sigma^{-1/2}\\
&\quad - N\Sigma^{-1/2}(\bar X-\mu)(\bar X-\mu)^\top\Sigma^{-1/2}\\
&= \Sigma^{-1/2}\left[\sum_{i=1}^g n_i(\bar X_i-\bar X)(\bar X_i-\bar X)^\top\right]\Sigma^{-1/2}\\
&= \Sigma^{-1/2}B\Sigma^{-1/2}.
\end{align*}
This verifies the matrix identity rather than assuming the Wishart form.
Now choose an orthogonal matrix $Q_N$ whose first $g-1$ rows span $a_N^\perp$ and whose last row is $a_N^\top$. Orthogonal multiplication preserves the joint standard Gaussian distribution of the rows, so $Z=Q_NY$ has independent standard Gaussian rows. Since $Q_NP_NQ_N^\top=\operatorname{diag}(1,\dots,1,0)$, we obtain
\begin{align*}
\Sigma^{-1/2}B\Sigma^{-1/2}
= \sum_{r=1}^{g-1}Z_r^\top Z_r.
\end{align*}
Taking the trace converts this matrix identity into a scalar sum of squares:
\begin{align*}
\operatorname{tr}(\Sigma^{-1}B)
= \sum_{r=1}^{g-1}\sum_{k=1}^p Z_{rk}^2.
\end{align*}
There are exactly $p(g-1)$ independent standard normal squares, so the distribution is $\chi^2_{p(g-1)}$.
[/guided]
[/step]
[step:Show that the within-group covariance consistently estimates $\Sigma$]
Define the pooled covariance matrix $S_N: \Omega \to \mathbb R^{p\times p}$ by
\begin{align*}
S_N = \frac{1}{N}W.
\end{align*}
For each fixed group $i$, the centered residuals $X_{ij}-\bar X_i$ have the usual Gaussian within-group sum-of-squares decomposition. Since $p$ and $g$ are fixed and $N-g \to \infty$, the law of large numbers for the independent centered Gaussian quadratic terms gives
\begin{align*}
S_N \xrightarrow{\mathbb P} \Sigma.
\end{align*}
Because $\Sigma$ is positive definite, matrix inversion is continuous at $\Sigma$, hence
\begin{align*}
S_N^{-1} \xrightarrow{\mathbb P} \Sigma^{-1}.
\end{align*}
The same Gaussian decomposition also gives $B=O_{\mathbb P}(1)$, since the preceding step identifies $\operatorname{tr}(\Sigma^{-1}B)$ as a fixed chi-squared random variable.
[/step]
[step:Expand the log determinant to first order]
Define $A_N: \Omega \to \mathbb R^{p\times p}$ by
\begin{align*}
A_N = W^{-1/2}BW^{-1/2}.
\end{align*}
Since $W=NS_N$, we have
\begin{align*}
A_N = \frac{1}{N}S_N^{-1/2}BS_N^{-1/2}.
\end{align*}
The convergence $S_N^{-1}\xrightarrow{\mathbb P}\Sigma^{-1}$ and the bound $B=O_{\mathbb P}(1)$ imply $A_N=O_{\mathbb P}(N^{-1})$. For symmetric matrices $A$ with $\|A\|_{\mathrm{op}}\leq 1/2$, the Taylor expansion of $\log(1+t)$ applied to the eigenvalues of $A$ gives
\begin{align*}
\left|\log\det(I_p+A)-\operatorname{tr}A\right|
\leq \operatorname{tr}(A^2).
\end{align*}
Applying this estimate to $A_N$ on an event whose probability tends to $1$ yields
\begin{align*}
-N\log\Lambda
&= N\log\det(I_p+A_N)\\
&= N\operatorname{tr}(A_N)+o_{\mathbb P}(1)\\
&= \operatorname{tr}(S_N^{-1}B)+o_{\mathbb P}(1).
\end{align*}
Finally,
\begin{align*}
\operatorname{tr}(S_N^{-1}B)-\operatorname{tr}(\Sigma^{-1}B)
= \operatorname{tr}\left((S_N^{-1}-\Sigma^{-1})B\right)
\xrightarrow{\mathbb P}0,
\end{align*}
because $S_N^{-1}-\Sigma^{-1}\xrightarrow{\mathbb P}0$ and $B=O_{\mathbb P}(1)$. Therefore
\begin{align*}
-N\log\Lambda
= \operatorname{tr}(\Sigma^{-1}B)+o_{\mathbb P}(1).
\end{align*}
[guided]
The determinant ratio is converted into a trace because the matrix perturbation is small. Since $W=NS_N$ and $S_N\xrightarrow{\mathbb P}\Sigma$, the matrix $W$ is of order $N$, whereas $B$ is of order one under $H_0$. Thus
\begin{align*}
A_N = W^{-1/2}BW^{-1/2}
= \frac{1}{N}S_N^{-1/2}BS_N^{-1/2}
\end{align*}
has operator norm of order $N^{-1}$ in probability.
The scalar expansion behind the argument is
\begin{align*}
\log(1+t)=t+O(t^2) \quad \text{as } t\to 0.
\end{align*}
For a symmetric matrix, applying this expansion to the eigenvalues gives
\begin{align*}
\left|\log\det(I_p+A)-\operatorname{tr}A\right|
\leq \operatorname{tr}(A^2)
\end{align*}
whenever $\|A\|_{\mathrm{op}}\leq 1/2$. This condition holds for $A=A_N$ with probability tending to $1$. Therefore
\begin{align*}
-N\log\Lambda
&= N\log\det(I_p+A_N)\\
&= N\operatorname{tr}(A_N)+o_{\mathbb P}(1).
\end{align*}
Using cyclic invariance of trace and $W=NS_N$ gives
\begin{align*}
N\operatorname{tr}(A_N)
= N\operatorname{tr}(W^{-1}B)
= \operatorname{tr}(S_N^{-1}B).
\end{align*}
The pooled covariance matrix satisfies $S_N^{-1}\xrightarrow{\mathbb P}\Sigma^{-1}$, and the between-group matrix is bounded in probability. Hence
\begin{align*}
\operatorname{tr}(S_N^{-1}B)
= \operatorname{tr}(\Sigma^{-1}B)+o_{\mathbb P}(1).
\end{align*}
Combining these estimates gives
\begin{align*}
-N\log\Lambda
= \operatorname{tr}(\Sigma^{-1}B)+o_{\mathbb P}(1).
\end{align*}
[/guided]
[/step]
[step:Replace $N$ by the Bartlett factor and conclude]
Define the Bartlett factor $c_N \in \mathbb R$ by
\begin{align*}
c_N = N-1-\frac{p+g}{2}.
\end{align*}
Since $p$ and $g$ are fixed,
\begin{align*}
\frac{c_N}{N} \to 1.
\end{align*}
From the previous step, $-N\log\Lambda = O_{\mathbb P}(1)$, so
\begin{align*}
-c_N\log\Lambda
= \frac{c_N}{N}(-N\log\Lambda)
= \operatorname{tr}(\Sigma^{-1}B)+o_{\mathbb P}(1).
\end{align*}
The second step showed that $\operatorname{tr}(\Sigma^{-1}B)\sim \chi^2_{p(g-1)}$. Hence
\begin{align*}
-\left(N-1-\frac{p+g}{2}\right)\log\Lambda
\xrightarrow{d} \chi^2_{p(g-1)}.
\end{align*}
This is the claimed large-residual-degrees-of-freedom chi-squared approximation.
[/step]