[proofplan]
The argument reduces the statistic to the squared Euclidean norm of the vector of the first $m$ residual autocorrelations. Under the regularity hypotheses for fitted causal invertible ARMA models, this vector has an asymptotic centered Gaussian distribution whose covariance is an orthogonal projection of rank $m-p-q$, because estimating $p+q$ parameters removes $p+q$ asymptotic degrees of freedom. Diagonalising that projection turns the limiting quadratic form into the sum of squares of $m-p-q$ independent standard normal variables, which is precisely a $\chi^2_{m-p-q}$ random variable.
[/proofplan]
[step:Collect the residual autocorrelations into one asymptotic vector]
Let $(\Omega,\mathcal F,\mathbb P)$ denote the probability space on which the observed time series, fitted residuals, and residual autocorrelations are defined. Define the residual autocorrelation vector $\hat r_{m,n}: \Omega \to \mathbb{R}^{m}$ by
\begin{align*}
\hat r_{m,n}
=
(\hat r_1,\dots,\hat r_m).
\end{align*}
Then the Box–Pierce statistic can be written as the squared Euclidean norm
\begin{align*}
Q_m
=
n|\hat r_{m,n}|^2
=
|\sqrt{n}\,\hat r_{m,n}|^2.
\end{align*}
The regularity assumptions in the theorem include the residual autocorrelation central limit theorem for fitted causal invertible $\operatorname{ARMA}(p,q)$ models under the white-noise null. Its hypotheses are in force here: $m$ is fixed, $m>p+q$, the fitted ARMA model is causal and invertible, the innovation and estimation regularity conditions needed for asymptotic normality hold, and the residuals are white noise under the null. Hence there exists a matrix $P_m \in \mathbb{R}^{m \times m}$ and a random vector $Z_m:\Omega\to\mathbb R^m$ such that
\begin{align*}
\sqrt{n}\,\hat r_{m,n}
\xrightarrow{d}
Z_m,
\qquad
Z_m \sim \mathcal{N}(0,P_m),
\end{align*}
where the same theorem identifies $P_m$ as the symmetric idempotent covariance projection left after projecting away the $p+q$ estimated ARMA-parameter directions:
\begin{align*}
P_m^2=P_m,
\qquad
P_m^\top=P_m,
\qquad
\operatorname{rank}(P_m)=m-p-q.
\end{align*}
[guided]
Let $(\Omega,\mathcal F,\mathbb P)$ be the probability space on which the observed time series, fitted residuals, and residual autocorrelations are defined. The statistic $Q_m$ only depends on the first $m$ residual autocorrelations, so we package them into a single random vector
\begin{align*}
\hat r_{m,n}: \Omega \to \mathbb{R}^{m},
\qquad
\hat r_{m,n}
=
(\hat r_1,\dots,\hat r_m).
\end{align*}
With this notation, the definition of $Q_m$ becomes
\begin{align*}
Q_m
=
n\sum_{k=1}^{m}\hat r_k^2
=
n|\hat r_{m,n}|^2
=
|\sqrt{n}\,\hat r_{m,n}|^2.
\end{align*}
The substantive asymptotic input is the residual autocorrelation central limit theorem included among the theorem's regularity assumptions. We check the hypotheses being used: $m$ is fixed with $m>p+q$, the fitted model is causal and invertible, the innovation and estimation regularity assumptions required for asymptotic normality are assumed, and the null hypothesis says that the fitted residuals are white noise. Therefore the theorem applies to the random vector $\hat r_{m,n}$ and gives a random vector $Z_m:\Omega\to\mathbb R^m$ and a matrix $P_m\in\mathbb R^{m\times m}$ such that
\begin{align*}
\sqrt{n}\,\hat r_{m,n}
\xrightarrow{d}
Z_m,
\qquad
Z_m \sim \mathcal{N}(0,P_m).
\end{align*}
The same central limit theorem identifies the covariance matrix $P_m$ as the orthogonal projection onto the residual-autocorrelation directions not removed by estimating the $p+q$ ARMA parameters. Equivalently,
\begin{align*}
P_m^2=P_m,
\qquad
P_m^\top=P_m,
\qquad
\operatorname{rank}(P_m)=m-p-q.
\end{align*}
This is exactly where the loss of $p+q$ degrees of freedom enters the Box-Pierce approximation.
[/guided]
[/step]
[step:Diagonalise the projection covariance]
Since $P_m$ is a real symmetric matrix, there exists an orthogonal matrix $U_m \in \mathbb{R}^{m \times m}$ and real eigenvalues $\lambda_1,\dots,\lambda_m$ such that
\begin{align*}
P_m
=
U_m
\operatorname{diag}(\lambda_1,\dots,\lambda_m)
U_m^\top.
\end{align*}
Because $P_m^2=P_m$, each eigenvalue satisfies $\lambda_j^2=\lambda_j$, hence $\lambda_j \in \{0,1\}$. Since $\operatorname{rank}(P_m)=m-p-q$, exactly $m-p-q$ of the eigenvalues equal $1$ and exactly $p+q$ of them equal $0$. Reordering the orthonormal eigenbasis if necessary,
\begin{align*}
P_m
=
U_m
\operatorname{diag}(\underbrace{1,\dots,1}_{m-p-q},\underbrace{0,\dots,0}_{p+q})
U_m^\top.
\end{align*}
[guided]
The covariance matrix $P_m$ is an orthogonal projection, and the useful way to understand a projection is to choose coordinates adapted to its range and kernel. Since $P_m$ is real and symmetric, it has an orthonormal eigenbasis. Therefore there is an orthogonal matrix $U_m \in \mathbb{R}^{m \times m}$ and real numbers $\lambda_1,\dots,\lambda_m$ such that
\begin{align*}
P_m
=
U_m
\operatorname{diag}(\lambda_1,\dots,\lambda_m)
U_m^\top.
\end{align*}
The idempotence condition $P_m^2=P_m$ restricts the possible eigenvalues. If $v \in \mathbb{R}^m$ is an eigenvector with eigenvalue $\lambda_j$, then
\begin{align*}
P_m^2v
=
P_m(\lambda_j v)
=
\lambda_j^2 v,
\end{align*}
while $P_mv=\lambda_j v$. Since $P_m^2=P_m$, we get
\begin{align*}
\lambda_j^2 v=\lambda_j v.
\end{align*}
Because $v \neq 0$, this implies $\lambda_j^2=\lambda_j$, so $\lambda_j \in \{0,1\}$. The rank of a projection is the number of eigenvalues equal to $1$, counted with multiplicity. Since $\operatorname{rank}(P_m)=m-p-q$, exactly $m-p-q$ eigenvalues equal $1$ and the remaining $p+q$ eigenvalues equal $0$. After reordering the eigenbasis, we obtain
\begin{align*}
P_m
=
U_m
\operatorname{diag}(\underbrace{1,\dots,1}_{m-p-q},\underbrace{0,\dots,0}_{p+q})
U_m^\top.
\end{align*}
[/guided]
[/step]
[step:Represent the limiting Gaussian vector by independent standard normals]
Let $Y_m:\Omega \to \mathbb{R}^{m}$ be defined by
\begin{align*}
Y_m
=
U_m^\top Z_m.
\end{align*}
Since $U_m$ is orthogonal and $Z_m \sim \mathcal{N}(0,P_m)$, the vector $Y_m$ is centered Gaussian with covariance
\begin{align*}
\operatorname{Cov}(Y_m)
=
U_m^\top P_m U_m
=
\operatorname{diag}(\underbrace{1,\dots,1}_{m-p-q},\underbrace{0,\dots,0}_{p+q}).
\end{align*}
Thus there exist independent standard normal random variables $G_1,\dots,G_{m-p-q}$ such that
\begin{align*}
Y_m
=
(G_1,\dots,G_{m-p-q},0,\dots,0)
\end{align*}
in distribution. Since orthogonal transformations preserve Euclidean norm,
\begin{align*}
|Z_m|^2
=
|Y_m|^2
=
\sum_{j=1}^{m-p-q}G_j^2
\end{align*}
in distribution.
[/step]
[step:Pass the asymptotic distribution through the quadratic map]
Define the continuous function $\varphi:\mathbb{R}^{m}\to\mathbb{R}$ by
\begin{align*}
\varphi(x)=|x|^2.
\end{align*}
Since $\sqrt{n}\,\hat r_{m,n}\xrightarrow{d}Z_m$ and $\varphi$ is continuous, the continuous mapping theorem gives
\begin{align*}
Q_m
=
|\sqrt{n}\,\hat r_{m,n}|^2
=
\varphi(\sqrt{n}\,\hat r_{m,n})
\xrightarrow{d}
\varphi(Z_m)
=
|Z_m|^2.
\end{align*}
By the preceding step,
\begin{align*}
|Z_m|^2
\sim
\sum_{j=1}^{m-p-q}G_j^2.
\end{align*}
The definition of the chi-square distribution gives
\begin{align*}
\sum_{j=1}^{m-p-q}G_j^2
\sim
\chi^2_{m-p-q}.
\end{align*}
Therefore
\begin{align*}
Q_m \xrightarrow{d} \chi^2_{m-p-q}.
\end{align*}
This proves the Box–Pierce portmanteau approximation for fixed $m$ under the white-noise null hypothesis.
[guided]
We now transfer the vector convergence to the scalar statistic. Define the continuous map
\begin{align*}
\varphi:\mathbb{R}^{m} &\to \mathbb{R} \\
x &\mapsto |x|^2.
\end{align*}
The statistic is exactly this map applied to the normalized residual autocorrelation vector:
\begin{align*}
Q_m
=
|\sqrt{n}\,\hat r_{m,n}|^2
=
\varphi(\sqrt{n}\,\hat r_{m,n}).
\end{align*}
Since
\begin{align*}
\sqrt{n}\,\hat r_{m,n}
\xrightarrow{d}
Z_m
\end{align*}
and $\varphi$ is continuous, the continuous mapping theorem yields
\begin{align*}
Q_m
=
\varphi(\sqrt{n}\,\hat r_{m,n})
\xrightarrow{d}
\varphi(Z_m)
=
|Z_m|^2.
\end{align*}
It remains only to identify the law of $|Z_m|^2$. From the diagonalisation of the projection covariance, we already know that $Z_m$ has only $m-p-q$ non-degenerate independent standard normal coordinates after an orthogonal change of variables. Thus, for independent standard normal random variables $G_1,\dots,G_{m-p-q}$,
\begin{align*}
|Z_m|^2
\sim
\sum_{j=1}^{m-p-q}G_j^2.
\end{align*}
By definition, the sum of squares of $m-p-q$ independent standard normal random variables has the chi-square distribution with $m-p-q$ degrees of freedom:
\begin{align*}
\sum_{j=1}^{m-p-q}G_j^2
\sim
\chi^2_{m-p-q}.
\end{align*}
Combining these identities gives
\begin{align*}
Q_m \xrightarrow{d} \chi^2_{m-p-q}.
\end{align*}
This is the claimed asymptotic distribution of the Box–Pierce statistic.
[/guided]
[/step]