[proofplan]
We write the residual sum of squares as a quadratic form $Z^\top (I_n - P)Z$ in the centred vector $Z = Y - X\beta$, where $P = X(X^\top X)^{-1}X^\top$ is the hat matrix. We then verify the hypotheses of the quadratic-forms lemma: that $I_n - P$ is symmetric and idempotent with rank $n - p$, and that $Z \sim N_n(\mathbf{0}, \sigma^2 I_n)$. The lemma delivers $\mathrm{RSS} \sim \sigma^2 \chi^2_{n-p}$ immediately.
[/proofplan]
[step:Centre the response and identify $\mathrm{RSS}$ with a quadratic form in the hat matrix complement]
Let $P: \mathbb{R}^n \to \mathbb{R}^n$ be the orthogonal projection onto the column space of $X$, represented (since $X$ has full column rank $p$) by the matrix
\begin{align*}
P &= X(X^\top X)^{-1} X^\top \in \mathbb{R}^{n \times n}.
\end{align*}
The fitted values are $\hat{Y} = X\hat\beta = X(X^\top X)^{-1}X^\top Y = PY$, and the residual vector is $R = Y - \hat{Y} = (I_n - P)Y$. By definition,
\begin{align*}
\mathrm{RSS} &= R^\top R = Y^\top (I_n - P)^\top (I_n - P) Y.
\end{align*}
Define the centred vector $Z := Y - X\beta$. We claim $\mathrm{RSS} = Z^\top (I_n - P) Z$. Indeed, since $P X = X(X^\top X)^{-1}(X^\top X) = X$, we have $(I_n - P) X = 0$, and hence $(I_n - P)(X\beta) = 0$. Therefore $(I_n - P)Y = (I_n - P)(Z + X\beta) = (I_n - P)Z$, so
\begin{align*}
\mathrm{RSS} &= R^\top R = \big[(I_n - P)Z\big]^\top \big[(I_n - P)Z\big] = Z^\top (I_n - P)^\top (I_n - P) Z.
\end{align*}
To reduce this to the single quadratic form $Z^\top (I_n - P) Z$, we now verify that $I_n - P$ is symmetric and idempotent.
[guided]
Our goal is to express $\mathrm{RSS}$ in the form required by the [Quadratic Forms and Idempotent Matrices](/theorems/1441) lemma, which takes a quadratic form $Z^\top A Z$ with $Z \sim N_n(\mathbf{0}, \sigma^2 I_n)$ and $A$ symmetric idempotent, and returns its chi-squared distribution. We therefore need three ingredients: a matrix $A$, a centred normal vector $Z$, and the identification $\mathrm{RSS} = Z^\top A Z$.
The natural candidate for $A$ is the complement of the hat matrix, $I_n - P$, because the residual vector $R = (I_n - P)Y$ is the image of $Y$ under $I_n - P$. The natural candidate for $Z$ is the centred vector $Z := Y - X\beta$, since subtracting the mean converts $Y \sim N_n(X\beta, \sigma^2 I_n)$ into $Z \sim N_n(\mathbf{0}, \sigma^2 I_n)$ (we verify this distributional claim in the next step).
The connecting computation rests on a single structural fact: the subspace $X\beta$ is annihilated by $I_n - P$. To see this, observe that the hat matrix satisfies $PX = X(X^\top X)^{-1}(X^\top X) = X$, because $X$ lies in its own column space and $P$ acts as the identity there. Consequently $(I_n - P) X = X - X = 0$, which gives $(I_n - P)(X\beta) = 0$ for every $\beta \in \mathbb{R}^p$. This lets us swap $Y$ for $Z$ inside the residual formula:
\begin{align*}
(I_n - P)Y = (I_n - P)(Z + X\beta) = (I_n - P)Z + 0 = (I_n - P)Z.
\end{align*}
Squaring and expanding,
\begin{align*}
\mathrm{RSS} = \|R\|^2 = \|(I_n - P)Z\|^2 = Z^\top (I_n - P)^\top (I_n - P) Z.
\end{align*}
If $I_n - P$ is symmetric and idempotent, then $(I_n - P)^\top (I_n - P) = (I_n - P)(I_n - P) = I_n - P$, collapsing the product into a single factor and yielding $\mathrm{RSS} = Z^\top (I_n - P)Z$. We verify symmetry and idempotence in the next step.
[/guided]
[/step]
[step:Verify that $I_n - P$ is symmetric, idempotent, and has rank $n - p$]
**Symmetry.** The hat matrix is symmetric:
\begin{align*}
P^\top &= \big(X(X^\top X)^{-1} X^\top\big)^\top = X\big((X^\top X)^{-1}\big)^\top X^\top = X(X^\top X)^{-1} X^\top = P,
\end{align*}
using that $(X^\top X)^{-1}$ is symmetric because $X^\top X$ is (the inverse of a symmetric matrix is symmetric). Therefore $(I_n - P)^\top = I_n - P$.
**Idempotence.** Computing $P^2$:
\begin{align*}
P^2 &= X(X^\top X)^{-1} X^\top \cdot X(X^\top X)^{-1} X^\top = X(X^\top X)^{-1} (X^\top X) (X^\top X)^{-1} X^\top = X(X^\top X)^{-1} X^\top = P.
\end{align*}
Hence $(I_n - P)^2 = I_n - 2P + P^2 = I_n - 2P + P = I_n - P$.
**Rank.** By the [Quadratic Forms and Idempotent Matrices](/theorems/1441) lemma, the rank of a symmetric idempotent matrix equals its trace. The trace of $P$ is
\begin{align*}
\operatorname{tr}(P) &= \operatorname{tr}\big(X(X^\top X)^{-1} X^\top\big) = \operatorname{tr}\big((X^\top X)^{-1} X^\top X\big) = \operatorname{tr}(I_p) = p,
\end{align*}
using the cyclic invariance $\operatorname{tr}(ABC) = \operatorname{tr}(BCA)$. Hence $\operatorname{rank}(I_n - P) = \operatorname{tr}(I_n - P) = n - p$.
Combining symmetry with idempotence in Step 1, we conclude
\begin{align*}
\mathrm{RSS} &= Z^\top (I_n - P)^\top (I_n - P) Z = Z^\top (I_n - P) Z.
\end{align*}
[guided]
We need three facts about $I_n - P$: symmetry, idempotence, and rank. Each has a clean one-line verification once the right trace identity is invoked.
*Symmetry of $P$.* We compute
\begin{align*}
P^\top = \big[X(X^\top X)^{-1} X^\top\big]^\top = X\big[(X^\top X)^{-1}\big]^\top X^\top.
\end{align*}
The matrix $X^\top X$ is symmetric (as $(X^\top X)^\top = X^\top X$), and the inverse of a symmetric matrix is symmetric: if $M = M^\top$ and $M$ is invertible, then $M^{-1} = (M^\top)^{-1} = (M^{-1})^\top$. Therefore $\big[(X^\top X)^{-1}\big]^\top = (X^\top X)^{-1}$, giving $P^\top = P$, and so $(I_n - P)^\top = I_n^\top - P^\top = I_n - P$.
*Idempotence of $P$.* Why should squaring $P$ return $P$? Geometrically $P$ is the orthogonal projection onto the column space $\operatorname{Range}(X)$; projecting a point that already lies in the subspace leaves it fixed, so $P(Pv) = Pv$ for all $v$. Algebraically, the key collapse is $(X^\top X)^{-1}(X^\top X) = I_p$ in the middle:
\begin{align*}
P^2 = X(X^\top X)^{-1} \underbrace{X^\top X}_{} (X^\top X)^{-1} X^\top = X \underbrace{(X^\top X)^{-1}(X^\top X)}_{=I_p} (X^\top X)^{-1} X^\top = X (X^\top X)^{-1} X^\top = P.
\end{align*}
From this $(I_n - P)^2 = I_n - 2P + P^2 = I_n - 2P + P = I_n - P$.
*Rank of $I_n - P$.* By [Quadratic Forms and Idempotent Matrices](/theorems/1441), a symmetric idempotent matrix has rank equal to its trace (because its eigenvalues are all $0$ or $1$, and the number of unit eigenvalues equals both the rank and the trace). We therefore compute the trace of $P$ using the cyclic invariance $\operatorname{tr}(ABC) = \operatorname{tr}(BCA) = \operatorname{tr}(CAB)$, which holds whenever the products are defined:
\begin{align*}
\operatorname{tr}(P) = \operatorname{tr}\!\big(\underbrace{X}_{n\times p} \underbrace{(X^\top X)^{-1} X^\top}_{p\times n}\big) = \operatorname{tr}\!\big(\underbrace{(X^\top X)^{-1} X^\top}_{p\times n} \underbrace{X}_{n\times p}\big) = \operatorname{tr}\!\big((X^\top X)^{-1}(X^\top X)\big) = \operatorname{tr}(I_p) = p.
\end{align*}
Therefore $\operatorname{rank}(I_n - P) = \operatorname{tr}(I_n - P) = n - p$. Geometrically, this expresses the decomposition $\mathbb{R}^n = \operatorname{Range}(X) \oplus \operatorname{Range}(X)^\perp$, where $P$ projects onto the first summand ($p$-dimensional) and $I_n - P$ onto the second ($n - p$-dimensional).
Finally, substituting symmetry and idempotence of $I_n - P$ into the end of Step 1:
\begin{align*}
\mathrm{RSS} = Z^\top (I_n - P)^\top (I_n - P) Z = Z^\top (I_n - P)(I_n - P) Z = Z^\top (I_n - P) Z.
\end{align*}
[/guided]
[/step]
[step:Apply the quadratic-forms lemma to the centred residual vector]
We now verify the distributional hypothesis on $Z$. Under the normal linear model, $Y \sim N_n(X\beta, \sigma^2 I_n)$. Since $Z = Y - X\beta$ is an affine function of $Y$, it is multivariate normal with mean $\mathbb{E}[Y] - X\beta = \mathbf{0}$ and covariance $\operatorname{Cov}(Y) = \sigma^2 I_n$. Therefore
\begin{align*}
Z &\sim N_n(\mathbf{0}, \sigma^2 I_n).
\end{align*}
All hypotheses of the [Quadratic Forms and Idempotent Matrices](/theorems/1441) lemma are now satisfied: $Z \sim N_n(\mathbf{0}, \sigma^2 I_n)$, and $A := I_n - P$ is symmetric, idempotent, and has rank $n - p$ (Step 2). The lemma yields
\begin{align*}
\frac{\mathrm{RSS}}{\sigma^2} &= \frac{Z^\top (I_n - P) Z}{\sigma^2} \sim \chi^2_{n - p},
\end{align*}
equivalently $\mathrm{RSS} \sim \sigma^2 \chi^2_{n-p}$, as claimed.
[/step]