[proofplan]
We write the centered sum of squares as a matrix quadratic form using the [orthogonal projection](/theorems/437) onto the subspace of $\mathbb{R}^n$ perpendicular to the all-ones vector. An orthogonal change of coordinates diagonalizes this projection and transforms the centered data into $n-1$ independent Gaussian contrast vectors and one sample-mean direction. The contrast vectors have distribution $\mathcal{N}_p(0,\Sigma)$, and the centered cross-product matrix is exactly the sum of their outer products. This is the defining representation of the Wishart distribution with $n-1$ degrees of freedom and scale matrix $\Sigma$.
[/proofplan]
[step:Represent the centered cross-product matrix by the centering projection]
Let $Y: \Omega \to \mathbb{R}^{n \times p}$ be the random matrix whose $i$-th row is $Y_i^\top$. Let $\mathbf{1}_n \in \mathbb{R}^n$ denote the column vector whose entries are all $1$, and define the centering projection $P \in \mathbb{R}^{n \times n}$ by
\begin{align*}
P &= I_n - \frac{1}{n}\mathbf{1}_n\mathbf{1}_n^\top .
\end{align*}
For each $\omega \in \Omega$, the $i$-th row of $PY(\omega)$ is
\begin{align*}
Y_i(\omega)^\top - \bar{Y}(\omega)^\top .
\end{align*}
Therefore
\begin{align*}
Y^\top P Y
&= (PY)^\top(PY) \\
&= \sum_{i=1}^n (Y_i-\bar{Y})(Y_i-\bar{Y})^\top \\
&= (n-1)S.
\end{align*}
Here we used $P^\top=P$ and $P^2=P$, which follow directly from the definition of $P$ and the identity $\mathbf{1}_n^\top\mathbf{1}_n=n$.
[/step]
[step:Diagonalize the centering projection by an orthogonal basis of contrasts]
Choose vectors $q_1,\dots,q_{n-1} \in \mathbb{R}^n$ forming an orthonormal basis of the subspace
\begin{align*}
\mathbf{1}_n^\perp &= \{x \in \mathbb{R}^n : x^\top \mathbf{1}_n = 0\},
\end{align*}
and set
\begin{align*}
q_n &= \frac{1}{\sqrt{n}}\mathbf{1}_n.
\end{align*}
Let $Q \in \mathbb{R}^{n \times n}$ be the orthogonal matrix whose $a$-th row is $q_a^\top$. Then $Q Q^\top=I_n$, and the projection $P$ has the decomposition
\begin{align*}
P = Q^\top D Q,
\qquad
D =
\begin{pmatrix}
I_{n-1} & 0 \\
0 & 0
\end{pmatrix}.
\end{align*}
Indeed, $Pq_a=q_a$ for $a \in \{1,\dots,n-1\}$ because $q_a^\top\mathbf{1}_n=0$, while $Pq_n=0$ because $q_n$ spans $\operatorname{span}\{\mathbf{1}_n\}$.
[guided]
The matrix $P$ subtracts row averages, so it should preserve exactly the directions in $\mathbb{R}^n$ whose coordinates sum to zero. We make this precise by defining
\begin{align*}
\mathbf{1}_n^\perp &= \{x \in \mathbb{R}^n : x^\top \mathbf{1}_n = 0\}.
\end{align*}
This subspace has dimension $n-1$. Choose an orthonormal basis $q_1,\dots,q_{n-1}$ for it, and complete it by setting
\begin{align*}
q_n &= \frac{1}{\sqrt{n}}\mathbf{1}_n.
\end{align*}
The vectors $q_1,\dots,q_n$ form an orthonormal basis of $\mathbb{R}^n$, because $q_n$ has Euclidean norm $1$ and is orthogonal to every vector in $\mathbf{1}_n^\perp$.
Let $Q \in \mathbb{R}^{n \times n}$ be the orthogonal matrix whose $a$-th row is $q_a^\top$. The projection $P$ acts as the identity on $\mathbf{1}_n^\perp$: if $x \in \mathbf{1}_n^\perp$, then
\begin{align*}
Px
&= x - \frac{1}{n}\mathbf{1}_n\mathbf{1}_n^\top x \\
&= x.
\end{align*}
It annihilates the mean direction:
\begin{align*}
Pq_n
&= \frac{1}{\sqrt{n}}\mathbf{1}_n
- \frac{1}{n}\mathbf{1}_n\mathbf{1}_n^\top\frac{1}{\sqrt{n}}\mathbf{1}_n \\
&= \frac{1}{\sqrt{n}}\mathbf{1}_n
- \frac{1}{n\sqrt{n}}\mathbf{1}_n(\mathbf{1}_n^\top\mathbf{1}_n) \\
&= \frac{1}{\sqrt{n}}\mathbf{1}_n
- \frac{1}{\sqrt{n}}\mathbf{1}_n \\
&=0.
\end{align*}
Thus, in the orthonormal basis $q_1,\dots,q_n$, the matrix of $P$ is
\begin{align*}
D =
\begin{pmatrix}
I_{n-1} & 0 \\
0 & 0
\end{pmatrix},
\end{align*}
which means
\begin{align*}
P = Q^\top DQ.
\end{align*}
[/guided]
[/step]
[step:Transform the observations into independent Gaussian contrasts]
Define the transformed data matrix $Z: \Omega \to \mathbb{R}^{n \times p}$ by
\begin{align*}
Z &= QY.
\end{align*}
Let $Z_a: \Omega \to \mathbb{R}^p$ denote the column vector whose transpose is the $a$-th row of $Z$. Thus
\begin{align*}
Z_a &= \sum_{i=1}^n q_{a i}Y_i,
\end{align*}
where $q_{a i}$ is the $i$-th component of $q_a$.
For $a \in \{1,\dots,n-1\}$, since $\sum_{i=1}^n q_{a i}=q_a^\top\mathbf{1}_n=0$, the mean of $Z_a$ is
\begin{align*}
\mathbb{E}[Z_a]
&= \sum_{i=1}^n q_{a i}\mu \\
&= 0.
\end{align*}
Its covariance matrix is
\begin{align*}
\operatorname{Cov}(Z_a)
&= \sum_{i=1}^n q_{a i}^2\Sigma \\
&= \Sigma,
\end{align*}
because $|q_a|^2=1$. If $a,b \in \{1,\dots,n-1\}$ with $a \neq b$, then
\begin{align*}
\operatorname{Cov}(Z_a,Z_b)
&= \sum_{i=1}^n q_{a i}q_{b i}\Sigma \\
&= (q_a^\top q_b)\Sigma \\
&=0.
\end{align*}
Since $(Z_1,\dots,Z_n)$ is obtained from the jointly Gaussian vector $(Y_1,\dots,Y_n)$ by a linear transformation, it is jointly Gaussian. Therefore the zero cross-covariances imply that $Z_1,\dots,Z_{n-1}$ are independent, and we have
\begin{align*}
Z_1,\dots,Z_{n-1} \overset{\mathrm{ind}}{\sim} \mathcal{N}_p(0,\Sigma).
\end{align*}
[guided]
The purpose of multiplying by $Q$ is to replace the original observations by orthogonal linear combinations: $n-1$ contrast directions and one mean direction. Define
\begin{align*}
Z &= QY,
\end{align*}
and let $Z_a: \Omega \to \mathbb{R}^p$ be the column vector whose transpose is row $a$ of $Z$. Because row $a$ of $Q$ is $q_a^\top$, we have
\begin{align*}
Z_a = \sum_{i=1}^n q_{a i}Y_i.
\end{align*}
For $a \leq n-1$, the vector $q_a$ lies in $\mathbf{1}_n^\perp$, so its coordinates sum to zero. Therefore the common mean $\mu$ cancels:
\begin{align*}
\mathbb{E}[Z_a]
&= \mathbb{E}\left[\sum_{i=1}^n q_{a i}Y_i\right] \\
&= \sum_{i=1}^n q_{a i}\mathbb{E}[Y_i] \\
&= \sum_{i=1}^n q_{a i}\mu \\
&= \left(\sum_{i=1}^n q_{a i}\right)\mu \\
&=0.
\end{align*}
The covariance calculation uses independence of the original observations. Since $\operatorname{Cov}(Y_i,Y_j)=0$ for $i \neq j$ and $\operatorname{Cov}(Y_i)=\Sigma$, we get
\begin{align*}
\operatorname{Cov}(Z_a)
&= \operatorname{Cov}\left(\sum_{i=1}^n q_{a i}Y_i\right) \\
&= \sum_{i=1}^n q_{a i}^2\operatorname{Cov}(Y_i) \\
&= \left(\sum_{i=1}^n q_{a i}^2\right)\Sigma \\
&= \Sigma.
\end{align*}
For two distinct contrast rows $a,b \leq n-1$, the same independence calculation gives
\begin{align*}
\operatorname{Cov}(Z_a,Z_b)
&= \sum_{i=1}^n q_{a i}q_{b i}\operatorname{Cov}(Y_i) \\
&= \left(\sum_{i=1}^n q_{a i}q_{b i}\right)\Sigma \\
&= (q_a^\top q_b)\Sigma \\
&=0.
\end{align*}
The vector $(Z_1,\dots,Z_n)$ is a linear transformation of the jointly Gaussian vector $(Y_1,\dots,Y_n)$, hence is jointly Gaussian. For jointly Gaussian random vectors, vanishing cross-covariance is equivalent to independence. Thus
\begin{align*}
Z_1,\dots,Z_{n-1} \overset{\mathrm{ind}}{\sim} \mathcal{N}_p(0,\Sigma).
\end{align*}
[/guided]
[/step]
[step:Identify the centered cross-product matrix as a Wishart sum]
Using the diagonalization $P=Q^\top DQ$ and the definition $Z=QY$, we obtain
\begin{align*}
(n-1)S
&= Y^\top P Y \\
&= Y^\top Q^\top DQY \\
&= Z^\top D Z \\
&= \sum_{a=1}^{n-1} Z_a Z_a^\top.
\end{align*}
By the defining representation of the Wishart distribution, the sum of $m$ independent outer products $X_aX_a^\top$, where
\begin{align*}
X_1,\dots,X_m \overset{\mathrm{ind}}{\sim} \mathcal{N}_p(0,\Sigma),
\end{align*}
has distribution $W_p(m,\Sigma)$. Applying this definition with $m=n-1$ and $X_a=Z_a$ gives
\begin{align*}
(n-1)S \sim W_p(n-1,\Sigma).
\end{align*}
This is the desired conclusion.
[/step]