[proofplan]
We rotate the observation index by an orthogonal matrix whose first row is the normalized all-ones vector. This separates the data into one Gaussian coordinate equal to $\sqrt{n}(\bar{X}-\mu)$ and $n-1$ Gaussian contrast coordinates. Orthogonality preserves the joint Gaussian structure, and a characteristic-function computation shows that these transformed coordinates are independent with covariance $\Sigma$. Finally, the centered residual sum of squares is exactly the sum of the outer products of the contrast coordinates, which gives both independence from $\bar{X}$ and the Wishart law.
[/proofplan]
[step:Choose an orthogonal basis separating the mean direction]
Let $e \in \mathbb{R}^n$ be the vector
\begin{align*}
e &= \frac{1}{\sqrt{n}}(1,\dots,1).
\end{align*}
Extend $e$ to an orthonormal basis $h_1,\dots,h_n$ of $\mathbb{R}^n$, with $h_1=e$. Let $H \in \mathbb{R}^{n \times n}$ be the orthogonal matrix whose $k$-th row is $h_k^\top$.
Define centered observations $Y_j: \Omega \to \mathbb{R}^p$ by
\begin{align*}
Y_j &= X_j-\mu,
\end{align*}
and define transformed coordinates $Z_k: \Omega \to \mathbb{R}^p$ by
\begin{align*}
Z_k &= \sum_{j=1}^{n} h_{kj}Y_j,
\qquad 1 \leq k \leq n.
\end{align*}
Because $h_{1j}=1/\sqrt{n}$ for every $j$,
\begin{align*}
Z_1
&= \frac{1}{\sqrt{n}}\sum_{j=1}^{n}(X_j-\mu) \\
&= \sqrt{n}(\bar{X}-\mu).
\end{align*}
[guided]
The observation index is the index $j \in \{1,\dots,n\}$, not the coordinate index in $\mathbb{R}^p$. We choose an orthogonal change of basis in this $n$-dimensional observation space so that one new coordinate measures the average and the remaining coordinates measure deviations from the average.
Let
\begin{align*}
e &= \frac{1}{\sqrt{n}}(1,\dots,1) \in \mathbb{R}^n.
\end{align*}
Since $|e|=1$, we may extend $e$ to an orthonormal basis $h_1,\dots,h_n$ of $\mathbb{R}^n$ with $h_1=e$. Let $H \in \mathbb{R}^{n \times n}$ be the orthogonal matrix whose $k$-th row is $h_k^\top$. Thus
\begin{align*}
\sum_{j=1}^{n} h_{kj}h_{\ell j}=\delta_{k\ell},
\qquad
\sum_{k=1}^{n} h_{ki}h_{kj}=\delta_{ij}.
\end{align*}
Define $Y_j: \Omega \to \mathbb{R}^p$ by $Y_j=X_j-\mu$. Each $Y_j$ has law $\mathcal{N}_p(0,\Sigma)$, and the random vectors $Y_1,\dots,Y_n$ are independent because subtracting a deterministic vector preserves independence. For $1 \leq k \leq n$, define
\begin{align*}
Z_k &= \sum_{j=1}^{n} h_{kj}Y_j.
\end{align*}
The first transformed coordinate is the mean direction:
\begin{align*}
Z_1
&= \sum_{j=1}^{n} \frac{1}{\sqrt{n}}(X_j-\mu) \\
&= \sqrt{n}\left(\frac{1}{n}\sum_{j=1}^{n}X_j-\mu\right) \\
&= \sqrt{n}(\bar{X}-\mu).
\end{align*}
Thus proving independence of $Z_1$ from $Z_2,\dots,Z_n$ will prove independence of $\bar{X}$ from any measurable function of $Z_2,\dots,Z_n$.
[/guided]
[/step]
[step:Show the transformed coordinates are independent centered normal vectors]
We compute the joint characteristic function of $(Z_1,\dots,Z_n)$. Let $a_1,\dots,a_n \in \mathbb{R}^p$, and define $b_j \in \mathbb{R}^p$ by
\begin{align*}
b_j &= \sum_{k=1}^{n} h_{kj}a_k.
\end{align*}
Using independence of $Y_1,\dots,Y_n$ and the characteristic function of $\mathcal{N}_p(0,\Sigma)$,
\begin{align*}
\mathbb{E}\left[\exp\left(i\sum_{k=1}^{n} a_k^\top Z_k\right)\right]
&= \mathbb{E}\left[\exp\left(i\sum_{j=1}^{n} b_j^\top Y_j\right)\right] \\
&= \prod_{j=1}^{n}\exp\left(-\frac{1}{2}b_j^\top\Sigma b_j\right) \\
&= \exp\left(-\frac{1}{2}\sum_{j=1}^{n} b_j^\top\Sigma b_j\right).
\end{align*}
By orthogonality of $H$,
\begin{align*}
\sum_{j=1}^{n} b_j^\top\Sigma b_j
&= \sum_{j=1}^{n}\sum_{k=1}^{n}\sum_{\ell=1}^{n}
h_{kj}h_{\ell j}a_k^\top\Sigma a_\ell \\
&= \sum_{k=1}^{n}\sum_{\ell=1}^{n}\delta_{k\ell}a_k^\top\Sigma a_\ell \\
&= \sum_{k=1}^{n}a_k^\top\Sigma a_k.
\end{align*}
Therefore
\begin{align*}
\mathbb{E}\left[\exp\left(i\sum_{k=1}^{n} a_k^\top Z_k\right)\right]
&= \prod_{k=1}^{n}\exp\left(-\frac{1}{2}a_k^\top\Sigma a_k\right).
\end{align*}
This factorization shows that $Z_1,\dots,Z_n$ are independent, and each $Z_k$ has law $\mathcal{N}_p(0,\Sigma)$.
[guided]
We now verify independence directly rather than merely asserting that an orthogonal transform preserves normality. Let $a_1,\dots,a_n \in \mathbb{R}^p$ be arbitrary test vectors. Define $b_j \in \mathbb{R}^p$ by
\begin{align*}
b_j &= \sum_{k=1}^{n}h_{kj}a_k.
\end{align*}
Then
\begin{align*}
\sum_{k=1}^{n}a_k^\top Z_k
&= \sum_{k=1}^{n}a_k^\top\left(\sum_{j=1}^{n}h_{kj}Y_j\right) \\
&= \sum_{j=1}^{n}\left(\sum_{k=1}^{n}h_{kj}a_k\right)^\top Y_j \\
&= \sum_{j=1}^{n}b_j^\top Y_j.
\end{align*}
Since $Y_1,\dots,Y_n$ are independent and each $Y_j$ has law $\mathcal{N}_p(0,\Sigma)$, their characteristic functions multiply:
\begin{align*}
\mathbb{E}\left[\exp\left(i\sum_{k=1}^{n}a_k^\top Z_k\right)\right]
&= \prod_{j=1}^{n}\mathbb{E}\left[\exp(i b_j^\top Y_j)\right] \\
&= \prod_{j=1}^{n}\exp\left(-\frac{1}{2}b_j^\top\Sigma b_j\right) \\
&= \exp\left(-\frac{1}{2}\sum_{j=1}^{n}b_j^\top\Sigma b_j\right).
\end{align*}
It remains to simplify the quadratic form. Expanding $b_j$ gives
\begin{align*}
\sum_{j=1}^{n}b_j^\top\Sigma b_j
&= \sum_{j=1}^{n}
\left(\sum_{k=1}^{n}h_{kj}a_k\right)^\top
\Sigma
\left(\sum_{\ell=1}^{n}h_{\ell j}a_\ell\right) \\
&= \sum_{j=1}^{n}\sum_{k=1}^{n}\sum_{\ell=1}^{n}
h_{kj}h_{\ell j}a_k^\top\Sigma a_\ell.
\end{align*}
Because the rows of $H$ are orthonormal,
\begin{align*}
\sum_{j=1}^{n}h_{kj}h_{\ell j}=\delta_{k\ell}.
\end{align*}
Hence
\begin{align*}
\sum_{j=1}^{n}b_j^\top\Sigma b_j
&= \sum_{k=1}^{n}\sum_{\ell=1}^{n}\delta_{k\ell}a_k^\top\Sigma a_\ell \\
&= \sum_{k=1}^{n}a_k^\top\Sigma a_k.
\end{align*}
Therefore
\begin{align*}
\mathbb{E}\left[\exp\left(i\sum_{k=1}^{n}a_k^\top Z_k\right)\right]
&= \prod_{k=1}^{n}\exp\left(-\frac{1}{2}a_k^\top\Sigma a_k\right).
\end{align*}
This is exactly the product of the characteristic functions of $n$ independent $\mathcal{N}_p(0,\Sigma)$ random vectors. Thus $Z_1,\dots,Z_n$ are independent and each $Z_k$ has law $\mathcal{N}_p(0,\Sigma)$.
[/guided]
[/step]
[step:Rewrite the residual sum of squares using only the contrast coordinates]
Let $R_j: \Omega \to \mathbb{R}^p$ be the centered residual
\begin{align*}
R_j &= X_j-\bar{X}.
\end{align*}
Since $Y_j=X_j-\mu$, we have
\begin{align*}
R_j &= Y_j-\frac{1}{n}\sum_{m=1}^{n}Y_m.
\end{align*}
Using the inverse orthogonal expansion $Y_j=\sum_{k=1}^{n}h_{kj}Z_k$ and $h_{1j}=1/\sqrt{n}$,
\begin{align*}
R_j
&= \sum_{k=1}^{n}h_{kj}Z_k-\frac{1}{n}\sum_{m=1}^{n}\sum_{k=1}^{n}h_{km}Z_k \\
&= \sum_{k=1}^{n}h_{kj}Z_k-\frac{1}{n}\sum_{k=1}^{n}\left(\sum_{m=1}^{n}h_{km}\right)Z_k.
\end{align*}
Now $\sum_{m=1}^{n}h_{1m}=\sqrt{n}$, while $\sum_{m=1}^{n}h_{km}=0$ for $k \geq 2$ because $h_k$ is orthogonal to $h_1=e$. Hence
\begin{align*}
R_j &= \sum_{k=2}^{n}h_{kj}Z_k.
\end{align*}
Therefore, using orthonormality of the rows of $H$,
\begin{align*}
\sum_{j=1}^{n}R_jR_j^\top
&= \sum_{j=1}^{n}
\left(\sum_{k=2}^{n}h_{kj}Z_k\right)
\left(\sum_{\ell=2}^{n}h_{\ell j}Z_\ell\right)^\top \\
&= \sum_{k=2}^{n}\sum_{\ell=2}^{n}
\left(\sum_{j=1}^{n}h_{kj}h_{\ell j}\right)Z_kZ_\ell^\top \\
&= \sum_{k=2}^{n}Z_kZ_k^\top.
\end{align*}
[guided]
The sample covariance only depends on residuals from the sample mean, so we need to express those residuals in the new orthogonal coordinates. Define
\begin{align*}
R_j &= X_j-\bar{X}.
\end{align*}
Because $Y_j=X_j-\mu$,
\begin{align*}
R_j
&= X_j-\frac{1}{n}\sum_{m=1}^{n}X_m \\
&= Y_j-\frac{1}{n}\sum_{m=1}^{n}Y_m.
\end{align*}
The transformation from $(Y_1,\dots,Y_n)$ to $(Z_1,\dots,Z_n)$ is orthogonal in the observation index, so its inverse is the transpose transformation:
\begin{align*}
Y_j &= \sum_{k=1}^{n}h_{kj}Z_k.
\end{align*}
Substituting this into the residual gives
\begin{align*}
R_j
&= \sum_{k=1}^{n}h_{kj}Z_k
-\frac{1}{n}\sum_{m=1}^{n}\sum_{k=1}^{n}h_{km}Z_k \\
&= \sum_{k=1}^{n}h_{kj}Z_k
-\frac{1}{n}\sum_{k=1}^{n}\left(\sum_{m=1}^{n}h_{km}\right)Z_k.
\end{align*}
The row $h_1$ equals $(1/\sqrt{n},\dots,1/\sqrt{n})$, so
\begin{align*}
\sum_{m=1}^{n}h_{1m}=\sqrt{n}.
\end{align*}
For $k \geq 2$, the row $h_k$ is orthogonal to $h_1$, which means
\begin{align*}
0=h_k\cdot h_1=\frac{1}{\sqrt{n}}\sum_{m=1}^{n}h_{km},
\end{align*}
and therefore $\sum_{m=1}^{n}h_{km}=0$. Thus the mean component cancels and only the contrast coordinates remain:
\begin{align*}
R_j &= \sum_{k=2}^{n}h_{kj}Z_k.
\end{align*}
Now compute the residual sum of squares:
\begin{align*}
\sum_{j=1}^{n}R_jR_j^\top
&= \sum_{j=1}^{n}
\left(\sum_{k=2}^{n}h_{kj}Z_k\right)
\left(\sum_{\ell=2}^{n}h_{\ell j}Z_\ell\right)^\top \\
&= \sum_{k=2}^{n}\sum_{\ell=2}^{n}
\left(\sum_{j=1}^{n}h_{kj}h_{\ell j}\right)Z_kZ_\ell^\top.
\end{align*}
Orthonormality of the rows of $H$ gives
\begin{align*}
\sum_{j=1}^{n}h_{kj}h_{\ell j}=\delta_{k\ell}.
\end{align*}
Hence all cross terms vanish and
\begin{align*}
\sum_{j=1}^{n}R_jR_j^\top
&= \sum_{k=2}^{n}Z_kZ_k^\top.
\end{align*}
This identity is the algebraic reason the covariance matrix has $n-1$ degrees of freedom: the mean direction has been removed, leaving $n-1$ independent contrast directions.
[/guided]
[/step]
[step:Conclude independence and identify the Wishart distribution]
By the previous step,
\begin{align*}
(n-1)S
&= \sum_{j=1}^{n}(X_j-\bar{X})(X_j-\bar{X})^\top \\
&= \sum_{k=2}^{n}Z_kZ_k^\top.
\end{align*}
The random vector $\bar{X}$ is a measurable function of $Z_1$, since
\begin{align*}
\bar{X} &= \mu+\frac{1}{\sqrt{n}}Z_1.
\end{align*}
The matrix $(n-1)S$ is a measurable function of $(Z_2,\dots,Z_n)$. Since $Z_1$ is independent of $(Z_2,\dots,Z_n)$, it follows that $\bar{X}$ and $S$ are independent.
Finally, $Z_2,\dots,Z_n$ are $n-1$ independent random vectors with common law $\mathcal{N}_p(0,\Sigma)$. By the defining characterization of the Wishart distribution,
\begin{align*}
\sum_{k=2}^{n}Z_kZ_k^\top \sim W_p(\Sigma,n-1).
\end{align*}
Therefore
\begin{align*}
(n-1)S \sim W_p(\Sigma,n-1),
\end{align*}
which completes the proof.
[/step]