[proofplan]
We encode the centered observations in a data matrix $Y \in \mathbb{R}^{n \times p}$. The covariance matrix is a scalar multiple of $Y^\top Y$, so its column space is controlled by the column space of $Y^\top$. The centering condition forces every column of $Y$ to lie in the hyperplane of vectors in $\mathbb{R}^n$ whose coordinates sum to zero, giving the bound $n-1$, while the number of columns gives the bound $p$.
[/proofplan]
[step:Represent the sample covariance as a Gram matrix]
Define the centered data matrix $Y \in \mathbb{R}^{n \times p}$ by
\begin{align*}
Y_{ij} := (X_i)_j - \bar{X}_j
\end{align*}
for $1 \leq i \leq n$ and $1 \leq j \leq p$. Equivalently, the $i$-th row of $Y$ is $(X_i-\bar{X})^\top$. Therefore
\begin{align*}
Y^\top Y
&= \sum_{i=1}^{n} (X_i-\bar{X})(X_i-\bar{X})^\top,
\end{align*}
and hence
\begin{align*}
S = \frac{1}{n-1}Y^\top Y.
\end{align*}
Since $n \geq 2$, the scalar $\frac{1}{n-1}$ is nonzero, so multiplication by this scalar does not change rank. Thus
\begin{align*}
\operatorname{rank}(S)=\operatorname{rank}(Y^\top Y).
\end{align*}
[/step]
[step:Bound the rank of $Y^\top Y$ by the rank of $Y$]
For a real matrix $A \in \mathbb{R}^{r \times s}$, let $\operatorname{Range}(A) := \{Aw : w \in \mathbb{R}^s\} \subseteq \mathbb{R}^r$ denote its column space. For every vector $v \in \mathbb{R}^p$, the vector $Y^\top Yv$ belongs to $\operatorname{Range}(Y^\top)$. Hence
\begin{align*}
\operatorname{Range}(Y^\top Y) \subseteq \operatorname{Range}(Y^\top).
\end{align*}
Taking dimensions gives
\begin{align*}
\operatorname{rank}(Y^\top Y) \leq \operatorname{rank}(Y^\top).
\end{align*}
The rank of a matrix equals the rank of its transpose, so
\begin{align*}
\operatorname{rank}(Y^\top Y) \leq \operatorname{rank}(Y).
\end{align*}
[/step]
[step:Use centering to place the column space of $Y$ in an $(n-1)$-dimensional hyperplane]
Let $H \subset \mathbb{R}^n$ be the hyperplane
\begin{align*}
H := \left\{a \in \mathbb{R}^n : \sum_{i=1}^{n} a_i = 0\right\}.
\end{align*}
For each column index $j \in \{1,\dots,p\}$, the sum of the entries in the $j$-th column of $Y$ is
\begin{align*}
\sum_{i=1}^{n} Y_{ij}
&= \sum_{i=1}^{n}\bigl((X_i)_j-\bar{X}_j\bigr) \\
&= \sum_{i=1}^{n}(X_i)_j - n\bar{X}_j \\
&= \sum_{i=1}^{n}(X_i)_j - n\left(\frac{1}{n}\sum_{i=1}^{n}(X_i)_j\right) \\
&= 0.
\end{align*}
Thus every column of $Y$ belongs to $H$, so $\operatorname{Range}(Y) \subseteq H$. The hyperplane $H$ is the kernel of the nonzero [linear map](/page/Linear%20Map)
\begin{align*}
\ell: \mathbb{R}^n &\to \mathbb{R} \\
a &\mapsto \sum_{i=1}^{n} a_i,
\end{align*}
so $\dim H=n-1$. Therefore
\begin{align*}
\operatorname{rank}(Y) = \dim \operatorname{Range}(Y) \leq n-1.
\end{align*}
[/step]
[step:Combine the column-count and centering bounds]
Since $Y$ has $p$ columns, its column space is spanned by at most $p$ vectors in $\mathbb{R}^n$, and hence
\begin{align*}
\operatorname{rank}(Y) \leq p.
\end{align*}
Together with the previous step,
\begin{align*}
\operatorname{rank}(Y) \leq \min\{p,n-1\}.
\end{align*}
Combining this with $\operatorname{rank}(S)=\operatorname{rank}(Y^\top Y)\leq \operatorname{rank}(Y)$ gives
\begin{align*}
\operatorname{rank}(S) \leq \min\{p,n-1\}.
\end{align*}
If $p \geq n$, then $n-1 \leq p-1$, so
\begin{align*}
\operatorname{rank}(S) \leq n-1 \leq p-1 < p.
\end{align*}
Since $S$ is a $p \times p$ matrix with rank strictly smaller than $p$, $S$ is singular.
[/step]