[proofplan]
We prove orthogonality by computing the inner products of the columns of $P$. The defining property of a permutation matrix says that each column is one of the standard basis vectors of $\mathbb{R}^n$, and that distinct columns are distinct standard basis vectors. Therefore the columns form an orthonormal list, which is exactly the statement that $P^\top P = I_n$.
[/proofplan]
[step:Write the columns of the permutation matrix as distinct standard basis vectors]
For each $k \in \{1,\dots,n\}$, let $p_k \in \mathbb{R}^n$ denote the $k$th column of $P$. Since $P$ is a permutation matrix, every column of $P$ has exactly one entry equal to $1$ and all other entries equal to $0$, and no two columns have their $1$ in the same row.
Let $e_1,\dots,e_n \in \mathbb{R}^n$ denote the standard basis vectors, where $e_r$ has $r$th component equal to $1$ and all other components equal to $0$. Thus, for each $k \in \{1,\dots,n\}$, there exists a unique index $\sigma(k) \in \{1,\dots,n\}$ such that
\begin{align*}
p_k = e_{\sigma(k)}.
\end{align*}
The condition that no two columns have their $1$ in the same row means that the map
\begin{align*}
\sigma: \{1,\dots,n\} &\to \{1,\dots,n\}
\end{align*}
is injective.
[/step]
[step:Compute the inner products of the columns]
Let $i,j \in \{1,\dots,n\}$. Since $p_i = e_{\sigma(i)}$ and $p_j = e_{\sigma(j)}$, the Euclidean [inner product](/page/Inner%20Product) of the two columns is
\begin{align*}
p_i^\top p_j = e_{\sigma(i)}^\top e_{\sigma(j)}.
\end{align*}
For standard basis vectors, the inner product satisfies
\begin{align*}
e_r^\top e_s =
\begin{cases}
1, & r=s,
\end{align*}
\begin{align*}
0, & r \neq s.
\end{cases}
\end{align*}
Since $\sigma$ is injective, $\sigma(i)=\sigma(j)$ holds iff $i=j$. Therefore
\begin{align*}
p_i^\top p_j =
\begin{cases}
1, & i=j,
\end{align*}
\begin{align*}
0, & i \neq j.
\end{cases}
\end{align*}
[guided]
We want to prove $P^\top P = I_n$, and the entries of $P^\top P$ are column inner products. That is why we name the columns. For each $k \in \{1,\dots,n\}$, let $p_k \in \mathbb{R}^n$ be the $k$th column of $P$.
Because $P$ is a permutation matrix, each column has exactly one entry equal to $1$ and every other entry equal to $0$. Hence each column is a standard basis vector. Let $e_1,\dots,e_n \in \mathbb{R}^n$ denote the standard basis vectors. For every $k \in \{1,\dots,n\}$, define $\sigma(k) \in \{1,\dots,n\}$ to be the row index where the unique $1$ in the $k$th column occurs. Then
\begin{align*}
p_k = e_{\sigma(k)}.
\end{align*}
The row condition in the definition of a permutation matrix says that no two columns have their $1$ in the same row. Therefore if $\sigma(i)=\sigma(j)$, then the $i$th and $j$th columns have their unique $1$ in the same row, so $i=j$. Thus $\sigma$ is injective.
Now let $i,j \in \{1,\dots,n\}$. Using the column description, we compute
\begin{align*}
p_i^\top p_j = e_{\sigma(i)}^\top e_{\sigma(j)}.
\end{align*}
The standard basis vectors are orthonormal, so
\begin{align*}
e_r^\top e_s =
\begin{cases}
1, & r=s,
\end{align*}
\begin{align*}
0, & r \neq s.
\end{cases}
\end{align*}
Applying this with $r=\sigma(i)$ and $s=\sigma(j)$, and using the injectivity of $\sigma$, gives
\begin{align*}
p_i^\top p_j =
\begin{cases}
1, & i=j,
\end{align*}
\begin{align*}
0, & i \neq j.
\end{cases}
\end{align*}
Thus the columns of $P$ are orthonormal.
[/guided]
[/step]
[step:Identify the Gram matrix with the identity matrix]
For each $i,j \in \{1,\dots,n\}$, the $(i,j)$-entry of $P^\top P$ is the inner product of the $i$th and $j$th columns of $P$:
\begin{align*}
(P^\top P)_{ij} = p_i^\top p_j.
\end{align*}
By the preceding computation,
\begin{align*}
(P^\top P)_{ij} =
\begin{cases}
1, & i=j,
\end{align*}
\begin{align*}
0, & i \neq j.
\end{cases}
\end{align*}
These are exactly the entries of the identity matrix $I_n$. Hence
\begin{align*}
P^\top P = I_n.
\end{align*}
[/step]
[step:Conclude that the permutation matrix belongs to the orthogonal group]
By definition,
\begin{align*}
O(n) = \{Q \in \mathbb{R}^{n \times n} : Q^\top Q = I_n\}.
\end{align*}
Since $P \in \mathbb{R}^{n \times n}$ and $P^\top P = I_n$, it follows that $P \in O(n)$. This proves the theorem.
[/step]