[guided]The whole proof is a single idea: **a $G$-invariant inner product becomes the standard inner product after a change of basis**. The unitary group $\operatorname{U}(n)$ is by definition the matrices that preserve the standard Hermitian inner product $(\cdot, \cdot)_0$. So if our $G$-invariant inner product $\langle \cdot, \cdot \rangle$ were the standard one, every $g \in G$ would already be unitary by virtue of the $G$-invariance condition $\langle g\mathbf{v}, g\mathbf{w}\rangle = \langle \mathbf{v}, \mathbf{w}\rangle$. The change of basis $P$ is the dictionary translating between the two inner products.
**Why is the dictionary identity $\langle \mathbf{x}, \mathbf{y} \rangle = (P^{-1}\mathbf{x}, P^{-1}\mathbf{y})_0$ true?** Expand any $\mathbf{x} \in \mathbb{C}^n$ in the basis $(\mathbf{f}_i)$: $\mathbf{x} = \sum_i x_i \mathbf{f}_i$. The coordinate vector of $\mathbf{x}$ in the basis $(\mathbf{f}_i)$ is $P^{-1}\mathbf{x}$ (because $P$ takes the standard basis to $(\mathbf{f}_i)$, so $P^{-1}$ takes vectors to their $(\mathbf{f}_i)$-coordinates). Hence $x_i = (P^{-1}\mathbf{x})_i$. By orthonormality of $(\mathbf{f}_i)$ for $\langle \cdot, \cdot \rangle$,
\begin{align*}
\langle \mathbf{x}, \mathbf{y} \rangle = \sum_i x_i \overline{y_i} = \sum_i (P^{-1}\mathbf{x})_i \overline{(P^{-1}\mathbf{y})_i} = (P^{-1}\mathbf{x},\, P^{-1}\mathbf{y})_0.
\end{align*}
**Why does $P^{-1}gP$ work, not $PgP^{-1}$?** Because the dictionary $(\diamond)$ has the form $\langle P\mathbf{x}, P\mathbf{y}\rangle = (\mathbf{x},\mathbf{y})_0$ — that is, $P$ pulls $\langle \cdot, \cdot\rangle$ back to $(\cdot, \cdot)_0$. To make a matrix $A$ preserve $(\cdot, \cdot)_0$, we must arrange that $P A$ preserves $\langle \cdot, \cdot\rangle$ (so that the dictionary translates back to $(\cdot, \cdot)_0$ on both sides). The simplest $A$ arising from $g$ that has this property is $A = P^{-1} g P$: then $P A = g P$, and $g$ preserves $\langle \cdot, \cdot\rangle$ by hypothesis, so $\langle g P \mathbf{x}, g P \mathbf{y}\rangle = \langle P\mathbf{x}, P\mathbf{y}\rangle = (\mathbf{x}, \mathbf{y})_0$. Tracing back, $(A\mathbf{x}, A\mathbf{y})_0 = \langle P A \mathbf{x}, P A \mathbf{y}\rangle = (\mathbf{x}, \mathbf{y})_0$.
The hypothesis "$G$ finite" is consumed entirely inside Weyl's unitary trick — its averaging argument requires a finite group. The remainder of the proof is general linear algebra, valid for any finite-dimensional complex inner-product space.[/guided]