[guided]Let $T: V \to V$ be a linear operator. Recall that the matrix of $T$ in a given basis is defined by how $T$ acts on coordinate vectors: the matrix $[T]_\mathcal{B}$ satisfies $[T(v)]_\mathcal{B} = [T]_\mathcal{B} \, [v]_\mathcal{B}$ for every $v \in V$, and likewise $[T(v)]_\mathcal{C} = [T]_\mathcal{C} \, [v]_\mathcal{C}$.
The strategy is to express $[T(v)]_\mathcal{C}$ in two ways and equate them. On one hand, by the definition of $[T]_\mathcal{C}$:
\begin{align*}
[T(v)]_\mathcal{C} &= [T]_\mathcal{C} \, [v]_\mathcal{C}.
\end{align*}
On the other hand, we can compute $[T(v)]_\mathcal{C}$ by first working in $\mathcal{B}$-coordinates and then converting. Applying the coordinate-vector formula (proved in the previous step) to the vector $T(v)$:
\begin{align*}
[T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T(v)]_\mathcal{B}.
\end{align*}
Now we use the definition of $[T]_\mathcal{B}$ to replace $[T(v)]_\mathcal{B}$:
\begin{align*}
[T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, [v]_\mathcal{B}.
\end{align*}
We still have $[v]_\mathcal{B}$ on the right, but we want everything in terms of $[v]_\mathcal{C}$. Since $P_{\mathcal{B} \to \mathcal{C}}$ is invertible (established in the first step), the coordinate-vector formula $[v]_\mathcal{C} = P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}$ can be inverted to give $[v]_\mathcal{B} = (P_{\mathcal{B} \to \mathcal{C}})^{-1} \, [v]_\mathcal{C}$. Substituting:
\begin{align*}
[T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1} \, [v]_\mathcal{C}.
\end{align*}
Comparing with $[T(v)]_\mathcal{C} = [T]_\mathcal{C} \, [v]_\mathcal{C}$, we see that the matrices $[T]_\mathcal{C}$ and $P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1}$ agree when applied to every vector $[v]_\mathcal{C} \in \mathbb{F}^n$. Why does $[v]_\mathcal{C}$ range over all of $\mathbb{F}^n$? By [Unique Representation By A Basis](/theorems/372), every vector in $V$ has a unique coordinate representation, so the coordinate map $v \mapsto [v]_\mathcal{C}$ is a bijection from $V$ to $\mathbb{F}^n$. Two matrices that agree on every vector in $\mathbb{F}^n$ must be equal (take $[v]_\mathcal{C} = e_k$, the $k$-th standard basis vector, to see that the $k$-th columns coincide for each $k$). Therefore:
\begin{align*}
[T]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1}.
\end{align*}
This is the similarity transformation relating the two matrix representations of the same linear operator. The formula has a clean interpretation: to apply $T$ in $\mathcal{C}$-coordinates, first convert from $\mathcal{C}$ to $\mathcal{B}$ (via $(P_{\mathcal{B} \to \mathcal{C}})^{-1}$), apply $[T]_\mathcal{B}$ in $\mathcal{B}$-coordinates, then convert the result back from $\mathcal{B}$ to $\mathcal{C}$ (via $P_{\mathcal{B} \to \mathcal{C}}$).[/guided]