Change of Basis Transformation — Statement & Proof

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] We prove both parts by expanding in coordinates. For the coordinate-vector formula, we write $v$ as a linear combination of the $\mathcal{B}$-basis vectors, re-express each $\mathcal{B}$-basis vector in the $\mathcal{C}$-basis using the columns of the change-of-basis matrix, and read off the $\mathcal{C}$-coordinates. For the operator-matrix formula, we compose the coordinate-vector identity with the definition of the matrix of a linear operator to reduce the claim to a matrix multiplication. [/proofplan] [step:Define the bases and the change-of-basis matrix] Let $\dim V = n$, and write the two bases as ordered tuples: \begin{align*} \mathcal{B} &= (b_1, \dots, b_n), & \mathcal{C} &= (c_1, \dots, c_n). \end{align*} The change-of-basis matrix $P_{\mathcal{B} \to \mathcal{C}} \in \mathbb{F}^{n \times n}$ is defined by expressing each $\mathcal{B}$-basis vector in the $\mathcal{C}$-basis: for each $j \in \{1, \dots, n\}$, \begin{align*} b_j &= \sum_{i=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, c_i. \end{align*} That is, the $j$-th column of $P_{\mathcal{B} \to \mathcal{C}}$ is the coordinate vector $[b_j]_\mathcal{C}$. Since $\mathcal{B}$ is a basis, the vectors $b_1, \dots, b_n$ are linearly independent; since the coordinate map $v \mapsto [v]_\mathcal{C}$ is an isomorphism, the columns $[b_1]_\mathcal{C}, \dots, [b_n]_\mathcal{C}$ are linearly independent in $\mathbb{F}^n$. Therefore $P_{\mathcal{B} \to \mathcal{C}}$ is invertible, i.e., $P_{\mathcal{B} \to \mathcal{C}} \in \mathrm{GL}_n(\mathbb{F})$. [guided] We begin by setting up the notation carefully. Let $\dim V = n$, and write the two bases as ordered tuples: \begin{align*} \mathcal{B} &= (b_1, \dots, b_n), & \mathcal{C} &= (c_1, \dots, c_n). \end{align*} The change-of-basis matrix $P_{\mathcal{B} \to \mathcal{C}} \in \mathbb{F}^{n \times n}$ encodes how to pass from $\mathcal{B}$-coordinates to $\mathcal{C}$-coordinates. Its definition is: for each $j \in \{1, \dots, n\}$, write the $j$-th $\mathcal{B}$-basis vector in terms of the $\mathcal{C}$-basis: \begin{align*} b_j &= \sum_{i=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, c_i. \end{align*} In other words, the $j$-th column of $P_{\mathcal{B} \to \mathcal{C}}$ is the coordinate vector $[b_j]_\mathcal{C}$. Why is this the right definition? Because when we expand $v = \sum_j \lambda_j b_j$ and substitute each $b_j$ in terms of the $c_i$, the matrix $P_{\mathcal{B} \to \mathcal{C}}$ will act on the column vector $(\lambda_1, \dots, \lambda_n)^\top = [v]_\mathcal{B}$ by matrix multiplication and produce $[v]_\mathcal{C}$. We verify this in the next step. Why is $P_{\mathcal{B} \to \mathcal{C}}$ invertible? Since $\mathcal{B}$ is a basis, $b_1, \dots, b_n$ are linearly independent. The coordinate map $v \mapsto [v]_\mathcal{C}$ is a linear isomorphism from $V$ to $\mathbb{F}^n$ (by [Unique Representation By A Basis](/theorems/372), every vector has a unique coordinate representation). An isomorphism preserves linear independence, so the columns $[b_1]_\mathcal{C}, \dots, [b_n]_\mathcal{C}$ of $P_{\mathcal{B} \to \mathcal{C}}$ are linearly independent in $\mathbb{F}^n$. A square matrix with linearly independent columns is invertible, so $P_{\mathcal{B} \to \mathcal{C}} \in \mathrm{GL}_n(\mathbb{F})$. [/guided] [/step] [step:Prove the coordinate-vector formula $[v]_\mathcal{C} = P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}$] Let $v \in V$ with $\mathcal{B}$-coordinate vector $[v]_\mathcal{B} = (\lambda_1, \dots, \lambda_n)^\top$, so that \begin{align*} v &= \sum_{j=1}^{n} \lambda_j \, b_j. \end{align*} Substituting the expansion of each $b_j$ in the $\mathcal{C}$-basis: \begin{align*} v &= \sum_{j=1}^{n} \lambda_j \sum_{i=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, c_i = \sum_{i=1}^{n} \left( \sum_{j=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, \lambda_j \right) c_i. \end{align*} By [Unique Representation By A Basis](/theorems/372), the $\mathcal{C}$-coordinates of $v$ are uniquely determined, so \begin{align*} ([v]_\mathcal{C})_i &= \sum_{j=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, \lambda_j = \sum_{j=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, ([v]_\mathcal{B})_j. \end{align*} This is precisely the $i$-th component of the matrix-vector product $P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}$. Therefore $[v]_\mathcal{C} = P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}$. [guided] We now derive the coordinate-vector formula. Let $v \in V$ with $\mathcal{B}$-coordinate vector $[v]_\mathcal{B} = (\lambda_1, \dots, \lambda_n)^\top$, meaning \begin{align*} v &= \sum_{j=1}^{n} \lambda_j \, b_j. \end{align*} Our goal is to express $v$ as a linear combination of the $\mathcal{C}$-basis vectors and identify the coefficients. We substitute the expansion $b_j = \sum_{i=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, c_i$ from the previous step: \begin{align*} v &= \sum_{j=1}^{n} \lambda_j \sum_{i=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, c_i. \end{align*} We interchange the order of summation (both sums are finite, so this is justified without any convergence hypothesis): \begin{align*} v &= \sum_{i=1}^{n} \left( \sum_{j=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, \lambda_j \right) c_i. \end{align*} This expresses $v$ as a linear combination of $c_1, \dots, c_n$. By [Unique Representation By A Basis](/theorems/372), the coordinate representation in a basis is unique. Therefore the coefficient of $c_i$ is the $i$-th $\mathcal{C}$-coordinate of $v$: \begin{align*} ([v]_\mathcal{C})_i &= \sum_{j=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, \lambda_j = \sum_{j=1}^{n} (P_{\mathcal{B} \to \mathcal{C}})_{ij} \, ([v]_\mathcal{B})_j. \end{align*} The right-hand side is exactly the definition of the $i$-th entry of the matrix-vector product $P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}$. Assembling all $n$ components, we conclude \begin{align*} [v]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}. \end{align*} Note the key mechanism: each column of $P_{\mathcal{B} \to \mathcal{C}}$ contributes the $\mathcal{C}$-coordinates of one $\mathcal{B}$-basis vector, weighted by the corresponding $\mathcal{B}$-coordinate of $v$. The matrix multiplication performs exactly this weighted summation. [/guided] [/step] [step:Derive the operator-matrix formula $[T]_\mathcal{C} = P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1}$] Let $T: V \to V$ be a linear operator. The matrix $[T]_\mathcal{B}$ is defined by the property that for every $v \in V$, \begin{align*} [T(v)]_\mathcal{B} &= [T]_\mathcal{B} \, [v]_\mathcal{B}, \end{align*} and similarly $[T(v)]_\mathcal{C} = [T]_\mathcal{C} \, [v]_\mathcal{C}$. Applying the coordinate-vector formula from the previous step to both $v$ and $T(v)$: \begin{align*} [T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T(v)]_\mathcal{B} = P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, [v]_\mathcal{B}. \end{align*} Since $P_{\mathcal{B} \to \mathcal{C}}$ is invertible, the coordinate-vector formula gives $[v]_\mathcal{B} = (P_{\mathcal{B} \to \mathcal{C}})^{-1} [v]_\mathcal{C}$. Substituting: \begin{align*} [T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1} \, [v]_\mathcal{C}. \end{align*} This holds for every $v \in V$, so it holds for every $[v]_\mathcal{C} \in \mathbb{F}^n$. By [Unique Representation By A Basis](/theorems/372), the coordinate map is a bijection, so $[v]_\mathcal{C}$ ranges over all of $\mathbb{F}^n$. Since the matrix $[T]_\mathcal{C}$ is the unique matrix satisfying $[T(v)]_\mathcal{C} = [T]_\mathcal{C} \, [v]_\mathcal{C}$ for all $v \in V$, we conclude \begin{align*} [T]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1}. \end{align*} [guided] Let $T: V \to V$ be a linear operator. Recall that the matrix of $T$ in a given basis is defined by how $T$ acts on coordinate vectors: the matrix $[T]_\mathcal{B}$ satisfies $[T(v)]_\mathcal{B} = [T]_\mathcal{B} \, [v]_\mathcal{B}$ for every $v \in V$, and likewise $[T(v)]_\mathcal{C} = [T]_\mathcal{C} \, [v]_\mathcal{C}$. The strategy is to express $[T(v)]_\mathcal{C}$ in two ways and equate them. On one hand, by the definition of $[T]_\mathcal{C}$: \begin{align*} [T(v)]_\mathcal{C} &= [T]_\mathcal{C} \, [v]_\mathcal{C}. \end{align*} On the other hand, we can compute $[T(v)]_\mathcal{C}$ by first working in $\mathcal{B}$-coordinates and then converting. Applying the coordinate-vector formula (proved in the previous step) to the vector $T(v)$: \begin{align*} [T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T(v)]_\mathcal{B}. \end{align*} Now we use the definition of $[T]_\mathcal{B}$ to replace $[T(v)]_\mathcal{B}$: \begin{align*} [T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, [v]_\mathcal{B}. \end{align*} We still have $[v]_\mathcal{B}$ on the right, but we want everything in terms of $[v]_\mathcal{C}$. Since $P_{\mathcal{B} \to \mathcal{C}}$ is invertible (established in the first step), the coordinate-vector formula $[v]_\mathcal{C} = P_{\mathcal{B} \to \mathcal{C}} \, [v]_\mathcal{B}$ can be inverted to give $[v]_\mathcal{B} = (P_{\mathcal{B} \to \mathcal{C}})^{-1} \, [v]_\mathcal{C}$. Substituting: \begin{align*} [T(v)]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1} \, [v]_\mathcal{C}. \end{align*} Comparing with $[T(v)]_\mathcal{C} = [T]_\mathcal{C} \, [v]_\mathcal{C}$, we see that the matrices $[T]_\mathcal{C}$ and $P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1}$ agree when applied to every vector $[v]_\mathcal{C} \in \mathbb{F}^n$. Why does $[v]_\mathcal{C}$ range over all of $\mathbb{F}^n$? By [Unique Representation By A Basis](/theorems/372), every vector in $V$ has a unique coordinate representation, so the coordinate map $v \mapsto [v]_\mathcal{C}$ is a bijection from $V$ to $\mathbb{F}^n$. Two matrices that agree on every vector in $\mathbb{F}^n$ must be equal (take $[v]_\mathcal{C} = e_k$, the $k$-th standard basis vector, to see that the $k$-th columns coincide for each $k$). Therefore: \begin{align*} [T]_\mathcal{C} &= P_{\mathcal{B} \to \mathcal{C}} \, [T]_\mathcal{B} \, (P_{\mathcal{B} \to \mathcal{C}})^{-1}. \end{align*} This is the similarity transformation relating the two matrix representations of the same linear operator. The formula has a clean interpretation: to apply $T$ in $\mathcal{C}$-coordinates, first convert from $\mathcal{C}$ to $\mathcal{B}$ (via $(P_{\mathcal{B} \to \mathcal{C}})^{-1}$), apply $[T]_\mathcal{B}$ in $\mathcal{B}$-coordinates, then convert the result back from $\mathcal{B}$ to $\mathcal{C}$ (via $P_{\mathcal{B} \to \mathcal{C}}$). [/guided] [/step]

What brings you to Androma?

Start with a route through the knowledge graph.

Change of Basis Transformation (Theorem # 3275)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Change of Basis Transformation (Theorem # 3275)

Discussion

Proof

Explore Further