Change Of Basis

Also known as: change of coordinates, coordinate change, basis transformation, transition matrix, change-of-basis

Edit 0 Issues 0 Pull Requests Roadmap Admin

Content

Problems

History

Issues Verification Attributions

A vector does not change when we rename the measuring rods around it, but its list of coordinates can change completely. The vector $x = 2e_1 + e_2$ in a plane is not a different vector if we measure it with the slanted basis $(e_1+e_2,e_2)$; only the coordinate record changes. Change of basis is the algebra that separates the vector from the coordinates used to describe it. The problem becomes urgent as soon as a [linear map](/page/Linear%20Map) is present. A map may have a complicated matrix in one basis and a diagonal matrix in another; a quadratic form may contain a cross term in one basis and split into squares after a suitable coordinate choice; a [Lie algebra](/page/Lie%20Algebra) representation may become understandable only after the right basis exposes invariant pieces. Each setting has its own compatibility conditions, but the shared lesson is the same: the mathematical object is basis-independent, while computation happens through coordinates. [example: The Same Vector in Two Bases] Let $V=\mathbb{R}^2$ with the standard basis $\mathcal E=(e_1,e_2)$, and let $\mathcal B=(b_1,b_2)$ where $b_1=e_1+e_2$ and $b_2=e_2$. For $x=2e_1+e_2$, we compute its $\mathcal B$-coordinates by finding $a_1,a_2\in\mathbb{R}$ such that \begin{align*} x=a_1b_1+a_2b_2. \end{align*} Substituting the definitions of $b_1$ and $b_2$ gives \begin{align*} a_1b_1+a_2b_2=a_1(e_1+e_2)+a_2e_2. \end{align*} Distributing $a_1$ and collecting the $e_2$ terms gives \begin{align*} a_1(e_1+e_2)+a_2e_2=a_1e_1+a_1e_2+a_2e_2=a_1e_1+(a_1+a_2)e_2. \end{align*} Since $x=2e_1+e_2$, equality with $a_1e_1+(a_1+a_2)e_2$ forces equality of the coefficients of the ordered basis $\mathcal E$: \begin{align*} a_1=2,\qquad a_1+a_2=1. \end{align*} Substituting $a_1=2$ into the second equation gives \begin{align*} 2+a_2=1. \end{align*} Subtracting $2$ from both sides gives \begin{align*} a_2=-1. \end{align*} Therefore \begin{align*} x=2b_1-b_2. \end{align*} In standard coordinates, the same vector is $[x]_{\mathcal E}=(2,1)^\top$, while in the basis $\mathcal B$ it is $[x]_{\mathcal B}=(2,-1)^\top$. The vector has not moved; only the coordinate column has changed because the basis vectors used to reconstruct it have changed. [/example] This example already shows the central tension. Coordinates are indispensable for calculation, but they are not the object itself. Change of basis gives a controlled way to translate between coordinate languages without changing the underlying linear algebra. ## Definition The page topic is not merely a matrix trick. A change of basis is the act of keeping the [vector space](/page/Vector%20Space) fixed while replacing the coordinate system used to describe its elements. The definition must therefore mention both sides: the old ordered basis whose coordinates we start with, and the new ordered basis whose coordinates we want. [definition: Change of Basis] Let $V$ be a finite-dimensional vector space over a field $k$. A change of basis on $V$ is a passage from one ordered basis $\mathcal B=(b_1,\ldots,b_n)$ of $V$ to another ordered basis $\mathcal C=(c_1,\ldots,c_n)$ of $V$. [/definition] This definition names the mathematical situation, but the word "ordered" is doing real work. A basis alone is not enough for coordinates, because coordinates are an ordered list. Swapping two basis vectors swaps two entries in every coordinate column. To make the phrase "the first coordinate" meaningful, we first record the basis vectors in a definite order. [definition: Ordered Basis] Let $V$ be a vector space over a field $k$. An ordered basis of $V$ is a finite tuple $\mathcal B = (v_1,\ldots,v_n)$ such that the set $\{v_1,\ldots,v_n\}$ is a basis of $V$. [/definition] An ordered basis gives names to the slots in a coordinate column, but it does not yet say how a vector fills those slots. The next concept answers the reconstruction question: which scalars multiply the basis vectors to produce a given vector? This is the data that will later be converted from one basis to another. [definition: Coordinate Vector] Let $V$ be an $n$-dimensional vector space over a field $k$, and let $\mathcal B=(v_1,\ldots,v_n)$ be an ordered basis of $V$. For $x \in V$, the coordinate vector of $x$ with respect to $\mathcal B$ is the column vector $[x]_{\mathcal B} = (a_1,\ldots,a_n)^\top \in k^n$ defined by \begin{align*} x &= \sum_{i=1}^n a_i v_i. \end{align*} [/definition] The coordinate vector depends on $\mathcal B$, but the assignment $x\mapsto [x]_{\mathcal B}$ is systematic. It sends addition and scalar multiplication in $V$ to addition and scalar multiplication in $k^n$. The next definition packages the whole coordinate system as a single linear isomorphism, which is the cleanest way to compare two bases. [definition: Coordinate Isomorphism] Let $V$ be an $n$-dimensional vector space over a field $k$, and let $\mathcal B=(v_1,\ldots,v_n)$ be an ordered basis of $V$. The coordinate isomorphism determined by $\mathcal B$ is the map \begin{align*} \Phi_{\mathcal B}: V &\to k^n \end{align*} defined by \begin{align*} \Phi_{\mathcal B}(x) &= [x]_{\mathcal B}. \end{align*} [/definition] Now suppose two ordered bases describe the same vector space. A vector can be decoded from its $\mathcal B$-coordinates and then encoded again using $\mathcal C$-coordinates. The resulting operation is the abstract change of basis before any matrix entries are written down. [definition: Change of Basis Map] Let $V$ be an $n$-dimensional vector space over a field $k$, and let $\mathcal B$ and $\mathcal C$ be ordered bases of $V$. The change of basis map from $\mathcal B$-coordinates to $\mathcal C$-coordinates is the linear map \begin{align*} \Phi_{\mathcal C}\circ \Phi_{\mathcal B}^{-1}: k^n &\to k^n \end{align*} defined by \begin{align*} (\Phi_{\mathcal C}\circ \Phi_{\mathcal B}^{-1})(u) &= \Phi_{\mathcal C}(\Phi_{\mathcal B}^{-1}(u)). \end{align*} [/definition] Computations need the matrix of this map. The matrix is not a new object acting on $V$ itself; it acts on coordinate columns. Naming it separately keeps the vector space, the coordinate spaces, and the conversion operation from being conflated. [definition: Change of Basis Matrix] Let $V$ be an $n$-dimensional vector space over a field $k$, and let $\mathcal B$ and $\mathcal C$ be ordered bases of $V$. The change of basis matrix from $\mathcal B$ to $\mathcal C$ is the matrix $P_{\mathcal C \leftarrow \mathcal B} \in k^{n\times n}$ representing the map $\Phi_{\mathcal C}\circ \Phi_{\mathcal B}^{-1}$ with respect to the standard basis of $k^n$. [/definition] The notation $P_{\mathcal C \leftarrow \mathcal B}$ should be read from right to left: it starts with $\mathcal B$-coordinates and returns $\mathcal C$-coordinates. Thus $[x]_{\mathcal C} = P_{\mathcal C \leftarrow \mathcal B}[x]_{\mathcal B}$. To compute the matrix in practice, we need a column rule that turns the definition into a finite list of coordinate calculations. [quotetheorem:8323] This column rule is the practical recipe: write each old basis vector in the new basis, and assemble the resulting coordinate columns. The rest of the chapter repeatedly uses this single idea in increasingly structured settings. ## Coordinate Translation The first skill is not diagonalising a matrix or simplifying an operator; it is translating one coordinate column into another without losing track of which basis is being used. The key point is that a coordinate column in $k^n$ does not carry its basis label inside itself. ### Direction of Conversion The arrow in $P_{\mathcal C \leftarrow \mathcal B}$ prevents a common reversal error. The matrix is built from old basis vectors written in the new basis, because an input coordinate column tells us how to combine old basis vectors. A concrete calculation makes the direction visible. [example: Computing a Change of Basis Matrix] Let $V=\mathbb{R}^2$, let $\mathcal E=(e_1,e_2)$, and let $\mathcal B=(b_1,b_2)$ with $b_1=e_1+e_2$ and $b_2=e_1-e_2$. To build $P_{\mathcal E \leftarrow \mathcal B}$, write each vector of $\mathcal B$ in the basis $\mathcal E$. Since \begin{align*} b_1=e_1+e_2=1e_1+1e_2 \end{align*} we have $[b_1]_{\mathcal E}=(1,1)^\top$. Similarly, \begin{align*} b_2=e_1-e_2=1e_1+(-1)e_2 \end{align*} so $[b_2]_{\mathcal E}=(1,-1)^\top$. Therefore the change-of-basis matrix from $\mathcal B$-coordinates to $\mathcal E$-coordinates is \begin{align*} P_{\mathcal E \leftarrow \mathcal B}=\begin{pmatrix}1&1\cr 1&-1\end{pmatrix}. \end{align*} Now let $x=3b_1+2b_2$, so $[x]_{\mathcal B}=(3,2)^\top$. Multiplying by the conversion matrix gives \begin{align*} P_{\mathcal E \leftarrow \mathcal B}[x]_{\mathcal B}=\begin{pmatrix}1&1\cr 1&-1\end{pmatrix}\begin{pmatrix}3\cr 2\end{pmatrix}. \end{align*} The first entry is $1\cdot 3+1\cdot 2=5$, and the second entry is $1\cdot 3+(-1)\cdot 2=1$, so \begin{align*} P_{\mathcal E \leftarrow \mathcal B}[x]_{\mathcal B}=\begin{pmatrix}5\cr 1\end{pmatrix}. \end{align*} Thus $[x]_{\mathcal E}=(5,1)^\top$, meaning $x=5e_1+e_2$. The same vector has coordinate column $(3,2)^\top$ in the basis $\mathcal B$ and coordinate column $(5,1)^\top$ in the basis $\mathcal E$. [/example] The computation gives one direction of conversion, but coordinate translation must also support reversal. If a conversion matrix did not have an inverse, two different old coordinate columns would collapse to the same new coordinate column, which would mean two different vectors had become indistinguishable. The next theorem says this cannot happen for bases of the same space. [quotetheorem:8324] This result is often the fastest way to compute the reverse matrix. If the old basis vectors are easy to write in the new basis, compute that direction first and invert only when needed. ### Composition of Coordinate Changes In the theorem cards below, $\mathrm{GL}_n(\mathbb F)$ denotes the invertible $n\times n$ matrices with entries in the field $\mathbb F$, $\mathrm{Mat}_n(\mathbb F)$ denotes all $n\times n$ matrices over $\mathbb F$, and $\mathrm{End}(V)$ denotes the linear maps from a vector space $V$ to itself. Sometimes there are three natural bases in play: a standard basis for computation, an eigenbasis for a map, and a problem-specific basis chosen by geometry. Passing through an intermediate basis should give the same answer as translating directly. For bases $\mathcal A,\mathcal B,\mathcal C$ of the same vector space, the coordinate-change matrices compose as $P_{\mathcal C\leftarrow \mathcal A}=P_{\mathcal C\leftarrow \mathcal B}P_{\mathcal B\leftarrow \mathcal A}$. The order of multiplication reflects the order in which coordinates are converted. First translate $\mathcal A$-coordinates to $\mathcal B$-coordinates; then translate $\mathcal B$-coordinates to $\mathcal C$-coordinates. [example: A Three-Basis Translation] Let $V=\mathbb{R}^2$, let $\mathcal A=(a_1,a_2)$ with $a_1=e_1+e_2$ and $a_2=e_2$, let $\mathcal B=(e_1,e_2)$, and let $\mathcal C=(c_1,c_2)$ with $c_1=e_1-e_2$ and $c_2=e_2$. For an $\mathcal A$-coordinate column $(u_1,u_2)^\top$, the represented vector is \begin{align*} u_1a_1+u_2a_2=u_1(e_1+e_2)+u_2e_2=u_1e_1+(u_1+u_2)e_2. \end{align*} Therefore \begin{align*} P_{\mathcal B \leftarrow \mathcal A}(u_1,u_2)^\top=(u_1,u_1+u_2)^\top. \end{align*} Now convert from $\mathcal B$-coordinates to $\mathcal C$-coordinates. If a vector has standard coordinates $(v_1,v_2)^\top$, then we look for scalars $\alpha,\beta$ such that \begin{align*} v_1e_1+v_2e_2=\alpha c_1+\beta c_2. \end{align*} Substituting $c_1=e_1-e_2$ and $c_2=e_2$ gives \begin{align*} \alpha c_1+\beta c_2=\alpha(e_1-e_2)+\beta e_2=\alpha e_1+(-\alpha+\beta)e_2. \end{align*} Matching coefficients of $e_1$ and $e_2$ gives $\alpha=v_1$ and $-\alpha+\beta=v_2$, so $\beta=v_1+v_2$. Hence \begin{align*} P_{\mathcal C \leftarrow \mathcal B}(v_1,v_2)^\top=(v_1,v_1+v_2)^\top. \end{align*} Starting from $[x]_{\mathcal A}=(2,3)^\top$, the first conversion gives \begin{align*} [x]_{\mathcal B}=P_{\mathcal B \leftarrow \mathcal A}(2,3)^\top=(2,2+3)^\top=(2,5)^\top. \end{align*} The second conversion gives \begin{align*} [x]_{\mathcal C}=P_{\mathcal C \leftarrow \mathcal B}(2,5)^\top=(2,2+5)^\top=(2,7)^\top. \end{align*} The direct conversion from $\mathcal A$ to $\mathcal C$ is the composition of these two coordinate rules: \begin{align*} (P_{\mathcal C \leftarrow \mathcal B}P_{\mathcal B \leftarrow \mathcal A})(u_1,u_2)^\top=P_{\mathcal C \leftarrow \mathcal B}(u_1,u_1+u_2)^\top=(u_1,u_1+(u_1+u_2))^\top=(u_1,2u_1+u_2)^\top. \end{align*} Applying this direct rule to $(2,3)^\top$ gives \begin{align*} (P_{\mathcal C \leftarrow \mathcal B}P_{\mathcal B \leftarrow \mathcal A})(2,3)^\top=(2,2\cdot 2+3)^\top=(2,7)^\top. \end{align*} Thus converting through the intermediate basis $\mathcal B$ gives the same $\mathcal C$-coordinate column as the direct conversion from $\mathcal A$ to $\mathcal C$. [/example] The computation exposes why matrix multiplication is not commutative in this setting: the two factors have different jobs, and reversing them changes the coordinate language expected at the input. ## Matrices of Linear Maps A linear map $T:V\to W$ is basis-independent, but its matrix requires a basis in the domain and a basis in the codomain. Change of basis tells us how the matrix changes when either coordinate system is replaced. ### Matrix Representation Before changing bases for a map, we need to say what a matrix of a map means. The columns come from images of domain basis vectors, written in the codomain basis. This definition is the place where a linear transformation becomes a rectangular array of scalars. [definition: Matrix of a Linear Map] Let $V$ and $W$ be finite-dimensional vector spaces over a field $k$, let $\mathcal B=(v_1,\ldots,v_n)$ be an ordered basis of $V$, and let $\mathcal D=(w_1,\ldots,w_m)$ be an ordered basis of $W$. For a linear map $T:V\to W$, the matrix of $T$ with respect to the domain basis $\mathcal B$ and codomain basis $\mathcal D$ is the matrix $[T]_{\mathcal D \leftarrow \mathcal B}\in k^{m\times n}$ defined by \begin{align*} [T(x)]_{\mathcal D} &= [T]_{\mathcal D \leftarrow \mathcal B}[x]_{\mathcal B} \end{align*} for every $x\in V$. [/definition] The notation records both coordinate systems because changing either one changes the matrix. To compare an old matrix with a new one, we must convert the input coordinates into the old domain basis, apply the old matrix, and then convert the output coordinates into the new codomain basis. The next theorem is exactly this bookkeeping written as a formula. [quotetheorem:387] Read the formula by following a vector through the coordinate systems. Start with $\mathcal C$-coordinates, convert them to $\mathcal B$-coordinates, apply the old matrix of $T$, and then convert the output from $\mathcal D$-coordinates to $\mathcal E$-coordinates. ### Endomorphisms and Similarity When $T:V\to V$ has the same vector space as domain and codomain, it is common to use the same basis on both sides. In that case the change-of-basis formula becomes the similarity relation for square matrices. We isolate similarity because it is the [equivalence relation](/page/Equivalence%20Relation) that means "same linear operator, different basis." [definition: Similar Matrices] Let $A,B\in k^{n\times n}$. The matrices $A$ and $B$ are similar if there exists an invertible matrix $P\in k^{n\times n}$ such that \begin{align*} B &= P^{-1}AP. \end{align*} [/definition] Similarity is not arbitrary matrix manipulation. The issue is that a matrix by itself does not remember which ordered basis was used to encode the operator. To compare two square matrices as possible descriptions of the same endomorphism, the change-of-basis matrix must convert coordinates before and after applying the operator, so the two matrix formulas agree on every vector. [quotetheorem:400] The inverse placement depends on the convention for $P$. If $P=P_{\mathcal B \leftarrow \mathcal C}$, then the same formula reads $[T]_{\mathcal C \leftarrow \mathcal C}=P^{-1}[T]_{\mathcal B \leftarrow \mathcal B}P$. The mathematics is unchanged; the subscripts make the direction explicit. [example: Diagonal Form from an Eigenbasis] Let $T:\mathbb{R}^2\to\mathbb{R}^2$ be defined in standard coordinates by $T(u_1,u_2)^\top=(2u_1+u_2,3u_2)^\top$, and let $v_1=e_1$ and $v_2=e_1+e_2$. First, $\mathcal B=(v_1,v_2)$ is a basis: if $a v_1+b v_2=0$, then \begin{align*} a e_1+b(e_1+e_2)=(a+b)e_1+b e_2=0e_1+0e_2. \end{align*} Equality of coefficients in the standard basis gives $b=0$ and then $a+b=0$, so $a=0$. We now compute the action of $T$ on the basis vectors. Since $v_1=e_1=(1,0)^\top$, \begin{align*} T(v_1)=T(1,0)^\top=(2\cdot 1+0,3\cdot 0)^\top=(2,0)^\top=2e_1=2v_1. \end{align*} Since $v_2=e_1+e_2=(1,1)^\top$, \begin{align*} T(v_2)=T(1,1)^\top=(2\cdot 1+1,3\cdot 1)^\top=(3,3)^\top=3(e_1+e_2)=3v_2. \end{align*} For a vector with $\mathcal B$-coordinates $(a_1,a_2)^\top$, the represented vector is $a_1v_1+a_2v_2$. By linearity of $T$, \begin{align*} T(a_1v_1+a_2v_2)=a_1T(v_1)+a_2T(v_2). \end{align*} Substituting the two eigenvector computations gives \begin{align*} a_1T(v_1)+a_2T(v_2)=a_1(2v_1)+a_2(3v_2)=2a_1v_1+3a_2v_2. \end{align*} Therefore the $\mathcal B$-coordinate column of the output is \begin{align*} [T(a_1v_1+a_2v_2)]_{\mathcal B}=(2a_1,3a_2)^\top. \end{align*} Thus $[T]_{\mathcal B \leftarrow \mathcal B}$ sends $(a_1,a_2)^\top$ to $(2a_1,3a_2)^\top$, so it is the diagonal matrix with diagonal entries $2$ and $3$. In the eigenbasis, the two coordinate directions are the eigenlines, and $T$ acts on them independently by the scalars $2$ and $3$. [/example] Diagonalisation is the most visible use of change of basis, but the same bookkeeping controls many basis-dependent matrix representations: rotations, projections, inclusions, quotient maps, bilinear forms, and representations of algebras. The precise transformation law depends on the type of object being represented. ## Invariants Under Change of Basis A change of basis should not alter intrinsic information about a linear map. It may change every entry of the matrix, but quantities attached to the underlying operator must survive. This section separates coordinate artefacts from basis-independent structure. ### Rank and Nullity The rank and kernel of a linear map are defined without choosing bases. Their matrix descriptions must therefore be stable under change of basis. Since a change of basis multiplies a matrix by invertible matrices on the left and right, it should not change the dimension of the image. [quotetheorem:8327] The rank theorem shows that some matrix data survives coordinate translation. Eigenvalue theory asks for a subtler invariant: a polynomial whose roots record possible scalar actions on invariant directions. That polynomial is defined from a matrix, but the surrounding goal is to use it as information about the underlying operator. [definition: Characteristic Polynomial] Let $A\in k^{n\times n}$, and write $k[\lambda]$ for the [polynomial ring](/page/Polynomial%20Ring) over $k$ in the variable $\lambda$. The [characteristic polynomial](/page/Characteristic%20Polynomial) of $A$ is \begin{align*} \chi_A(\lambda) &= \det(\lambda I_n-A) \in k[\lambda]. \end{align*} [/definition] The characteristic polynomial is defined from a matrix, but for a linear operator it should be independent of the basis used to compute it. Similarity is the algebraic test for this independence. The next theorem supplies the determinant identity that makes the basis-independent language legitimate. [quotetheorem:402] This theorem justifies speaking of the characteristic polynomial of a linear operator $T:V\to V$. Choose a basis, compute the matrix, compute the determinant, and the resulting polynomial does not depend on that choice. ### Trace and Determinant Trace and determinant are matrix expressions, but individual entries can change dramatically under a change of basis. The useful question is which combinations of entries remain unchanged when the matrix is replaced by a similar one. Trace and determinant pass this test, so they can be attached to the operator rather than to a particular coordinate description. [quotetheorem:401] The theorem does not say that diagonal entries or individual matrix entries are invariant. It says that certain combinations of entries survive every change of basis. [example: Entries Change While Trace and Determinant Stay Fixed] Let $A:\mathbb{R}^2\to\mathbb{R}^2$ be the linear map whose standard-coordinate rule is \begin{align*} A(u_1,u_2)^\top=(u_1+2u_2,4u_2)^\top. \end{align*} Thus, in the standard basis $\mathcal E=(e_1,e_2)$, the first column is $A(e_1)=(1,0)^\top$ and the second column is $A(e_2)=(2,4)^\top$, so \begin{align*} [A]_{\mathcal E\leftarrow \mathcal E}=\begin{pmatrix}1&2\cr 0&4\end{pmatrix}. \end{align*} Now change to the basis $\mathcal B=(b_1,b_2)$ where $b_1=e_1$ and $b_2=e_1+e_2$. A vector with $\mathcal B$-coordinates $(a_1,a_2)^\top$ is \begin{align*} a_1b_1+a_2b_2=a_1e_1+a_2(e_1+e_2)=(a_1+a_2)e_1+a_2e_2. \end{align*} Therefore its standard coordinates are $(a_1+a_2,a_2)^\top$. Applying $A$ gives \begin{align*} A(a_1+a_2,a_2)^\top=((a_1+a_2)+2a_2,4a_2)^\top=(a_1+3a_2,4a_2)^\top. \end{align*} To rewrite this output in the basis $\mathcal B$, solve \begin{align*} c_1b_1+c_2b_2=c_1e_1+c_2(e_1+e_2)=(c_1+c_2)e_1+c_2e_2. \end{align*} Matching this with $(a_1+3a_2)e_1+4a_2e_2$ gives \begin{align*} c_2=4a_2,\qquad c_1+c_2=a_1+3a_2. \end{align*} Substituting $c_2=4a_2$ into the second equation gives $c_1+4a_2=a_1+3a_2$, hence $c_1=a_1-a_2$. Thus \begin{align*} [A]_{\mathcal B\leftarrow \mathcal B}(a_1,a_2)^\top=(a_1-a_2,4a_2)^\top. \end{align*} So \begin{align*} [A]_{\mathcal B\leftarrow \mathcal B}=\begin{pmatrix}1&-1\cr 0&4\end{pmatrix}. \end{align*} The off-diagonal entry in the first row and second column has changed from $2$ in the standard basis to $-1$ in the basis $\mathcal B$. However, \begin{align*} \operatorname{tr}\begin{pmatrix}1&2\cr 0&4\end{pmatrix}=1+4=5 \end{align*} and \begin{align*} \det\begin{pmatrix}1&2\cr 0&4\end{pmatrix}=1\cdot 4-2\cdot 0=4. \end{align*} For the new matrix, \begin{align*} \operatorname{tr}\begin{pmatrix}1&-1\cr 0&4\end{pmatrix}=1+4=5 \end{align*} and \begin{align*} \det\begin{pmatrix}1&-1\cr 0&4\end{pmatrix}=1\cdot 4-(-1)\cdot 0=4. \end{align*} The individual off-diagonal entry depends on the chosen basis, while the trace and determinant remain fixed for this change of coordinates. [/example] The danger is to assign meaning to a matrix entry without specifying the basis. Entries can be engineered by a change of basis, while invariant quantities cannot. ## Bases Adapted to Structure Change of basis is not only a translation device. It is also a strategy for choosing coordinates that match the structure of a problem. A good basis makes hidden decomposition visible. ### Eigenbases If a linear map has enough eigenvectors, then the basis formed from those eigenvectors makes the action of the map as simple as possible: each basis direction is merely scaled. This special kind of basis is the coordinate form of decomposing the vector space into invariant lines. Naming it lets us state diagonalisation without tying it to a particular matrix. [definition: Eigenbasis] Let $V$ be a finite-dimensional vector space over a field $k$, and let $T:V\to V$ be linear. An eigenbasis for $T$ is an ordered basis $\mathcal B=(v_1,\ldots,v_n)$ of $V$ such that each $v_i$ is an eigenvector of $T$. [/definition] An eigenbasis converts the operator matrix into a diagonal matrix. The obstruction is that diagonal coordinates require every basis direction to be preserved as a line, not merely that the characteristic polynomial have roots. Thus diagonalisation is exactly the problem of finding enough eigenvectors to form a basis of the whole space. [quotetheorem:8325] This theorem is the conceptual core of diagonalisation. We are not trying to make a matrix pretty; we are trying to find a basis made from invariant one-dimensional directions. ### Failure of Diagonalisation The limitation is just as important. Some operators do not have enough eigenvectors, so no basis can make their matrices diagonal. A small example shows that this failure is structural rather than a result of poor calculation. [example: A Matrix with Too Few Eigenvectors] Let $T:\mathbb{R}^2\to\mathbb{R}^2$ have standard matrix $A$ defined by $A(u_1,u_2)^\top=(u_1+u_2,u_2)^\top$. In the standard basis, \begin{align*} A=\begin{pmatrix}1&1\cr 0&1\end{pmatrix}. \end{align*} If $\lambda$ is an eigenvalue, then there is a nonzero vector $x=(x_1,x_2)^\top$ such that $(A-\lambda I_2)x=0$. Here \begin{align*} A-\lambda I_2=\begin{pmatrix}1-\lambda&1\cr 0&1-\lambda\end{pmatrix}. \end{align*} The determinant is \begin{align*} \det(A-\lambda I_2)=(1-\lambda)(1-\lambda)-1\cdot 0=(1-\lambda)^2. \end{align*} For a nonzero solution to exist, this determinant must vanish, so \begin{align*} (1-\lambda)^2=0. \end{align*} Hence $\lambda=1$, so $1$ is the only eigenvalue. Now compute its eigenspace. Since \begin{align*} A-I_2=\begin{pmatrix}0&1\cr 0&0\end{pmatrix}, \end{align*} we have \begin{align*} (A-I_2)\begin{pmatrix}x_1\cr x_2\end{pmatrix}=\begin{pmatrix}0\cdot x_1+1\cdot x_2\cr 0\cdot x_1+0\cdot x_2\end{pmatrix}=\begin{pmatrix}x_2\cr 0\end{pmatrix}. \end{align*} Thus $(A-I_2)x=0$ is equivalent to $x_2=0$. Therefore every eigenvector has the form \begin{align*} x=\begin{pmatrix}x_1\cr 0\end{pmatrix}=x_1e_1 \end{align*} with $x_1\ne 0$, and the eigenspace is $\operatorname{span}(e_1)$. An eigenbasis of $\mathbb{R}^2$ would need two linearly independent eigenvectors. But all eigenvectors lie in the one-dimensional subspace $\operatorname{span}(e_1)$, so any two of them are scalar multiples of $e_1$ and cannot form a basis of $\mathbb{R}^2$. Therefore $T$ has no eigenbasis over $\mathbb{R}$, and no change of basis over $\mathbb{R}$ can make its matrix diagonal. [/example] The failure is not computational inconvenience. It is an invariant obstruction: similarity cannot create missing eigenvectors. ### Adapted Bases for Subspaces A basis can also be chosen to respect a subspace. This is the coordinate form of passing from a subspace to a quotient or decomposing a map according to invariant pieces. The relevant basis places the chosen subspace in the first coordinate directions. [definition: Basis Adapted to a Subspace] Let $V$ be a finite-dimensional vector space over a field $k$, and let $U\subset V$ be a subspace. An ordered basis $\mathcal B=(v_1,\ldots,v_n)$ of $V$ is adapted to $U$ if there exists $r\in\{0,1,\ldots,n\}$ such that $(v_1,\ldots,v_r)$ is an ordered basis of $U$. [/definition] An adapted basis is useful only if such bases exist whenever a subspace is present. Finite-dimensional linear algebra provides this by extending a basis of the subspace to a basis of the whole space. The next theorem is the formal existence statement behind block matrix methods. [quotetheorem:8326] Once a basis is adapted, vectors in $U$ are exactly those vectors whose later coordinates vanish. This is why adapted bases are useful for quotient spaces, invariant subspaces, and block triangular forms. [example: Block Form from an Invariant Subspace] Let $V$ be finite-dimensional over $k$, let $T:V\to V$ be linear, and let $U\subset V$ be $T$-invariant. Choose an adapted basis $\mathcal B=(u_1,\ldots,u_r,w_1,\ldots,w_s)$, so $(u_1,\ldots,u_r)$ is a basis of $U$. We show that the matrix of $T$ in this basis has a zero lower-left block. For each $i\in\{1,\ldots,r\}$, $T$-invariance gives $T(u_i)\in U$. Since $(u_1,\ldots,u_r)$ is a basis of $U$, there are scalars $a_{1i},\ldots,a_{ri}\in k$ such that \begin{align*} T(u_i)=a_{1i}u_1+\cdots+a_{ri}u_r. \end{align*} Viewed as an expansion in the full basis $\mathcal B$, this is \begin{align*} T(u_i)=a_{1i}u_1+\cdots+a_{ri}u_r+0w_1+\cdots+0w_s. \end{align*} Therefore the $i$th column of $[T]_{\mathcal B\leftarrow \mathcal B}$ is \begin{align*} [T(u_i)]_{\mathcal B}=(a_{1i},\ldots,a_{ri},0,\ldots,0)^\top. \end{align*} Thus every entry below row $r$ in the first $r$ columns is $0$. For the remaining basis vectors, write \begin{align*} T(w_\ell)=b_{1\ell}u_1+\cdots+b_{r\ell}u_r+c_{1\ell}w_1+\cdots+c_{s\ell}w_s. \end{align*} These columns are unrestricted. Hence, with $A=(a_{ji})\in k^{r\times r}$, $B=(b_{j\ell})\in k^{r\times s}$, and $C=(c_{m\ell})\in k^{s\times s}$, the matrix has block form \begin{align*} [T]_{\mathcal B\leftarrow \mathcal B}=\begin{pmatrix}A&B\cr 0&C\end{pmatrix}. \end{align*} The zero lower-left block records exactly that vectors starting in $U$ are sent back into $U$, so no $w_1,\ldots,w_s$ components appear in $T(u_1),\ldots,T(u_r)$. [/example] This is the same philosophy as diagonalisation, but with a weaker structural goal. Instead of decomposing into one-dimensional invariant subspaces, we choose a basis that remembers a larger invariant subspace. ## Dual Bases and Coordinate Functionals Coordinates can also be extracted by linear functionals. This is the dual viewpoint: a basis of $V$ determines a companion basis of $V^*$ whose elements read off coordinates. If $x=\sum_i a_i v_i$, the scalars $a_i$ are not found by inspecting the vector directly; they are produced by coordinate functionals. The [dual basis](/theorems/414) packages these functionals. This viewpoint becomes important whenever vectors are paired with linear functionals, as in bilinear forms and representation theory. [definition: Dual Basis] Let $V$ be an $n$-dimensional vector space over a field $k$, and let $\mathcal B=(v_1,\ldots,v_n)$ be an ordered basis of $V$. The dual basis of $\mathcal B$ is the ordered basis $\mathcal B^*=(f_1,\ldots,f_n)$ of $V^*$ such that each $f_i:V\to k$ is the linear functional defined by $f_i(v_j)=1$ when $i=j$ and $f_i(v_j)=0$ when $i\ne j$. [/definition] The functional $f_i$ is designed to detect the coefficient of $v_i$. The point that needs checking is that these values on basis vectors determine a linear functional on every vector, and that applying it to an arbitrary expansion $x=\sum_j a_jv_j$ really isolates the single coefficient $a_i$. [quotetheorem:414] Coordinate extraction is paired with coordinate conversion. If vector coordinates change by one matrix, then coordinate functionals must change in a compatible way so that evaluating a functional on a vector gives the same scalar. The next theorem identifies the forced conversion matrix on dual coordinates. [quotetheorem:416] The inverse transpose is not a mysterious correction factor. It is forced by the requirement that evaluating a functional on a vector gives a scalar independent of coordinates. [example: Dual Basis in the Plane] Let $V=\mathbb{R}^2$ with $\mathcal B=(b_1,b_2)$, where $b_1=e_1+e_2$ and $b_2=e_2$. The dual basis $\mathcal B^*=(f_1,f_2)$ is determined by the conditions $f_1(b_1)=1$, $f_1(b_2)=0$, $f_2(b_1)=0$, and $f_2(b_2)=1$. Every linear functional on $\mathbb{R}^2$ has the form $x_1e_1+x_2e_2\mapsto \alpha x_1+\beta x_2$ for some scalars $\alpha,\beta\in\mathbb{R}$. Write \begin{align*} f_1(x_1e_1+x_2e_2)=\alpha x_1+\beta x_2. \end{align*} Since $b_1=e_1+e_2=1e_1+1e_2$, the condition $f_1(b_1)=1$ gives \begin{align*} f_1(b_1)=f_1(e_1+e_2)=\alpha\cdot 1+\beta\cdot 1=\alpha+\beta=1. \end{align*} Since $b_2=e_2=0e_1+1e_2$, the condition $f_1(b_2)=0$ gives \begin{align*} f_1(b_2)=f_1(e_2)=\alpha\cdot 0+\beta\cdot 1=\beta=0. \end{align*} Substituting $\beta=0$ into $\alpha+\beta=1$ gives $\alpha=1$, so \begin{align*} f_1(x_1e_1+x_2e_2)=x_1. \end{align*} Similarly, write \begin{align*} f_2(x_1e_1+x_2e_2)=\gamma x_1+\delta x_2. \end{align*} The condition $f_2(b_1)=0$ gives \begin{align*} f_2(e_1+e_2)=\gamma\cdot 1+\delta\cdot 1=\gamma+\delta=0. \end{align*} The condition $f_2(b_2)=1$ gives \begin{align*} f_2(e_2)=\gamma\cdot 0+\delta\cdot 1=\delta=1. \end{align*} Substituting $\delta=1$ into $\gamma+\delta=0$ gives $\gamma+1=0$, hence $\gamma=-1$. Therefore \begin{align*} f_2(x_1e_1+x_2e_2)=-x_1+x_2. \end{align*} For $x=2e_1+e_2$, we get \begin{align*} f_1(x)=f_1(2e_1+e_2)=2 \end{align*} and \begin{align*} f_2(x)=f_2(2e_1+e_2)=-2+1=-1. \end{align*} Thus the dual basis functionals read off the $\mathcal B$-coordinates of $x$, giving $[x]_{\mathcal B}=(2,-1)^\top$. [/example] Dual bases are the first place where the direction of change matters deeply. Vectors and covectors transform in paired ways so that evaluation remains unchanged. ## Common Failure Modes Many errors in change of basis come from treating a coordinate column as if it were the vector itself. A column vector becomes meaningful only after its basis label is known. ### Unlabelled Coordinates The same column can describe different vectors in different bases, and the same vector can have different columns in different bases. This failure mode is small, but it is responsible for many incorrect computations with matrices. The example below isolates the issue before any linear map is involved. [example: The Same Column Can Mean Different Vectors] Let $V=\mathbb{R}^2$, let $\mathcal E=(e_1,e_2)$, and let $\mathcal B=(b_1,b_2)$ with $b_1=e_1+e_2$ and $b_2=e_2$. We compare what the same coordinate column $(1,0)^\top$ reconstructs from these two ordered bases. Using the standard basis $\mathcal E$, the column $(1,0)^\top$ means the linear combination \begin{align*} 1e_1+0e_2=e_1. \end{align*} Thus the vector represented by $(1,0)^\top$ in the basis $\mathcal E$ is $e_1$. Using the basis $\mathcal B$, the same column $(1,0)^\top$ means the linear combination \begin{align*} 1b_1+0b_2=b_1. \end{align*} Substituting $b_1=e_1+e_2$ gives \begin{align*} b_1=e_1+e_2. \end{align*} Thus the vector represented by $(1,0)^\top$ in the basis $\mathcal B$ is $e_1+e_2$. Since $e_1$ and $e_1+e_2$ are different vectors in $\mathbb R^2$, the expression $(1,0)^\top$ alone does not determine a vector in $V$; the basis label is part of the coordinate data. [/example] The cure is not more notation for its own sake. The basis subscript is the information that makes the coordinate column interpretable. ### Reversing the Matrix Another common error is building $P_{\mathcal E \leftarrow \mathcal B}$ and then using it as if it were $P_{\mathcal B \leftarrow \mathcal E}$. A semantic check helps: ask which coordinate language the input column uses. The following example shows how the wrong direction produces a different vector. [example: A Reversed Matrix Gives the Wrong Vector] Let $\mathcal B=(b_1,b_2)$ in $\mathbb{R}^2$, where $b_1=e_1+e_2$ and $b_2=e_2$. We compute both conversion directions and then apply the wrong one to show exactly where the error enters. A vector with $\mathcal B$-coordinates $(u_1,u_2)^\top$ is \begin{align*} u_1b_1+u_2b_2=u_1(e_1+e_2)+u_2e_2. \end{align*} Distributing and collecting the standard basis vectors gives \begin{align*} u_1(e_1+e_2)+u_2e_2=u_1e_1+u_1e_2+u_2e_2=u_1e_1+(u_1+u_2)e_2. \end{align*} Thus \begin{align*} P_{\mathcal E \leftarrow \mathcal B}(u_1,u_2)^\top=(u_1,u_1+u_2)^\top. \end{align*} For the reverse direction, start with a vector whose standard coordinates are $(v_1,v_2)^\top$, so the vector is $v_1e_1+v_2e_2$. To find its $\mathcal B$-coordinates, solve \begin{align*} v_1e_1+v_2e_2=a_1b_1+a_2b_2. \end{align*} Substituting $b_1=e_1+e_2$ and $b_2=e_2$ gives \begin{align*} a_1b_1+a_2b_2=a_1(e_1+e_2)+a_2e_2=a_1e_1+(a_1+a_2)e_2. \end{align*} Matching coefficients of $e_1$ and $e_2$ gives \begin{align*} a_1=v_1,\qquad a_1+a_2=v_2. \end{align*} Substituting $a_1=v_1$ into the second equation gives $v_1+a_2=v_2$, hence $a_2=v_2-v_1$. Therefore \begin{align*} P_{\mathcal B \leftarrow \mathcal E}(v_1,v_2)^\top=(v_1,v_2-v_1)^\top. \end{align*} Now take $x=e_1$. Its standard coordinate column is \begin{align*} [x]_{\mathcal E}=(1,0)^\top. \end{align*} Applying the correct conversion matrix gives \begin{align*} [x]_{\mathcal B}=P_{\mathcal B \leftarrow \mathcal E}(1,0)^\top=(1,0-1)^\top=(1,-1)^\top. \end{align*} This agrees with the reconstruction \begin{align*} 1b_1+(-1)b_2=(e_1+e_2)-e_2=e_1. \end{align*} If we instead use $P_{\mathcal E \leftarrow \mathcal B}$ on the standard coordinate column, then \begin{align*} P_{\mathcal E \leftarrow \mathcal B}(1,0)^\top=(1,1+0)^\top=(1,1)^\top. \end{align*} Interpreting this output as a $\mathcal B$-coordinate column reconstructs \begin{align*} 1b_1+1b_2=(e_1+e_2)+e_2=e_1+2e_2. \end{align*} Since $e_1+2e_2\ne e_1$, the reversed matrix gives the wrong vector: it was built to accept $\mathcal B$-coordinates as input, not $\mathcal E$-coordinates. [/example] The wrong answer is not random; it is the result of applying a map to inputs written in the wrong coordinate language. ## Beyond and Connected Topics Change of basis is the coordinate engine behind much of finite-dimensional algebra. In [Cambridge IB Linear Algebra](/page/Cambridge%20IB%20Linear%20Algebra), it appears in diagonalisation, canonical forms, bilinear forms, and the study of linear maps through their matrices. In linear analysis, the same idea persists when vector spaces become normed spaces, but only after adding analytic hypotheses. A bounded invertible linear map with bounded inverse can be viewed as a change of coordinates in an infinite-dimensional setting; a merely algebraic isomorphism may fail to preserve convergence or boundedness. This is one bridge to [Cambridge II Linear Analysis](/page/Cambridge%20II%20Linear%20Analysis). In Lie theory, changing basis in a Lie algebra changes the structure constants but not the Lie algebra itself. This distinction is central when comparing presentations of the same algebra in [Lie Algebras I: Foundations](/page/Lie%20Algebras%20I%3A%20Foundations) and when choosing bases adapted to ideals, Cartan subalgebras, or root decompositions in [Lie Algebras II: Structure and Classification](/page/Lie%20Algebras%20II%3A%20Structure%20and%20Classification). Change of basis also leads toward canonical form theory. [Jordan normal form](/theorems/864), [rational canonical form](/theorems/863), and singular value decompositions each ask for bases that make a matrix reveal the structure that survives coordinate changes. ## References Androma, [Cambridge IB Linear Algebra](/page/Cambridge%20IB%20Linear%20Algebra). Androma, [Cambridge II Linear Analysis](/page/Cambridge%20II%20Linear%20Analysis). Androma, [Lie Algebras I: Foundations](/page/Lie%20Algebras%20I%3A%20Foundations). Androma, [Lie Algebras II: Structure and Classification](/page/Lie%20Algebras%20II%3A%20Structure%20and%20Classification). Sheldon Axler, *Linear Algebra Done Right* (2015). Kenneth Hoffman and Ray Kunze, *Linear Algebra* (1971). Serge Lang, *Linear Algebra* (1987).

Created by admin on 6/20/2026 | Last updated on 6/20/2026

What brings you to Androma?

Start with a route through the knowledge graph.

Change Of Basis

Sign in to Androma

Check your inbox

One last step

Change Of Basis

Prerequisites (0/6 completed)

Prerequisites Graph

Rate this page