A [linear map](/page/Linear%20Map) is often handed to us as a matrix, but the entries of that matrix are coordinates rather than permanent features of the map. If we change the basis, the same transformation may have a very different-looking matrix. The characteristic polynomial answers the need for a coordinate-independent algebraic fingerprint: it packages the values where $tI-A$ loses invertibility, and from that single polynomial we recover eigenvalues, determinant, trace, and relations among powers of the operator.
The first reason to introduce this polynomial is the eigenvalue problem. The equation $Av=\lambda v$ asks for a scalar $\lambda$ and a nonzero vector $v$. Rewriting it as $(\lambda I-A)v=0$ shows that such a vector exists exactly when $\lambda I-A$ is singular. Determinants turn singularity into one equation in the scalar parameter.
Here $M_n(k)$ denotes the set of $n\times n$ matrices with entries in a field $k$, $I_n$ denotes the $n\times n$ identity matrix, and $k[t]$ denotes the [polynomial ring](/page/Polynomial%20Ring) in the formal variable $t$ with coefficients in $k$. Thus $M_2(\mathbb R)$ is the space of $2\times2$ real matrices.
[example: Searching for Eigenvalues by a Determinant]
Let $A=\begin{pmatrix}2&1\cr1&2\end{pmatrix}\in M_2(\mathbb R)$ and let $v=\begin{pmatrix}v_1\cr v_2\end{pmatrix}$. The equation $Av=\lambda v$ has a nonzero solution exactly when the homogeneous system $(\lambda I_2-A)v=0$ has a nonzero solution, which for a $2\times 2$ matrix is equivalent to $\det(\lambda I_2-A)=0$.
Here
\begin{align*}
\lambda I_2-A=\begin{pmatrix}\lambda&0\cr0&\lambda\end{pmatrix}-\begin{pmatrix}2&1\cr1&2\end{pmatrix}=\begin{pmatrix}\lambda-2&-1\cr-1&\lambda-2\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$, we get
\begin{align*}
\det(\lambda I_2-A)=(\lambda-2)(\lambda-2)-(-1)(-1).
\end{align*}
Thus
\begin{align*}
\det(\lambda I_2-A)=(\lambda-2)^2-1.
\end{align*}
Expanding and factoring,
\begin{align*}
(\lambda-2)^2-1=\lambda^2-4\lambda+4-1.
\end{align*}
\begin{align*}
\lambda^2-4\lambda+4-1=\lambda^2-4\lambda+3.
\end{align*}
\begin{align*}
\lambda^2-4\lambda+3=(\lambda-1)(\lambda-3).
\end{align*}
Therefore $\det(\lambda I_2-A)=0$ exactly for $\lambda=1$ or $\lambda=3$. For $\lambda=1$,
\begin{align*}
A-I_2=\begin{pmatrix}1&1\cr1&1\end{pmatrix},
\end{align*}
so $(A-I_2)v=0$ means
\begin{align*}
v_1+v_2=0.
\end{align*}
The corresponding eigenvectors are the nonzero multiples of $\begin{pmatrix}1\cr-1\end{pmatrix}$.
For $\lambda=3$,
\begin{align*}
A-3I_2=\begin{pmatrix}-1&1\cr1&-1\end{pmatrix},
\end{align*}
so $(A-3I_2)v=0$ means
\begin{align*}
-v_1+v_2=0,
\end{align*}
equivalently $v_1-v_2=0$. The corresponding eigenvectors are the nonzero multiples of $\begin{pmatrix}1\cr1\end{pmatrix}$. The determinant polynomial has therefore found the two lines on which $A$ acts by scalar multiplication.
[/example]
This example also fixes a convention. Some books use $\det(A-tI_n)$, but this page uses $\det(tI_n-A)$ because it gives a monic polynomial in $t$.
## Definition
The determinant condition becomes useful only after we stop substituting a particular scalar and leave the parameter formal. Otherwise we only test one possible eigenvalue at a time, with no single object that records all the values where invertibility fails. The formal determinant below packages that whole obstruction into one polynomial: its roots are candidates for eigenvalues, and its coefficients encode familiar invariants.
[definition: Characteristic Polynomial of a Matrix]
Let $k$ be a field, let $n\in\mathbb N$, and let $A\in M_n(k)$. The characteristic polynomial of $A$ is the polynomial $\chi_A(t)\in k[t]$ defined by
\begin{align*}
\chi_A(t)=\det(tI_n-A).
\end{align*}
[/definition]
The variable $t$ is formal, so $tI_n-A$ is a matrix with entries in $k[t]$. This guarantees that the determinant is a genuine polynomial rather than a collection of separate scalar tests. Its general coefficient pattern will be recorded later when trace and determinant are compared systematically.
Before using that general coefficient result, it helps to compute the smallest non-scalar case. The next example is needed because the $2\times2$ formula is the model behind many quick characteristic polynomial calculations.
[example: The $2\times2$ Formula]
Let $A=\begin{pmatrix}a&b\cr c&d\end{pmatrix}\in M_2(k)$. By the definition of the characteristic polynomial,
\begin{align*}
\chi_A(t)=\det(tI_2-A).
\end{align*}
Now
\begin{align*}
tI_2=\begin{pmatrix}t&0\cr0&t\end{pmatrix}.
\end{align*}
Subtracting entries gives
\begin{align*}
tI_2-A=\begin{pmatrix}t-a&0-b\cr0-c&t-d\end{pmatrix}.
\end{align*}
Hence
\begin{align*}
tI_2-A=\begin{pmatrix}t-a&-b\cr-c&t-d\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$, with $x=t-a$, $y=-b$, $z=-c$, and $w=t-d$, we get
\begin{align*}
\chi_A(t)=(t-a)(t-d)-(-b)(-c).
\end{align*}
Since $(-b)(-c)=bc$,
\begin{align*}
\chi_A(t)=(t-a)(t-d)-bc.
\end{align*}
Expanding the first product,
\begin{align*}
(t-a)(t-d)=t^2-td-at+ad.
\end{align*}
Because $td=dt$ in the field $k$,
\begin{align*}
t^2-td-at+ad=t^2-(a+d)t+ad.
\end{align*}
Therefore
\begin{align*}
\chi_A(t)=t^2-(a+d)t+ad-bc.
\end{align*}
Here $\operatorname{tr}A=a+d$ and $\det A=ad-bc$, so
\begin{align*}
\chi_A(t)=t^2-(\operatorname{tr}A)t+\det A.
\end{align*}
The off-diagonal entries $b$ and $c$ enter only through the determinant term $ad-bc$, while the coefficient of $t$ is determined by the trace.
[/example]
A diagonal matrix shows what the polynomial looks like when the coordinate axes are already eigenvector directions. The next example is the baseline for triangular and diagonalizable cases.
[example: Diagonal Matrices]
Let $A=\operatorname{diag}(a_1,\dots,a_n)\in M_n(k)$. Since $I_n=\operatorname{diag}(1,\dots,1)$, multiplying by the formal variable $t$ gives
\begin{align*}
tI_n=\operatorname{diag}(t,\dots,t).
\end{align*}
Subtracting diagonal entries and subtracting $0$ from every off-diagonal entry gives
\begin{align*}
tI_n-A=\operatorname{diag}(t-a_1,\dots,t-a_n).
\end{align*}
To compute the determinant, use the permutation formula. The identity permutation contributes
\begin{align*}
(t-a_1)(t-a_2)\cdots(t-a_n).
\end{align*}
Every non-identity permutation $\sigma$ has some index $i$ with $\sigma(i)\ne i$, so the product $\prod_{i=1}^n (tI_n-A)_{i,\sigma(i)}$ contains an off-diagonal entry of the diagonal matrix $tI_n-A$, hence contains a factor $0$. Therefore all non-identity permutation terms vanish, and
\begin{align*}
\chi_A(t)=\det(tI_n-A)=\prod_{i=1}^n(t-a_i).
\end{align*}
Thus each diagonal entry $a_i$ appears as a linear factor $t-a_i$, so the roots of the characteristic polynomial are exactly the diagonal entries, counted with their repetitions in the diagonal list.
[/example]
## Eigenvalues and Multiplicity
The characteristic polynomial would be much less useful if its roots were only formal artifacts. The main reason it matters is that its roots are exactly the eigenvalues, so the polynomial detects the scalar directions of the operator.
### Eigenvalues and Eigenspaces
To connect the polynomial to geometry, we first name the scalars that act on some nonzero direction by stretching. This definition is needed because the determinant equation will be interpreted through the existence of such directions.
[definition: Eigenvalue]
Let $k$ be a field, let $V$ be a [vector space](/page/Vector%20Space) over $k$, and let $T:V\to V$ be a linear map. A scalar $\lambda\in k$ is an eigenvalue of $T$ if there exists a nonzero vector $v\in V$ such that
\begin{align*}
T(v)=\lambda v.
\end{align*}
[/definition]
For a fixed scalar $\lambda$, we need to collect all vectors that satisfy the eigenvalue equation. This collection is a subspace, and its dimension will later measure how many independent eigenvectors the polynomial root has produced.
[definition: Eigenspace]
Let $k$ be a field, let $V$ be a vector space over $k$, let $T:V\to V$ be a linear map, and let $\lambda\in k$. The eigenspace of $T$ with eigenvalue $\lambda$ is
\begin{align*}
E_\lambda(T)=\ker(T-\lambda\operatorname{id}_V).
\end{align*}
[/definition]
The determinant criterion for singularity is now ready to be translated into spectral language. The remaining gap is to connect a root of the polynomial, which is a statement about a determinant, with the existence of a nonzero vector solving the eigenvalue equation. The theorem below closes that gap and explains why characteristic polynomials are computed in eigenvalue problems.
[quotetheorem:7911]
The theorem also exposes a dependence on the base field. A polynomial may have roots after enlarging the field even when it has none over the original field.
[example: Rotation Over $\mathbb R$]
Let $R=\begin{pmatrix}0&-1\cr1&0\end{pmatrix}\in M_2(\mathbb R)$. We compute the characteristic polynomial and then interpret its roots using *Eigenvalue Criterion*. Since
\begin{align*}
tI_2=\begin{pmatrix}t&0\cr0&t\end{pmatrix},
\end{align*}
entrywise subtraction gives
\begin{align*}
tI_2-R=\begin{pmatrix}t&0\cr0&t\end{pmatrix}-\begin{pmatrix}0&-1\cr1&0\end{pmatrix}=\begin{pmatrix}t&1\cr-1&t\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$ with $x=t$, $y=1$, $z=-1$, and $w=t$, we get
\begin{align*}
\chi_R(t)=\det(tI_2-R)=t\cdot t-1\cdot(-1).
\end{align*}
Thus
\begin{align*}
\chi_R(t)=t^2+1.
\end{align*}
If $\lambda\in\mathbb R$ were a root, then $\lambda^2+1=0$, so $\lambda^2=-1$. But $\lambda^2\ge 0$ for every real $\lambda$, so no such real root exists. Therefore $R$ has no real eigenvalue by *Eigenvalue Criterion*.
Over $\mathbb C$, the same determinant calculation gives the same polynomial, now viewed in $\mathbb C[t]$. Since $i^2=-1$,
\begin{align*}
(t-i)(t+i)=t^2+it-it-i^2.
\end{align*}
The middle terms cancel, and $-i^2=1$, so
\begin{align*}
(t-i)(t+i)=t^2+1.
\end{align*}
Hence $\chi_R(t)=(t-i)(t+i)$ over $\mathbb C$, and the complexified operator has eigenvalues $i$ and $-i$. The same rotation has no real eigenline, but after extending scalars it has two complex eigenvalues.
[/example]
### Algebraic and Geometric Multiplicity
Roots can repeat, and a repeated root raises a new question: does the repetition mean several independent eigenvectors, or only a higher-order algebraic obstruction to invertibility? To separate those two possibilities, we first need a way to count repetition using only divisibility in the characteristic polynomial.
[definition: Algebraic Multiplicity]
Let $k$ be a field, let $A\in M_n(k)$, and let $\lambda\in k$. The algebraic multiplicity of $\lambda$ as a root of $\chi_A(t)$ is the largest integer $m\ge 0$ such that $(t-\lambda)^m$ divides $\chi_A(t)$ in $k[t]$.
[/definition]
Algebraic multiplicity counts the root in the polynomial. To compare it with actual eigenvectors, we need a separate dimension that belongs to the eigenspace.
[definition: Geometric Multiplicity]
Let $k$ be a field, let $V$ be a finite-dimensional vector space over $k$, let $T:V\to V$ be a linear map, and let $\lambda\in k$. The geometric multiplicity of $\lambda$ is
\begin{align*}
\dim\ker(T-\lambda\operatorname{id}_V).
\end{align*}
[/definition]
These two multiplicities cannot vary independently. The possible obstruction is that eigenspace dimension is geometric data, while algebraic multiplicity is only a polynomial count; a priori one might expect either one to be larger. The basic constraint is that the number of independent eigenvectors cannot exceed the multiplicity with which the scalar appears in the characteristic polynomial.
[quotetheorem:919]
The inequality can be strict, and that strictness is the first sign that the characteristic polynomial alone does not determine the matrix up to similarity.
[example: A Repeated Root with One Eigenline]
Let $A=\begin{pmatrix}1&1\cr0&1\end{pmatrix}\in M_2(k)$. We compute its characteristic polynomial and then compare the algebraic and geometric multiplicities of the root $1$. Since
\begin{align*}
tI_2=\begin{pmatrix}t&0\cr0&t\end{pmatrix},
\end{align*}
entrywise subtraction gives
\begin{align*}
tI_2-A=\begin{pmatrix}t-1&0-1\cr0-0&t-1\end{pmatrix}=\begin{pmatrix}t-1&-1\cr0&t-1\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$ with $x=t-1$, $y=-1$, $z=0$, and $w=t-1$, we get
\begin{align*}
\chi_A(t)=\det(tI_2-A)=(t-1)(t-1)-(-1)\cdot 0.
\end{align*}
Because $(-1)\cdot 0=0$, this is
\begin{align*}
\chi_A(t)=(t-1)^2.
\end{align*}
Thus $1$ is a root of $\chi_A(t)$ with algebraic multiplicity $2$.
Now
\begin{align*}
A-I_2=\begin{pmatrix}1&1\cr0&1\end{pmatrix}-\begin{pmatrix}1&0\cr0&1\end{pmatrix}=\begin{pmatrix}0&1\cr0&0\end{pmatrix}.
\end{align*}
For $v=\begin{pmatrix}v_1\cr v_2\end{pmatrix}$, matrix multiplication gives
\begin{align*}
(A-I_2)v=\begin{pmatrix}0&1\cr0&0\end{pmatrix}\begin{pmatrix}v_1\cr v_2\end{pmatrix}=\begin{pmatrix}v_2\cr0\end{pmatrix}.
\end{align*}
Therefore $(A-I_2)v=0$ exactly when $v_2=0$, so
\begin{align*}
\ker(A-I_2)=\left\{\begin{pmatrix}v_1\cr0\end{pmatrix}:v_1\in k\right\}.
\end{align*}
This kernel is spanned by $\begin{pmatrix}1\cr0\end{pmatrix}$, so its dimension is $1$. The characteristic polynomial records a double root, but the eigenspace for that root is only one-dimensional.
[/example]
## Invariance Under Change of Basis
A coordinate-dependent polynomial would not be useful for studying a linear map. The next issue is therefore invariance: if two matrices represent the same operator in different bases, they must have the same characteristic polynomial.
### Similarity
A change of basis replaces a matrix by a conjugate matrix. To ask whether the characteristic polynomial belongs to the underlying operator rather than to its coordinates, we need a precise relation identifying matrices that differ only by such a coordinate change.
[definition: Similar Matrices]
Let $k$ be a field and let $A,B\in M_n(k)$. The matrices $A$ and $B$ are similar if there exists an invertible matrix $P\in GL_n(k)$ such that
\begin{align*}
B=P^{-1}AP.
\end{align*}
[/definition]
Similarity is the matrix-level form of describing the same linear map in two bases. The obstruction to defining a characteristic polynomial for an operator is that different bases produce different matrices, so the determinant formula must survive conjugation before it is basis-independent.
The next issue is therefore not how to compute $\det(tI-A)$ in one chosen basis, but whether this computation gives the same answer after every allowed change of basis. We need an invariance result strong enough to compare two similar matrices and then translate that matrix statement back to a statement about a single linear map.
The formal step needed here has two parts. First, similar matrices must have the same characteristic polynomial; second, once a basis has been chosen for a linear map, that matrix computation must agree with the coordinate-free determinant expression attached to the operator itself. The quoted theorem supplies exactly this bridge, so it is the result that makes the later definition independent of any arbitrary basis choice.
In the quoted theorem, $\mathrm{Mat}_n(\mathbb F)$ means the same kind of matrix space as $M_n(k)$, but written over a field denoted $\mathbb F$ instead of $k$. The notation $\mathrm{End}(V)$ denotes the set of linear maps $V\to V$, and $\mathrm{id}$ denotes the identity map on $V$.
Before defining the characteristic polynomial of a linear map, we must answer a precise question: if the same operator is represented by two different matrices, do those matrices give the same determinant polynomial? The following theorem is the mechanism that answers yes. It converts change of basis into similarity of matrices, proves that characteristic polynomials are unchanged by that similarity, and thereby supplies the independence needed for the next definition.
[quotetheorem:402]
Once similarity invariance is known, we can define the characteristic polynomial directly for a linear map. The remaining notational problem is that a linear map has no entries until a basis is chosen; invariance lets us choose any basis without changing the polynomial that results.
[definition: Characteristic Polynomial of a Linear Map]
Let $k$ be a field, let $V$ be an $n$-dimensional vector space over $k$, and let $T:V\to V$ be a linear map. The characteristic polynomial of $T$ is
\begin{align*}
\chi_T(t)=\chi_A(t),
\end{align*}
where $A\in M_n(k)$ is the matrix of $T$ with respect to any basis of $V$.
[/definition]
The previous theorem makes this definition well-defined. We are now free to compute in whichever basis exposes the most structure.
[example: Same Operator, Different Matrices]
Let $T:\mathbb R^2\to\mathbb R^2$ have standard matrix $A=\begin{pmatrix}2&1\cr0&3\end{pmatrix}$, and take the ordered basis $w_1=(1,0)$, $w_2=(1,1)$. The change-of-basis matrix from $w$-coordinates to standard coordinates has columns $w_1$ and $w_2$, so
\begin{align*}
P=\begin{pmatrix}1&1\cr0&1\end{pmatrix}.
\end{align*}
Since
\begin{align*}
\begin{pmatrix}1&1\cr0&1\end{pmatrix}\begin{pmatrix}1&-1\cr0&1\end{pmatrix}=\begin{pmatrix}1\cdot 1+1\cdot 0&1\cdot(-1)+1\cdot 1\cr0\cdot 1+1\cdot 0&0\cdot(-1)+1\cdot 1\end{pmatrix}=\begin{pmatrix}1&0\cr0&1\end{pmatrix},
\end{align*}
we have
\begin{align*}
P^{-1}=\begin{pmatrix}1&-1\cr0&1\end{pmatrix}.
\end{align*}
First multiply
\begin{align*}
AP=\begin{pmatrix}2&1\cr0&3\end{pmatrix}\begin{pmatrix}1&1\cr0&1\end{pmatrix}=\begin{pmatrix}2\cdot 1+1\cdot 0&2\cdot 1+1\cdot 1\cr0\cdot 1+3\cdot 0&0\cdot 1+3\cdot 1\end{pmatrix}=\begin{pmatrix}2&3\cr0&3\end{pmatrix}.
\end{align*}
Then
\begin{align*}
P^{-1}AP=\begin{pmatrix}1&-1\cr0&1\end{pmatrix}\begin{pmatrix}2&3\cr0&3\end{pmatrix}=\begin{pmatrix}1\cdot 2+(-1)\cdot 0&1\cdot 3+(-1)\cdot 3\cr0\cdot 2+1\cdot 0&0\cdot 3+1\cdot 3\end{pmatrix}=\begin{pmatrix}2&0\cr0&3\end{pmatrix}.
\end{align*}
For the original matrix,
\begin{align*}
tI_2-A=\begin{pmatrix}t&0\cr0&t\end{pmatrix}-\begin{pmatrix}2&1\cr0&3\end{pmatrix}=\begin{pmatrix}t-2&-1\cr0&t-3\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$ gives
\begin{align*}
\chi_A(t)=\det(tI_2-A)=(t-2)(t-3)-(-1)\cdot 0=(t-2)(t-3).
\end{align*}
For the matrix in the basis $w_1,w_2$,
\begin{align*}
tI_2-P^{-1}AP=\begin{pmatrix}t&0\cr0&t\end{pmatrix}-\begin{pmatrix}2&0\cr0&3\end{pmatrix}=\begin{pmatrix}t-2&0\cr0&t-3\end{pmatrix},
\end{align*}
and hence
\begin{align*}
\chi_{P^{-1}AP}(t)=(t-2)(t-3)-0\cdot 0=(t-2)(t-3).
\end{align*}
The two coordinate matrices therefore have the same characteristic polynomial, while the basis $w_1,w_2$ makes the two eigenvector directions visible as the coordinate axes.
[/example]
### Triangular Forms
Diagonal form may be unavailable, but triangular form is often enough for characteristic polynomials. The key simplification is that entries below the diagonal cannot contribute to the determinant of $tI_n-A$ when $A$ is triangular, so the diagonal alone should determine the full characteristic polynomial.
[quotetheorem:7912]
Triangular matrices show why diagonal entries become meaningful only in a suitable basis. In an arbitrary basis they are not invariant, but in triangular form they display the roots of the invariant polynomial.
## Coefficients and Matrix Data
The roots of the characteristic polynomial are spectral data, but the coefficients matter even when the polynomial does not factor. The first and last non-leading coefficients recover trace and determinant, so the characteristic polynomial unifies two older invariants.
### Trace and Determinant
The coefficient of $t^{n-1}$ is controlled by the sum of diagonal entries. We isolate this sum because the characteristic polynomial will show why it is invariant under change of basis.
[definition: Trace]
Let $k$ be a field and let $A=(a_{ij})\in M_n(k)$. The trace of $A$ is
\begin{align*}
\operatorname{tr}A=\sum_{i=1}^n a_{ii}.
\end{align*}
[/definition]
The constant term of the characteristic polynomial is controlled by invertibility at $t=0$. To state that coefficient relation without ambiguity, we need the scalar obtained from the determinant function itself: the invariant that detects whether the matrix is invertible and measures the signed volume-scaling factor in the classical geometric interpretation.
[definition: Determinant]
Let $k$ be a field and let $A\in M_n(k)$. The determinant of $A$ is the scalar $\det A\in k$ obtained by the alternating multilinear determinant function on the columns of $A$ normalized by $\det I_n=1$.
[/definition]
Trace and determinant appear as coefficients in the same polynomial. The issue is to identify exactly which coefficients they control, including the signs imposed by the convention $\chi_A(t)=\det(tI_n-A)$. The coefficient formula below turns those invariants into a practical check on characteristic-polynomial computations.
[quotetheorem:7910]
This theorem gives a fast way to test computations. A candidate polynomial with the wrong trace coefficient or constant term cannot be the characteristic polynomial.
[example: A Consistency Check]
Let $A=\begin{pmatrix}1&2&0\cr0&3&4\cr5&0&6\end{pmatrix}$. Its trace is
\begin{align*}
\operatorname{tr}A=1+3+6=10.
\end{align*}
Also, expanding $\det A$ along the first row gives
\begin{align*}
\det A=1(3\cdot 6-4\cdot 0)-2(0\cdot 6-4\cdot 5)+0.
\end{align*}
Thus
\begin{align*}
\det A=18-2(-20)=58.
\end{align*}
So *Trace and Determinant from the Characteristic Polynomial* predicts that the coefficient of $t^2$ in $\chi_A(t)$ is $-10$, and that the constant term is $(-1)^3\det A=-58$.
Now compute the characteristic polynomial itself. Since
\begin{align*}
tI_3-A=\begin{pmatrix}t-1&-2&0\cr0&t-3&-4\cr-5&0&t-6\end{pmatrix},
\end{align*}
expansion along the first row gives
\begin{align*}
\chi_A(t)=(t-1)\det\begin{pmatrix}t-3&-4\cr0&t-6\end{pmatrix}-(-2)\det\begin{pmatrix}0&-4\cr-5&t-6\end{pmatrix}+0.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$, the first minor is
\begin{align*}
\det\begin{pmatrix}t-3&-4\cr0&t-6\end{pmatrix}=(t-3)(t-6)-(-4)\cdot 0=(t-3)(t-6).
\end{align*}
The second minor is
\begin{align*}
\det\begin{pmatrix}0&-4\cr-5&t-6\end{pmatrix}=0\cdot(t-6)-(-4)(-5)=-20.
\end{align*}
Therefore
\begin{align*}
\chi_A(t)=(t-1)(t-3)(t-6)+2(-20).
\end{align*}
First,
\begin{align*}
(t-3)(t-6)=t^2-6t-3t+18=t^2-9t+18.
\end{align*}
Then
\begin{align*}
(t-1)(t^2-9t+18)=t^3-9t^2+18t-t^2+9t-18=t^3-10t^2+27t-18.
\end{align*}
Hence
\begin{align*}
\chi_A(t)=t^3-10t^2+27t-18-40=t^3-10t^2+27t-58.
\end{align*}
The computed polynomial has coefficient $-10$ on $t^2$ and constant term $-58$, so it passes both the trace and determinant checks.
[/example]
### Principal Minors
The middle coefficients need more than trace and determinant. They are built from determinants of submatrices obtained by keeping the same set of rows and columns, so we name those pieces before stating the coefficient formula.
[definition: Principal Minor]
Let $k$ be a field, let $A\in M_n(k)$, and let $S\subset\{1,\dots,n\}$. The principal submatrix $A_S$ is the matrix obtained from $A$ by keeping exactly the rows and columns indexed by $S$. The principal minor associated to $S$ is $\det A_S$.
[/definition]
Principal minors are the entry-level objects that assemble the middle coefficients. The obstruction is combinatorial: a middle coefficient comes from determinant terms that choose some entries from $A$ and the remaining powers of $t$ from $tI_n$. The formula below organizes those choices by principal submatrices.
[quotetheorem:7913]
For diagonal matrices, this formula reduces to elementary symmetric polynomials in the diagonal entries. For general matrices, it explains why the middle coefficients are structured combinations of subdeterminants rather than arbitrary expressions in the entries.
## Cayley-Hamilton and Operator Algebra
The characteristic polynomial begins as a determinant, but it also becomes an identity satisfied by the operator. This is the point where spectral data turns into algebraic control over powers of a matrix.
### Evaluating Polynomials at Operators
To say that an operator satisfies a polynomial, we need to substitute the operator into the polynomial. This is not ordinary numerical substitution: the constant term must become a scalar multiple of the identity map, and powers of $t$ must become iterated compositions of $T$.
[definition: Polynomial in a Linear Operator]
Let $k$ be a field, let $V$ be a vector space over $k$, let $T:V\to V$ be a linear map, and let $p(t)=a_m t^m+a_{m-1}t^{m-1}+\cdots+a_1t+a_0\in k[t]$. The value of $p$ at $T$ is the linear map
\begin{align*}
p(T)=a_mT^m+a_{m-1}T^{m-1}+\cdots+a_1T+a_0\operatorname{id}_V.
\end{align*}
[/definition]
Polynomial evaluation gives a way to turn an identity in $k[t]$ into an identity among powers of an operator. The useful question is whether the determinant polynomial attached to $T$ is strong enough to force such an operator identity, because that would let high powers of $T$ be reduced using only lower powers.
One related measure of how strongly an operator is constrained by polynomial identities is its [minimal polynomial](/page/Minimal%20Polynomial): among the nonzero polynomials $p(t)\in k[t]$ with $p(T)=0$, there is a unique monic polynomial of least degree, denoted here by $M_T(t)$. Thus a statement that the characteristic polynomial is divisible by, or agrees with a multiple of, $M_T(t)$ is a statement about all polynomial relations forced on $T$.
The next structural question is whether the characteristic polynomial itself always belongs to this collection of annihilating polynomials. If it does, then the coefficients coming from a determinant do not merely record eigenvalue data: they give a universal recurrence relation for the operator. This is the content of the [Cayley-Hamilton theorem](/theorems/865).
[quotetheorem:407]
If $\chi_T(t)=t^n+c_{n-1}t^{n-1}+\cdots+c_0$, Cayley-Hamilton gives
\begin{align*}
T^n=-c_{n-1}T^{n-1}-\cdots-c_1T-c_0\operatorname{id}_V.
\end{align*}
Every power of degree at least $n$ can therefore be reduced to a linear combination of lower powers.
[example: Reducing Powers of a Matrix]
Let $A=\begin{pmatrix}1&1\cr1&0\end{pmatrix}$. We first compute the characteristic polynomial that will be substituted into $A$. Since
\begin{align*}
tI_2-A=\begin{pmatrix}t&0\cr0&t\end{pmatrix}-\begin{pmatrix}1&1\cr1&0\end{pmatrix}=\begin{pmatrix}t-1&-1\cr-1&t\end{pmatrix},
\end{align*}
the $2\times 2$ determinant formula gives
\begin{align*}
\chi_A(t)=\det(tI_2-A)=(t-1)t-(-1)(-1).
\end{align*}
Because $(t-1)t=t^2-t$ and $(-1)(-1)=1$, this becomes
\begin{align*}
\chi_A(t)=t^2-t-1.
\end{align*}
By *[Cayley-Hamilton Theorem](/theorems/923)*, substituting $A$ into its characteristic polynomial gives
\begin{align*}
\chi_A(A)=A^2-A-I_2=0.
\end{align*}
Adding $A+I_2$ to both sides gives
\begin{align*}
A^2=A+I_2.
\end{align*}
Multiplying this identity on the left by $A$ gives
\begin{align*}
A^3=A(A^2)=A(A+I_2).
\end{align*}
Using distributivity and $AI_2=A$, we get
\begin{align*}
A(A+I_2)=A^2+A.
\end{align*}
Substituting $A^2=A+I_2$ into this expression gives
\begin{align*}
A^3=(A+I_2)+A=2A+I_2.
\end{align*}
The same reduction works for all higher powers. If $A^m=xA+yI_2$ for some scalars $x,y$, then
\begin{align*}
A^{m+1}=A(xA+yI_2)=xA^2+yA.
\end{align*}
Substituting $A^2=A+I_2$ gives
\begin{align*}
A^{m+1}=x(A+I_2)+yA=(x+y)A+xI_2.
\end{align*}
Thus each time a power $A^2$ appears, it can be replaced by $A+I_2$, so every power $A^m$ is a linear combination of $A$ and $I_2$.
[/example]
### Inverses as Polynomials
For an invertible matrix, the constant term of the characteristic polynomial is nonzero. After substituting $A$ into that polynomial, the Cayley-Hamilton relation can be rearranged so that the single negative power $A^{-1}$ is expressed using only nonnegative powers of $A$.
[quotetheorem:7915]
This formula is more structural than numerical. It says that no operation outside polynomial expressions in $A$ is needed to express the inverse.
## Computation and Factorisation
Direct determinant expansion grows quickly with dimension. Computation becomes manageable when the matrix has structure, when the vector space has invariant subspaces, or when the field is enlarged so the polynomial factors.
### Block Structure
An invariant subspace rarely makes the whole matrix diagonal or even fully triangular, but it does force part of the matrix to stop feeding into its complement. To use determinant expansion on this partial decomposition, we need a precise name for matrices whose lower-left block regions vanish.
[definition: Block Upper Triangular Matrix]
Let $k$ be a field. A matrix $A\in M_n(k)$ is block upper triangular if there are positive integers $n_1,\dots,n_r$ with $n_1+\cdots+n_r=n$ and square matrices $A_{ii}\in M_{n_i}(k)$ such that $A$ has zero blocks below the diagonal blocks and arbitrary blocks on or above the diagonal blocks.
[/definition]
Once the entries below the diagonal blocks are zero, the determinant should no longer depend on the off-diagonal blocks in the same way it would for a general matrix. The immediate tool is a determinant factorisation for block upper triangular matrices; applying that determinant result to the particular matrix $tI-A$ is what gives the corresponding factorisation of the characteristic polynomial.
[quotetheorem:399]
For characteristic polynomials, the theorem is used with $tI-A$ in place of the original matrix. Off-diagonal blocks may change eigenvectors and complements, but after this substitution they do not change the factorisation of the characteristic polynomial into the diagonal block contributions.
[example: A Preserved Plane]
Let $A=\begin{pmatrix}2&1&5\cr0&3&4\cr0&0&7\end{pmatrix}\in M_3(\mathbb R)$. If $e_1,e_2,e_3$ are the standard basis vectors, then
\begin{align*}
Ae_1=\begin{pmatrix}2\cr0\cr0\end{pmatrix}=2e_1.
\end{align*}
Also,
\begin{align*}
Ae_2=\begin{pmatrix}1\cr3\cr0\end{pmatrix}=e_1+3e_2.
\end{align*}
Thus $A(\operatorname{span}\{e_1,e_2\})\subseteq \operatorname{span}\{e_1,e_2\}$, so the plane spanned by the first two standard basis vectors is preserved by $A$.
With respect to the block decomposition $\mathbb R^3=\operatorname{span}\{e_1,e_2\}\oplus \operatorname{span}\{e_3\}$, the diagonal blocks are
\begin{align*}
A_{11}=\begin{pmatrix}2&1\cr0&3\end{pmatrix}, \qquad A_{22}=\begin{pmatrix}7\end{pmatrix}.
\end{align*}
By *[Block Triangular Determinant](/theorems/399)*,
\begin{align*}
\chi_A(t)=\chi_{A_{11}}(t)\chi_{A_{22}}(t).
\end{align*}
Now
\begin{align*}
tI_2-A_{11}=\begin{pmatrix}t&0\cr0&t\end{pmatrix}-\begin{pmatrix}2&1\cr0&3\end{pmatrix}=\begin{pmatrix}t-2&-1\cr0&t-3\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$, we get
\begin{align*}
\chi_{A_{11}}(t)=(t-2)(t-3)-(-1)\cdot 0.
\end{align*}
Since $(-1)\cdot 0=0$,
\begin{align*}
\chi_{A_{11}}(t)=(t-2)(t-3).
\end{align*}
For the $1\times 1$ block,
\begin{align*}
\chi_{A_{22}}(t)=\det\begin{pmatrix}t-7\end{pmatrix}=t-7.
\end{align*}
Therefore
\begin{align*}
\chi_A(t)=(t-2)(t-3)(t-7).
\end{align*}
The off-diagonal entries $5$ and $4$ can change how eigenvectors sit relative to the preserved plane, but the characteristic polynomial is determined by the diagonal blocks.
[/example]
### Splitting Fields and Companion Matrices
Some characteristic polynomials do not factor over the original field. To discuss eigenvalues systematically, we need language for the case where all roots already lie in the field.
[definition: Splitting of the Characteristic Polynomial]
Let $k$ be a field and let $A\in M_n(k)$. The characteristic polynomial $\chi_A(t)$ splits over $k$ if there exist scalars $\lambda_1,\dots,\lambda_n\in k$ such that
\begin{align*}
\chi_A(t)=\prod_{i=1}^n(t-\lambda_i).
\end{align*}
[/definition]
Extending the field can reveal roots that were invisible over the original field, but it should not alter the determinant expression used to compute $\chi_A(t)$.
To use splitting fields safely, we need a compatibility statement: when the entries of $A$ are viewed inside a larger field, the determinant calculation must give the same polynomial, now regarded over that larger field. This is the bridge between computing $\chi_A(t)$ over the original field and factoring it after scalar extension.
[quotetheorem:7914]
This permits us to factor over a larger field without changing the polynomial as an invariant. We only change the ring where factorisation is allowed.
[example: Same Polynomial, More Roots]
For $R=\begin{pmatrix}0&-1\cr1&0\end{pmatrix}$, first compute
\begin{align*}
tI_2-R=\begin{pmatrix}t&0\cr0&t\end{pmatrix}-\begin{pmatrix}0&-1\cr1&0\end{pmatrix}=\begin{pmatrix}t&1\cr-1&t\end{pmatrix}.
\end{align*}
Using $\det\begin{pmatrix}x&y\cr z&w\end{pmatrix}=xw-yz$ gives
\begin{align*}
\chi_R(t)=\det(tI_2-R)=t\cdot t-1\cdot(-1).
\end{align*}
Since $t\cdot t=t^2$ and $1\cdot(-1)=-1$, this becomes
\begin{align*}
\chi_R(t)=t^2-(-1)=t^2+1.
\end{align*}
Over $\mathbb R$, if $\lambda$ were a root of $\chi_R(t)$, then
\begin{align*}
\lambda^2+1=0.
\end{align*}
But $\lambda^2\ge 0$ for every $\lambda\in\mathbb R$, so $\lambda^2+1\ge 1$, and therefore $\lambda^2+1\ne 0$. Thus $\chi_R(t)$ has no real root, so $R$ has no real eigenvalue and no real eigenline by *Eigenvalue Criterion*.
By *[Compatibility of the Characteristic Polynomial with Field Extension](/theorems/7914)*, extending scalars from $\mathbb R$ to $\mathbb C$ keeps the same determinant polynomial, now viewed in $\mathbb C[t]$. Since $i^2=-1$,
\begin{align*}
(t-i)(t+i)=t^2+it-it-i^2.
\end{align*}
The terms $it$ and $-it$ cancel, and $-i^2=-(-1)=1$, so
\begin{align*}
(t-i)(t+i)=t^2+1.
\end{align*}
Hence
\begin{align*}
\chi_R(t)=(t-i)(t+i)
\end{align*}
in $\mathbb C[t]$, so the complex eigenvalues are $i$ and $-i$ by *Eigenvalue Criterion*.
For $\lambda=i$,
\begin{align*}
R-iI_2=\begin{pmatrix}-i&-1\cr1&-i\end{pmatrix}.
\end{align*}
For $v=\begin{pmatrix}z_1\cr z_2\end{pmatrix}\in\mathbb C^2$, the equation $(R-iI_2)v=0$ gives
\begin{align*}
-iz_1-z_2=0.
\end{align*}
Thus $z_2=-iz_1$, and the second row gives $z_1-iz_2=z_1-i(-iz_1)=z_1+i^2z_1=0$, so
\begin{align*}
E_i=\operatorname{span}_{\mathbb C}\left\{\begin{pmatrix}1\cr-i\end{pmatrix}\right\}.
\end{align*}
For $\lambda=-i$,
\begin{align*}
R+iI_2=\begin{pmatrix}i&-1\cr1&i\end{pmatrix}.
\end{align*}
The equation $(R+iI_2)v=0$ gives
\begin{align*}
iz_1-z_2=0.
\end{align*}
Thus $z_2=iz_1$, and the second row gives $z_1+iz_2=z_1+i(iz_1)=z_1+i^2z_1=0$, so
\begin{align*}
E_{-i}=\operatorname{span}_{\mathbb C}\left\{\begin{pmatrix}1\cr i\end{pmatrix}\right\}.
\end{align*}
The same matrix has no real eigenline, but after extending scalars to $\mathbb C$ its characteristic polynomial splits and produces two complex eigenspaces.
[/example]
Characteristic polynomials are not only invariants to compute from known matrices. Given a monic polynomial, one can ask whether there is a canonical matrix whose action stores the polynomial coefficients and forces that polynomial to appear as its characteristic polynomial. The standard answer is built by shifting basis vectors and placing the coefficients in the last column.
[definition: Companion Matrix]
Let $k$ be a field and let $p(t)=t^n+a_{n-1}t^{n-1}+\cdots+a_1t+a_0\in k[t]$. The companion matrix of $p$ is
\begin{align*}
C_p=\begin{pmatrix}0&0&\cdots&0&-a_0\cr1&0&\cdots&0&-a_1\cr0&1&\cdots&0&-a_2\cr\vdots&\vdots&\ddots&\vdots&\vdots\cr0&0&\cdots&1&-a_{n-1}\end{pmatrix}\in M_n(k).
\end{align*}
[/definition]
The definition is useful only if the displayed matrix really stores the polynomial, not just its list of coefficients. The structural point of this construction is that the determinant of $tI-C_p$ reconstructs $p(t)$ with the coefficients in exactly the prescribed order. Thus companion matrices provide a controlled way to build matrices with a chosen characteristic polynomial.
This construction is also the first appearance of a broader theme: some matrices can be understood by decomposing the underlying space into cyclic pieces, each represented by a companion matrix. Later refinements, such as [rational canonical form](/theorems/863), make that theme precise using additional invariants; for now the companion matrix only serves as a concrete bridge from a monic polynomial to an explicit matrix.
## Beyond and Connected Topics
The characteristic polynomial is the first spectral invariant, but it is not the finest. The minimal polynomial is the monic polynomial of least degree annihilating a linear operator. It divides the characteristic polynomial and detects the largest Jordan block sizes, or over arbitrary fields, the largest companion-block contributions in rational canonical form. For example, the matrices
\begin{align*}
\begin{pmatrix}1&0\cr0&1\end{pmatrix}
\quad \text{and} \quad
\begin{pmatrix}1&1\cr0&1\end{pmatrix}
\end{align*}
both have characteristic polynomial $(t-1)^2$, but their minimal polynomials are $t-1$ and $(t-1)^2$ respectively. The characteristic polynomial sees the repeated eigenvalue; the minimal polynomial sees whether a nilpotent part remains.
Diagonalisation is the next major question after computing $\chi_T(t)$. If the characteristic polynomial splits and the geometric multiplicities add up to $\dim V$, then the operator has a basis of eigenvectors. When this fails over an algebraically closed field, Jordan form records how eigenvectors are missing.
Rational canonical form keeps the theory over the original field. Instead of requiring roots of the characteristic polynomial, it decomposes a linear operator into companion blocks governed by invariant factors. This is where the characteristic polynomial meets module theory over $k[t]$, a theme developed further in [Cambridge III Commutative Algebra](/page/Cambridge%20III%20Commutative%20Algebra).
The determinant and trace aspects of the characteristic polynomial connect directly with the linear algebra of [Cambridge IA Vectors and Matrices](/page/Cambridge%20IA%20Vectors%20and%20Matrices) and [Cambridge IB Linear Algebra](/page/Cambridge%20IB%20Linear%20Algebra). In systems of linear differential equations, eigenvalues obtained from $\chi_A(t)$ govern growth, decay, oscillation, and stability, connecting this algebraic invariant to [Cambridge IA Differential Equations](/page/Cambridge%20IA%20Differential%20Equations).
## References
Androma, [Cambridge IA Vectors and Matrices](/page/Cambridge%20IA%20Vectors%20and%20Matrices).
Androma, [Cambridge IB Linear Algebra](/page/Cambridge%20IB%20Linear%20Algebra).
Androma, [Cambridge III Commutative Algebra](/page/Cambridge%20III%20Commutative%20Algebra).
Androma, [Cambridge IA Differential Equations](/page/Cambridge%20IA%20Differential%20Equations).
Axler, *Linear Algebra Done Right* (2015).
Hoffman and Kunze, *Linear Algebra* (1971).
Lang, *Linear Algebra* (1987).