Characteristic Polynomial via Minimal Polynomial (Theorem # 1573)
Theorem
The characteristic polynomial of $m_\alpha$ is
\begin{align*}
\det(xI - m_\alpha) = p_\alpha^{[L:K(\alpha)]}.
\end{align*}
If $p_\alpha$ splits as $p_\alpha(x) = (x - \alpha_1)\cdots(x - \alpha_r)$ in some extension field, then
\begin{align*}
N_{K(\alpha)/K}(\alpha) &= \prod_i \alpha_i, \qquad \operatorname{tr}_{K(\alpha)/K}(\alpha) = \sum_i \alpha_i,
\end{align*}
and for the full extension:
\begin{align*}
N_{L/K}(\alpha) &= \Bigl(\prod_i \alpha_i\Bigr)^{[L:K(\alpha)]}, \qquad
\operatorname{tr}_{L/K}(\alpha) = [L:K(\alpha)] \sum_i \alpha_i.
\end{align*}
Number Theory
Algebraic Number Theory
Discussion
No discussion available for this theorem.
Proof
[proofplan]
View the $K$-vector space $L$ as a $K(\alpha)$-vector space with the restricted multiplication, and note that the multiplication-by-$\alpha$ map $m_\alpha: L \to L$ is $K(\alpha)$-linear. Choose a $K(\alpha)$-basis $e_1, \ldots, e_t$ of $L$ where $t = [L:K(\alpha)]$; this decomposes $L$ as a $K$-vector space into a direct sum $L = \bigoplus_{j=1}^t K(\alpha) \cdot e_j$, each summand $K(\alpha)$-invariant under $m_\alpha$. A block-diagonal argument shows the characteristic polynomial of $m_\alpha$ on $L$ equals the $t$-th power of its characteristic polynomial on the single summand $K(\alpha)$. On $K(\alpha)$, the matrix of $m_\alpha$ with respect to the $K$-basis $1, \alpha, \ldots, \alpha^{r-1}$ is the companion matrix of $p_\alpha$, whose characteristic polynomial is precisely $p_\alpha$. Combining, $\det(xI - m_\alpha) = p_\alpha^t = p_\alpha^{[L:K(\alpha)]}$. The norm and trace formulas then follow by reading off the constant term and the coefficient of $x^{n-1}$ of a product expansion of $p_\alpha^t = \prod_i (x - \alpha_i)^t$.
[/proofplan]
[step:Set up the multiplication map $m_\alpha: L \to L$ and record its $K(\alpha)$-linearity]
Let $n := [L:K]$ (a positive integer, since $L/K$ is a finite extension — the theorem is stated in the context where $L$ is a number field and $K$ a subfield, both finite-dimensional over $\mathbb{Q}$; in any case the statement presupposes $[L:K] < \infty$). The **multiplication-by-$\alpha$ map** is
\begin{align*}
m_\alpha: L &\to L \\
x &\mapsto \alpha x,
\end{align*}
which is well-defined because $L$ is a field (closed under multiplication). It is $K$-linear: for $x, y \in L$ and $c \in K$,
\begin{align*}
m_\alpha(x + y) &= \alpha(x + y) = \alpha x + \alpha y = m_\alpha(x) + m_\alpha(y), \\
m_\alpha(c x) &= \alpha \cdot (c x) = c \cdot (\alpha x) = c \cdot m_\alpha(x),
\end{align*}
using commutativity in the field $L$ and that $K \subseteq L$.
Stronger: $m_\alpha$ is $K(\alpha)$-linear. For any $\lambda \in K(\alpha)$ and $x \in L$,
\begin{align*}
m_\alpha(\lambda x) = \alpha (\lambda x) = (\alpha \lambda) x = (\lambda \alpha) x = \lambda (\alpha x) = \lambda \cdot m_\alpha(x),
\end{align*}
using commutativity in $L$.
The characteristic polynomial of $m_\alpha$ as a $K$-endomorphism of the $n$-dimensional $K$-vector space $L$ is the monic polynomial
\begin{align*}
\chi_{m_\alpha}(x) := \det(xI_n - M) \in K[x],
\end{align*}
where $M$ is the matrix of $m_\alpha$ with respect to any $K$-basis of $L$. This is independent of the chosen basis, because a change-of-basis matrix $P$ transforms $M \mapsto P^{-1} M P$, which does not affect $\det(xI - \cdot)$.
[guided]
The multiplication-by-$\alpha$ map
\begin{align*}
m_\alpha: L \to L, \qquad x \mapsto \alpha x,
\end{align*}
is the central object of this proof. It is $K$-linear — as a direct check: $m_\alpha(x + y) = \alpha(x + y) = \alpha x + \alpha y = m_\alpha(x) + m_\alpha(y)$ by distributivity in the ring $L$, and $m_\alpha(c x) = \alpha(c x) = c(\alpha x) = c \cdot m_\alpha(x)$ for $c \in K$, using commutativity in $L$.
A stronger observation: $m_\alpha$ is not just $K$-linear but $K(\alpha)$-linear. This is because commutativity in $L$ allows us to move any $\lambda \in K(\alpha) \subseteq L$ past $\alpha$:
\begin{align*}
m_\alpha(\lambda x) = \alpha \lambda x = \lambda \alpha x = \lambda \cdot m_\alpha(x).
\end{align*}
This $K(\alpha)$-linearity is what will let us decompose $m_\alpha$ into blocks, one per basis vector of $L$ over $K(\alpha)$.
Define the characteristic polynomial of $m_\alpha$:
\begin{align*}
\chi_{m_\alpha}(x) := \det(xI_n - M) \in K[x],
\end{align*}
where $M \in K^{n \times n}$ is the matrix of $m_\alpha$ with respect to any chosen $K$-basis of $L$ (here $n = [L:K] = \dim_K L$). The polynomial $\chi_{m_\alpha}$ is independent of the basis: a change of $K$-basis via matrix $P \in GL_n(K)$ replaces $M$ by $P^{-1}MP$, and
\begin{align*}
\det(xI - P^{-1}MP) = \det(P^{-1}(xI - M)P) = \det(P^{-1})\det(xI - M)\det(P) = \det(xI - M).
\end{align*}
Hence $\chi_{m_\alpha}$ is an invariant of the operator $m_\alpha$ alone, not of the basis.
Our ultimate goal is $\chi_{m_\alpha}(x) = p_\alpha(x)^{[L:K(\alpha)]}$ and the resulting norm/trace formulas.
[/guided]
[/step]
[step:Choose a $K(\alpha)$-basis of $L$ and decompose $L$ as a $K$-vector space]
Let $t := [L:K(\alpha)]$ and $r := [K(\alpha):K] = \deg p_\alpha$ (the latter equality is the fundamental identity "$[K(\alpha):K] = \deg p_\alpha$" — verified below). By the [tower law](/pages/???),
\begin{align*}
n = [L:K] = [L:K(\alpha)] \cdot [K(\alpha):K] = t \cdot r.
\end{align*}
Choose a $K(\alpha)$-basis $e_1, \ldots, e_t$ of $L$: such a basis exists because $L$ is a finite-dimensional $K(\alpha)$-vector space of dimension $t$. For each $j \in \{1, \ldots, t\}$, the cyclic $K(\alpha)$-submodule
\begin{align*}
V_j := K(\alpha) \cdot e_j = \{\lambda e_j : \lambda \in K(\alpha)\} \subseteq L
\end{align*}
is a $K$-subspace of $L$ of dimension $r$ (it is isomorphic to $K(\alpha)$ as a $K$-vector space via $\lambda e_j \mapsto \lambda$, which is $K$-linear and bijective because $\{e_j\}$ is $K(\alpha)$-linearly independent).
Since $\{e_1, \ldots, e_t\}$ is a $K(\alpha)$-basis of $L$, every $x \in L$ has a unique expression $x = \sum_{j=1}^t \lambda_j e_j$ with $\lambda_j \in K(\alpha)$. This is precisely the $K$-vector space direct sum decomposition
\begin{align*}
L = \bigoplus_{j=1}^t V_j \qquad \text{(as $K$-vector spaces)}. \tag{$\ast$}
\end{align*}
Each $V_j$ is $m_\alpha$-invariant: if $x = \lambda e_j \in V_j$ with $\lambda \in K(\alpha)$, then $m_\alpha(x) = \alpha \lambda e_j$, and $\alpha \lambda \in K(\alpha)$ since $K(\alpha)$ is closed under multiplication. Hence $m_\alpha(V_j) \subseteq V_j$.
[guided]
Let $r := \deg p_\alpha$ and $t := [L:K(\alpha)]$. Note that $r = [K(\alpha):K]$: this is the standard fact that $K(\alpha) \cong K[x]/\langle p_\alpha \rangle$ (as $K$-algebras) via $\alpha \leftrightarrow x$, and the quotient $K[x]/\langle p_\alpha \rangle$ has $K$-basis $\{1, x, x^2, \ldots, x^{r-1}\}$, hence $K$-dimension $r$. (This is the [Minimal Polynomial Degree Equals Extension Degree](/theorems/???) principle.) Therefore, by the [tower law](/pages/???),
\begin{align*}
n = [L:K] = [L:K(\alpha)] \cdot [K(\alpha):K] = tr.
\end{align*}
Pick a $K(\alpha)$-basis $e_1, \ldots, e_t$ of $L$ (exists because $L$ is finite-dimensional over $K(\alpha)$, of dimension $t$). For each $j$, consider
\begin{align*}
V_j := K(\alpha) \cdot e_j = \{\lambda e_j : \lambda \in K(\alpha)\}.
\end{align*}
This is a $K$-subspace of $L$. Its $K$-dimension is $r$: the map $K(\alpha) \to V_j$, $\lambda \mapsto \lambda e_j$, is $K$-linear, surjective by definition of $V_j$, and injective because $\{e_j\}$ is linearly independent over $K(\alpha)$ (singleton subset of a basis), so $\lambda e_j = 0 \implies \lambda = 0$. Hence $V_j \cong K(\alpha)$ as $K$-vector spaces, and $\dim_K V_j = \dim_K K(\alpha) = r$.
The direct sum decomposition $(\ast)$ follows from the fact that $\{e_1, \ldots, e_t\}$ is a $K(\alpha)$-basis of $L$: every $x \in L$ has a unique expression $x = \sum_j \lambda_j e_j$ with $\lambda_j \in K(\alpha)$, and the uniqueness gives the direct-sum structure. The total $K$-dimension of $\bigoplus_j V_j$ is $\sum_{j=1}^t r = tr = n$, matching $\dim_K L$, confirming that the decomposition exhausts $L$.
Crucially, each $V_j$ is $m_\alpha$-invariant: for $\lambda e_j \in V_j$, we have $m_\alpha(\lambda e_j) = \alpha(\lambda e_j) = (\alpha \lambda) e_j$, and $\alpha \lambda \in K(\alpha)$ (because $\alpha \in K(\alpha)$ and $K(\alpha)$ is a subring of $L$, closed under multiplication). So $m_\alpha(\lambda e_j) \in V_j$.
Invariance is the structural fact that enables block diagonalisation: because $m_\alpha$ preserves each summand, the matrix of $m_\alpha$ in a basis compatible with the decomposition is block-diagonal.
[/guided]
[/step]
[step:Express the matrix of $m_\alpha$ on $L$ as block-diagonal and reduce to the single block $K(\alpha) \cdot e_1$]
Let $1, \alpha, \ldots, \alpha^{r-1}$ be the standard $K$-basis of $K(\alpha)$ (it is a $K$-basis because $K(\alpha) \cong K[x]/\langle p_\alpha \rangle$ and $\{1, x, \ldots, x^{r-1}\}$ is a $K$-basis of the quotient). For each $j \in \{1, \ldots, t\}$, the set
\begin{align*}
\mathcal{B}_j := \{e_j, \alpha e_j, \alpha^2 e_j, \ldots, \alpha^{r-1} e_j\}
\end{align*}
is a $K$-basis of $V_j$ (it is the image of $\{1, \alpha, \ldots, \alpha^{r-1}\}$ under the $K$-linear isomorphism $K(\alpha) \to V_j$, $\lambda \mapsto \lambda e_j$ from the previous step). Concatenating over $j$:
\begin{align*}
\mathcal{B} := \mathcal{B}_1 \sqcup \mathcal{B}_2 \sqcup \cdots \sqcup \mathcal{B}_t
\end{align*}
is a $K$-basis of $L$ (totalling $tr = n$ vectors, matching $\dim_K L$). This is an ordered basis: we list $\mathcal{B}_1$, then $\mathcal{B}_2$, etc.
With respect to $\mathcal{B}$, the matrix $M \in K^{n \times n}$ of $m_\alpha$ is block-diagonal:
\begin{align*}
M = \begin{pmatrix} C & 0 & \cdots & 0 \\ 0 & C & \cdots & 0 \\ \vdots & & \ddots & \vdots \\ 0 & 0 & \cdots & C \end{pmatrix} \in K^{n \times n}, \tag{$\dagger$}
\end{align*}
where $C \in K^{r \times r}$ is the matrix of $m_\alpha$ restricted to $V_j$, expressed in the basis $\mathcal{B}_j$ — and this matrix is the **same** for every $j$. Indeed, for each $j$, the map $V_j \to K(\alpha)$ sending $\alpha^i e_j \mapsto \alpha^i$ is a $K$-linear isomorphism that intertwines $m_\alpha|_{V_j}$ with $m_\alpha|_{K(\alpha)}$ (both are "multiplication by $\alpha$"), so the matrix representations agree.
The off-diagonal blocks are zero because each $V_j$ is $m_\alpha$-invariant (previous step): $m_\alpha(\alpha^i e_j) = \alpha^{i+1} e_j \in V_j$, never involving any $e_{j'}$ with $j' \neq j$.
Taking determinants of $(xI_n - M)$ using the multiplicativity of the determinant on block-diagonal matrices:
\begin{align*}
\chi_{m_\alpha}(x) = \det(xI_n - M) = \prod_{j=1}^t \det(xI_r - C) = \det(xI_r - C)^t. \tag{$\ddagger$}
\end{align*}
So the characteristic polynomial of $m_\alpha$ on $L$ is the $t$-th power of the characteristic polynomial of $m_\alpha$ restricted to a single summand $K(\alpha) \cdot e_1 \cong K(\alpha)$.
[guided]
We upgrade the $K(\alpha)$-basis $\{e_1, \ldots, e_t\}$ of $L$ to a $K$-basis by refining each $e_j$ using the $K$-basis $\{1, \alpha, \ldots, \alpha^{r-1}\}$ of $K(\alpha)$. The set $\{1, \alpha, \ldots, \alpha^{r-1}\}$ is a $K$-basis of $K(\alpha)$: the isomorphism $K[x]/\langle p_\alpha \rangle \cong K(\alpha)$ (sending $x$ to $\alpha$) transfers the basis $\{1, x, \ldots, x^{r-1}\}$ of the quotient to $K(\alpha)$.
For each $j$, define
\begin{align*}
\mathcal{B}_j := \{e_j, \alpha e_j, \alpha^2 e_j, \ldots, \alpha^{r-1} e_j\}.
\end{align*}
This is a $K$-basis of $V_j = K(\alpha) \cdot e_j$: applying the $K$-linear isomorphism $K(\alpha) \xrightarrow{\sim} V_j$, $\lambda \mapsto \lambda e_j$, to the $K$-basis $\{1, \alpha, \ldots, \alpha^{r-1}\}$ of $K(\alpha)$ gives a $K$-basis of $V_j$, which is exactly $\mathcal{B}_j$.
Concatenating $\mathcal{B} := \mathcal{B}_1 \sqcup \cdots \sqcup \mathcal{B}_t$ produces a set of $tr = n$ vectors in $L$. Is $\mathcal{B}$ a $K$-basis of $L$? Yes: the $K$-direct sum $L = \bigoplus_j V_j$ from the previous step means $\mathcal{B}$ is a $K$-basis of $L$ iff each $\mathcal{B}_j$ is a $K$-basis of $V_j$, which we just verified.
Now consider the matrix $M \in K^{n \times n}$ of $m_\alpha$ with respect to the ordered basis $\mathcal{B}$ (we fix the ordering: $e_1, \alpha e_1, \ldots, \alpha^{r-1} e_1, e_2, \alpha e_2, \ldots, \alpha^{r-1} e_t$). Since $m_\alpha$ preserves each $V_j$ (by invariance), applying $m_\alpha$ to any basis vector in $\mathcal{B}_j$ produces a $K$-linear combination of vectors in $\mathcal{B}_j$ alone — no "cross-block" contributions. The matrix $M$ is therefore block-diagonal with $t$ blocks of size $r$:
\begin{align*}
M = \operatorname{diag}(C_1, C_2, \ldots, C_t),
\end{align*}
where $C_j \in K^{r \times r}$ is the matrix of $m_\alpha|_{V_j}$ in the basis $\mathcal{B}_j$.
Claim: $C_1 = C_2 = \cdots = C_t =: C$. Reason: for each $j$, the isomorphism $\psi_j: K(\alpha) \to V_j$, $\psi_j(\alpha^i) = \alpha^i e_j$, is $K$-linear and intertwines "multiplication by $\alpha$" — specifically, $m_\alpha \circ \psi_j = \psi_j \circ m_\alpha|_{K(\alpha)}$. In terms of matrices, the matrix of $m_\alpha|_{V_j}$ in $\mathcal{B}_j$ is the same as the matrix of $m_\alpha|_{K(\alpha)}$ in the basis $\{1, \alpha, \ldots, \alpha^{r-1}\}$ of $K(\alpha)$. This is independent of $j$.
Denote the common block by $C$. Then $M = \operatorname{diag}(C, C, \ldots, C)$. Taking characteristic polynomials: for block-diagonal matrices, the determinant factors as the product of the determinants of the diagonal blocks (this is a standard determinant identity, following from Laplace expansion or the definition via permutations):
\begin{align*}
\chi_{m_\alpha}(x) = \det(xI_n - M) = \prod_{j=1}^t \det(xI_r - C) = \det(xI_r - C)^t.
\end{align*}
So we have reduced computing the characteristic polynomial of $m_\alpha$ on the $n$-dimensional $L$ to computing the characteristic polynomial of $m_\alpha$ on the $r$-dimensional $K(\alpha)$, namely $\det(xI_r - C)$.
[/guided]
[/step]
[step:Identify $C$ as the companion matrix of $p_\alpha$ and compute $\det(xI_r - C) = p_\alpha(x)$]
We compute the matrix $C \in K^{r \times r}$ of $m_\alpha: K(\alpha) \to K(\alpha)$ in the $K$-basis $\{1, \alpha, \alpha^2, \ldots, \alpha^{r-1}\}$ of $K(\alpha)$.
For $i \in \{0, 1, \ldots, r-2\}$, the action of $m_\alpha$ on the basis vector $\alpha^i$ is
\begin{align*}
m_\alpha(\alpha^i) = \alpha \cdot \alpha^i = \alpha^{i+1} \in K(\alpha),
\end{align*}
which is the next basis vector.
For $i = r - 1$, we compute
\begin{align*}
m_\alpha(\alpha^{r-1}) = \alpha^r.
\end{align*}
Writing the minimal polynomial as $p_\alpha(x) = x^r + a_{r-1} x^{r-1} + \cdots + a_1 x + a_0$ with $a_i \in K$, the identity $p_\alpha(\alpha) = 0$ rearranges to
\begin{align*}
\alpha^r = -a_{r-1} \alpha^{r-1} - a_{r-2} \alpha^{r-2} - \cdots - a_1 \alpha - a_0.
\end{align*}
Reading off the coefficients: the matrix $C$ (with the convention that the $i$-th column of $C$ encodes $m_\alpha(\alpha^{i-1})$ in the basis) is
\begin{align*}
C = \begin{pmatrix}
0 & 0 & 0 & \cdots & 0 & -a_0 \\
1 & 0 & 0 & \cdots & 0 & -a_1 \\
0 & 1 & 0 & \cdots & 0 & -a_2 \\
0 & 0 & 1 & \cdots & 0 & -a_3 \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & 1 & -a_{r-1}
\end{pmatrix} \in K^{r \times r}. \tag{$\star$}
\end{align*}
This is the **companion matrix** of $p_\alpha$.
We now compute $\chi_C(x) := \det(xI_r - C)$.
[claim:Characteristic polynomial of the companion matrix equals the minimal polynomial]
$\chi_C(x) = p_\alpha(x)$.
[/claim]
[proof]
We compute $\det(xI_r - C)$ directly by Laplace expansion along the first row. Form the matrix
\begin{align*}
xI_r - C = \begin{pmatrix}
x & 0 & 0 & \cdots & 0 & a_0 \\
-1 & x & 0 & \cdots & 0 & a_1 \\
0 & -1 & x & \cdots & 0 & a_2 \\
0 & 0 & -1 & \cdots & 0 & a_3 \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & -1 & x + a_{r-1}
\end{pmatrix}.
\end{align*}
We prove $\det(xI_r - C) = p_\alpha(x)$ by induction on $r$.
**Base case $r = 1$:** $C = (-a_0) \in K^{1 \times 1}$ and $p_\alpha(x) = x + a_0$. Directly, $\det(xI_1 - C) = \det(x - (-a_0)) = x + a_0 = p_\alpha(x)$.
**Inductive step:** Assume the claim for size $r - 1$. Expand $\det(xI_r - C)$ along the first row, which has two nonzero entries: $x$ in position $(1,1)$ and $a_0$ in position $(1, r)$. Using cofactor expansion:
\begin{align*}
\det(xI_r - C) = x \cdot \det(M_{11}) + (-1)^{1+r} a_0 \cdot \det(M_{1r}),
\end{align*}
where $M_{11}$ is the $(r-1) \times (r-1)$ submatrix obtained by deleting row $1$ and column $1$, and $M_{1r}$ is obtained by deleting row $1$ and column $r$.
The submatrix $M_{11}$ is $xI_{r-1} - C'$, where $C' \in K^{(r-1) \times (r-1)}$ is the companion matrix of $p'(x) := x^{r-1} + a_{r-1} x^{r-2} + \cdots + a_2 x + a_1$. By the inductive hypothesis, $\det(M_{11}) = p'(x) = x^{r-1} + a_{r-1} x^{r-2} + \cdots + a_1$.
The submatrix $M_{1r}$ is lower-triangular with $-1$'s on the diagonal: explicitly,
\begin{align*}
M_{1r} = \begin{pmatrix}
-1 & x & 0 & \cdots & 0 \\
0 & -1 & x & \cdots & 0 \\
\vdots & & \ddots & \ddots & \vdots \\
0 & 0 & \cdots & -1 & x \\
0 & 0 & \cdots & 0 & -1
\end{pmatrix} \in K^{(r-1) \times (r-1)}.
\end{align*}
The determinant of a triangular matrix is the product of its diagonal entries: $\det(M_{1r}) = (-1)^{r-1}$.
Combining:
\begin{align*}
\det(xI_r - C) &= x \cdot (x^{r-1} + a_{r-1} x^{r-2} + \cdots + a_1) + (-1)^{1+r} a_0 \cdot (-1)^{r-1} \\
&= x^r + a_{r-1} x^{r-1} + \cdots + a_1 x + (-1)^{1+r}(-1)^{r-1} a_0.
\end{align*}
The final sign is $(-1)^{1+r+r-1} = (-1)^{2r} = 1$. Therefore
\begin{align*}
\det(xI_r - C) = x^r + a_{r-1} x^{r-1} + \cdots + a_1 x + a_0 = p_\alpha(x).
\end{align*}
[/proof]
[guided]
We compute the matrix $C$ of $m_\alpha$ restricted to $K(\alpha)$, in the $K$-basis $\{1, \alpha, \alpha^2, \ldots, \alpha^{r-1}\}$. For $i \in \{0, 1, \ldots, r-2\}$:
\begin{align*}
m_\alpha(\alpha^i) = \alpha \cdot \alpha^i = \alpha^{i+1}.
\end{align*}
This is already a single basis vector — the $(i+2)$-th basis vector, namely $\alpha^{i+1}$. So the $(i+1)$-th column of $C$ (with 1-indexing) is the standard basis vector $e_{i+2}$: a single $1$ in position $i+2$, zeros elsewhere.
The interesting case is $i = r - 1$:
\begin{align*}
m_\alpha(\alpha^{r-1}) = \alpha^r.
\end{align*}
Now $\alpha^r$ is not itself one of the basis vectors $\{1, \alpha, \ldots, \alpha^{r-1}\}$ — it is one power higher. But $\alpha$ is a root of its minimal polynomial:
\begin{align*}
p_\alpha(\alpha) = \alpha^r + a_{r-1} \alpha^{r-1} + \cdots + a_1 \alpha + a_0 = 0,
\end{align*}
where we write $p_\alpha(x) = x^r + a_{r-1} x^{r-1} + \cdots + a_1 x + a_0 \in K[x]$ (the coefficients $a_i$ lie in $K$ because $p_\alpha \in K[x]$). Rearranging:
\begin{align*}
\alpha^r = -a_0 - a_1 \alpha - a_2 \alpha^2 - \cdots - a_{r-1} \alpha^{r-1}.
\end{align*}
This expresses $\alpha^r$ as a $K$-linear combination of the basis vectors. So the $r$-th column of $C$ is $(-a_0, -a_1, \ldots, -a_{r-1})^\top$.
Assembling: the matrix $C$ is the **companion matrix** of $p_\alpha$, displayed in $(\star)$.
We claim $\det(xI_r - C) = p_\alpha(x)$. This is a classical fact about companion matrices, proved by induction on $r$ via Laplace expansion along the first row of $xI_r - C$:
- The $(1,1)$ entry is $x$; its cofactor is the determinant of an $(r-1) \times (r-1)$ matrix, which — by induction — equals $x^{r-1} + a_{r-1} x^{r-2} + \cdots + a_1$ (the companion polynomial of the "shifted" $p_\alpha$).
- The $(1, r)$ entry is $a_0$; its cofactor is $(-1)^{1+r}$ times the determinant of a lower-triangular matrix with $-1$'s on the diagonal, which equals $(-1)^{r-1}$.
The signs combine to $(-1)^{1+r} \cdot (-1)^{r-1} = (-1)^{2r} = 1$, giving
\begin{align*}
\det(xI_r - C) = x \cdot (x^{r-1} + a_{r-1} x^{r-2} + \cdots + a_1) + a_0 = x^r + a_{r-1} x^{r-1} + \cdots + a_1 x + a_0 = p_\alpha(x).
\end{align*}
See the claim above for the full induction. The base case $r = 1$ is $C = (-a_0)$, $p_\alpha(x) = x + a_0$, and $\det(x - C) = \det(x + a_0) = x + a_0$, matching. The inductive step was sketched above.
This is sometimes phrased: the minimal polynomial of an element $\alpha$ is equal to the characteristic polynomial of multiplication-by-$\alpha$ acting on $K(\alpha)$.
[/guided]
[/step]
[step:Combine to obtain $\det(xI - m_\alpha) = p_\alpha^{[L:K(\alpha)]}$]
Combining $(\ddagger)$ with the claim in the previous step:
\begin{align*}
\chi_{m_\alpha}(x) = \det(xI_n - M) = \det(xI_r - C)^t = p_\alpha(x)^t = p_\alpha(x)^{[L:K(\alpha)]}.
\end{align*}
This is the first assertion of the theorem.
[guided]
Assembling everything:
- From step 3, the characteristic polynomial of $m_\alpha$ on $L$ factors as $\chi_{m_\alpha}(x) = \det(xI_r - C)^t$, where $C$ is the matrix of $m_\alpha$ on the single summand $K(\alpha)$ and $t = [L:K(\alpha)]$.
- From step 4, $C$ is the companion matrix of $p_\alpha$ and $\det(xI_r - C) = p_\alpha(x)$.
Substituting:
\begin{align*}
\chi_{m_\alpha}(x) = p_\alpha(x)^t = p_\alpha(x)^{[L:K(\alpha)]}.
\end{align*}
This is the main claim of the theorem.
The formula has a clean structural interpretation: the characteristic polynomial of $m_\alpha$ is the minimal polynomial $p_\alpha$ raised to the "redundancy factor" $[L:K(\alpha)]$. The redundancy measures how much bigger $L$ is than the smallest field $K(\alpha)$ in which $m_\alpha$ already "uses up" all its algebraic content.
[/guided]
[/step]
[step:Derive the norm and trace formulas on $K(\alpha)$ from the constant and sub-leading coefficients of $p_\alpha$]
Suppose $p_\alpha$ splits in some extension field $M \supseteq K$ as $p_\alpha(x) = \prod_{i=1}^r (x - \alpha_i)$, where $\alpha_1, \ldots, \alpha_r \in M$ are the roots (with multiplicity; for a number field in characteristic $0$ and a minimal polynomial, which is irreducible, the roots are in fact distinct).
Expanding the product using [Vieta's formulas](/pages/???):
\begin{align*}
p_\alpha(x) = \prod_{i=1}^r (x - \alpha_i) = x^r - \left(\sum_i \alpha_i\right) x^{r-1} + \cdots + (-1)^r \prod_i \alpha_i. \tag{$\diamond$}
\end{align*}
So the coefficient of $x^{r-1}$ in $p_\alpha$ is $-\sum_i \alpha_i$, and the constant term is $(-1)^r \prod_i \alpha_i$.
By definition of norm and trace on the field $K(\alpha)$, where $m_\alpha: K(\alpha) \to K(\alpha)$ has characteristic polynomial $p_\alpha$:
\begin{align*}
N_{K(\alpha)/K}(\alpha) &:= \det(m_\alpha|_{K(\alpha)}) \in K, \\
\operatorname{tr}_{K(\alpha)/K}(\alpha) &:= \operatorname{tr}(m_\alpha|_{K(\alpha)}) \in K.
\end{align*}
For any $r \times r$ matrix with characteristic polynomial $x^r + c_{r-1} x^{r-1} + \cdots + c_1 x + c_0$:
- The determinant equals $(-1)^r c_0$ (read off from $\chi(0) = c_0$ and $\det(-m_\alpha) = (-1)^r \det(m_\alpha)$; more directly, substituting $x = 0$ into $\det(xI - m_\alpha) = p_\alpha(x)$ gives $\det(-m_\alpha) = p_\alpha(0)$, so $(-1)^r \det(m_\alpha) = p_\alpha(0)$, i.e., $\det(m_\alpha) = (-1)^r p_\alpha(0)$).
- The trace equals $-c_{r-1}$ (comparing coefficients of $x^{r-1}$ in $\det(xI - m_\alpha)$ and $x^r + c_{r-1} x^{r-1} + \cdots$; the $x^{r-1}$ coefficient of the characteristic polynomial of an $r \times r$ matrix is always $-\operatorname{tr}$ by direct expansion of the Leibniz formula).
Applying these formulas to the characteristic polynomial $p_\alpha$ of $m_\alpha|_{K(\alpha)}$ with the expansion $(\diamond)$:
\begin{align*}
N_{K(\alpha)/K}(\alpha) &= (-1)^r p_\alpha(0) = (-1)^r \cdot (-1)^r \prod_i \alpha_i = \prod_i \alpha_i, \\
\operatorname{tr}_{K(\alpha)/K}(\alpha) &= -(-\sum_i \alpha_i) = \sum_i \alpha_i.
\end{align*}
[guided]
Suppose we have a splitting $p_\alpha(x) = \prod_{i=1}^r (x - \alpha_i)$ in some extension $M$ of $K$. Expanding the product by distribution gives the [Vieta formulas](/pages/???):
\begin{align*}
\prod_{i=1}^r (x - \alpha_i) = \sum_{k=0}^r (-1)^k e_k(\alpha_1, \ldots, \alpha_r) x^{r-k},
\end{align*}
where $e_k$ is the $k$-th elementary symmetric polynomial. So
\begin{align*}
p_\alpha(x) = x^r - e_1(\alpha) x^{r-1} + e_2(\alpha) x^{r-2} - \cdots + (-1)^r e_r(\alpha),
\end{align*}
writing $e_k(\alpha) := e_k(\alpha_1, \ldots, \alpha_r)$ for brevity.
Reading off specific coefficients:
- Coefficient of $x^{r-1}$: $-e_1(\alpha) = -\sum_i \alpha_i$.
- Constant term (coefficient of $x^0$): $(-1)^r e_r(\alpha) = (-1)^r \prod_i \alpha_i$.
Now apply the definitions of norm and trace. For any $K$-linear endomorphism $T$ of a finite-dimensional $K$-vector space with characteristic polynomial
\begin{align*}
\chi_T(x) = x^N + c_{N-1} x^{N-1} + \cdots + c_1 x + c_0,
\end{align*}
the standard facts are:
- $\det(T) = (-1)^N c_0$: substituting $x = 0$ into $\chi_T(x) = \det(xI - T)$ gives $c_0 = \chi_T(0) = \det(-T) = (-1)^N \det(T)$, so $\det(T) = (-1)^N c_0$.
- $\operatorname{tr}(T) = -c_{N-1}$: expanding $\det(xI - T)$ by the Leibniz formula, the identity permutation contributes $\prod_i (x - T_{ii}) = x^N - (\sum_i T_{ii}) x^{N-1} + \cdots = x^N - \operatorname{tr}(T) x^{N-1} + \cdots$, and any other permutation contributes a term of $x$-degree at most $N - 2$; hence $c_{N-1} = -\operatorname{tr}(T)$.
Apply these with $T = m_\alpha|_{K(\alpha)}$, $N = r$. The definitions of the field-theoretic norm and trace are
\begin{align*}
N_{K(\alpha)/K}(\alpha) &:= \det(m_\alpha|_{K(\alpha)}), \\
\operatorname{tr}_{K(\alpha)/K}(\alpha) &:= \operatorname{tr}(m_\alpha|_{K(\alpha)}).
\end{align*}
From the characteristic polynomial $p_\alpha$, we read:
\begin{align*}
c_0 &= (-1)^r \prod_i \alpha_i, \quad c_{r-1} = -\sum_i \alpha_i.
\end{align*}
Hence:
\begin{align*}
N_{K(\alpha)/K}(\alpha) &= (-1)^r \cdot c_0 = (-1)^r \cdot (-1)^r \prod_i \alpha_i = \prod_i \alpha_i, \\
\operatorname{tr}_{K(\alpha)/K}(\alpha) &= -c_{r-1} = -(-\sum_i \alpha_i) = \sum_i \alpha_i.
\end{align*}
[/guided]
[/step]
[step:Derive the norm and trace formulas on $L$ from $\chi_{m_\alpha} = p_\alpha^t$]
The characteristic polynomial of $m_\alpha$ on $L$ is $p_\alpha(x)^t = \prod_{i=1}^r (x - \alpha_i)^t$, where $t = [L:K(\alpha)]$ and $n = rt = [L:K]$.
Expanding $\prod_{i=1}^r (x - \alpha_i)^t$: this is a polynomial of degree $n = rt$ whose roots (in $M$, with multiplicity) are $\alpha_1, \ldots, \alpha_r$, each appearing with multiplicity $t$. By Vieta's formulas applied to this degree-$n$ polynomial:
\begin{align*}
\prod_{i=1}^r (x - \alpha_i)^t &= x^n - \left(\text{sum of roots with multiplicity}\right) x^{n-1} + \cdots + (-1)^n \left(\text{product of roots with multiplicity}\right).
\end{align*}
The "sum of roots with multiplicity" is $t \cdot \sum_i \alpha_i$ (each $\alpha_i$ contributing $t$ times). The "product of roots with multiplicity" is $\prod_i \alpha_i^t = (\prod_i \alpha_i)^t$.
Applying the determinant/trace formulas from the previous step with $N = n$:
\begin{align*}
N_{L/K}(\alpha) &:= \det(m_\alpha) = (-1)^n c_0 = (-1)^n \cdot (-1)^n \left(\prod_i \alpha_i\right)^t = \left(\prod_i \alpha_i\right)^{[L:K(\alpha)]}, \\
\operatorname{tr}_{L/K}(\alpha) &:= \operatorname{tr}(m_\alpha) = -c_{n-1} = -\left(-t \sum_i \alpha_i\right) = [L:K(\alpha)] \sum_i \alpha_i.
\end{align*}
This completes the derivation of all four formulas in the theorem statement.
[guided]
On $L$, the characteristic polynomial of $m_\alpha$ is $p_\alpha^t$ with $t = [L:K(\alpha)]$. Reading off the roots of $p_\alpha^t$ in $M$ (counted with multiplicity):
\begin{align*}
p_\alpha(x)^t = \prod_{i=1}^r (x - \alpha_i)^t.
\end{align*}
This polynomial has degree $rt = n = [L:K]$. Its roots (with multiplicity) are $\alpha_1, \ldots, \alpha_r$, each repeated $t$ times. Applying Vieta's formulas to this degree-$n$ polynomial:
The coefficient of $x^{n-1}$ is negative of the sum of the roots counted with multiplicity:
\begin{align*}
-\left[\underbrace{(\alpha_1 + \alpha_1 + \cdots + \alpha_1)}_{t \text{ times}} + \cdots + \underbrace{(\alpha_r + \cdots + \alpha_r)}_{t \text{ times}}\right] = -t \sum_i \alpha_i.
\end{align*}
The constant term is $(-1)^n$ times the product of the roots counted with multiplicity:
\begin{align*}
(-1)^n \cdot \alpha_1^t \cdot \alpha_2^t \cdots \alpha_r^t = (-1)^n \left(\prod_i \alpha_i\right)^t.
\end{align*}
Apply the standard determinant-constant and trace-coefficient identities for an $n \times n$ matrix, this time with $N = n$:
\begin{align*}
N_{L/K}(\alpha) &= \det(m_\alpha) = (-1)^n c_0 = (-1)^n \cdot (-1)^n \left(\prod_i \alpha_i\right)^t = \left(\prod_i \alpha_i\right)^t, \\
\operatorname{tr}_{L/K}(\alpha) &= \operatorname{tr}(m_\alpha) = -c_{n-1} = t \sum_i \alpha_i.
\end{align*}
With $t = [L:K(\alpha)]$, these are the announced formulas:
\begin{align*}
N_{L/K}(\alpha) &= \left(\prod_i \alpha_i\right)^{[L:K(\alpha)]}, \\
\operatorname{tr}_{L/K}(\alpha) &= [L:K(\alpha)] \sum_i \alpha_i.
\end{align*}
Interpretation: the norm and trace over $L$ are obtained from the norm and trace over $K(\alpha)$ by raising the product to the power $t$ (for the norm, which is multiplicative) and multiplying the sum by $t$ (for the trace, which is additive). The factor $t$ records how many "copies" of the building block $K(\alpha)$ sit inside $L$, reflecting the multiplicative behaviour of $\det$ and the additive behaviour of $\operatorname{tr}$ on block-diagonal matrices with repeated diagonal blocks.
This completes the proof of the theorem.
[/guided]
[/step]
Explore Further
Convergence of Dirichlet Series
Algebraic Number Theory
Square-Free Discriminant Criterion
Algebraic Number Theory
Holomorphicity of Non-Trivial $L$-Functions
Algebraic Number Theory
Unique Factorization of Ideals
Algebraic Number Theory
Norm and Trace via Embeddings
Algebraic Number Theory
Multiplicativity of Norm and Additivity of Trace
Algebraic Number Theory
Units via Norm
Algebraic Number Theory
Finiteness of the Class Group (Quadratic Case)
Algebraic Number Theory