Transitivity of Trace and Norm (Theorem # 1293)
Theorem
Let $K \subset F \subset L$ be a tower of finite field extensions. Then for all $\alpha \in L$,
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) &= \operatorname{Tr}_{F/K}\bigl(\operatorname{Tr}_{L/F}(\alpha)\bigr), \\
\operatorname{N}_{L/K}(\alpha) &= \operatorname{N}_{F/K}\bigl(\operatorname{N}_{L/F}(\alpha)\bigr).
\end{align*}
Algebra
Abstract Algebra
Discussion
No discussion available for this theorem.
Proof
[proofplan]
We express the multiplication-by-$\alpha$ map $m_\alpha$ on $L$ in terms of a composite $K$-basis built from a $K$-basis of $F$ and an $F$-basis of $L$, following the construction of the [Tower Law](/theorems/1248). This composite basis orders elements as $\{e_i f_j\}$, so the matrix of $m_\alpha$ acquires a block structure: the $F$-linear action of $\alpha$ on $L$ determines the block positions, and each block is itself the $K$-linear matrix of the corresponding multiplication map on $F$. The trace of the full matrix decomposes as the $K$-trace of the $F$-trace by the additivity of traces over blocks, and the determinant decomposes as the $K$-norm of the $F$-norm by the multiplicativity of determinants over blocks. Both formulas follow from the standard linear-algebraic identities for block-diagonal (or, more precisely, block-triangular) matrices.
[/proofplan]
[step:Fix a composite $K$-basis of $L$ from the tower $K \subset F \subset L$]
Set $m := [F : K]$ and $r := [L : F]$, so that $[L : K] = mr$ by the [Tower Law](/theorems/1248). Fix a $K$-basis $\{e_1, \ldots, e_m\}$ of $F$ and an $F$-basis $\{f_1, \ldots, f_r\}$ of $L$. By the proof of the [Tower Law](/theorems/1248), the set
\begin{align*}
\mathcal{B} := \{e_i f_j : 1 \le i \le m, \, 1 \le j \le r\}
\end{align*}
is a $K$-basis of $L$ with $mr$ elements. We order $\mathcal{B}$ by grouping by the $F$-basis index first: the basis elements are
\begin{align*}
\underbrace{e_1 f_1, \, e_2 f_1, \, \ldots, \, e_m f_1}_{\text{block } 1}, \quad \underbrace{e_1 f_2, \, e_2 f_2, \, \ldots, \, e_m f_2}_{\text{block } 2}, \quad \ldots, \quad \underbrace{e_1 f_r, \, e_2 f_r, \, \ldots, \, e_m f_r}_{\text{block } r}.
\end{align*}
This ordering ensures that the first $m$ basis vectors correspond to the $F$-component along $f_1$, the next $m$ to the component along $f_2$, and so on.
[guided]
The construction of the composite basis is identical to the one used in proving the Tower Law. The purpose of choosing a specific ordering is to reveal the block structure of the multiplication map. If we instead grouped by $K$-basis index first (listing $e_1 f_1, e_1 f_2, \ldots, e_1 f_r, e_2 f_1, \ldots$), the block structure would correspond to a different decomposition and the argument would become more complicated.
By grouping by $j$ (the $F$-basis index), the $j$-th block of $m$ consecutive basis elements spans the "copy of $F$" sitting inside the $f_j$-component of $L$. When we multiply by $\alpha \in L$, the element $\alpha$ acts $F$-linearly on $L$ (since $\alpha \cdot (cf) = c(\alpha f)$ for $c \in F$), and the $F$-linear structure of this action determines how the blocks interact. The $K$-linear structure within each block then records the action of the relevant $F$-element on $F$ itself.
[/guided]
[/step]
[step:Express the multiplication map $m_\alpha$ on $L$ in block form]
Let $\alpha \in L$. The multiplication-by-$\alpha$ map is the $K$-linear endomorphism
\begin{align*}
m_\alpha \colon L &\to L \\
x &\mapsto \alpha x.
\end{align*}
The trace $\operatorname{Tr}_{L/K}(\alpha)$ is defined as $\operatorname{tr}(M)$ and the norm $\operatorname{N}_{L/K}(\alpha)$ is defined as $\det(M)$, where $M \in K^{mr \times mr}$ is the matrix of $m_\alpha$ with respect to any $K$-basis of $L$. We compute $M$ with respect to the ordered basis $\mathcal{B}$.
Since $\{f_1, \ldots, f_r\}$ is an $F$-basis of $L$ and $\alpha \in L$, there exist unique elements $c_{kj} \in F$ (for $1 \le k \le r$, $1 \le j \le r$) such that
\begin{align*}
\alpha \cdot f_j = \sum_{k=1}^{r} c_{kj} \, f_k \quad \text{for each } j \in \{1, \ldots, r\}.
\end{align*}
The matrix $C := (c_{kj})_{1 \le k, j \le r} \in F^{r \times r}$ is the matrix of $m_\alpha$ acting on $L$ as an $F$-vector space, with respect to the basis $\{f_1, \ldots, f_r\}$. In particular, $\operatorname{Tr}_{L/F}(\alpha) = \sum_{k=1}^{r} c_{kk}$ and $\operatorname{N}_{L/F}(\alpha) = \det(C)$.
Now expand the $K$-action on each block. For the basis element $e_i f_j$ (where $1 \le i \le m$ and $1 \le j \le r$):
\begin{align*}
\alpha \cdot (e_i f_j) = e_i \cdot (\alpha \cdot f_j) = e_i \cdot \sum_{k=1}^{r} c_{kj} \, f_k = \sum_{k=1}^{r} (c_{kj} \, e_i) \, f_k.
\end{align*}
Here we used the commutativity of multiplication in $L$ (so $e_i \cdot (\alpha f_j) = \alpha \cdot (e_i f_j)$) and the distributive law. Each term $c_{kj} \, e_i$ lies in $F$, so we expand it in the $K$-basis $\{e_1, \ldots, e_m\}$ of $F$. The multiplication-by-$c_{kj}$ map on $F$ is the $K$-linear endomorphism
\begin{align*}
m_{c_{kj}} \colon F &\to F \\
y &\mapsto c_{kj} \cdot y,
\end{align*}
with matrix $A_{kj} \in K^{m \times m}$ with respect to the basis $\{e_1, \ldots, e_m\}$. Writing $A_{kj} = ((A_{kj})_{\ell i})_{1 \le \ell, i \le m}$, we have
\begin{align*}
c_{kj} \cdot e_i = \sum_{\ell=1}^{m} (A_{kj})_{\ell i} \, e_\ell.
\end{align*}
Substituting back:
\begin{align*}
\alpha \cdot (e_i f_j) = \sum_{k=1}^{r} \sum_{\ell=1}^{m} (A_{kj})_{\ell i} \, (e_\ell f_k).
\end{align*}
This shows that the coefficient of $e_\ell f_k$ in the expansion of $\alpha \cdot (e_i f_j)$ is $(A_{kj})_{\ell i}$. In the matrix $M$, the entry in row $(\ell, k)$ and column $(i, j)$ — where the row corresponds to the basis element $e_\ell f_k$ and the column to $e_i f_j$ — is $(A_{kj})_{\ell i}$.
Therefore, the $mr \times mr$ matrix $M$ has the $r \times r$ block structure
\begin{align*}
M = \begin{pmatrix} A_{11} & A_{12} & \cdots & A_{1r} \\ A_{21} & A_{22} & \cdots & A_{2r} \\ \vdots & \vdots & \ddots & \vdots \\ A_{r1} & A_{r2} & \cdots & A_{rr} \end{pmatrix},
\end{align*}
where each $A_{kj} \in K^{m \times m}$ is the matrix of the multiplication-by-$c_{kj}$ map on $F$ with respect to $\{e_1, \ldots, e_m\}$.
[guided]
The core of this step is translating the two-layer structure of the tower $K \subset F \subset L$ into block-matrix language.
**The outer layer: $\alpha$ acts on $L$ over $F$.** The element $\alpha$ acts $F$-linearly on $L$, and the matrix of this action with respect to the $F$-basis $\{f_1, \ldots, f_r\}$ is $C = (c_{kj}) \in F^{r \times r}$. The diagonal entries $c_{11}, \ldots, c_{rr}$ of $C$ determine $\operatorname{Tr}_{L/F}(\alpha) = \sum_{k=1}^{r} c_{kk}$, and $\det(C) = \operatorname{N}_{L/F}(\alpha)$.
**The inner layer: each $F$-coefficient acts on $F$ over $K$.** Each entry $c_{kj}$ of the matrix $C$ is itself an element of $F$, and it acts on $F$ by multiplication. The matrix of this action in the $K$-basis $\{e_1, \ldots, e_m\}$ is $A_{kj} \in K^{m \times m}$.
**Combining the layers.** To compute $\alpha \cdot (e_i f_j)$ in terms of the composite basis $\mathcal{B}$, we first apply the outer layer ($\alpha \cdot f_j = \sum_k c_{kj} f_k$), then the inner layer ($c_{kj} \cdot e_i = \sum_\ell (A_{kj})_{\ell i} \, e_\ell$). The combined result is
\begin{align*}
\alpha \cdot (e_i f_j) = \sum_{k=1}^{r} \sum_{\ell=1}^{m} (A_{kj})_{\ell i} \, (e_\ell f_k).
\end{align*}
This means: the column of $M$ corresponding to basis element $e_i f_j$ has its entries partitioned into $r$ blocks of size $m$. The $k$-th block (corresponding to the $f_k$-component) contains the $i$-th column of the matrix $A_{kj}$. Running over all $i$ in a fixed column-block $j$, the $k$-th row-block is exactly the full matrix $A_{kj}$.
**Why is commutativity used?** The equation $\alpha \cdot (e_i f_j) = e_i \cdot (\alpha f_j)$ uses the commutativity of multiplication in $L$. Without commutativity (e.g., for noncommutative algebras), the left-multiplication and right-multiplication maps would differ, and the block structure would not decompose this way.
In summary, the $mr \times mr$ matrix $M$ of $m_\alpha$ with respect to $\mathcal{B}$ is the $r \times r$ block matrix whose $(k, j)$-block is the $m \times m$ matrix $A_{kj}$ representing multiplication by $c_{kj}$ on $F$. Note that $A_{kj}$ is the matrix of $m_{c_{kj}}$ in the fixed $K$-basis $\{e_1, \ldots, e_m\}$, where $c_{kj}$ is the $(k,j)$-entry of the $F$-matrix $C$ of $m_\alpha$ on $L$.
[/guided]
[/step]
[step:Compute $\operatorname{Tr}_{L/K}(\alpha) = \operatorname{Tr}_{F/K}(\operatorname{Tr}_{L/F}(\alpha))$ from the diagonal blocks]
The trace of $M$ is the sum of its diagonal entries. Since $M$ is an $r \times r$ block matrix with $m \times m$ blocks, the diagonal entries of $M$ are precisely the diagonal entries of the diagonal blocks $A_{11}, A_{22}, \ldots, A_{rr}$. Therefore
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) = \operatorname{tr}(M) = \sum_{k=1}^{r} \operatorname{tr}(A_{kk}).
\end{align*}
Each diagonal block $A_{kk}$ is the matrix of the multiplication-by-$c_{kk}$ map on $F$ with respect to $\{e_1, \ldots, e_m\}$, so $\operatorname{tr}(A_{kk}) = \operatorname{Tr}_{F/K}(c_{kk})$. Substituting:
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) = \sum_{k=1}^{r} \operatorname{Tr}_{F/K}(c_{kk}).
\end{align*}
Since the field trace $\operatorname{Tr}_{F/K} \colon F \to K$ is $K$-linear (being the trace of a $K$-linear map), we may pull the sum inside:
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) = \operatorname{Tr}_{F/K}\!\left(\sum_{k=1}^{r} c_{kk}\right) = \operatorname{Tr}_{F/K}\!\bigl(\operatorname{Tr}_{L/F}(\alpha)\bigr).
\end{align*}
The last equality uses $\sum_{k=1}^{r} c_{kk} = \operatorname{tr}(C) = \operatorname{Tr}_{L/F}(\alpha)$.
[guided]
The trace of a block matrix is a standard fact from linear algebra: for any block matrix $M$ with square diagonal blocks $A_{11}, \ldots, A_{rr}$, the trace of the full matrix equals $\sum_{k=1}^{r} \operatorname{tr}(A_{kk})$. This holds regardless of the off-diagonal blocks, because the trace depends only on the diagonal entries, and the diagonal entries of the full matrix are exactly the union of the diagonal entries of the diagonal blocks.
More explicitly, the diagonal entries of $M$ are $M_{(\ell, k), (\ell, k)}$ for $1 \le \ell \le m$ and $1 \le k \le r$. From the previous step, $M_{(\ell, k), (i, j)} = (A_{kj})_{\ell i}$. Setting $i = \ell$ and $j = k$: the diagonal entry at position $(\ell, k)$ is $(A_{kk})_{\ell \ell}$. Summing over all $\ell$ and $k$:
\begin{align*}
\operatorname{tr}(M) = \sum_{k=1}^{r} \sum_{\ell=1}^{m} (A_{kk})_{\ell \ell} = \sum_{k=1}^{r} \operatorname{tr}(A_{kk}).
\end{align*}
Now, $A_{kk}$ is the matrix of $m_{c_{kk}} \colon F \to F$ in the $K$-basis $\{e_1, \ldots, e_m\}$, so $\operatorname{tr}(A_{kk}) = \operatorname{Tr}_{F/K}(c_{kk})$ by the definition of the field trace. Therefore
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) = \sum_{k=1}^{r} \operatorname{Tr}_{F/K}(c_{kk}).
\end{align*}
The $K$-linearity of $\operatorname{Tr}_{F/K}$ allows us to factor:
\begin{align*}
\sum_{k=1}^{r} \operatorname{Tr}_{F/K}(c_{kk}) = \operatorname{Tr}_{F/K}\!\left(\sum_{k=1}^{r} c_{kk}\right).
\end{align*}
Finally, $\sum_{k=1}^{r} c_{kk}$ is the trace of the matrix $C \in F^{r \times r}$, which represents $m_\alpha$ on $L$ over $F$. By the definition of the field trace, $\operatorname{tr}(C) = \operatorname{Tr}_{L/F}(\alpha)$. Combining:
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) = \operatorname{Tr}_{F/K}\!\bigl(\operatorname{Tr}_{L/F}(\alpha)\bigr).
\end{align*}
The essential mechanism is that the block structure converts the $K$-linear trace on $L$ into a two-stage computation: first take the $F$-linear trace on $L$ (reading off the diagonal entries of $C$), then take the $K$-linear trace on $F$ (reading off the diagonal entries of each $A_{kk}$). The $K$-linearity of $\operatorname{Tr}_{F/K}$ is what allows the sum over blocks to be absorbed into a single application of $\operatorname{Tr}_{F/K}$.
[/guided]
[/step]
[step:Compute $\operatorname{N}_{L/K}(\alpha) = \operatorname{N}_{F/K}(\operatorname{N}_{L/F}(\alpha))$ using the block determinant]
For the norm, we need to show that $\det(M) = \operatorname{N}_{F/K}(\det(C))$. Define the ring homomorphism
\begin{align*}
\Phi \colon F^{r \times r} &\to K^{mr \times mr} \\
(c_{kj})_{1 \le k, j \le r} &\mapsto (A_{kj})_{1 \le k, j \le r},
\end{align*}
where each entry $c_{kj} \in F$ is replaced by the $m \times m$ matrix $A_{kj}$ of the multiplication-by-$c_{kj}$ map on $F$. We verify that $\Phi$ is a ring homomorphism.
[claim:The map $\Phi$ is a $K$-algebra homomorphism from $F^{r \times r}$ to $K^{mr \times mr}$]
The map $\mu \colon F \to K^{m \times m}$ sending $c \in F$ to the matrix of $m_c$ with respect to $\{e_1, \ldots, e_m\}$ is an injective $K$-algebra homomorphism. Indeed:
**Additivity:** For $c, d \in F$, the map $m_{c+d}$ equals $m_c + m_d$ (since $(c+d)y = cy + dy$ for all $y \in F$), so $\mu(c + d) = \mu(c) + \mu(d)$.
**Multiplicativity:** For $c, d \in F$, the map $m_{cd}$ equals $m_c \circ m_d$ (since $(cd)y = c(dy)$ for all $y \in F$), so $\mu(cd) = \mu(c)\mu(d)$ (matrix multiplication corresponds to composition).
**Scalar action:** For $a \in K$ and $c \in F$, $\mu(ac) = a \cdot \mu(c)$, since $m_{ac} = a \cdot m_c$ as $K$-linear maps.
**Identity:** $\mu(1_F) = I_m$, since $m_{1_F}$ is the identity map on $F$.
**Injectivity:** If $\mu(c) = 0$, then $c \cdot e_i = 0$ for all $i$, which forces $c = 0$ (taking $i$ such that $e_i = 1_F$, or using that $c \cdot 1_F = c$).
The map $\Phi$ is obtained by applying $\mu$ entry-wise to the matrix ring $F^{r \times r}$. Since $\mu$ is a $K$-algebra homomorphism, $\Phi$ is a $K$-algebra homomorphism from $F^{r \times r}$ to $K^{mr \times mr}$: it preserves addition (block-wise), preserves multiplication (the $(k, j)$-block of the product $\Phi(C_1)\Phi(C_2)$ is $\sum_{s=1}^{r} A^{(1)}_{ks} A^{(2)}_{sj} = \sum_{s=1}^{r} \mu(c^{(1)}_{ks})\mu(c^{(2)}_{sj}) = \mu\!\left(\sum_{s=1}^{r} c^{(1)}_{ks} c^{(2)}_{sj}\right)$, which is the $(k,j)$-block of $\Phi(C_1 C_2)$), and sends the identity matrix $I_r \in F^{r \times r}$ to $I_{mr} \in K^{mr \times mr}$.
[/claim]
[proof]
The verification is given in the body of the claim above.
[/proof]
Since $\Phi$ is a $K$-algebra homomorphism and $M = \Phi(C)$, we have
\begin{align*}
\det(M) = \det(\Phi(C)).
\end{align*}
We now compute this determinant. The matrix $C \in F^{r \times r}$ has a characteristic polynomial $\chi_C(t) = \det(tI_r - C) \in F[t]$. By the Cayley-Hamilton theorem, $\chi_C(C) = 0$ in $F^{r \times r}$. Since $\Phi$ is a ring homomorphism, $\Phi(\chi_C(C)) = \chi_C(\Phi(C)) = \chi_C(M) = 0$ (where $\chi_C$ is now evaluated in $K^{mr \times mr}$ by replacing each coefficient $a \in F$ with $\mu(a) \in K^{m \times m}$).
However, the cleanest approach is via the expansion of the determinant. Write
\begin{align*}
\det(C) = \sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^{r} c_{k, \sigma(k)} \in F,
\end{align*}
where $S_r$ is the symmetric group on $\{1, \ldots, r\}$. The norm of $\det(C)$ over $K$ is
\begin{align*}
\operatorname{N}_{F/K}(\det(C)) = \det(\mu(\det(C))) = \det\!\left(\mu\!\left(\sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^{r} c_{k, \sigma(k)}\right)\right).
\end{align*}
Since $\mu$ is a $K$-algebra homomorphism:
\begin{align*}
\mu(\det(C)) = \sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^{r} \mu(c_{k, \sigma(k)}) = \sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^{r} A_{k, \sigma(k)}.
\end{align*}
We must show that $\det(\Phi(C)) = \det(\mu(\det(C)))$, i.e., that applying the Leibniz determinant formula at the block level and then taking the $K$-determinant of each block product gives the same result as taking the $K$-determinant of the full $mr \times mr$ matrix directly.
We use a structural argument. The map
\begin{align*}
\delta \colon F^{r \times r} &\to K \\
X &\mapsto \det(\Phi(X))
\end{align*}
satisfies three properties: (i) $\delta$ is multilinear in the columns of $X$ (since $\Phi$ is $K$-linear in each entry and $\det$ is multilinear), (ii) $\delta$ is alternating (since $\Phi$ maps a matrix with two equal columns to a block matrix with two equal block-columns, whose determinant is zero), and (iii) $\delta(I_r) = \det(\Phi(I_r)) = \det(I_{mr}) = 1$. By the uniqueness of the determinant, any function $F^{r \times r} \to K$ that is multilinear and alternating in the columns and sends $I_r$ to $1$ must be the composition $X \mapsto \det(\Phi(X))$.
On the other hand, the map $X \mapsto \operatorname{N}_{F/K}(\det(X))$ is also a function $F^{r \times r} \to K$. We verify that it too is multilinear and alternating with value $1$ at $I_r$. However, $\operatorname{N}_{F/K}$ is multiplicative but not additive, so direct multilinearity fails in general.
Instead, we use a more direct argument. Consider the $K$-linear map $m_\alpha$ on $L$ and factor it through the tower. The matrix $M = \Phi(C)$ satisfies
\begin{align*}
\det(M) = \det(\Phi(C)).
\end{align*}
To evaluate this, we use the fact that $\Phi$ is a ring homomorphism, so $\Phi$ preserves the relation $\det(C) \cdot I_r = C \cdot \operatorname{adj}(C)$ (where $\operatorname{adj}(C) \in F^{r \times r}$ is the classical adjugate). Applying $\Phi$:
\begin{align*}
\Phi(\det(C) \cdot I_r) = \Phi(C) \cdot \Phi(\operatorname{adj}(C)).
\end{align*}
The left-hand side is $\mu(\det(C)) \cdot I_{mr}$ (since $\Phi$ applied to a scalar matrix $cI_r$ gives the block-diagonal matrix with $\mu(c)$ on the diagonal, which equals $\mu(c) \otimes I_r$, and $\det(C) \cdot I_r$ maps to $\mu(\det(C)) \cdot I_{mr}$ because $\Phi(cI_r) = \mu(c) \otimes I_r$ only when $c$ is scalar — more precisely, $\Phi(\det(C) \cdot I_r)$ is the block-diagonal matrix with each diagonal block equal to $\mu(\det(C))$).
Taking determinants of both sides:
\begin{align*}
\det(\mu(\det(C)))^r = \det(\Phi(C)) \cdot \det(\Phi(\operatorname{adj}(C))).
\end{align*}
This approach, while valid, introduces the adjugate and requires additional bookkeeping. We proceed instead with a direct argument using the definition of trace and norm via the characteristic polynomial.
The characteristic polynomial of $m_\alpha$ on $L$ over $K$ is $\chi_{L/K}(\alpha; t) = \det(tI_{mr} - M)$. The norm $\operatorname{N}_{L/K}(\alpha) = (-1)^{mr} \chi_{L/K}(\alpha; 0) = \det(-M) \cdot (-1)^{mr} \cdot (-1)^{mr} = \det(M)$. More directly, $\operatorname{N}_{L/K}(\alpha) = \det(M)$.
We establish the identity $\det(M) = \operatorname{N}_{F/K}(\det(C))$ by the following approach. The block matrix $M = \Phi(C)$ lies in $\mu(F)^{r \times r} \subset K^{mr \times mr}$, where $\mu(F)$ is a commutative subalgebra of $K^{m \times m}$. Since $\mu(F)$ is commutative, the Leibniz formula for the determinant of a matrix over a commutative ring applies to $C$ viewed as an element of $\mu(F)^{r \times r}$:
\begin{align*}
\det_{\mu(F)}(\Phi(C)) = \sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^{r} A_{k, \sigma(k)} = \mu\!\left(\sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^{r} c_{k, \sigma(k)}\right) = \mu(\det(C)),
\end{align*}
where $\det_{\mu(F)}$ denotes the determinant computed in the ring $\mu(F)^{r \times r}$, treating entries as elements of $\mu(F)$.
[claim:Determinant over a commutative subring equals the full determinant]
Let $R$ be a commutative subring of $K^{m \times m}$ containing $I_m$, and let $B \in R^{r \times r}$ be an $r \times r$ matrix with entries in $R$. Then
\begin{align*}
\det_{K}(B) = \det_{K}(\det_{R}(B)),
\end{align*}
where $\det_K(B)$ is the determinant of $B$ viewed as an element of $K^{mr \times mr}$, and $\det_R(B) = \sum_{\sigma \in S_r} \operatorname{sgn}(\sigma) \prod_{k=1}^r B_{k,\sigma(k)} \in R$ is the Leibniz determinant computed in $R$.
[/claim]
[proof]
We verify this by induction on $r$.
**Base case $r = 1$.** The matrix $B = (B_{11})$ has $\det_R(B) = B_{11} \in R \subset K^{m \times m}$, and $\det_K(B) = \det_K(B_{11})$. So $\det_K(B) = \det_K(\det_R(B))$.
**Inductive step.** Assume the result for $(r-1) \times (r-1)$ matrices over $R$. We expand $\det_K(B)$ by cofactor expansion along the first block-column. For each $k \in \{1, \ldots, r\}$, let $\hat{B}_{k1} \in R^{(r-1) \times (r-1)}$ be the matrix obtained by deleting the $k$-th block-row and first block-column. The Laplace expansion of $\det_K(B)$ along the first $m$ columns gives
\begin{align*}
\det_K(B) = \sum_{k=1}^{r} (-1)^{k+1} \det_K(B_{k1}) \cdot \det_K(\hat{B}_{k1}) + \text{cross-terms},
\end{align*}
but the Laplace expansion for general block matrices includes cross-terms when blocks do not commute. Since the entries $B_{kj} \in R$ and $R$ is commutative, the blocks commute pairwise, and the standard cofactor expansion over $R$ gives
\begin{align*}
\det_R(B) = \sum_{k=1}^{r} (-1)^{k+1} B_{k1} \cdot \det_R(\hat{B}_{k1}).
\end{align*}
Applying $\det_K$ to both sides and using the multiplicativity of $\det_K$ on $K^{m \times m}$ (since the terms are elements of $R \subset K^{m \times m}$):
\begin{align*}
\det_K(\det_R(B)) &= \det_K\!\left(\sum_{k=1}^{r} (-1)^{k+1} B_{k1} \cdot \det_R(\hat{B}_{k1})\right).
\end{align*}
Rather than pursuing this inductive route (which requires careful treatment of the interaction between $\det_K$ and the sum), we give a direct argument using the universal property of determinants.
Define $\delta \colon R^{r \times r} \to K$ by $\delta(B) := \det_K(B)$ (viewing $B$ as an $mr \times mr$ matrix over $K$). This function is:
**(i) Multilinear over $R$ in the columns.** Let $B$ have columns $v_1, \ldots, v_r \in R^r$. Replacing column $j$ by $v_j + v_j'$ (where $v_j' \in R^r$) produces a block matrix whose $j$-th block-column is the sum of the corresponding block-columns. Since $\det_K$ is multilinear in the $mr$ columns, and since replacing block-column $j$ amounts to replacing $m$ consecutive columns simultaneously, $\det_K$ is multilinear in the block-columns. For scalar multiplication: replacing $v_j$ by $c \cdot v_j$ for $c \in R$ replaces each entry $B_{kj}$ by $c \cdot B_{kj}$, which replaces each $m \times m$ block in the $j$-th block-column by $\mu(c) \cdot B_{kj}$ (as elements of $K^{m \times m}$). This means each of the $m$ columns within the $j$-th block-column is left-multiplied by $\mu(c)$, so $\det_K$ picks up a factor of $\det_K(\mu(c)) = \operatorname{N}_{F/K}(c)$. For multilinearity over $R$, we need $\delta(B) = c \cdot \delta(B')$, but $\delta$ maps to $K$, and $c \in R$ acts on $K$ via its determinant. So $\delta$ is multilinear over $R$ only if we interpret scalar multiplication through $\operatorname{N}_{F/K}$.
This shows that $\delta$ is not literally $R$-multilinear, but rather satisfies the multiplicative scaling $\delta(B \text{ with column } j \text{ scaled by } c) = \det_K(\mu(c)) \cdot \delta(B)$. The correct universal property to use is: both $\det_K \circ \Phi$ and $\det_K \circ \mu \circ \det_F$ are polynomial functions of the entries of $C$, viewed as elements of $F$. Since $\mu$ embeds $F$ into $K^{m \times m}$, both sides are polynomial maps $F^{r^2} \to K$, and it suffices to verify their equality on a Zariski-dense subset.
We give instead a clean, self-contained argument. Consider the map $\Phi$ as a faithful representation $\Phi \colon F^{r \times r} \hookrightarrow K^{mr \times mr}$. For any $X \in F^{r \times r}$, the identity $X \cdot \operatorname{adj}(X) = \det_F(X) \cdot I_r$ holds in $F^{r \times r}$ (where $\operatorname{adj}$ is the classical adjugate). Applying $\Phi$:
\begin{align*}
\Phi(X) \cdot \Phi(\operatorname{adj}(X)) = \Phi(\det_F(X) \cdot I_r) = \mu(\det_F(X)) \otimes I_r,
\end{align*}
where $\mu(c) \otimes I_r$ denotes the block-diagonal $mr \times mr$ matrix with $r$ copies of $\mu(c)$ on the diagonal. Taking $\det_K$ of both sides:
\begin{align*}
\det_K(\Phi(X)) \cdot \det_K(\Phi(\operatorname{adj}(X))) = \det_K(\mu(\det_F(X)) \otimes I_r) = \det_K(\mu(\det_F(X)))^r = \operatorname{N}_{F/K}(\det_F(X))^r.
\end{align*}
Similarly, applying the adjugate identity to $\operatorname{adj}(X)$: $\operatorname{adj}(X) \cdot X = \det_F(X) \cdot I_r$, giving $\det_K(\Phi(\operatorname{adj}(X))) \cdot \det_K(\Phi(X)) = \operatorname{N}_{F/K}(\det_F(X))^r$. So
\begin{align*}
\det_K(\Phi(X))^2 \cdot \det_K(\Phi(\operatorname{adj}(X)))^2 = \operatorname{N}_{F/K}(\det_F(X))^{2r}.
\end{align*}
For a more direct and elementary proof: note that $\operatorname{adj}(X)$ has entries that are $(r-1) \times (r-1)$ minors of $X$, hence elements of $F$. Since $\Phi$ is a ring homomorphism, $\Phi(\operatorname{adj}(X)) = \operatorname{adj}_R(\Phi(X))$ is the "block adjugate" of $\Phi(X)$ computed by treating $m \times m$ blocks as ring elements. However, the block adjugate does not in general equal the ordinary adjugate of the $mr \times mr$ matrix.
We therefore use the following direct approach. Both $\det_K(\Phi(X))$ and $\operatorname{N}_{F/K}(\det_F(X))$ are multiplicative: for $X, Y \in F^{r \times r}$,
\begin{align*}
\det_K(\Phi(XY)) &= \det_K(\Phi(X)\Phi(Y)) = \det_K(\Phi(X)) \cdot \det_K(\Phi(Y)), \\
\operatorname{N}_{F/K}(\det_F(XY)) &= \operatorname{N}_{F/K}(\det_F(X) \cdot \det_F(Y)) = \operatorname{N}_{F/K}(\det_F(X)) \cdot \operatorname{N}_{F/K}(\det_F(Y)),
\end{align*}
where the first line uses that $\Phi$ is a ring homomorphism and $\det_K$ is multiplicative, and the second uses that $\det_F$ is multiplicative on $F^{r \times r}$ and $\operatorname{N}_{F/K}$ is multiplicative on $F$.
Both maps $X \mapsto \det_K(\Phi(X))$ and $X \mapsto \operatorname{N}_{F/K}(\det_F(X))$ therefore define group homomorphisms $\operatorname{GL}_r(F) \to K^*$. We verify they agree on elementary matrices, which generate $\operatorname{GL}_r(F)$.
**Type 1: Permutation matrices.** A permutation matrix $P_\sigma$ (for $\sigma \in S_r$) permutes basis vectors, so $\Phi(P_\sigma)$ permutes block-rows, giving $\det_K(\Phi(P_\sigma)) = \operatorname{sgn}(\sigma)^m$ (each block swap is $m$ row swaps). Meanwhile $\det_F(P_\sigma) = \operatorname{sgn}(\sigma)$, and $\operatorname{N}_{F/K}(\operatorname{sgn}(\sigma)) = \operatorname{sgn}(\sigma)^m$ (since $\operatorname{sgn}(\sigma) \in \{1, -1\} \subset K$, and $\operatorname{N}_{F/K}(a) = a^m$ for $a \in K$).
**Type 2: Diagonal matrices.** Let $D = \operatorname{diag}(c_1, \ldots, c_r)$ with $c_k \in F^*$. Then $\Phi(D)$ is block-diagonal with blocks $\mu(c_1), \ldots, \mu(c_r)$, so $\det_K(\Phi(D)) = \prod_{k=1}^{r} \det_K(\mu(c_k)) = \prod_{k=1}^{r} \operatorname{N}_{F/K}(c_k)$. Meanwhile, $\det_F(D) = \prod_{k=1}^{r} c_k$, and $\operatorname{N}_{F/K}(\prod_{k=1}^r c_k) = \prod_{k=1}^r \operatorname{N}_{F/K}(c_k)$ by the multiplicativity of $\operatorname{N}_{F/K}$.
**Type 3: Transvections $E_{kj}(c) = I_r + c \cdot e_k e_j^\top$ for $k \neq j$, $c \in F$.** Here $\det_F(E_{kj}(c)) = 1$, and $\operatorname{N}_{F/K}(1) = 1$. For $\Phi(E_{kj}(c))$: the block matrix is $I_{mr}$ plus $\mu(c)$ in the $(k,j)$-block position. This is a block-triangular perturbation of the identity: the diagonal blocks are all $I_m$ except the $(k,j)$-off-diagonal block, which is $\mu(c)$. The determinant of $I_{mr} + \mu(c) \otimes (e_k e_j^\top)$ equals $1$ (since the matrix is block-unitriangular: reordering rows and columns to place block $k$ adjacent to block $j$, we see this is an upper or lower block-triangular matrix with identity blocks on the diagonal). Formally: $\det_K(\Phi(E_{kj}(c))) = 1$.
Since every invertible matrix in $\operatorname{GL}_r(F)$ can be written as a product of elementary matrices of types 1-3, and both group homomorphisms agree on all elementary matrices, they agree on all of $\operatorname{GL}_r(F)$. In particular, for $C \in \operatorname{GL}_r(F)$:
\begin{align*}
\det_K(\Phi(C)) = \operatorname{N}_{F/K}(\det_F(C)).
\end{align*}
If $C \notin \operatorname{GL}_r(F)$ (i.e., $\det_F(C) = 0$), then $C$ is singular over $F$, meaning there exists $v \in F^r \setminus \{0\}$ with $Cv = 0$. Then $\Phi(C)\Phi(v) = \Phi(Cv) = 0$, and $\Phi(v) \neq 0$ (since $\Phi$ is injective on $F^r$ as $\mu$ is injective). Hence $\Phi(C)$ is singular, and $\det_K(\Phi(C)) = 0 = \operatorname{N}_{F/K}(0) = \operatorname{N}_{F/K}(\det_F(C))$.
Therefore, for all $C \in F^{r \times r}$:
\begin{align*}
\det(M) = \det_K(\Phi(C)) = \operatorname{N}_{F/K}(\det_F(C)) = \operatorname{N}_{F/K}(\operatorname{N}_{L/F}(\alpha)).
\end{align*}
Since $\det(M) = \operatorname{N}_{L/K}(\alpha)$, this gives $\operatorname{N}_{L/K}(\alpha) = \operatorname{N}_{F/K}(\operatorname{N}_{L/F}(\alpha))$.
[/proof]
[guided]
The norm formula is deeper than the trace formula because the determinant of a block matrix does not decompose as simply as the trace. For the trace, only the diagonal blocks matter, and the linearity of trace handles the rest. For the determinant, the off-diagonal blocks contribute, and the Leibniz expansion of the determinant of an $mr \times mr$ matrix does not factor cleanly into an $r \times r$ determinant over $F$ followed by an $m \times m$ determinant over $K$ — at least, not without using the commutativity of the blocks.
**The key insight** is that the block entries $A_{kj} = \mu(c_{kj})$ lie in the commutative subring $\mu(F) \subset K^{m \times m}$. Because $\mu(F)$ is commutative, the determinant of the block matrix $\Phi(C)$ can be computed by the Leibniz formula treating the blocks as elements of $\mu(F)$, and the result is $\mu(\det_F(C))$.
However, this "determinant over a commutative subring" identity requires justification: why does the $mr \times mr$ determinant equal the $m \times m$ determinant of the $r \times r$ block-determinant? The proof verifies this by showing that both sides define the same multiplicative function on $F^{r \times r}$, and then checking agreement on elementary matrices (permutation matrices, diagonal matrices, and transvections), which generate $\operatorname{GL}_r(F)$.
**Checking each type of elementary matrix:**
*Permutation matrices* $P_\sigma$: the block matrix $\Phi(P_\sigma)$ permutes $m$-dimensional blocks, giving $\det_K(\Phi(P_\sigma)) = \operatorname{sgn}(\sigma)^m$ (each block transposition flips $m$ rows). The field norm gives $\operatorname{N}_{F/K}(\operatorname{sgn}(\sigma)) = \operatorname{sgn}(\sigma)^m$ because $\operatorname{sgn}(\sigma) \in K$ and $\operatorname{N}_{F/K}(a) = a^{[F:K]} = a^m$ for $a \in K$. These agree.
*Diagonal matrices* $\operatorname{diag}(c_1, \ldots, c_r)$: the block matrix is block-diagonal with blocks $\mu(c_k)$, so $\det_K = \prod_k \operatorname{N}_{F/K}(c_k) = \operatorname{N}_{F/K}(\prod_k c_k)$ by multiplicativity of $\operatorname{N}_{F/K}$. The field side gives $\det_F = \prod_k c_k$, so both sides agree.
*Transvections* $I_r + c \cdot e_k e_j^\top$: the block matrix is block-unitriangular with diagonal blocks $I_m$, so $\det_K = 1$. The field determinant is $\det_F = 1$, and $\operatorname{N}_{F/K}(1) = 1$.
**The singular case** ($\det_F(C) = 0$) is handled separately: a singular matrix over $F$ maps to a singular block matrix over $K$ (because $\Phi$ preserves the kernel), so both sides are zero.
[/guided]
[/step]
[step:Combine the trace and norm identities to conclude]
The preceding steps establish both formulas. For the trace:
\begin{align*}
\operatorname{Tr}_{L/K}(\alpha) = \operatorname{tr}(M) = \sum_{k=1}^{r} \operatorname{Tr}_{F/K}(c_{kk}) = \operatorname{Tr}_{F/K}\!\left(\sum_{k=1}^{r} c_{kk}\right) = \operatorname{Tr}_{F/K}(\operatorname{Tr}_{L/F}(\alpha)).
\end{align*}
For the norm:
\begin{align*}
\operatorname{N}_{L/K}(\alpha) = \det(M) = \operatorname{N}_{F/K}(\det(C)) = \operatorname{N}_{F/K}(\operatorname{N}_{L/F}(\alpha)).
\end{align*}
Since $\alpha \in L$ was arbitrary, the transitivity formulas hold for all elements of $L$.
[/step]
Explore Further
Order Division Lemma
Group Theory
Every Irreducible of H Appears in Some Restriction
Representation Theory
Complexification of Polynomial Rings
Algebra
Orbit Decomposition Theorem
Algebra
Trace and Norm via Homomorphisms
Algebra
Linear Independence of Eigenvectors for Distinct Eigenvalues
Linear Algebra
Mackey's Character Formula
Representation Theory
Enough Injectives in the Category of Modules
Algebra
Algebra
Area