Number fields form the natural generalization of the integers within algebraic number theory, extending the familiar ring of integers to finite extensions of the rationals. This course develops the fundamental theory of algebraic integers and their arithmetic properties, revealing how many classical results about the ordinary integers — such as unique factorization — can be understood and refined through the lens of algebraic structures. The course begins by establishing what number fields are and introducing the ring of integers within them, then systematically develops the tools needed to understand their multiplicative structure and the arithmetic of their ideals.
The central theme is the interplay between algebraic and arithmetic properties: we study how ideals factor into primes, how norms and traces relate elements to rational integers, and how the fundamental group-theoretic quantities — the class number and unit group — constrain the overall arithmetic of the field. The chapters build progressively from concrete objects (number fields and their basic invariants) to structural results, culminating in finiteness theorems that assert the class group is finite and the unit group is finitely generated. Along the way, the Minkowski bound provides a geometric bridge between ideals and lattice points, offering a tool to compute class groups explicitly for specific fields.
The final chapter on $L$-functions and Dirichlet series connects algebraic number theory to analytic methods, showing how the analytic properties of certain generating functions encode deep arithmetic information about the field. Throughout, the course emphasizes both the theory — understanding why these structures exist and what they measure — and explicit computation, developing techniques to work with concrete examples and compute invariants that govern the arithmetic of a given number field.
# 1. Number fields
This chapter introduces the central objects of the course: number fields and their rings of integers. Starting from concrete arithmetic questions about which integers can be expressed as sums of two squares, we arrive at the abstract framework of algebraic integers and integral ring extensions. The key algebraic engine is a Cayley–Hamilton argument that drives the proof that $\mathcal{O}_L$ is a ring — a fact that would be trivial if the extension were a vector space over a field, but requires genuine work for module extensions over $\mathbb{Z}$.
## From Arithmetic Questions to Number Fields
The motivating question is deceptively simple: which integers can be written as $x^2 + y^2$ for $x, y \in \mathbb{Z}$? And if $a$ and $b$ are both of this form, must $ab$ be as well?
The elegant answer comes from working inside $\mathbb{Z}[i]$. The integers expressible as sums of two squares are precisely the norms of elements of $\mathbb{Z}[i]$: for $\alpha = x + iy$,
\begin{align*}
N(x + iy) &= |x + iy|^2 = x^2 + y^2.
\end{align*}
The multiplicativity of the norm then gives the result immediately: if $a = N(z)$ and $b = N(w)$, then $ab = N(zw)$.
A second example: the solutions to the Diophantine equation $x^2 + 2 = y^3$ are found by working in $\mathbb{Z}[\sqrt{-2}]$. Both examples illustrate the same technique — embed an integer arithmetic problem into a larger ring $\mathcal{O}_L$ sitting inside a field extension $L$ of $\mathbb{Q}$, leverage the ring structure there, then descend back to conclusions about $\mathbb{Z}$.
[definition: Field Extension]
A **field extension** is an inclusion of fields $K \subseteq L$. We write this as $L/K$.
[/definition]
[definition: Degree of a Field Extension]
Let $K \subseteq L$ be a field extension. Then $L$ is a vector space over $K$, and the **degree** of the extension is
\begin{align*}
[L : K] &= \dim_K(L).
\end{align*}
If $[L : K]$ is finite, the extension is called **finite**.
[/definition]
The degree records the size of the extension; it is the most fundamental numerical invariant we attach to $L/K$. We now single out the class of extensions the course studies.
[definition: Number Field]
A **number field** is a finite field extension of $\mathbb{Q}$.
[/definition]
The examples $\mathbb{Q}(i)$ and $\mathbb{Q}(\sqrt{-2})$ are both number fields of degree 2. Degree alone does not determine the field: $\mathbb{Q}(\sqrt{2})$ and $\mathbb{Q}(\sqrt{-1})$ are both quadratic extensions but are structurally different.
## Algebraic Integers
A field is the simplest kind of ring: its only ideals are $\{0\}$ and the field itself. If we want interesting algebraic structure inside a number field, we must look for a distinguished subring. For $\mathbb{Q}$, that subring is $\mathbb{Z}$. What is the correct analogue in a general number field $L$?
One naive attempt: take the $\mathbb{Z}$-span of a $\mathbb{Q}$-basis of $L$. This gives a $\mathbb{Z}$-module inside $L$, but it depends on the choice of basis and is generally not a ring. Another attempt: call $\alpha \in L$ an "integer" if $\alpha$ satisfies some polynomial with integer coefficients. But non-monic polynomials allow fractions: $\frac{1}{2}$ satisfies $2x - 1 = 0$, and we certainly don't want $\frac{1}{2} \in \mathcal{O}_\mathbb{Q}$. This suggests the right criterion: restrict to elements satisfying a monic polynomial with integer coefficients. The monic requirement forces the leading term to be $1$, cutting off the denominators that admitted $\frac{1}{2}$. Equivalently, monic polynomials in $\mathbb{Z}[x]$ are precisely those for which the leading coefficient does not introduce denominators when one uses the relation to express high powers of $\alpha$ in terms of lower powers.
[definition: Algebraic Integer]
Let $L$ be a number field. An **algebraic integer** in $L$ is an element $\alpha \in L$ for which there exists a monic polynomial $f \in \mathbb{Z}[x]$ with $f(\alpha) = 0$. The set of all algebraic integers in $L$ is denoted $\mathcal{O}_L$.
[/definition]
The most accessible illustration is the ring $\mathbb{Z}[i]$ of Gaussian integers. Here the definition is transparent: every element $x + iy$ is a root of a quadratic with integer coefficients, so we can verify integrality by direct computation.
[example: Gaussian Integers]
For $L = \mathbb{Q}(i)$, one has $\mathcal{O}_L = \mathbb{Z}[i]$. Every element $a + bi$ with $a, b \in \mathbb{Z}$ satisfies the monic polynomial $(x - a - bi)(x - a + bi) = x^2 - 2ax + (a^2 + b^2) \in \mathbb{Z}[x]$. The converse — that no other elements of $\mathbb{Q}(i)$ are algebraic integers — requires more work and will be proved in the next chapter.
[/example]
We first verify that $\mathcal{O}_L$ genuinely generalises $\mathbb{Z} \subseteq \mathbb{Q}$.
[quotetheorem:1564]
[citeproof:1564]
This baseline check is worth pausing on. The proof uses the coprimality hypothesis $\gcd(r,s) = 1$ in an essential way: it is this assumption that forces $s = 1$ from $s \mid r^n$. If we dropped coprimality and wrote $\alpha = r/s$ with $s$ possibly sharing factors with $r$, the argument breaks down. The theorem tells us that algebraic integrality inside $\mathbb{Q}$ introduces nothing beyond ordinary integers — $\mathcal{O}_\mathbb{Q} = \mathbb{Z}$ exactly. This is the baseline check that justifies calling $\mathcal{O}_L$ a generalisation of $\mathbb{Z}$: when the number field is $\mathbb{Q}$ itself, we recover the object we started with.
## Integrality and the Ring of Integers
The next question is whether $\mathcal{O}_L$ is a ring — that is, whether sums, differences, and products of algebraic integers are again algebraic integers. This is not immediate from the definition, because combining roots of two different monic polynomials need not produce a root of a third by any simple procedure.
The analogous statement for algebraic numbers over a field is proved in Galois theory: if $\alpha, \beta$ are algebraic over $K$, then $K[\alpha, \beta]$ is a finite extension of $K$, and every element of a finite extension is algebraic. The proof for algebraic integers over $\mathbb{Z}$ follows the same shape, but since $\mathbb{Z}$ is not a field, we must work with finitely-generated modules rather than finite-dimensional vector spaces.
[definition: Integrality Over a Ring]
Let $R \subseteq S$ be rings. An element $\alpha \in S$ is **integral over $R$** if there exists a monic polynomial $f \in R[x]$ with $f(\alpha) = 0$. If every element of $S$ is integral over $R$, we say $S$ is **integral over $R$**.
[/definition]
The passage from finite-dimensional vector spaces over a field to finitely-generated modules over a ring is forced on us precisely because $\mathbb{Z}$ is not a field. The notion of "finite-dimensional vector space over $\mathbb{Z}$" is meaningless — but "finitely generated $\mathbb{Z}$-module" is perfectly well-defined and plays the same structural role in the integer setting.
[definition: Finitely Generated Module]
We say $S$ is **finitely generated over $R$** (as an $R$-module) if there exist elements $\alpha_1, \ldots, \alpha_n \in S$ such that every element of $S$ can be written as an $R$-linear combination $\sum r_i \alpha_i$.
[/definition]
The notation $R[\alpha_1, \ldots, \alpha_r]$ denotes the subring of $S$ generated by $R$ together with $\alpha_1, \ldots, \alpha_r$; equivalently, it is the image of the evaluation homomorphism $R[x_1, \ldots, x_r] \to S$ sending $x_i \mapsto \alpha_i$.
The relationship between integrality and finite generation runs in both directions. The easier direction:
[quotetheorem:1565]
[citeproof:1565]
Monicity is essential here: it gives a bound on the powers of $s$ we need to generate the module. Concretely, for a monic polynomial of degree $n$, we get a generating set of size $n$, namely $\{1, s, \ldots, s^{n-1}\}$. This is the easy direction of the equivalence because the monic relation directly exhibits a finite generating set — no further argument is needed. The monicity hypothesis is what prevents $\frac{1}{2}$ (a root of $2x - 1$) from being an algebraic integer: without it the generating set would be unbounded, since the relation $2s = 1$ fails to express $s^2$ in terms of $\{1, s\}$ over $\mathbb{Z}$. It is also worth noting what this theorem does *not* say: the converse — that finite generation implies integrality — will require genuine work and is the content of the next theorem.
The converse requires more work. The key idea is to mimic the Cayley–Hamilton theorem: if $s \in S$ acts on $S$ by multiplication, and $S$ is a finitely generated $R$-module, then $s$ satisfies its own "characteristic polynomial" over $R$.
[quotetheorem:1566]
[citeproof:1566]
[remark: Why Not Just Use Linear Dependence]
In a vector space over a field, one could instead argue that $1, s, s^2, \ldots$ are linearly dependent, giving a polynomial relation for $s$. But this polynomial might not be monic (the leading coefficient over a ring need not be a unit). The Cayley–Hamilton argument circumvents this by constructing the characteristic polynomial, which is automatically monic.
[/remark]
The hypothesis $\alpha_1 = 1$ in the proof is not merely a normalisation convenience — it is what allows the conclusion $\det(sI - B) = 0$ to follow from $\det(sI - B) \cdot \alpha_i = 0$. If the generating set did not include $1$, the annihilation of all generators by $\det(sI - B)$ would not force $\det(sI - B)$ itself to be zero: the zero ideal of a module is trivially annihilated by any element, but we need $1$ in the module to conclude the element is zero in $R$.
Combining the two theorems immediately gives the ring structure of $\mathcal{O}_L$:
[quotetheorem:1567]
[citeproof:1567]
This is the foundational result that makes the arithmetic of $L$ tractable. The ring $\mathcal{O}_L$ is our replacement for $\mathbb{Z}$ inside the number field $L$: closed under addition and multiplication, and containing $\mathbb{Z}$ itself. The proof uses that $L$ is a field — the finiteness argument closes because $L/\mathbb{Q}$ is finite-dimensional, so $\mathbb{Z}[\alpha,\beta] \subset L$ is a finitely generated module. The result sets up the entire rest of the course: we study $\mathcal{O}_L$ via its ideal theory (Chapter 3), unique factorisation in Dedekind domains (Chapter 3), ideal norms (Chapter 4), and class groups measuring how far $\mathcal{O}_L$ is from being a unique factorisation domain (Chapters 3 and 6).
[remark: Integral Does Not Imply Finitely Generated]
The implication "finitely generated $\Rightarrow$ integral" does not reverse: the ring of all algebraic integers in $\mathbb{C}$ is integral over $\mathbb{Z}$ by definition, but is not finitely generated as a $\mathbb{Z}$-module.
[/remark]
## Transitivity of Integrality
Suppose we have a tower of rings $A \subseteq B \subseteq C$, with $B$ integral over $A$ and $C$ integral over $B$. Is $C$ necessarily integral over $A$? A priori there is no reason this should hold — integrality is defined in terms of a specific ring the element sits over, and stacking hypotheses could force $\alpha \in C$ to satisfy increasingly complex relations over $A$. The answer is yes, but it requires a proof.
[quotetheorem:1568]
[citeproof:1568]
Transitivity is less obvious than the corresponding fact for finite generation, because when we stack an integral extension on top of another, the coefficients of the polynomial for $c$ over $B$ may themselves require complex monic relations over $A$. The proof handles this by collecting finitely many such coefficients $b_0, \ldots, b_N$ into a finitely generated ring $B_0 = A[b_0, \ldots, b_N]$, reducing to the finitely-generated case. The hypotheses cannot be weakened: if $B$ is not integral over $A$, the argument fails at the step where we use that $B_0$ is finitely generated over $A$. In practice, transitivity is used whenever we build towers of rings in composite extensions — if $K \subseteq L \subseteq M$ are number fields, then $\mathcal{O}_M$ is integral over $\mathcal{O}_L$, which is integral over $\mathbb{Z}$, so transitivity gives $\mathcal{O}_M$ integral over $\mathbb{Z}$ as expected.
## Recognising Algebraic Integers via Minimal Polynomials
Given an element $\alpha \in L$, how do we decide whether $\alpha \in \mathcal{O}_L$? In principle one must show that no monic polynomial in $\mathbb{Z}[x]$ vanishes at $\alpha$ — checking all polynomials is infeasible. The minimal polynomial reduces this to a single check.
Recall that for a field extension $K \subseteq L$ and $\alpha \in L$, the **minimal polynomial** $p_\alpha \in K[x]$ is the monic polynomial of least degree over $K$ with $p_\alpha(\alpha) = 0$.
[quotetheorem:1569]
[citeproof:1569]
This divisibility property implies that $p_\alpha$ is the unique monic polynomial of its degree vanishing at $\alpha$: any two monic polynomials of minimal degree annihilating $\alpha$ must divide each other and hence be equal. The hypothesis that $K$ is a field is essential to the proof: polynomial division in $K[x]$ requires a field precisely so that remainders have strictly lower degree than the divisor — over a general ring, division with a controlled-degree remainder fails in general. The uniqueness and minimality we establish here are exactly the properties that make the next theorem's criterion work: they turn an infinite search over all monic $\mathbb{Z}$-polynomials into a single check on one specific polynomial. It also connects integrality to questions in Galois theory: whether $p_\alpha$ splits over $\mathbb{Q}$ or requires a larger field is the starting point for understanding separability and Galois groups, which appear in Chapter 5 when we study ramification.
[quotetheorem:1570]
[citeproof:1570]
This criterion is what makes integrality testable in practice: it reduces an infinite search — over all monic polynomials in $\mathbb{Z}[x]$ — to a single computation, since one need only find $p_\alpha$ and inspect its coefficients. The monicity of $p_\alpha$ matches the required form of the criterion automatically, because minimal polynomials are monic by convention, so the only condition that needs to be checked is whether the coefficients fall in $\mathbb{Z}$ as opposed to merely $\mathbb{Q}$. Two remarks on the structure of the argument: first, it is essential that the coefficients land in $\mathbb{Z}$ and not just $\mathbb{Q}$ — after all, $p_\alpha$ already has coefficients in $\mathbb{Q}$ by definition, and the content is precisely that integrality forces them into the smaller set $\mathbb{Z}$. Second, the key step is that $\mathcal{O}_M$ is a ring, which is why the result that $\mathcal{O}_L$ is a ring (established above) is foundational rather than decorative. The example below shows the criterion in action on two contrasting cases.
[example: Testing Integrality]
Is $\alpha = \frac{1 + \sqrt{5}}{2}$ an algebraic integer? Its minimal polynomial over $\mathbb{Q}$ is $x^2 - x - 1 \in \mathbb{Z}[x]$, which is monic with integer coefficients. So yes, $\alpha \in \mathcal{O}_{\mathbb{Q}(\sqrt{5})}$.
What is surprising here is that $\alpha$ lies *outside* $\mathbb{Z}[\sqrt{5}] = \{a + b\sqrt{5} : a, b \in \mathbb{Z}\}$. The naive guess for the ring of integers of $\mathbb{Q}(\sqrt{5})$ would be $\mathbb{Z}[\sqrt{5}]$, but the golden ratio $\frac{1+\sqrt{5}}{2}$ is integral and lies outside it. This means $\mathcal{O}_{\mathbb{Q}(\sqrt{5})}$ is strictly larger than $\mathbb{Z}[\sqrt{5}]$ — the ring of integers of a number field is generally *not* of the form $\mathbb{Z}[\text{basis element}]$. Determining $\mathcal{O}_L$ for a general number field $L$ is the central computational problem of the course.
By contrast, $\beta = \frac{1}{2}$ has minimal polynomial $x - \frac{1}{2} \in \mathbb{Q}[x]$, whose coefficients are not all integers. So $\beta \notin \mathcal{O}_\mathbb{Q}$, consistent with $\mathcal{O}_\mathbb{Q} = \mathbb{Z}$. This confirms the heuristic: the condition $2x - 1 = 0$ satisfied by $\frac{1}{2}$ is not monic, and no monic $\mathbb{Z}$-polynomial can annihilate a non-integer rational.
[/example]
## The Fraction Field of the Ring of Integers
Having established that $\mathcal{O}_L$ is a ring sitting inside $L$, we might ask: how much of $L$ does it capture? Can every element of $L$ be written as a ratio of algebraic integers? The answer is yes — in fact, something stronger is true: every element of $L$ is a rational multiple of an algebraic integer.
[quotetheorem:1571]
[citeproof:1571]
This result means $L$ is obtained from $\mathcal{O}_L$ simply by inverting nonzero integers — every element of the number field is a rational multiple of an algebraic integer. The ring $\mathcal{O}_L$ therefore captures all the arithmetic of $L$ without losing the ability to recover $L$ itself. Just as $\mathbb{Q}$ is built from $\mathbb{Z}$ by allowing denominators, so $L$ is built from $\mathcal{O}_L$ by allowing denominators in $\mathbb{Z}$. The parallel is precise: $\operatorname{Frac}(\mathbb{Z}) = \mathbb{Q}$ and $\operatorname{Frac}(\mathcal{O}_L) = L$. It is crucial that $L$ be finite over $\mathbb{Q}$ for the construction to work: the argument takes the *finite* set of denominators appearing in the coefficients of $g$ and clears them with a single integer $n$. If $\alpha$ were algebraic over $\mathbb{Q}$ but did not admit a minimal polynomial of finite degree — as would be the case for a transcendental element in an infinite-dimensional extension — then there would be no finite list of coefficients to clear, and no single $n$ would work.
Now that we have established $\mathcal{O}_L$ as a ring capturing all the arithmetic of $L$, we need invariants of individual elements to detect units and irreducibles. The norm and trace are the foundational tools: viewed as matrix determinant and trace, they translate algebraic questions into computable integer conditions.
# 2. Norm, trace, discriminant
In Chapter 1 we introduced number fields and their rings of integers, drawing motivation from the Gaussian integers $\mathbb{Z}[i]$ and the norm $N(x + iy) = x^2 + y^2$. This chapter develops that idea systematically. We attach three numerical invariants to a number field $L/\mathbb{Q}$: the **norm** $N_{L/\mathbb{Q}}(\alpha)$ and **trace** $\operatorname{tr}_{L/\mathbb{Q}}(\alpha)$ of an element $\alpha \in L$, and the **discriminant** $D_L$, a single integer attached to the field itself. These invariants are surprisingly powerful: the norm detects units and irreducibles in $\mathcal{O}_L$, while the discriminant encodes how the prime factorisation behaves in $L$.
## Norm and Trace
Given $\alpha \in \mathcal{O}_L$, can we computationally detect whether $\alpha$ is a unit? Whether it is irreducible? Whether it even is an algebraic integer? The norm and trace are the elementary invariants that answer these questions. The norm and trace of an element generalise the familiar determinant and trace of a matrix. Given $\alpha \in L$, multiplication by $\alpha$ is a $\mathbb{Q}$-linear map $m_\alpha: L \to L$, $\ell \mapsto \alpha \ell$. We read off its invariants.
[definition: Norm and Trace]
Let $L/K$ be a field extension and $\alpha \in L$. Write $m_\alpha: L \to L$ for the $K$-linear map $\ell \mapsto \alpha \ell$. The **norm** and **trace** of $\alpha$ are
\begin{align*}
N_{L/K}(\alpha) &= \det m_\alpha, \\
\operatorname{tr}_{L/K}(\alpha) &= \operatorname{tr} m_\alpha.
\end{align*}
[/definition]
These satisfy the expected algebraic rules.
[quotetheorem:1572]
Both identities follow immediately from the linearity of determinant and trace under composition of linear maps.
The multiplicativity and additivity have complementary roles. Multiplicativity makes the norm a group homomorphism $\mathcal{O}_L^\times \to \mathbb{Z}^\times = \{\pm 1\}$: units must have norm $\pm 1$. Additivity is what makes the trace into a bilinear form $L \times L \to K$ via $(x, y) \mapsto \operatorname{tr}(xy)$ — the trace form that will define the discriminant. Importantly, multiplicativity does not require $L/K$ to be Galois; it is a purely linear-algebraic statement about the map $m_\alpha$.
A more explicit formula ties the norm and trace to the minimal polynomial. Let $p_\alpha \in K[x]$ be the minimal polynomial of $\alpha$ over $K$.
[quotetheorem:1573]
[citeproof:1573]
The exponent $[L:K(\alpha)]$ records how many copies of $\alpha$'s minimal polynomial appear in the characteristic polynomial — equivalently, $\alpha$ has $r = \deg p_\alpha$ distinct conjugates in any algebraic closure, each appearing $[L:K(\alpha)]$ times in the characteristic polynomial. In the primitive element case $L = K(\alpha)$, the exponent is 1 and the characteristic polynomial coincides with the minimal polynomial. The result depends on $L/K$ being separable; in characteristic $p$, inseparability can cause the characteristic polynomial to fail to split into distinct irreducible factors, and the formula requires modification.
An immediate consequence connects integrality to the coefficients of the characteristic polynomial.
[quotetheorem:1574]
[citeproof:1574]
### Explicit formulas for quadratic fields
It is instructive to work out the norm and trace concretely in the quadratic case before moving to the general theory.
[example: Quadratic Fields]
Let $L = K(\sqrt{d})$ where $d \in K$ is not a square. Using the $K$-basis $\{1, \sqrt{d}\}$, any element $\alpha = x + y\sqrt{d}$ has multiplication matrix
\begin{align*}
m_\alpha = \begin{pmatrix} x & dy \\ y & x \end{pmatrix}.
\end{align*}
Reading off the determinant and trace:
\begin{align*}
N_{L/K}(x + y\sqrt{d}) &= x^2 - dy^2 = (x + y\sqrt{d})(x - y\sqrt{d}), \\
\operatorname{tr}_{L/K}(x + y\sqrt{d}) &= 2x = (x + y\sqrt{d}) + (x - y\sqrt{d}).
\end{align*}
Alternatively, the minimal polynomial of $\alpha = x + y\sqrt{d}$ (with $y \neq 0$) is $(z - x)^2 - y^2 d = z^2 - 2xz + (x^2 - dy^2)$, which has roots $x \pm y\sqrt{d}$.
[/example]
Notice that when $L = \mathbb{Q}(\sqrt{d})$ with $d < 0$, the norm $x^2 - dy^2 = x^2 + |d|y^2$ is exactly the squared modulus of $\alpha$ viewed as a complex number. This is precisely the norm we used in $\mathbb{Z}[i]$.
### Integers of quadratic fields
The explicit norm formula allows us to determine $\mathcal{O}_L$ for quadratic fields. The answer depends on $d \pmod{4}$ in a way that is not immediately obvious.
[quotetheorem:1575]
[citeproof:1575]
[remark: On the $d \equiv 1 \pmod{4}$ Case]
The element $\omega = \tfrac{1}{2}(1 + \sqrt{d})$ is not in $\mathbb{Z}[\sqrt{d}]$, yet it is an algebraic integer. This is a genuine phenomenon: the ring of integers is often strictly larger than the "naive" ring one might guess. For $d = -3$, the ring $\mathcal{O}_{\mathbb{Q}(\sqrt{-3})} = \mathbb{Z}[\tfrac{1+\sqrt{-3}}{2}]$ is the ring of Eisenstein integers, which features prominently in the study of cubic reciprocity.
[/remark]
## Field Embeddings
To go further — in particular, to get a formula for the norm and trace in terms of the roots of $p_\alpha$ — we need to understand the embeddings of $L$ into $\mathbb{C}$.
The key tool is the primitive element theorem, recalled from Galois Theory.
[quotetheorem:1267]
For example, $\mathbb{Q}(\sqrt{2}, \sqrt{3}) = \mathbb{Q}(\sqrt{2} + \sqrt{3})$.
Since every number field has the form $\mathbb{Q}(\alpha)$, we can count its embeddings into $\mathbb{C}$ explicitly.
[quotetheorem:1576]
[citeproof:1576]
The labelling $\sigma_1, \ldots, \sigma_n$ is not canonical — any permutation gives an equally valid list. Complex conjugation acts on the embeddings by $\sigma \mapsto \bar{\sigma}$, pairing up the non-real embeddings and fixing the real ones. This action is precisely what determines the signature $(r, s)$ introduced in the next subsection. The count $n$ is specific to characteristic $0$: in characteristic $p$, inseparability can reduce the number of distinct embeddings below the degree.
Using these $n$ embeddings $\sigma_1, \ldots, \sigma_n: L \to \mathbb{C}$, the norm and trace have a clean closed form.
[quotetheorem:1577]
[citeproof:1577]
This formula turns norm and trace computations into embedding computations, which are often more concrete. For instance, to evaluate $N_{\mathbb{Q}(\sqrt{2})/\mathbb{Q}}(\alpha)$ we just compute $\sigma_1(\alpha)\sigma_2(\alpha)$ where $\sigma_1, \sigma_2$ send $\sqrt{2}$ to $\pm\sqrt{2}$. For a Galois extension, the embeddings are automorphisms, and the formula shows that $N$ and $\operatorname{tr}$ are Galois-invariant — the sum and product of all Galois conjugates land in the base field. For a non-Galois $L/\mathbb{Q}$, individual conjugates $\sigma_i(\beta)$ may lie outside $L$, but their product and sum are constrained to land in $\mathbb{Q}$ by the characteristic polynomial theorem.
### Units and irreducibles via the norm
The embedding formula makes the norm particularly effective for detecting units.
[quotetheorem:1578]
[citeproof:1578]
This gives a criterion but not a method for finding all units. The shape of the unit group depends strongly on the signature. For imaginary quadratic fields ($r = 0$, $s = 1$), the norm form $N(a + b\sqrt{d}) = a^2 + |d|b^2$ is positive-definite in two real variables, so $|N(\alpha)| = 1$ has only finitely many integer solutions: $\mathcal{O}_L^\times$ is finite. For real quadratic fields ($r = 2$, $s = 0$), $|N(\alpha)| = 1$ becomes a Pell equation $a^2 - db^2 = \pm 1$, which has infinitely many solutions. The general pattern — that the rank of $\mathcal{O}_L^\times$ equals $r + s - 1$ — is Dirichlet's unit theorem, proved in Chapter 7.
[quotetheorem:1579]
[citeproof:1579]
This criterion is sufficient but not necessary: the converse fails. Consider $\mathbb{Z}[\sqrt{-5}]$ and the element $3$. Any factorisation $3 = \alpha\beta$ in $\mathbb{Z}[\sqrt{-5}]$ forces $N(\alpha)N(\beta) = N(3) = 9$. But there is no element of norm $3$ in $\mathbb{Z}[\sqrt{-5}]$, since $a^2 + 5b^2 = 3$ has no integer solutions (checking $b = 0$ gives $a^2 = 3$, and $b \neq 0$ gives $a^2 + 5b^2 \geq 5$). So $3$ is irreducible in $\mathbb{Z}[\sqrt{-5}]$, yet $N(3) = 9$ is composite — norm-irreducibility is not necessary for irreducibility. This failure is not an isolated accident. The element $3$ is irreducible but not prime in $\mathbb{Z}[\sqrt{-5}]$ (since $3 \mid 6 = (1 + \sqrt{-5})(1 - \sqrt{-5})$ but $3$ divides neither factor), and this gap is precisely what motivates passing from elements to ideals in Chapter 3. In the ideal world, the ideal $(3)$ factors into prime ideals of prime norm, restoring unique factorisation at the level of ideals.
[example: Units in the Gaussian Integers]
For $L = \mathbb{Q}(i)$, we have $\mathcal{O}_L = \mathbb{Z}[i]$ and $N(a + bi) = a^2 + b^2$. The norm equals 1 if and only if $a^2 + b^2 = 1$, which has the four solutions $\pm 1, \pm i$. So $\mathcal{O}_L^\times = \{1, -1, i, -i\}$, as expected.
For a more interesting example, consider $L = \mathbb{Q}(\sqrt{2})$. Here $N(a + b\sqrt{2}) = a^2 - 2b^2$. We need $a^2 - 2b^2 = \pm 1$. The element $1 + \sqrt{2}$ has norm $1 - 2 = -1$, so it is a unit. Its powers $(1 + \sqrt{2})^k$ give infinitely many distinct units, so $\mathcal{O}_L^\times$ is infinite.
[/example]
### Real and complex embeddings
The $n$ embeddings of $L$ split into two types: those whose image lies in $\mathbb{R}$, and conjugate pairs of embeddings into $\mathbb{C} \setminus \mathbb{R}$. This partition is what controls the signature of $L$, which governs the shape of the unit group (via Dirichlet's theorem) and the Minkowski embedding of $\mathcal{O}_L$.
[definition: Signature of a Number Field]
Let $L$ be a number field of degree $n$. Write $r$ for the number of real embeddings $L \hookrightarrow \mathbb{R}$, and $s$ for the number of conjugate pairs of non-real embeddings $L \hookrightarrow \mathbb{C}$. Then $n = r + 2s$. The pair $(r, s)$ is the **signature** of $L$.
Equivalently, if $\alpha$ is a primitive element with minimal polynomial $p_\alpha$, then $r$ is the number of real roots of $p_\alpha$ and $s$ is the number of pairs of complex conjugate roots.
[/definition]
The distinction between real and complex embeddings becomes critical when we study the unit group in Chapter 7 and the class group in Chapters 3 and 6. For now, two quick examples illustrate the range of possibilities.
[example: Signatures]
- $L = \mathbb{Q}(\sqrt{-1})$ has degree 2. The minimal polynomial of $i$ is $x^2 + 1$, which has no real roots. So $r = 0$, $s = 1$, signature $(0,1)$.
- $L = \mathbb{Q}(\sqrt{2})$ has degree 2. The minimal polynomial $x^2 - 2$ has two real roots $\pm\sqrt{2}$. So $r = 2$, $s = 0$, signature $(2,0)$.
- $L = \mathbb{Q}(2^{1/3})$ has degree 3. The minimal polynomial $x^3 - 2$ has one real root $2^{1/3}$ and a conjugate pair of complex roots $2^{1/3}\omega, 2^{1/3}\omega^2$ (where $\omega = e^{2\pi i/3}$). So $r = 1$, $s = 1$, signature $(1,1)$.
[/example]
## Discriminant
The norm and trace are invariants of elements. For field-level questions — most importantly: which rational primes ramify in $\mathcal{O}_L$? — we need an invariant of $L$ itself. The discriminant $D_L$ is that invariant.
### The trace form
To define a discriminant of $L$ we need a numerical invariant of the $\mathbb{Z}$-module $\mathcal{O}_L$. The construction passes through a bilinear form. The only natural bilinear form available over any base ring — not just a field — is the trace form.
[quotetheorem:1580]
[citeproof:1580]
We write $\Delta(\alpha_1, \ldots, \alpha_n) = \det(\operatorname{tr}_{L/K}(\alpha_i\alpha_j))$ for the determinant of the Gram matrix. This quantity depends on the choice of basis: if $\alpha_i' = \sum_j a_{ij}\alpha_j$ with change-of-basis matrix $A = (a_{ij}) \in \operatorname{GL}_n(K)$, then
\begin{align*}
\Delta(\alpha_1', \ldots, \alpha_n') = (\det A)^2\, \Delta(\alpha_1, \ldots, \alpha_n).
\end{align*}
So different bases give values differing by a perfect square in $K^\times$. For number fields, we can eliminate this ambiguity by restricting to integral bases.
### Integral bases and the discriminant
We now have a numerical invariant $\Delta(\alpha_1, \ldots, \alpha_n)$ attached to any $\mathbb{Q}$-basis of $L$, but it depends on the basis in a controlled way. For the invariant to depend only on $L$ and not on the chosen basis, we need to restrict to $\mathbb{Z}$-bases of $\mathcal{O}_L$. Do such bases always exist? And when they do, is $\Delta$ well-defined up to sign?
[definition: Integral Basis]
Let $L/\mathbb{Q}$ be a number field of degree $n$. A $\mathbb{Q}$-basis $\alpha_1, \ldots, \alpha_n$ of $L$ is an **integral basis** if
\begin{align*}
\mathcal{O}_L = \mathbb{Z}\alpha_1 \oplus \cdots \oplus \mathbb{Z}\alpha_n.
\end{align*}
Equivalently, it is simultaneously a $\mathbb{Q}$-basis for $L$ and a $\mathbb{Z}$-basis for $\mathcal{O}_L$.
[/definition]
The key existence theorem guarantees that integral bases always exist.
[quotetheorem:1581]
[citeproof:1581]
Two integral bases are related by a matrix in $\operatorname{GL}_n(\mathbb{Z})$, whose determinant is $\pm 1$. Consequently $(\det A)^2 = 1$ and the value of $\Delta$ is independent of the choice of integral basis. This gives a well-defined invariant.
[definition: Discriminant]
The **discriminant** $D_L$ of a number field $L$ is
\begin{align*}
D_L = \Delta(\alpha_1, \ldots, \alpha_n)
\end{align*}
for any integral basis $\alpha_1, \ldots, \alpha_n$ of $\mathcal{O}_L$.
[/definition]
### Computing discriminants of quadratic fields
With the machinery in place, we compute $D_L$ for $L = \mathbb{Q}(\sqrt{d})$. The answer confirms something we saw earlier: the case $d \equiv 1 \pmod{4}$ behaves differently, reflecting the fact that $\mathcal{O}_L$ is larger than the naive $\mathbb{Z}[\sqrt{d}]$ when $d \equiv 1 \pmod{4}$.
[example: Discriminant of a Quadratic Field]
Let $L = \mathbb{Q}(\sqrt{d})$ with $d$ square-free, $d \neq 0, 1$.
**Case $d \equiv 2$ or $3 \pmod{4}$:** The integral basis is $\{1, \sqrt{d}\}$. The embeddings send $\sqrt{d} \mapsto \pm\sqrt{d}$, so the matrix $S$ from the proof of non-degeneracy is
\begin{align*}
S = \begin{pmatrix} 1 & \sqrt{d} \\ 1 & -\sqrt{d} \end{pmatrix},
\end{align*}
and $D_L = (\det S)^2 = (-2\sqrt{d})^2 = 4d$.
**Case $d \equiv 1 \pmod{4}$:** The integral basis is $\{1, \omega\}$ where $\omega = \tfrac{1}{2}(1 + \sqrt{d})$. The embeddings send $\omega \mapsto \tfrac{1}{2}(1 \pm \sqrt{d})$, so
\begin{align*}
S = \begin{pmatrix} 1 & \tfrac{1}{2}(1 + \sqrt{d}) \\ 1 & \tfrac{1}{2}(1 - \sqrt{d}) \end{pmatrix},
\end{align*}
and $\det S = -\sqrt{d}$, giving $D_L = d$.
In summary:
\begin{align*}
D_{\mathbb{Q}(\sqrt{d})} = \begin{cases} 4d & \text{if } d \equiv 2 \text{ or } 3 \pmod{4}, \\ d & \text{if } d \equiv 1 \pmod{4}. \end{cases}
\end{align*}
[/example]
For instance, $D_{\mathbb{Q}(\sqrt{-1})} = -4$, $D_{\mathbb{Q}(\sqrt{5})} = 5$, and $D_{\mathbb{Q}(\sqrt{-3})} = -3$.
### Connection to polynomial discriminants
There is a natural consistency check between the discriminant of a number field and the classical discriminant of a polynomial. If $f(x) = \prod_{i=1}^n (x - \alpha_i)$, the discriminant of $f$ is defined as
\begin{align*}
\operatorname{disc}(f) = \prod_{i < j}(\alpha_i - \alpha_j)^2.
\end{align*}
When $L = \mathbb{Q}(\theta)$ and the basis $1, \theta, \ldots, \theta^{n-1}$ is integral (which happens for many natural examples but not always), the Vandermonde computation in the proof above gives
\begin{align*}
\Delta(1, \theta, \ldots, \theta^{n-1}) = \prod_{i < j}(\sigma_i(\theta) - \sigma_j(\theta))^2 = \operatorname{disc}(p_\theta),
\end{align*}
where $p_\theta$ is the minimal polynomial of $\theta$ and $\sigma_i(\theta)$ are its roots. So $D_L = \operatorname{disc}(p_\theta)$ whenever $1, \theta, \ldots, \theta^{n-1}$ is an integral basis. When it is not (e.g., the $d \equiv 1 \pmod{4}$ quadratic case), the discriminant of $p_\theta$ and $D_L$ differ by a perfect square integer factor reflecting the index $[\mathbb{Z}[\theta] : \mathcal{O}_L]$.
[remark: Discriminant and Ramification]
The discriminant $D_L$ is not merely a computational curiosity. A rational prime $p$ ramifies in $\mathcal{O}_L$ — meaning some prime ideal above $p$ appears with multiplicity greater than one — if and only if $p \mid D_L$. This is proved in Chapter 5 once we have developed the theory of ideals. For quadratic fields, the formula above shows that the only primes that can ramify are those dividing $D_L$: for $\mathbb{Q}(\sqrt{-1})$, only $p = 2$ ramifies; for $\mathbb{Q}(\sqrt{5})$, only $p = 5$ ramifies.
[/remark]
With the norm and trace in hand, we have computational tests for when elements are units, but a deeper problem remains: elements of $\mathcal{O}_L$ fail to factor uniquely in general. The remedy is to rise above element-level arithmetic and work instead with ideals, where unique factorization holds perfectly once we understand the multiplicative structure.
# 3. Multiplicative structure of ideals
Having established the ring of integers $\mathcal{O}_L$ in the previous chapter, a natural question is how well-behaved it is as a ring. The answer, in general, is that it fails to be a unique factorization domain. This chapter develops the remedy: by passing from elements to ideals, and then equipping the collection of ideals with a multiplicative structure, we recover a perfect analogue of unique factorization. The central theorem is that every nonzero ideal in $\mathcal{O}_L$ factors uniquely as a product of prime ideals. The invariant measuring how far $\mathcal{O}_L$ is from being a PID — the class group — is introduced at the end of the chapter.
## Failure of unique factorization in $\mathcal{O}_L$
The ring $\mathcal{O}_L$ can fail to be a UFD. The simplest example makes this concrete.
[example: Failure of Unique Factorization in $\mathbb{Z}[\sqrt{-5}]$]
Let $L = \mathbb{Q}(\sqrt{-5})$, so $\mathcal{O}_L = \mathbb{Z}[\sqrt{-5}]$. We have the factorization
\begin{align*}
3 \cdot 7 &= (1 + 2\sqrt{-5})(1 - 2\sqrt{-5}).
\end{align*}
The norms of these four elements are $N(3) = 9$, $N(7) = 49$, $N(1 \pm 2\sqrt{-5}) = 21$. Since none are associates of one another (that would require equal norms up to sign), these are genuinely distinct factorisations. Moreover, all four elements $3, 7, 1 \pm 2\sqrt{-5}$ are irreducible. For instance, if $3 = \alpha\beta$ then $9 = N(\alpha)N(\beta)$, forcing $N(\alpha) = \pm 3$. But the equation $x^2 + 5y^2 = 3$ has no integer solutions, so no such $\alpha$ exists. The checks for the other three elements are analogous.
Thus unique factorization fails in $\mathbb{Z}[\sqrt{-5}]$.
[/example]
It is still possible to factor any element into irreducibles (not necessarily uniquely): one induction on $|N(\alpha)|$. If $|N(\alpha)| = 1$ then $\alpha$ is a unit; otherwise if $\alpha = \beta\gamma$ with $N(\beta)N(\gamma) = N(\alpha)$ and neither factor a unit, then $|N(\beta)|, |N(\gamma)| < |N(\alpha)|$.
The fix is to pass from elements to ideals. Ideals carry a natural multiplicative structure, and the fundamental theorem of this chapter says that unique factorization holds perfectly at the level of ideals.
## Ideal multiplication and prime ideals
Since element-level factorisation fails in $\mathcal{O}_L$, we lift to ideals. The first question: what should multiplication of ideals mean, and does it behave the way we need it to?
[definition: Ideal Multiplication]
Let $\mathfrak{a}, \mathfrak{b} \unlhd \mathcal{O}_L$ be ideals. Their product is
\begin{align*}
\mathfrak{a}\mathfrak{b} &= \left\{ \sum_{i,j} \alpha_i \beta_j : \alpha_i \in \mathfrak{a},\, \beta_j \in \mathfrak{b} \right\}.
\end{align*}
We say $\mathfrak{a}$ **divides** $\mathfrak{b}$, written $\mathfrak{a} \mid \mathfrak{b}$, if there exists an ideal $\mathfrak{c}$ with $\mathfrak{a}\mathfrak{c} = \mathfrak{b}$.
[/definition]
Multiplication is associative: $(\mathfrak{a}\mathfrak{b})\mathfrak{c} = \mathfrak{a}(\mathfrak{b}\mathfrak{c})$, and commutative. For finitely generated ideals we compute products via generators:
\begin{align*}
\langle x_1, \ldots, x_n \rangle \langle y_1, \ldots, y_m \rangle &= \langle x_i y_j : 1 \leq i \leq n,\, 1 \leq j \leq m \rangle.
\end{align*}
In particular, $\langle x \rangle \langle y \rangle = \langle xy \rangle$.
[example: Factoring $\langle 3 \rangle$ in $\mathbb{Z}[\sqrt{-5}]$]
In $\mathbb{Z}[\sqrt{-5}]$, the principal ideal $\langle 3 \rangle$ factors as
\begin{align*}
\langle 3 \rangle &= \langle 3, 1 + \sqrt{-5} \rangle \langle 3, 1 - \sqrt{-5} \rangle.
\end{align*}
To verify this, we compute the product on the right:
\begin{align*}
\langle 3, 1 + \sqrt{-5} \rangle \langle 3, 1 - \sqrt{-5} \rangle &= \langle 9,\, 3(1+\sqrt{-5}),\, 3(1-\sqrt{-5}),\, 21 \rangle.
\end{align*}
Since $\gcd(9, 21) = 3$, Euclid's algorithm gives $\langle 9, 21 \rangle = \langle 3 \rangle$, and the whole expression reduces to $\langle 3 \rangle$.
Notice that $3$ was irreducible as an element — there was no element of norm $3$ in $\mathbb{Z}[\sqrt{-5}]$. By passing to ideals, we can refine the factorization further, and the ideals $\langle 3, 1 \pm \sqrt{-5} \rangle$ are not principal (if they were, we would recover a factorization of $3$ as an element). These ideals can be thought of as "generalized elements" that live outside $\mathcal{O}_L$ but carry arithmetic significance.
[/example]
The example shows ideals factor more finely than elements, but to exploit this we need the right notion of an ideal playing the role of a prime — one for which the familiar reasoning "if a prime divides a product, it divides a factor" still applies. The ring-theoretic definition of a prime ideal generalises exactly this property, and it is the first ingredient in the axiomatic framework we will develop.
[definition: Prime Ideal]
Let $R$ be a ring. An ideal $\mathfrak{p} \subseteq R$ is **prime** if $R/\mathfrak{p}$ is an integral domain. Equivalently, for all $x, y \in R$: $xy \in \mathfrak{p}$ implies $x \in \mathfrak{p}$ or $y \in \mathfrak{p}$.
In this course, prime ideals are always required to be nonzero.
[/definition]
## Dedekind domains
The ring of integers $\mathcal{O}_L$ belongs to a special class of rings whose algebraic structure is perfectly suited for ideal-theoretic arguments.
[definition: Dedekind Domain]
A ring $R$ is a **Dedekind domain** if:
1. $R$ is an integral domain.
2. $R$ is Noetherian.
3. $R$ is integrally closed in $\operatorname{Frac}(R)$: if $x \in \operatorname{Frac}(R)$ is integral over $R$, then $x \in R$.
4. Every proper prime ideal is maximal.
[/definition]
Condition (4) is the geometric heart of the definition: it says $R$ is "one-dimensional" in the sense of Krull dimension. The theorem we need is:
[quotetheorem:1582]
[citeproof:1582]
Of the three Dedekind conditions, integral closure is the subtlest — it is exactly the hypothesis that distinguishes $\mathcal{O}_L$ from larger orders like $\mathbb{Z}[\sqrt{-3}]$ (which is Noetherian of dimension 1 but not integrally closed: $\frac{1+\sqrt{-3}}{2}$ is a root of $x^2 - x + 1$, hence integral over $\mathbb{Z}$, yet does not lie in $\mathbb{Z}[\sqrt{-3}]$). That failure of integral closure is exactly what breaks unique factorization: $\mathbb{Z}[\sqrt{-3}]$ fails unique factorization of ideals for the same reason $\mathcal{O}_L = \mathbb{Z}[\frac{1+\sqrt{-3}}{2}] = \mathbb{Z}[\zeta_3]$ succeeds. The Krull-dimension-1 condition is equally essential: without it, primes are not maximal and quotients $\mathcal{O}_L/\mathfrak{p}$ are not fields. Dedekind domains form the *minimal* class where all three properties — Noetherian, dimension 1, integrally closed — hold simultaneously, which is exactly what every step in the rest of the chapter needs.
The lemma used in the proof of (iv) is important in its own right.
[quotetheorem:1583]
[citeproof:1583]
The finiteness of $\mathcal{O}_L/\mathfrak{a}$ is foundational for the rest of number theory: it lets us define the ideal norm $N(\mathfrak{a}) = |\mathcal{O}_L/\mathfrak{a}|$, which underpins the Minkowski bound for class number estimates, the statement that only finitely many primes ramify, and the finiteness of the class group itself (Chapter 7). Integral closure of $\mathcal{O}_L$ was used in the proof: dropping it breaks the argument at the step where $\mathcal{O}_L/a\mathcal{O}_L$ injects into $(1/\alpha)\mathbb{Z}^n/\mathbb{Z}^n$ — without $\alpha \cdot \mathcal{O}_L$ being contained in $\mathcal{O}_L$, this quotient need not be finite.
## Fractional ideals and their inverses
The strategy for proving unique factorization of ideals mirrors the proof for $\mathbb{Z}$: once one has established that prime ideals divide products by dividing one factor, uniqueness follows by cancellation. But cancellation in ideals requires inverses, and a nonzero integral ideal $\mathfrak{a}$ has no integral ideal inverse (since $\mathfrak{a}\mathfrak{b} \subseteq \mathfrak{a}$ for any ideal $\mathfrak{b}$). The solution is to enlarge the universe from integral ideals to fractional ideals.
[definition: Fractional Ideal]
A **fractional ideal** of $\mathcal{O}_L$ is an $\mathcal{O}_L$-submodule of $L$ that is finitely generated over $\mathcal{O}_L$.
[/definition]
Every ideal of $\mathcal{O}_L$ is a fractional ideal (it is already an $\mathcal{O}_L$-submodule of $L$, finitely generated by the Noetherian condition). We introduce the word "integral" to distinguish these from fractional ideals that are not contained in $\mathcal{O}_L$.
[definition: Integral Ideal]
When we wish to emphasise that $\mathfrak{a} \unlhd \mathcal{O}_L$ is an ordinary ideal, we call it an **integral ideal** (or honest ideal). We never use the word "ideal" alone to mean a fractional ideal.
[/definition]
The following lemma characterises fractional ideals more concretely.
[quotetheorem:1584]
In other words, every fractional ideal is of the form $\frac{1}{c}\mathfrak{a}$ for some integral ideal $\mathfrak{a}$ and nonzero integer $c$.
[proof]
$(\Leftarrow)$ If $c\mathfrak{q}$ is an integral ideal, it is finitely generated (since $\mathcal{O}_L$ is Noetherian). Since $c\mathfrak{q} \cong \mathfrak{q}$ as $\mathcal{O}_L$-modules, $\mathfrak{q}$ is also finitely generated.
$(\Rightarrow)$ Suppose $\mathfrak{q}$ is generated by $x_1, \ldots, x_n$ over $\mathcal{O}_L$. Write $x_i = y_i/n_i$ with $y_i \in \mathcal{O}_L$ and $n_i \in \mathbb{Z} \setminus \{0\}$. Let $c = \operatorname{lcm}(n_1, \ldots, n_k)$. Then $c\mathfrak{q} \subseteq \mathcal{O}_L$ and is an $\mathcal{O}_L$-submodule of $\mathcal{O}_L$, i.e., an integral ideal.
[/proof]
[remark: Rank of Fractional Ideals]
The proof implicitly used that fractional ideals have finite rank; we record this. As an abelian group, every fractional ideal $\mathfrak{q}$ satisfies $\mathfrak{q} \cong \mathbb{Z}^n$ where $n = [L:\mathbb{Q}]$. This follows because $c\mathfrak{q}$ is a nonzero integral ideal, which is both a subgroup and a supergroup (via $a_0\mathcal{O}_L \subseteq c\mathfrak{q}$ for some $a_0 \in \mathbb{Z}$) of $\mathcal{O}_L \cong \mathbb{Z}^n$; hence its rank must equal $n$.
[/remark]
The natural candidate for the inverse of a fractional ideal $\mathfrak{q}$ is the set $\{x \in L : x\mathfrak{q} \subseteq \mathcal{O}_L\}$. Two preparatory results show this construction is well-behaved.
[quotetheorem:1585]
[citeproof:1585]
The second part says that for proper ideals, the set $\{y \in L : y\mathfrak{a} \subseteq \mathcal{O}_L\}$ is strictly larger than $\mathcal{O}_L$. Combined with the characterisation lemma (pick $a \in \mathfrak{a}$; then $a \cdot \{y : y\mathfrak{a} \subseteq \mathcal{O}_L\} \subseteq \mathcal{O}_L$), this set is a fractional ideal, and it will be the inverse of $\mathfrak{a}$.
[definition: Invertible Fractional Ideal]
A fractional ideal $\mathfrak{q}$ is **invertible** if there exists a fractional ideal $\mathfrak{r}$ such that $\mathfrak{q}\mathfrak{r} = \mathcal{O}_L$.
[/definition]
The key theorem of this section is that every nonzero fractional ideal is invertible; the proof constructs the inverse explicitly.
[quotetheorem:1586]
[citeproof:1586]
All three Dedekind conditions are essential to this theorem. Noetherianness gives that the maximality argument terminates. Integral closure ensures the candidate inverse $\mathfrak{b} := \{x \in L : x\mathfrak{a} \subseteq \mathcal{O}_L\}$ satisfies $\mathfrak{a}\mathfrak{b} = \mathcal{O}_L$ rather than landing in some intermediate conductor (in $\mathbb{Z}[\sqrt{-3}]$, the ideal $\langle 2, 1+\sqrt{-3} \rangle$ has conductor equal to itself and is *not* invertible). Dimension 1 ensures that any proper nonzero ideal is contained in a maximal ideal to which we can apply the stability argument. This theorem sets up the next section: divisibility of ideals can now be studied via cancellation.
## Divisibility and containment
With inverses in hand, cancellation becomes available, and the relationship between divisibility and containment becomes transparent.
[quotetheorem:1587]
[citeproof:1587]
[remark: To Contain Is to Divide]
The equivalence of divisibility and containment yields a famous slogan in algebraic number theory: to contain is to divide ($\mathfrak{b} \subseteq \mathfrak{a} \iff \mathfrak{a} \mid \mathfrak{b}$). Showing that $\mathfrak{a}$ divides $\mathfrak{b}$ — which requires producing a quotient ideal — is replaced by the much simpler task of checking containment.
[/remark]
## Unique factorization of ideals
We now ask the central question: does the machinery we have built — Dedekind domains, invertibility, the to-contain-is-to-divide equivalence — actually yield a unique factorization theorem at the ideal level? The affirmative answer is the main theorem of the chapter.
Before proving the theorem, we record a useful lemma relating prime ideals to products.
[quotetheorem:1588]
[citeproof:1588]
This is Euclid's lemma for ideals and is the key step of the uniqueness half of the unique factorization theorem. Without "prime divides product implies prime divides a factor," factorisation could split in incompatible ways — exactly the phenomenon that kills uniqueness of element factorisation in $\mathbb{Z}[\sqrt{-5}]$. The ideal version succeeds because prime ideals in a Dedekind domain are maximal, so $\mathcal{O}_L/\mathfrak{p}$ is a field and hence an integral domain.
We now have all ingredients: existence via Noetherian induction, uniqueness via prime-divides-product. The main theorem follows.
[quotetheorem:1589]
[citeproof:1589]
This is the structural payoff of everything built so far. The three Dedekind conditions enter separately: Noetherianness drives the existence half (the descending chain of prime power divisors terminates), maximality of primes enables the key step that a prime divides a product by dividing one factor, and invertibility (which uses integral closure) lets us cancel to conclude uniqueness. Dropping any condition breaks the theorem: $\mathbb{Z}[\sqrt{-3}]$ is Noetherian of dimension 1 but not integrally closed, and the ideal $\langle 2, 1+\sqrt{-3}\rangle\langle 2, 1-\sqrt{-3}\rangle = \langle 2\rangle^2$ has two distinct factorisations into prime-looking ideals. Note also that the theorem does not claim unique factorisation at the element level: $2 \cdot 3 = (1+\sqrt{-5})(1-\sqrt{-5})$ is still two distinct irreducible factorisations in $\mathbb{Z}[\sqrt{-5}]$. What unique factorisation of ideals restores is that $\langle 6 \rangle$, as a product of prime ideals, factors uniquely.
The unique factorization theorem has an elegant structural consequence: the nonzero fractional ideals form a group with a free abelian structure.
[quotetheorem:1590]
[citeproof:1590]
[remark: LCM and GCD of Ideals]
The free abelian group structure is exactly what makes the integers and the nonzero fractional ideals of $\mathcal{O}_L$ look similar arithmetically: both are free abelian groups on a set of primes, and the freeness — no relations among distinct primes — is precisely unique factorisation phrased group-theoretically. In non-Dedekind settings this fails: primes need not be maximal, or ideals need not be invertible, and the free structure collapses. This freeness directly motivates the class group construction that follows: passing to the quotient by the principal ideals measures the arithmetic gap between $\mathcal{O}_L$ and $\mathbb{Z}$. Since divisibility and containment coincide, the lattice of ideals under inclusion has a clean description: the **least common multiple** of $\mathfrak{a}$ and $\mathfrak{b}$ is $\mathfrak{a} \cap \mathfrak{b}$, and the **greatest common divisor** is their sum
\begin{align*}
\mathfrak{a} + \mathfrak{b} &= \{a + b : a \in \mathfrak{a},\, b \in \mathfrak{b}\}.
\end{align*}
[/remark]
## Quadratic fields: an explicit verification
The invertibility proof produced an inverse by an abstract maximality argument, giving no recipe. For quadratic fields, can we write down the inverse by hand? This section shows yes, and the argument doubles as a concrete illustration of the ideal-theoretic machinery.
[example: Every Ideal in a Quadratic Field Has a Principal Multiple]
Let $\mathfrak{a} \unlhd \mathcal{O}_L$ be a nonzero ideal, where $[L:\mathbb{Q}] = 2$. We show directly that there exists an ideal $\mathfrak{b}$ such that $\mathfrak{a}\mathfrak{b}$ is principal — providing a hands-on proof of invertibility for this case.
Since $\mathcal{O}_L \cong \mathbb{Z}^2$, any ideal $\mathfrak{a} \cong \mathbb{Z}^2$ as abelian groups, so $\mathfrak{a}$ is generated by at most two elements as an $\mathcal{O}_L$-module. If one generator suffices, $\mathfrak{a}$ is already principal. Otherwise write $\mathfrak{a} = \langle b, \alpha \rangle$ with $b \in \mathbb{Z}$ and $\alpha \in \mathcal{O}_L$ (we may arrange one generator in $\mathbb{Z}$ by subtracting $\mathbb{Z}$-linear combinations). Write $\alpha = x + y\sqrt{d}$, $\bar{\alpha} = x - y\sqrt{d}$ and let $\bar{\mathfrak{a}} = \langle b, \bar{\alpha} \rangle$. Then
\begin{align*}
\mathfrak{a}\bar{\mathfrak{a}} &= \langle b^2,\, b\alpha,\, b\bar{\alpha},\, \alpha\bar{\alpha} \rangle \\
&= \langle b^2,\, b\alpha,\, b\operatorname{tr}(\alpha),\, N(\alpha) \rangle,
\end{align*}
using $\operatorname{tr}(\alpha) = \alpha + \bar{\alpha}$ and $N(\alpha) = \alpha\bar{\alpha}$. Here $b^2$, $b\operatorname{tr}(\alpha)$, and $N(\alpha)$ are all integers. Let $c = \gcd(b^2, b\operatorname{tr}(\alpha), N(\alpha))$. To see that $b\alpha \in \langle c \rangle$: write $b\alpha = cx$ and note that $\operatorname{tr}(x) = \frac{b}{c}\operatorname{tr}(\alpha) \in \mathbb{Z}$ (since $c \mid b\operatorname{tr}(\alpha)$) and $N(x) = \frac{b^2 N(\alpha)}{c^2} \in \mathbb{Z}$ (since $c^2 \mid b^2 N(\alpha)$, which follows from $c \mid b^2$ and $c \mid N(\alpha)$). Thus $x$ has integer trace and norm, so $x \in \mathcal{O}_L$ and $b\alpha \in c\mathcal{O}_L$. Therefore $\mathfrak{a}\bar{\mathfrak{a}} = \langle c \rangle$, a principal ideal.
[/example]
## The class group
How far is $\mathcal{O}_L$ from being a PID? Unique factorization of ideals partitions nonzero fractional ideals into principal and non-principal, and the ratio of the two forms a group — the class group — that measures exactly this obstruction.
[definition: Class Group]
The **class group** (or **ideal class group**) of a number field $L$ is
\begin{align*}
\mathrm{Cl}_L &= I_L / P_L,
\end{align*}
where $I_L$ is the group of nonzero fractional ideals and $P_L$ is the normal subgroup of principal fractional ideals. For $\mathfrak{a} \in I_L$, we write $[\mathfrak{a}]$ for its class in $\mathrm{Cl}_L$.
[/definition]
Two ideals $\mathfrak{a}$ and $\mathfrak{b}$ represent the same class if and only if there exists $\gamma \in L^\times$ with $\gamma\mathfrak{a} = \mathfrak{b}$. The class group is trivial precisely when $P_L = I_L$, meaning every fractional ideal is principal.
[quotetheorem:1591]
[citeproof:1591]
[remark: The Role of the Class Group]
The class group is the precise measure of how much $\mathcal{O}_L$ fails to be a PID: a trivial class group means all ideals are principal and element-level unique factorization holds; a nontrivial class group records the exact "defect" of principal ideals among all fractional ideals, and understanding its structure (especially its order, the class number) is one of the central goals of algebraic number theory. The example $\mathbb{Z}[\sqrt{-5}]$ has $\mathrm{Cl}_L \cong \mathbb{Z}/2\mathbb{Z}$ — a nontrivial class group of order 2. The nonprincipal class is represented by the ideals $\langle 3, 1 \pm 2\sqrt{-5} \rangle$. The factorizations $3 \cdot 7 = (1 + 2\sqrt{-5})(1 - 2\sqrt{-5})$ correspond, at the ideal level, to the same product $\langle 3 \rangle \langle 7 \rangle = \langle 1+2\sqrt{-5} \rangle \langle 1-2\sqrt{-5} \rangle$, but now uniquely recoverable from the prime ideal factorizations of each side.
[/remark]
In Chapters 5 and 6, we will develop methods — notably Dedekind's criterion and the Minkowski bound — to compute the class group of any given number field explicitly.
Having recovered unique factorization at the ideal level, we now ask how to measure the size of ideals. The ideal norm $N(\mathfrak{a}) = |\mathcal{O}_L/\mathfrak{a}|$ provides the answer: a completely multiplicative invariant that extends the element norm and opens the door to geometric arguments about how many ideals exist of bounded size.
# 4. Norms of ideals
Having established unique factorisation of ideals and the class group in the previous chapter, we now turn to measuring the size of ideals. The ideal norm is the central tool for making the class group computable: it lets us count ideals of bounded norm, and the key estimate that the class group is finite rests on showing that every ideal class contains an ideal of norm at most a specific bound — the Minkowski bound — computable from the field discriminant and degree. Without the ideal norm, we would have no way to reduce the class group to a finite search. The goal of this chapter is to construct this invariant, establish its multiplicativity, and reconcile it with the element norm that has run in parallel throughout the course.
## The Norm of an Ideal
The norm of an element $\alpha \in \mathcal{O}_L$ was defined using the field-theoretic trace and determinant. For ideals we adopt a more direct approach: the size of the quotient.
[definition: Norm of an Ideal]
Let $L$ be a number field with ring of integers $\mathcal{O}_L$, and let $\mathfrak{a} \unlhd \mathcal{O}_L$ be a nonzero ideal. The **norm** of $\mathfrak{a}$ is
\begin{align*}
N(\mathfrak{a}) = |\mathcal{O}_L / \mathfrak{a}|.
\end{align*}
[/definition]
The nonzero hypothesis is essential. For the zero ideal, $\mathcal{O}_L / (0) \cong \mathcal{O}_L$ is infinite, so the quotient-index definition breaks down entirely. The quotient-index definition also explains why extending the element norm $N_{L/\mathbb{Q}}(\alpha)$ directly to non-principal ideals fails: $N_{L/\mathbb{Q}}$ is defined via a generator, and non-principal ideals have no canonical generator. A fractional ideal $\mathfrak{a}$ that is not principal in $\mathcal{O}_L$ has many choices of generating sets but none picks out a single element whose field norm we could take. The quotient-index definition bypasses this entirely — it depends only on the ideal as a subset of $\mathcal{O}_L$, not on any generator.
We proved in the chapter on ideals that $\mathcal{O}_L/\mathfrak{a}$ is always a finite set, so $N(\mathfrak{a}) \in \mathbb{N}$ is well-defined. It is immediate that $N(\mathfrak{a}) = 1$ if and only if $\mathfrak{a} = \mathcal{O}_L$: the quotient is trivial precisely when the ideal is the whole ring.
[example: Norm of a Principal Ideal Generated by an Integer]
Let $d \in \mathbb{Z}$ and consider the principal ideal $\langle d \rangle = d\mathcal{O}_L$. Since $\mathcal{O}_L \cong \mathbb{Z}^n$ as abelian groups (where $n = [L:\mathbb{Q}]$), we have $d\mathcal{O}_L \cong (d\mathbb{Z})^n$, and thus
\begin{align*}
N(\langle d \rangle) = |\mathbb{Z}^n / (d\mathbb{Z})^n| = |\mathbb{Z}/d\mathbb{Z}|^n = d^n.
\end{align*}
[/example]
Before establishing the key multiplicativity property, we record a simple but useful observation. The proof is short, but the conclusion is the cornerstone of several later arguments: knowing that $N(\mathfrak{a})$ actually belongs to $\mathfrak{a}$ turns the abstract norm into a concrete element of the ring.
[quotetheorem:1592]
[citeproof:1592]
This says the norm of $\mathfrak{a}$ is always an integer lying inside $\mathfrak{a}$ itself. The theorem does NOT say that $N(\mathfrak{a})$ generates $\mathfrak{a}$ — a non-principal prime ideal $\mathfrak{p}$ with $N(\mathfrak{p}) = p$ contains the rational prime $p$, but $\langle p \rangle \subsetneq \mathfrak{p}$ in general when $p$ splits. What it does give is that $\mathfrak{a}$ always contains a nonzero rational integer; in particular, $N(\mathfrak{a})$ is an explicit such integer. The forward connection to note: bounding $N(\mathfrak{a}) \leq B$ for some bound $B$ forces $\mathfrak{a} \supseteq \langle N(\mathfrak{a}) \rangle$, which restricts the possible prime ideal divisors of $\mathfrak{a}$ to lie above rational primes $\leq B$. This is precisely the argument used to show the class group is finitely generated.
## Multiplicativity of the Norm
The defining property of the norm is that it is completely multiplicative: it turns ideal multiplication into integer multiplication. This mirrors exactly the multiplicativity of the element norm $N_{L/\mathbb{Q}}(\alpha\beta) = N_{L/\mathbb{Q}}(\alpha) N_{L/\mathbb{Q}}(\beta)$.
[quotetheorem:1593]
The course presents two proofs of this result, which illuminate different aspects of the structure.
[proof]
**First approach.** By unique factorisation into prime ideals, it suffices to prove the case $\mathfrak{b} = \mathfrak{p}$ prime, i.e. to show $N(\mathfrak{a}\mathfrak{p}) = N(\mathfrak{a}) N(\mathfrak{p})$.
By the third isomorphism theorem,
\begin{align*}
\frac{\mathcal{O}_L}{\mathfrak{a}} \cong \frac{\mathcal{O}_L/\mathfrak{a}\mathfrak{p}}{\mathfrak{a}/\mathfrak{a}\mathfrak{p}},
\end{align*}
so it suffices to construct an isomorphism of abelian groups $\mathcal{O}_L/\mathfrak{p} \cong \mathfrak{a}/\mathfrak{a}\mathfrak{p}$.
By unique factorisation, $\mathfrak{a} \neq \mathfrak{a}\mathfrak{p}$, so we can choose $\alpha \in \mathfrak{a} \setminus \mathfrak{a}\mathfrak{p}$. Define the map
\begin{align*}
\phi: \mathcal{O}_L/\mathfrak{p} &\to \mathfrak{a}/\mathfrak{a}\mathfrak{p}, \quad x + \mathfrak{p} \mapsto \alpha x + \mathfrak{a}\mathfrak{p}.
\end{align*}
This is well-defined because $\alpha \in \mathfrak{a}$ implies $\alpha\mathfrak{p} \subseteq \mathfrak{a}\mathfrak{p}$.
For **injectivity**: if $\alpha x \in \mathfrak{a}\mathfrak{p}$, write $\langle \alpha \rangle = \mathfrak{a}\mathfrak{c}$ for some ideal $\mathfrak{c}$ (possible since $\langle \alpha \rangle \subseteq \mathfrak{a}$). Then $x\mathfrak{a}\mathfrak{c} \subseteq \mathfrak{a}\mathfrak{p}$, hence $x\mathfrak{c} \subseteq \mathfrak{p}$. Since $\mathfrak{p}$ is prime, either $\mathfrak{c} \subseteq \mathfrak{p}$ or $x \in \mathfrak{p}$. But $\mathfrak{c} \subseteq \mathfrak{p}$ would give $\langle \alpha \rangle = \mathfrak{a}\mathfrak{c} \subseteq \mathfrak{a}\mathfrak{p}$, contradicting $\alpha \notin \mathfrak{a}\mathfrak{p}$. So $x \in \mathfrak{p}$.
For **surjectivity**: we need $\mathfrak{a}\mathfrak{p} + \langle \alpha \rangle = \mathfrak{a}$. We have the strict inclusions $\mathfrak{a}\mathfrak{p} \subsetneq \mathfrak{a}\mathfrak{p} + \langle \alpha \rangle \subseteq \mathfrak{a}$. Since $\langle \alpha \rangle \subseteq \mathfrak{a}$, writing $\langle \alpha \rangle = \mathfrak{a}\mathfrak{c}$ gives $\mathfrak{a}\mathfrak{p} + \mathfrak{a}\mathfrak{c} = \mathfrak{a}(\mathfrak{p} + \mathfrak{c})$. Since $\alpha \notin \mathfrak{a}\mathfrak{p}$, we have $\mathfrak{c} \not\subseteq \mathfrak{p}$. Thus $\mathfrak{p} + \mathfrak{c} \supsetneq \mathfrak{p}$, and maximality of $\mathfrak{p}$ gives $\mathfrak{p} + \mathfrak{c} = \mathcal{O}_L$, so $\mathfrak{a}\mathfrak{p} + \langle \alpha \rangle = \mathfrak{a}$.
**Second approach.** By unique factorisation, write $\mathfrak{a}\mathfrak{b} = \mathfrak{p}_1^{a_1} \cdots \mathfrak{p}_r^{a_r}$ for distinct primes $\mathfrak{p}_i$. The Chinese Remainder Theorem gives
\begin{align*}
\frac{\mathcal{O}_L}{\mathfrak{p}_1^{a_1} \cdots \mathfrak{p}_r^{a_r}} \cong \frac{\mathcal{O}_L}{\mathfrak{p}_1^{a_1}} \times \cdots \times \frac{\mathcal{O}_L}{\mathfrak{p}_r^{a_r}}.
\end{align*}
It remains to show $|\mathcal{O}_L/\mathfrak{p}^r| = |\mathcal{O}_L/\mathfrak{p}|^r$. The tower of ideals $\mathcal{O}_L \supset \mathfrak{p} \supset \mathfrak{p}^2 \supset \cdots \supset \mathfrak{p}^r$ gives $|\mathcal{O}_L/\mathfrak{p}^r| = \prod_{k=0}^{r-1} |\mathfrak{p}^k/\mathfrak{p}^{k+1}|$.
The key claim is that each successive quotient $\mathfrak{p}^k/\mathfrak{p}^{k+1}$ is a one-dimensional vector space over the residue field $\mathcal{O}_L/\mathfrak{p}$. To see this, pick any $\beta \in \mathfrak{p}^k \setminus \mathfrak{p}^{k+1}$, which exists since $\mathfrak{p}^k \neq \mathfrak{p}^{k+1}$ by unique factorisation. Multiplication by $\beta$ defines a map $\mathcal{O}_L \to \mathfrak{p}^k$ sending $x \mapsto \beta x$. This descends to a map $\mathcal{O}_L/\mathfrak{p} \to \mathfrak{p}^k/\mathfrak{p}^{k+1}$. The same injectivity and surjectivity argument as in the first approach shows this map is an isomorphism: injectivity holds because $\beta x \in \mathfrak{p}^{k+1}$ and $\langle \beta \rangle = \mathfrak{p}^k \mathfrak{c}$ with $\mathfrak{c} \not\subseteq \mathfrak{p}$ forces $x \in \mathfrak{p}$, and surjectivity follows from $\mathfrak{p}^k = \mathfrak{p}^{k+1} + \langle \beta \rangle$ by the maximality of $\mathfrak{p}$. Thus $|\mathfrak{p}^k/\mathfrak{p}^{k+1}| = |\mathcal{O}_L/\mathfrak{p}|$, and the result follows.
[/proof]
The multiplicativity theorem is more than an analogy with the element norm — it is the structural property that makes the ideal norm useful as an arithmetic invariant. The key point is that this result requires $\mathcal{O}_L$ to be a Dedekind domain. In a non-Dedekind order, multiplicativity can fail. For example, consider the order $R = \mathbb{Z}[\sqrt{-3}] \subsetneq \mathcal{O}_{\mathbb{Q}(\sqrt{-3})} = \mathbb{Z}[\frac{1+\sqrt{-3}}{2}]$. In $R$, the ideals are not uniquely factorisable into prime ideals, and the identity $N(\mathfrak{a}\mathfrak{b}) = N(\mathfrak{a}) N(\mathfrak{b})$ fails for certain pairs of ideals. The proof above uses both unique factorisation (to reduce to prime ideals) and the maximality of prime ideals (a special feature of Dedekind domains); both ingredients fail in a general order. The surprising forward connection is that multiplicativity of the norm is exactly what makes the Dedekind zeta function $\zeta_L(s) = \sum_{\mathfrak{a} \neq 0} N(\mathfrak{a})^{-s}$ factor as an Euler product $\prod_{\mathfrak{p}} (1 - N(\mathfrak{p})^{-s})^{-1}$, extending the classical Riemann zeta factorisation.
## The Norm and the Discriminant
Computing the ideal norm abstractly requires an integral basis of the ideal. Is there a shortcut via the discriminant? The norm and discriminant are coupled by a natural identity: the discriminant of any $\mathbb{Z}$-basis of $\mathfrak{a}$ differs from the field discriminant by exactly a factor of $N(\mathfrak{a})^2$. The precise formulation is the following theorem.
Recall the discriminant of a sequence $\alpha_1, \ldots, \alpha_n \in L$:
\begin{align*}
\Delta(\alpha_1, \ldots, \alpha_n) = \det\bigl(\operatorname{Tr}_{L/\mathbb{Q}}(\alpha_i \alpha_j)\bigr) = \det(\sigma_i(\alpha_j))^2,
\end{align*}
where $\sigma_1, \ldots, \sigma_n$ are the embeddings $L \hookrightarrow \mathbb{C}$.
[quotetheorem:1594]
[citeproof:1594]
Before discussing the features of this formula, one consequence of part 1 deserves to be singled out, because it underpins every later argument in which ideals are treated as lattices. The freeness and the exact rank count are what permit Minkowski-style geometry of numbers to be applied to ideals at all.
[remark: Ideals Are Rank-$n$ Free $\mathbb{Z}$-Modules]
The first part of the theorem is structurally important: every nonzero ideal in $\mathcal{O}_L$ has a $\mathbb{Z}$-basis of exactly $n$ elements, the same rank as $\mathcal{O}_L$ itself. This is the source of the finiteness of the class group.
[/remark]
Several features of this theorem deserve attention. First, the nonzero hypothesis is essential: the zero ideal has $\mathcal{O}_L/(0) = \mathcal{O}_L$, which is infinite, so neither the rank statement nor the discriminant formula makes sense. Second, the $\mathbb{Z}$-basis provided by the theorem is not canonical — unlike the ring $\mathcal{O}_L$ itself (which has a distinguished integral basis), a generic ideal $\mathfrak{a}$ admits many equally-natural $\mathbb{Z}$-bases, and the theorem only asserts existence of one. This non-canonicality is not a defect: the discriminant $\Delta(\alpha_1, \ldots, \alpha_n)$ depends on the basis chosen, but the formula $\Delta = N(\mathfrak{a})^2 D_L$ holds for any such basis, so the right-hand side is basis-independent. This is precisely what makes the discriminant of an ideal well-defined as an invariant of $\mathfrak{a}$, not of a particular generating set. Third, note what the formula does NOT say: it does not say that every element of $\mathfrak{a}$ lies in a $\mathbb{Z}$-basis; the basis $\alpha_1, \ldots, \alpha_n$ is a $\mathbb{Z}$-basis of $\mathfrak{a}$ as a free $\mathbb{Z}$-module, and most elements of $\mathfrak{a}$ are non-trivial $\mathbb{Z}$-linear combinations of the $\alpha_i$.
A particularly clean application is the following criterion for an integral basis. The key identity is $\Delta(\alpha_1, \ldots, \alpha_n) = N(\mathfrak{a})^2 D_L$: as soon as the left side is square-free, the factor $N(\mathfrak{a})^2$ must equal 1.
[quotetheorem:1595]
[citeproof:1595]
This theorem is the one most often used to verify integral bases by inspection. However, note carefully what it does NOT say. First, the converse fails: $D_L$ being square-free does not force $\operatorname{disc}(p_\alpha)$ to be square-free for any particular generator $\alpha$. Second, when $\operatorname{disc}(p_\alpha)$ has a square factor, $\mathcal{O}_L$ can be strictly larger than $\mathbb{Z}[\alpha]$. The paradigmatic example is $\alpha = \sqrt{5}$, where $\operatorname{disc}(x^2 - 5) = 4 \cdot 5 = 20$. The square factor 4 signals that $\mathcal{O}_{\mathbb{Q}(\sqrt{5})}$ might be larger than $\mathbb{Z}[\sqrt{5}]$, and indeed $\mathcal{O}_{\mathbb{Q}(\sqrt{5})} = \mathbb{Z}[\frac{1+\sqrt{5}}{2}]$ is strictly larger. By contrast, $\alpha = \sqrt{-3}$ has $\operatorname{disc}(x^2 + 3) = -12 = -4 \cdot 3$, again with a square factor, and $\mathcal{O}_{\mathbb{Q}(\sqrt{-3})} = \mathbb{Z}[\frac{1+\sqrt{-3}}{2}] \supsetneq \mathbb{Z}[\sqrt{-3}]$. To appreciate why the $N(\mathfrak{a})^2$ factor in the discriminant formula is essential, observe that without it the formula would say $\Delta(\alpha_1, \ldots, \alpha_n) = D_L$ for every ideal, which would force the discriminant to be constant across all $\mathbb{Z}$-bases of all nonzero ideals — manifestly false, since taking $\mathfrak{a} = \langle d \rangle$ for $d \in \mathbb{Z}$ with $|d| > 1$ gives $\Delta(d\alpha_1', \ldots, d\alpha_n') = d^{2n} D_L \neq D_L$.
[explanation: The Discriminant Criterion in Practice]
The theorem generalises naturally: the condition $\mathfrak{a} = \mathcal{O}_L$ does not actually require $\mathfrak{a}$ to be an ideal in the ring-theoretic sense — it only needs to be a subgroup of $\mathcal{O}_L$ isomorphic to $\mathbb{Z}^n$, since the quotient $\mathcal{O}_L/\mathfrak{a}$ is well-defined for any such subgroup.
This gives a powerful method for proving that a given set of algebraic integers is an integral basis. Let $\alpha$ be an algebraic integer with $L = \mathbb{Q}(\alpha)$ and $n = [L:\mathbb{Q}]$. The subgroup $\mathbb{Z}[\alpha] = \{a_0 + a_1\alpha + \cdots + a_{n-1}\alpha^{n-1} : a_i \in \mathbb{Z}\}$ sits inside $\mathcal{O}_L$, and
\begin{align*}
\Delta(1, \alpha, \alpha^2, \ldots, \alpha^{n-1}) = \operatorname{disc}(p_\alpha),
\end{align*}
where $p_\alpha$ is the minimal polynomial of $\alpha$ over $\mathbb{Q}$.
If $\operatorname{disc}(p_\alpha)$ is square-free, then $\mathbb{Z}[\alpha] = \mathcal{O}_L$ and $\{1, \alpha, \ldots, \alpha^{n-1}\}$ is an integral basis.
Even when $\operatorname{disc}(p_\alpha)$ is not square-free, the formula still gives useful information. Write $\operatorname{disc}(p_\alpha) = d^2 m$ where $m$ is square-free. Then $N(\mathbb{Z}[\alpha])$ divides $d$. Since the order of any $x + \mathbb{Z}[\alpha] \in \mathcal{O}_L/\mathbb{Z}[\alpha]$ divides $d$, we get $d \cdot x \in \mathbb{Z}[\alpha]$ for all $x \in \mathcal{O}_L$, yielding the inclusion
\begin{align*}
\mathbb{Z}[\alpha] \subseteq \mathcal{O}_L \subseteq \tfrac{1}{d}\mathbb{Z}[\alpha].
\end{align*}
For example, take $\alpha = \sqrt{a}$ with $a \in \mathbb{Z}$ square-free. The minimal polynomial is $p_\alpha = x^2 - a$, with $\operatorname{disc}(p_\alpha) = 4a$. Here $d = 2$ and we recover
\begin{align*}
\mathbb{Z}[\sqrt{a}] \subseteq \mathcal{O}_{\mathbb{Q}(\sqrt{a})} \subseteq \tfrac{1}{2}\mathbb{Z}[\sqrt{a}],
\end{align*}
which matches the classical determination of the ring of integers of a quadratic field.
[/explanation]
## The Element Norm and the Ideal Norm
For a principal ideal $\langle \alpha \rangle$, we now have two norms in play: the field-theoretic norm $N_{L/\mathbb{Q}}(\alpha) = \prod_{i=1}^n \sigma_i(\alpha)$, defined via the embeddings of $L$, and the ideal norm $N(\langle \alpha \rangle) = |\mathcal{O}_L/\langle \alpha \rangle|$, defined via the quotient ring. Do they agree? Must we always take the absolute value of $N_{L/\mathbb{Q}}(\alpha)$, or do the two norms happen to coincide? And critically: why can we not simply define $N(\mathfrak{a}) = N_{L/\mathbb{Q}}(\alpha)$ whenever $\mathfrak{a} = \langle \alpha \rangle$ is principal, and avoid the quotient-index definition altogether? The naive approach fails because $N_{L/\mathbb{Q}}(\alpha)$ is not an ideal invariant: replacing $\alpha$ by a unit multiple $u\alpha$ (which generates the same ideal) gives $N_{L/\mathbb{Q}}(u\alpha) = N_{L/\mathbb{Q}}(u) N_{L/\mathbb{Q}}(\alpha)$, and since $N_{L/\mathbb{Q}}(u) = \pm 1$ for a unit, this can change the sign. The quotient-index $|\mathcal{O}_L/\langle u\alpha \rangle| = |\mathcal{O}_L/\langle \alpha \rangle|$ is invariant under unit multiples. The following theorem shows that, up to sign, the two norms are the same.
[quotetheorem:1596]
[citeproof:1596]
Before closing the chapter with the final reconciliation, one technical point remains. The squared equality obtained from the discriminant formula only determines the two norms up to sign, and the two norms are objects of genuinely different type — a cardinality on one side and a signed integer on the other. It is worth recording explicitly how this sign mismatch is resolved, because it clarifies why the ideal norm is the universal size invariant on both principal and non-principal ideals alike.
[remark: Why the Absolute Value]
The element norm $N_{L/\mathbb{Q}}(\alpha)$ can be negative (for example, $N_{\mathbb{Q}(\sqrt{2})/\mathbb{Q}}(\sqrt{2}) = -2$), while the ideal norm $N(\langle \alpha \rangle) = |\mathcal{O}_L/\langle \alpha \rangle|$ is a cardinality and hence always positive. The absolute value reconciles the two.
[/remark]
This theorem closes the loop between the two norms that have run in parallel throughout the course. The surprising consequence is that the quotient-index definition — which requires no choice of generator and works for all ideals — gives exactly the same value as the embedding-based definition on principal ideals. The ideal norm is thus the correct extension of the element norm to all ideals, not merely a convenient substitute. Combined with multiplicativity, the theorem now allows us to compute ideal norms of explicit prime ideals directly from the norms of elements: if $\mathfrak{p} = \langle \pi \rangle$ is principal (which happens when the class group is trivial, or over a suitable localisation), then $N(\mathfrak{p}) = |N_{L/\mathbb{Q}}(\pi)|$. More importantly, in Chapter 6 we will use norms to bound the sizes of ideal classes and establish finiteness of the class group via the Minkowski bound.
The ideal norm allows us to count ideals, but the fine arithmetic structure — which ideals are principal, how they cluster into classes — is determined by the prime ideals themselves. Every prime ideal lies above a unique rational prime, and the way $\langle p \rangle$ factors into prime powers is entirely controlled by how a generator's minimal polynomial reduces modulo $p$.
# 5. Structure of prime ideals
Having established that every ideal in $\mathcal{O}_L$ factors uniquely into prime ideals, a natural question arises: what are these prime ideals, and how can we find them concretely? This chapter answers both questions. Every prime ideal of $\mathcal{O}_L$ lies above a unique rational prime $p$, and the way $\langle p \rangle$ factors in $\mathcal{O}_L$ is completely controlled by how the minimal polynomial of a generator reduces modulo $p$. The central tool is Dedekind's criterion, which turns the arithmetic problem of factoring ideals into a problem in polynomial arithmetic over $\mathbb{F}_p$.
## Every prime ideal lies above a rational prime
How do we get a handle on all prime ideals of $\mathcal{O}_L$? Without some structural reduction the classification would be hopeless — $\mathcal{O}_L$ could, in principle, carry prime ideals with no obvious connection to rational primes. The key observation is that every prime ideal of $\mathcal{O}_L$ sits above a unique rational prime $p \in \mathbb{Z}$, reducing the entire classification to a finite problem for each prime $p$.
[quotetheorem:1597]
[citeproof:1597]
[explanation: Significance of primality and the road ahead]
The primality of $p$ is not incidental to the argument — it is forced. If instead $\mathfrak{p} \cap \mathbb{Z} = m\mathbb{Z}$ for some composite $m = ab$, then $ab \in \mathfrak{p}$ would force $a \in \mathfrak{p}$ or $b \in \mathfrak{p}$, meaning $a$ or $b$ already lay in $\mathfrak{p} \cap \mathbb{Z} = m\mathbb{Z}$ and hence $m \mid a$ or $m \mid b$ — but then $m \mid m$ is trivially true, giving no new information. The point is that the primality condition propagates from $\mathbb{Z}$ to $\mathcal{O}_L$ in exactly the right way.
The theorem also does not say how many prime ideals lie above a given $p$, nor how they are structured. That information is contained in the factorization $\langle p \rangle = \mathfrak{p}_1^{e_1} \cdots \mathfrak{p}_m^{e_m}$, and the fundamental constraint on this factorization is the identity $\sum_{i=1}^m e_i f_i = n$ which we will derive in the next section. The integers $e_i$ (ramification indices) and $f_i$ (inertia degrees) thus carry all the arithmetic information about how $p$ behaves in $\mathcal{O}_L$.
[/explanation]
## Ramification, inertia, and splitting
Given a rational prime $p$, the factorization $\langle p \rangle = \mathfrak{p}_1^{e_1} \cdots \mathfrak{p}_m^{e_m}$ can occur in qualitatively different ways: the primes might be distinct or repeated, the exponents might all equal one or some might exceed one, and the residue degrees might be large or small. To classify these patterns and communicate about them efficiently, we introduce standard terminology. Write the factorization
\begin{align*}
\langle p \rangle = \mathfrak{p}_1^{e_1} \cdots \mathfrak{p}_m^{e_m}
\end{align*}
for distinct prime ideals $\mathfrak{p}_i$ with $N(\mathfrak{p}_i) = p^{f_i}$. Taking norms gives $p^n = \prod p^{e_i f_i}$, so we have the fundamental identity
\begin{align*}
\sum_{i=1}^{m} e_i f_i = n.
\end{align*}
The integers $e_i$ and $f_i$ measure how $p$ behaves in the extension and have classical names.
[definition: Ramification Index]
In the factorization $\langle p \rangle = \mathfrak{p}_1^{e_1} \cdots \mathfrak{p}_m^{e_m}$, the integers $e_1, \ldots, e_m$ are called the **ramification indices** of $p$.
[/definition]
[definition: Ramified Prime]
A prime $p$ is **ramified** in $\mathcal{O}_L$ if some $e_i > 1$, i.e., some prime ideal appears with multiplicity greater than one in the factorization of $\langle p \rangle$.
[/definition]
[definition: Inert Prime]
A prime $p$ is **inert** in $\mathcal{O}_L$ if $m = 1$ and $e_1 = 1$, i.e., $\langle p \rangle$ itself remains a prime ideal in $\mathcal{O}_L$.
[/definition]
[definition: Split Prime]
A prime $p$ **splits completely** in $\mathcal{O}_L$ if $e_1 = \cdots = e_m = 1$ and $f_1 = \cdots = f_m = 1$, so $m = n$. In this case $\langle p \rangle$ factors into $n$ distinct prime ideals each of norm $p$.
[/definition]
[remark: These cases do not exhaust all possibilities]
A prime can exhibit intermediate behaviour: for instance, $\langle p \rangle$ might factor into two prime ideals of different residue degrees, or split into some ideals with $e_i > 1$ and others with $e_i = 1$. The terms above name the extreme cases that arise most frequently. Ramification is particularly significant and will return prominently later in the course.
[/remark]
## Dedekind's criterion
The key question is now computational: given $p$ and $L$, how does one explicitly find the $\mathfrak{p}_i$ and $e_i$? The answer, under a mild hypothesis, is completely explicit.
The idea is as follows. If $\alpha \in \mathcal{O}_L$ is an element such that $\mathbb{Z}[\alpha]$ is "close" to $\mathcal{O}_L$ in the sense that $p$ does not divide the index $|\mathcal{O}_L / \mathbb{Z}[\alpha]|$, then the factorization of $\langle p \rangle$ can be read off directly from the factorization of the minimal polynomial of $\alpha$ modulo $p$.
[quotetheorem:1598]
[citeproof:1598]
The hypothesis $p \nmid |\mathcal{O}_L / \mathbb{Z}[\alpha]|$ is not a cosmetic assumption — the criterion can give a completely wrong answer when it fails. The following example illustrates this concretely.
[example: Failure when $p$ divides the index]
Let $L = \mathbb{Q}(\sqrt{5})$. Since $5 \equiv 1 \pmod{4}$, the ring of integers is $\mathcal{O}_L = \mathbb{Z}[\omega]$ where $\omega = \frac{1+\sqrt{5}}{2}$, with minimal polynomial $h(x) = x^2 - x - 1$.
Suppose instead we try to apply Dedekind's criterion with $\theta = \sqrt{5}$, so $\mathbb{Z}[\theta]$ has index $2$ in $\mathcal{O}_L$ (since $\omega = \frac{1+\theta}{2} \notin \mathbb{Z}[\theta]$). At $p = 2$ we have $2 \mid 2 = |\mathcal{O}_L / \mathbb{Z}[\theta]|$, so the hypothesis fails.
Now factor the minimal polynomial of $\theta$: $x^2 - 5 \equiv x^2 + 1 = (x+1)^2 \pmod{2}$. Dedekind's formula would give $\langle 2 \rangle = \mathfrak{p}^2$ with $\mathfrak{p} = \langle 2, \sqrt{5} + 1 \rangle$, suggesting that $2$ ramifies in $L$.
But this is wrong. Applying Dedekind's criterion correctly with $\omega$: we reduce $h(x) = x^2 - x - 1 \equiv x^2 + x + 1 \pmod{2}$. This polynomial is irreducible over $\mathbb{F}_2$, so $2$ is in fact **inert** in $\mathcal{O}_L$: $\langle 2 \rangle$ is already prime. The criterion misled us because we chose $\theta$ whose index is divisible by $2$.
The lesson is that Dedekind's criterion is a recipe for $\alpha$, not for $L$: the choice of generator matters, and one must verify the index hypothesis before trusting the output.
[/example]
[remark: Practical use of Dedekind's criterion]
In practice, if $\mathcal{O}_L = \mathbb{Z}[\alpha]$ for some $\alpha$, then the condition $p \nmid |\mathcal{O}_L/\mathbb{Z}[\alpha]|$ holds for all primes $p$, and Dedekind's criterion applies everywhere. If not, one seeks an $\alpha$ with $\mathcal{O}_L/\mathbb{Z}[\alpha]$ of order coprime to the prime of interest. For quadratic fields $\mathbb{Q}(\sqrt{d})$, the ring $\mathbb{Z}[\sqrt{d}]$ has index at most $2$ in $\mathcal{O}_L$, so it works for all odd primes. For general number fields and primes dividing the index for all natural choices of $\alpha$, more sophisticated methods such as the Montes algorithm and Newton polygon factorisation are needed.
[/remark]
## Factoring primes in quadratic fields
Quadratic fields are the first nontrivial case where Dedekind's criterion can be applied systematically, and the results are particularly clean: the splitting behaviour of every odd prime $p$ is completely captured by a single invariant, the Legendre symbol $\left(\frac{d}{p}\right)$. This is not just a computational convenience — the connection between Legendre symbols and splitting is the starting point for quadratic reciprocity and the broader theory of class fields. We work out the complete picture for $L = \mathbb{Q}(\sqrt{d})$ with $d \neq 0, 1$ square-free.
### Odd primes in $\mathbb{Q}(\sqrt{d})$
For an odd prime $p$, the ring $\mathbb{Z}[\sqrt{d}]$ has index $1$ or $2$ in $\mathcal{O}_L$, so $p \nmid |\mathcal{O}_L / \mathbb{Z}[\sqrt{d}]|$ and Dedekind applies. The minimal polynomial is $x^2 - d$, and we must factor $x^2 - d \bmod p$.
The three cases are determined entirely by the Legendre symbol $\left(\frac{d}{p}\right)$:
1. **$\left(\frac{d}{p}\right) = 1$** (i.e., $d$ is a square mod $p$): write $x^2 - d \equiv (x-r)(x+r) \pmod{p}$ for some $r \in \mathbb{F}_p$. Dedekind gives
\begin{align*}
\langle p \rangle = \mathfrak{p}_1 \mathfrak{p}_2, \quad \mathfrak{p}_1 = \langle p, \sqrt{d} - r \rangle, \quad \mathfrak{p}_2 = \langle p, \sqrt{d} + r \rangle,
\end{align*}
with $N(\mathfrak{p}_1) = N(\mathfrak{p}_2) = p$. So $p$ **splits**.
2. **$\left(\frac{d}{p}\right) = -1$** (i.e., $d$ is not a square mod $p$): $x^2 - d$ is irreducible mod $p$. Dedekind gives $\langle p \rangle = \mathfrak{p}$ prime. So $p$ is **inert**.
3. **$\left(\frac{d}{p}\right) = 0$** (i.e., $p \mid d$): $x^2 - d \equiv x^2 \pmod{p}$, a repeated root. Dedekind gives
\begin{align*}
\langle p \rangle = \mathfrak{p}^2, \quad \mathfrak{p} = \langle p, \sqrt{d} \rangle.
\end{align*}
So $p$ **ramifies**.
The Legendre symbol thus completely encodes the splitting behaviour of odd primes in quadratic fields.
[example: Factoring 5 in $\mathbb{Q}(\sqrt{-11})$]
Consider $L = \mathbb{Q}(\sqrt{-11})$. We have $d = -11$ and $p = 5$. The ring $\mathbb{Z}[\sqrt{-11}]$ has index $2$ in $\mathcal{O}_L$ (since $d \equiv 1 \pmod 4$, we have $\mathcal{O}_L = \mathbb{Z}[\frac{1+\sqrt{-11}}{2}]$), but $5 \nmid 2$, so Dedekind applies. The minimal polynomial of $\sqrt{-11}$ is $x^2 + 11$. Reducing mod $5$:
\begin{align*}
x^2 + 11 \equiv x^2 - 4 = (x-2)(x+2) \pmod 5.
\end{align*}
So $5$ splits, and
\begin{align*}
\langle 5 \rangle = \langle 5, \sqrt{-11} + 2 \rangle \langle 5, \sqrt{-11} - 2 \rangle.
\end{align*}
[/example]
### The prime 2 in $\mathbb{Q}(\sqrt{d})$
The case $p = 2$ requires more care. If $d \equiv 2$ or $3 \pmod 4$, we use $\alpha = \sqrt{d}$; if $d \equiv 1 \pmod 4$, we use $\alpha = \frac{1+\sqrt{d}}{2}$ (for which $\mathcal{O}_L = \mathbb{Z}[\alpha]$, so Dedekind applies with $p = 2$).
[quotetheorem:1599]
[citeproof:1599]
These computations reveal a striking pattern connecting ramification to the discriminant.
[explanation: Ramification and the discriminant]
Recall that the discriminant of $L = \mathbb{Q}(\sqrt{d})$ is
\begin{align*}
D_L = \begin{cases} d & \text{if } d \equiv 1 \pmod 4, \\ 4d & \text{if } d \equiv 2, 3 \pmod 4. \end{cases}
\end{align*}
Inspecting the splitting behaviour case by case, one finds that a prime $p$ ramifies in $L$ if and only if $p \mid D_L$. For odd $p$: $p$ ramifies iff $p \mid d$, and iff $p \mid D_L$ (since $D_L$ and $d$ have the same odd prime divisors). For $p = 2$: $2$ ramifies iff $d \equiv 2, 3 \pmod 4$, iff $4 \mid D_L$, iff $2 \mid D_L$.
This relationship between ramification and the discriminant is not a coincidence — it holds in complete generality, as proved in the theorem below. The discriminant thus captures, in a single integer, all the primes where the arithmetic of $\mathcal{O}_L$ is "singular" over $\mathbb{Z}$.
[/explanation]
## A corollary on complete splitting
Can a small prime split completely in a large-degree extension? This is a very concrete question about polynomial arithmetic mod $p$: complete splitting requires $\bar{g}(x)$ to factor into $n$ distinct linear factors over $\mathbb{F}_p$, which demands $n$ distinct elements of $\mathbb{F}_p$. Dedekind's criterion gives a clean negative answer.
[quotetheorem:1600]
[citeproof:1600]
The bound $p \geq n$ is only a necessary condition, not a sufficient one. Even for $p \geq n$, most primes will fail to split completely: the density of completely split primes in any Galois extension of degree $n > 1$ is $1/n$, a fact made precise by the Chebotarev density theorem. The theorem above captures the simplest obstruction, but the full picture of which primes split completely is a much deeper story tied to the Frobenius conjugacy class.
[example: Complete splitting with $p < n$]
Let $L = \mathbb{Q}(\alpha)$ where $\alpha$ has minimal polynomial $x^3 - x^2 - 2x - 8$, so $n = 3$ and $p = 2 < 3$. One can verify (and it appears on the example sheets) that $2$ splits completely, meaning $\mathcal{O}_L / 2\mathcal{O}_L \cong \mathbb{F}_2 \times \mathbb{F}_2 \times \mathbb{F}_2$. The corollary then tells us something about the structure of $\mathcal{O}_L$: for every $\beta \in \mathcal{O}_L$, the index $|\mathcal{O}_L / \mathbb{Z}[\beta]|$ must be even. In other words, no single element $\beta$ generates $\mathcal{O}_L$ as a $\mathbb{Z}$-module "efficiently" with respect to $2$.
[/example]
## Ramification and the discriminant: the general theorem
We have seen case-by-case in quadratic fields that the discriminant appears to flag exactly the ramified primes. Is this a universal phenomenon? The general theorem confirms it: $p$ ramifies in $\mathcal{O}_L$ if and only if $p \mid D_L$.
[quotetheorem:1601]
The proof requires more commutative algebra than we develop in these notes (it rests on the structure theorem for finite étale algebras over a Dedekind domain), but the intuition has already been carried through: ramification of $p$ means $\mathcal{O}_L/\langle p \rangle$ is not reduced, and non-reducedness is detected by the vanishing of a discriminant of the residue algebra, which in turn is $D_L \pmod p$. Combined with the finiteness of the discriminant, this theorem implies that only finitely many primes ramify in any given number field — a fact of fundamental importance in class field theory and beyond.
Knowing how to factor rational primes into ideals is the first step toward understanding the class group. But class group elements form an infinite set in principle; the Minkowski bound, drawing on convex geometry, produces a finite search: every ideal class contains a representative of bounded norm, forcing the class group to be finite.
# 6. Minkowski bound and finiteness of the class group
Dedekind's criterion gave us a complete algorithm for factoring rational primes in a number field, but it leaves open a more global question: is the class group of a given number field finite? We cannot check every prime ideal individually. The resolution comes from a beautiful piece of geometry — by embedding a number field into Euclidean space, we can use the geometry of convex bodies to show that every ideal class has a representative of bounded norm. Once we have such a bound, the finiteness of the class group follows immediately from the elementary fact that there are only finitely many ideals of any given norm.
We develop the argument in two stages. The case of imaginary quadratic fields is treated first, where the geometry is two-dimensional and entirely pictorial. We then carry out the full argument in arbitrary dimensions using Minkowski's theorem on lattices.
## Quadratic extensions
Why begin with imaginary quadratic fields? Because the geometry is two-dimensional and already contains every key idea — a lattice, a convex body, and the volume-counting argument that lies at the heart of the proof. Once you see how area forces a short vector to exist, the $n$-dimensional version is a straightforward generalization.
Consider $L = \mathbb{Q}(\sqrt{d})$ with $d < 0$. The ring of integers is $\mathcal{O}_L = \mathbb{Z}[\alpha]$, where
\begin{align*}
\alpha =
\begin{cases}
\sqrt{d} & d \equiv 2, 3 \pmod{4} \\
\frac{1}{2}(1 + \sqrt{d}) & d \equiv 1 \pmod{4}.
\end{cases}
\end{align*}
Since $d < 0$, we have $L \subseteq \mathbb{C}$, and we can view $\mathcal{O}_L$ as a discrete subset of the complex plane. When $d \equiv 2, 3 \pmod{4}$, the lattice of $\mathcal{O}_L$ is rectangular, spanned by $1$ and $\sqrt{d}$. When $d \equiv 1 \pmod{4}$, the lattice is hexagonal, spanned by $1$ and $\frac{1}{2}(1 + \sqrt{d})$.
Any ideal $\mathfrak{a} \leq \mathcal{O}_L$ is isomorphic to $\mathbb{Z}^2$ as an abelian group, so it too forms a sublattice of $\mathbb{C} \cong \mathbb{R}^2$. The key geometric input is the following lemma, which guarantees the existence of a short non-zero vector in any lattice.
Before stating it, it is worth asking what can go wrong if we drop the hypotheses. Convexity and symmetry are both genuinely necessary: a non-convex region can have area exceeding $4A(\Lambda)$ yet contain no lattice point because it winds around the lattice without enclosing any, and a non-symmetric region of area $4A(\Lambda)$ can be shifted so that it just misses the origin. These failure modes motivate why Minkowski's theorem imposes both conditions simultaneously, rather than area alone.
[example: Non-Convex Region That Misses All Lattice Points]
Take $\Lambda = \mathbb{Z}^2$ and $S = [-2, 2]^2 \setminus [-1, 1]^2$, a square annulus. The area is $4^2 - 2^2 = 12 > 4 = 4 \cdot \operatorname{covol}(\mathbb{Z}^2)$, so $S$ far exceeds the area threshold. Yet every lattice point of $\mathbb{Z}^2$ with $\|(x,y)\|_\infty \leq 1$ lies in the removed interior square, and every lattice point with $\|(x,y)\|_\infty \geq 2$ lies outside $S$. The non-zero lattice points with $\|(x,y)\|_\infty = 2$ are on the outer boundary, which $S$ does contain, but scaling $S$ inward slightly by $\varepsilon$ gives an open region of area $> 4$ with no non-zero lattice point. The failure is non-convexity: $S$ has a "hole" at the origin.
[/example]
[example: Non-Symmetric Region That Misses the Origin Lattice]
Take $\Lambda = \mathbb{Z}^2$ and $S = [0.1, 2.1] \times [0.1, 2.1]$, a unit square shifted away from the origin. Its area is $4 = 4 \cdot \operatorname{covol}(\mathbb{Z}^2)$. The interior contains exactly one lattice point, $(1,1)$, but not the origin. Since $S$ is not symmetric about $\mathbf{0}$, the theorem does not apply. If we instead took any closed axis-aligned square of area $4$ centred at the origin, it would contain $(\pm 1, 0)$ and $(0, \pm 1)$ as required.
[/example]
[quotetheorem:1602]
[citeproof:1602]
The remarkable feature of this lemma is that the threshold depends only on the area of the fundamental parallelogram, not on its shape. Why does $4A(\Lambda)$ appear rather than, say, $2A(\Lambda)$? The proof divides $S$ in half, applies a counting argument to the halved region, and then reassembles using symmetry — each halving step contributes one factor of $2$, producing the $4$ in the threshold. The bound is sharp: the closed unit square $[-1,1]^2$ has area $4$, exactly $4 \cdot \operatorname{covol}(\mathbb{Z}^2)$, and contains only $(\pm 1, 0)$, $(0, \pm 1)$ as non-zero lattice points, all on the boundary. If we removed the boundary, an open square of area $4$ contains no non-zero lattice point — so the equality case genuinely requires closedness.
To apply the lemma to ideals, we need to know the areas of the relevant lattices.
[quotetheorem:1603]
[citeproof:1603]
This identity is the bridge between ideal theory and Euclidean geometry. The norm $N(\mathfrak{a})$ encodes how $\mathfrak{a}$ sits inside $\mathcal{O}_L$ as a $\mathbb{Z}$-module — it is the index $[\mathcal{O}_L : \mathfrak{a}]$ — and so it governs the ratio of the fundamental parallelogram of $\mathfrak{a}$ to that of $\mathcal{O}_L$. The discriminant $D_L$ enters through the formula $A(\mathcal{O}_L) = \frac{1}{2}\sqrt{|D_L|}$: without this formula, we would know only the abstract $\mathbb{Z}$-module structure of $\mathfrak{a}$ and have no way to tie the algebra to a Euclidean area. The discriminant is therefore the precise bridge between arithmetic and geometry.
Now we can extract a norm bound. Given any ideal $\mathfrak{a} \leq \mathcal{O}_L$, Minkowski's lemma produces a non-zero $\alpha \in \mathfrak{a}$ with
\begin{align*}
N(\alpha) = |\alpha|^2 \leq \frac{4 A(\mathfrak{a})}{\pi} = N(\mathfrak{a}) \cdot \frac{2\sqrt{|D_L|}}{\pi}.
\end{align*}
Since $\alpha \in \mathfrak{a}$, we have $\langle \alpha \rangle \subseteq \mathfrak{a}$, so $\langle \alpha \rangle = \mathfrak{a} \mathfrak{b}$ for some ideal $\mathfrak{b}$, giving $N(\mathfrak{b}) = N(\langle \alpha \rangle)/N(\mathfrak{a}) \leq c_L$ where
\begin{align*}
c_L = \frac{2\sqrt{|D_L|}}{\pi}.
\end{align*}
Since $[\mathfrak{b}] = [\mathfrak{a}^{-1}]$ in $\mathrm{Cl}_L$, every class in $\mathrm{Cl}_L$ is represented by an ideal of norm at most $c_L$.
[quotetheorem:1604]
[citeproof:1604]
The bound $c_L = 2\sqrt{|D_L|}/\pi$ is the effective version of "every class has a small representative." What it does not say is that every ideal of norm $\leq c_L$ is non-principal — many are principal, and those are simply the identity element of $\mathrm{Cl}_L$. What it does guarantee is that if we list every prime ideal of norm $< c_L$, we already have enough to generate the entire class group: any class can be rewritten as a product of such small primes. This makes the class group directly computable from a finite piece of data.
To pass from bounded norm to finiteness, we need the following elementary counting lemma.
[quotetheorem:1605]
[citeproof:1605]
This lemma is really a statement about the coarseness of the norm: all ideals of norm $m$ must contain $m\mathcal{O}_L$, and the quotient $\mathcal{O}_L / m\mathcal{O}_L$ is a finite ring. The fact that there are only finitely many ideals in any finite ring is what drives the finiteness. The argument would fail if we replaced "norm $m$" by "norm $\leq m$" and $m \to \infty$, which is why we needed the bound $c_L$ to be a fixed finite number first.
Combining these two results immediately yields the main finiteness theorem.
[quotetheorem:1606]
[citeproof:1606]
The structure of the proof is worth pausing over. It is a two-step reduction: first we use geometry (Minkowski's lemma) to reduce an infinite class group to a finite search, then we use algebra (unique factorization of ideals) to identify generators from a finite list. Neither step alone is enough. Without the norm bound, there would be no finite search. Without unique factorization, the norm bound would not translate into generators. The interplay between the analytic bound $c_L$ and the algebraic structure is characteristic of algebraic number theory.
[explanation: Reading the Minkowski Bound]
In practice, the bound $c_L = 2\sqrt{|D_L|}/\pi$ is small enough to be computable. To determine $\mathrm{Cl}_L$, one factors $\langle p \rangle$ for every prime $p < c_L$ using Dedekind's criterion, then checks which of the resulting prime ideals are principal by looking for elements of the appropriate norm.
[/explanation]
We illustrate with three examples of increasing complexity.
[example: $\mathbb{Q}(\sqrt{-7})$]
Let $d = -7$. Then $d \equiv 1 \pmod{4}$, so $D_L = -7$ and
\begin{align*}
c_L = \frac{2\sqrt{7}}{\pi} \approx 1.68.
\end{align*}
There are no rational primes $p$ with $p < c_L$, so there are no prime ideals of norm less than $c_L$. This means $\mathrm{Cl}_L$ is generated by the empty set: every ideal class is the identity, so $\mathrm{Cl}_L = \{[1]\}$ is the class group of order 1. Hence $\mathcal{O}_L$ is a principal ideal domain and, in particular, a UFD. The same conclusion holds for $d \in \{-1, -2, -3\}$, each giving a Minkowski bound less than 2.
This example shows how powerful the bound can be: a single inequality $c_L < 2$ tells us the class group has order 1 without any further computation. For $d = -7$ specifically, the underlying reason is that $\mathbb{Q}(\sqrt{-7})$ is one of the nine Heegner fields — imaginary quadratic fields whose ring of integers is a UFD.
[/example]
Moving from the trivial class group to the first genuinely non-trivial case shows how the bound becomes more informative as $|D_L|$ grows: once $c_L$ exceeds $2$, there are primes to factor, and the first surprise arises when one of them fails to be principal.
[example: $\mathbb{Q}(\sqrt{-5})$]
Let $d = -5$. Then $D_L = -20$ and
\begin{align*}
c_L = \frac{4\sqrt{5}}{\pi} \approx 2.85.
\end{align*}
So $\mathrm{Cl}_L$ is generated by prime ideals dividing $\langle 2 \rangle$. From Dedekind's criterion (or the formula for quadratic fields), $x^2 + 5 \equiv (x+1)^2 \pmod{2}$, giving $\langle 2 \rangle = \mathfrak{p}^2$ where $\mathfrak{p} = \langle 2, 1 + \sqrt{-5} \rangle$.
We claim $\mathfrak{p}$ is not principal. If $\mathfrak{p} = \langle \beta \rangle$ with $\beta = x + y\sqrt{-5}$, then $N(\beta) = x^2 + 5y^2 = 2$, which has no integer solutions. Therefore $[\mathfrak{p}] \neq [1]$ in $\mathrm{Cl}_L$, and since $[\mathfrak{p}]^2 = [\mathfrak{p}^2] = [\langle 2 \rangle] = [1]$, we conclude $\mathrm{Cl}_L = \langle [\mathfrak{p}] \rangle \cong \mathbb{Z}/2\mathbb{Z}$.
This is consistent with the classical observation that $6 = 2 \cdot 3 = (1 + \sqrt{-5})(1 - \sqrt{-5})$ gives two distinct factorizations into irreducibles in $\mathbb{Z}[\sqrt{-5}]$. The failure of unique factorization is exactly accounted for by the non-identity class $[\mathfrak{p}]$: the two factorizations of $6$ correspond to two distinct ways to write $\langle 6 \rangle$ as a product of prime ideals that happen to group together differently.
[/example]
Stepping up the class number once more, we encounter a field where the class group is cyclic of order $4$ — the Minkowski bound now admits several small primes, and one must combine them to recover the full group structure.
[example: $\mathbb{Q}(\sqrt{-17})$]
Let $d = -17 \equiv 3 \pmod{4}$, so $D_L = -4 \cdot 17 = -68$ and $c_L \approx 5.3$. We must factor $\langle 2 \rangle$, $\langle 3 \rangle$, and $\langle 5 \rangle$.
Modulo 2: $x^2 + 17 \equiv (x+1)^2 \pmod{2}$, so $\langle 2 \rangle = \mathfrak{p}^2$ with $\mathfrak{p} = \langle 2, 1 + \sqrt{-17} \rangle$.
Modulo 3: $x^2 + 17 \equiv x^2 - 1 \equiv (x-1)(x+1) \pmod{3}$, so $\langle 3 \rangle = \mathfrak{q} \bar{\mathfrak{q}}$ with $\mathfrak{q} = \langle 3, 1 + \sqrt{-17} \rangle$ and $\bar{\mathfrak{q}} = \langle 3, 1 - \sqrt{-17} \rangle$.
Modulo 5: $x^2 + 17 \equiv x^2 + 2 \pmod{5}$, which is irreducible over $\mathbb{F}_5$, so $5$ is inert and $[\langle 5 \rangle] = [1]$ in $\mathrm{Cl}_L$.
Thus $\mathrm{Cl}_L = \langle [\mathfrak{p}], [\mathfrak{q}] \rangle$. To find a relation, note $N(\langle 1 + \sqrt{-17} \rangle) = 18 = 2 \cdot 3^2$ and $1 + \sqrt{-17} \in \mathfrak{p} \cap \mathfrak{q}$. Since $N(\mathfrak{p}\mathfrak{q}) = 6$ and $6 \mid 18$, the remaining factor of $3$ forces
\begin{align*}
\langle 1 + \sqrt{-17} \rangle = \mathfrak{p} \mathfrak{q}^2,
\end{align*}
giving $[\mathfrak{p}] = [\mathfrak{q}]^{-2}$ in $\mathrm{Cl}_L$.
Neither $\mathfrak{p}$ nor $\mathfrak{q}$ is principal: for $\mathfrak{p}$, the equation $N(\mathfrak{p}) = 2 = x^2 + 17y^2$ has no integer solutions; for $\mathfrak{q}$, $N(\mathfrak{q}) = 3 = x^2 + 17y^2$ has no integer solutions. So $[\mathfrak{p}] \neq [1]$ and $[\mathfrak{q}] \neq [1]$.
We claim $\mathrm{Cl}_L \cong \mathbb{Z}/4\mathbb{Z}$, generated by $[\mathfrak{q}]$. From $[\mathfrak{p}] = [\mathfrak{q}]^{-2}$ and $[\mathfrak{p}]^2 = [\langle 2 \rangle] = [1]$, we get $[\mathfrak{q}]^4 = [1]$, so the order of $[\mathfrak{q}]$ divides $4$. To rule out $\mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}$, we must show $[\mathfrak{q}]^2 \neq [1]$. Indeed, $[\mathfrak{q}]^2 = [\mathfrak{q}^2]$; if $\mathfrak{q}^2 = \langle \beta \rangle$ for some $\beta$, then $N(\beta) = N(\mathfrak{q})^2 = 9$. The solutions to $x^2 + 17y^2 = 9$ require $y = 0$, giving $x = \pm 3$, so $\beta = \pm 3$, but then $\mathfrak{q}^2 = \langle 3 \rangle = \mathfrak{q}\bar{\mathfrak{q}}$, implying $\mathfrak{q} = \bar{\mathfrak{q}}$, which contradicts $\mathfrak{q} \neq \bar{\mathfrak{q}}$ (they have different generators $1 + \sqrt{-17}$ and $1 - \sqrt{-17}$). Therefore $[\mathfrak{q}]^2 \neq [1]$, which rules out $\mathbb{Z}/2\mathbb{Z} \times \mathbb{Z}/2\mathbb{Z}$, and we conclude $\mathrm{Cl}_L \cong \mathbb{Z}/4\mathbb{Z}$.
[/example]
These three examples sit within a much more restrictive global picture: although class numbers can be computed case by case, the question of which $d$ actually give class groups of order 1 turns out to have a complete and striking answer.
[remark: Imaginary Quadratic UFDs]
A classical theorem in algebraic number theory states that $\mathcal{O}_L$ is a UFD for $L = \mathbb{Q}(\sqrt{d})$ with $d < 0$ if and only if $-d \in \{1, 2, 3, 7, 11, 19, 43, 67, 163\}$. The "if" direction follows from computing the Minkowski bound for each value and verifying that the class group has order 1. The converse — that no other values work — is considerably harder and uses deep results in analytic number theory.
[/remark]
## Minkowski in arbitrary dimension
The two-dimensional picture worked because $\mathbb{C}$ naturally carries an inner product, and an imaginary quadratic field embeds entirely into $\mathbb{C}$. For an arbitrary number field $L$ of degree $n$, the embeddings are partly real and partly complex: there are $r$ real embeddings and $s$ conjugate pairs of complex embeddings, with $r + 2s = n$. The challenge is to build an analogue of the two-dimensional lattice and still make the volume-counting argument work. The answer is to embed $L$ into $\mathbb{R}^r \times \mathbb{C}^s \cong \mathbb{R}^n$, where each complex embedding contributes a two-dimensional real piece. The rest of the proof follows the same pattern, with area replaced by $n$-dimensional volume.
We begin by establishing the necessary lattice theory.
[definition: Discrete Subset]
A subset $X \subseteq \mathbb{R}^n$ is **discrete** if for every $x \in X$, there is some $\varepsilon > 0$ such that $B(x, \varepsilon) \cap X = \{x\}$. Equivalently, for every compact $K \subseteq \mathbb{R}^n$, the intersection $K \cap X$ is finite.
[/definition]
Discreteness is an exact condition: a subset is either discrete or it is not. The next theorem characterizes which subgroups are discrete. The condition is stronger than one might expect — it is not enough to have a spanning set; the spanning elements must be linearly independent over $\mathbb{R}$ in a sense that is controlled by the ambient dimension. This makes discreteness a genuinely geometric condition, not just an algebraic one.
[quotetheorem:1607]
[citeproof:1607]
[remark: Linear Independence Is Essential]
The linear independence condition cannot be dropped. Consider the subgroup $\mathbb{Z}\sqrt{2} + \mathbb{Z}\sqrt{3} \subseteq \mathbb{R}$. This is generated by two elements, but the ambient space is $\mathbb{R}^1$, which has $\mathbb{R}$-dimension $1$. Any two non-zero elements of $\mathbb{R}$ are $\mathbb{R}$-linearly dependent, so we can have at most one $\mathbb{R}$-linearly independent generator in $\mathbb{R}$. The subgroup $\mathbb{Z}\sqrt{2} + \mathbb{Z}\sqrt{3}$ is therefore not a discrete subgroup. Indeed, it is dense in $\mathbb{R}$: since $\sqrt{3}/\sqrt{2}$ is irrational, Kronecker's theorem implies that the integer combinations $m\sqrt{2} + n\sqrt{3}$ come arbitrarily close to $0$ (and to any other real number). Discreteness of a free $\mathbb{Z}$-module in $\mathbb{R}^n$ requires its $\mathbb{Z}$-rank to equal the $\mathbb{R}$-dimension of its $\mathbb{R}$-span — and that span must have dimension at most $n$. A rank-$2$ $\mathbb{Z}$-submodule of $\mathbb{R}^1$ violates this: its two generators must be $\mathbb{R}$-linearly dependent, so the module collapses to a line through the origin in $\mathbb{R}^1$, which in turn has dense orbits unless the ratio of generators is rational.
[/remark]
With the structure of discrete subgroups in hand, we can single out those of maximal rank and attach to each a fundamental geometric invariant — the covolume.
[definition: Lattice and Covolume]
A discrete subgroup $\Lambda \subseteq \mathbb{R}^n$ is a **lattice** if $\operatorname{rank} \Lambda = n$. Given a basis $x_1, \ldots, x_n$ of $\Lambda$, the **fundamental domain** is
\begin{align*}
P = \left\{\sum_{i=1}^n \lambda_i x_i : \lambda_i \in [0,1]\right\},
\end{align*}
and the **covolume** of $\Lambda$ is $\operatorname{covol}(\Lambda) = \operatorname{vol}(P) = |\det A|$, where $A$ is the matrix with columns $x_i$.
[/definition]
This definition chose a specific basis, so we should check that the resulting number does not secretly depend on that choice.
[remark: Covolume is Well-Defined]
If $x_1', \ldots, x_n'$ is a different basis of $\Lambda$, the change-of-basis matrix has entries in $\mathbb{Z}$ and determinant $\pm 1$, so $\operatorname{covol}(\Lambda)$ is independent of the choice of basis. The name "covolume" reflects that the torus $\mathbb{R}^n/\Lambda$ has volume $\operatorname{covol}(\Lambda)$.
[/remark]
We are now ready for the central geometric statement. Minkowski's theorem relates the covolume of a lattice to the volume of a symmetric convex set, and it is precisely the tool needed to force a small lattice element to exist inside a convex body of sufficient size.
[quotetheorem:1608]
[citeproof:1608]
[remark: Sharpness]
Taking $\Lambda = \mathbb{Z}^n$ and $S = [-1,1]^n$ shows the bound $\operatorname{vol}(S) = 2^n = 2^n \operatorname{covol}(\Lambda)$ is sharp: $S$ contains only the $2^n$ vertices of $\{-1,1\}^n$, none of which is an interior non-zero lattice point.
[/remark]
Minkowski's theorem is genuinely a theorem about volume and convexity, not about algebraic structure. Its hypothesis is only that $S$ is large, symmetric, and convex; the conclusion is that any lattice coarse enough relative to $S$ must have a point inside $S$. The role of convexity is subtle: it ensures that once two points $y$ and $z$ are "close modulo $\Lambda$," their midpoint is also inside $S$. Without convexity, the two points might be near each other modulo $\Lambda$ but at opposite ends of a disconnected set, so the midpoint would escape. Without symmetry, the argument that $-2z \in S$ fails, and we cannot form the lattice element $y - z$ from elements of $S$.
Now we embed the number field into Euclidean space. Let $L$ be a number field with $[L:\mathbb{Q}] = n$. Let $\sigma_1, \ldots, \sigma_r : L \to \mathbb{R}$ be the real embeddings and $\sigma_{r+1}, \ldots, \sigma_{r+s}, \bar{\sigma}_{r+1}, \ldots, \bar{\sigma}_{r+s} : L \to \mathbb{C}$ be the pairs of complex conjugate embeddings, so that $n = r + 2s$. Using the identification $\mathbb{C} \cong \mathbb{R}^2$ via $x + iy \mapsto (x, y)$, we obtain an embedding
\begin{align*}
\sigma : L \hookrightarrow \mathbb{R}^r \times \mathbb{C}^s \cong \mathbb{R}^n, \qquad \sigma(\alpha) = (\sigma_1(\alpha), \ldots, \sigma_r(\alpha), \sigma_{r+1}(\alpha), \ldots, \sigma_{r+s}(\alpha)).
\end{align*}
[quotetheorem:1609]
[citeproof:1609]
This is the $n$-dimensional analogue of the formula $A(\mathfrak{a}) = N(\mathfrak{a}) \cdot A(\mathcal{O}_L)$ proved in the quadratic case. The factor $2^{-s}$ arises because each complex embedding contributes a two-dimensional real piece but we are counting it as a single complex coordinate; the coordinate change from the standard complex embedding $(\sigma_{r+j}, \bar\sigma_{r+j})$ to the real–imaginary splitting $({\rm Re}, {\rm Im})$ has Jacobian $2^{-1}$ per pair. The discriminant $|D_L|^{1/2}$ plays the same role as $\frac{1}{2}\sqrt{|D_L|}$ in the quadratic case, serving as the geometric volume of the ring of integers under the embedding $\sigma$.
We now state and prove the Minkowski bound for a general number field.
[quotetheorem:1610]
[citeproof:1610]
[remark: Recovering the Quadratic Bound]
For an imaginary quadratic field, $r = 0$, $s = 1$, $n = 2$, so $c_L = \frac{4}{\pi} \cdot \frac{2}{4} |D_L|^{1/2} = \frac{2}{\pi}|D_L|^{1/2}$, which matches the bound $2\sqrt{|D_L|}/\pi$ derived geometrically above.
[/remark]
The Minkowski bound $c_L$ depends on the signature $(r,s)$ and the discriminant $|D_L|^{1/2}$. The factor $(4/\pi)^s$ is larger than $1$ for $s > 0$: complex embeddings make the body $B_{r,s}(t)$ smaller in Euclidean volume relative to the $\ell^1$-ball, so we need a larger $t$, which translates into a larger bound on $|N(\alpha)|$. The factor $n!/n^n$ goes to $0$ as $n \to \infty$ (by Stirling: $n!/n^n \sim \sqrt{2\pi n}/e^n$), so the purely combinatorial part of $c_L$ actually decreases with degree, while the discriminant $|D_L|^{1/2}$ typically grows rapidly with $n$. In practice, $c_L$ grows with $n$ because the discriminant dominates.
[quotetheorem:1611]
[citeproof:1611]
This theorem is the culmination of the geometric work. Every ideal class, no matter how "large" an ideal it might contain, also contains a representative whose norm is bounded by the fixed constant $c_L$. The proof works by moving to the inverse class $[\mathfrak{a}^{-1}]$ — this is the standard trick in ideal class arithmetic, since applying the Minkowski bound to $\mathfrak{a}^{-1}$ directly gives an element whose norm is controlled, and multiplying back by $\mathfrak{a}$ produces a small integral representative of the original class. The key step is the norm identity $N(\mathfrak{a}^{-1}) \cdot N(\mathfrak{a}) = 1$, which follows from the multiplicativity of the norm.
[quotetheorem:1612]
[citeproof:1612]
This is the central result of the chapter, and it deserves to be appreciated for what it does and does not assert. It says the class group is finite — this is a structural finiteness result, not an effective computation of the group. The bound $c_L$ tells us which primes can generate the group, but computing the full group structure requires additional work: determining which prime ideals are principal, finding relations among non-principal ones, and identifying the abstract group from generators and relations. The theorem also says nothing about how large $\mathrm{Cl}_L$ can be — the class number $h_L = |\mathrm{Cl}_L|$ can be arbitrarily large even for quadratic fields. What finiteness gives is a guarantee that the computation terminates.
[explanation: Computing Class Groups in Practice]
In practice, to determine $\mathrm{Cl}_L$ for a given number field $L$:
1. Compute $c_L$ from the formula, using the discriminant $D_L$ and the signature $(r, s)$.
2. For each rational prime $p < c_L$, factor $\langle p \rangle$ in $\mathcal{O}_L$ using Dedekind's criterion, obtaining prime ideals $\mathfrak{p}_{i}$ of known norm.
3. Determine which of these prime ideals are principal by searching for elements of $\mathcal{O}_L$ with the required norm (using the norm form of $L$).
4. Find relations among the classes $[\mathfrak{p}_i]$ by looking for principal ideals with small norm that factor through several $\mathfrak{p}_i$, as in the example of $\mathbb{Q}(\sqrt{-17})$ above.
5. Identify the abstract group structure of $\mathrm{Cl}_L$ from the generators and relations found.
The Minkowski bound guarantees that this finite computation determines $\mathrm{Cl}_L$ completely.
[/explanation]
With the finiteness of the class group secured, attention shifts to the other major structural group: the units $\mathcal{O}_L^\times$. The signature $(r,s)$ determines whether units are finite or infinite; for fields with real embeddings, the logarithmic map linearizes the unit condition and reveals a free abelian group of rank $r + s - 1$.
# 7. Dirichlet's unit theorem
Having established the finiteness of the class group in the previous chapter, we turn to another fundamental structural result about number fields: the classification of the unit group $\mathcal{O}_L^\times$. The norm condition $|N(\alpha)| = 1$ tells us when $\alpha$ is a unit, but gives no control over how many units exist or how they fit together. Must the unit group be finite? If it is infinite, does it have any discernible structure? What governs the rank — the number of independent units? Dirichlet's unit theorem provides the complete answer, identifying $\mathcal{O}_L^\times$ as a product of a finite cyclic group with a free abelian group of explicitly computed rank.
## Statement of Dirichlet's Unit Theorem
The norm condition $|N(\alpha)| = 1$ is necessary and sufficient for a unit, but it is a multiplicative constraint: it tells us the product of all embeddings has absolute value 1, not what the individual embeddings look like. This is not enough to control how many independent units there are. One key difficulty is that the unit group is multiplicative, not additive, so naive counting by size does not work.
To see the contrast sharply: $\mathbb{Q}(i)$ has only finitely many units ($\{\pm 1, \pm i\}$), while $\mathbb{Q}(\sqrt{2})$ has infinitely many ($\pm(1+\sqrt{2})^n$ for all $n \in \mathbb{Z}$). The difference comes down to the signature. An imaginary quadratic field has one pair of complex embeddings and no real ones; the constraint $|N(\alpha)| = 1$ translates to $|z|^2 = 1$, which cuts out a compact set (the unit circle) in $\mathbb{C}$. A real quadratic field has two real embeddings; the constraint $|x_1 x_2| = 1$ cuts out a hyperbola in $\mathbb{R}^2$, which is non-compact and contains infinitely many integer points.
This geometric intuition — the signature determines whether the norm-1 surface is compact or not — is the heart of Dirichlet's theorem.
Recall that for a number field $L$ of degree $n = [L : \mathbb{Q}]$, the signature $(r, s)$ records the number of real embeddings $r$ and pairs of complex conjugate embeddings $s$, so that $r + 2s = n$. The roots of unity in $L$ form a subgroup:
\begin{align*}
\mu_L = \{ \alpha \in L : \alpha^N = 1 \text{ for some } N > 0 \}.
\end{align*}
[quotetheorem:1613]
The integer $r + s - 1$ is the rank of the unit group. The $-1$ is not an accident: one entire dimension is automatically consumed by the norm-1 constraint, which forces all log-embeddings to lie in a trace-zero hyperplane. The surprise is that exactly $r + s - 1$ independent units always exist — no more, no fewer.
For a totally real field of degree $n$ (signature $(n, 0)$), the rank is $n - 1$, meaning $n - 1$ multiplicatively independent units beyond $\pm 1$. For a totally imaginary field such as an imaginary quadratic (signature $(0, 1)$), the rank is $0$ and the unit group is finite. The rank formula correctly predicts the contrast we observed above.
The unit theorem does not directly say anything about the sizes of units, only about their count. Measuring how large the lattice of units is requires the regulator, defined below in the section on the regulator. The regulator enters the analytic class number formula, linking the sizes of units to the leading term of the Dedekind zeta function.
Before proving the full theorem, we work through the case of real quadratic fields, where the geometry is completely visible and the argument already contains all the essential ideas.
## Units in Real Quadratic Fields
The contrast between imaginary and real quadratic fields is stark: one has at most six units total, the other has infinitely many. The unit group flips from finite to infinite exactly when the field acquires a real embedding. What explains this dichotomy, and how do we extract an explicit generator of the infinite part?
Let $L = \mathbb{Q}(\sqrt{d})$, where $d > 1$ is square-free. The two real embeddings are $\sigma_1(\alpha) = x + y\sqrt{d}$ and $\sigma_2(\alpha) = x - y\sqrt{d}$ for $\alpha = x + y\sqrt{d}$, so $r = 2$, $s = 0$. The only roots of unity in a real field are $\pm 1$, so $\mu_L = \{\pm 1\}$. Dirichlet's theorem predicts
\begin{align*}
\mathcal{O}_L^\times \cong \{\pm 1\} \times \mathbb{Z}.
\end{align*}
The norm satisfies $N(x + y\sqrt{d}) = x^2 - dy^2$, so the unit condition $|N(\alpha)| = 1$ becomes the Pell equation $x^2 - dy^2 = \pm 1$. Dirichlet's theorem says this equation has infinitely many integer solutions, and they are all $\pm$ powers of a single element.
We prove this in two steps: first we show infinitely many units exist, then we identify a single generator.
### Infinitely Many Units via Minkowski
[quotetheorem:1614]
[citeproof:1614]
This proof is entirely non-constructive: it finds units by the pigeonhole principle rather than by an explicit formula. The residue pigeonhole — identifying two elements that are congruent mod $m\mathcal{O}_L$ and have the same norm — is the same mechanism that drives the general proof for arbitrary number fields. Note that the argument breaks down for imaginary quadratic fields: there the Minkowski box $S_t$ is replaced by a disk, and the norm-1 locus is compact, so the iterative construction produces only finitely many new units. This is precisely why imaginary quadratic fields have finite unit groups.
### The Fundamental Unit
[quotetheorem:1615]
[citeproof:1615]
The fundamental unit is the smallest unit greater than 1 in $\mathbb{R}$, a fact that followed from compactness of the bounded unit-norm locus. This minimality criterion is the prototype for the general argument: in the full proof, linear independence of log-images follows from a similar diagonal-dominance observation. The connection to the class number formula is through the regulator $R_L = \log \sigma_1(\varepsilon_0)$: larger fundamental units mean larger regulators and faster growth of the Dedekind zeta function near $s = 1$.
## Proof of the General Dirichlet Unit Theorem
The proof of the full theorem follows the same two-step pattern: (1) introduce logarithmic coordinates to turn the problem into one about discrete subgroups of $\mathbb{R}^{r+s}$, then (2) find enough independent units to span a lattice of the correct rank. The logarithm is the key tool: it converts the multiplicative unit condition $|N(\alpha)| = 1$ into the additive trace-zero condition $\sum y_i = 0$, linearising the problem. Without this transformation, trying to build units by ad hoc means — say, products of small-norm elements — gives vectors in $\mathbb{R}^{r+s}$ with no reason to be linearly independent.
### The Logarithmic Embedding
Define the logarithmic map $\ell : \mathcal{O}_L^\times \to \mathbb{R}^{r+s}$ by
\begin{align*}
x \mapsto \bigl(\log|\sigma_1(x)|,\ \ldots,\ \log|\sigma_r(x)|,\ 2\log|\sigma_{r+1}(x)|,\ \ldots,\ 2\log|\sigma_{r+s}(x)|\bigr).
\end{align*}
The factors of $2$ on the complex places arise because $\sigma_{r+i}$ and its conjugate $\overline{\sigma_{r+i}}$ contribute equally to the norm; recording $2\log|\sigma_{r+i}(x)|$ weights each complex place consistently. Since $\log|ab| = \log|a| + \log|b|$, the map $\ell$ is a group homomorphism from $(\mathcal{O}_L^\times, \cdot)$ to $(\mathbb{R}^{r+s}, +)$.
The image of $\ell$ is a discrete subgroup of $\mathbb{R}^{r+s}$ lying inside the trace-zero hyperplane, and the kernel is exactly $\mu_L$. We verify both facts and then establish that the image has the maximal possible rank inside the hyperplane. The argument proceeds in four steps: identifying the kernel, showing the image is discrete and lies in the trace-zero hyperplane, constructing enough units to achieve maximal rank, and finally stitching these together via a linear algebra computation.
### Identifying the Kernel as Roots of Unity
$\ker \ell$ consists of those $\alpha \in \mathcal{O}_L^\times$ with $|\sigma_i(\alpha)| = 1$ for all $i$. Since $\sigma(\mathcal{O}_L)$ is a discrete lattice and $\{(y_i, z_j) : |y_i| = 1, |z_j| = 1\}$ is compact, the intersection is finite, so $\ker \ell$ is finite. Every element of a finite group has finite order, so $\ker \ell \subseteq \mu_L$. Conversely, if $\alpha^N = 1$ then $|\sigma_i(\alpha)|^N = 1$, which gives $|\sigma_i(\alpha)| = 1$ for all $i$, so $\alpha \in \ker \ell$. Thus $\ker \ell = \mu_L$.
Since $L$ embeds in $\mathbb{C}$, the group $\mu_L$ is a finite subgroup of $\mathbb{C}^\times$, hence cyclic (a finite subgroup of the multiplicative group of a field is always cyclic).
### Discreteness and the Trace-Zero Hyperplane
Let $H = \{ (y_1, \ldots, y_{r+s}) \in \mathbb{R}^{r+s} : \sum_{i=1}^{r+s} y_i = 0 \} \cong \mathbb{R}^{r+s-1}$.
For any $\alpha \in \mathcal{O}_L^\times$, the norm satisfies $N(\alpha) = \pm 1$, so $|N(\alpha)| = 1$. Computing directly:
\begin{align*}
|N(\alpha)| = \prod_{i=1}^r |\sigma_i(\alpha)| \cdot \prod_{j=1}^s |\sigma_{r+j}(\alpha)|^2 = 1.
\end{align*}
Taking logarithms gives $\sum_{i=1}^r \log|\sigma_i(\alpha)| + 2\sum_{j=1}^s \log|\sigma_{r+j}(\alpha)| = 0$, which is exactly $\sum y_i = 0$. So $\operatorname{im} \ell \subseteq H$.
To see that $\operatorname{im} \ell$ is discrete, note that $\ell^{-1}([-A, A]^{r+s})$ maps under $\sigma$ into the compact set
\begin{align*}
\{ (y_i, z_j) : e^{-A} \leq |y_i| \leq e^A,\ e^{-A/2} \leq |z_j| \leq e^{A/2} \}.
\end{align*}
Since $\sigma(\mathcal{O}_L)$ is discrete, this intersection is finite, so $\operatorname{im} \ell \cap [-A, A]^{r+s}$ is finite for every $A > 0$.
Thus $\operatorname{im} \ell$ is a discrete subgroup of $H \cong \mathbb{R}^{r+s-1}$, and hence isomorphic to $\mathbb{Z}^a$ for some $a \leq r + s - 1$. The remaining task is to show $a = r + s - 1$.
### Constructing Units with Prescribed Sign Patterns
The sign-pattern construction is the engine of the proof. We show that for each $k \in \{1, \ldots, r+s\}$, there exists a unit $u_k \in \mathcal{O}_L^\times$ such that if $\ell(u_k) = (y_1, \ldots, y_{r+s})$, then $y_i < 0$ for all $i \neq k$ (and hence $y_k > 0$ since $\sum y_i = 0$). These $r+s$ sign-pattern units will then yield $r+s-1$ linearly independent vectors in $H$.
Fix $k$ and pick any nonzero $\alpha_1 \in \mathcal{O}_L$ with log-embedding $(a_1, \ldots, a_{r+s}) = \ell(\alpha_1)$. Apply Minkowski to the box
\begin{align*}
S = \{ (y_1, \ldots, y_r, z_1, \ldots, z_s) \in \mathbb{R}^r \times \mathbb{C}^s : |y_i| \leq c_i,\ |z_j| \leq c_{r+j} \},
\end{align*}
which has volume $2^r \pi^s c_1 \cdots c_{r+s}$, is convex, and is symmetric about $0$. Choose $0 < c_i < e^{a_i}$ for $i \neq k$, and set
\begin{align*}
c_k = \left(\frac{2}{\pi}\right)^s |D_L|^{1/2} \frac{1}{c_1 \cdots \hat{c}_k \cdots c_{r+s}}
\end{align*}
so that $\operatorname{vol}(S) \geq 2^n \operatorname{covol}(\mathcal{O}_L)$. Minkowski produces a nonzero $\beta \in \mathcal{O}_L$ with $\sigma(\beta) \in S$. Since $|\sigma_i(\beta)| \leq c_i < e^{a_i}$, we have $\log|\sigma_i(\beta)| < a_i$ for all $i \neq k$; that is, $b_i < a_i$ for all $i \neq k$ (where $\ell(\beta) = (b_1, \ldots, b_{r+s})$). The norm bound follows from the volume choice: $|N(\beta)| = \prod |\sigma_i(\beta)| \cdot \prod |\sigma_{r+j}(\beta)|^2 \leq c_1 \cdots c_{r+s} \leq (2/\pi)^s |D_L|^{1/2}$.
Starting from $\alpha_1$ and iterating produces a sequence $\alpha_1, \alpha_2, \ldots \in \mathcal{O}_L$ with $|N(\alpha_t)|$ uniformly bounded by $(2/\pi)^s |D_L|^{1/2}$, and the coordinates $b_i^{(t)}$ (for $i \neq k$) strictly decreasing. Since $|N(\alpha_t)|$ is a bounded positive integer, the pigeonhole principle gives $t \neq t'$ with $N(\alpha_t) = N(\alpha_{t'}) = m$ and $\alpha_t \equiv \alpha_{t'} \pmod{m\mathcal{O}_L}$. The same algebraic argument as in the quadratic case — forming $\alpha_t / \alpha_{t'} \in \mathcal{O}_L$ with norm 1 — shows $u_k := \alpha_t / \alpha_{t'}$ lies in $\mathcal{O}_L^\times$, and $\ell(u_k) = \ell(\alpha_t) - \ell(\alpha_{t'})$ has $y_i < 0$ for all $i \neq k$.
### Linear Independence via Diagonal Dominance
Let $m = r + s$ and form the $m \times m$ matrix $A$ whose $j$th row is $\ell(u_j)$. By construction: $a_{jj} > 0$, $a_{ji} < 0$ for $i \neq j$, and $\sum_i a_{ji} = 0$ (since every row lies in $H$). The sign-pattern construction is precisely what gives the diagonal dominance that will force linear independence.
We apply the following linear algebra fact: if $A \in \operatorname{Mat}_m(\mathbb{R})$ satisfies $a_{ii} > 0$, $a_{ij} < 0$ for $i \neq j$, and $\sum_j a_{ij} = 0$ for each row $i$, then any $m - 1$ rows are linearly independent.
To prove this, suppose the first $m - 1$ rows are linearly dependent: there exist $t_1, \ldots, t_{m-1}$ not all zero with $\sum_{i=1}^{m-1} t_i \ell(u_i) = 0$. Choose $k$ with $|t_k| = \max_i |t_i|$; since not all $t_i$ are zero, $|t_k| > 0$. After dividing by $t_k$, we may assume $t_k = 1$ and $|t_i| \leq 1$ for all $i$. The $k$th component of the dependence relation gives:
\begin{align*}
0 = \sum_{i=1}^{m-1} t_i a_{ki}.
\end{align*}
For $i = k$, we have $t_k a_{kk} = a_{kk} > 0$. For $i \neq k$ with $i \leq m-1$, we have $a_{ki} < 0$ and $|t_i| \leq 1 = |t_k|$. Since $t_k = 1 > 0$ and we chose $k$ to maximise $|t_i|$, we have two cases. If $t_i \geq 0$, then $t_i a_{ki} \geq t_i \cdot a_{ki} \geq a_{ki}$ (since $0 \leq t_i \leq 1$ and $a_{ki} \leq 0$, so $t_i a_{ki} \geq a_{ki}$). If $t_i < 0$, then $t_i a_{ki} > 0 > a_{ki}$, so again $t_i a_{ki} \geq a_{ki}$. In both cases $t_i a_{ki} \geq a_{ki}$, with the combined bound:
\begin{align*}
0 = \sum_{i=1}^{m-1} t_i a_{ki} \geq \sum_{i=1}^{m-1} a_{ki}.
\end{align*}
But $\sum_{i=1}^m a_{ki} = 0$ (row-sum zero), so $\sum_{i=1}^{m-1} a_{ki} = -a_{km} > 0$ (since $a_{km} < 0$ as $m \neq k$). This gives $0 \geq -a_{km} > 0$, a contradiction. Hence the first $m - 1$ rows are linearly independent.
Applying this with $m = r + s$ shows $\ell(u_1), \ldots, \ell(u_{r+s-1})$ are linearly independent in $H \cong \mathbb{R}^{r+s-1}$. These span a full-rank sublattice of $\operatorname{im} \ell$, so $\operatorname{rank}(\operatorname{im} \ell) = r + s - 1$.
Assembling the pieces: we have shown $\operatorname{im} \ell \cong \mathbb{Z}^{r+s-1}$ and $\ker \ell = \mu_L$ is finite cyclic. The short exact sequence
\begin{align*}
1 \to \mu_L \to \mathcal{O}_L^\times \xrightarrow{\ell} \mathbb{Z}^{r+s-1} \to 0
\end{align*}
splits (since $\mathbb{Z}^{r+s-1}$ is free abelian), giving $\mathcal{O}_L^\times \cong \mu_L \times \mathbb{Z}^{r+s-1}$.
The sign-pattern units $u_k$ are not themselves fundamental units — they may be far from minimal — but their log-images are the right shape to force linear independence. The diagonal dominance argument is why the sign pattern is designed the way it is: making $y_k > 0$ and all other $y_i < 0$ ensures each $u_k$ "knows about" dimension $k$ that the others do not.
## The Regulator
The unit theorem gives a lattice structure: $\ell(\mathcal{O}_L^\times) \cong \mathbb{Z}^{r+s-1}$ sits inside the hyperplane $H \cong \mathbb{R}^{r+s-1}$. But different number fields can have the same rank $r + s - 1$ yet very different unit groups: the lattice might be sparse or dense, the generators might be small or astronomically large. How can we measure how large this lattice is?
[definition: Regulator]
The **regulator** of a number field $L$ is the covolume of the unit lattice inside $H$, computed with respect to the Lebesgue measure on $H$ induced from $\mathbb{R}^{r+s}$:
\begin{align*}
R_L = \operatorname{covol}\bigl(\ell(\mathcal{O}_L^\times) \subseteq H\bigr).
\end{align*}
Concretely, if $\varepsilon_1, \ldots, \varepsilon_{r+s-1} \in \mathcal{O}_L^\times$ are generators of the free part of $\mathcal{O}_L^\times$, then $R_L$ is the absolute value of any $(r+s-1) \times (r+s-1)$ minor obtained by deleting any one row from the $(r+s) \times (r+s-1)$ matrix $(\ell(\varepsilon_1) \mid \cdots \mid \ell(\varepsilon_{r+s-1}))^\top$. The covolume is computed inside $H$ with its induced Euclidean measure, not in all of $\mathbb{R}^{r+s}$ (where the lattice does not span the full space). The value of $R_L$ is independent of the choice of minor by the column-sum-zero constraint.
[/definition]
A large regulator means the fundamental units are large: the generators of the free part of $\mathcal{O}_L^\times$ have large absolute values. A small regulator means the units are clustered near 1 in absolute value. The regulator appears alongside the discriminant, class number, and number of roots of unity in the analytic class number formula, which expresses the residue of the Dedekind zeta function $\zeta_L(s)$ at $s = 1$ in terms of these invariants. For imaginary quadratic fields (where $r + s - 1 = 0$), the convention is $R_L = 1$.
## Units in Quadratic Fields
Having established the general theorem and the regulator that measures the size of the unit lattice, we now return to concrete computations. Quadratic fields offer the simplest non-trivial testing ground: the signature is either $(0, 1)$ or $(2, 0)$, so the rank is either $0$ or $1$, and explicit computation is feasible. Working through these cases in detail serves two purposes — confirming the general theory gives the right answer, and developing the computational techniques (Pell-equation solving, minimality verification) that underpin all subsequent explicit unit calculations.
### Imaginary Quadratic Fields
For $d < 0$ square-free, we have $r = 0$, $s = 1$, so $r + s - 1 = 0$ and $\mathcal{O}_L^\times = \mu_L$ is a finite cyclic group. The norm equation $|N(\alpha)| = 1$ becomes $|\sigma(\alpha)|^2 = 1$, confining units to the unit circle in $\mathbb{C}$ — a compact set. Only finitely many algebraic integers lie on the unit circle, so the unit group is finite.
[example: Units in Imaginary Quadratic Fields]
The unit group of $\mathcal{O}_L$ for $L = \mathbb{Q}(\sqrt{d})$, $d < 0$, is determined as follows.
**Case $d = -1$:** $\mathcal{O}_L = \mathbb{Z}[i]$. The norm equation is $x^2 + y^2 = 1$ for $x, y \in \mathbb{Z}$. The integer solutions are $(x, y) \in \{(\pm 1, 0), (0, \pm 1)\}$, giving units $\{1, -1, i, -i\} \cong \mathbb{Z}/4\mathbb{Z}$.
**Case $d = -3$:** $\mathcal{O}_L = \mathbb{Z}[\omega]$ where $\omega = \frac{1}{2}(1 + \sqrt{-3})$. A general element is $\alpha = a + b\omega$ for $a, b \in \mathbb{Z}$, with norm $N(\alpha) = a^2 + ab + b^2$. Setting this equal to 1, we look for non-negative integer solutions to $a^2 + ab + b^2 = 1$. Completing the square: $(a + b/2)^2 + 3b^2/4 = 1$, so $b^2 \leq 4/3$, giving $b \in \{0, \pm 1\}$. Checking each: $(b = 0)$ gives $a = \pm 1$; $(b = 1)$ gives $a^2 + a = 0$, so $a \in \{0, -1\}$; $(b = -1)$ gives $a^2 - a = 0$, so $a \in \{0, 1\}$. This yields six units: $\{1, -1, \omega, -\omega, \omega^2, -\omega^2\} \cong \mathbb{Z}/6\mathbb{Z}$, since $\omega^6 = 1$.
**Case $d = -2$:** $\mathcal{O}_L = \mathbb{Z}[\sqrt{-2}]$. The norm equation is $x^2 + 2y^2 = 1$ for $x, y \in \mathbb{Z}$. Since $2y^2 \geq 0$, we need $x^2 \leq 1$, so $x \in \{0, \pm 1\}$. If $x = 0$: $2y^2 = 1$, no integer solution. If $x = \pm 1$: $y^2 = 0$, so $y = 0$, giving units $\pm 1$.
**For all other $d < 0$:** The norm form $x^2 - dy^2 = 1$ (if $d \equiv 2, 3 \pmod{4}$) or the norm form with half-integer coefficients (if $d \equiv 1 \pmod{4}$) confines $y$ to $y = 0$ and $x = \pm 1$, giving $\mathcal{O}_L^\times = \{\pm 1\} \cong \mathbb{Z}/2\mathbb{Z}$.
[/example]
The above case analysis matches Dirichlet's prediction exactly: rank zero means a finite unit group, and the cyclic structure emerges because any finite subgroup of a field's multiplicative group is cyclic. The torsion orders $2, 4, 6$ visible above are the only possibilities because these are the only $n$ for which $\mathbb{Q}(\zeta_n)$ is contained in some imaginary quadratic field. In every case the regulator satisfies $R_L = 1$, consistently with the convention for rank-zero unit groups.
### Real Quadratic Fields and Fundamental Units
For $d > 0$, we have $r = 2$, $s = 0$, and $\mathcal{O}_L^\times \cong \{\pm 1\} \times \mathbb{Z}$. The regulator is $R_L = \log|\sigma_1(\varepsilon_0)|$, where $\varepsilon_0$ is a fundamental unit. Finding $\varepsilon_0$ explicitly amounts to solving the Pell equation; in general there is no polynomial-time algorithm, though quadratic fields admit efficient methods via continued fractions.
A practical approach: guess a unit $\varepsilon$ (e.g., a small solution to $x^2 - dy^2 = \pm 1$), then verify it is fundamental by checking there is no unit strictly between $1$ and $\varepsilon$ in $\mathbb{R}$. The next two examples illustrate the method — first for a small discriminant where the fundamental unit is easy to spot, then for a larger discriminant where the first few candidates fail and one must search more carefully.
[example: Fundamental Unit of $\mathbb{Q}(\sqrt{2})$]
Consider $L = \mathbb{Q}(\sqrt{2})$, with $\mathcal{O}_L = \mathbb{Z}[\sqrt{2}]$. We try $\varepsilon = 1 + \sqrt{2}$. Since $\sqrt{2} \approx 1.414$, we have $\varepsilon \approx 2.414$. The norm is $N(\varepsilon) = (1 + \sqrt{2})(1 - \sqrt{2}) = 1 - 2 = -1$, so $\varepsilon$ is a unit.
To show $\varepsilon$ is fundamental, we check there is no unit $u = a + b\sqrt{2}$ with $1 < u < 1 + \sqrt{2}$ and $a, b \in \mathbb{Z}$ (or half-integer, but since $d = 2 \equiv 2 \pmod 4$, $\mathcal{O}_L = \mathbb{Z}[\sqrt{2}]$ and $a, b$ are integers). Since $u > 1 > 0$ and $|N(u)| = 1$, the conjugate $\bar{u} = a - b\sqrt{2}$ satisfies $|a - b\sqrt{2}| = 1/u < 1$. So $u > 1$ and $|\bar{u}| < 1$ mean $a + b\sqrt{2} > 1$ and $|a - b\sqrt{2}| < 1$.
From $a + b\sqrt{2} > 1$: since $\sqrt{2} > 1$, any $a, b > 0$ with $b \geq 1$ gives $a + b\sqrt{2} \geq \sqrt{2} \approx 1.41$. Try $b = 1$, $a = 0$: $u = \sqrt{2} \approx 1.41$ and $N(\sqrt{2}) = -2 \neq \pm 1$, not a unit. Try $b = 1$, $a = 1$: $u = 1 + \sqrt{2}$, which is exactly $\varepsilon$, not strictly less. For $a \leq 0$ and $b = 1$: $a + \sqrt{2} \leq \sqrt{2} < 1 + \sqrt{2}$ but $N(a + \sqrt{2}) = a^2 - 2$; this equals $\pm 1$ only for $a = \pm 1$, giving $u = -1 + \sqrt{2} = 1/\varepsilon \approx 0.41 < 1$, outside our range. No integer pair $(a, b)$ gives a unit strictly between 1 and $1 + \sqrt{2}$, so $\varepsilon_0 = 1 + \sqrt{2}$ is the fundamental unit and $R_L = \log(1 + \sqrt{2}) \approx 0.881$.
[/example]
For $\mathbb{Q}(\sqrt{2})$ the fundamental unit was visible at the smallest value of $y$, so the search terminated immediately. The next example shows that even for modest discriminants, one may need to check several values before the Pell equation yields a solution — and that the resulting fundamental unit can already be significantly larger than the naive first guess would suggest.
[example: Fundamental Unit of $\mathbb{Q}(\sqrt{11})$]
Consider $L = \mathbb{Q}(\sqrt{11})$. Since $11 \equiv 3 \pmod{4}$, $\mathcal{O}_L = \mathbb{Z}[\sqrt{11}]$. We look for small solutions to $x^2 - 11y^2 = \pm 1$: try $y = 1$: $x^2 = 11 \pm 1$, so $x^2 = 12$ or $x^2 = 10$, neither a perfect square. Try $y = 2$: $x^2 = 44 \pm 1$, so $x^2 = 45$ or $x^2 = 43$, neither perfect. Try $y = 3$: $x^2 = 99 \pm 1$, so $x^2 = 100$ or $x^2 = 98$. We have $x^2 = 100$, giving $x = 10$.
So $\varepsilon = 10 - 3\sqrt{11}$ satisfies $N(\varepsilon) = 100 - 9 \cdot 11 = 1$. Since $3\sqrt{11} \approx 9.95$, we have $\varepsilon \approx 0.05 < 1$, so $\varepsilon^{-1} = 10 + 3\sqrt{11} \approx 19.95 > 1$ is the natural fundamental unit candidate.
To show $\varepsilon^{-1} = 10 + 3\sqrt{11}$ is fundamental, suppose $u = a + b\sqrt{11}$ satisfies $1 < u < 10 + 3\sqrt{11}$. Since $u > 1$ and $|N(u)| = 1$:
First, if $N(u) = -1$, then $a^2 - 11b^2 = -1$, so $a^2 \equiv -1 \pmod{11}$. But $-1$ is not a quadratic residue mod $11$ (since $11 \equiv 3 \pmod 4$), a contradiction. So $N(u) = 1$.
With $N(u) = 1$, $u\bar{u} = 1$, so $\bar{u} = a - b\sqrt{11}$ satisfies $0 < \bar{u} < 1$ (since $u > 1$ and $u\bar{u} = 1$ with $\bar{u} > 0$). From $1 < a + b\sqrt{11}$ and $0 < a - b\sqrt{11} < 1$, adding gives $2a > 1$, so $a \geq 1$. Subtracting gives $0 < 2b\sqrt{11} < 1 + 10 + 3\sqrt{11} = 11 + 3\sqrt{11} < 7\sqrt{11}$, so $b < 7/2$, meaning $b \in \{1, 2, 3\}$. For each: $b = 1$ gives $a^2 = 11 + 1 = 12$, not a perfect square; $b = 2$ gives $a^2 = 44 + 1 = 45$, not a perfect square; $b = 3$ gives $a^2 = 99 + 1 = 100$, so $a = 10$ and $u = 10 + 3\sqrt{11}$, which is exactly $\varepsilon^{-1}$, not strictly less. So no unit lies strictly between 1 and $\varepsilon^{-1}$, confirming $\varepsilon_0 = 10 + 3\sqrt{11}$ is fundamental.
[/example]
The $\mathbb{Q}(\sqrt{11})$ computation is representative of a ubiquitous phenomenon: fundamental units can grow dramatically with $d$, and there is no simple pattern. For instance, the fundamental unit of $\mathbb{Q}(\sqrt{94})$ is already $2143295 + 221064\sqrt{94}$, and examples with exponentially larger fundamental units exist for innocuous-looking $d$. This explosive growth makes algorithmic questions highly non-trivial — a phenomenon explored further in the following remark.
[remark: No General Efficient Algorithm]
While there exist good algorithms for finding fundamental units of quadratic fields (notably via continued fraction expansions of $\sqrt{d}$, which terminates with a period that reveals the fundamental unit), no polynomial-time algorithm is known for number fields of arbitrary degree. The best general algorithms run in sub-exponential time using LLL lattice reduction applied to the log-unit lattice: one computes approximate log-images of candidate units and uses LLL to find a short vector, then checks exactness. This is in contrast to ideal class groups, where the Minkowski bound gives a finite search procedure whose complexity is more tractable.
[/remark]
The norm, trace, discriminant, class group, and unit group are all arithmetic invariants tied to the field $L$ itself. Chapter 8 reveals that these quantities appear together in the analytic class number formula, encoded in the residue of the Dedekind zeta function at $s = 1$, connecting the algebra of $L$ to the distribution of primes and analytic properties of $L$-functions.
# 8. $L$-functions and Dirichlet series
This chapter is non-examinable. It brings together the algebraic machinery developed throughout the course and connects it to analytic number theory. The central goal is Dirichlet's theorem on primes in arithmetic progressions: if $\gcd(a, q) = 1$, then there are infinitely many primes congruent to $a$ modulo $q$. The proof is analytic, and reaching it requires building up the theory of Dirichlet series, the Riemann zeta function, and Dirichlet $L$-functions. Along the way, we see how the zeta functions of number fields encode deep arithmetic information about their rings of integers.
## Infinitely Many Primes via Euler Products
The starting point is a new proof of an old result.
[quotetheorem:724]
[citeproof:724]
The proof is more than a curiosity: it shows that divergence of a sum over primes forces infinitely many primes. The key structure here is the **Euler product** — a factorisation of a sum over all positive integers into a product over primes. Euler's product converts a global analytic question (divergence of the harmonic series) into information about the distribution of primes, the most fundamental multiplicative structure in $\mathbb{Z}$. Dirichlet's theorem generalises this idea to primes in arithmetic progressions by replacing $\sum_p p^{-s}$ with a sum over primes in a congruence class, and analysing its behaviour near $s = 1$ using $L$-functions. The surprising fact is that one cannot prove Dirichlet's theorem by purely algebraic or elementary means — the non-vanishing of certain analytic functions at a single point is genuinely the hard step.
## Dirichlet Series and the Riemann Zeta Function
Euler's product is a formal identity between a sum $\sum_n n^{-s}$ and an infinite product over primes — but does this expression actually converge, and if so, what analytic object does it define? Without a precise convergence theory, treating $\zeta(s)$ as a meromorphic function of a complex variable $s$ would be groundless. To make rigorous sense of this, we need the theory of Dirichlet series.
To see what can go wrong, consider the series $\sum_{n \geq 1} (-1)^{n-1} n^{-s}$. For real $s > 0$ this converges conditionally by the alternating series test, but $\sum_{n \geq 1} n^{-s}$ itself diverges for real $s \leq 1$. The partial sums of $a_n = (-1)^{n-1}$ are bounded ($0$ or $1$), which is precisely why the alternating series has better convergence than the harmonic series. A general theory should detect this from the growth of partial sums. The naive idea of plugging $s = 0$ or $s = -1$ into $\sum n^{-s}$ produces divergent expressions, and no amount of formal manipulation changes this — the series simply does not converge in those regions.
[definition: Dirichlet Series]
A **Dirichlet series** is a series of the form $\sum_{n \geq 1} a_n n^{-s}$, where $a_1, a_2, \ldots \in \mathbb{C}$ and $s \in \mathbb{C}$.
[/definition]
The most important Dirichlet series in number theory is the Riemann zeta function.
[definition: Riemann Zeta Function]
The **Riemann zeta function** is the Dirichlet series
\begin{align*}
\zeta(s) = \sum_{n \geq 1} n^{-s}, \quad s \in \mathbb{C}.
\end{align*}
[/definition]
To understand where $\zeta(s)$ converges, and more generally where Dirichlet series converge, we have the following lemma, proved by the standard technique of partial summation (Abel summation) from IA Analysis.
[quotetheorem:1616]
[citeproof:1616]
Applying this lemma to $\zeta(s)$ with $a_n = 1$ gives $T(N) = N = O(N^1)$, so $\zeta(s)$ converges and is holomorphic for $\operatorname{Re}(s) > 1$. The content of the theorem is a half-plane of convergence: the Dirichlet series converges uniformly on compact subsets of $\{\operatorname{Re}(s) > r\}$ and the sum is holomorphic there. Notice also what the theorem does NOT say: it does not assert the boundary behaviour is nice, nor does it give a formula for the abscissa of convergence. The hypothesis $T(N) = O(N^r)$ is sufficient but not necessary — the alternating series $\sum (-1)^{n-1} n^{-s}$ has $T(N) = O(1)$ (so $r = 0$), while $\sum n^{-s}$ has $r = 1$; the actual regions of convergence differ because of cancellation in the coefficients. This distinction will matter when we deal with $L(\chi, s)$ for non-trivial $\chi$: character orthogonality will force $T(N) = O(1)$, pushing the half-plane of holomorphicity all the way to $\operatorname{Re}(s) > 0$.
The zeta function has three key properties that make it central to number theory.
[quotetheorem:1617]
Part (i) follows immediately from the Dirichlet series lemma. For part (ii), the idea is to write $\frac{1}{s-1} = \sum_{n=1}^\infty \int_n^{n+1} x^{-s}\,dx$ and check that $\sum_n \phi_n$ converges uniformly for $\operatorname{Re}(s) > 0$, where $\phi_n = n^{-s} - \int_n^{n+1} x^{-s}\,dx$.
For part (iii), given the first $r$ primes $p_1, \ldots, p_r$, expanding each geometric series gives $\prod_{i=1}^r (1 - p_i^{-s})^{-1} = \sum n^{-s}$, where the sum ranges over positive integers whose prime factors lie among $p_1, \ldots, p_r$. In particular all integers $1, \ldots, r$ appear, so
\begin{align*}
\left|\zeta(s) - \prod_{i=1}^r (1 - p_i^{-s})^{-1}\right| \leq \sum_{n \geq r} n^{-\operatorname{Re}(s)} \to 0 \quad \text{as } r \to \infty.
\end{align*}
Absolute convergence follows since $\sum_p p^{-\operatorname{Re}(s)} \leq \sum_n n^{-\operatorname{Re}(s)} < \infty$ and the product $\prod(1 - a_n)$ converges absolutely if and only if $\sum a_n$ does.
These three properties together make $\zeta(s)$ into a well-defined meromorphic function on $\{\operatorname{Re}(s) > 0\}$ that simultaneously encodes the multiplicative structure of $\mathbb{Z}$ (via the Euler product) and the additive structure (via the Dirichlet series). The simple pole at $s = 1$ with residue $1$ is the analytic shadow of the fact that there are "roughly $N/\log N$" primes up to $N$ — a quantitative statement that becomes the prime number theorem. The Euler product is crucial: it shows $\zeta(s) \neq 0$ for $\operatorname{Re}(s) > 1$ (since each factor is non-zero and the product converges), a fact that the Dirichlet series representation alone does not guarantee.
## The Zeta Function of a Number Field
The Euler product of $\zeta_\mathbb{Q}$ runs over rational primes, but for a number field $L$ the natural primes are the prime ideals of $\mathcal{O}_L$. The rational prime Euler product is insufficient for number field arithmetic: a prime $p \in \mathbb{Z}$ contributes a single factor $(1 - p^{-s})^{-1}$ to $\zeta_\mathbb{Q}$, but in $\mathcal{O}_L$ the ideal $(p)$ may split, remain prime, or ramify — three fundamentally different behaviours that a product over rational primes cannot distinguish. What zeta function is formed by a prime-ideal Euler product, and does it encode the arithmetic of $L$?
[definition: Zeta Function of a Number Field]
Let $L \supseteq \mathbb{Q}$ be a number field with $[L:\mathbb{Q}] = n$. The **zeta function of $L$** is
\begin{align*}
\zeta_L(s) = \sum_{\mathfrak{a} \lhd \mathcal{O}_L} N(\mathfrak{a})^{-s},
\end{align*}
where the sum runs over all non-zero ideals of $\mathcal{O}_L$.
[/definition]
When $L = \mathbb{Q}$, the ideals of $\mathbb{Z}$ are $(n)$ for $n \geq 1$ and $N((n)) = n$, so $\zeta_\mathbb{Q}(s) = \zeta(s)$ is the Riemann zeta function.
The key theorem about $\zeta_L$ mirrors the properties of $\zeta$, but now the residue at $s = 1$ encodes deep arithmetic invariants of $L$.
[quotetheorem:1618]
The proof does not require new ideas beyond those already seen. The Euler product identity holds formally as an immediate consequence of unique factorisation of ideals into prime ideals; convergence then reduces to estimating the number of ideals of fixed norm, which is done geometrically (using Minkowski-type arguments), and that geometric counting is precisely what produces the residue in the analytic class number formula.
The analytic class number formula is one of the most striking results in algebraic number theory. It says that a single complex-analytic datum — the residue of a Dirichlet series at a pole — encodes the class number, the regulator, the discriminant, and the number of roots of unity simultaneously. In particular, the residue is always positive and non-zero, since all factors $|\mathrm{Cl}_L|, 2^{r_1}, (2\pi)^{r_2}, R_L, |D_L|^{1/2}, |\mu_L|$ are positive real numbers. The formula does not say that these invariants are computable from $\zeta_L$ individually — one needs additional structure to disentangle them. What it does say is that $\zeta_L$ carries all of them together, as a single product. The extension to $\operatorname{Re}(s) > 1 - 1/n$ (and in fact much further by the full theory) is what makes it possible to evaluate $\zeta_L$ at $s = 1$ and extract arithmetic information.
## Dirichlet $L$-Functions and Quadratic Fields
To see how $\zeta_L$ and the Euler product interact with the splitting behaviour of primes, we compute $\zeta_L$ explicitly for $L = \mathbb{Q}(\sqrt{d})$.
[example: Zeta Function of a Quadratic Field]
Let $L = \mathbb{Q}(\sqrt{d})$ with discriminant $D$ (which equals $d$ or $4d$ depending on $d \bmod 4$). Every prime ideal $\mathfrak{p}$ of $\mathcal{O}_L$ lies above a unique rational prime $p$, and we enumerate the Euler factors prime by prime.
- If $p \mid |D_L|$: the prime ramifies, $(p) = \mathfrak{p}^2$, $N(\mathfrak{p}) = p$. The Euler factor is $(1 - p^{-s})^{-1}$.
- If $p$ remains prime in $\mathcal{O}_L$: $N((p)) = p^2$, giving a factor $(1 - p^{-2s})^{-1} = (1 - p^{-s})^{-1}(1 + p^{-s})^{-1}$.
- If $p$ splits: $(p) = \mathfrak{p}_1 \mathfrak{p}_2$ with $N(\mathfrak{p}_i) = p$, giving a factor $(1 - p^{-s})^{-2}$.
Collecting all primes, one finds $\zeta_L(s) = \zeta(s) L(\chi_D, s)$, where $\chi_D(p) = 0, -1, 1$ according to whether $p$ ramifies, remains prime, or splits.
[/example]
The calculation is striking: the splitting type of $p$ in $\mathcal{O}_L$ determines whether $\chi_D(p) = 0, -1$, or $1$, and the extra Euler factor at each prime captures exactly this arithmetic data. This is the sense in which $\zeta_L$ "knows more" than $\zeta_\mathbb{Q}$: the quotient $\zeta_L(s)/\zeta_\mathbb{Q}(s) = L(\chi_D, s)$ detects which primes split and which remain inert. The function $L(\chi_D, s)$ is an instance of a Dirichlet $L$-function. We now give the general definition.
[definition: Dirichlet Character]
A function $\chi: \mathbb{Z} \to \mathbb{C}$ is a **Dirichlet character of modulus $D$** if there exists a group homomorphism $\omega: (\mathbb{Z}/D\mathbb{Z})^\times \to \mathbb{C}^\times$ such that
\begin{align*}
\chi(m) = \begin{cases} \omega(m \bmod D) & \gcd(m, D) = 1 \\ 0 & \text{otherwise.} \end{cases}
\end{align*}
The character $\chi$ is **non-trivial** if $\omega$ is non-trivial.
[/definition]
Dirichlet characters are multiplicative: $\chi(mn) = \chi(m)\chi(n)$ for all $m, n$.
[definition: Dirichlet $L$-Function]
For a Dirichlet character $\chi$, the **Dirichlet $L$-function** is defined by the Euler product
\begin{align*}
L(\chi, s) = \prod_{p\text{ prime}} (1 - \chi(p)p^{-s})^{-1}.
\end{align*}
By multiplicativity of $\chi$, this equals the Dirichlet series $\sum_{n \geq 1} \chi(n) n^{-s}$.
[/definition]
To compute $L(\chi, s)$ in practice, one writes down $\chi$ explicitly on a complete set of residues modulo $D$. For example, to evaluate $L(\chi_{-4}, s)$ numerically, one notes that $\chi_{-4}$ takes values $1, 0, -1, 0$ on residues $1, 2, 3, 4$ respectively, so $L(\chi_{-4}, s) = 1 - 3^{-s} + 5^{-s} - 7^{-s} + \cdots$, grouping terms in blocks of length $4$ to see the cancellation. The character $\chi_D$ also provides a practical test for splitting: given a prime $p \nmid D$, computing $\chi_D(p)$ via the Legendre symbol $\left(\frac{d}{p}\right)$ tells you immediately whether $p$ splits ($+1$), remains inert ($-1$), or ramifies ($0$) in $\mathbb{Q}(\sqrt{d})$.
[example: The Character $\chi_{-4}$]
For $L = \mathbb{Q}(\sqrt{-1})$, the discriminant is $D = -4$. We have $\left(\frac{-1}{p}\right) = (-1)^{(p-1)/2}$ for odd primes $p$, and $\chi_{-4}(2) = 0$ since $2$ ramifies. Extending multiplicatively to all of $\mathbb{Z}$, the character $\chi_{-4}$ is
\begin{align*}
\chi_{-4}(m) = \begin{cases} (-1)^{(m-1)/2} & m \text{ odd} \\ 0 & m \text{ even.} \end{cases}
\end{align*}
Notice that $\chi_{-4}(m - 4) = \chi_{-4}(m)$, confirming it has period $4$. The corresponding $L$-function is
\begin{align*}
L(\chi_{-4}, s) = 1 - \frac{1}{3^s} + \frac{1}{5^s} - \frac{1}{7^s} + \cdots.
\end{align*}
[/example]
The character $\chi_D$ arising from $\mathbb{Q}(\sqrt{d})$ is a Dirichlet character of modulus $D$.
[quotetheorem:1619]
[citeproof:1619]
Characters taking only values in $\{0, \pm 1\}$, as $\chi_D$ does, are called **quadratic Dirichlet characters**.
## Non-Vanishing of $L(\chi, 1)$
The key analytic input needed for Dirichlet's theorem is that $L(\chi, 1) \neq 0$ for non-trivial characters. We first prove this for non-trivial characters in general, and then note the consequence for quadratic characters.
[quotetheorem:1620]
[citeproof:1620]
This gives us the non-vanishing at $s = 1$ for quadratic characters as a corollary of the analytic class number formula.
[quotetheorem:1621]
[citeproof:1621]
The argument is worth examining carefully. If $L(\chi_D, 1)$ were zero, then $\zeta_{\mathbb{Q}(\sqrt{d})}(s) = \zeta_\mathbb{Q}(s) \cdot L(\chi_D, s)$ would have a removable singularity (or no singularity at all) at $s = 1$, contradicting the analytic class number formula which guarantees a genuine simple pole with positive residue. The non-vanishing is thus not a miracle but a consequence of the finiteness of the class number and the specific form of the residue. What this argument does NOT provide is a formula for $L(\chi_D, 1)$ in isolation — only its non-vanishing. The explicit value comes from the class number formula itself:
The analytic class number formula makes this precise. For $D < 0$:
\begin{align*}
L(\chi_D, 1) = \frac{2\pi\,|\mathrm{Cl}_{\mathbb{Q}(\sqrt{d})}|}{|D|^{1/2}\,|\mu_{\mathbb{Q}(\sqrt{d})}|}.
\end{align*}
[example: Leibniz Formula for $\pi$]
For $L = \mathbb{Q}(\sqrt{-1})$, the class group is trivial ($|\mathrm{Cl}_L| = 1$), the discriminant is $|D| = 4$, and $|\mu_L| = 4$ (the fourth roots of unity). The analytic class number formula gives
\begin{align*}
1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \cdots = \frac{2\pi \cdot 1}{2 \cdot 4} = \frac{\pi}{4}.
\end{align*}
More generally, whenever we know the class number of an imaginary quadratic field, we obtain a series expansion for $\pi$. These series converge extremely slowly.
[/example]
## The Zeta Function of a Cyclotomic Field
Cyclotomic fields $\mathbb{Q}(\omega_q)$ are the only ones where we can explicitly factor $\zeta_L$ as a product of Dirichlet $L$-functions with characters of a fixed modulus — and this factorisation is what drives the proof of Dirichlet's theorem. The reason this works for cyclotomic fields and not for arbitrary number fields is that $\operatorname{Gal}(\mathbb{Q}(\omega_q)/\mathbb{Q}) \cong (\mathbb{Z}/q\mathbb{Z})^\times$ is abelian, so all its irreducible representations are one-dimensional characters — exactly the Dirichlet characters of modulus $q$. For non-abelian extensions, the Galois group has higher-dimensional irreducible representations (entering the Langlands programme), and the Euler-factor calculation becomes far more involved.
We collect the relevant facts about cyclotomic extensions (proofs from Galois Theory).
[quotetheorem:1622]
Parts (i) and (ii) are from the Galois Theory course. Parts (iii) and (iv) are on the example sheet. Part (v) is Galois theory and is omitted here.
Part (v) is the key input for our zeta function computation: it tells us that the splitting type of $p$ in $\mathcal{O}_L$ is controlled entirely by the order of $p$ in $(\mathbb{Z}/q\mathbb{Z})^\times$, which is a purely group-theoretic datum. This is what allows the Euler-factor calculation to be expressed in terms of characters of $(\mathbb{Z}/q\mathbb{Z})^\times$.
Now we can factor $\zeta_L$ into a product of Dirichlet $L$-functions. Let $\omega_1, \ldots, \omega_{\varphi(q)}: (\mathbb{Z}/q\mathbb{Z})^\times \to \mathbb{C}^\times$ be the distinct irreducible (one-dimensional) characters, with $\omega_1$ the trivial character. Let $\chi_i$ be the corresponding Dirichlet characters.
For $p \nmid q$, the prime $p$ generates a cyclic subgroup $\langle p \rangle \subset (\mathbb{Z}/q\mathbb{Z})^\times$ of order $f$. The values $\omega_1(p), \ldots, \omega_{\varphi(q)}(p)$ are then the $f$th roots of unity, each appearing $\varphi(q)/f$ times. This follows from representation theory: the restriction of the regular representation of $(\mathbb{Z}/q\mathbb{Z})^\times$ to $\langle p \rangle \cong \mathbb{Z}/f$ is $\frac{\varphi(q)}{f}$ copies of the regular representation of $\mathbb{Z}/f$. Using the factorisation $1 - t^f = \prod_{\gamma^f = 1}(1 - \gamma t)$ with $t = p^{-s}$:
\begin{align*}
(1 - p^{-fs})^{-\varphi(q)/f} = \prod_{i=1}^{\varphi(q)} (1 - \omega_i(p) p^{-s})^{-1}.
\end{align*}
[quotetheorem:1623]
[citeproof:1623]
This factorisation is the analytic analogue of the group-theoretic decomposition of the regular representation of $(\mathbb{Z}/q\mathbb{Z})^\times$ into irreducible characters. Each $L(\chi_i, s)$ captures a single "frequency component" of the arithmetic of $\mathcal{O}_L$, and the product of all such components reassembles the full zeta function. The factorisation also reveals the structure of the singularity at $s=1$: only $L(\chi_1, s) = \zeta_\mathbb{Q}(s)$ (up to a finite correction) has a pole there, since all other $\chi_i$ are non-trivial and thus $L(\chi_i, s)$ is holomorphic at $s = 1$ by the holomorphicity theorem. This means that the residue of $\zeta_L$ at $s = 1$ equals $\prod_{i=2}^{\varphi(q)} L(\chi_i, 1)$ times the residue of $\zeta_\mathbb{Q}$, so the class number formula becomes a formula for this product of $L$-values.
## Non-Vanishing for All Dirichlet Characters
The quadratic case established $L(\chi_D, 1) \neq 0$ for real characters via the class number formula for quadratic fields. But Dirichlet's theorem requires non-vanishing for ALL non-trivial characters of ANY modulus, including complex characters. The obstruction to a direct generalisation is that a complex character $\chi$ is not of the form $\chi_D$ for any quadratic field, so the two-variable identity $\zeta_L = \zeta \cdot L(\chi, s)$ does not directly apply. Instead, the cyclotomic factorisation provides a substitute: instead of a product of two functions, we have a product of $\varphi(q)$ functions, and a zero of one factor would force an anomalous vanishing order in the whole product.
[quotetheorem:1624]
[citeproof:1624]
Non-vanishing of $L(\chi, 1)$ is the one genuinely hard analytic step in the whole theory. The rest of the argument — the Euler product, the character orthogonality, the logarithm expansion — are formal computations. But if even one $L(\chi_i, 1)$ vanished, the argument for primes in progressions would collapse: the logarithm of a zero is $-\infty$, which would pull the sum $\sum_i \overline{\chi_i(a)} \log L(\chi_i, s)$ to $-\infty$ rather than $+\infty$ as $s \to 1^+$, destroying the divergence that guarantees infinitely many primes in the progression. The proof above gives non-vanishing "for free" by a counting argument on poles — but notice that it tells us nothing about where in $\mathbb{C}$ the functions $L(\chi, s)$ might vanish for $\operatorname{Re}(s) < 1$. The distribution of zeros of $L$-functions in the critical strip $\{0 < \operatorname{Re}(s) < 1\}$ is the subject of the Riemann hypothesis and its generalisations, and remains open.
## Dirichlet's Theorem on Primes in Arithmetic Progressions
We now prove the main theorem. The strategy is to express the sum $\sum_{p \equiv a \pmod{q}} p^{-s}$ in terms of $\log L(\chi_i, s)$ via character orthogonality, and then show this sum has a pole at $s = 1$.
[quotetheorem:1625]
[citeproof:1625]
Dirichlet's theorem answers a qualitative question — infinitely many or not — but says nothing about density. The primes are equidistributed among the $\varphi(q)$ reduced residue classes modulo $q$: each class contains asymptotically $\pi(x)/\varphi(q)$ primes up to $x$, a result known as the Dirichlet density theorem (or Chebotarev's density theorem for its generalisation). The proof method also gives more: by examining the residue of the polar part, one can extract that the sum $\sum_{p \equiv a} p^{-s}$ behaves like $\frac{1}{\varphi(q)} \log\frac{1}{s-1}$, showing each congruence class gets an equal share of the "logarithmic weight" of primes. Notice also that the non-vanishing hypothesis is used at a single critical point: we need $L(\chi_i, 1) \neq 0$ only at $s = 1$, and only to ensure $\log L(\chi_i, 1)$ is finite. Any zero of an $L$-function at $s = 1$ would contribute an additional $-\infty$ term, which could cancel or dominate the divergence from the trivial character, making the argument fail.
## Artin $L$-Functions and the Langlands Programme
The constructions in this chapter extend far beyond abelian extensions. For a Galois extension $L/\mathbb{Q}$ with $G = \operatorname{Gal}(L/\mathbb{Q})$, the zeta function always factors as
\begin{align*}
\zeta_L(s) = \prod_\rho L(\rho, s)^{\dim \rho},
\end{align*}
where $\rho$ ranges over the irreducible representations of $G$, and $L(\rho, s)$ is the **Artin $L$-function** associated to $\rho$. The representation $\rho = 1$ gives $L(1, s) = \zeta_\mathbb{Q}(s)$, and each non-trivial $\rho$ contributes its own $L$-function, which factorises as an Euler product $L(\rho, s) = \prod_{p} L_p(\rho, s)$ over Euler factors.
By the **Kronecker--Weber theorem** (not proved in this course), every abelian extension of $\mathbb{Q}$ is contained in some cyclotomic field. So the cyclotomic computations above cover all abelian extensions, and the characters $\chi_i$ account for all one-dimensional representations.
The function $L(\rho, s)$ extends meromorphically in $s$, and it is conjectured to be holomorphic for all $s$ when $\rho \neq 1$ (the **Artin conjecture**, still open in general).
When $\rho$ is one-dimensional, Artin's $L$-function is a Dirichlet $L$-function $L(\chi, s)$ for some character $\chi$. Identifying which $\chi$ corresponds to a given $\rho$ is a higher-dimensional version of quadratic reciprocity, and the general theory doing this is **class field theory**.
When $\dim \rho > 1$, one enters the domain of non-abelian class field theory, known as the **Langlands programme** — one of the central open problems in modern number theory and the subject of ongoing research connecting number theory, representation theory, and algebraic geometry.
## References
Grojnowski, I. *Number Fields*. Cambridge Part II Lecture Notes, Lent 2016.
Contents
- 1. Number fields
- From Arithmetic Questions to Number Fields
- Algebraic Integers
- Integrality and the Ring of Integers
- Transitivity of Integrality
- Recognising Algebraic Integers via Minimal Polynomials
- The Fraction Field of the Ring of Integers
- 2. Norm, trace, discriminant
- Norm and Trace
- Explicit formulas for quadratic fields
- Integers of quadratic fields
- Field Embeddings
- Units and irreducibles via the norm
- Real and complex embeddings
- Discriminant
- The trace form
- Integral bases and the discriminant
- Computing discriminants of quadratic fields
- Connection to polynomial discriminants
- 3. Multiplicative structure of ideals
- Failure of unique factorization in $\mathcal{O}_L$
- Ideal multiplication and prime ideals
- Dedekind domains
- Fractional ideals and their inverses
- Divisibility and containment
- Unique factorization of ideals
- Quadratic fields: an explicit verification
- The class group
- 4. Norms of ideals
- The Norm of an Ideal
- Multiplicativity of the Norm
- The Norm and the Discriminant
- The Element Norm and the Ideal Norm
- 5. Structure of prime ideals
- Every prime ideal lies above a rational prime
- Ramification, inertia, and splitting
- Dedekind's criterion
- Factoring primes in quadratic fields
- Odd primes in $\mathbb{Q}(\sqrt{d})$
- The prime 2 in $\mathbb{Q}(\sqrt{d})$
- A corollary on complete splitting
- Ramification and the discriminant: the general theorem
- 6. Minkowski bound and finiteness of the class group
- Quadratic extensions
- Minkowski in arbitrary dimension
- 7. Dirichlet's unit theorem
- Statement of Dirichlet's Unit Theorem
- Units in Real Quadratic Fields
- Infinitely Many Units via Minkowski
- The Fundamental Unit
- Proof of the General Dirichlet Unit Theorem
- The Logarithmic Embedding
- Identifying the Kernel as Roots of Unity
- Discreteness and the Trace-Zero Hyperplane
- Constructing Units with Prescribed Sign Patterns
- Linear Independence via Diagonal Dominance
- The Regulator
- Units in Quadratic Fields
- Imaginary Quadratic Fields
- Real Quadratic Fields and Fundamental Units
- 8. $L$-functions and Dirichlet series
- Infinitely Many Primes via Euler Products
- Dirichlet Series and the Riemann Zeta Function
- The Zeta Function of a Number Field
- Dirichlet $L$-Functions and Quadratic Fields
- Non-Vanishing of $L(\chi, 1)$
- The Zeta Function of a Cyclotomic Field
- Non-Vanishing for All Dirichlet Characters
- Dirichlet's Theorem on Primes in Arithmetic Progressions
- Artin $L$-Functions and the Langlands Programme
- References
Cambridge II Number Fields
Content
Problems
History
Created by admin on 4/24/2026 | Last updated on 4/24/2026
Prerequisites
No prerequisites required for this page.
Rate this page
★
★
★
★
★
Poor
Excellent