These notes study the passage from classical modular forms to Galois representations. The guiding question is how analytic data, such as the Hecke eigenvalues of a modular eigenform, can encode arithmetic data over $\mathbb Q$. The course develops this question by moving through modular curves, Hecke correspondences, Jacobians, cohomology, elliptic curves, and finally two-dimensional representations of absolute Galois groups.
The chapters are arranged as a sequence of translations. Congruence conditions first become level structures on modular curves. Hecke eigenvalues then become geometric correspondences, and those correspondences act on cohomology and Tate modules. In weight $2$ this construction is visible through Jacobians of modular curves; in higher weight it is completed by Deligne's construction. Later chapters use the resulting compatible systems to discuss residual representations, congruences, elliptic curves, modular parametrizations, the modularity theorem, and the route from modularity to [Fermat's Last Theorem](/theorems/4789).
The aim of this introductory chapter is not to prove the main theorems. It fixes the objects and compatibility conditions that the rest of the course will make precise: which coefficients of a modular form should become Frobenius traces, which geometric spaces carry the relevant operators, and why elliptic curves and modular eigenforms can be compared by their local data.
# Introduction
## What Does It Mean for a Modular Form to Be Arithmetic?
A modular form begins life as a [holomorphic function](/page/Holomorphic%20Function) on the upper half-plane with transformation laws and growth conditions. The first problem of the course is that this analytic definition does not yet explain why its Fourier coefficients should have Galois-theoretic meaning. A $q$-expansion alone gives a sequence of complex numbers, but it does not explain why the prime-index coefficients should satisfy local Euler factor identities, be compatible as $p$ varies, or match point counts over finite fields. Hecke operators provide the bridge: their eigenvalues are simultaneously analytic spectral data and arithmetic counting data on modular curves.
For the first definition, the notation is as follows. The congruence subgroup
\begin{align*}
\Gamma_1(N)=\left\{\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in SL_2(\mathbb Z): c\equiv 0\pmod N,\ a\equiv d\equiv 1\pmod N\right\}
\end{align*}
acts on the upper half-plane by fractional linear transformations. The space $S_k(\Gamma_1(N))$ is the complex [vector space](/page/Vector%20Space) of cusp forms of weight $k$ for this subgroup, meaning modular forms whose expansions at all cusps have zero constant term. For each $n\ge 1$, the Hecke operator $T_n$ is the standard double-coset operator acting linearly on $S_k(\Gamma_1(N))$; later chapters rebuild this operator geometrically as a correspondence on modular curves. A simultaneous eigenvector for all $T_n$ is the analytic input whose eigenvalues will later be compared with Frobenius traces. When a character $\varepsilon$ appears later, it is the nebentypus character, the Dirichlet character recording the diamond-operator transformation factor of the form.
[definition: Normalised Hecke Eigenform]
Let $N \in \mathbb N$ and $k \ge 2$. A normalised Hecke eigenform of level $N$ and weight $k$ is a cusp form $f \in S_k(\Gamma_1(N))$ such that
\begin{align*}
f(q) = \sum_{n=1}^{\infty} a_n(f)q^n, \qquad a_1(f)=1,
\end{align*}
and $f$ is an eigenvector for all Hecke operators $T_n$ acting on $S_k(\Gamma_1(N))$.
[/definition]
The normalisation fixes the scale, so the eigenvalues can be read off as the coefficients $a_n(f)$ under the standard Hecke action. In this course, the coefficients $a_p(f)$ for primes $p \nmid N$ will become traces of Frobenius elements in Galois representations.
[example: Ramanujan Delta As A Prototype]
The discriminant form is
\begin{align*}
\Delta(q)=q\prod_{n=1}^{\infty}(1-q^n)^{24}=\sum_{n=1}^{\infty}\tau(n)q^n .
\end{align*}
The product has constant term $1$, so multiplying by $q$ gives first coefficient $\tau(1)=1$ and no constant term. Thus the displayed $q$-expansion is normalised, and the vanishing constant term is the cusp-form condition at the cusp $\infty$. The classical discriminant form is a level $1$ Hecke eigenform of weight $12$.
For a prime $p$, the prime-index Hecke eigenvalue is the coefficient $\tau(p)$. Since the level is $1$, the nebentypus is trivial, so the determinant term predicted for weight $12$ is
\begin{align*}
\varepsilon(p)p^{k-1}=1\cdot p^{12-1}=p^{11}.
\end{align*}
Hence the local Euler factor has the form
\begin{align*}
1-\tau(p)X+p^{11}X^2.
\end{align*}
This is exactly the shape of a two-dimensional Frobenius characteristic polynomial written in reciprocal form: if a $2 \times 2$ matrix has trace $t$ and determinant $d$, with eigenvalues $\alpha$ and $\beta$, then
\begin{align*}
\det(1-XA)&=(1-\alpha X)(1-\beta X)\\
&=1-(\alpha+\beta)X+\alpha\beta X^2\\
&=1-\operatorname{tr}(A)X+\det(A)X^2.
\end{align*}
Thus, for $\Delta$, in the arithmetic-Frobenius convention used in these notes, the comparison data at $p$ would be $\operatorname{tr}(\operatorname{Frob}_p)=\tau(p)$ and $\det(\operatorname{Frob}_p)=p^{11}$. The reusable phenomenon is that prime-index Hecke eigenvalues are the trace entries one compares with local Galois data once the Frobenius convention has been fixed.
[/example]
## Why Modular Curves Enter the Story
If Hecke eigenvalues are to have arithmetic meaning, they need a geometric home. Modular curves supply that home by reinterpreting modular forms and Hecke operators in terms of elliptic curves with level structure. This changes the subject from functions on the upper half-plane to algebraic curves defined over number fields.
[definition: Modular Curve Of Type Gamma Zero]
Let $N \in \mathbb N$. The modular curve $Y_0(N)$ is the coarse moduli space whose complex points classify pairs $(E,C)$, where $E$ is an elliptic curve over $\mathbb C$ and $C \subset E[N]$ is a cyclic subgroup of order $N$.
[/definition]
The compactification $X_0(N)$ adds cusps, which correspond to degenerating elliptic curves. The analytic description as a quotient of the upper half-plane and the moduli description as elliptic curves with cyclic subgroups are two views of the same object.
The informal phrase "classifies elliptic curves with level structure" is not enough for later arithmetic use: one has to know exactly which data are being identified and which degenerations are represented after compactification. Without that precision, a Hecke correspondence would have no well-defined moduli interpretation at the cusps or on points with automorphisms. The following classification fixes the complex points of $Y_0(N)$ and explains how the cusp points enter $X_0(N)$.
[quotetheorem:4729]
[citeproof:4729]
The hypotheses indicate exactly what the moduli problem remembers. The cyclic subgroup of order $N$ is the geometric datum corresponding to the congruence condition defining $\Gamma_0(N)$, whereas full level structure would remember a basis of $E[N]$. For now this is a classification over $\mathbb C$: it tells us what the complex points and cusps mean. The arithmetic refinements, such as integral models, bad reduction, and cohomological actions, enter only after the basic moduli picture is stable.
## How Hecke Operators Become Correspondences
The next problem is to turn the algebra of Hecke operators into geometry. On $q$-expansions, $T_p$ is a formula on coefficients. On modular curves, it records the collection of degree $p$ isogenies out of an elliptic curve with level structure.
[definition: Hecke Correspondence At A Prime]
Let $p$ be a prime with $p \nmid N$. Let $Z_p(N)$ be the modular curve whose non-cuspidal complex points classify triples $(E,C,D)$, where $(E,C)$ is a point of $Y_0(N)$ and $D \subset E[p]$ is a cyclic subgroup of order $p$. The prime-to-level Hecke correspondence is the diagram
\begin{align*}
X_0(N) \xleftarrow{\pi_1} Z_p(N) \xrightarrow{\pi_2} X_0(N),
\end{align*}
where
\begin{align*}
\pi_1(E,C,D)&=(E,C), & \pi_2(E,C,D)&=(E/D,(C+D)/D).
\end{align*}
It induces an operator
\begin{align*}
T_p=(\pi_2)_*(\pi_1)^*: \operatorname{Div}(X_0(N))\longrightarrow \operatorname{Div}(X_0(N)).
\end{align*}
On non-cuspidal moduli points this operator is given by
\begin{align*}
(E,C) \longmapsto \sum_{D \subset E[p]} (E/D, (C+D)/D),
\end{align*}
where the sum ranges over cyclic subgroups $D \subset E[p]$ of order $p$.
[/definition]
This definition is the first place where the geometry begins to do arithmetic work. A single point of the modular curve is replaced by the formal sum of all its degree $p$ cyclic quotients, so an operator that was originally a formula on Fourier coefficients becomes a correspondence of curves. That is why the same symbol $T_p$ can later act on modular forms, divisor groups, Jacobians, and cohomology without changing its underlying moduli meaning.
[example: The Prime-To-Level Hecke Operator]
Suppose $p \nmid N$ and $(E,C)$ is a non-cuspidal point of $Y_0(N)$, so $C \subset E[N]$ is cyclic of order $N$. Over $\mathbb C$, the group $E[p]$ is a two-dimensional $\mathbb F_p$-vector space. Its subgroups of order $p$ are exactly its one-dimensional $\mathbb F_p$-subspaces: there are $p^2-1$ nonzero vectors in $E[p]$, and each one-dimensional subspace contains $p-1$ nonzero vectors, so the number of such subgroups is
\begin{align*}
\frac{p^2-1}{p-1}
&=\frac{(p-1)(p+1)}{p-1}\\
&=p+1.
\end{align*}
For each subgroup $D \subset E[p]$ of order $p$, the quotient map $\phi_D:E \to E/D$ is a cyclic isogeny of degree $p$ with kernel $D$. The transported level subgroup is
\begin{align*}
(C+D)/D \subset E/D.
\end{align*}
Since $|C|=N$, $|D|=p$, and $p \nmid N$, the intersection $C \cap D$ has order dividing both $N$ and $p$, hence $|C \cap D|=1$. Therefore
\begin{align*}
|(C+D)/D|
&=\frac{|C+D|}{|D|}\\
&=\frac{|C||D|/|C\cap D|}{|D|}\\
&=\frac{Np/1}{p}\\
&=N.
\end{align*}
Also $(C+D)/D$ is cyclic because it is the image of the cyclic group $C$ under the quotient map $E \to E/D$. Thus $T_p$ sends
\begin{align*}
(E,C)\longmapsto \sum_{\substack{D\subset E[p]\\ |D|=p}} (E/D,(C+D)/D),
\end{align*}
a formal sum of exactly $p+1$ targets. The point is that the prime-to-level Hecke operator records all degree $p$ cyclic isogenies out of $(E,C)$, with the level structure carried along through each quotient.
[/example]
## What A Galois Representation Attached To A Form Should Do
The main arithmetic object of the course is a representation of the absolute [Galois group](/page/Galois%20Group) $G_{\mathbb Q}=\operatorname{Gal}(\overline{\mathbb Q}/\mathbb Q)$. The question is what such a representation should remember from a modular eigenform. For primes away from the level and the auxiliary prime $\ell$, the answer is encoded by Frobenius traces and determinants. In these notes, $\operatorname{Frob}_p$ denotes arithmetic Frobenius in the good-prime comparison formulas below. Geometric Frobenius is the inverse convention, so formulas written with that convention must invert the Frobenius element and adjust the polynomial consistently.
Several coefficient-field conventions appear repeatedly from this point on. A semisimple representation means one that decomposes, after the chosen coefficient field is fixed, as a direct sum of irreducible subrepresentations; this is the form determined by almost-all Frobenius characteristic polynomials. A field embedding $\iota_\ell:\overline{\mathbb Q}\hookrightarrow\overline{\mathbb Q}_\ell$ is the chosen way to regard algebraic Hecke eigenvalues as $\ell$-adic scalars. If $K_f$ is the number field generated by the coefficients of an eigenform and $\lambda\mid\ell$ is a prime of $K_f$, then $K_{f,\lambda}$ is the completion of $K_f$ at $\lambda$, the local field in which the $\lambda$-adic representation is written. Tensor products such as $T_\ell A\otimes_{\mathbb Z_\ell}\mathbb Q_\ell$ mean that an integral Tate module is being converted into a vector space, while tensoring with $K_{f,\lambda}$ extracts the coefficient-field component attached to the chosen prime $\lambda$.
[definition: Galois Representation Attached To A Modular Eigenform]
Let $f \in S_k(\Gamma_1(N))$ be a normalised Hecke eigenform and let $\ell$ be a prime. An $\ell$-adic Galois representation attached to $f$ is a continuous representation
\begin{align*}
\rho_{f,\ell}:G_{\mathbb Q}\longrightarrow GL_2(\overline{\mathbb Q}_\ell)
\end{align*}
which is unramified at every prime $p \nmid N\ell$ and satisfies
\begin{align*}
\operatorname{tr}(\rho_{f,\ell}(\operatorname{Frob}_p)) &= a_p(f),\\
\det(\rho_{f,\ell}(\operatorname{Frob}_p)) &= \varepsilon(p)p^{k-1}
\end{align*}
for all primes $p \nmid N\ell$, where $\varepsilon$ is the nebentypus character of $f$.
[/definition]
The definition is a target specification rather than a construction. It says what the answer must look like at good primes, but it does not explain why a representation with these traces and determinants should exist, why it should be two-dimensional, or why the same object should work compatibly as the auxiliary prime $\ell$ varies. Those are genuine constraints: arbitrary sequences of Hecke eigenvalues do not automatically come from Galois actions. The existence theorem quoted next is therefore the point where the local Euler-factor pattern becomes a global representation-theoretic object.
[quotetheorem:4730]
[citeproof:4730]
This theorem is the structural hinge for the general weight case. It marks the moment when Hecke eigenvalues stop being only analytic spectral data: they become traces of a continuous representation of $G_{\mathbb Q}$, with determinants fixed by the weight and nebentypus. The restriction to primes away from $N\ell$ is part of the content rather than a technical annoyance, because it isolates the clean Frobenius comparison before the course turns to the harder local questions at bad primes and at the auxiliary prime.
## Why Weight Two Is The First Complete Case
The course gives special attention to weight $2$ because modular forms of weight $2$ are differentials on modular curves. This makes the passage from eigenforms to Galois representations more concrete: Jacobians of modular curves carry Tate modules, and Hecke operators act on those Tate modules.
[definition: Tate Module]
Let $A$ be an abelian variety over a field $K$, and let $\ell$ be a prime different from the characteristic of $K$. The $\ell$-adic Tate module of $A$ is
\begin{align*}
T_\ell A = \varprojlim_n A[\ell^n](\overline K),
\end{align*}
where the transition maps are multiplication by $\ell$.
[/definition]
The absolute Galois group $G_K$ acts on $A[\ell^n](\overline K)$ for every $n$, hence on the inverse limit $T_\ell A$. When $A$ is a quotient of the Jacobian $J_0(N)$ cut out by a Hecke eigenform, this action gives the desired two-dimensional representation.
The point now is to make that construction canonical and compatible with the Hecke eigenvalues. One needs a theorem saying that the eigensystem of a weight $2$ newform really cuts out a Galois representation whose Frobenius traces are the Fourier coefficients of the form.
[quotetheorem:4731]
[citeproof:4731]
This is the first construction in the course where all major objects are visible at once: modular curves, Hecke operators, abelian varieties, cohomology, and Galois representations. Weight $2$ is essential here because weight $2$ cusp forms are holomorphic differentials on modular curves, so they naturally contribute to the Jacobian. The newform hypothesis isolates one Hecke eigensystem from the oldform contributions coming from lower levels, which is why the associated quotient $A_f$ carries the representation relevant to that eigenform. The theorem is not yet the modularity theorem for elliptic curves: it constructs Galois representations from modular forms, while modularity asks when the Galois representation already attached to an elliptic curve arises in this way.
## How Elliptic Curves Fit Into The Same Picture
The bridge to elliptic curves comes from the fact that an elliptic curve over $\mathbb Q$ also has a two-dimensional Galois representation on its Tate module. Modularity says that this representation is not isolated: it comes from a weight $2$ modular form.
[definition: Modular Elliptic Curve]
An elliptic curve $E$ over $\mathbb Q$ is modular if there exists $N \in \mathbb N$ and a non-constant morphism of algebraic curves over $\mathbb Q$
\begin{align*}
X_0(N) \longrightarrow E.
\end{align*}
[/definition]
Equivalently, the $L$-function of $E$ agrees with the $L$-function of a weight $2$ newform of level equal to the conductor of $E$. The course treats this equivalence as part of the modularity package connecting geometry, analysis, and Galois representations.
The definition of modularity could a priori describe only a special class of elliptic curves whose Galois representations happen to come from modular forms. The obstruction is that an arbitrary elliptic curve over $\mathbb Q$ is given by an equation, while a modular form is analytic data constrained by Hecke operators. The theorem below removes that gap by asserting that every elliptic curve over $\mathbb Q$ is controlled by the modular framework just described.
[quotetheorem:4781]
[citeproof:4781]
Here modularity reverses the direction established earlier. Instead of starting with a modular form and constructing a Galois representation, it starts with the Tate module of an elliptic curve and asserts that the same local traces occur in a weight $2$ newform. This reversal is what makes the theorem powerful in arithmetic applications: once an elliptic curve is known to be modular, its point counts, conductor, and local Euler factors can be studied through the Hecke eigenvalues of a modular form. The later use in Fermat's Last Theorem depends on this comparison together with the Frey curve and Ribet level-lowering theorem.
[example: Frobenius Trace Of An Elliptic Curve]
Let $E/\mathbb Q$ have good reduction at a prime $p$, and let $\ell \ne p$. Define
\begin{align*}
a_p(E)=p+1-|E(\mathbb F_p)|.
\end{align*}
By the usual Frobenius trace formula for elliptic curves over finite fields, arithmetic Frobenius on
\begin{align*}
V_\ell E=T_\ell E\otimes_{\mathbb Z_\ell}\mathbb Q_\ell
\end{align*}
has trace $a_p(E)$ and determinant $p$. Since $V_\ell E$ is two-dimensional, the characteristic polynomial of a linear operator $A$ on it is
\begin{align*}
\det(X-A)
&=(X-\alpha)(X-\beta)\\
&=X^2-(\alpha+\beta)X+\alpha\beta\\
&=X^2-\operatorname{tr}(A)X+\det(A),
\end{align*}
where $\alpha,\beta$ are the two eigenvalues after extending scalars. Applying this to $A=\operatorname{Frob}_p$ gives
\begin{align*}
\det(X-\operatorname{Frob}_p\mid V_\ell E)
&=X^2-\operatorname{tr}(\operatorname{Frob}_p)X+\det(\operatorname{Frob}_p)\\
&=X^2-a_p(E)X+p.
\end{align*}
If $E$ is modular with associated weight $2$ newform $f$, then the modularity comparison identifies the good-prime Euler factor of $E$ with the Euler factor of $f$. The elliptic-curve factor is determined by $a_p(E)$, while the newform factor is determined by $a_p(f)$, so equality of the factors gives
\begin{align*}
1-a_p(E)X+pX^2=1-a_p(f)X+pX^2,
\end{align*}
and comparing the coefficients of $X$ yields
\begin{align*}
-a_p(E)=-a_p(f), \qquad\text{hence}\qquad a_p(E)=a_p(f).
\end{align*}
Thus the same number is simultaneously the Frobenius trace on the Tate module of $E$ and the prime-index Hecke eigenvalue of the modular form attached to $E$.
[/example]
## The Logical Shape Of The Course
The course is organised around a sequence of translations. Modular curves translate modular forms into geometry; Hecke correspondences translate eigenvalues into algebraic cycles; cohomology translates cycles into linear operators; Galois actions translate arithmetic over $\mathbb Q$ into representations; modularity translates elliptic curves back into modular forms.
[explanation: Main Thread]
The first lectures construct modular curves and revisit Hecke operators geometrically. The middle of the course introduces the cohomological setting in which Hecke operators and Galois actions can be compared. The later lectures formulate the Galois representations attached to eigenforms, study their local behaviour, and explain how modularity theorems use deformation and level-lowering ideas. The final destination is not a complete proof of the modularity theorem, but a coherent account of the objects and statements that make the theorem meaningful.
[/explanation]
Several results will be proved only in the cases where the course has built enough machinery. Deeper theorems from étale cohomology, the Langlands programme, and modularity lifting will be stated with enough precision to be used. This separation is part of the architecture: the course proves the geometric mechanism in accessible cases and records the general theorems that extend it.
The introduction has now set out the course’s division of labor: the geometric constructions will be developed explicitly where they can be seen, while the broader theorems are recorded in a form usable later. To make that mechanism concrete, we next fix the modular curves and level structures that encode congruence conditions as moduli problems.
# 1. Modular Curves and Level Structures
This opening chapter fixes the geometric objects that carry the Hecke correspondences and cohomology classes used later in the course. The guiding question is how a congruence condition on lattices in the upper half-plane becomes a moduli problem for elliptic curves with extra finite subgroup data. We assume the basic language of Riemann surfaces, elliptic curves over $\mathbb C$, finite torsion subgroups, and group actions on the upper half-plane. We move between the analytic quotient description, the moduli interpretation, and the compactification by cusps because all three viewpoints are needed for the construction of Galois representations from modular forms.
## Analytic Quotients of the Upper Half-Plane
The first problem is to turn the upper half-plane into a family of Riemann surfaces whose points remember congruence-level information. Classical modular forms of level $N$ are functions on the upper half-plane with transformation laws under congruence subgroups, so the natural geometric space is the quotient by the same subgroup.
[definition: Upper Half-Plane]
The upper half-plane is
\begin{align*}
\mathcal H := \{z \in \mathbb C : \operatorname{Im}(z) > 0\}.
\end{align*}
[/definition]
The group $SL_2(\mathbb Z)$ acts on $\mathcal H$ by fractional linear transformations:
\begin{align*}
\begin{pmatrix} a & b \\ c & d \end{pmatrix} z = \frac{az+b}{cz+d}.
\end{align*}
Congruence subgroups impose arithmetic restrictions on the lower row of the matrix. These restrictions are the analytic shadows of level structure on elliptic curves: they control which changes of lattice basis preserve a chosen finite subgroup or torsion point. We therefore need explicit subgroups before forming the quotients that will become modular curves of level $N$.
[definition: Congruence Subgroups Gamma Zero and Gamma One]
Let $N \in \mathbb N$. Define
\begin{align*}
\Gamma_0(N) &:= \left\{\begin{pmatrix} a & b \\ c & d \end{pmatrix} \in SL_2(\mathbb Z) : c \equiv 0 \pmod N\right\}, \\
\Gamma_1(N) &:= \left\{\begin{pmatrix} a & b \\ c & d \end{pmatrix} \in SL_2(\mathbb Z) : c \equiv 0 \pmod N,\ a \equiv d \equiv 1 \pmod N\right\}.
\end{align*}
[/definition]
The inclusions
\begin{align*}
\Gamma_1(N) \le \Gamma_0(N) \le SL_2(\mathbb Z)
\end{align*}
mean that larger level structure gives a finer quotient.
The group definitions by themselves are only algebraic congruence conditions; to study modular forms geometrically, they must be turned into spaces on which functions, differentials, and later compactifications live. Taking orbits of the upper half-plane packages elliptic curves with the corresponding level data into Riemann surfaces, but before adding cusps these surfaces are still open and noncompact. These open quotients are the first geometric modular curves.
The next object should therefore record exactly which analytic quotient is being used before any cusps are added. This notation separates the open moduli problem from its later compactification, so that statements about forms, maps, and boundary points have a fixed geometric base.
[definition: Open Modular Curves]
Let $N \in \mathbb N$. The open modular curves of level $\Gamma_0(N)$ and $\Gamma_1(N)$ are
\begin{align*}
Y_0(N) &:= \Gamma_0(N) \backslash \mathcal H, \\
Y_1(N) &:= \Gamma_1(N) \backslash \mathcal H.
\end{align*}
[/definition]
These quotients are generally orbifold quotients at elliptic fixed points, but they carry natural structures as Riemann surfaces after the usual local uniformising charts are introduced. The omitted points in the compactification are not mysterious extra analytic points: they are the limiting directions in which a lattice degenerates.
[example: Level One Quotient]
For $N=1$, the congruence conditions modulo $1$ impose no restriction. Indeed, every integer is congruent to $0$ modulo $1$ and every integer is congruent to $1$ modulo $1$, so
\begin{align*}
\Gamma_0(1)
&=\left\{\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in SL_2(\mathbb Z): c\equiv 0\pmod 1\right\} \\
&=SL_2(\mathbb Z),
\end{align*}
and
\begin{align*}
\Gamma_1(1)
&=\left\{\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in SL_2(\mathbb Z): c\equiv 0\pmod 1,\ a\equiv d\equiv 1\pmod 1\right\} \\
&=SL_2(\mathbb Z).
\end{align*}
Therefore the two open quotients are the same:
\begin{align*}
Y_0(1)
&=\Gamma_0(1)\backslash \mathcal H \\
&=SL_2(\mathbb Z)\backslash \mathcal H \\
&=\Gamma_1(1)\backslash \mathcal H \\
&=Y_1(1).
\end{align*}
After compactification, there is one added cusp because $SL_2(\mathbb Z)$ acts transitively on $\mathbb P^1(\mathbb Q)$. The standard modular function $j$ is invariant under $SL_2(\mathbb Z)$ and has Fourier expansion
\begin{align*}
j(q)=q^{-1}+744+196884q+\cdots,
\end{align*}
where $q=e^{2\pi iz}$. Thus as $z\to i\infty$, one has $q\to 0$ and $j(q)$ has a pole, so the cusp is sent to $j=\infty$. With the usual normalization, the elliptic point represented by $e^{2\pi i/3}$ has $j=0$, and the elliptic point represented by $i$ has $j=1728$. Hence $j$ identifies the compactified quotient $X_0(1)$ with $\mathbb P^1$, with the cusp at $\infty$ and the two elliptic special values at $0$ and $1728$.
[/example]
The example explains why compactification is unavoidable. The quotient $Y_0(1)$ is not compact because the imaginary part of $z$ may tend to infinity, but adding the cusp produces the projective line.
## Moduli of Elliptic Curves with Level Structure
The second problem is to interpret a point of $Y_0(N)$ without choosing a coordinate $z \in \mathcal H$. The upper half-plane parametrises complex elliptic curves because $z \in \mathcal H$ determines the lattice $\Lambda_z=\mathbb Z z+\mathbb Z$ and hence the elliptic curve $E_z=\mathbb C/\Lambda_z$. The congruence subgroup records which finite subgroup or torsion point survives a change of lattice basis.
[definition: Cyclic Subgroup of Order N]
Let $E$ be an elliptic curve over $\mathbb C$ and let $N \in \mathbb N$. A cyclic subgroup of order $N$ is a subgroup $C \le E[N]$ that is isomorphic to $\mathbb Z/N\mathbb Z$.
[/definition]
For $\Gamma_0(N)$ the extra datum is such a subgroup. For $\Gamma_1(N)$ the extra datum is a chosen point of exact order $N$, which is finer because it selects a generator rather than only the subgroup it generates.
The analytic quotient remembers only how a lattice basis may be changed, so the moduli interpretation must translate that matrix condition into intrinsic data on the elliptic curve itself. Otherwise the phrase "level $N$ structure" would depend on the chosen uniformisation rather than on the isomorphism class of the curve. The definition below names the two intrinsic pieces of torsion data that match the congruence subgroups $\Gamma_0(N)$ and $\Gamma_1(N)$.
[definition: Gamma Zero and Gamma One Level Structures]
Let $E$ be an elliptic curve over $\mathbb C$ and let $N \in \mathbb N$.
A $\Gamma_0(N)$-level structure on $E$ is a cyclic subgroup $C \le E[N]$ of order $N$.
A $\Gamma_1(N)$-level structure on $E$ is a point $P \in E[N]$ of exact order $N$.
[/definition]
The definition is deliberately asymmetric: $\Gamma_0(N)$ forgets the choice of generator, while $\Gamma_1(N)$ remembers it. This distinction later becomes visible in the maps between modular curves and in the diamond operators.
[quotetheorem:4733]
[citeproof:4733]
This theorem is the prototype for much of the course: an analytic quotient becomes arithmetic because changing a basis of homology is controlled by an integral matrix. The cyclic order-$N$ hypothesis is essential: a noncyclic subgroup of $E[N]$ may have rank two, such as all of $E[p]$, and is not stabilised by the same parabolic congruence condition. If the subgroup has smaller order, then the point belongs to a lower-level moduli problem rather than to level $N$. The statement is also a coarse moduli statement over $\mathbb C$: elliptic curves with extra automorphisms can identify points in the quotient, so the theorem records isomorphism classes rather than a universal family with no stabilisers. This limitation matters later because Hecke correspondences act on these coarse curves, while more refined integral or stack-theoretic constructions are needed to track automorphisms uniformly.
[quotetheorem:4734]
[citeproof:4734]
The exact order condition is necessary because a point $P \in E[N]$ of smaller order generates lower-level data and would not distinguish the same congruence subgroup. For example, the zero point lies in $E[N]$ for every $N$, but allowing it as level structure would collapse the intended level information. The theorem is again a coarse statement: for small $N$ or elliptic curves with extra automorphisms, the pair $(E,P)$ can still have nontrivial automorphisms, so $Y_1(N)$ should not be read as a fine moduli scheme in those cases. This limitation matters later because forgetting the generator loses the action of units on primitive torsion points, which is precisely the information measured by diamond operators.
The two moduli interpretations are related by forgetting a generator. The map $(E,P) \mapsto (E,\langle P\rangle)$ induces a finite morphism $Y_1(N) \to Y_0(N)$.
[example: Forgetting the Generator]
Let $E/\mathbb C$ be an elliptic curve, and let $C \le E[N]$ be cyclic of order $N$. Choose one generator $Q$ of $C$. Since $C$ is cyclic of order $N$, every element of $C$ has the form $aQ$ for a unique residue class $a \in \mathbb Z/N\mathbb Z$.
We determine exactly which of these elements are generators. Let $g=\gcd(a,N)$. For an integer $m$,
\begin{align*}
m(aQ)=0
&\Longleftrightarrow (ma)Q=0 \\
&\Longleftrightarrow N \mid ma \\
&\Longleftrightarrow \frac{N}{g}\mid m.
\end{align*}
Thus the order of $aQ$ is $N/g$. Therefore $aQ$ has exact order $N$ precisely when $g=1$, that is, precisely when $a \in (\mathbb Z/N\mathbb Z)^\times$.
Hence the $\Gamma_1(N)$-level structures lying above the $\Gamma_0(N)$-level structure $(E,C)$ are exactly the points
\begin{align*}
aQ \in C \qquad \text{with } a\in(\mathbb Z/N\mathbb Z)^\times,
\end{align*}
before quotienting by automorphisms of the pair. The map $Y_1(N)\to Y_0(N)$ therefore forgets the chosen generator and remembers only the cyclic subgroup it generates.
[/example]
This finite cover is the first appearance of a recurring idea: changing the level structure produces maps between modular curves, and the induced maps on cohomology will interact with Hecke operators.
## Cusps and Compactification
The third problem is to complete $Y_0(N)$ and $Y_1(N)$ without losing the moduli meaning. Analytically, noncompactness arises from orbits of rational boundary points of the upper half-plane. Moduli-theoretically, those missing points represent degenerations of elliptic curves to nodal cubic curves with compatible level structure.
[definition: Cusps of a Congruence Subgroup]
Let $\Gamma \le SL_2(\mathbb Z)$ be a congruence subgroup. The cusps of $\Gamma$ are the orbits of $\Gamma$ on $\mathbb P^1(\mathbb Q)=\mathbb Q \cup \{\infty\}$.
[/definition]
For $\Gamma_0(N)$ the cusp set is finite, so there are only finitely many missing boundary directions to add.
The open curves $Y_0(N)$ and $Y_1(N)$ are not adequate for divisors, Jacobians, or cohomology because sequences can escape toward the cusps. The required object is a completed quotient that includes those rational boundary orbits while keeping them visibly separate from the interior moduli of honest elliptic curves. This compactified curve is the surface on which later correspondences and divisor constructions will act.
We now need notation for the completed curves themselves, not just for the added cusp set. Naming the compactifications makes it possible to distinguish maps and divisors on the compact surface from their restrictions to the open modular curve.
[definition: Compactified Modular Curves]
Let $N \in \mathbb N$. The compactified modular curves are
\begin{align*}
X_0(N) &:= \Gamma_0(N) \backslash (\mathcal H \cup \mathbb P^1(\mathbb Q)), \\
X_1(N) &:= \Gamma_1(N) \backslash (\mathcal H \cup \mathbb P^1(\mathbb Q)).
\end{align*}
[/definition]
These quotients carry the standard compact Riemann surface structure extending the quotient structures on $Y_0(N)$ and $Y_1(N)$.
The word "standard" here refers to the local $q$-parameter at a cusp. Near $\infty$, the coordinate $q=e^{2\pi i z}$ identifies a neighbourhood of the cusp with a punctured disc, and adding the cusp fills in the missing point $q=0$.
The local $q$-coordinate shows how to fill one cusp, but compactness is a global assertion: the quotient must have only finitely many regions and finitely many cusp ends to complete. Without finite index, the same local filling procedure need not control the whole quotient. The theorem below supplies the global finiteness and compactness statement needed before treating the completed quotient as a compact Riemann surface.
[quotetheorem:4735]
[citeproof:4735]
Compactness is the bridge from analytic modular forms to algebraic geometry. The finite-index hypothesis is what leaves only finitely many fundamental-domain pieces, and the finiteness of cusp orbits is what leaves only finitely many punctures to fill. If infinitely many cusp ends remained, the same filling argument would not produce a compact surface. Compactness alone does not provide an integral moduli interpretation or a universal elliptic curve over the boundary; those require the more refined theory of generalized elliptic curves. Once $X_0(N)$ is compact, meromorphic modular functions become meromorphic functions on a projective curve, and divisors, Jacobians, and cohomology enter the theory.
[quotetheorem:4736]
[citeproof:4736]
The compactification statement should be read as a structural description of the boundary. The added cusp points are not arbitrary extras: they are the limiting objects needed so that modular curves become compact and so that correspondences extend across degenerations. The algebro-geometric refinement is naturally phrased using generalized elliptic curves and the Deligne-Rapoport moduli stack.
[example: Cusps for X Zero One]
For $N=1$, the condition defining $\Gamma_0(1)$ imposes no restriction, so
\begin{align*}
\Gamma_0(1)=SL_2(\mathbb Z).
\end{align*}
We verify that there is exactly one cusp. Let $r\in \mathbb Q$ and write $r=a/c$ with $a,c\in \mathbb Z$ and $\gcd(a,c)=1$. By Bezout's identity, there exist $b,d\in \mathbb Z$ such that
\begin{align*}
ad-bc=1.
\end{align*}
Then
\begin{align*}
\gamma=\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in SL_2(\mathbb Z),
\end{align*}
and its action on $\infty$ is
\begin{align*}
\gamma\cdot \infty=\frac{a}{c}=r.
\end{align*}
The point $\infty$ itself is fixed by the identity matrix, so every point of $\mathbb P^1(\mathbb Q)=\mathbb Q\cup\{\infty\}$ lies in the $SL_2(\mathbb Z)$-orbit of $\infty$. Hence $X_0(1)$ has one cusp.
The classical modular function $j$ is $SL_2(\mathbb Z)$-invariant, so it descends from $\mathcal H$ to the quotient $Y_0(1)$. Its Fourier expansion at the cusp is
\begin{align*}
j(q)=q^{-1}+744+196884q+\cdots,
\end{align*}
where $q=e^{2\pi iz}$. As $z\to i\infty$, one has
\begin{align*}
q=e^{2\pi i z}\to 0,
\end{align*}
and the leading term $q^{-1}$ tends to $\infty$, so the added cusp is the point $j=\infty$. With the usual normalization, the compactified quotient is identified with $\mathbb P^1$ using $j$ as a coordinate. Thus modular functions of level $1$ may be read as rational functions in the single coordinate $j$ on the compact curve $X_0(1)\cong \mathbb P^1$.
[/example]
The single-cusp case hides the richer geometry at higher level. At level $N$, several cusps may appear, and their widths control Fourier expansions at different boundary components.
## Maps Between Levels
The fourth problem is to compare modular curves when the level changes. These maps are not auxiliary bookkeeping: Hecke correspondences are built from them, and later the old/new decomposition of modular forms depends on their geometry.
Suppose $M \mid N$. Then $\Gamma_0(N) \le \Gamma_0(M)$, so there is a natural finite holomorphic map
\begin{align*}
X_0(N) \longrightarrow X_0(M).
\end{align*}
In moduli terms, this map forgets part of the level structure.
[definition: Degeneracy Maps for Gamma Zero Level]
Let $M \mid N$. A degeneracy map of $\Gamma_0$-level from $N$ to $M$ is a finite morphism
\begin{align*}
\delta: X_0(N) \longrightarrow X_0(M)
\end{align*}
induced by a functorial operation on pairs $(E,C)$, where $C \le E[N]$ is cyclic of order $N$.
For $p \nmid N$, the two standard degeneracy maps are
\begin{align*}
\alpha_p,\beta_p: X_0(Np) \longrightarrow X_0(N),
\end{align*}
where $\alpha_p(E,C)=(E,C[N])$ and $\beta_p$ is induced by quotienting $E$ by the order-$p$ subgroup of $C$ and transporting the remaining order-$N$ level structure.
[/definition]
The two standard degeneracy maps at prime level change, especially from level $N$ to level $Np$, are the geometric source of the Hecke correspondence $T_p$ when $p \nmid N$. In the next chapter this will be upgraded from a map of curves to a correspondence acting on divisors, differentials, and cohomology.
[example: Two Degeneracy Maps at p]
Let $p\nmid N$ and let $(E,C)$ be a point of $Y_0(Np)$, with $C$ cyclic of order $Np$. Choose a generator $Q$ of $C$, so
\begin{align*}
C=\langle Q\rangle,\qquad \operatorname{ord}(Q)=Np.
\end{align*}
The subgroup of elements of $C$ killed by $N$ is generated by $pQ$, since
\begin{align*}
\operatorname{ord}(pQ)
=\frac{Np}{\gcd(Np,p)}
=\frac{Np}{p}
=N.
\end{align*}
Thus
\begin{align*}
C[N]=\langle pQ\rangle
\end{align*}
is cyclic of order $N$, and the first degeneracy map is
\begin{align*}
\alpha_p(E,C)=(E,C[N])=(E,\langle pQ\rangle).
\end{align*}
For the second map, take the order-$p$ subgroup of $C$. It is generated by $NQ$, because
\begin{align*}
\operatorname{ord}(NQ)
=\frac{Np}{\gcd(Np,N)}
=\frac{Np}{N}
=p.
\end{align*}
Let
\begin{align*}
K:=C[p]=\langle NQ\rangle
\end{align*}
and let
\begin{align*}
\pi:E\longrightarrow E/K
\end{align*}
be the quotient isogeny. Since $K\le C$, the image $\pi(C)$ is naturally the [quotient group](/page/Quotient%20Group) $C/K$, so
\begin{align*}
|\pi(C)|=\frac{|C|}{|K|}
=\frac{Np}{p}
=N.
\end{align*}
Also, a quotient of a cyclic group is cyclic, so $\pi(C)$ is a cyclic subgroup of $E/K$ of order $N$. Hence the second degeneracy map is
\begin{align*}
\beta_p(E,C)=(E/K,\pi(C))=(E/C[p],\pi(C)).
\end{align*}
The two outputs keep different parts of the original cyclic subgroup: $\alpha_p$ keeps the order-$N$ subgroup inside $E$, while $\beta_p$ first kills the order-$p$ subgroup and then keeps the resulting order-$N$ subgroup on the quotient curve. Together they give the two legs of the Hecke correspondence defining $T_p$ for $p\nmid N$.
[/example]
The language of correspondences is essential because a Hecke operator is rarely a single-valued map on points. It is a finite multi-valued operation, naturally expressed as a diagram of finite morphisms between modular curves.
## Genus and the First Nontrivial Example
The final problem in this chapter is to measure how complicated $X_0(N)$ is as a compact curve. The genus determines the dimension of holomorphic differentials, hence the dimension of weight-two cusp forms, and it controls whether the curve can itself be an elliptic curve.
[quotetheorem:4737]
[citeproof:4737]
This formula packages the geometry of the finite map $X_0(N) \to X_0(1)$. The terms $e_2(N)$ and $e_3(N)$ are necessary because the quotient map is ramified above the elliptic points of $X_0(1)$, while $c(N)$ records the contribution of the boundary. Omitting these corrections would treat the cover as unramified over the orbifold and cusp points, producing the wrong genus even at small levels. The formula determines only a numerical invariant of the compact curve; it does not identify equations, a rational model, or a distinguished origin when the genus is one.
[example: The Curve X Zero Eleven]
For $N=11$, we evaluate the terms in the genus formula for $X_0(N)$. Since $11$ is prime,
\begin{align*}
\mu_{11}=[SL_2(\mathbb Z):\Gamma_0(11)]=11+1=12.
\end{align*}
The cusps of $\Gamma_0(11)$ are represented by $\infty$ and $0$, so
\begin{align*}
c(11)=2.
\end{align*}
There are no elliptic points of order $2$: the relevant congruence is $x^2\equiv -1\pmod {11}$, and the quadratic residues modulo $11$ are
\begin{align*}
0^2&\equiv 0, &
1^2&\equiv 1, &
2^2&\equiv 4, &
3^2&\equiv 9, &
4^2&\equiv 5, &
5^2&\equiv 3 \pmod {11},
\end{align*}
with the remaining squares repeating by symmetry. Since $-1\equiv 10\pmod {11}$ is not in $\{0,1,3,4,5,9\}$, we have
\begin{align*}
e_2(11)=0.
\end{align*}
There are also no elliptic points of order $3$: the relevant congruence is $x^2+x+1\equiv 0\pmod {11}$, equivalently
\begin{align*}
4x^2+4x+4&\equiv 0\pmod {11} \\
(2x+1)^2+3&\equiv 0\pmod {11} \\
(2x+1)^2&\equiv -3\equiv 8\pmod {11}.
\end{align*}
But $8$ is not a quadratic residue modulo $11$, so
\begin{align*}
e_3(11)=0.
\end{align*}
Substituting into the genus formula gives
\begin{align*}
g(X_0(11))
&=1+\frac{\mu_{11}}{12}-\frac{e_2(11)}{4}-\frac{e_3(11)}{3}-\frac{c(11)}{2} \\
&=1+\frac{12}{12}-\frac{0}{4}-\frac{0}{3}-\frac{2}{2} \\
&=1+1-0-0-1 \\
&=1.
\end{align*}
Thus $X_0(11)$ is a compact Riemann surface of genus one. Choosing a cusp as the origin gives it the structure of an elliptic curve. A standard Weierstrass model is
\begin{align*}
y^2+y=x^3-x^2-10x-20.
\end{align*}
For this model, the Weierstrass coefficients are
\begin{align*}
a_1&=0, &
a_2&=-1, &
a_3&=1, &
a_4&=-10, &
a_6&=-20.
\end{align*}
Hence
\begin{align*}
b_2&=a_1^2+4a_2=-4, \\
b_4&=2a_4+a_1a_3=-20, \\
b_6&=a_3^2+4a_6=1-80=-79, \\
b_8&=a_1^2a_6+4a_2a_6-a_1a_3a_4+a_2a_3^2-a_4^2 \\
&=0+80-0-1-100 \\
&=-21.
\end{align*}
The discriminant is therefore
\begin{align*}
\Delta
&=-b_2^2b_8-8b_4^3-27b_6^2+9b_2b_4b_6 \\
&=-(-4)^2(-21)-8(-20)^3-27(-79)^2+9(-4)(-20)(-79) \\
&=336+64000-168507-56880 \\
&=-161051 \\
&=-11^5.
\end{align*}
Since $\Delta\neq 0$, the cubic is nonsingular and therefore defines an elliptic curve. This is the first level at which the modular curve itself has genus one, so modular geometry already produces an elliptic curve at level $11$.
[/example]
Because $X_0(11)$ has genus one, its holomorphic differentials form a one-dimensional space. In classical terms this space is $S_2(\Gamma_0(11))$, generated by a normalized newform whose coefficients encode the arithmetic of the elliptic curve above.
[remark: Why Level Eleven Matters]
The level $11$ example is the smallest level for which $X_0(N)$ has genus one. It is therefore the first level where the modular curve has its own group law after a cusp is chosen as the identity. Later, this group law will be related to the Jacobian $J_0(N)$ and to Galois representations cut out by Hecke eigenclasses.
[/remark]
We have now seen modular curves as analytic quotients, moduli spaces, and algebraic curves with level structure. The next step is to replace the Fourier-coefficient description of Hecke operators with geometric correspondences between these curves, which is the form needed for Jacobians and cohomology.
# 2. Hecke Correspondences on Modular Curves
This chapter changes the viewpoint on Hecke operators. In a first course, $T_n$ is often introduced by its action on Fourier expansions; here the same operator is built geometrically from maps between modular curves. This geometric construction is the form that interacts with Jacobians, cohomology, and eventually Galois representations.
Building on Chapter 1, the chapter assumes the moduli interpretation of $Y_0(N)$ and $X_0(N)$, the description of points as elliptic curves with cyclic level subgroup, and the basic action of finite morphisms on divisors. It also uses the analytic uniformisation of modular curves by the upper half-plane and the standard $q$-expansion description of modular forms. The guiding idea is that a Hecke operator is not a single map from a modular curve to itself. It is a correspondence: a diagram with two maps to the same curve, so that a point is first lifted to a space of auxiliary choices and then pushed back down after modifying the elliptic curve by an isogeny.
## From Operators to Correspondences
What geometric object should replace the formula for $T_n$ on $q$-expansions? The answer is a curve parameterising extra finite subgroup data. The two projections remember either the original elliptic curve or the quotient elliptic curve, and their combined effect is the Hecke operator.
[definition: Algebraic Correspondence]
Let $X$ be a smooth projective curve over a field $k$. An algebraic correspondence from $X$ to itself is a diagram
\begin{align*}
X \xleftarrow{\pi_1} C \xrightarrow{\pi_2} X
\end{align*}
where $C$ is a curve over $k$ and $\pi_1,\pi_2$ are finite morphisms.
[/definition]
A correspondence acts on divisors by pullback along one projection followed by pushforward along the other. Thus a point $x \in X$ is sent to the formal sum of the images under $\pi_2$ of the points of $C$ lying above $x$ under $\pi_1$.
To compare this geometry with Hecke operators, the pull-push action has to be promoted from an informal rule on points to a homomorphism on the whole divisor group. This is the level at which correspondences can later descend to Jacobians and be compared with their action on modular forms and cohomology. The next definition fixes that divisor-level operator.
[definition: Hecke Correspondence on Divisors]
Let $X \xleftarrow{\pi_1} C \xrightarrow{\pi_2} X$ be an algebraic correspondence over a field $k$. The induced correspondence operator on the divisor group of $X$ is the group homomorphism
\begin{align*}
T_C : \operatorname{Div}(X) &\longrightarrow \operatorname{Div}(X), &
T_C([Z]) &= (\pi_2)_*(\pi_1)^*[Z].
\end{align*}
[/definition]
This definition is deliberately geometric: it says nothing about [Fourier series](/page/Fourier%20Series). The bridge back to modular forms comes from applying the correspondence to differential forms on modular curves, or equivalently to sections of the line bundle of modular forms.
The next theorem uses a little scheme notation to keep track of the primes at which the construction is valid. The expression $\operatorname{Spec}\mathbb Z[1/Nn]$ means the base over which the primes dividing $Nn$ have been inverted, so elliptic curves have well-behaved $N$- and $n$-torsion there. Saying that a modular curve is defined over this base means that its fibres can be taken over any field whose characteristic does not divide $Nn$. A base change is the operation of passing from that universal base to such a field, and "coarse" means that the curve records isomorphism classes of elliptic curves with level structure rather than a fine moduli object with a universal family in every small-level case.
[quotetheorem:4738]
[citeproof:4738]
The compatibility condition in the theorem prevents the level subgroup from degenerating in the quotient. If $p \mid N$ and $D=C[p]$, then quotienting by $D$ collapses part of the level subgroup, so $C/D$ has order $N/p$ rather than $N$ and does not define a point of $X_0(N)$. Thus the displayed formula is a clean construction of the prime-to-level operators, not a definition of the bad-prime operators. In the most important first case, $n=p$ is prime and $p \nmid N$, and then every subgroup $D \subset E[p]$ of order $p$ is allowed. The missing bad-prime case is handled later by degeneracy maps and the operator $U_p$.
[example: Prime Hecke Correspondence Away From The Level]
Let $p \nmid N$ and let $(E,C)$ represent a non-cuspidal point of $Y_0(N)$, with $C \subset E$ cyclic of order $N$. The points of the fibre of $\pi_1$ above $(E,C)$ are precisely the triples $(E,C,D)$ with $D \subset E$ cyclic of order $p$ and $D \cap C=0$. Because $p \nmid N$, every such $D$ automatically satisfies the compatibility condition: the subgroup $D \cap C$ has order dividing both $p$ and $N$, hence
\begin{align*}
|D \cap C| \mid \gcd(p,N)=1,
\end{align*}
so $D \cap C=0$.
Over an algebraically closed field of characteristic different from $p$, the $p$-torsion group is
\begin{align*}
E[p] \cong (\mathbb Z/p\mathbb Z)^2.
\end{align*}
A cyclic order-$p$ subgroup of $E[p]$ is the same thing as a one-dimensional $\mathbb F_p$-subspace of this two-dimensional vector space. There are $p^2-1$ nonzero vectors in $\mathbb F_p^2$, and each one-dimensional subspace contains $p-1$ nonzero vectors, so the number of such subgroups is
\begin{align*}
\frac{p^2-1}{p-1}=p+1.
\end{align*}
For each subgroup $D$, the quotient map $E \to E/D$ sends $C$ injectively because $C \cap D=0$, and
\begin{align*}
(C+D)/D \cong C/(C\cap D)=C,
\end{align*}
so $(C+D)/D$ is cyclic of order $N$. Therefore the correspondence sends
\begin{align*}
(E,C) \longmapsto \sum_{\substack{D \subset E[p] \\ |D|=p}} (E/D,(C+D)/D).
\end{align*}
Thus $T_p$ records exactly the $p+1$ degree-$p$ quotients of $E$ whose transported subgroup still defines a $\Gamma_0(N)$-level structure.
[/example]
## Double Cosets and Isogenies
Why does the same operator have a double-coset formula on the upper half-plane? The analytic quotient description of modular curves remembers lattices in $\mathbb C$, while a finite-index inclusion of lattices gives an isogeny of elliptic curves. Double cosets organise all such inclusions modulo the equivalence imposed by the level group.
[definition: Hecke Double Coset]
View $\Gamma_0(N)$ as its usual congruence subgroup of $SL_2(\mathbb Z)$ inside $GL_2^+(\mathbb Q)$. For $\alpha \in GL_2^+(\mathbb Q)$ such that $\Gamma_0(N)\alpha\Gamma_0(N)$ is a finite union of right cosets, choose a decomposition
\begin{align*}
\Gamma_0(N)\alpha\Gamma_0(N) = \bigsqcup_i \Gamma_0(N)\alpha_i.
\end{align*}
The associated double-coset operator is the endomorphism of $M_k(\Gamma_0(N))$ defined by summing the weight-$k$ slash actions of the representatives $\alpha_i$, and it also defines a finite correspondence on $\Gamma_0(N)\backslash \mathfrak H$.
[/definition]
After the definition, the operator attached to the double coset is obtained by summing over the representatives $\alpha_i$. For $T_p$ with $p \nmid N$, one may take
\begin{align*}
\Gamma_0(N)
\begin{pmatrix}1&0\\0&p\end{pmatrix}
\Gamma_0(N)
= \bigsqcup_{a=0}^{p-1} \Gamma_0(N)
\begin{pmatrix}1&a\\0&p\end{pmatrix}
\sqcup
\Gamma_0(N)
\begin{pmatrix}p&0\\0&1\end{pmatrix}.
\end{align*}
This is the analytic source of the $p+1$ terms in the geometric correspondence.
The remaining issue is to identify this double-coset calculation with the moduli-theoretic correspondence by degree-$p$ isogenies. The next comparison ensures that the analytic operator and the geometric correspondence are not merely analogous but describe the same prime-to-level Hecke action.
[quotetheorem:4739]
[citeproof:4739]
This theorem is the conceptual bridge between the formula seen in classical modular forms and the correspondence needed for geometry. The hypothesis $p \nmid N$ is essential: if $p$ divides $N$, the same double coset no longer corresponds to all $p+1$ quotient directions compatible with a fixed $\Gamma_0(N)$-structure, because the level subgroup already contains a distinguished $p$-part. The theorem identifies the correspondence on the open analytic curve; it does not by itself describe the integral model at bad reduction or the bad-prime operator. It also explains why Hecke operators commute away from the level: the relevant double-coset algebra is commutative in the prime-to-level part.
The next comparison asks whether this geometric correspondence has the same normalization as the classical operator on Fourier expansions. This matters because later trace formulas use the coefficient $a_p(f)$, so the geometric pull-push operator must recover the same $q$-expansion coefficient operator rather than a rescaled variant.
[quotetheorem:4740]
[citeproof:4740]
The condition $p \nmid N$ again separates the $p$-isogeny data from the level subgroup, so the Tate-curve calculation sees exactly the $p$ etale quotients and the one canonical quotient. If $p \mid N$, the level structure chooses a $p$-direction and the second summand in the displayed $T_p$ formula is not part of the bad-prime operator $U_p$. The theorem is a comparison of normalisations on modular forms; it does not assert the same formula for divisors, Jacobians, or cohomology without translating the action through the relevant functor. This coefficient formula is the link that later lets Hecke eigenvalues computed from $q$-expansions be read geometrically.
[example: Recovering the Coefficient Formula for $T_p$]
Let $f(q)=\sum_{m\ge 0}a_mq^m$ have weight $k$ and level $\Gamma_0(N)$, with $p\nmid N$, and let $\zeta_p$ be a primitive $p$-th root of unity. The $p$ quotients with Tate parameters $\zeta_p^a q^{1/p}$ contribute, with the standard normalisation of the geometric action on forms,
\begin{align*}
\frac{1}{p}\sum_{a=0}^{p-1} f(\zeta_p^a q^{1/p})
&= \frac{1}{p}\sum_{a=0}^{p-1}\sum_{m\ge 0} a_m(\zeta_p^a q^{1/p})^m\\
&= \frac{1}{p}\sum_{m\ge 0} a_m q^{m/p}\sum_{a=0}^{p-1}\zeta_p^{am}.
\end{align*}
For the inner sum, if $p\mid m$, then $\zeta_p^{am}=1$ for every $a$, so
\begin{align*}
\sum_{a=0}^{p-1}\zeta_p^{am}=p.
\end{align*}
If $p\nmid m$, then $\zeta_p^m\neq 1$ and the finite geometric sum gives
\begin{align*}
\sum_{a=0}^{p-1}\zeta_p^{am}
&=\frac{1-(\zeta_p^m)^p}{1-\zeta_p^m}
=\frac{1-\zeta_p^{mp}}{1-\zeta_p^m}
=\frac{1-1}{1-\zeta_p^m}
=0.
\end{align*}
Therefore only the indices $m=pr$ survive:
\begin{align*}
\frac{1}{p}\sum_{a=0}^{p-1} f(\zeta_p^a q^{1/p})
&=\frac{1}{p}\sum_{r\ge 0} a_{pr}q^r\cdot p\\
&=\sum_{r\ge 0}a_{pr}q^r.
\end{align*}
The remaining quotient has Tate parameter $q^p$; the weight-$k$ transformation of the invariant differential contributes the factor $p^{k-1}$, so its contribution is
\begin{align*}
p^{k-1}f(q^p)
&=p^{k-1}\sum_{m\ge 0}a_m(q^p)^m\\
&=p^{k-1}\sum_{m\ge 0}a_mq^{pm}.
\end{align*}
Adding the two families of quotient isogenies gives
\begin{align*}
(T_pf)(q)=\sum_{m\ge 0}a_{pm}q^m+p^{k-1}\sum_{m\ge 0}a_mq^{pm},
\end{align*}
so the geometric correspondence separates exactly into the coefficient-extraction term and the $q\mapsto q^p$ term.
[/example]
## Degeneracy Maps and the Operator $U_p$
What changes when the prime is part of the level? The prime-to-level correspondence used for $T_p$ no longer treats the level subgroup and the $p$-isogeny independently. Instead, maps between different levels become central, and the operator at $p$ is built from degeneracy maps.
[definition: Degeneracy Maps from Level $Np$ to Level $N$]
Let $p$ be prime and let $N \ge 1$. On non-cuspidal moduli points of $Y_0(Np)$, written as $(E,C)$ with $C \subset E$ cyclic of order $Np$, define two maps to $Y_0(N)$ by
\begin{align*}
\alpha(E,C) &= (E,C[N]), & \beta(E,C) &= (E/C[p], C/C[p]).
\end{align*}
[/definition]
Here $C[p]$ is the unique subgroup of $C$ of order $p$, and $C[N]$ is the unique subgroup of order $N$ when $p \nmid N$. The map $\alpha$ forgets the $p$-part of the level, while $\beta$ quotients by that $p$-part and then remembers the residual $N$-level structure.
[example: The Two Degeneracy Maps]
Assume $p \nmid N$ and let $(E,C_{Np}) \in Y_0(Np)$, where $C_{Np} \subset E$ is cyclic of order $Np$. Since $C_{Np}$ is cyclic, it has a unique subgroup of order $d$ for each divisor $d$ of $Np$. We write
\begin{align*}
C_N &= C_{Np}[N], &
C_p &= C_{Np}[p],
\end{align*}
so $|C_N|=N$ and $|C_p|=p$.
The first degeneracy map forgets the $p$-part of the level:
\begin{align*}
\alpha(E,C_{Np})=(E,C_N).
\end{align*}
Because $C_N$ is cyclic of order $N$, this is a point of $Y_0(N)$.
For the second degeneracy map, quotient by the order-$p$ subgroup $C_p$. The image of $C_{Np}$ in $E/C_p$ is
\begin{align*}
C_{Np}/C_p.
\end{align*}
Its order is
\begin{align*}
|C_{Np}/C_p|
=\frac{|C_{Np}|}{|C_p|}
=\frac{Np}{p}
=N,
\end{align*}
and a quotient of a cyclic group is cyclic. Hence
\begin{align*}
\beta(E,C_{Np})=(E/C_p,\,C_{Np}/C_p)
\end{align*}
is also a point of $Y_0(N)$. Thus one level-$Np$ point determines two level-$N$ points: one by forgetting the $p$-level subgroup, and one by moving across the degree-$p$ isogeny $E \to E/C_p$ and carrying the remaining cyclic subgroup along.
[/example]
At primes dividing the level, the usual $T_p$ correspondence is no longer the right operator: it would count quotient directions that are not compatible with the level structure already present at $p$. The geometry therefore has to retain only the quotients selected by the existing $p$-part of the cyclic subgroup. This modified bad-prime correspondence is named separately because it produces a different endomorphism of modular forms and a different rule on $q$-expansions.
The obstruction is that a single notation $T_p$ would blur two different local situations: at good primes there are all order-$p$ quotient directions, while at bad primes the level structure has already chosen a direction that must be respected. The next definition fixes the bad-prime operator so that later eigenvalue statements can distinguish primes dividing the level from primes away from it.
[definition: The Operator $U_p$]
Let $p \mid N$ and let $k \ge 2$. The operator $U_p$ on modular forms of level $\Gamma_0(N)$ is the endomorphism
\begin{align*}
U_p : M_k(\Gamma_0(N)) \longrightarrow M_k(\Gamma_0(N))
\end{align*}
induced by the Hecke correspondence that remembers cyclic order-$p$ quotients compatible with the existing $p$-part of the level. Its restriction to cusp forms gives an endomorphism $U_p:S_k(\Gamma_0(N))\to S_k(\Gamma_0(N))$.
[/definition]
In $q$-expansions, the normalisation used in classical modular forms gives
\begin{align*}
(U_p f)(q)=\sum_{m\ge 0}a_{pm}q^m.
\end{align*}
Geometrically, the absence of the second $p^{k-1}q^{pm}$ term reflects that the level already chooses a direction at $p$.
[remark: Distinction Between $T_p$ and $U_p$]
For $p \nmid N$, the operator $T_p$ sums over all $p+1$ cyclic order-$p$ subgroups. For $p \mid N$, the operator $U_p$ is adapted to the level structure at $p$ and behaves differently both geometrically and spectrally. This distinction is one of the reasons that local behaviour at primes dividing the level carries arithmetic information.
[/remark]
## Atkin-Lehner Involutions and Oldforms
How do forms at lower level sit inside forms at higher level, and how can we separate genuinely new forms from those obtained by level raising? Degeneracy maps give the embeddings from old levels; Atkin-Lehner involutions supply symmetries at the bad primes. The new subspace is defined by removing the images that come from lower levels.
[definition: Atkin-Lehner Involution]
Let $Q \mid N$ with $\gcd(Q,N/Q)=1$. The Atkin-Lehner involution $w_Q$ on $X_0(N)$ is the automorphism induced by an element of $GL_2^+(\mathbb Q)$ normalising $\Gamma_0(N)$ and having determinant $Q$.
[/definition]
In the moduli description, $w_Q$ quotients by the distinguished order-$Q$ part of the level subgroup. The involution $w_Q$ is especially important when $Q$ is an exact prime-power divisor of $N$. On eigenforms it often acts by a scalar $\pm 1$, and this sign is linked to the local functional equation of the associated $L$-function.
[example: Atkin-Lehner at Prime Level]
Let $N=p$ and let $(E,C)$ represent a non-cuspidal point of $Y_0(p)$, with $C \subset E[p]$ cyclic of order $p$. Let
\begin{align*}
\varphi:E \longrightarrow E/C
\end{align*}
be the quotient isogeny. The image $\varphi(E[p])$ is a cyclic subgroup of $E/C$ of order $p$: indeed
\begin{align*}
\ker(\varphi|_{E[p]})=E[p]\cap C=C,
\end{align*}
so
\begin{align*}
\varphi(E[p]) \cong E[p]/C,
\end{align*}
and since $|E[p]|=p^2$ and $|C|=p$, this quotient has order $p$. Thus
\begin{align*}
w_p(E,C)=(E/C,\varphi(E[p])).
\end{align*}
Now let $\widehat{\varphi}:E/C\to E$ be the dual isogeny, so
\begin{align*}
\widehat{\varphi}\circ\varphi=[p]_E,
\qquad
\varphi\circ\widehat{\varphi}=[p]_{E/C}.
\end{align*}
Since $\ker(\widehat{\varphi})=\varphi(E[p])$, applying $w_p$ a second time gives the quotient
\begin{align*}
(E/C)/\varphi(E[p]) \cong E
\end{align*}
via $\widehat{\varphi}$. The transported level subgroup is the image of $(E/C)[p]$ under $\widehat{\varphi}$:
\begin{align*}
\widehat{\varphi}((E/C)[p])
&\subseteq \ker(\varphi) \\
&=C,
\end{align*}
because for $y\in (E/C)[p]$,
\begin{align*}
\varphi(\widehat{\varphi}(y))=[p]_{E/C}(y)=0.
\end{align*}
The subgroup $\widehat{\varphi}((E/C)[p])$ has order $p$, since its kernel on $(E/C)[p]$ is $\ker(\widehat{\varphi})=\varphi(E[p])$, also of order $p$, and $|(E/C)[p]|=p^2$. Hence
\begin{align*}
\widehat{\varphi}((E/C)[p])=C.
\end{align*}
Therefore the second application of $w_p$ identifies the resulting pair with $(E,C)$, so $w_p^2=1$ on $X_0(p)$.
[/example]
This involution separates genuinely new level-$p$ information from forms imported through lower-level degeneracy maps, which motivates the old/new distinction. In higher level, the same issue appears whenever a form at level $N$ is obtained by pulling a form up from a proper divisor of $N$. To isolate the primitive contribution of level $N$, one first names the subspace generated by all such imported forms.
[definition: Old Subspace]
Let $M \mid N$ and let $S_k(\Gamma_0(M))$ and $S_k(\Gamma_0(N))$ be spaces of cusp forms. The old subspace of level $N$ is the span of the images of cusp forms from proper divisors $M$ of $N$ under all degeneracy maps from level $M$ to level $N$.
[/definition]
After the old subspace has been identified, the remaining primitive contribution cannot be chosen by an arbitrary vector-space complement, since such a choice would have no reason to respect Hecke operators. The Petersson inner product supplies a canonical way to remove the imported forms while preserving the operator structure away from the level. This gives the correct definition of the new part.
[definition: New Subspace]
The new subspace $S_k(\Gamma_0(N))^{\mathrm{new}}$ is the orthogonal complement of the old subspace inside $S_k(\Gamma_0(N))$ with respect to the Petersson inner product.
[/definition]
Once old and new subspaces have been separated, the natural structural question is whether this separation accounts for all cusp forms at the level. The point is not merely to split a vector space, but to split it in a way compatible with the Hecke operators that will later supply eigenvalue systems. A canonical decomposition is therefore needed before one can speak cleanly about newforms as the primitive building blocks at level $N$.
[quotetheorem:4741]
[citeproof:4741]
The hypotheses $k \ge 2$ and the use of cusp forms matter because the Petersson inner product gives the orthogonality used to define the new subspace; without this inner-product structure, an arbitrary complement to the old subspace would not be canonical and need not be Hecke-stable. The decomposition is a statement about Hecke modules, not a claim that every vector-space complement has arithmetic meaning. Later, the newforms in this decomposition provide the eigenvalue systems used to isolate Galois representations.
[example: Oldforms at Level $Np$]
Let $p \nmid N$ and let $f \in S_k(\Gamma_0(N))$ have $q$-expansion
\begin{align*}
f(q)=\sum_{m\ge 1}a_mq^m.
\end{align*}
The two degeneracy maps from level $Np$ to level $N$ give two pullbacks of $f$ to level $Np$. The pullback along the map that forgets the $p$-part of the level leaves the Tate parameter unchanged, so its $q$-expansion is
\begin{align*}
f(q)=\sum_{m\ge 1}a_mq^m.
\end{align*}
The other degeneracy map is induced at the cusp by quotienting by the order-$p$ subgroup, which sends the Tate parameter to $q^p$. Therefore its pullback has expansion
\begin{align*}
f(q^p)
&=\sum_{m\ge 1}a_m(q^p)^m\\
&=\sum_{m\ge 1}a_mq^{pm}.
\end{align*}
Thus the two level-$Np$ forms generated from $f$ are
\begin{align*}
f(q)
\qquad\text{and}\qquad
f(q^p),
\end{align*}
and their linear span is
\begin{align*}
\operatorname{span}\{f(q),f(q^p)\}\subseteq S_k(\Gamma_0(Np)).
\end{align*}
As $f$ ranges over $S_k(\Gamma_0(N))$, these spans give the $p$-old contribution at level $Np$. The new subspace consists of the cusp forms at level $Np$ that are Petersson-orthogonal to all old contributions coming from proper divisors of $Np$.
[/example]
## Why Correspondences Matter for Galois Representations
Why spend this time replacing formulas by geometry? The reason is that correspondences act not only on modular forms, but also on divisors, Jacobians, and etale cohomology. These are the objects from which the Galois representations in the rest of the course will be extracted.
The key point is functoriality. A finite correspondence $X \xleftarrow{\pi_1} C \xrightarrow{\pi_2} X$ gives pull-push operators on cohomology and on the Jacobian $J(X)$, and these operators are defined algebraically over the field of definition of the modular curve. Therefore Hecke eigenvalues can be detected inside Galois modules. Chapter 3 makes this precise by passing from correspondences on $X_0(N)$ to endomorphisms of $J_0(N)$ and its Hecke algebra.
[remark: Hecke Operators as Algebraic Endomorphisms]
For $X=X_0(N)$, the Hecke correspondence $T_n$ induces an endomorphism of the Jacobian $J_0(N)$ and an endomorphism of etale cohomology groups such as $H^1_{\mathrm{et}}(X_{\overline{\mathbb Q}},\mathbb Q_\ell)$. These actions commute with the natural action of $\operatorname{Gal}(\overline{\mathbb Q}/\mathbb Q)$ because the correspondences are defined over $\mathbb Q$.
[/remark]
This is the first appearance of the mechanism behind Eichler-Shimura. A normalised eigenform cuts out a system of Hecke eigenvalues; the corresponding eigenspaces in cohomology carry Galois actions; those Galois actions become the two-dimensional representations studied later.
The previous chapter built Hecke operators from finite diagrams of modular curves and showed how they act on geometric invariants. We now shift from the curves themselves to their Jacobians and the Hecke algebra acting on divisor classes, differentials, and homology, where the first traces of Galois representations begin to appear.
# 3. Jacobians and the Hecke Algebra
This chapter moves from modular curves as geometric objects to their Jacobians as linear receptacles for Hecke correspondences. The guiding question is how the correspondences from the previous chapter act on divisor classes, differentials, and homology, and why this action is rich enough to remember Hecke eigenforms. The answer is the integral Hecke algebra: a finite commutative ring acting on $J_0(N)$ whose maximal ideals isolate congruence classes of eigenforms.
Using the correspondences constructed in Chapter 2, the chapter also prepares the Galois-theoretic construction that follows. Eichler-Shimura relates Frobenius and Hecke operators on the Jacobian, while semisimplicity over characteristic zero lets us decompose the resulting modules according to systems of eigenvalues.
## Divisor Classes on the Jacobian
The first problem is to replace a compact Riemann surface by an abelian group that still records its holomorphic differentials and algebraic correspondences. Points of the curve themselves do not form an additive object, and meromorphic functions create unavoidable relations between formal sums of points. For $X_0(N)$ this replacement is the Jacobian, built from degree-zero divisors modulo principal divisors.
[definition: Divisor Group]
Let $X$ be a compact Riemann surface. The divisor group $\operatorname{Div}(X)$ is the free abelian group on the points of $X$. For
\begin{align*}
D = \sum_{P \in X} n_P[P]
\end{align*}
with finitely many non-zero $n_P \in \mathbb Z$, the degree is
\begin{align*}
\deg D = \sum_{P \in X} n_P.
\end{align*}
The subgroup of degree-zero divisors is denoted $\operatorname{Div}^0(X)$.
[/definition]
The degree-zero condition is the first constraint compatible with meromorphic functions, because zeros and poles of a [meromorphic function](/page/Meromorphic%20Function) have the same total multiplicity. But it is still too coarse: if a meromorphic function moves a zero from one point to another while introducing the corresponding pole, the resulting divisor changes as a formal sum even though it represents the same linear equivalence class. The obstruction is that degree alone cannot tell apart genuinely new divisor classes from changes produced by multiplying by a function.
Principal divisors are the degree-zero divisors that should become zero in the Jacobian. Naming them isolates the subgroup by which degree-zero divisors will be quotiented.
The next formal step is to make this quotient subgroup intrinsic, not dependent on an informal picture of moving zeros and poles. This requires a divisor map from non-zero meromorphic functions to degree-zero divisors, whose image is exactly the collection of relations imposed by linear equivalence.
[definition: Principal Divisor]
Let $X$ be a compact Riemann surface, and let $\mathcal M(X)^*$ be the multiplicative group of non-zero meromorphic functions on $X$. The divisor map is the function
\begin{align*}
\operatorname{div}:\mathcal M(X)^* \to \operatorname{Div}^0(X)
\end{align*}
defined by
\begin{align*}
\operatorname{div}(f)=\sum_{P \in X} \operatorname{ord}_P(f)[P].
\end{align*}
The group of principal divisors is
\begin{align*}
\operatorname{Prin}(X)=\operatorname{im}(\operatorname{div})\subseteq \operatorname{Div}^0(X).
\end{align*}
[/definition]
Since [principal divisors have degree zero](/theorems/2177), they define an [equivalence relation](/page/Equivalence%20Relation) on $\operatorname{Div}^0(X)$. The point of quotienting is to forget changes caused by meromorphic functions while retaining the global obstruction to moving one divisor to another. For a compact Riemann surface this quotient is not merely bookkeeping: it is the abelian group that will carry Hecke actions and later becomes a complex torus.
[definition: Jacobian]
Let $X$ be a compact Riemann surface. The Jacobian of $X$ is
\begin{align*}
J(X)=\operatorname{Div}^0(X)/\operatorname{Prin}(X).
\end{align*}
For the modular curve $X_0(N)$ we write
\begin{align*}
J_0(N)=J(X_0(N)).
\end{align*}
[/definition]
Analytically, the divisor quotient definition of $J(X)$ should match a more concrete object built from holomorphic differentials and periods. This comparison is necessary because Hecke correspondences act naturally on divisors, differentials, and homology, and later arguments need these actions to be the same construction viewed in different languages. The theorem below supplies the analytic torus model of the Jacobian and identifies its dimension with the genus.
[quotetheorem:4742]
[citeproof:4742]
This description explains why the Jacobian is the natural place for Hecke correspondences: correspondences push divisors forward, pull differentials back, and act on homology by functoriality. The genus hypothesis is essential in the dimension statement: for genus $0$ the space of holomorphic differentials is zero and the Jacobian is trivial, so there is no non-trivial period lattice to quotient by. The theorem does not give a canonical set of coordinates on $J(X)$, because the period lattice depends on the choice of homology basis; it gives the functorial analytic model needed to compare divisors, differentials, and homology.
[example: The Genus-One Boundary Case]
Let $O=\infty$ be the chosen cusp on the genus-one curve $X_0(11)$. The Abel-Jacobi map based at $O$ is
\begin{align*}
\phi_O:X_0(11) &\longrightarrow J_0(11),\\
P &\longmapsto [P]-[O].
\end{align*}
Since $X_0(11)$ has genus $1$, the [Abel-Jacobi theorem for compact Riemann surfaces](/theorems/4742) says that $\phi_O$ is an isomorphism of complex tori, and the chosen point $O$ maps to the zero element because
\begin{align*}
\phi_O(O)=[O]-[O]=0 \in J_0(11).
\end{align*}
Thus the divisor class $[P]-[\infty]$ is exactly the point $P$ after transporting the group law from $J_0(11)$ back to $X_0(11)$.
The dependence on the base point is visible algebraically. If $O'$ is another point, then for every $P\in X_0(11)$,
\begin{align*}
\phi_{O'}(P)
&=[P]-[O']\\
&=([P]-[O])-([O']-[O])\\
&=\phi_O(P)-\phi_O(O').
\end{align*}
So changing the origin translates the identification by the fixed element $-\phi_O(O')$ of the Jacobian. This is why a genus-one curve without a chosen origin is naturally a torsor for its Jacobian, while the pair $(X_0(11),\infty)$ is an elliptic curve. In this case that elliptic curve is the modular elliptic curve of conductor $11$.
[/example]
The example is the smallest case in which the Jacobian is already visible as an elliptic curve rather than as a higher-dimensional abelian variety.
## Hecke Operators on Geometry and Cohomology
The next question is how a Hecke correspondence acts on the Jacobian rather than only on modular forms. A formula on $q$-expansions does not by itself act on divisor classes, and an arbitrary relation between points would not preserve rational equivalence. A correspondence $C$ from $X$ to itself is a curve equipped with two finite maps to $X$, and it acts by pulling divisors back along one map and pushing them forward along the other.
[definition: Action of a Correspondence on Divisors]
Let $X$ be a smooth projective curve and let $C$ be a correspondence
\begin{align*}
X \xleftarrow{\pi_1} C \xrightarrow{\pi_2} X
\end{align*}
with $\pi_1$ and $\pi_2$ finite. The induced endomorphism of $\operatorname{Div}(X)$ is
\begin{align*}
C_*D=(\pi_2)_*(\pi_1)^*D.
\end{align*}
If $C$ is a Hecke correspondence on $X_0(N)$, the resulting endomorphism of $J_0(N)$ is denoted by the same Hecke symbol.
[/definition]
Because pullback and pushforward preserve principal divisors in this setting, the construction descends from divisors to divisor classes. This descent is the key point needed to turn a geometric correspondence on the curve into an actual endomorphism of its Jacobian. Without it, Hecke operators would act only on formal divisors and would not yet interact with the abelian variety whose Tate module will later carry Galois representations. The theorem records precisely this passage from correspondences on $X_0(N)$ to arithmetic operators on $J_0(N)$.
At this point the construction needs a guarantee that the geometric Hecke correspondences do more than act on the curve itself. The result below supplies the bridge from finite correspondences to endomorphisms of the Jacobian, which is the setting where later torsion and Tate-module arguments take place.
[quotetheorem:4743]
[citeproof:4743]
The finite-map hypothesis is what makes the construction descend to the Jacobian: without finiteness there is no pushforward of divisors with finite multiplicities, and without preservation of principal divisors the operation would not respect rational equivalence. The theorem does not say that all endomorphisms of $J_0(N)$ arise from Hecke correspondences; it only constructs the arithmetic subalgebra generated by them. The same operator can now be read in three compatible languages: divisor classes on $J_0(N)$, holomorphic differentials on $X_0(N)$, and singular homology of the Riemann surface.
[illustration:hecke-correspondence-divisors]
[quotetheorem:4744]
[citeproof:4744]
This compatibility is the bridge between analytic eigenforms and arithmetic quotients of the Jacobian. The identification with weight $2$ cusp forms is essential: modular forms of other weights do not directly give holomorphic differentials on $X_0(N)$. The result also does not produce an integral eigenbasis of homology; denominators and congruences between eigenforms are exactly why the integral Hecke algebra must be introduced next.
[example: Point Counts Recover Hecke Eigenvalues in Level Eleven]
For $N=11$, the curve $X_0(11)$ has genus $1$, so $J_0(11)$ is an elliptic curve and $S_2(\Gamma_0(11))$ is a one-dimensional complex vector space generated by a normalized newform
\begin{align*}
f=\sum_{n\ge 1}a_nq^n.
\end{align*}
By compatibility of the Hecke action with holomorphic differentials, each $T_n$ acts on this one-dimensional differential line as multiplication by the scalar $a_n$.
Let $p\nmid 11$. Then $X_0(11)$ has good reduction at $p$. On the $f$-isotypic part, the Eichler-Shimura relation on the Jacobian gives the Frobenius characteristic polynomial
\begin{align*}
X^2-a_pX+p.
\end{align*}
Hence the trace of Frobenius is $a_p$. For an elliptic curve $E/\mathbb F_p$, the point-count formula is
\begin{align*}
\#E(\mathbb F_p)
&=1-\operatorname{tr}(\operatorname{Frob}_p\mid H^1(E))+p\\
&=p+1-\operatorname{tr}(\operatorname{Frob}_p\mid H^1(E)).
\end{align*}
Applying this to the good reduction of $X_0(11)$ and substituting $\operatorname{tr}(\operatorname{Frob}_p)=a_p$ gives
\begin{align*}
\#X_0(11)(\mathbb F_p)
&=p+1-a_p.
\end{align*}
Solving for $a_p$ yields
\begin{align*}
a_p=p+1-\#X_0(11)(\mathbb F_p).
\end{align*}
Thus geometric point counts determine the same integers that occur as Hecke eigenvalues. The condition $p\nmid 11$ is essential: at $p=11$ the curve has bad reduction, so the good-reduction Frobenius trace formula is replaced by the local $U_{11}$ operator together with the reduction type of the elliptic curve.
[/example]
This example previews the form of the Galois representation attached to an eigenform: Frobenius traces should be Hecke eigenvalues.
## Eichler-Shimura on the Jacobian
The central problem is to connect Hecke operators with Frobenius. If $p \nmid N$, the curve $X_0(N)$ and its Jacobian have good reduction at $p$, and the geometric correspondence $T_p$ reduces modulo $p$ in a way controlled by Frobenius and Verschiebung.
[quotetheorem:4745]
[citeproof:4745]
The second formulation is the one used in Galois representations. The hypothesis $p \nmid N$ is essential: it gives good reduction and separates the order-$p$ subgroup schemes into Frobenius and Verschiebung pieces. The theorem does not describe the local representation at bad primes, where the geometry of the special fibre and the operator $U_p$ replace the displayed quadratic relation. For good primes it says that the characteristic polynomial of Frobenius on the relevant two-dimensional quotient should be $X^2-a_pX+p$ in weight $2$.
[remark: Arithmetic Meaning of the Relation]
For a weight $2$ eigenform $f$ with $T_pf=a_pf$, the Eichler-Shimura relation predicts that Frobenius should have trace $a_p$ and determinant $p$ on the $f$-isotypic part of the Tate module. This is the first appearance of the trace-determinant recipe for modular Galois representations.
[/remark]
The relation is special to good primes. At primes dividing $N$, the operator $U_p$ and the geometry of bad reduction replace the simple Frobenius plus Verschiebung formula.
## The Integral Hecke Algebra
The final question is how to isolate the part of $J_0(N)$ belonging to a particular eigenform. Over $\mathbb C$ we can project onto eigenspaces of differentials, but an abelian variety has no canonical operation of taking a complex eigenspace, and reduction modulo $\lambda$ can merge distinct characteristic-zero eigensystems. The answer is to package all Hecke operators into a single commutative algebra and then localise or quotient at the maximal ideal determined by the eigenvalues.
[definition: Integral Hecke Algebra]
The integral Hecke algebra of level $N$ is the subring
\begin{align*}
\mathbb T = \mathbb Z[T_n : n \ge 1] \subset \operatorname{End}(J_0(N)).
\end{align*}
It is generated by the Hecke endomorphisms acting on $J_0(N)$.
[/definition]
The same algebra acts faithfully on several objects after tensoring with $\mathbb Q$, including $S_2(\Gamma_0(N))$ and $H_1(X_0(N),\mathbb Z)$. The integral structure matters because congruences between eigenforms are detected by maximal ideals of $\mathbb T$.
A single eigenform usually has Hecke eigenvalues that are not rational numbers, so the map from $\mathbb T$ determined by that form may land outside $\mathbb Q$. Before one can reduce its eigenvalues modulo a prime or compare them with integral Hecke operators, the field generated by all of those eigenvalues has to be specified.
This field is the coefficient domain in which the eigensystem lives. Naming it now makes later choices of primes, completions, and reductions precise rather than treating the eigenvalues as floating complex numbers.
[definition: Hecke Field]
Let $f=\sum_{n\ge1}a_nq^n$ be a normalized eigenform. The Hecke field of $f$ is
\begin{align*}
K_f=\mathbb Q(a_n : n \ge 1).
\end{align*}
[/definition]
The eigenvalues define a ring homomorphism from the Hecke algebra to the ring generated by the coefficients of $f$.
Congruence questions require more than the characteristic-zero field: two eigenforms can be distinct over $\mathbb C$ but become indistinguishable after reducing their eigenvalues modulo a prime. The way to record this loss of information inside the Hecke algebra is to choose a prime of the coefficient ring and take the kernel of the reduced eigenvalue system.
[definition: Maximal Ideal Attached to an Eigenform]
Let $f=\sum_{n\ge1}a_nq^n$ be a normalized eigenform, let $\mathcal O_f$ be the ring of integers of its Hecke field, and let $\lambda$ be a prime ideal of $\mathcal O_f$. Since the Hecke operators preserve the integral lattice of modular forms, the eigenvalues $a_n$ are algebraic integers, so $a_n \in \mathcal O_f$. The eigenvalue homomorphism is the ring map
\begin{align*}
\theta_f:\mathbb T \to \mathcal O_f, \qquad \theta_f(T_n)=a_n.
\end{align*}
The maximal ideal of $\mathbb T$ attached to $(f,\lambda)$ is
\begin{align*}
\mathfrak m_{f,\lambda}=\ker\bigl(\mathbb T \xrightarrow{\theta_f} \mathcal O_f \to \mathcal O_f/\lambda\bigr).
\end{align*}
[/definition]
Localising at $\mathfrak m_{f,\lambda}$ extracts the congruence class of eigenforms whose Hecke eigenvalues agree with those of $f$ modulo $\lambda$. Before passing to congruences, however, one needs to know what the Hecke algebra looks like over characteristic zero, where distinct eigenvalue systems should separate cleanly. The obstruction is that the integral Hecke algebra can contain congruence information and need not split as a product over $\mathbb Z$. Tensoring with $\mathbb Q$ removes that integral torsion and reveals the semisimple picture against which the later mod-$\lambda$ complications are measured.
[quotetheorem:4746]
[citeproof:4746]
Semisimplicity explains why characteristic zero sees eigenforms separately. The characteristic-zero hypothesis is essential: after reducing modulo $\lambda$, two different systems of eigenvalues can become equal and nilpotent congruence phenomena may appear in the local Hecke algebra. The theorem does not assert that $\mathbb T$ itself is a product of rings over $\mathbb Z$; it only describes the rational algebra after tensoring with $\mathbb Q$. Integral and mod-$\lambda$ questions are subtler because different characteristic-zero eigenforms may become congruent.
The semisimple characteristic-zero picture is only useful for Galois representations if an individual newform can be isolated inside it. The obstruction is that the full space contains many Hecke eigensystems at once, while a later construction needs the eigenspace cut out by the eigenvalues of a fixed $f$ to be as small as possible. The theorem below supplies that separation over $\mathbb C$, before integral congruences and $\lambda$-adic localisations make the picture less clean.
[quotetheorem:4747]
[citeproof:4747]
The newform hypothesis is essential: oldforms can give several copies of the same lower-level eigensystem inside a larger level. The theorem does not rule out congruences between distinct newforms modulo $\lambda$; it is a statement over $\mathbb C$. In this course the theorem is used as an input to control the size of eigenspaces and the corresponding quotients of $J_0(N)$.
[example: Certifying a Hecke Field by a Sturm Bound]
Let $K=\mathbb Q(\sqrt{5})$, and let
\begin{align*}
f=q+a_2q^2+a_3q^3+\cdots \in S_2(\Gamma_0(N))
\end{align*}
be a normalized eigenform with $a_2=1+\sqrt{5}$ and $a_3=-2$. For weight $2$, the Sturm bound is
\begin{align*}
B=\left\lfloor \frac{2}{12}[SL_2(\mathbb Z):\Gamma_0(N)]\right\rfloor
=\left\lfloor \frac{1}{6}[SL_2(\mathbb Z):\Gamma_0(N)]\right\rfloor.
\end{align*}
Suppose the computed coefficients satisfy
\begin{align*}
a_n\in K \qquad \text{for every } 1\le n\le B.
\end{align*}
We show that every coefficient of $f$ lies in $K$. Let $\sigma$ be any field automorphism of $\mathbb C$ fixing $K$. Since $a_n\in K$ for $1\le n\le B$, we have
\begin{align*}
\sigma(a_n)=a_n \qquad \text{for } 1\le n\le B.
\end{align*}
Apply $\sigma$ coefficientwise to $f$:
\begin{align*}
\sigma(f)-f
&=\sum_{n\ge 1}(\sigma(a_n)-a_n)q^n.
\end{align*}
For $1\le n\le B$, each coefficient $\sigma(a_n)-a_n$ is $0$, so the first $B$ coefficients of $\sigma(f)-f$ vanish. By *Sturm's theorem*, this forces
\begin{align*}
\sigma(f)-f=0.
\end{align*}
Hence $\sigma(a_n)=a_n$ for every $n\ge 1$, so all coefficients of $f$ are fixed by every automorphism of $\mathbb C$ fixing $K$. Therefore
\begin{align*}
a_n\in K \qquad \text{for every } n\ge 1.
\end{align*}
Now $K_f=\mathbb Q(a_n:n\ge 1)$ is contained in $K$. The coefficient $a_2$ gives the reverse containment because
\begin{align*}
a_2-1=(1+\sqrt{5})-1=\sqrt{5},
\end{align*}
so
\begin{align*}
\sqrt{5}\in K_f.
\end{align*}
Thus
\begin{align*}
\mathbb Q(\sqrt{5})\subseteq K_f\subseteq \mathbb Q(\sqrt{5}),
\end{align*}
and therefore
\begin{align*}
K_f=\mathbb Q(\sqrt{5}).
\end{align*}
For example, reducing the certified coefficient relations modulo the prime ideal $\lambda=(\sqrt{5})$ above $5$ gives a residual Hecke eigensystem with
\begin{align*}
a_2=1+\sqrt{5}\equiv 1 \pmod{\lambda},
\qquad
a_3=-2\equiv 3 \pmod{\lambda}.
\end{align*}
The finite Sturm check certifies the whole Hecke field, not merely the first few displayed coefficients.
[/example]
This computation is typical: finitely many coefficients up to a Sturm bound determine the eigenform, while the field generated by those coefficients determines where its Galois representation will be realised.
## From Hecke Ideals to Abelian Variety Quotients
The chapter ends with the construction that turns an eigenform into a geometric factor of the Jacobian. Instead of taking an eigenspace of differentials, which is a vector-space operation and need not define an abelian subvariety over $\mathbb Q$, we take a quotient abelian variety cut out by the annihilator of the eigenform. In practice, this means computing enough Hecke matrices on modular symbols or homology to identify the relations $T_n-a_n(f)$, generating the ideal they span, and quotienting the Jacobian by the subgroup generated by their images.
[definition: Modular Abelian Variety Attached to a Newform]
Let $f$ be a normalized newform of weight $2$ and level $N$. Let
\begin{align*}
I_f=\ker(\mathbb T \to \mathcal O_f),
\end{align*}
where $T_n$ maps to $a_n(f)$. The modular abelian variety attached to $f$ is
\begin{align*}
A_f=J_0(N)/I_fJ_0(N).
\end{align*}
[/definition]
The quotient $A_f$ is defined over $\mathbb Q$ and has dimension $[K_f:\mathbb Q]$ under the usual newform hypotheses. The newform hypothesis is needed here: an oldform would reproduce a factor already coming from a lower-level Jacobian rather than define a genuinely new quotient at level $N$. When $K_f=\mathbb Q$, it is an elliptic curve quotient of $J_0(N)$.
The quotient definition still needs its arithmetic justification: one must know that this geometric factor really carries the eigenform's Hecke data in the expected way. The following result is the point where the newform is promoted from an analytic eigenvector to a modular abelian variety with the corresponding Hecke action.
[quotetheorem:4748]
[citeproof:4748]
The quotient $A_f$ is the geometric home for the Galois representation associated with $f$. The theorem does not yet construct the two-dimensional representation itself: $A_f$ has dimension $[K_f:\mathbb Q]$, so its full Tate module is usually larger than two-dimensional over $\mathbb Q_\ell$. In Chapters 5 and 6, its $\lambda$-adic Tate module is localised at $\mathfrak m_{f,\lambda}$ to produce a two-dimensional representation whose Frobenius traces are the Hecke eigenvalues.
At this point the Hecke action has been transported from correspondences on curves to linear algebra on Jacobians and homology. The next chapter identifies the precise bridge between modular forms and the topology of modular curves, with weight 2 providing the cleanest case through Eichler-Shimura theory.
# 4. Eichler-Shimura Theory
The previous chapters constructed modular curves and their Hecke correspondences as geometric objects. We now turn to the bridge between analytic modular forms and the topology of those curves. For weight $2$, this bridge is especially concrete: cusp forms are holomorphic differentials, their periods define cohomology classes, and Hecke operators act compatibly on both sides. The outcome is Eichler-Shimura theory, which is the cohomological starting point for attaching arithmetic representations to eigenforms.
## Weight Two Cusp Forms as Differentials
Why does weight $2$ play a privileged role on a modular curve? The answer is that the transformation law for a weight $2$ modular form is exactly the transformation law needed for $f(z)\,dz$ to descend from the upper half-plane to the quotient curve.
Let $\mathcal H = \{z \in \mathbb C : \operatorname{Im}(z) > 0\}$, and let $\Gamma_0(N) \le SL_2(\mathbb Z)$ act on $\mathcal H$ by fractional linear transformations. If $\gamma = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$, then
\begin{align*}
\gamma z = \frac{az+b}{cz+d}, \qquad d(\gamma z) = \frac{dz}{(cz+d)^2}.
\end{align*}
Thus a holomorphic function $f: \mathcal H \to \mathbb C$ satisfying $f(\gamma z) = (cz+d)^2 f(z)$ gives
\begin{align*}
f(\gamma z)\,d(\gamma z) = f(z)\,dz.
\end{align*}
This invariant differential is the geometric reason that $S_2(\Gamma_0(N))$ appears in the cohomology of $X_0(N)$.
[definition: Holomorphic Differential on a Riemann Surface]
Let $X$ be a compact Riemann surface. A holomorphic differential on $X$ is a holomorphic section of the canonical bundle $\Omega_X^1$.
[/definition]
For a modular curve, it is often best to construct such a differential upstairs on $\mathcal H$ and then check invariance under the group. The cusp condition controls the extension across the compactifying points.
[quotetheorem:4749]
[citeproof:4749]
This theorem gives the first manifestation of modular forms as cohomological objects. The weight $2$ hypothesis is essential because $d(\gamma z)$ contributes exactly the factor $(cz+d)^{-2}$; for weights other than $2$, the expression $f(z)\,dz$ is not invariant and must instead be interpreted as a section of a different automorphic line bundle. The cusp condition is also essential: a modular form with non-zero constant term at a cusp gives $f(q)dq/q$, which has a logarithmic pole at $q=0$ rather than a holomorphic extension. Thus the theorem does not identify all modular forms with differentials; it identifies precisely the cuspidal weight $2$ forms with regular differentials on the compact modular curve, setting up the later passage from differentials to cohomology classes.
[example: The Level Eleven Differential]
For $N=11$, the cusp-form space $S_2(\Gamma_0(11))$ is one-dimensional, generated by the normalized newform
\begin{align*}
f(q)=q-2q^2-q^3+2q^4+q^5+2q^6-2q^7-2q^9-2q^{10}+q^{11}+\cdots.
\end{align*}
Under the weight $2$ differential construction, the associated differential is
\begin{align*}
\omega_f
&= f(q)\frac{dq}{q} \\
&= \left(q-2q^2-q^3+2q^4+q^5+2q^6-2q^7-2q^9-2q^{10}+q^{11}+\cdots\right)\frac{dq}{q} \\
&= \left(\frac{q}{q}-2\frac{q^2}{q}-\frac{q^3}{q}+2\frac{q^4}{q}+\frac{q^5}{q}+2\frac{q^6}{q}-2\frac{q^7}{q}-2\frac{q^9}{q}-2\frac{q^{10}}{q}+\frac{q^{11}}{q}+\cdots\right)dq \\
&= \left(1-2q-q^2+2q^3+q^4+2q^5-2q^6-2q^8-2q^9+q^{10}+\cdots\right)dq.
\end{align*}
There is no pole at $q=0$, because the displayed coefficient of $dq$ is a [power series](/page/Power%20Series) with non-negative powers of $q$. Since $X_0(11)$ has genus $1$, the space $H^0(X_0(11),\Omega^1)$ has complex dimension $1$, so the non-zero differential $\omega_f$ spans the holomorphic differentials on the elliptic curve $X_0(11)$.
[/example]
## Betti Cohomology and Periods
How can a holomorphic modular form be recorded using only topology? The basic operation is integration over paths and cycles. Once a cusp form is viewed as a differential, its periods along homology classes give linear functionals on homology, and hence classes in Betti cohomology.
[definition: Betti Cohomology of a Modular Curve]
Let $X_0(N)(\mathbb C)$ be the complex analytic modular curve. Its first Betti cohomology is
\begin{align*}
H^1_B(X_0(N),\mathbb C) := H^1(X_0(N)(\mathbb C),\mathbb C).
\end{align*}
[/definition]
The de Rham class of a holomorphic differential gives a Betti cohomology class by integration. Explicitly, if $\omega$ is a holomorphic differential and $\gamma$ is a singular $1$-cycle, the period is
\begin{align*}
\gamma \longmapsto \int_\gamma \omega.
\end{align*}
This is invariant under homology because holomorphic differentials are closed.
The period construction suggests a map from differentials to cohomology, but it does not by itself describe the whole cohomology group. The obstruction is a dimension mismatch: holomorphic differentials supply only the $(1,0)$ part, while Betti cohomology also contains classes detected by anti-holomorphic differentials. A precise comparison theorem is needed to say that these two analytic pieces together account for all complex Betti cohomology.
[quotetheorem:4750]
[citeproof:4750]
The complex conjugate summand is not an error in the dictionary; it is essential. A compact Riemann surface of genus $g$ has $\dim_{\mathbb C} H^0(X,\Omega^1)=g$ but $\dim_{\mathbb C}H^1_B(X,\mathbb C)=2g$, so holomorphic differentials alone account for only half of Betti cohomology. For example, an elliptic curve has one holomorphic differential but two independent complex Betti cohomology classes. The theorem also depends on compactness: on the open curve $Y_0(N)$, cuspidal and Eisenstein boundary phenomena enter and the clean two-summand statement must be replaced by a statement involving relative or compactly supported cohomology. Eichler-Shimura theory packages both a form and its conjugate into a rational cohomological object, which is why this decomposition is the bridge to rational Hecke modules rather than just to complex analysis.
For computations, arbitrary cycles in Betti homology are too flexible to serve as a finite, explicit basis. The next definition replaces them with canonical relative paths between cusps, which are rigid enough to encode by rational endpoints and still rich enough to pair with modular forms.
[definition: Modular Symbol]
Let $\alpha,\beta \in \mathbb P^1(\mathbb Q)$ and let $\Gamma \le SL_2(\mathbb Z)$ be a congruence subgroup. The modular symbol $\{\alpha,\beta\}$ is the homology class, relative to the cusps, of the geodesic path in $\mathcal H \cup \mathbb P^1(\mathbb Q)$ from $\alpha$ to $\beta$, modulo the action of $\Gamma$.
[/definition]
Modular symbols replace arbitrary paths on a modular curve by paths between cusps. They are computationally useful because the cusps are rational and the relations among such symbols can be described using the action of $SL_2(\mathbb Z)$.
What remains is to know that these symbols are not merely convenient examples of relative cycles. The theorem below supplies the finite generation statement that makes modular symbols a practical model for the relevant homology.
[quotetheorem:4751]
[citeproof:4751]
This result is the computational form of relative homology on modular curves: it replaces arbitrary paths by finitely many cusp-to-cusp symbols with explicit relations.
[illustration:manin-tessellation-modular-symbol]
The standard tessellation of the upper half-plane by ideal triangles is the geometric source of those finite relations. The congruence-subgroup hypothesis is important because it makes the quotient of the tessellation finite enough to produce a finite presentation; without such finiteness, modular symbols need not give a practical finite linear-algebra model. The theorem concerns relative homology with cusps included, so it does not by itself isolate the cuspidal part; one must later pass to the cuspidal quotient or pair with cusp forms to remove boundary classes. This is the reason modular symbols make Hecke eigenclasses algorithmically accessible.
[example: Periods from a Modular Symbol]
Let $f\in S_2(\Gamma_0(N))$, write its Fourier expansion at infinity as
\begin{align*}
f(z)=\sum_{n\ge 1} a_n e^{2\pi inz},
\end{align*}
and let $\omega_f=2\pi i f(z)\,dz$. For the modular symbol $\{0,\infty\}$, represented by the vertical path $z=it$ with $0<t<\infty$, the period is
\begin{align*}
\int_{\{0,\infty\}}\omega_f
&=2\pi i\int_0^{i\infty} f(z)\,dz \\
&=2\pi i\int_0^\infty f(it)\, i\,dt \\
&=-2\pi\int_0^\infty f(it)\,dt.
\end{align*}
Along this path,
\begin{align*}
q=e^{2\pi iz}=e^{2\pi i(it)}=e^{-2\pi t},
\end{align*}
so
\begin{align*}
f(it)=\sum_{n\ge 1}a_n e^{-2\pi nt}.
\end{align*}
For each $n\ge 1$,
\begin{align*}
\int_0^\infty e^{-2\pi nt}\,dt
&=\left[-\frac{1}{2\pi n}e^{-2\pi nt}\right]_{0}^{\infty} \\
&=0-\left(-\frac{1}{2\pi n}\right) \\
&=\frac{1}{2\pi n}.
\end{align*}
Thus, whenever the termwise integration is justified, for instance after using cuspidality at the endpoints to control the improper integral, the period becomes
\begin{align*}
\int_{\{0,\infty\}}\omega_f
&=-2\pi\sum_{n\ge 1}a_n\int_0^\infty e^{-2\pi nt}\,dt \\
&=-2\pi\sum_{n\ge 1}a_n\frac{1}{2\pi n} \\
&=-\sum_{n\ge 1}\frac{a_n}{n}.
\end{align*}
Equivalently, using $dq/q=2\pi i\,dz$, the same computation reads
\begin{align*}
\int_{\{0,\infty\}}\omega_f
&=\int_{q=1}^{q=0} f(q)\frac{dq}{q} \\
&=\int_1^0 \left(\sum_{n\ge 1}a_n q^n\right)\frac{dq}{q} \\
&=\int_1^0 \sum_{n\ge 1}a_n q^{n-1}\,dq \\
&=-\sum_{n\ge 1}\frac{a_n}{n}.
\end{align*}
The modular-symbol period is therefore the value at $s=1$ of the Dirichlet series attached to $f$, up to the displayed sign coming from the orientation from $0$ to $\infty$; this is the concrete numerical bridge between the $q$-expansion and a homology class.
[/example]
## The Eichler-Shimura Isomorphism
What extra structure is preserved when we pass from modular forms to cohomology? The answer is the Hecke action. Since Hecke operators were constructed geometrically as correspondences on modular curves, they act on cohomology as well as on modular forms, and the comparison respects these actions.
[definition: Hecke Action on Cohomology]
Let $T_n$ be the Hecke correspondence on $X_0(N)$, written as a diagram
\begin{align*}
X_0(N) \xleftarrow{p_1} C_n \xrightarrow{p_2} X_0(N).
\end{align*}
The induced Hecke operator on Betti cohomology is the [linear map](/page/Linear%20Map)
\begin{align*}
T_n := (p_2)_*\,p_1^*: H^1_B(X_0(N),\mathbb C) \longrightarrow H^1_B(X_0(N),\mathbb C).
\end{align*}
[/definition]
This definition is deliberately geometric. It is the same correspondence that produced the usual Hecke operator on $q$-expansions, but expressed in a form that can act on homology and cohomology.
[quotetheorem:4752]
[citeproof:4752]
The theorem should be read as more than a complex vector-space identity. The compactness of $X_0(N)$ and the cuspidality of the forms are what keep Eisenstein boundary classes out of this statement; for non-cuspidal modular forms on the open curve, extra cohomology appears from the cusps. The Hecke-equivariance also depends on using the same pull-push normalisation on cohomology as in the geometric definition of the classical Hecke operators; changing the correspondence normalisation would change the eigenvalue comparison. What the theorem does not yet supply is a Galois representation: it supplies a rational Hecke module in Betti cohomology, and the later comparison with étale cohomology turns that Hecke module into arithmetic information.
To turn Hecke eigenvalues into Galois traces, the geometric Hecke correspondence must be compared with Frobenius after reduction modulo a good prime. The obstruction is that $T_p$ is defined by a modular correspondence, while Frobenius is an arithmetic endomorphism of the special fibre. The theorem below identifies these two kinds of operators inside étale cohomology, which is the first concrete bridge from Hecke action to Galois action.
[quotetheorem:4753]
[citeproof:4753]
This relation is the geometric prototype for the Galois-representation trace formula. The hypothesis $p\nmid N$ is the good-reduction hypothesis; at primes dividing $N$, the curve has bad reduction and the formula is replaced by local statements involving operators such as $U_p$, monodromy, and conductor data. The Frobenius convention also matters: using arithmetic Frobenius reverses the displayed inverse convention. With the stated convention, the identity explains why a weight $2$ eigenform should give a two-dimensional representation with trace $a_p(f)$ and determinant $p$ at primes away from $N$.
[example: The Level Eleven Hecke Module]
For the level $11$ newform
\begin{align*}
f(q)=q-2q^2-q^3+2q^4+q^5+\cdots,
\end{align*}
the coefficient of $q^n$ is the Hecke eigenvalue $a_n(f)$ because $f$ is normalized. Reading off the displayed coefficients gives
\begin{align*}
a_2(f)&=-2,\\
a_3(f)&=-1,\\
a_5(f)&=1.
\end{align*}
Thus
\begin{align*}
T_2 f=-2f,\qquad T_3 f=-f,\qquad T_5 f=f.
\end{align*}
Since $S_2(\Gamma_0(11))$ is one-dimensional, the holomorphic summand in the Eichler-Shimura decomposition is
\begin{align*}
S_2(\Gamma_0(11))=\mathbb C f.
\end{align*}
The [Hodge decomposition](/theorems/2745) for this genus $1$ modular curve gives
\begin{align*}
H^1_B(X_0(11),\mathbb C)
&\cong S_2(\Gamma_0(11))\oplus \overline{S_2(\Gamma_0(11))}\\
&=\mathbb C f\oplus \mathbb C\overline f,
\end{align*}
so
\begin{align*}
\dim_{\mathbb C}H^1_B(X_0(11),\mathbb C)
=\dim_{\mathbb C}\mathbb C f+\dim_{\mathbb C}\mathbb C\overline f
=1+1=2.
\end{align*}
On the conjugate summand, the Hecke eigenvalue is the complex conjugate of the eigenvalue on $f$. Therefore
\begin{align*}
T_p\overline f=\overline{a_p(f)}\,\overline f.
\end{align*}
For $p=2,3,5$, the numbers $-2,-1,1$ are real, so
\begin{align*}
\overline{a_2(f)}=-2,\qquad \overline{a_3(f)}=-1,\qquad \overline{a_5(f)}=1.
\end{align*}
Thus the two complex eigenlines in $H^1_B(X_0(11),\mathbb C)$ have the same displayed Hecke eigenvalues for these primes. The associated rational Hecke module is the first cohomology of the elliptic curve $X_0(11)$, and its complexification splits into the two conjugate eigenlines above.
[/example]
## Modular Symbols in Computation
How does one compute the cohomology class attached to an eigenform in practice? Instead of triangulating the whole modular curve, Manin's theorem turns the problem into finite linear algebra on symbols. Hecke operators become explicit matrices, and eigenforms can be recovered by finding simultaneous eigenspaces.
[definition: Manin Symbol]
For $\Gamma_0(N)$, a Manin symbol is the class of an element $g\in SL_2(\mathbb Z)$ in the finite set $\Gamma_0(N)\backslash SL_2(\mathbb Z)$, interpreted as the modular symbol $\{g0,g\infty\}$.
[/definition]
The Manin relations encode the boundary and triangle relations in the standard tessellation. After imposing them, the resulting finite module maps onto the relative homology generated by modular symbols.
[example: Modular-Symbol Computation at Level Eleven]
At level $11$, the index formula for $\Gamma_0(N)$ gives
\begin{align*}
[\operatorname{SL}_2(\mathbb Z):\Gamma_0(11)]
&=11\left(1+\frac{1}{11}\right)\\
&=11\cdot \frac{12}{11}\\
&=12.
\end{align*}
Thus the Manin-symbol presentation starts with $12$ complex generators. Imposing the two-term and three-term Manin relations gives a finite presentation of relative homology, and passing to the cuspidal quotient kills the boundary classes. In this level, the resulting relation matrix has rank $10$, so the cuspidal modular-symbol space has dimension
\begin{align*}
12-10=2.
\end{align*}
This agrees with the cohomological dimension: the modular curve $X_0(11)$ has genus $1$, hence
\begin{align*}
\dim_{\mathbb C}H^1_B(X_0(11),\mathbb C)=2g=2\cdot 1=2.
\end{align*}
For the normalized newform
\begin{align*}
f(q)=q-2q^2-q^3+2q^4+q^5+\cdots,
\end{align*}
the coefficient of $q^2$ is $a_2(f)=-2$, so $T_2f=-2f$. By *Eichler-Shimura Isomorphism*, the complexified cuspidal modular-symbol space splits as the eigenline generated by $f$ together with its conjugate eigenline. Since $-2$ is real,
\begin{align*}
T_2\overline f=\overline{-2}\,\overline f=-2\overline f.
\end{align*}
In the basis $(f,\overline f)$, the matrix of $T_2$ is therefore
\begin{align*}
\begin{pmatrix}
-2&0\\
0&-2
\end{pmatrix}.
\end{align*}
Its characteristic polynomial is
\begin{align*}
\det\left(
X I-
\begin{pmatrix}
-2&0\\
0&-2
\end{pmatrix}
\right)
&=
\det
\begin{pmatrix}
X+2&0\\
0&X+2
\end{pmatrix}\\
&=(X+2)(X+2)\\
&=(X+2)^2.
\end{align*}
Thus the finite Manin-symbol calculation recovers the same Hecke eigenvalue $a_2(f)=-2$ on both conjugate cohomology classes.
[/example]
This computation illustrates the practical content of Eichler-Shimura theory. A finite presentation of relative homology, together with Hecke matrices, detects the same eigenvalues that appear in Fourier expansions. Chapters 5 and 6 replace Betti cohomology by étale cohomology and Tate modules, but the Hecke eigenpacket is already visible here.
Eichler-Shimura explains why Hecke eigenvalues already appear in Betti cohomology and relative homology. What remains is to replace this topological picture by an arithmetic one, using l-adic Tate modules and étale cohomology to produce genuine Galois representations.
# 5. $\ell$-adic Tate Modules and Galois Representations
The previous chapters built the geometric and Hecke-theoretic side of modular forms: modular curves, their Jacobians, and the Hecke algebra acting by correspondences. This chapter turns those geometric objects into linear representations of absolute Galois groups. The main mechanism is the $\ell$-adic Tate module, which records all $\ell$-power torsion points of an elliptic curve or abelian variety at once.
The guiding question is: how much arithmetic information is visible in the action of $G_K = \operatorname{Gal}(\overline{K}/K)$ on torsion points? For elliptic curves this action gives two-dimensional $l$-adic representations, and for Hecke-stable factors of modular Jacobians it is the bridge from Hecke eigenvalues to Galois representations.
The chapter begins with torsion points because they are the most concrete place where Galois groups act on geometry. It then passes from finite torsion layers to inverse limits, from inverse limits to continuous $l$-adic representations, and from general abelian varieties to modular Jacobians. The point of the progression is to make the later modularity dictionary precise: Hecke eigenvalues will be recovered as Frobenius traces on two-dimensional pieces of Tate modules.
## Torsion Points and Tate Modules
A first attempt to study an elliptic curve $E/K$ by Galois theory is to look at the finite group $E[l](\overline{K})$ of $l$-torsion points. This gives a representation over $\mathbb{F}_l$, but for modular forms we need a characteristic-zero object whose traces can equal Hecke eigenvalues. The Tate module is the inverse limit that remembers all compatible $l^n$-torsion data.
[definition: Elliptic Curve Tate Module]
Let $K$ be a field, let $l$ be a prime distinct from $\operatorname{char}(K)$, and let $E/K$ be an elliptic curve. The $l$-adic Tate module of $E$ is
\begin{align*}
T_lE := \varprojlim_n E[l^n](\overline{K}),
\end{align*}
where the transition maps are multiplication by $l$.
[/definition]
Thus an element of $T_lE$ is a compatible system $(P_n)_{n\ge 1}$ with $P_n \in E[l^n](\overline{K})$ and $lP_{n+1}=P_n$. Since $l\ne \operatorname{char}(K)$, the group $E[l^n](\overline{K})$ is isomorphic to $(\mathbb{Z}/l^n\mathbb{Z})^2$, so $T_lE$ is a free $\mathbb{Z}_l$-module of rank $2$.
[example: Tate Module of a Complex Elliptic Curve]
Let $E(\mathbb{C})\cong \mathbb{C}/\Lambda$, where $\Lambda$ is a rank-$2$ lattice. A point of $E(\mathbb{C})$ is a class $z+\Lambda$, and it is killed by $l^n$ exactly when
\begin{align*}
l^n(z+\Lambda)=0+\Lambda
\quad\Longleftrightarrow\quad
l^nz\in \Lambda
\quad\Longleftrightarrow\quad
z\in l^{-n}\Lambda.
\end{align*}
Hence
\begin{align*}
E[l^n](\mathbb{C})=l^{-n}\Lambda/\Lambda.
\end{align*}
Choose a $\mathbb{Z}$-basis $\omega_1,\omega_2$ of $\Lambda$. Then every class in $l^{-n}\Lambda/\Lambda$ has the form
\begin{align*}
\frac{a_1}{l^n}\omega_1+\frac{a_2}{l^n}\omega_2+\Lambda,
\end{align*}
with $a_1,a_2\in \mathbb{Z}$, and changing $a_i$ by a multiple of $l^n$ does not change the class. Thus the map
\begin{align*}
(\mathbb{Z}/l^n\mathbb{Z})^2 &\longrightarrow l^{-n}\Lambda/\Lambda,\\
(a_1,a_2)&\longmapsto \frac{a_1}{l^n}\omega_1+\frac{a_2}{l^n}\omega_2+\Lambda
\end{align*}
is an isomorphism.
Under this identification, multiplication by $l$ sends
\begin{align*}
\frac{b_1}{l^{n+1}}\omega_1+\frac{b_2}{l^{n+1}}\omega_2+\Lambda
\end{align*}
to
\begin{align*}
\frac{b_1}{l^n}\omega_1+\frac{b_2}{l^n}\omega_2+\Lambda.
\end{align*}
Equivalently, the transition map
\begin{align*}
(\mathbb{Z}/l^{n+1}\mathbb{Z})^2\longrightarrow(\mathbb{Z}/l^n\mathbb{Z})^2
\end{align*}
is reduction modulo $l^n$ in each coordinate. Therefore
\begin{align*}
T_lE
&=\varprojlim_n E[l^n](\mathbb{C})\\
&\cong \varprojlim_n (\mathbb{Z}/l^n\mathbb{Z})^2\\
&\cong \left(\varprojlim_n \mathbb{Z}/l^n\mathbb{Z}\right)^2\\
&=\mathbb{Z}_l^2.
\end{align*}
With the chosen basis of $\Lambda$, the map
\begin{align*}
\Lambda\otimes_{\mathbb{Z}}\mathbb{Z}_l&\longrightarrow T_lE,\\
(a_1\omega_1+a_2\omega_2)\otimes u&\longmapsto (ua_1,ua_2)\in \mathbb{Z}_l^2
\end{align*}
identifies $\Lambda\otimes_{\mathbb{Z}}\mathbb{Z}_l$ with $T_lE$. Thus the Tate module is the $l$-adic completion of the lattice $\Lambda$, not merely one finite torsion layer.
[/example]
The same construction works for higher-dimensional abelian varieties, and this is the form needed for Jacobians of modular curves. A single torsion layer cannot distinguish phenomena that disappear after reduction modulo $l$, while the full inverse system retains congruence information at every $l$-power level. For modular Jacobians this extra information is essential, because Hecke eigenspaces are usually isolated only after passing to a characteristic-zero coefficient field.
For Jacobians and their quotients, the relevant torsion groups are no longer two-generated as in the elliptic curve case; their size must reflect the dimension of the abelian variety. The next definition sets up the inverse-limit object whose rank and Galois action will later be cut into eigenform pieces by Hecke operators.
[definition: Abelian Variety Tate Module]
Let $K$ be a field, let $l$ be a prime distinct from $\operatorname{char}(K)$, and let $A/K$ be an abelian variety of dimension $g$. The $l$-adic Tate module of $A$ is
\begin{align*}
T_lA := \varprojlim_n A[l^n](\overline{K}),
\end{align*}
with transition maps given by multiplication by $l$.
[/definition]
For an abelian variety of dimension $g$, tensoring the Tate module with $\mathbb{Q}_l$ gives the rational Tate module
\begin{align*}
V_lA := T_lA \otimes_{\mathbb{Z}_l} \mathbb{Q}_l,
\end{align*}
which is often the natural coefficient space for semisimple representation-theoretic statements.
Before using $T_lA$ as a representation space, one must know that the inverse limit has the expected size and no hidden torsion defects. This is not automatic from the definition: it depends on the structure of prime-to-characteristic torsion on an abelian variety.
The needed structural input is a rank statement: it must identify the Tate module as a free $\mathbb Z_l$-module of the expected rank. Without that theorem, the later Galois representation would not have a well-defined dimension to compare with modular forms.
[quotetheorem:4754]
[citeproof:4754]
This theorem explains why elliptic curves give two-dimensional Galois representations while Jacobians of modular curves give much larger representations that must later be decomposed using Hecke operators. The hypothesis $l\ne \operatorname{char}(K)$ is necessary here: in characteristic $l$, the $l$-power torsion group scheme can have connected parts and the geometric points need not have full rank. The theorem is therefore the reason the later construction always separates the auxiliary prime $l$ from primes of reduction. It also tells us exactly how large a representation must be before Hecke operators are used to cut out the pieces attached to eigenforms.
## Galois Action on Tate Modules
The torsion points of an elliptic curve or abelian variety are defined over $\overline{K}$, so the absolute Galois group acts on them. The question is whether this action respects the inverse limit structure strongly enough to produce a continuous $l$-adic representation.
[definition: Galois Representation on a Tate Module]
Let $A/K$ be an abelian variety, let $l\ne \operatorname{char}(K)$, and let $G_K=\operatorname{Gal}(\overline{K}/K)$. The $l$-adic Galois representation attached to $A$ is the continuous homomorphism
\begin{align*}
\rho_{A,l}:G_K \longrightarrow \operatorname{Aut}_{\mathbb{Z}_l}(T_lA)
\end{align*}
induced by the action of $G_K$ on $A[l^n](\overline{K})$ for all $n\ge 1$.
[/definition]
After choosing a $\mathbb{Z}_l$-basis of $T_lA$, this becomes a matrix representation
\begin{align*}
\rho_{A,l}:G_K \longrightarrow GL_{2g}(\mathbb{Z}_l).
\end{align*}
For an elliptic curve $E/K$, the representation has dimension $2$ over $\mathbb{Q}_l$ after passing to $V_lE$.
[example: The Weil Pairing and the Determinant]
Let $E/K$ be an elliptic curve and suppose $l\ne \operatorname{char}(K)$. Fix $n\ge 1$, and choose a basis $P,Q$ of $E[l^n](\overline K)$ such that the Weil pairing value $e_{l^n}(P,Q)$ is a primitive $l^n$th root of unity. For $\sigma\in G_K$, write the action on this basis as
\begin{align*}
\sigma(P)&=aP+cQ,\\
\sigma(Q)&=bP+dQ,
\end{align*}
with $a,b,c,d\in \mathbb{Z}/l^n\mathbb{Z}$. By bilinearity and alternatingness of the Weil pairing,
\begin{align*}
e_{l^n}(\sigma P,\sigma Q)
&=e_{l^n}(aP+cQ,bP+dQ)\\
&=e_{l^n}(P,P)^{ab}e_{l^n}(P,Q)^{ad}e_{l^n}(Q,P)^{cb}e_{l^n}(Q,Q)^{cd}\\
&=1^{ab}e_{l^n}(P,Q)^{ad}e_{l^n}(P,Q)^{-cb}1^{cd}\\
&=e_{l^n}(P,Q)^{ad-bc}.
\end{align*}
The exponent $ad-bc$ is exactly the determinant of the matrix of $\sigma$ on $E[l^n]$ in the basis $P,Q$.
On the other hand, Galois equivariance of the Weil pairing gives
\begin{align*}
e_{l^n}(\sigma P,\sigma Q)
=\sigma(e_{l^n}(P,Q)).
\end{align*}
If $\chi_{l^n}:G_K\to(\mathbb{Z}/l^n\mathbb{Z})^\times$ is the mod-$l^n$ cyclotomic character, then
\begin{align*}
\sigma(e_{l^n}(P,Q))=e_{l^n}(P,Q)^{\chi_{l^n}(\sigma)}.
\end{align*}
Since $e_{l^n}(P,Q)$ is primitive, equality of the two powers implies
\begin{align*}
\det(\rho_{E,l}(\sigma)\bmod l^n)\equiv \chi_{l^n}(\sigma)\pmod{l^n}.
\end{align*}
These congruences hold for every $n$, so in the inverse limit
\begin{align*}
\det(\rho_{E,l}(\sigma))=\chi_l(\sigma)\in \mathbb{Z}_l^\times.
\end{align*}
Thus the Weil pairing forces the determinant of the Tate-module representation to be exactly the $l$-adic cyclotomic character.
[/example]
For modular forms of weight $2$, the determinant condition is the first sign that the representation has the expected shape: Frobenius determinants should be powers of primes, matching the constant term in the Hecke polynomial.
## Frobenius and Good Reduction
A central test for an $l$-adic representation is what it does at primes away from $l$. Without a good local model, the Galois action can contain extra inertia information coming from singular reduction rather than from the clean arithmetic of point-counting over finite fields. Good reduction is the condition that removes this obstruction: if $A/K$ has good reduction at a finite prime $\mathfrak{p}$, then inertia should act without effect on $T_lA$, and the arithmetic is carried by a Frobenius element.
[definition: Unramified Representation]
Let $K$ be a number field, let $\mathfrak{p}$ be a finite prime of $K$, and let $l$ be a rational prime with $\mathfrak{p}\nmid l$. A continuous representation $\rho:G_K\to GL_n(\mathbb{Q}_l)$ is unramified at $\mathfrak{p}$ if the inertia subgroup $I_{\mathfrak{p}}\subset G_K$ acts as the identity under $\rho$.
[/definition]
When $\rho$ is unramified at $\mathfrak{p}$, the conjugacy class of Frobenius has a well-defined characteristic polynomial under $\rho$. In this chapter $\operatorname{Frob}_{\mathfrak{p}}$ denotes arithmetic Frobenius, acting on the residue field by $x\mapsto x^{N\mathfrak{p}}$.
The key obstruction is ramification: if inertia acts nontrivially, Frobenius alone cannot summarize the local Galois action. For abelian varieties with good reduction away from $l$, the expected clean behavior is that the Tate module representation is unramified, so its local information is carried by Frobenius. This is the structural input needed before Frobenius polynomials can be compared with Euler factors.
[quotetheorem:4755]
[citeproof:4755]
This theorem is the local input behind the Euler factors of Hasse-Weil $L$-functions and, later, the Euler factors of modular forms. Its necessity is that Frobenius is only a clean conjugacy class after inertia has been killed; otherwise the local representation contains ramification data that cannot be summarized by a single unramified characteristic polynomial. The restriction $\mathfrak{p}\nmid l$ is also essential, since the $l$-adic Tate module has subtler $p$-adic Hodge-theoretic behaviour at primes above $l$. In the modular-form comparison, this theorem is what permits the simple Euler factor at primes away from both the level and $l$.
The next computation specializes this unramified Frobenius action to elliptic curves with good reduction. It turns the abstract Frobenius conjugacy class into the concrete polynomial whose trace is counted by points over $\mathbb F_p$.
[quotetheorem:4756]
[citeproof:4756]
This theorem is the point at which the abstract Galois action becomes computational. The good-reduction hypothesis is necessary because the formula uses a smooth reduction and the Lefschetz trace formula over $\mathbb{F}_p$; at bad primes the local factor has a different shape and inertia contributes. The auxiliary prime $l$ disappears from the answer, which is a first indication that the compatible system is intrinsic to $E$ rather than to the chosen Tate module. The coefficient $a_p$ is the same integer that appears in the local $L$-factor
\begin{align*}
L_p(E,s)^{-1}=1-a_pp^{-s}+p^{1-2s}.
\end{align*}
This is the template for matching Frobenius traces with Hecke eigenvalues.
[example: Counting Points for a Frobenius Trace]
Consider $E:y^2+y=x^3-x^2$ over $\mathbb{Q}$. For the reduction modulo $5$, the Weierstrass coefficients are $a_1=0$, $a_2=-1$, $a_3=1$, $a_4=0$, and $a_6=0$, so
\begin{align*}
b_2&=a_1^2+4a_2=-4,\\
b_4&=2a_4+a_1a_3=0,\\
b_6&=a_3^2+4a_6=1,\\
b_8&=a_1^2a_6+4a_2a_6-a_1a_3a_4+a_2a_3^2-a_4^2=-1.
\end{align*}
Hence
\begin{align*}
\Delta
&=-b_2^2b_8-8b_4^3-27b_6^2+9b_2b_4b_6\\
&=-(-4)^2(-1)-8\cdot 0^3-27\cdot 1^2+9(-4)\cdot 0\cdot 1\\
&=16-27\\
&=-11.
\end{align*}
Since $-11\equiv 4\not\equiv 0\pmod 5$, the reduction is smooth over $\mathbb{F}_5$.
Now count the $\mathbb{F}_5$-points. The left-hand side takes the values
\begin{align*}
y^2+y
&=0 &&\text{for } y=0,\\
&=2 &&\text{for } y=1,\\
&=1 &&\text{for } y=2,\\
&=2 &&\text{for } y=3,\\
&=0 &&\text{for } y=4.
\end{align*}
The right-hand side $x^3-x^2$ takes the values
\begin{align*}
0^3-0^2&=0,\\
1^3-1^2&=0,\\
2^3-2^2&=8-4=4\equiv 4\pmod 5,\\
3^3-3^2&=27-9=18\equiv 3\pmod 5,\\
4^3-4^2&=64-16=48\equiv 3\pmod 5.
\end{align*}
Thus $x=0$ gives the two solutions $y=0,4$, $x=1$ gives the two solutions $y=0,4$, and $x=2,3,4$ give no solutions. There are therefore $4$ affine points and one point at infinity, so
\begin{align*}
|E(\mathbb{F}_5)|=4+1=5.
\end{align*}
By the Frobenius trace formula for elliptic curves,
\begin{align*}
a_5=5+1-|E(\mathbb{F}_5)|=5+1-5=1.
\end{align*}
Therefore the Frobenius polynomial at $5$ is
\begin{align*}
1-a_5X+5X^2=1-X+5X^2.
\end{align*}
This point count turns the abstract trace $a_5$ into an explicit integer by reducing the curve and counting its smooth special fibre.
[/example]
Point counts show what Frobenius looks like once a smooth reduction is already available, but they do not explain how to recognize good reduction from the representation itself. The obstruction is that a bad integral model can introduce inertia into the Tate module action, and this ramification is invisible if one looks only at a single reduced equation. The Néron-Ogg-Shafarevich criterion supplies the converse test: good reduction is exactly the condition detected by unramifiedness of the Tate module.
[quotetheorem:4757]
[citeproof:4757]
This result is used here as a quoted structural theorem from the theory of Néron models. Its role is conceptual: good reduction is not merely a property of equations, but is encoded in the ramification behaviour of the Tate module. The criterion is also a warning that unramifiedness is a strong condition: it detects the existence of a smooth integral model, not just the absence of visible singularities in one chosen equation. The independence of the auxiliary prime $l$ is especially important for modularity, because it lets local reduction properties be read uniformly across the compatible family of $l$-adic representations.
## Modular Jacobians and Hecke-Stable Quotients
The representations attached to modular forms do not appear from a single torsion point. They arise by cutting the large Tate module of a modular Jacobian using Hecke operators. Without such a cutting operation, $T_lJ_0(N)$ contains the contributions of many eigenforms at once, so its Frobenius traces are too large and too mixed to represent one modular form. The problem is to isolate a two-dimensional piece from $T_lJ_0(N)$ whose Frobenius traces are the Hecke eigenvalues of a chosen newform.
Let $J_0(N)$ be the Jacobian of $X_0(N)$ over $\mathbb{Q}$, and let $\mathbb{T}$ be the Hecke algebra generated by the correspondences $T_n$ acting on $J_0(N)$. The action of $\mathbb{T}$ commutes with the action of $G_{\mathbb{Q}}$, so Hecke-stable quotients of $J_0(N)$ produce Galois-stable quotients of Tate modules.
[definition: Hecke Stable Quotient]
Let $A$ be an abelian variety over $\mathbb{Q}$ equipped with an action homomorphism
\begin{align*}
\mathbb{T} \longrightarrow \operatorname{End}_{\mathbb{Q}}(A).
\end{align*}
A quotient abelian variety $\pi:A\to B$ is Hecke-stable if there is an action homomorphism
\begin{align*}
\mathbb{T} \longrightarrow \operatorname{End}_{\mathbb{Q}}(B)
\end{align*}
such that $\pi\circ T = T\circ \pi$ for every $T\in\mathbb{T}$.
[/definition]
For a normalized weight-$2$ newform $f=\sum_{n\ge 1}a_nq^n$ on $\Gamma_0(N)$, the Eichler-Shimura construction gives a quotient $A_f$ of $J_0(N)$ on which each $T_n$ acts through the eigenvalue $a_n$, after allowing coefficients in the Hecke field $K_f=\mathbb{Q}(a_n:n\ge 1)$. The remaining issue is to say how large this quotient is and in what sense it is determined by $f$. Since conjugates of $f$ have conjugate eigenvalue systems, the quotient cannot usually be just one elliptic curve over $\mathbb Q$. The theorem identifies the abelian variety cut out by the full Galois orbit of the eigensystem, which is the geometric object whose Tate module will realize the representation.
[quotetheorem:4758]
[citeproof:4758]
The dimension of $A_f$ is $[K_f:\mathbb{Q}]$. This matters because the Tate module of $A_f$ is not usually two-dimensional over $\mathbb{Q}_l$; it becomes a collection of two-dimensional representations only after using the coefficient field. The quotient is canonical only up to isogeny, so all later statements must be formulated in terms of rational Tate modules or semisimple representations. When $f$ has rational Hecke eigenvalues, the quotient $A_f$ is an elliptic curve over $\mathbb{Q}$, which is the special case where the construction is visible without decomposing by a larger coefficient field.
We now spell out the coefficient-field notation used in this extraction. The Hecke action on $A_f$ gives an embedding of $K_f$ into the rational endomorphism algebra
\begin{align*}
\operatorname{End}_{\mathbb Q}(A_f)\otimes_{\mathbb Z}\mathbb Q,
\end{align*}
where $\operatorname{End}_{\mathbb Q}(A_f)$ means endomorphisms of $A_f$ defined over $\mathbb Q$. The rational Tate module is
\begin{align*}
V_lA_f:=T_lA_f\otimes_{\mathbb Z_l}\mathbb Q_l.
\end{align*}
After scalar extension from $\mathbb Q$ to $\mathbb Q_l$, the coefficient algebra decomposes as
\begin{align*}
K_f\otimes_{\mathbb Q}\mathbb Q_l\cong \prod_{\lambda' \mid l}K_{f,\lambda'},
\end{align*}
where $K_{f,\lambda'}$ is the completion of $K_f$ at the prime $\lambda'$ above $l$. The factor indexed by a chosen $\lambda\mid l$ cuts out the two-dimensional $K_{f,\lambda}$-vector space denoted $V_\lambda A_f$. This is the precise meaning of the "$K_{f,\lambda}$-linear component" in the definition.
[definition: The $\ell$-adic Representation Attached to a Newform of Weight Two]
Let $f\in S_2(\Gamma_0(N))$ be a normalized newform with Hecke field $K_f$, let $\lambda$ be a prime of $K_f$ above $l$, and let $A_f$ be the Eichler-Shimura quotient. The $\lambda$-adic representation attached to $f$ is the two-dimensional representation
\begin{align*}
\rho_{f,\lambda}:G_{\mathbb{Q}}\longrightarrow GL_2(K_{f,\lambda})
\end{align*}
obtained from the natural action of $G_{\mathbb Q}$ on $V_\lambda A_f$.
[/definition]
This definition hides a decomposition: $V_lA_f$ has dimension $2[K_f:\mathbb{Q}]$ over $\mathbb{Q}_l$, and the action of $K_f\otimes_{\mathbb{Q}}\mathbb{Q}_l$ splits it into two-dimensional pieces indexed by primes $\lambda\mid l$. After extracting one such piece, the essential test is whether it remembers the Hecke eigenvalues of $f$.
The obstruction is that the construction passes through the geometry of a quotient abelian variety, while the form $f$ is specified by its Fourier coefficients. To justify calling the resulting Galois representation the one attached to $f$, one needs a comparison at primes where both sides have clean unramified local factors. The trace and determinant formulas at primes $p\nmid Nl$ provide that comparison.
[quotetheorem:4759]
[citeproof:4759]
This is the first precise form of the slogan that Hecke eigenvalues are Frobenius traces. The restrictions $p\nmid Nl$ are part of the content, not a technical afterthought: $p\nmid N$ gives good reduction of the modular curve and $p\ne l$ keeps the Tate module in the prime-to-$p$ range. The theorem does not yet describe the local representation at primes dividing $N$ or at $l$, where ramification and $l$-adic Hodge theory enter. Its role is to identify the unramified local factors, which are the factors controlled directly by the Hecke operators $T_p$.
## Isogenies and Semisimplicity
The construction of $A_f$ is only canonical up to isogeny, so the attached representation must be insensitive to replacing an abelian variety by an isogenous one. Otherwise the representation attached to a newform would depend on arbitrary choices made while cutting out the quotient of $J_0(N)$. More deeply, the Galois representation should determine the abelian variety up to isogeny; without such a rigidity theorem, the dictionary from geometry to representations would lose information in an uncontrolled way. Faltings' theorem supplies this rigidity.
[quotetheorem:4760]
[citeproof:4760]
This theorem is used here as a quoted major result in arithmetic geometry. In this course it justifies treating the $l$-adic representation as a faithful isogeny-level invariant of an abelian variety. The statement is stronger than merely saying that isogenous varieties have isomorphic Tate modules: it identifies all Galois-equivariant homomorphisms of Tate modules with genuine algebraic homomorphisms after $l$-adic completion. Its limitation is equally important, since it works at the isogeny level and therefore does not distinguish integral models inside the same isogeny class. This is exactly the level of precision needed for modular forms, where the Eichler-Shimura quotient is defined up to isogeny.
[example: The Representation Attached to X0(11)]
For $X_0(11)$ the genus is $1$, so $J_0(11)=X_0(11)$ as an elliptic curve after choosing a rational base point. Use the model
\begin{align*}
E:\quad y^2+y=x^3-x^2-10x-20.
\end{align*}
Here $a_1=0$, $a_2=-1$, $a_3=1$, $a_4=-10$, and $a_6=-20$, so
\begin{align*}
b_2&=a_1^2+4a_2=0+4(-1)=-4,\\
b_4&=2a_4+a_1a_3=2(-10)+0=-20,\\
b_6&=a_3^2+4a_6=1+4(-20)=-79,\\
b_8&=a_1^2a_6+4a_2a_6-a_1a_3a_4+a_2a_3^2-a_4^2\\
&=0+4(-1)(-20)-0+(-1)(1)^2-(-10)^2\\
&=80-1-100=-21.
\end{align*}
Thus
\begin{align*}
\Delta
&=-b_2^2b_8-8b_4^3-27b_6^2+9b_2b_4b_6\\
&=-(-4)^2(-21)-8(-20)^3-27(-79)^2+9(-4)(-20)(-79)\\
&=336+64000-27\cdot 6241-56880\\
&=64336-168507-56880\\
&=-161051\\
&=-11^5.
\end{align*}
Therefore $E$ has good reduction at every prime $p\ne 11$.
The associated weight-$2$ newform has rational Fourier coefficients, so its Hecke field is $\mathbb{Q}$ and the Eichler-Shimura quotient is the elliptic curve $A_f=E$. Since $T_lE$ is a free $\mathbb{Z}_l$-module of rank $2$, the Tate-module action gives
\begin{align*}
\rho_{E,l}:G_{\mathbb{Q}}\longrightarrow GL_2(\mathbb{Z}_l).
\end{align*}
If $p\nmid 11l$, then $p\ne 11$, so $E$ has good reduction at $p$, and $p\ne l$, so the prime-to-$p$ Tate module is in the good-reduction range. Hence $\rho_{E,l}$ is unramified at $p$, and the Frobenius trace formula gives
\begin{align*}
\det(1-X\rho_{E,l}(\operatorname{Frob}_p))
=1-a_pX+pX^2,
\end{align*}
where
\begin{align*}
a_p=p+1-|E(\mathbb{F}_p)|.
\end{align*}
On the modular side, the same $a_p$ is the $p$th Hecke eigenvalue of the newform attached to $X_0(11)$. Thus in this genus-$1$ case the two-dimensional Galois representation attached to the newform is exactly the Tate-module representation of the elliptic curve $X_0(11)$.
[/example]
The chapter has built the basic dictionary: abelian varieties give Tate modules, Tate modules give continuous Galois representations, good reduction gives unramifiedness, and modular Jacobians with Hecke-stable quotients give the two-dimensional representations attached to weight-$2$ eigenforms. The next step is to formulate the construction for general weights and to understand the precise local conditions at primes dividing the level and at $l$ itself.
# 6. Galois Representations Attached to Weight 2 Newforms
This chapter turns the geometric Hecke theory of modular curves into two-dimensional Galois representations. The guiding question is: how can the numbers $a_p(f)$ occurring in the $q$-expansion of a newform be recognised as traces of Frobenius elements in an actual representation of $\operatorname{Gal}(\overline{\mathbb Q}/\mathbb Q)$? For weight $2$, the answer is especially concrete because weight $2$ modular forms occur in the first cohomology of modular curves, where Hecke correspondences and Galois actions coexist.
The construction has three moving parts. First, a newform determines a number field generated by its Hecke eigenvalues. Second, each prime ideal above a rational prime $l$ gives a local coefficient field in which an $l$-adic representation can live. Third, the Eichler-Shimura relation identifies Frobenius acting on cohomology with the Hecke operator $T_p$, forcing the characteristic polynomial of Frobenius to have coefficients determined by the modular form.
## Newforms, Coefficient Fields, and Primes Above $l$
The first problem is that a Hecke eigenform need not have rational Fourier coefficients. If its eigenvalues lie in a larger number field, then any Galois representation attached to it must remember not only the form but also a choice of completion of that number field.
[definition: Weight Two Newform]
Let $N \ge 1$ and let $\varepsilon: (\mathbb Z/N\mathbb Z)^\times \to \mathbb C^\times$ be a Dirichlet character. A weight two newform of level $N$ and nebentypus $\varepsilon$ is a normalized cuspidal Hecke eigenform
$ f \in S_2(\Gamma_1(N), \varepsilon) $
which lies in the new subspace and has Fourier expansion
\begin{align*}
f(q) = \sum_{n=1}^{\infty} a_n(f)q^n, \qquad a_1(f)=1.
\end{align*}
[/definition]
The normalization $a_1(f)=1$ fixes the scaling, so the Hecke eigenvalue of $T_n$ is the coefficient $a_n(f)$ whenever $n$ is prime to the level in the usual Hecke range. The new condition removes the part inherited from lower levels, which is important because the resulting Galois representation is meant to be an irreducible arithmetic object rather than a sum of old contributions.
Even after normalization, the coefficients need not lie in $\mathbb Q$. Since these coefficients will become Frobenius traces and Hecke eigenvalues, the coefficient field records the smallest number field over which the arithmetic data of the newform is visible.
[definition: Coefficient Field of a Newform]
Let $f=\sum_{n\ge 1}a_n(f)q^n$ be a normalized newform. The coefficient field of $f$ is
\begin{align*}
K_f := \mathbb Q(a_n(f) : n\ge 1) \subset \mathbb C.
\end{align*}
[/definition]
For a normalized newform, the coefficients $a_n(f)$ are algebraic integers. Thus $K_f$ is a number field, and the Hecke eigenvalues may be studied simultaneously at all finite places of $K_f$.
[example: Rational Coefficient Field]
Suppose $f\in S_2(\Gamma_0(N))$ is a normalized newform and every Fourier coefficient satisfies $a_n(f)\in\mathbb Q$. The coefficients of a normalized newform are algebraic integers, so each coefficient lies in
\begin{align*}
\mathbb Q\cap \overline{\mathbb Z}=\mathbb Z,
\end{align*}
where $\overline{\mathbb Z}$ denotes the ring of algebraic integers. Hence $a_n(f)\in\mathbb Z$ for every $n$.
The coefficient field is therefore
\begin{align*}
K_f
&=\mathbb Q(a_n(f):n\ge 1)\\
&=\mathbb Q(\text{integers }a_n(f):n\ge 1)\\
&=\mathbb Q.
\end{align*}
For a rational prime $l$, the only prime of $\mathbb Z=\mathcal O_{\mathbb Q}$ above $l$ is $(l)$, and completing gives
\begin{align*}
K_{f,(l)}=(\mathbb Q)_{(l)}=\mathbb Q_l.
\end{align*}
Thus in the rational-coefficient case there is no choice of prime of the coefficient field above $l$, and the attached $l$-adic representation has coefficient field $\mathbb Q_l$.
[/example]
The rational coefficient case includes the newforms attached to elliptic curves over $\mathbb Q$. In general, however, a fixed rational prime $l$ can split, ramify, or remain inert in $K_f$, and different primes above $l$ give different coefficient fields.
The obstruction is that an $l$-adic representation cannot use the abstract field $K_f$ alone; it must live over a completion determined by a chosen prime above $l$. Choosing different primes above the same rational prime can produce different local coefficient fields, so the local field has to be named explicitly before the representation is defined.
[definition: Local Coefficient Field]
Let $K_f$ be the coefficient field of a newform $f$, let $\mathcal O_{K_f}$ be its ring of integers, and let $\lambda \trianglelefteq \mathcal O_{K_f}$ be a nonzero prime ideal above a rational prime $l$. The local coefficient field at $\lambda$ is the completion
\begin{align*}
K_{f,\lambda}:=(K_f)_\lambda.
\end{align*}
Its valuation ring is denoted $\mathcal O_{f,\lambda}$ and its residue field is denoted $k_{f,\lambda}$.
[/definition]
The notation records that the same rational prime $l$ may give several non-isomorphic embeddings of $K_f$ into finite extensions of $\mathbb Q_l$. The $\lambda$-adic representation attached to $f$ is built over this completion.
[example: Real Quadratic Hecke Field]
Let $K_f=\mathbb Q(\sqrt d)$ be a real quadratic coefficient field, and suppose the rational prime $l$ splits in $\mathcal O_{K_f}$ as
\begin{align*}
l\mathcal O_{K_f}=\lambda_1\lambda_2.
\end{align*}
Splitting means that $d$ has two square roots in $\mathbb Q_l$; choose one of them and call it $\alpha$, so $\alpha^2=d$ and the other root is $-\alpha$.
The two $l$-adic embeddings of $K_f$ are therefore
\begin{align*}
\iota_1:K_f&\longrightarrow \mathbb Q_l,
&
\iota_1(a+b\sqrt d)&=a+b\alpha,\\
\iota_2:K_f&\longrightarrow \mathbb Q_l,
&
\iota_2(a+b\sqrt d)&=a-b\alpha.
\end{align*}
Indeed,
\begin{align*}
\iota_1((\sqrt d)^2)&=\iota_1(d)=d=\alpha^2=\iota_1(\sqrt d)^2,\\
\iota_2((\sqrt d)^2)&=\iota_2(d)=d=(-\alpha)^2=\iota_2(\sqrt d)^2,
\end{align*}
so both maps respect the defining relation $(\sqrt d)^2=d$. The two primes $\lambda_1,\lambda_2$ give the two completions
\begin{align*}
K_{f,\lambda_1}\cong \mathbb Q_l,
\qquad
K_{f,\lambda_2}\cong \mathbb Q_l,
\end{align*}
corresponding to these two embeddings.
If, for a good prime $p$, the Hecke eigenvalue has the form
\begin{align*}
a_p(f)=u_p+v_p\sqrt d
\qquad (u_p,v_p\in \mathbb Q),
\end{align*}
then the two associated $l$-adic Frobenius traces are
\begin{align*}
\iota_1(a_p(f))&=u_p+v_p\alpha,\\
\iota_2(a_p(f))&=u_p-v_p\alpha.
\end{align*}
Thus the same algebraic Hecke eigenvalue $a_p(f)$ gives two possibly different $l$-adic traces, depending on which prime of $K_f$ above $l$ is used.
[/example]
## Constructing the Representation $\rho_{f,\lambda}$
The second problem is to find a natural vector space on which both the absolute Galois group and the Hecke algebra act. For weight $2$, the right home is the first etale cohomology of a modular curve, because holomorphic differentials on the modular curve correspond to weight $2$ cusp forms and Hecke correspondences act geometrically.
[definition: Galois Representation Attached to a Newform]
Let $f$ be a weight two newform of level $N$, nebentypus $\varepsilon$, coefficient field $K_f$, and let $\lambda\mid l$ be a prime of $K_f$. A $\lambda$-adic Galois representation attached to $f$ is a continuous representation
\begin{align*}
\rho_{f,\lambda}: \operatorname{Gal}(\overline{\mathbb Q}/\mathbb Q) \longrightarrow GL_2(K_{f,\lambda})
\end{align*}
whose Frobenius characteristic polynomials away from $Nl$ are determined by the Hecke eigenvalues of $f$.
[/definition]
This definition states what the object must do, but it does not yet prove that such an object exists.
The construction now needs an existence theorem strong enough to produce a continuous two-dimensional representation and to certify its good-prime Frobenius polynomials. Deligne's theorem is the result that turns the desired compatibility condition into an actual representation.
[quotetheorem:4761]
[citeproof:4761]
The determinant contains two familiar factors. With the arithmetic-Frobenius convention used in this chapter, the $l$-adic cyclotomic character contributes the factor $p$, and the nebentypus contributes $\varepsilon(p)$. If one instead writes formulas using geometric Frobenius, the cyclotomic value is inverted, so the two conventions must not be identified without changing the formula accordingly.
[remark: Frobenius Convention]
Some authors use geometric Frobenius instead of arithmetic Frobenius. This changes where inverses appear in the formula. In these notes, $\operatorname{Frob}_p$ is chosen so that the characteristic polynomial is
\begin{align*}
X^2-a_p(f)X+\varepsilon(p)p.
\end{align*}
[/remark]
## Frobenius Traces, Determinants, and Ramification
The main test of the construction is whether the representation can be read from its values at good primes. Away from the level $N$ and the auxiliary prime $l$, the answer is yes: the representation is unramified, and Frobenius has the characteristic polynomial predicted by the Hecke eigenvalues.
[definition: Unramified at a Prime]
Let $p$ be a rational prime and let $V$ be a finite-dimensional $K_{f,\lambda}$-vector space with a continuous action of $G_{\mathbb Q}:=\operatorname{Gal}(\overline{\mathbb Q}/\mathbb Q)$. The representation is unramified at $p$ if the inertia subgroup $I_p\subset G_{\mathbb Q}$ acts as the identity on $V$.
[/definition]
When the representation is unramified at $p$, a conjugacy class of Frobenius elements acts on $V$. Therefore the trace and determinant of Frobenius are well-defined functions of the representation.
The next point is to identify those two functions for the representation attached to a modular form. This is where the abstract Galois action becomes computable from the Fourier coefficients and nebentypus.
[quotetheorem:4762]
[citeproof:4762]
This theorem is the point at which modular forms become arithmetic representations. The analytic coefficient $a_p(f)$ is recovered as a trace, while the nebentypus and the cyclotomic character combine in the determinant.
[example: Point Counts on an Elliptic Curve]
Let $E/\mathbb Q$ be an elliptic curve of conductor $N$, and let $f_E\in S_2(\Gamma_0(N))$ be the associated normalized newform. For a prime $p\nmid N$, the reduction $E_{\mathbb F_p}$ is smooth, and the modularity relation between $E$ and $f_E$ gives
\begin{align*}
a_p(f_E)=p+1-|E(\mathbb F_p)|.
\end{align*}
If also $l\ne p$, then $p\nmid Nl$, so the Frobenius trace of the attached $l$-adic representation is
\begin{align*}
\operatorname{tr}\rho_{f_E,l}(\operatorname{Frob}_p)
&=a_p(f_E)\\
&=p+1-|E(\mathbb F_p)|.
\end{align*}
For example, suppose
\begin{align*}
|E(\mathbb F_p)|=p+1-3.
\end{align*}
Then
\begin{align*}
a_p(f_E)
&=p+1-|E(\mathbb F_p)|\\
&=p+1-(p+1-3)\\
&=p+1-p-1+3\\
&=3.
\end{align*}
Since $f_E$ has nebentypus equal to $1$, the determinant at $\operatorname{Frob}_p$ is $p$, and therefore
\begin{align*}
\det\left(XI-\rho_{f_E,l}(\operatorname{Frob}_p)\right)
&=X^2-a_p(f_E)X+p\\
&=X^2-3X+p.
\end{align*}
Thus the point count over $\mathbb F_p$ determines the good-prime Frobenius polynomial of the representation.
[/example]
The elliptic curve example shows the geometric origin of the trace formula in its most concrete form. For a general newform, the modular abelian variety attached to $f$ replaces the elliptic curve, and its Tate module supplies the same two-dimensional representation after cutting out the $f$-factor.
The trace formula is only useful where Frobenius is defined without choices. At primes of bad reduction, or at the auxiliary prime $l$, inertia can act nontrivially, so the polynomial computed from point counts no longer gives a complete local description. One therefore needs a precise ramification statement separating the good primes, where the Hecke eigenvalues control Frobenius directly, from the exceptional primes that require separate local analysis.
[quotetheorem:4763]
[citeproof:4763]
At primes dividing $N$, ramification reflects the bad reduction and the local automorphic type of $f$. At the prime $l$, the representation is usually ramified in a subtler $l$-adic Hodge-theoretic way, and later chapters study the extra structure present there. Thus the clean Frobenius polynomial is a good-prime invariant, not a complete local description at every prime. This limitation is useful pedagogically: it isolates the part of the representation governed directly by Hecke operators before the course turns to inertia, conductors, and local-global compatibility.
## Uniqueness from Chebotarev
The final problem is whether the Frobenius trace and determinant formulas determine the representation. Since Frobenius elements are only specified at primes away from $Nl$, this is a density question about conjugacy classes in the absolute Galois group.
[quotetheorem:4764]
[citeproof:4764]
For newforms, this uniqueness theorem means that Deligne's representation is characterized by its good-prime Frobenius polynomials. The construction may use cohomology, modular curves, and projectors, but the resulting semisimple representation is the unique one with the prescribed traces and determinants at almost all primes.
[example: Distinguishing Two Newforms]
Let $f$ and $g$ be weight two newforms with the same level $N$ and the same nebentypus $\varepsilon$. Fix primes $\lambda_f\mid l$ of $K_f$ and $\lambda_g\mid l$ of $K_g$, and view both local coefficient fields inside a common finite extension $L/\mathbb Q_l$. Suppose first that $p\nmid Nl$ and
\begin{align*}
a_p(f)\ne a_p(g)
\end{align*}
in $L$. The good-prime Frobenius characteristic polynomial formula gives the traces
\begin{align*}
\operatorname{tr}\rho_{f,\lambda_f}(\operatorname{Frob}_p)&=a_p(f),\\
\operatorname{tr}\rho_{g,\lambda_g}(\operatorname{Frob}_p)&=a_p(g).
\end{align*}
If the two $L$-linear representations were isomorphic, then the matrices representing $\operatorname{Frob}_p$ would be conjugate in $GL_2(L)$, and conjugate matrices have the same trace:
\begin{align*}
\operatorname{tr}\rho_{f,\lambda_f}(\operatorname{Frob}_p)
=
\operatorname{tr}\rho_{g,\lambda_g}(\operatorname{Frob}_p).
\end{align*}
Substituting the trace formulas gives
\begin{align*}
a_p(f)=a_p(g),
\end{align*}
contradicting the chosen prime $p$. Hence the associated representations cannot be isomorphic.
Conversely, suppose that after embedding both coefficient fields into $L$, one has
\begin{align*}
a_p(f)=a_p(g)
\end{align*}
for all primes $p$ outside a finite set containing the bad primes. For every such good prime,
\begin{align*}
\operatorname{tr}\rho_{f,\lambda_f}(\operatorname{Frob}_p)
&=a_p(f)\\
&=a_p(g)\\
&=\operatorname{tr}\rho_{g,\lambda_g}(\operatorname{Frob}_p).
\end{align*}
Both representations are semisimple and unramified outside finitely many primes, so *Chebotarev Uniqueness for Semisimple Representations* gives
\begin{align*}
\rho_{f,\lambda_f}\otimes_{K_{f,\lambda_f}}L
\cong
\rho_{g,\lambda_g}\otimes_{K_{g,\lambda_g}}L
\end{align*}
as semisimple $L$-representations. Thus a single differing good-prime coefficient separates the representations, while agreement of almost all good-prime traces identifies them after passing to common coefficients.
[/example]
The chapter has established the main dictionary for weight $2$ newforms: Hecke eigenvalues become Frobenius traces, the nebentypus times the cyclotomic character becomes the determinant, and bad primes are confined to the level and the chosen auxiliary prime. Subsequent chapters refine this dictionary locally, especially at primes dividing $N$ and at the prime $l$.
# 7. Higher Weight Forms and Deligne's Construction
The previous chapters attached Galois representations to weight $2$ forms by using the ordinary cohomology of modular curves and the Jacobians that they define. Higher weights force a new ingredient: the modular curve still parametrises elliptic curves, but the modular form transforms as a tensor power of the invariant differential, so the relevant cohomology must remember a varying vector space over the curve. Deligne's construction packages this variation as an etale local system, then applies the same Hecke-eigenspace philosophy to obtain a two-dimensional Galois representation.
The main question in this chapter is how the geometry of modular curves records the Hecke eigenvalues of a normalized cuspidal eigenform $f \in S_k(\Gamma_1(N),\varepsilon)$ with $k \ge 2$. We state Deligne's theorem, explain the trace and determinant formulas, and interpret the weights through Hodge-Tate theory, purity, and the functional equation. The chapter ends with Ramanujan's Delta form, where the abstract construction becomes the representation whose Frobenius traces are the coefficients $\tau(p)$.
## Why Higher Weight Requires Local Systems
What changes when a modular form has weight $k>2$? For weight $2$, a cusp form $f(z)\,dz$ is a holomorphic differential on the modular curve, so it naturally belongs to the first cohomology of the curve. For weight $k$, the factor $(cz+d)^k$ says that the form behaves like a section of a higher tensor power of the Hodge bundle, and ordinary cohomology of the curve has no room to store the additional symmetric-power data.
[definition: Hodge Bundle On A Modular Curve]
Let $Y_1(N)$ be the modular curve over a base on which $N$ is invertible, and let $\pi:E\to Y_1(N)$ be the universal elliptic curve. The Hodge bundle is
\begin{align*}
\omega := \pi_*\Omega^1_{E/Y_1(N)}.
\end{align*}
[/definition]
For $k\ge 0$, the sheaf of weight $k$ modular forms is modelled by sections of $\omega^{\otimes k}$, with the appropriate condition at the cusps after compactification.
The Hodge bundle explains the automorphy factor on the analytic side. If $\gamma=\begin{pmatrix}a&b\\ c&d\end{pmatrix}$ acts on the upper half-plane and $z\mapsto \gamma z$, the invariant differential on the corresponding elliptic curve scales by $cz+d$. A weight $k$ modular form compensates by the $k$-th tensor power of this scaling.
[example: Weight Two As Ordinary Cohomology]
Let $f\in S_2(\Gamma_1(N))$. For
\begin{align*}
\gamma=\begin{pmatrix}a&b\\ c&d\end{pmatrix}\in \Gamma_1(N),
\end{align*}
the weight $2$ transformation rule and the derivative of the fractional linear transformation give
\begin{align*}
f(\gamma z)&=(cz+d)^2f(z),\\
d(\gamma z)&=d\left(\frac{az+b}{cz+d}\right)
=\frac{(cz+d)a-c(az+b)}{(cz+d)^2}\,dz\\
&=\frac{ad-bc}{(cz+d)^2}\,dz
=\frac{1}{(cz+d)^2}\,dz.
\end{align*}
Therefore
\begin{align*}
f(\gamma z)\,d(\gamma z)
=(cz+d)^2f(z)\cdot \frac{1}{(cz+d)^2}\,dz
=f(z)\,dz,
\end{align*}
so $f(z)\,dz$ is invariant under $\Gamma_1(N)$ and descends to a holomorphic differential on $Y_1(N)(\mathbb C)$.
At a cusp, choose a local parameter $q=e^{2\pi iz}$. Since $f$ is cuspidal,
\begin{align*}
f(q)=\sum_{n=1}^{\infty}a_nq^n.
\end{align*}
Also
\begin{align*}
dq=2\pi i e^{2\pi iz}\,dz=2\pi iq\,dz,
\end{align*}
so
\begin{align*}
f(z)\,dz
=\left(\sum_{n=1}^{\infty}a_nq^n\right)\frac{dq}{2\pi iq}
=\frac{1}{2\pi i}\left(\sum_{n=1}^{\infty}a_nq^{n-1}\right)dq.
\end{align*}
The power series $\sum_{n=1}^{\infty}a_nq^{n-1}$ has no negative powers of $q$, so the differential extends holomorphically across the cusp. Thus $f(z)\,dz$ defines a class in $H^1(X_1(N)(\mathbb C),\mathbb C)$, and Hecke correspondences act on this class by the same pull-push operation that acts on the differential. This is the geometric reason that weight $2$ eigenforms are visible in the Jacobian of $X_1(N)$.
[/example]
For $k>2$, replacing $dz$ by a higher tensor does not produce an ordinary differential form on the curve. The missing weight must be carried by coefficients rather than by the curve alone.
We need the object that supplies those coefficients to live as a local system over the modular curve. Its fibre must remember the first cohomology of the elliptic curve being parametrized, and taking symmetric powers of that fibre is what inserts the exponent $k-2$ into the cohomological construction.
[definition: Symmetric Power Local System]
Fix a prime $\ell\nmid N$. Let $\pi:E\to Y_1(N)$ be the universal elliptic curve, and set
\begin{align*}
\mathcal V_\ell := R^1\pi_*\mathbb Q_\ell.
\end{align*}
For $k\ge 2$, the local system relevant to weight $k$ is
\begin{align*}
\operatorname{Sym}^{k-2}\mathcal V_\ell.
\end{align*}
Its fibre at a geometric point corresponding to an elliptic curve $E_y$ is $\operatorname{Sym}^{k-2}H^1_{\mathrm{et}}(E_y,\mathbb Q_\ell)$.
[/definition]
The exponent $k-2$ is not an accident. The curve itself contributes the cohomological degree $1$, which accounts for the differential part present already in weight $2$; the remaining $k-2$ powers come from the relative cohomology of the universal elliptic curve.
[explanation: Analytic Meaning Of The Coefficient System]
On the analytic modular curve, the local system is the one attached to the algebraic representation $\operatorname{Sym}^{k-2}$ of $GL_2$. A point of the upper half-plane determines an elliptic curve $E_z=\mathbb C/(\mathbb Z+z\mathbb Z)$, and $H^1(E_z)$ varies as $z$ varies. Passing to the symmetric power records polynomial functions in the periods of the elliptic curve, which is the extra data needed to match weight $k$ automorphy.
This is the cohomological source of Eichler-Shimura in higher weight: cusp forms of weight $k$ appear in the parabolic part of $H^1$ of the modular curve with coefficients in $\operatorname{Sym}^{k-2}$. The same Hecke correspondences used in weight $2$ act on this cohomology because they act not only on elliptic curves but also on their cohomology.
[/explanation]
The next structural question is whether this cohomological space actually contains the same Hecke eigenpackets as the analytic cusp forms. This is not automatic from the definition of the local system: ordinary cohomology can also see boundary phenomena, and the coefficient system must match the weight exactly. The comparison theorem supplies the bridge by identifying the cuspidal Hecke eigensystems inside parabolic cohomology, which is the cohomological setting where Galois can act.
[quotetheorem:4765]
[citeproof:4765]
This theorem explains the target of Deligne's construction. Instead of searching for a Galois representation directly from a $q$-expansion, we search inside the etale cohomology group where the Hecke eigenclass attached to $f$ lives. The condition $k\ge 2$ is part of the geometry: the coefficient system is $\operatorname{Sym}^{k-2}$, and negative symmetric powers do not define local systems of this kind. Parabolic cohomology is needed because ordinary cohomology also contains boundary and Eisenstein classes from the cusps, whose Hecke eigenvalues are not the cuspidal systems we want. The congruence subgroup fixes the moduli problem and the available Hecke correspondences; changing from $\Gamma_1(N)$ to another level changes the coefficient spaces and the diamond-operator bookkeeping. The theorem does not by itself construct a Galois representation; it identifies the cohomological location in which Deligne will later impose both Hecke and Galois actions.
[example: Boundary Classes Do Not Give Cusp Forms]
On the non-compact curve $Y_1(N)(\mathbb C)$, ordinary cohomology with coefficients in the local system $\mathbb V_{k-2,\mathbb Q}$ sees not only interior classes but also classes whose restrictions to small punctured neighbourhoods of the cusps are nonzero. The parabolic cohomology is the kernel of the restriction map to the boundary:
\begin{align*}
H^1_{\mathrm{par}}(Y_1(N)(\mathbb C),\mathbb V_{k-2,\mathbb Q})
=
\ker\left(
H^1(Y_1(N)(\mathbb C),\mathbb V_{k-2,\mathbb Q})
\to
H^1(\partial X_1(N)(\mathbb C),\mathbb V_{k-2,\mathbb Q})
\right).
\end{align*}
Thus a class survives in parabolic cohomology exactly when its boundary restriction is zero at every cusp.
This matches the analytic cusp condition. If $f(q)=\sum_{n=0}^{\infty}a_nq^n$ is holomorphic at a cusp, then $f$ is cuspidal at that cusp precisely when
\begin{align*}
a_0=0,
\end{align*}
so its $q$-expansion begins with positive powers of $q$. Boundary classes record the data left at the cusps, while cusp forms have no constant term there. Therefore ordinary cohomology contains Eisenstein boundary eigensystems in addition to cuspidal ones, and passing to parabolic cohomology removes exactly the boundary contribution needed for the Eichler-Shimura cuspidal correspondence.
[/example]
## Deligne's Galois Representation Attached To A Weight k Eigenform
Given a normalized cuspidal Hecke eigenform, what should its Galois representation do? The answer is dictated by Frobenius at primes away from $N\ell$: the trace must be the Hecke eigenvalue and the determinant must be the product of the cyclotomic factor and the nebentypus. Deligne's theorem asserts that these desired local conditions are realised by an actual two-dimensional representation.
[definition: Normalized Cuspidal Eigenform With Nebentypus]
Let $k\ge 2$, $N\ge 1$, and let $\varepsilon:(\mathbb Z/N\mathbb Z)^\times\to \mathbb C^\times$ be a Dirichlet character. A normalized cuspidal eigenform of weight $k$, level $\Gamma_1(N)$, and nebentypus $\varepsilon$ is a form
\begin{align*}
f(q)=\sum_{n=1}^{\infty}a_n(f)q^n\in S_k(\Gamma_1(N),\varepsilon)
\end{align*}
such that $a_1(f)=1$, $f$ is an eigenvector for each Hecke operator
\begin{align*}
T_p:S_k(\Gamma_1(N),\varepsilon)\to S_k(\Gamma_1(N),\varepsilon) \qquad (p\nmid N),
\end{align*}
and $f$ is an eigenvector for each bad-prime operator
\begin{align*}
U_p:S_k(\Gamma_1(N),\varepsilon)\to S_k(\Gamma_1(N),\varepsilon) \qquad (p\mid N).
\end{align*}
[/definition]
The coefficients $a_n(f)$ generate a number field $K_f$, called the coefficient field. For a finite place $\lambda$ of $K_f$ above a rational prime $\ell$, the representation should live over the completion $K_{f,\lambda}$, but existence alone would not identify which representation has been constructed.
We need the identifying data to be the good-prime Frobenius polynomials. A useful theorem must show that the trace recovers $a_p(f)$ and that the determinant recovers the nebentypus and cyclotomic factor, because only those formulas tie the Galois representation back to the Hecke eigensystem.
[quotetheorem:4766]
[citeproof:4766]
Cuspidality is essential because it removes Eisenstein pieces whose Galois representations decompose into characters rather than giving the two-dimensional irreducible objects attached to cusp forms. Normalization fixes the scale of the eigenform: without $a_1(f)=1$, the same eigensystem would be represented by many scalar multiples, and the coefficient $a_p(f)$ would not be a canonical trace. The exclusion $p\nmid N\ell$ is also substantive, since at primes dividing $N$ the representation can be ramified and at $p=\ell$ the correct language is $\ell$-adic Hodge theory rather than an unramified Frobenius polynomial. Thus the theorem determines the representation by its good-prime Frobenius data, but it does not give a full description of the local representation at every bad prime.
The determinant formula is often written as
\begin{align*}
\det\rho_{f,\lambda}=\varepsilon\,\chi_\ell^{k-1},
\end{align*}
where $\chi_\ell:G_{\mathbb Q}\to \mathbb Z_\ell^\times$ is the $\ell$-adic cyclotomic character and $\varepsilon$ is viewed as a finite-order Galois character through the [Kronecker-Weber theorem](/theorems/3340). This identity is understood by evaluating both sides on Frobenius elements away from $N\ell$ and then using continuity and density.
[example: Determinant And Frobenius Roots]
Let $f\in S_k(\Gamma_0(N))$ be a normalized newform with identity nebentypus, and let
\begin{align*}
A_p=\rho_{f,\lambda}(\operatorname{Frob}_p)
\end{align*}
for a prime $p\nmid N\ell$. By *Deligne Representation For A Cuspidal Eigenform*,
\begin{align*}
\operatorname{tr}(A_p)=a_p(f),\qquad \det(A_p)=p^{k-1}.
\end{align*}
For any $2\times 2$ matrix $A$ with trace $t$ and determinant $d$, the characteristic polynomial is
\begin{align*}
\det(XI-A)=X^2-tX+d.
\end{align*}
Applying this with $A=A_p$, $t=a_p(f)$, and $d=p^{k-1}$ gives
\begin{align*}
\det(XI-A_p)=X^2-a_p(f)X+p^{k-1}.
\end{align*}
If $\alpha_p$ and $\beta_p$ are the two Frobenius roots, then
\begin{align*}
X^2-a_p(f)X+p^{k-1}=(X-\alpha_p)(X-\beta_p)
=X^2-(\alpha_p+\beta_p)X+\alpha_p\beta_p.
\end{align*}
Comparing coefficients gives
\begin{align*}
\alpha_p+\beta_p=a_p(f),\qquad \alpha_p\beta_p=p^{k-1}.
\end{align*}
Thus the trace fixes the sum of the roots, while the determinant fixes their product.
If the nebentypus is $\varepsilon$ instead of the identity character, the determinant formula becomes
\begin{align*}
\det(A_p)=\varepsilon(p)p^{k-1},
\end{align*}
so the same coefficient comparison gives
\begin{align*}
\alpha_p\beta_p=\varepsilon(p)p^{k-1}.
\end{align*}
Since $\varepsilon(p)$ is a root of unity, every complex embedding has $|\varepsilon(p)|=1$, so the nebentypus changes the phase of the product but not its complex absolute value.
[/example]
The example also points to a boundary of the idea. The following remark, Semisimplicity In The Statement, records that interpretation before the construction is used again.
[remark: Semisimplicity In The Statement]
Deligne's theorem is usually stated for the semisimplification because the trace and determinant at almost all Frobenius elements determine the semisimple representation. Integral lattices inside $\rho_{f,\lambda}$ matter for residual representations and congruences between modular forms, but the rational $\lambda$-adic representation is the canonical object at this stage of the course.
[/remark]
The local behaviour at primes dividing $N\ell$ is more delicate. At primes $p\mid N$, ramification reflects the level and the local automorphic representation of $f$. At $\ell$, the representation is governed by $p$-adic Hodge theory, and the first invariant to record is the pair of Hodge-Tate weights.
## Hodge-Tate Weights And The Weight Of A Modular Form
How does the integer $k$ reappear once the representation has been constructed? It is visible in the Hodge-Tate weights at $\ell$, in the size of Frobenius eigenvalues at good primes, and in the centre of the functional equation of the $L$-function. These three viewpoints express the same motivic weight $k-1$.
[definition: Hodge-Tate Weights Of Deligne's Representation]
Let $f\in S_k(\Gamma_1(N),\varepsilon)$ be a normalized cuspidal eigenform with $k\ge 2$, and let $\lambda\mid \ell$. With the convention that the cyclotomic character $\chi_\ell$ has Hodge-Tate weight $1$, the Hodge-Tate weights of $\rho_{f,\lambda}$ are
\begin{align*}
0 \quad \text{and} \quad k-1.
\end{align*}
[/definition]
The two weights reflect the two Hodge pieces of the motive attached to $f$: one from the holomorphic cusp form and one from its dual partner. Their sum agrees with the exponent in the determinant, since $\det\rho_{f,\lambda}$ has Hodge-Tate weight $k-1$.
The definition records the expected pair, but the representation constructed by Deligne still needs a theorem placing it in this Hodge-Tate framework. That result is what makes the modular weight visible in the local $\ell$-adic representation.
[quotetheorem:4767]
[citeproof:4767]
This theorem gives the $\ell$-adic version of the weight of a modular form. It is also the reason the determinant contains $\chi_\ell^{k-1}$ rather than $\chi_\ell^k$ or $\chi_\ell^{k-2}$. The hypothesis that the representation comes from Deligne's geometric construction is important: an arbitrary two-dimensional $\ell$-adic representation with the same determinant need not have these Hodge-Tate weights. The theorem records only the multiset of Hodge-Tate weights, not whether the local representation at $\ell$ is crystalline, semistable, ordinary, or how its filtered module is arranged. The finite-order nebentypus affects the determinant and ramification but contributes Hodge-Tate weight $0$, so it does not change the pair $0,k-1$.
[example: Weight Two And Elliptic Curves]
For $k=2$, the modular-form Hodge-Tate weights are
\begin{align*}
0 \quad \text{and} \quad k-1=2-1=1.
\end{align*}
This is the same pair that occurs for the $\ell$-adic representation attached to an elliptic curve $E/\mathbb Q$. Indeed, set
\begin{align*}
V_\ell(E)=T_\ell(E)\otimes_{\mathbb Z_\ell}\mathbb Q_\ell.
\end{align*}
The de Rham cohomology of $E$ has the Hodge filtration
\begin{align*}
0\subset H^0(E,\Omega^1_{E/\mathbb Q})\subset H^1_{\mathrm{dR}}(E/\mathbb Q),
\end{align*}
and for an elliptic curve
\begin{align*}
\dim_{\mathbb Q}H^0(E,\Omega^1_{E/\mathbb Q})=1,\qquad
\dim_{\mathbb Q}H^1(E,\mathcal O_E)=1.
\end{align*}
Thus the two Hodge pieces occur in degrees $1$ and $0$, so, with the convention that $\chi_\ell$ has Hodge-Tate weight $1$, the Hodge-Tate weights of $V_\ell(E)$ are
\begin{align*}
0 \quad \text{and} \quad 1.
\end{align*}
Therefore the weight $2$ formula for modular forms reproduces exactly the Hodge-Tate pattern of elliptic curves, which is why weight $2$ newforms are the ones later compared directly with Galois representations coming from modular elliptic curves.
[/example]
## Purity And Weil Bounds For Fourier Coefficients
Why should the Fourier coefficient $a_p(f)$ have size about $p^{(k-1)/2}$? Deligne's proof of the [Weil conjectures](/theorems/2199) supplies the geometric input: the Frobenius eigenvalues on the relevant cohomology are pure of weight $k-1$. Since $a_p(f)$ is the sum of two such eigenvalues, the Ramanujan-Petersson bound follows.
[definition: Purity Of Frobenius Eigenvalues]
Let $V$ be a finite-dimensional $K_{f,\lambda}$-representation of $G_{\mathbb Q}$, unramified at a prime $p\ne \ell$. The representation is pure of weight $w$ at $p$ if every eigenvalue $\alpha$ of $\rho(\operatorname{Frob}_p)$ is algebraic and satisfies
\begin{align*}
|\iota(\alpha)|=p^{w/2}
\end{align*}
for every embedding $\iota:\overline{\mathbb Q}\hookrightarrow \mathbb C$.
[/definition]
Purity is a statement about all complex embeddings of the Frobenius eigenvalues, not only about their value in a chosen $\lambda$-adic field. This is exactly the missing input in the analytic estimate: the trace formula expresses $a_p(f)$ as a sum of two Frobenius roots, but without a bound on the size of each root the trace formula alone gives no Ramanujan-Petersson inequality.
For the representations attached to cuspidal eigenforms, the obstruction is therefore to prove that the two good-prime roots have the geometric weight predicted by the coefficient system. Once that purity statement is known in weight $k-1$, the bound follows by adding the two roots.
[quotetheorem:4768]
[citeproof:4768]
This is the modern form of the Ramanujan-Petersson conjecture for holomorphic cusp forms. The hypotheses separate good primes from bad primes: the clean two-root purity statement applies at primes where the representation is unramified, while primes dividing $N$ require local factors determined by the ramified local representation. Cuspidality matters because Eisenstein series have coefficients governed by sums of characters and powers, not by pure two-dimensional cuspidal motives. Purity controls the absolute values of the Frobenius roots; it does not compute $a_p(f)$ directly, but once the roots are known to have size $p^{(k-1)/2}$ the trace estimate follows by adding them. The case of Ramanujan's Delta form was the original motivating example, and it becomes a direct corollary of Deligne's theorem.
[example: Recovering A Weil Bound From A Frobenius Polynomial]
Suppose $f$ has identity nebentypus and $p$ is a good prime with Frobenius polynomial
\begin{align*}
X^2-a_p(f)X+p^{k-1}=(X-\alpha_p)(X-\beta_p).
\end{align*}
Expanding the right-hand side gives
\begin{align*}
(X-\alpha_p)(X-\beta_p)
&=X^2-X\beta_p-\alpha_pX+\alpha_p\beta_p\\
&=X^2-(\alpha_p+\beta_p)X+\alpha_p\beta_p.
\end{align*}
Comparing the coefficient of $X$ in the two expressions for the same monic quadratic polynomial gives
\begin{align*}
-a_p(f)=-(\alpha_p+\beta_p),
\end{align*}
and hence
\begin{align*}
a_p(f)=\alpha_p+\beta_p.
\end{align*}
Let $\iota:K_f\hookrightarrow \mathbb C$ be an embedding. Since $\iota$ preserves addition,
\begin{align*}
\iota(a_p(f))=\iota(\alpha_p+\beta_p)=\iota(\alpha_p)+\iota(\beta_p).
\end{align*}
Purity of weight $k-1$ gives
\begin{align*}
|\iota(\alpha_p)|=p^{(k-1)/2},\qquad
|\iota(\beta_p)|=p^{(k-1)/2}.
\end{align*}
Therefore the triangle inequality in $\mathbb C$ gives
\begin{align*}
|\iota(a_p(f))|
&=|\iota(\alpha_p)+\iota(\beta_p)|\\
&\le |\iota(\alpha_p)|+|\iota(\beta_p)|\\
&=p^{(k-1)/2}+p^{(k-1)/2}\\
&=2p^{(k-1)/2}.
\end{align*}
Thus the trace coefficient $a_p(f)$ is bounded by adding the two pure Frobenius roots, each of size $p^{(k-1)/2}$.
[/example]
The previous example fixes one test case for the idea. The next example, Ramanujan Delta And The Tau Bound, changes the setting so the same mechanism can be recognized from another angle.
[example: Ramanujan Delta And The Tau Bound]
The discriminant modular form
\begin{align*}
\Delta(q)=q\prod_{n=1}^{\infty}(1-q^n)^{24}=\sum_{n=1}^{\infty}\tau(n)q^n
\end{align*}
spans $S_{12}(SL_2(\mathbb Z))$, so it is a normalized eigenform of weight $k=12$, level $N=1$, and identity nebentypus. Fix a prime $\ell$, and for a prime $p\ne \ell$ set
\begin{align*}
A_p=\rho_{\Delta,\ell}(\operatorname{Frob}_p).
\end{align*}
Since $N=1$, the condition $p\nmid N\ell$ is exactly $p\ne \ell$. By *Deligne Representation For A Cuspidal Eigenform*,
\begin{align*}
\operatorname{tr}(A_p)&=\tau(p),\\
\det(A_p)&=p^{12-1}=p^{11}.
\end{align*}
For a $2\times 2$ matrix
\begin{align*}
A=\begin{pmatrix}a&b\\ c&d\end{pmatrix},
\end{align*}
we have
\begin{align*}
\det(XI-A)
&=\det\begin{pmatrix}X-a&-b\\ -c&X-d\end{pmatrix}\\
&=(X-a)(X-d)-(-b)(-c)\\
&=X^2-dX-aX+ad-bc\\
&=X^2-(a+d)X+(ad-bc)\\
&=X^2-\operatorname{tr}(A)X+\det(A).
\end{align*}
Applying this to $A=A_p$ gives
\begin{align*}
\det(XI-A_p)
&=X^2-\operatorname{tr}(A_p)X+\det(A_p)\\
&=X^2-\tau(p)X+p^{11}.
\end{align*}
Let $\alpha_p$ and $\beta_p$ be the two roots of this polynomial. Then
\begin{align*}
X^2-\tau(p)X+p^{11}
&=(X-\alpha_p)(X-\beta_p)\\
&=X^2-X\beta_p-\alpha_p X+\alpha_p\beta_p\\
&=X^2-(\alpha_p+\beta_p)X+\alpha_p\beta_p.
\end{align*}
Comparing the coefficients of $X$ gives
\begin{align*}
-\tau(p)=-(\alpha_p+\beta_p),
\end{align*}
hence
\begin{align*}
\tau(p)=\alpha_p+\beta_p.
\end{align*}
By *Deligne Weil Bound For Hecke Eigenvalues*, the two Frobenius roots have complex absolute value
\begin{align*}
|\alpha_p|=|\beta_p|=p^{(12-1)/2}=p^{11/2}.
\end{align*}
Therefore the triangle inequality gives
\begin{align*}
|\tau(p)|
&=|\alpha_p+\beta_p|\\
&\le |\alpha_p|+|\beta_p|\\
&=p^{11/2}+p^{11/2}\\
&=2p^{11/2}.
\end{align*}
Thus Ramanujan's bound for the prime coefficient $\tau(p)$ is the trace estimate obtained by adding the two pure Frobenius roots of Deligne's representation.
[/example]
## Functional Equations And The Motivic Viewpoint
What does the Galois representation know about the analytic $L$-function of $f$? Away from bad primes, the characteristic polynomial of Frobenius is the Euler factor of the modular $L$-function. The determinant, Hodge-Tate weights, and purity together explain why the completed $L$-function has its functional equation centred at $s=k/2$.
[definition: Good Euler Factor Of A Modular Form]
Let $f\in S_k(\Gamma_1(N),\varepsilon)$ be a normalized eigenform. For a prime $p\nmid N$, the good Euler factor of $L(f,s)$ is
\begin{align*}
L_p(f,s)=\left(1-a_p(f)p^{-s}+\varepsilon(p)p^{k-1-2s}\right)^{-1}.
\end{align*}
[/definition]
This Euler factor is the reciprocal of the characteristic polynomial of Frobenius after substituting $X=p^{-s}$. Deligne's representation therefore turns the Hecke eigenvalue data into the standard local factor of a Galois representation.
To make that comparison part of the theory rather than a formal analogy, one needs a statement identifying the good Euler factors of the modular form with the Frobenius factors of the attached representation. The following theorem records this compatibility at primes where no inertia correction is present.
[quotetheorem:4769]
[citeproof:4769]
The restriction $p\nmid N\ell$ is exactly the range in which the representation is unramified and the Frobenius conjugacy class is available without extra inertia data. If $p\mid N$, the Euler factor is modified according to the local automorphic representation of $f$ and may have degree smaller than two. If $p=\ell$, the local factor is not read from an unramified Frobenius element in the same way; it is described using the appropriate $\ell$-adic Hodge-theoretic object. Thus the theorem gives the clean good-prime Euler factor, while the completed $L$-function also requires separate bad-prime and archimedean factors.
[explanation: Centre Of The Functional Equation]
A pure two-dimensional motive of weight $k-1$ has Frobenius eigenvalues of size $p^{(k-1)/2}$. The functional equation exchanges $s$ with $k-s$, so its centre is $s=k/2$. This is compatible with the determinant formula: the product of the two Frobenius roots is $\varepsilon(p)p^{k-1}$, and the power $k-1$ is the motivic weight.
The Hodge-Tate weights $0$ and $k-1$ also determine the archimedean gamma factor. Thus the three pieces of structure seen in this chapter are not separate coincidences: the local system explains the cohomological construction, the determinant records the cyclotomic weight, and purity gives the analytic size of the Fourier coefficients.
[/explanation]
The pieces just assembled are the data needed later: good-prime traces, determinants, residual reductions, and the separation between clean unramified primes and the bad local factors.
[remark: What Is Used Later]
In Chapters 8, 9, and 10, the representation $\rho_{f,\lambda}$ will be reduced modulo $\lambda$ to obtain residual representations, compared with Galois representations from elliptic curves, and used in modularity lifting arguments. The formulas
\begin{align*}
\operatorname{tr}(\rho_{f,\lambda}(\operatorname{Frob}_p))=a_p(f),\qquad
\det\rho_{f,\lambda}=\varepsilon\chi_\ell^{k-1}
\end{align*}
are the bridge between congruences of modular forms and congruences of Galois representations.
[/remark]
Deligne’s theorem extends the weight 2 picture to arbitrary weight, but now the resulting representations are sensitive to the coefficient field and its primes. The next chapter studies what survives after reduction mod l, and how congruences of modular forms are reflected in residual Galois representations.
# 8. Residual Representations and Congruences
This chapter passes from characteristic-zero Galois representations to their reductions modulo primes of coefficient fields. The main question is how much arithmetic information survives after reduction, and how congruences between Hecke eigenvalues are reflected in the residual representation. The guiding principle is that congruences between modular forms are not only congruences between $q$-expansions: they are congruences between systems of Frobenius traces, hence between Galois representations after semisimplification.
The chapter also introduces the first deformation-theoretic viewpoint. Reducible residual representations, especially those arising from Eisenstein congruences, give a bridge from modular forms to extension classes, Selmer groups, and deformation rings. We only state the deepest global theorems here, but we keep the exact hypotheses visible because Chapters 10 and 11 use them in modularity lifting and level lowering.
## Reduction Modulo Primes of Coefficient Fields
A Galois representation attached to a newform has coefficients in a number field, while congruences are measured modulo a prime of its ring of integers. The first problem is therefore notational and structural: given a representation over a finite extension of $\mathbb{Q}_\ell$, what does it mean to reduce it modulo the maximal ideal, and why is the answer independent of choices only after semisimplification?
Let $f = \sum_{n\ge 1} a_n(f)q^n$ be a normalised cuspidal newform of weight $k\ge 2$, level $N$, and nebentypus $\varepsilon$. Let $K_f = \mathbb{Q}(a_n(f): n\ge 1)$ be its coefficient field, and let $\lambda$ be a prime of $K_f$ above the rational prime $\ell$. Write $K_{f,\lambda}$ for the completion, $\mathcal{O}_{f,\lambda}$ for its valuation ring, and $k_\lambda$ for its residue field.
[definition: Integral Lattice]
Let $V$ be a finite-dimensional $K_{f,\lambda}$-vector space and let $\rho: G_\mathbb{Q}\to GL(V)$ be a continuous representation. An integral lattice in $V$ is a finitely generated $\mathcal{O}_{f,\lambda}$-submodule $L\subset V$ such that the natural map $L\otimes_{\mathcal{O}_{f,\lambda}} K_{f,\lambda}\to V$ is an isomorphism. It is $G_\mathbb{Q}$-stable if $\rho(g)L\subset L$ for all $g\in G_\mathbb{Q}$.
[/definition]
A stable lattice allows us to reduce matrices modulo $\lambda$, but the resulting representation can depend on the chosen lattice. Two lattices in the same characteristic-zero representation may produce different extension classes after reduction.
For congruence and deformation arguments, the canonical object is therefore not the raw reduction alone but its semisimplification. The definition below separates the lattice-dependent representation from the semisimplified residual representation that later notation such as $\bar{\rho}_{f,\lambda}$ is meant to denote.
[definition: Residual Representation]
Let $\rho: G_\mathbb{Q}\to GL(V)$ be a continuous representation over $K_{f,\lambda}$ and let $L\subset V$ be a $G_\mathbb{Q}$-stable integral lattice. The residual representation attached to $L$ is
\begin{align*}
\rho_L \otimes k_\lambda: G_\mathbb{Q}\longrightarrow GL(L/\lambda L).
\end{align*}
The semisimplified residual representation is the semisimplification of $\rho_L\otimes k_\lambda$.
[/definition]
We write $\bar{\rho}^{\mathrm{ss}}$ for the semisimplified residual representation when the characteristic-zero representation is understood. For modular forms the standard notation is $\bar{\rho}_{f,\lambda}$, with the convention that it denotes the semisimplified residual representation unless a lattice has been specified.
This convention needs a theorem behind it. Otherwise the symbol $\bar{\rho}^{\mathrm{ss}}$ could still depend on an unstated choice of stable lattice, making residual congruence statements ill-defined.
[quotetheorem:4770]
[citeproof:4770]
The independence statement is what makes the notation $\bar{\rho}^{\mathrm{ss}}$ meaningful without naming a lattice each time. It does not say that every residual lattice model is the same before semisimplification; extension classes can still depend on the chosen lattice. What survives canonically is the collection of Jordan-Hölder constituents, and that is the level of information used in congruence arguments.
For a newform, Deligne's representation gives a particularly concrete residual object. Since the characteristic-zero representation is characterized at good primes by Hecke eigenvalues, reducing the representation should reduce those same trace and determinant formulas modulo $\lambda$. The next theorem makes this precise and turns congruences between modular forms into congruences between residual Galois representations. This is the point where mod-$\lambda$ Hecke data becomes a representation-theoretic invariant.
[quotetheorem:4771]
[citeproof:4771]
[example: Ramanujan Tau Modulo Small Primes]
Let $\Delta(q)=\sum_{n\ge 1}\tau(n)q^n$ be the normalised cusp form of weight $12$, level $1$, and nebentypus equal to $1$. For a prime $\lambda\mid \ell$, the residual representation attached to $\Delta$ is unramified at every prime $p\ne \ell$, and the newform trace-determinant formulas give
\begin{align*}
\operatorname{tr}\bar{\rho}_{\Delta,\lambda}(\operatorname{Frob}_p)&\equiv \tau(p)\pmod{\lambda},\\
\det\bar{\rho}_{\Delta,\lambda}(\operatorname{Frob}_p)&\equiv p^{12-1}=p^{11}\pmod{\lambda}.
\end{align*}
Take $\ell=691$. Ramanujan's congruence says that, for every prime $p\ne 691$,
\begin{align*}
\tau(p)\equiv 1+p^{11}\pmod{691}.
\end{align*}
Hence the characteristic polynomial of $\bar{\rho}_{\Delta,691}(\operatorname{Frob}_p)$ is
\begin{align*}
X^2-\operatorname{tr}\bar{\rho}_{\Delta,691}(\operatorname{Frob}_p)X
+\det\bar{\rho}_{\Delta,691}(\operatorname{Frob}_p)
&\equiv X^2-\tau(p)X+p^{11}\pmod{691}\\
&\equiv X^2-(1+p^{11})X+p^{11}\pmod{691}\\
&=X^2-X-p^{11}X+p^{11}\pmod{691}\\
&=(X-1)(X-p^{11})\pmod{691}.
\end{align*}
For the representation $1\oplus \bar{\chi}_{691}^{11}$, the two Frobenius eigenvalues at $p\ne 691$ are $1$ and
\begin{align*}
\bar{\chi}_{691}(\operatorname{Frob}_p)^{11}\equiv p^{11}\pmod{691},
\end{align*}
so its characteristic polynomial at $\operatorname{Frob}_p$ is
\begin{align*}
(X-1)(X-p^{11})=X^2-(1+p^{11})X+p^{11}.
\end{align*}
Thus the two residual representations have the same characteristic polynomial at every $\operatorname{Frob}_p$ with $p\ne 691$. Chebotarev density and the Brauer-Nesbitt theorem imply that equality on this dense set of Frobenius elements determines the semisimplification, so
\begin{align*}
\bar{\rho}_{\Delta,691}^{\mathrm{ss}}\cong 1\oplus \bar{\chi}_{691}^{11}.
\end{align*}
The congruence of Hecke eigenvalues therefore records a reducibility statement for the residual Galois representation, not merely a numerical coincidence among Fourier coefficients.
[/example]
## Hecke Congruences and Congruence Primes
The next problem is to recognise when two modular forms are congruent. The $q$-expansion principle says that congruences of Fourier coefficients are meaningful, while the Galois representation viewpoint says that congruent eigenvalues force congruent Frobenius traces. This section makes that bridge precise.
[definition: Hecke Congruence]
Let $f$ and $g$ be normalised Hecke eigenforms with coefficients in number fields contained in a common finite extension $K/\mathbb{Q}$. Let $\lambda$ be a prime of $K$. We say that $f$ and $g$ are congruent modulo $\lambda$ away from a finite set $S$ of primes if
\begin{align*}
a_p(f)\equiv a_p(g)\pmod{\lambda}
\end{align*}
for every prime $p\notin S$ at which the relevant Hecke operators are unramified.
[/definition]
The finite set $S$ usually contains primes dividing the levels and the residue characteristic. The determinant must also match if the residual representations are to be isomorphic, so congruences are strongest when the weights and nebentypus characters are compatible modulo $\lambda$.
In applications one often wants to name the primes at which a fixed eigenform is congruent to a genuinely different eigensystem inside a chosen Hecke module. That requires recording both the ambient module and the exclusion of mere Galois conjugates.
[definition: Congruence Prime]
Let $f$ be a normalised Hecke eigenform in a Hecke module $M$ over a number field $K$. A prime $\lambda$ of $K$ is a congruence prime for $f$ in $M$ if there exists another normalised eigenform $g$ represented in $M$ such that $g$ is not Galois-conjugate to $f$ and the systems of Hecke eigenvalues of $f$ and $g$ are congruent modulo $\lambda$ for all Hecke operators in the chosen Hecke algebra.
[/definition]
The phrase "in $M$" matters. A congruence inside the full space of modular forms can have a different meaning from a congruence inside the new subspace, the cuspidal subspace, or an integral cohomology group.
The reason this notion interacts with Galois representations is that good-prime Hecke congruences are exactly trace congruences for Frobenius. To turn a coefficient congruence into an isomorphism statement, one also has to compare determinants and then use the density of Frobenius classes.
[quotetheorem:4772]
[citeproof:4772]
The converse is often how residual representations are used in practice: an isomorphism of residual representations produces congruences among Hecke eigenvalues at almost all primes.
[example: A Tau Congruence as a Hecke Congruence]
The normalized Eisenstein eigenpacket of weight $12$ has $T_p$-eigenvalue $1+p^{11}$ for every prime $p$. The cusp form $\Delta(q)=\sum_{n\ge 1}\tau(n)q^n$ is a normalized Hecke eigenform, so its $T_p$-eigenvalue is $\tau(p)$. Ramanujan's congruence gives, for every prime $p$,
\begin{align*}
\tau(p)&\equiv 1+p^{11}\pmod{691}.
\end{align*}
Thus, prime by prime, the two systems of Hecke eigenvalues satisfy
\begin{align*}
a_p(\Delta)&=\tau(p),\\
a_p(E_{12})&=1+p^{11},\\
a_p(\Delta)-a_p(E_{12})&=\tau(p)-(1+p^{11})\equiv 0\pmod{691}.
\end{align*}
Equivalently,
\begin{align*}
a_p(\Delta)\equiv a_p(E_{12})\pmod{691}
\end{align*}
for every prime $p$, so $\Delta$ is congruent modulo $691$ to the Eisenstein eigenpacket of weight $12$. This is not a congruence to another cusp form at level $1$, because $E_{12}$ is Eisenstein rather than cuspidal; it is a Hecke congruence inside the full space of modular forms of weight $12$.
[/example]
The Ramanujan congruence is an example of a residual eigensystem appearing more naturally modulo a prime than in characteristic zero. To use such eigensystems in the same framework as Deligne's representations, one needs a way to lift a mod-$\lambda$ eigenpacket to an honest characteristic-zero eigenform after possibly enlarging coefficients. The obstruction is algebraic rather than analytic: eigenvectors over a residue field need not visibly come from eigenvectors over the original coefficient ring. The [lifting lemma](/theorems/2437) supplies the finite-algebra input that bridges this gap.
[quotetheorem:4773]
[citeproof:4773]
The lemma is a bridge from mod $\lambda$ eigenclasses to characteristic-zero eigenforms. Conceptually, it says that a residual eigensystem can be lifted to characteristic zero after passing through the finite local algebra that records the Hecke action.
[example: Lifting a Modulo Lambda Eigenpacket]
Suppose $\bar{h}$ is a mod $\lambda$ Hecke eigenform of level $N$ with eigenvalues $\bar{a}_p\in k_\lambda$ for every Hecke operator $T_p$ with $p\nmid N\ell$. This means that, for each such $p$,
\begin{align*}
T_p\bar{h}=\bar{a}_p\bar{h}.
\end{align*}
By the *[Deligne-Serre Lifting Lemma](/theorems/4773)*, after replacing the coefficient ring by the ring of integers of a finite extension if necessary, there is a characteristic-zero eigenform $f$ whose reduction has the same residual eigensystem. Thus for every $p\nmid N\ell$,
\begin{align*}
T_pf&=a_p(f)f,\\
a_p(f)\bmod \lambda&=\bar{a}_p,
\end{align*}
or equivalently
\begin{align*}
a_p(f)\equiv \bar{a}_p\pmod{\lambda}.
\end{align*}
The lift is therefore not merely some characteristic-zero vector reducing to $\bar{h}$: it is a simultaneous Hecke eigenform, so its eigenvalues determine Frobenius traces in the attached Galois representation at all primes $p\nmid N\ell$.
[/example]
Lifting explains how residual eigenpackets can be compared with characteristic-zero forms, but it does not yet say how much level is truly forced by the residual representation. A form may have level divisible by a prime $p$ because its characteristic-zero representation is ramified there, while the residual representation becomes unramified or has weaker ramification after reduction. The level-lowering question asks when such a residual representation can be realized by a newform of smaller level. [Ribet's theorem](/theorems/4516) is the decisive result that makes this loss of level arithmetic rather than accidental.
[quotetheorem:4774]
[citeproof:4774]
The conceptual message of Ribet's theorem is that residual representations remember less level structure than characteristic-zero forms: ramification of the modular form at $p$ can disappear modulo $\lambda$. This is why level lowering is powerful enough to move a residual representation from a high-conductor modular form to a lower-level one.
For the architecture of the course, this theorem is the first place where residual representations become more than reductions of compatible systems. They acquire their own minimal level, determined by the ramification that survives after reduction. The distinction matters because a congruence can hide part of the original level even when the characteristic-zero form genuinely needed it. In the Fermat application, this is the step that turns the high conductor of a Frey curve into a contradiction at a much smaller level. Thus level lowering should be read as a rigidity theorem for residual Galois data, not only as a statement about changing the index $N$ in a space of modular forms.
## Eisenstein Ideals and Reducibility
The final problem is the most important source of congruences in the course: when does a cuspidal eigenform become congruent to an Eisenstein series? The Galois representation attached to an Eisenstein eigenpacket is reducible, so an Eisenstein congruence forces a cuspidal residual representation to become reducible after semisimplification.
[definition: Eisenstein Ideal]
Let $\mathbb{T}$ be a Hecke algebra acting on modular forms of weight $2$ and level $N$. The Eisenstein ideal is the ideal $I_E\subset \mathbb{T}$ generated by the elements
\begin{align*}
T_p-(1+p)
\end{align*}
for primes $p\nmid N$, together with the additional level operators imposed by the chosen Eisenstein series.
[/definition]
The exact list of generators at primes dividing $N$ depends on whether one works with $\Gamma_0(N)$, $\Gamma_1(N)$, new quotients, or a specified character. The prime-to-$N$ generators already record the essential reducible Frobenius traces.
The point of introducing this ideal is to detect when a cuspidal eigensystem has the same residual good-prime traces as the reducible Eisenstein eigensystem. The theorem below turns containment of the Eisenstein ideal in a maximal ideal into the corresponding reducibility statement for the residual Galois representation.
[quotetheorem:4775]
[citeproof:4775]
The Eisenstein ideal is the algebraic device that makes reducibility visible inside the Hecke algebra. Away from the level, the generator $T_p-(1+p)$ is exactly the difference between the trace of a cuspidal eigenpacket and the trace expected from a reducible sum of the constant character and the cyclotomic character. If a maximal ideal contains these generators, the residual representation attached to the corresponding eigensystem has the same semisimplified good-prime traces as that reducible representation. This explains why Eisenstein congruences are not isolated coefficient accidents: they are Hecke-algebra shadows of reducible residual Galois representations.
[example: Eisenstein Congruence at Level 11]
Let $f$ be the normalised weight $2$ newform of level $11$ attached to the elliptic curve of conductor $11$, with nebentypus equal to $1$. The Eisenstein congruence modulo $5$ says that, for every prime $p\ne 5,11$,
\begin{align*}
a_p(f)\equiv 1+p\pmod{5}.
\end{align*}
For such $p$, the residual representation $\bar{\rho}_{f,5}$ is unramified at $p$, and the trace-determinant formulas for a weight $2$ newform with nebentypus equal to $1$ give
\begin{align*}
\operatorname{tr}\bar{\rho}_{f,5}(\operatorname{Frob}_p)&\equiv a_p(f)\pmod{5},\\
\det\bar{\rho}_{f,5}(\operatorname{Frob}_p)&\equiv 1\cdot p^{2-1}=p\pmod{5}.
\end{align*}
Substituting the Eisenstein congruence into the characteristic polynomial gives
\begin{align*}
X^2-\operatorname{tr}\bar{\rho}_{f,5}(\operatorname{Frob}_p)X
+\det\bar{\rho}_{f,5}(\operatorname{Frob}_p)
&\equiv X^2-a_p(f)X+p\pmod{5}\\
&\equiv X^2-(1+p)X+p\pmod{5}\\
&=X^2-X-pX+p\pmod{5}\\
&=(X-1)(X-p)\pmod{5}.
\end{align*}
For the reducible representation $1\oplus\bar{\chi}_5$, the mod $5$ cyclotomic character satisfies
\begin{align*}
\bar{\chi}_5(\operatorname{Frob}_p)\equiv p\pmod{5},
\end{align*}
so its Frobenius eigenvalues are $1$ and $p$, and its characteristic polynomial is
\begin{align*}
(X-1)(X-p)=X^2-X-pX+p=X^2-(1+p)X+p.
\end{align*}
Thus $\bar{\rho}_{f,5}$ and $1\oplus\bar{\chi}_5$ have the same characteristic polynomial at every $\operatorname{Frob}_p$ with $p\ne 5,11$. By Chebotarev density and the Brauer-Nesbitt theorem,
\begin{align*}
\bar{\rho}_{f,5}^{\mathrm{ss}}\cong 1\oplus\bar{\chi}_5.
\end{align*}
The level $11$ Eisenstein congruence therefore says that the cuspidal residual representation becomes reducible after semisimplification.
[/example]
The example shows a particular Eisenstein congruence, but the structural question is broader: for prime level, when can a cuspidal Hecke eigenform be congruent to an Eisenstein series? [Mazur's theorem](/theorems/985) gives the controlling criterion and turns the reducibility of residual representations into a precise statement about the Eisenstein ideal rather than a numerical accident in a few Fourier coefficients.
[quotetheorem:4776]
[citeproof:4776]
Mazur's theorem shows that Eisenstein congruences are controlled by arithmetic invariants rather than accidental coefficient coincidences. The congruence prime is tied to the numerator of $(p-1)/12$, so the phenomenon is governed by the geometry of $X_0(p)$ and its Jacobian.
[explanation: From Reducibility to Deformation Problems]
Once a residual representation $\bar{\rho}$ is fixed, a deformation problem asks for all lifts of $\bar{\rho}$ to local Artinian rings with prescribed ramification, determinant, and local behaviour. If $\bar{\rho}$ is irreducible, [Schur's lemma](/theorems/2414) gives a well-behaved deformation functor under standard hypotheses. If $\bar{\rho}^{\mathrm{ss}}\cong 1\oplus \bar{\chi}_\ell$, the reducibility creates extra extension data, and Eisenstein ideals become a way to measure which reducible residual systems arise from cuspidal characteristic-zero forms.
This is the first appearance of the philosophy behind modularity lifting. Congruences place modular points inside deformation spaces; deformation rings organise all possible lifts; Hecke algebras cut out the modular lifts. Later chapters make this comparison precise through $R=\mathbb{T}$ theorems.
[/explanation]
This deformation viewpoint also explains why semisimplification is useful but lossy: it keeps the Jordan-Hoelder factors while discarding extension classes that later deformation rings may need to remember.
[remark: What Semisimplification Forgets]
The representation $1\oplus \bar{\chi}_\ell$ records only the two Jordan-Hoelder factors. A nonsplit extension
\begin{align*}
0\longrightarrow \bar{\chi}_\ell\longrightarrow \bar{\rho}\longrightarrow 1\longrightarrow 0
\end{align*}
can carry arithmetic information invisible in $\bar{\rho}^{\mathrm{ss}}$. This hidden extension data is exactly what deformation theory and Selmer groups are designed to retain.
[/remark]
Once representations are reduced modulo primes, semisimplification no longer captures all of the arithmetic information. The next chapter returns to elliptic curves, modular parametrizations, and L-functions, where these residual and extension-theoretic phenomena reappear in a more geometric setting.
# 9. Elliptic Curves, Modular Parametrizations, and L-functions
This chapter brings the Galois-representation viewpoint back to elliptic curves. The guiding question is how the modular form attached to an elliptic curve can be recognized through its $L$-function, and how the geometry of $X_0(N)$ produces the curve itself. The chapter also records the local invariants that determine the level: the conductor, the minimal discriminant, and the reduction type at each prime.
## Modularity as Equality of L-functions
What does it mean, in arithmetic terms, for an elliptic curve over $\mathbb{Q}$ to be modular? Chapters 6 and 7 attached Galois representations to eigenforms and saw that Hecke eigenvalues control Frobenius traces. For elliptic curves the same Frobenius traces arise from counting points over finite fields, so modularity becomes a comparison of Euler factors prime by prime.
[definition: Hasse-Weil L-function of an Elliptic Curve]
Let $E/\mathbb{Q}$ be an elliptic curve of conductor $N$. For each prime $p$, define $a_p(E)$ by
\begin{align*}
a_p(E) &= p + 1 - |E(\mathbb{F}_p)|
\end{align*}
when $E$ has good reduction at $p$. The Hasse-Weil $L$-function of $E$ is the function
\begin{align*}
L(E,-):\{s\in\mathbb C:\operatorname{Re}(s)>3/2\}\longrightarrow \mathbb C
\end{align*}
defined by the Euler product
\begin{align*}
L(E,s) &= \prod_p L_p(E,s)^{-1},
\end{align*}
where for $p \nmid N$,
\begin{align*}
L_p(E,s) &= 1 - a_p(E)p^{-s} + p^{1-2s}.
\end{align*}
[/definition]
At primes of bad reduction, the local factor is modified according to the reduction type. For multiplicative reduction the local factor is $1-a_p(E)p^{-s}$ with $a_p(E)=1$ in the split case and $a_p(E)=-1$ in the non-split case; for additive reduction in the semistable form used in this course the local trace is taken to be $0$. This is precisely the information that a list of good-prime point counts does not record: the same looking Frobenius traces away from finitely many primes still leave the bad Euler factors to be checked locally.
To compare an elliptic curve with a modular form, the modular form needs its own analytic object built from the same Hecke data. In weight two, the Fourier coefficients form a Dirichlet series whose Euler factors mirror the Frobenius polynomials appearing for elliptic curves.
[definition: L-function of a Weight Two Newform]
Let $f \in S_2(\Gamma_0(N))$ be a normalized newform with Fourier expansion
\begin{align*}
f(q) &= \sum_{n=1}^{\infty} a_n(f)q^n, \qquad a_1(f)=1.
\end{align*}
The $L$-function of $f$ is the function
\begin{align*}
L(f,-):\{s\in\mathbb C:\operatorname{Re}(s)>3/2\}\longrightarrow \mathbb C
\end{align*}
defined by
\begin{align*}
L(f,s) &= \sum_{n=1}^{\infty} \frac{a_n(f)}{n^s}.
\end{align*}
[/definition]
The Euler product of $L(f,s)$ is governed by the Hecke eigenvalues. For $p \nmid N$ the local factor is
\begin{align*}
1-a_p(f)p^{-s}+p^{1-2s},
\end{align*}
and the factors at primes dividing $N$ are determined by the corresponding $U_p$-eigenvalues.
The comparison problem is now well-posed: an elliptic curve and a weight two newform can be called the same from the modularity point of view when all of their Euler factors, including the bad ones, agree. This is packaged as equality of their $L$-functions.
[definition: Modular Elliptic Curve]
An elliptic curve $E/\mathbb{Q}$ of conductor $N$ is modular if there exists a normalized newform $f \in S_2(\Gamma_0(N))$ with rational Fourier coefficients such that
\begin{align*}
L(E,s) &= L(f,s).
\end{align*}
[/definition]
This definition packages infinitely many congruence-counting statements into one analytic identity. The equality is equivalent to equality of the Euler factors at every prime, so it can be tested away from bad primes by comparing point counts with Hecke eigenvalues.
In practice, the usable criterion starts with the good primes: for each $p\nmid N$, the point count of the curve gives $a_p(E)$, while the newform gives $a_p(f)$. The remaining issue is to state precisely when agreement of these local data is enough to identify the modular form attached to the curve.
[quotetheorem:4777]
[citeproof:4777]
The hypothesis at all good primes is stronger than what computations can literally verify, but it states the exact mathematical comparison: every unramified Frobenius characteristic polynomial must match. Agreement for all but finitely many good primes is enough to identify the semisimplified compatible Galois representations by Chebotarev density, but it does not by itself determine the finitely many exceptional Euler factors. The rational-coefficient hypothesis matters because it makes the newform side two-dimensional over $\mathbb{Q}$, so it can match the Tate module of a single elliptic curve rather than a higher-dimensional abelian variety. Computationally, checking several small primes proposes a candidate; the theorem explains what must then be proved or verified locally.
[example: Point Count for the Curve 11a1]
Consider the elliptic curve
\begin{align*}
E: y^2+y=x^3-x^2-10x-20,
\end{align*}
the curve labelled $11a1$. Since $5\nmid 11$, the prime $5$ is a good prime for this curve, and reducing the equation modulo $5$ gives
\begin{align*}
y^2+y&=x^3-x^2.
\end{align*}
We count the affine solutions in $\mathbb{F}_5$. For the left-hand side,
\begin{align*}
0^2+0&=0,\\
1^2+1&=2,\\
2^2+2&=6\equiv 1 \pmod 5,\\
3^2+3&=12\equiv 2 \pmod 5,\\
4^2+4&=20\equiv 0 \pmod 5.
\end{align*}
Thus $y^2+y$ takes the value $0$ for $y=0,4$, the value $1$ for $y=2$, and the value $2$ for $y=1,3$. For the right-hand side,
\begin{align*}
0^3-0^2&=0,\\
1^3-1^2&=0,\\
2^3-2^2&=8-4=4,\\
3^3-3^2&=27-9=18\equiv 3 \pmod 5,\\
4^3-4^2&=64-16=48\equiv 3 \pmod 5.
\end{align*}
The matching pairs are therefore
\begin{align*}
(x,y)&=(0,0),(0,4),(1,0),(1,4),
\end{align*}
so there are $4$ affine points over $\mathbb{F}_5$. Adding the point at infinity gives
\begin{align*}
|E(\mathbb{F}_5)|&=4+1=5.
\end{align*}
Hence
\begin{align*}
a_5(E)&=5+1-|E(\mathbb{F}_5)|\\
&=5+1-5\\
&=1.
\end{align*}
The associated newform of level $11$ has matching $q^5$-coefficient. This single count does not prove modularity, but it rules out any candidate newform whose fifth Hecke eigenvalue is not $1$ and illustrates how good-prime traces become a finite-dimensional search inside $S_2(\Gamma_0(11))$.
[/example]
## Modular Parametrizations and Optimal Quotients
If modularity is equality of $L$-functions, where is the elliptic curve geometrically visible? A direct map out of $X_0(N)$ is too rigid to carry the full linear algebra of Hecke eigenspaces. The Jacobian $J_0(N)$ is needed because divisors on $X_0(N)$ form an abelian variety on which the Hecke algebra acts, and quotients of this abelian variety isolate the eigensystem attached to a newform. Thus a modular elliptic curve occurs as a quotient of $J_0(N)$, and the corresponding map from $X_0(N)$ to $E$ is obtained by composing the Abel-Jacobi map with this quotient.
[illustration:modular-parametrization-factorization]
[definition: Modular Parametrization]
Let $E/\mathbb{Q}$ be an elliptic curve of conductor $N$. A modular parametrization of $E$ of level $N$ is a non-constant morphism of curves over $\mathbb{Q}$
\begin{align*}
\varphi: X_0(N) \longrightarrow E.
\end{align*}
[/definition]
Such a map sends the geometry of cyclic $N$-isogenies to points on $E$. We need the Jacobian viewpoint because, after choosing a rational cusp as base point, the map factors through $J_0(N)$ and the construction is controlled by abelian varieties rather than by the curve alone.
[definition: Optimal Quotient of the Modular Jacobian]
Let $A$ be an abelian variety over $\mathbb{Q}$ and let $\pi:J_0(N)\to A$ be a surjective homomorphism over $\mathbb{Q}$ with connected kernel. Then $A$ is an optimal quotient of $J_0(N)$.
[/definition]
Connectedness of the kernel removes finite isogeny ambiguity. We need this condition to identify the distinguished representative through which all other elliptic quotients in the same isogeny class are reached by isogeny.
[quotetheorem:4778]
[citeproof:4778]
The theorem is the geometric bridge from eigenforms to elliptic curves, but its hypotheses are doing real work. Weight $2$ is the weight whose modular forms are holomorphic differentials on $X_0(N)$, so they live naturally in the cotangent space of $J_0(N)$. Rational Fourier coefficients force the newform quotient to have dimension $1$; if the coefficient field is larger, the same construction gives a higher-dimensional abelian variety rather than an elliptic curve. The output is also an isogeny class or optimal quotient, not a canonical Weierstrass equation, so additional choices are needed before one writes down a specific model.
[definition: Newform Quotient]
Let $f \in S_2(\Gamma_0(N))$ be a normalized newform. The newform quotient attached to $f$ is the abelian variety
\begin{align*}
A_f &= J_0(N) / I_fJ_0(N),
\end{align*}
where $I_f$ is the annihilator ideal of $f$ in the Hecke algebra acting on $S_2(\Gamma_0(N))$.
[/definition]
For rational $f$, this quotient is an elliptic curve up to isogeny. For non-rational coefficient fields, $A_f$ has higher dimension, and the dimension is the degree of the coefficient field generated by the Hecke eigenvalues.
[example: The Modular Curve X0(11)]
Let $f$ be the normalized generator of the one-dimensional space $S_2(\Gamma_0(11))$. Since $\dim S_2(\Gamma_0(11))=1$, every Hecke operator preserves the line spanned by $f$, so $f$ is automatically a simultaneous Hecke eigenform. Its eigenvalues are rational because each Hecke operator acts on this one-dimensional $\mathbb{Q}$-rational space by multiplication by a scalar.
The genus of $X_0(11)$ is $1$, and the dimension of the Jacobian of a smooth projective curve equals its genus. Hence
\begin{align*}
\dim J_0(11)&=g(X_0(11))\\
&=1.
\end{align*}
Therefore $J_0(11)$ is an elliptic curve. The newform quotient attached to $f$ is
\begin{align*}
A_f&=J_0(11)/I_fJ_0(11).
\end{align*}
Because $S_2(\Gamma_0(11))$ has only the eigensystem of $f$, this quotient is nonzero; because $J_0(11)$ already has dimension $1$, the quotient $A_f$ also has dimension $1$. Thus the map
\begin{align*}
J_0(11)\longrightarrow A_f
\end{align*}
is a surjective homomorphism between elliptic curves, so its kernel is finite and the map is an isogeny.
The rational newform quotient is the elliptic curve in the isogeny class labelled $11a$, so it is isogenous to the curve $11a1$. Choosing the cusp $\infty$ as base point gives the Abel-Jacobi map
\begin{align*}
X_0(11)&\longrightarrow J_0(11),\\
P&\longmapsto [P]-[\infty].
\end{align*}
Composing with the quotient map gives
\begin{align*}
X_0(11)\longrightarrow J_0(11)\longrightarrow A_f\sim 11a1.
\end{align*}
In this level, the modular parametrization is concrete because the whole Jacobian is already elliptic: no higher-dimensional piece has to be separated before reaching the curve whose Frobenius traces agree with the coefficients of $f$.
[/example]
## Conductors, Discriminants, and Reduction Types
Why should the level of the modular form be the conductor of the elliptic curve? The level records ramification in the Galois representation attached to the newform, while the conductor of $E$ records the primes and depths of bad reduction. Matching the two is the local part of modularity.
[definition: Minimal Discriminant]
Let $E/\mathbb{Q}$ be an elliptic curve. A minimal integral Weierstrass equation for $E$ has discriminant $\Delta_E\in \mathbb{Z}$ with minimal $p$-adic valuation at every prime $p$ among integral Weierstrass equations for $E$. The integer $\Delta_E$ is the minimal discriminant of $E$.
[/definition]
The minimal discriminant measures singularity in reductions of the chosen integral model. We need a separate classification of the bad fibre because a prime $p$ divides $\Delta_E$ exactly when the reduced cubic is singular modulo $p$, but this does not by itself give the conductor exponent.
[definition: Reduction Type of an Elliptic Curve]
Let $E/\mathbb{Q}$ be an elliptic curve and let $p$ be a prime. The reduction type of $E$ at $p$ is good, split multiplicative, non-split multiplicative, or additive according to the singularity type of a minimal Weierstrass model over $\mathbb{Z}_p$.
[/definition]
Good reduction contributes no conductor exponent. We need a single global invariant that records these local contributions: multiplicative reduction contributes exponent $1$, while additive reduction contributes a larger exponent depending on the wild part of the local Galois action, especially at $p=2$ and $p=3$.
[definition: Conductor of an Elliptic Curve]
Let $E/\mathbb{Q}$ be an elliptic curve. The conductor of $E$ is the positive integer
\begin{align*}
N_E &= \prod_p p^{f_p(E)},
\end{align*}
where $f_p(E)$ is the local conductor exponent of the $\ell$-adic Tate module representation $T_\ell(E)\otimes_{\mathbb{Z}_\ell}\mathbb{Q}_\ell$ for any prime $\ell\ne p$.
[/definition]
The definition is independent of the auxiliary prime $\ell$. It is designed so that $N_E$ is exactly the Artin conductor of the two-dimensional Galois representation carried by the Tate module.
[quotetheorem:4779]
[citeproof:4779]
This local relation separates three different pieces of information: $\Delta_E$ detects which primes are bad, the reduction type distinguishes multiplicative from additive behaviour, and the conductor exponent measures the ramification depth of the Tate module. For $p\ge 5$ the displayed cases are clean, but at $p=2$ and $p=3$ wild ramification can make additive conductor exponents larger than $2$, so the valuation of the discriminant alone is not a conductor formula. The curve $11a1$ below is a boundary example in the other direction: $v_{11}(\Delta_E)=5$ while the conductor exponent is only $1$ because the reduction is multiplicative.
[example: Local Invariants of 11a1]
For
\begin{align*}
E: y^2+y=x^3-x^2-10x-20,
\end{align*}
write the equation in the form
\begin{align*}
y^2+a_1xy+a_3y=x^3+a_2x^2+a_4x+a_6.
\end{align*}
Here
\begin{align*}
a_1&=0,& a_2&=-1,& a_3&=1,& a_4&=-10,& a_6&=-20.
\end{align*}
The standard Weierstrass invariants are
\begin{align*}
b_2&=a_1^2+4a_2=0^2+4(-1)=-4,\\
b_4&=2a_4+a_1a_3=2(-10)+0\cdot 1=-20,\\
b_6&=a_3^2+4a_6=1^2+4(-20)=1-80=-79,\\
b_8&=a_1^2a_6+4a_2a_6-a_1a_3a_4+a_2a_3^2-a_4^2\\
&=0^2(-20)+4(-1)(-20)-0\cdot 1\cdot(-10)+(-1)(1^2)-(-10)^2\\
&=80-1-100\\
&=-21.
\end{align*}
Therefore the discriminant is
\begin{align*}
\Delta_E
&=-b_2^2b_8-8b_4^3-27b_6^2+9b_2b_4b_6\\
&=-(-4)^2(-21)-8(-20)^3-27(-79)^2+9(-4)(-20)(-79)\\
&=-16(-21)-8(-8000)-27(6241)+9(80)(-79)\\
&=336+64000-168507-56880\\
&=-161051\\
&=-11^5.
\end{align*}
Thus the only prime dividing the discriminant is $11$, so the curve has good reduction at every prime $p\ne 11$.
To determine the reduction type at $11$, compute
\begin{align*}
c_4&=b_2^2-24b_4\\
&=(-4)^2-24(-20)\\
&=16+480\\
&=496.
\end{align*}
Since $496=11\cdot 45+1$, we have $11\nmid c_4$. A bad prime $p$ with $p\mid \Delta_E$ and $p\nmid c_4$ gives multiplicative reduction, so $E$ has multiplicative reduction at $11$. Hence the local conductor exponent at $11$ is $1$, and
\begin{align*}
N_E&=11^1\\
&=11.
\end{align*}
The level of the associated newform is therefore $11$: the conductor records the multiplicative reduction exponent, not the full discriminant valuation $v_{11}(\Delta_E)=5$.
[/example]
## From Local Data to the Modular Form
How do these ingredients fit together in computations? Given an elliptic curve $E/\mathbb{Q}$, the minimal discriminant first identifies the bad primes, the conductor determines the candidate level, and point counts at good primes determine the expected Hecke eigenvalues.
[quotetheorem:4781]
[citeproof:4781]
This theorem is the culmination of the elliptic-curve side of the course. The statement does not produce a preferred equation for $E$, nor does it say that a few point counts certify the identity of $L$-functions. What it does say is that the conductor is exactly the required level and that the analytic, geometric, and Galois-theoretic constructions are guaranteed to meet at a newform of that level. This is why the computational workflow below starts with local conductor data before comparing Hecke eigenvalues.
[example: Matching a Curve to a Newform]
Let $E/\mathbb{Q}$ have conductor $37$, and let $p$ be a prime with $p\ne 37$. Since $p\nmid 37$, the prime $p$ is a good prime for $E$, so its good Euler factor is determined by
\begin{align*}
a_p(E)&=p+1-|E(\mathbb{F}_p)|.
\end{align*}
Thus, if the point counts at $p=2,3,5$ are denoted
\begin{align*}
|E(\mathbb{F}_2)|&=m_2,&
|E(\mathbb{F}_3)|&=m_3,&
|E(\mathbb{F}_5)|&=m_5,
\end{align*}
then the first traces obtained from the curve are
\begin{align*}
a_2(E)&=2+1-m_2=3-m_2,\\
a_3(E)&=3+1-m_3=4-m_3,\\
a_5(E)&=5+1-m_5=6-m_5.
\end{align*}
Now list the normalized newforms $f\in S_2(\Gamma_0(37))$ and compare these integers with their Hecke eigenvalues $a_2(f),a_3(f),a_5(f)$. A candidate newform must satisfy
\begin{align*}
a_2(f)&=3-m_2,\\
a_3(f)&=4-m_3,\\
a_5(f)&=6-m_5.
\end{align*}
For any good prime $p$ where this equality holds, the Euler factors agree because
\begin{align*}
L_p(E,s)&=1-a_p(E)p^{-s}+p^{1-2s}\\
&=1-a_p(f)p^{-s}+p^{1-2s}\\
&=L_p(f,s).
\end{align*}
Equivalently, the Frobenius characteristic polynomials agree:
\begin{align*}
1-a_p(E)T+pT^2
&=1-a_p(f)T+pT^2.
\end{align*}
Matching several small good primes therefore narrows the search to the newform whose Hecke eigenvalues have the same Frobenius traces as $E$; proving equality at all good primes and checking the bad local factor at $37$ is exactly the comparison required by *Modularity Criterion by Frobenius Traces*.
[/example]
The chapter leaves us with three equivalent perspectives on the same phenomenon. Analytically, modularity is the equality $L(E,s)=L(f,s)$. Geometrically, it is the existence of a modular parametrization $X_0(N)\to E$ or an elliptic quotient of $J_0(N)$. Galois-theoretically, it is the identification of the Tate module representation of $E$ with the representation attached to the newform $f$.
We now have both sides of the dictionary: modular forms describe the analytic data, and elliptic curves supply the Galois representations and modular parametrizations. The culminating theorem of the course identifies these descriptions in full generality, showing that modularity is exactly the compatibility between them.
# 10. The Modularity Theorem
The course has built a dictionary between modular forms and two-dimensional Galois representations: modular curves provide the geometry, Hecke operators provide the arithmetic correspondences, and eigenforms provide compatible systems of representations. This chapter records the culminating theorem in the direction relevant to elliptic curves over $\mathbb{Q}$: every such curve is detected by a modular form. The emphasis is not on reproducing the proof of Wiles, but on isolating the objects that enter the proof and explaining why deformation rings and Hecke algebras are the right comparison.
## The Modularity Statement for Elliptic Curves
The guiding question is: when does the $\ell$-adic Galois representation on the Tate module of an elliptic curve come from a modular eigenform? For an elliptic curve $E/\mathbb{Q}$ and a prime $\ell$, the representation
\begin{align*}
\rho_{E,\ell}:G_{\mathbb{Q}}\longrightarrow GL_2(\mathbb{Z}_\ell)
\end{align*}
encodes the action of $G_{\mathbb{Q}}$ on $T_\ell E$. Modularity predicts that the Frobenius traces of this representation are the Hecke eigenvalues of a weight $2$ newform.
[definition: Modular Elliptic Curve]
An elliptic curve $E/\mathbb{Q}$ is modular if there exist an integer $N \ge 1$ and a nonconstant morphism of curves over $\mathbb{Q}$
\begin{align*}
X_0(N) \longrightarrow E.
\end{align*}
[/definition]
This geometric formulation is equivalent to an automorphic formulation in terms of the $L$-function of $E$. We need the theorem to assert that the integer $N$ may be taken to be the conductor of $E$, with an associated modular form of weight $2$ and level $N$.
[quotetheorem:4781]
[citeproof:4781]
This theorem is the global bridge from elliptic curves to modular forms. In the semistable case it is the Wiles-Taylor modularity theorem, and in full generality it uses the later work of Breuil, Conrad, Diamond, and Taylor.
In Galois-representation language, the same theorem says that for each prime $\ell$, the representation $V_\ell E=T_\ell E\otimes_{\mathbb{Z}_\ell}\mathbb{Q}_\ell$ is isomorphic, after choosing an embedding of coefficient fields into $\mathbb{Q}_\ell$, to the $\ell$-adic representation attached to $f$. The equality of $L$-functions is then a consequence of matching characteristic polynomials of Frobenius at almost all primes.
[example: The Curve of Conductor Eleven]
For
\begin{align*}
E: y^2+y=x^3-x^2-10x-20,
\end{align*}
the Weierstrass coefficients are $a_1=0$, $a_2=-1$, $a_3=1$, $a_4=-10$, and $a_6=-20$. Hence
\begin{align*}
b_2&=a_1^2+4a_2=-4,\\
b_4&=2a_4+a_1a_3=-20,\\
b_6&=a_3^2+4a_6=1-80=-79,\\
b_8&=a_1^2a_6+4a_2a_6-a_1a_3a_4+a_2a_3^2-a_4^2\\
&=0+80-0-1-100=-21.
\end{align*}
Therefore the discriminant of this integral model is
\begin{align*}
\Delta
&=-b_2^2b_8-8b_4^3-27b_6^2+9b_2b_4b_6\\
&=-(-4)^2(-21)-8(-20)^3-27(-79)^2+9(-4)(-20)(-79)\\
&=336+64000-168507-56880\\
&=-161051\\
&=-11^5.
\end{align*}
Tate's algorithm for this minimal Weierstrass equation gives conductor $11$.
The modular curve $X_0(11)$ has genus $1$. Choosing the cusp at infinity as the rational base point makes $X_0(11)$ into an elliptic curve, and the standard modular parametrization identifies it with an elliptic curve in the isogeny class of $E$. On the modular-form side, $S_2(\Gamma_0(11))$ is one-dimensional, so its normalized eigenform is unique; it begins
\begin{align*}
f(q)=q-2q^2-q^3+2q^4+q^5+2q^6-2q^7+\cdots.
\end{align*}
We can see the coefficient matching in small primes by counting points. For $p=2$, the equation becomes
\begin{align*}
y^2+y=x^3+x^2
\end{align*}
over $\mathbb{F}_2$. The left side is $0$ for both $y=0$ and $y=1$, while $x^3+x^2=0$ for both $x=0$ and $x=1$. Thus there are $4$ affine points and one point at infinity, so
\begin{align*}
|E(\mathbb{F}_2)|=5,
\qquad
a_2(E)=2+1-5=-2.
\end{align*}
For $p=3$, the right side is
\begin{align*}
x^3-x^2-10x-20\equiv x^3+2x^2+2x+1 \pmod 3.
\end{align*}
The values of $y^2+y$ for $y=0,1,2$ are $0,2,0$, and the right side at $x=0,1,2$ is $1,0,0$. Hence there are $4$ affine points and one point at infinity, giving
\begin{align*}
|E(\mathbb{F}_3)|=5,
\qquad
a_3(E)=3+1-5=-1.
\end{align*}
For $p=5$, the equation becomes
\begin{align*}
y^2+y=x^3-x^2.
\end{align*}
The values of $y^2+y$ for $y=0,1,2,3,4$ are $0,2,1,2,0$, and the values of $x^3-x^2$ for $x=0,1,2,3,4$ are $0,0,4,3,3$. Thus only $x=0$ and $x=1$ contribute solutions, with two choices of $y$ each, so
\begin{align*}
|E(\mathbb{F}_5)|=5,
\qquad
a_5(E)=5+1-5=1.
\end{align*}
These are exactly the displayed coefficients $a_2(f)=-2$, $a_3(f)=-1$, and $a_5(f)=1$; in general the modular parametrization gives $a_p(f)=p+1-|E(\mathbb{F}_p)|$ for every prime $p\ne 11$.
[/example]
The theorem is often called the Taniyama-Shimura-Weil modularity theorem. Historically, the semistable case was proved by Wiles and Taylor-Wiles, and the remaining cases were completed later by Breuil, Conrad, Diamond, and Taylor. Chapter 11 uses precisely the semistable case, together with Ribet level lowering, to prove Fermat's Last Theorem.
## Semistable Curves and the Wiles-Taylor Strategy
The next question is why the semistable case is the first place where the proof becomes accessible. Semistability restricts the possible bad reduction of the elliptic curve, which in turn gives manageable local deformation conditions for the Galois representation.
[definition: Semistable Elliptic Curve]
An elliptic curve $E/\mathbb{Q}$ is semistable if, for every prime $p$, the reduction of $E$ at $p$ is either good or multiplicative.
[/definition]
Semistability excludes additive reduction. This matters because the local Galois representation at a bad prime then has a relatively simple shape, and the deformation problem can impose explicit local conditions without losing control of the deformation ring.
[example: Semistable Reduction and Local Shape]
Let $E/\mathbb{Q}$ have multiplicative reduction at $p$, and fix a prime $\ell\ne p$. The local Tate-curve description of multiplicative reduction says that, after choosing a suitable basis of $V_\ell E$, there is an unramified character $\varepsilon:G_{\mathbb{Q}_p}\to\{\pm 1\}$ such that every $\sigma\in G_{\mathbb{Q}_p}$ acts by a matrix of the form
\begin{align*}
\rho_{E,\ell}(\sigma)
=
\begin{pmatrix}
\varepsilon(\sigma)\chi_\ell(\sigma) & c(\sigma)\\
0 & \varepsilon(\sigma)
\end{pmatrix},
\end{align*}
where $\chi_\ell$ is the $\ell$-adic cyclotomic character and $c(\sigma)$ records the extension class. The character $\varepsilon$ distinguishes split from nonsplit multiplicative reduction; because it is unramified, its value on every inertia element is $1$.
Now let $\tau\in I_p$. Since $\ell\ne p$, inertia acts as the identity on all $\ell$-power roots of unity, so $\chi_\ell(\tau)=1$. Substituting $\varepsilon(\tau)=1$ and $\chi_\ell(\tau)=1$ into the displayed matrix gives
\begin{align*}
\rho_{E,\ell}(\tau)
=
\begin{pmatrix}
1\cdot 1 & c(\tau)\\
0 & 1
\end{pmatrix}
=
\begin{pmatrix}
1 & c(\tau)\\
0 & 1
\end{pmatrix}.
\end{align*}
Therefore
\begin{align*}
\rho_{E,\ell}(\tau)-I
&=
\begin{pmatrix}
1 & c(\tau)\\
0 & 1
\end{pmatrix}
-
\begin{pmatrix}
1 & 0\\
0 & 1
\end{pmatrix} \\
&=
\begin{pmatrix}
0 & c(\tau)\\
0 & 0
\end{pmatrix},
\end{align*}
and
\begin{align*}
(\rho_{E,\ell}(\tau)-I)^2
&=
\begin{pmatrix}
0 & c(\tau)\\
0 & 0
\end{pmatrix}
\begin{pmatrix}
0 & c(\tau)\\
0 & 0
\end{pmatrix} \\
&=
\begin{pmatrix}
0\cdot 0+c(\tau)\cdot 0 & 0\cdot c(\tau)+c(\tau)\cdot 0\\
0\cdot 0+0\cdot 0 & 0\cdot c(\tau)+0\cdot 0
\end{pmatrix} \\
&=
\begin{pmatrix}
0 & 0\\
0 & 0
\end{pmatrix}.
\end{align*}
Thus each inertia element acts unipotently on $V_\ell E$. This upper-triangular unipotent inertia action is the local Galois-theoretic signature of multiplicative reduction, and it is the shape imposed at $p$ in the semistable deformation condition.
[/example]
The proof strategy begins with a residual representation
\begin{align*}
\bar{\rho}:G_{\mathbb{Q}}\longrightarrow GL_2(k),
\end{align*}
where $k$ is a finite field of characteristic $\ell$. If $\bar{\rho}$ is known to be modular, the goal is to prove that certain $\ell$-adic lifts of $\bar{\rho}$ are also modular.
[definition: Residual Modularity]
A continuous representation $\bar{\rho}:G_{\mathbb{Q}}\to GL_2(k)$ is residually modular if there exists a normalized cuspidal Hecke eigenform $f$ and a prime $\lambda$ of its coefficient field such that the semisimplification of $\bar{\rho}$ is isomorphic to the residual representation attached to $f$ modulo $\lambda$.
[/definition]
The bridge from residual modularity to modularity of a lift is a modularity lifting theorem. Such a theorem is not merely a statement about representations: it compares a universal deformation object with a Hecke-theoretic object.
The statement below uses standard deformation-theory shorthand, so we fix the vocabulary before the theorem card. A deformation problem for $\bar{\rho}$ means a rule specifying which lifts of $\bar{\rho}$ are allowed, including a fixed determinant and local conditions at the finitely many primes where ramification or $\ell$-adic behaviour matters. At $p\ne\ell$, a minimal condition keeps the same ramification as $\bar{\rho}$, while a semistable condition permits the controlled unipotent inertia that comes from multiplicative reduction. At $\ell$, an ordinary condition means that the local representation has a $G_{\mathbb Q_\ell}$-stable line with prescribed character, while a finite-flat condition means that the representation comes from a finite flat group scheme over $\mathbb Z_\ell$. The adjoint trace-zero representation $\operatorname{ad}^0(\bar{\rho})$ is the Galois module of trace-zero endomorphisms of the underlying two-dimensional space; its Selmer group measures allowed infinitesimal deformations, and the dual Selmer group measures the corresponding obstructions. The Taylor-Wiles numerical criterion is the dimension equality that makes the patching argument identify the deformation ring with a Hecke algebra.
[quotetheorem:4782]
[citeproof:4782]
[explanation: Why the Hypotheses Are Needed]
Oddness is not a cosmetic condition: even two-dimensional representations over $\mathbb{Q}$ have the wrong sign at complex conjugation and do not correspond to holomorphic weight $2$ modular forms. Absolute irreducibility of $\bar{\rho}$ is also essential, because reducible residual representations can have deformation functors with extra components that are not detected by cuspidal Hecke algebras.
The local conditions determine which automorphic forms are being compared with the deformation ring. If the condition at $\ell$ is neither ordinary nor finite flat in the required sense, the local deformation ring can have components not represented by the modular forms under consideration. If ramification away from $\ell$ is allowed without a matching level condition, the deformation ring may classify Galois representations whose conductors are larger than the Hecke algebra can see. The theorem therefore does not say that every lift of a modular residual representation is modular; it says that lifts lying on the deformation problem matched to the Hecke algebra are modular.
[/explanation]
## Deformation Rings and Hecke Algebras
The central algebraic question is: why should a deformation ring know anything about modular forms? The answer is that both the deformation ring and the Hecke algebra classify the same Galois representations, but from opposite directions.
[definition: Deformation Functor]
Let $\bar{\rho}:G_{\mathbb{Q}}\to GL_2(k)$ be a continuous representation. A deformation functor assigns to each complete local Noetherian $\mathcal{O}$-algebra $A$ with residue field $k$ the set of equivalence classes of continuous lifts
\begin{align*}
\rho_A:G_{\mathbb{Q}}\to GL_2(A)
\end{align*}
reducing to $\bar{\rho}$ modulo the maximal ideal of $A$ and satisfying specified local conditions.
[/definition]
When the usual representability hypotheses hold, this functor is represented by a complete local Noetherian ring $R$. The point of imposing local deformation conditions is that they cut out a deformation problem small enough to compare with modular forms, while still containing the representation attached to the elliptic curve.
[definition: Hecke Algebra Localised at a Residual Representation]
Fix integers $k\ge 2$ and $N\ge 1$, and let
\begin{align*}
M=S_k(\Gamma_0(N);\mathcal{O})
\end{align*}
be the $\mathcal{O}$-module of cusp forms of weight $k$ and level $N$ with coefficients in $\mathcal{O}$. For each prime $p\nmid N$, the Hecke operator is an $\mathcal{O}$-linear map
\begin{align*}
T_p:M\longrightarrow M.
\end{align*}
The Hecke algebra $\mathbb{T}\subseteq \operatorname{End}_{\mathcal{O}}(M)$ is the $\mathcal{O}$-subalgebra generated by the operators $T_p$ for $p\nmid N$ and the relevant operators at primes dividing $N$. If $\mathfrak{m}\subset\mathbb{T}$ is the maximal ideal determined by the residual representation $\bar{\rho}$, then $\mathbb{T}_{\mathfrak{m}}$ is the localisation of $\mathbb{T}$ at $\mathfrak{m}$.
[/definition]
The localisation isolates congruence classes of modular forms whose residual Galois representation is $\bar{\rho}$. It is therefore the Hecke-theoretic counterpart of the deformation ring for $\bar{\rho}$.
For the comparison theorem, the local deformation conditions $\mathcal D_v$ are the local pieces of the global deformation problem: each one says which lifts of $\bar{\rho}|_{G_{\mathbb Q_v}}$ are allowed at the prime $v$. The global ring $R$ represents all lifts satisfying these local rules, while the localized Hecke algebra $\mathbb T_{\mathfrak m}$ records the modular lifts with the same residual representation. The Selmer and dual Selmer dimensions are the tangent-space and obstruction counts that tell Taylor-Wiles patching whether the natural map $R\to\mathbb T_{\mathfrak m}$ has the right size to be an isomorphism.
[quotetheorem:4783]
[citeproof:4783]
This result converts modularity into commutative algebra. If a Galois representation defines a point of $R$, then the isomorphism $R\cong\mathbb{T}_{\mathfrak{m}}$ identifies that point with a system of Hecke eigenvalues, and hence with a modular form.
[explanation: Failure Modes in the Comparison]
The hypotheses prevent several concrete mismatches. If $\bar{\rho}$ is reducible, the deformation ring may contain Eisenstein components, while the cuspidal Hecke algebra sees only the cuspidal part of the automorphic spectrum. If the deformation condition permits ramification at a prime but the Hecke algebra is taken at a level that does not include that ramification, then $R$ classifies lifts with no possible image in $\mathbb{T}_{\mathfrak{m}}$.
Localising at the wrong maximal ideal is equally fatal: it compares deformations of one residual representation with modular forms congruent to another. Finally, if the deformation problem is not of Taylor-Wiles type, the auxiliary-prime patching may not produce enough variables to balance the number of relations, so the depth and dimension comparison behind the numerical criterion can fail.
[/explanation]
The explanation gives the organizing idea; the example now supplies a test case. This keeps the next construction tied to computations rather than only to terminology.
[example: Local Conditions at a Prime]
Fix a residual representation $\bar{\rho}:G_{\mathbb{Q}}\to GL_2(k)$ and a prime $p\ne \ell$, and write $I_p\subset G_{\mathbb{Q}_p}$ for inertia. A lift to a complete local $\mathcal{O}$-algebra $A$ is a representation
\begin{align*}
\rho_A:G_{\mathbb{Q}}\to GL_2(A)
\end{align*}
whose reduction modulo the maximal ideal of $A$ is $\bar{\rho}$. If $\bar{\rho}$ is unramified at $p$, then
\begin{align*}
\bar{\rho}(\tau)=I
\qquad\text{for every }\tau\in I_p.
\end{align*}
The minimal local condition at $p$ requires the same inertia equation for the lift:
\begin{align*}
\rho_A(\tau)=I
\qquad\text{for every }\tau\in I_p.
\end{align*}
Thus, under the minimal condition, the restriction $\rho_A|_{G_{\mathbb{Q}_p}}$ factors through the quotient $G_{\mathbb{Q}_p}/I_p$, so the local deformation remembers only the Frobenius action at $p$.
For a semistable multiplicative condition, the allowed inertia action is larger but still controlled. After choosing a basis, the condition permits matrices of the form
\begin{align*}
\rho_A(\tau)
=
\begin{pmatrix}
1 & c(\tau)\\
0 & 1
\end{pmatrix}
\qquad(\tau\in I_p),
\end{align*}
where $c(\tau)\in A$. Then
\begin{align*}
\rho_A(\tau)-I
&=
\begin{pmatrix}
1 & c(\tau)\\
0 & 1
\end{pmatrix}
-
\begin{pmatrix}
1 & 0\\
0 & 1
\end{pmatrix} \\
&=
\begin{pmatrix}
0 & c(\tau)\\
0 & 0
\end{pmatrix},
\end{align*}
and multiplying this matrix by itself gives
\begin{align*}
(\rho_A(\tau)-I)^2
&=
\begin{pmatrix}
0 & c(\tau)\\
0 & 0
\end{pmatrix}
\begin{pmatrix}
0 & c(\tau)\\
0 & 0
\end{pmatrix} \\
&=
\begin{pmatrix}
0\cdot 0+c(\tau)\cdot 0 & 0\cdot c(\tau)+c(\tau)\cdot 0\\
0\cdot 0+0\cdot 0 & 0\cdot c(\tau)+0\cdot 0
\end{pmatrix} \\
&=
\begin{pmatrix}
0 & 0\\
0 & 0
\end{pmatrix}.
\end{align*}
So the minimal condition imposes $\rho_A(\tau)-I=0$ on inertia, while the semistable multiplicative condition permits $\rho_A(\tau)-I$ to be nonzero but forces it to be nilpotent of square zero. These two choices define different local deformation conditions at $p$, hence different local deformation rings, and the global deformation ring $R$ is obtained by imposing the chosen condition at every ramified prime simultaneously.
[/example]
The local conditions are not decorations on the theorem; they determine which automorphic forms can appear on the Hecke side. Minimal deformation conditions correspond to keeping the level small, while allowing extra ramification corresponds to raising the level in a controlled way.
## From Modularity to Fermat and Langlands
The final question is why the modularity theorem became a turning point rather than a single result about elliptic curves. Its importance is that it confirms a precise case of the Langlands philosophy: arithmetic Galois representations should correspond to automorphic forms.
[quotetheorem:4784]
[citeproof:4784]
[explanation: Why Each Input Is Necessary]
The prime-exponent reduction is needed because the Frey curve is built using a prime $p$ so that the mod $p$ representation can be compared with modular forms modulo $p$. Semistability is needed because Wiles's theorem, in the form used for Fermat Last Theorem, proves modularity only for semistable elliptic curves; without semistability the implication would not apply to the Frey curve. Ribet level lowering is the step that turns modularity into a contradiction: modularity alone would only place the Frey curve at its original conductor, where modular forms may exist.
The argument also does not prove Fermat Last Theorem independently of modularity. It proves that a Fermat solution would contradict the simultaneous validity of semistable modularity and Ribet level lowering for the associated Frey curve.
[/explanation]
The proof demonstrates the power of moving between [Diophantine equations](/page/Diophantine%20Equations), elliptic curves, Galois representations, and modular forms. A statement about integer solutions becomes a statement about the nonexistence of a modular form of a certain level.
[remark: Place in the Langlands Program]
The modularity theorem is the two-dimensional, weight $2$, rational-coefficient case of a broader expected correspondence between Galois representations and automorphic representations. The representation $\rho_{E,\ell}$ is the Galois side, while the newform $f$ is the automorphic side. The equality of $L$-functions is the visible shadow of the deeper matching of local factors.
[/remark]
The same pattern now appears throughout arithmetic geometry. Modularity lifting theorems have been extended far beyond elliptic curves over $\mathbb{Q}$, including higher-dimensional automorphic forms, totally real fields, and potential automorphy theorems. The essential architecture remains the one introduced here: start with residual modularity, formulate a deformation problem, compare it with a Hecke algebra, and use the comparison to lift automorphy.
The modularity theorem completes the bridge from elliptic curves to modular forms and Galois representations. The next chapter uses that bridge in a classical application: a Frey curve attached to a hypothetical solution of Fermat's equation would force a contradiction with modularity.
# 11. Fermat's Last Theorem via Frey Curves
Chapters 6 through 10 built the bridge from modular forms to two-dimensional Galois representations, and then from elliptic curves to modularity. This chapter explains how that bridge gives a proof of Fermat's Last Theorem. The argument is indirect: a hypothetical solution to $a^n+b^n=c^n$ produces an elliptic curve with exceptional arithmetic properties, and the modularity theorems force those properties to be incompatible.
The main point is not that modular forms give a way to search for Fermat solutions. Rather, they rule out the existence of any primitive solution by translating it into a statement about levels of modular forms. The contradiction appears at the smallest possible level, where the relevant space of cusp forms has no suitable newform.
## From a Hypothetical Fermat Solution to the Frey Curve
What arithmetic object should be attached to a putative solution of Fermat's equation? The decisive insight is that the equation itself can be encoded into the reduction behaviour of an elliptic curve. The Frey curve is engineered so that its discriminant and conductor remember the primes dividing $abc$, while its mod-$p$ Galois representation has a level that Ribet's theorem can lower dramatically.
[definition: Primitive Fermat Solution]
Let $p$ be an odd prime. A primitive Fermat solution of exponent $p$ is a triple $(a,b,c) \in \mathbb Z^3$ such that
\begin{align*}
a^p+b^p&=c^p, \\
abc&\ne 0, \\
\gcd(a,b,c)&=1.
\end{align*}
[/definition]
The reduction to prime exponents is standard. We need the modular argument only for odd primes $p$, because if Fermat's equation has a non-zero solution for some exponent $n>2$, then it has one for an odd prime exponent $p\mid n$, except for the exponent $4$ case already settled by Fermat's infinite descent.
[definition: Frey Curve]
Let $p$ be an odd prime and let $(a,b,c)$ be a primitive Fermat solution of exponent $p$. The associated Frey curve is the elliptic curve
\begin{align*}
E_{a,b,p}: y^2 = x(x-a^p)(x+b^p)
\end{align*}
over $\mathbb Q$.
[/definition]
This Weierstrass equation has full rational $2$-torsion, with non-zero $2$-torsion points at $x=0$, $x=a^p$, and $x=-b^p$. The Fermat equation enters through the identity $a^p+b^p=c^p$, which makes the difference between the two non-zero roots equal to $c^p$.
[example: The Discriminant of the Frey Curve]
The roots of $x(x-a^p)(x+b^p)$ are
\begin{align*}
r_1&=0, &
r_2&=a^p, &
r_3&=-b^p.
\end{align*}
Their pairwise differences are
\begin{align*}
r_1-r_2&=0-a^p=-a^p,\\
r_1-r_3&=0-(-b^p)=b^p,\\
r_2-r_3&=a^p-(-b^p)=a^p+b^p=c^p.
\end{align*}
For a monic cubic with roots $r_1,r_2,r_3$, the cubic discriminant is
\begin{align*}
\prod_{i<j}(r_i-r_j)^2.
\end{align*}
Thus
\begin{align*}
\operatorname{disc}\bigl(x(x-a^p)(x+b^p)\bigr)
&=(-a^p)^2(b^p)^2(c^p)^2\\
&=a^{2p}b^{2p}c^{2p}.
\end{align*}
For a Weierstrass equation $y^2=f(x)$ with $f$ a monic cubic, the discriminant of the equation is $16\operatorname{disc}(f)$, so the displayed Frey equation has
\begin{align*}
\Delta
&=16a^{2p}b^{2p}c^{2p}.
\end{align*}
Consequently, for every odd prime $\ell\mid abc$, the exponent of $\ell$ in $\Delta$ is divisible by $p$; this is the high divisibility that later makes the odd primes dividing $abc$ removable by level lowering modulo $p$.
[/example]
The discriminant calculation alone does not yet put the Frey curve into the modularity machine. To apply Wiles's theorem cleanly, the curve must have controlled local reduction at every prime, especially the primes dividing $abc$ and the prime $2$. The next result supplies that local control: the special shape of the Fermat equation forces the Frey curve to be semistable, so its bad reduction is mild enough for the modularity and level-lowering arguments to interact.
[quotetheorem:4785]
[citeproof:4785]
The hypotheses in this theorem are doing real work. The primitivity assumption ensures that an odd prime cannot divide two of $a,b,c$ at once; without it, the local equation could have worse collisions among the roots and the reduction need not be multiplicative. The restriction $p\ge 5$ separates the modular argument from the classical low-exponent cases and avoids small-prime pathologies in the local computations. The theorem does not say that every elliptic curve with full rational $2$-torsion is semistable; the special Fermat relation and primitivity are what force the bad reduction to be multiplicative. Its forward role is precise: semistability puts the Frey curve inside the scope of Wiles's modularity theorem.
[quotetheorem:4786]
[citeproof:4786]
This conductor statement is weaker than a complete conductor formula, but it is exactly the part used in the contradiction. Primitivity is again essential: if an odd prime divided two of $a,b,c$, the reduction type could be worse and the conductor exponent would no longer be forced to be $1$. The theorem also does not say that the same primes disappear from the conductor of the elliptic curve itself; level lowering concerns the residual mod-$p$ representation after comparing the conductor with the minimal discriminant. The usable test is this: at an odd prime $\ell\mid abc$, the conductor contributes a single factor of $\ell$, while the minimal discriminant has valuation divisible by $p$, and that is the local pattern Ribet's theorem can remove.
## Serre, Ribet, and the Contradiction with Modularity
How does an elliptic curve contradict a theorem about modular forms? The route is through its Galois representation on $p$-torsion. Modularity attaches the representation to a modular form of level $N(E)$, while Ribet's theorem removes from the level the primes at which the discriminant has the special divisibility forced by the Fermat equation.
[definition: Mod-$p$ Representation of an Elliptic Curve]
Let $E/\mathbb Q$ be an elliptic curve and let $p$ be a prime. The mod-$p$ Galois representation of $E$ is the homomorphism
\begin{align*}
\bar{\rho}_{E,p}: G_{\mathbb Q} \to GL(E[p]) \cong GL_2(\mathbb F_p)
\end{align*}
obtained from the natural action of $G_{\mathbb Q}=\operatorname{Gal}(\overline{\mathbb Q}/\mathbb Q)$ on the $p$-torsion subgroup $E[p]$.
[/definition]
For the Frey curve, this representation is odd and irreducible in the range needed for the argument. We need a modularity theorem applicable to such semistable elliptic curves before Ribet's level lowering can create the contradiction.
[quotetheorem:4787]
[citeproof:4787]
This is the Wiles and Taylor-Wiles modularity theorem in the semistable case. The semistability hypothesis is the bridge to the version of modularity available in the original Fermat proof; without semistability one needs the later full modularity theorem for elliptic curves over $\mathbb Q$. The conductor matters because it is the level of the newform before level lowering, and weight $2$ matters because elliptic curves correspond to weight $2$ modular forms rather than to arbitrary weights. Modularity does not by itself give a contradiction: it first places the Frey representation in the modular world, and the contradiction only appears after Ribet's theorem lowers the level.
The modularity theorem puts the Frey curve into the world of modular forms, but Fermat's equation needs a contradiction at an impossibly small level. The obstruction is that the initial modular form still has the conductor dictated by the bad primes of the Frey curve. [Ribet's level-lowering theorem](/theorems/4774) is the technical hinge: under the special local ramification created by a Fermat solution, it removes those primes from the level while preserving the residual Galois representation.
[quotetheorem:4788]
[citeproof:4788]
Ribet's theorem is often described as proving Serre's epsilon conjecture. In this application, the important point is the direction from a modular residual representation with ramification at many primes to a newform whose level has had those primes removed. The divisibility of the minimal discriminant valuation by $p$ is indispensable: for a general multiplicative prime with valuation not divisible by $p$, the residual representation need not become unramified in the required sense, and the prime may remain in the level. Irreducibility of $\bar{\rho}_{E_{a,b,p},p}$ is also not cosmetic, since level lowering is a theorem about residual representations satisfying precise modularity and local hypotheses. The theorem therefore does not say that every bad prime of an elliptic curve can be erased; it says that the special bad primes created by a Fermat solution are exactly of the removable kind.
[example: Why Level Two Is Impossible]
The contradiction is concentrated in the target space $S_2(\Gamma_0(2))$. The modular curve $X_0(2)$ has genus $0$. By the definition of genus for a smooth projective complex curve,
\begin{align*}
\dim_{\mathbb C} H^0(X_0(2),\Omega^1)=0.
\end{align*}
Weight $2$ cusp forms on $\Gamma_0(2)$ identify with holomorphic differentials on $X_0(2)$ by sending $f(z)$ to $f(z)\,dz$, so
\begin{align*}
\dim_{\mathbb C} S_2(\Gamma_0(2))
&=\dim_{\mathbb C} H^0(X_0(2),\Omega^1)\\
&=0.
\end{align*}
Therefore
\begin{align*}
S_2(\Gamma_0(2))=0.
\end{align*}
A newform is in particular a nonzero cusp form, so there is no weight $2$ newform of level $2$ from which the Frey representation could arise.
[/example]
This emptiness result is the final target of the level-lowering step. Once the residual representation has been forced to level $2$, there is no remaining modular object available to carry it.
[quotetheorem:4789]
[citeproof:4789]
The hypotheses in the final theorem are exactly the classical ones: non-zero integer solutions are excluded only for exponents $n>2$, while $n=2$ has many Pythagorean triples. The reduction to primitive prime-exponent solutions is a necessary preliminary step; without it the Frey curve construction would not have the local conductor and discriminant properties used above. The theorem does not classify near-misses or quantify how close $a^n+b^n$ can come to another $n$th power. What it does provide is a reusable proof pattern: construct a representation from arithmetic data, prove it is modular, use local information to lower the level, and then inspect the resulting space of modular forms.
## What the Proof Uses and What It Does Not Prove About Explicit Solutions
What has actually been shown, and what information has the method discarded? The proof is existential and global. It eliminates the possibility of a primitive Fermat solution by forcing an impossible modular form, but it does not compute a list of near-solutions or describe numerical searches for triples.
[explanation: Inputs to the Modular Proof]
The argument has four main ingredients. First, the Frey construction turns a primitive Fermat solution into a semistable elliptic curve with controlled discriminant and conductor. Second, the [modularity theorem for semistable elliptic curves](/theorems/4787) turns that elliptic curve into a weight $2$ newform. Third, Mazur-type irreducibility results ensure that the residual representation satisfies the hypotheses needed for level lowering. Fourth, Ribet's theorem removes the odd primes dividing $abc$ from the level, leaving the impossible level $2$ case.
Each input has a different mathematical nature. The Frey curve calculation is explicit arithmetic geometry. Modularity is deformation theory and the arithmetic of Hecke algebras. Level lowering is a comparison between local Galois behaviour and the oldform-newform structure of modular forms. The final contradiction is a small-dimensional computation on a modular curve.
[/explanation]
These inputs also explain why the argument is not an elementary descent in disguise. Each step preserves only the information needed to reach the modular contradiction.
[remark: No Formula for Fermat Solutions]
The theorem proves that Fermat solutions do not exist, so it cannot produce a parametrisation of them. More importantly, the argument never searches through triples $(a,b,c)$. It shows that any such triple would lead to a Galois representation with incompatible modular properties.
[/remark]
The construction therefore sacrifices explicit numerical information in exchange for rigid structural information. The next remark records why that structural information is present in the Frey curve and not in a random elliptic curve attached to three integers.
[remark: Why the Frey Curve Is Not Arbitrary]
Many elliptic curves can be written down from three integers, but the Frey curve is tailored to the equation. Its roots are arranged so that their pairwise differences are $a^p$, $b^p$, and $c^p$, which forces the discriminant to contain $a^{2p}b^{2p}c^{2p}$. At the same time, semistability keeps the conductor small enough for level lowering to have force.
[/remark]
The remark clarifies how the previous construction should be read. The next example, The Logical Shape of the Contradiction, returns to a concrete case where that interpretation can be checked directly.
[example: The Logical Shape of the Contradiction]
Assume that a primitive Fermat solution $(a,b,c)$ exists for some prime exponent $p\ge 5$, so
\begin{align*}
a^p+b^p&=c^p,\\
abc&\ne 0,\\
\gcd(a,b,c)&=1.
\end{align*}
Attach to it the Frey curve $E=E_{a,b,p}$ and its residual representation
\begin{align*}
\bar{\rho}_{E,p}:G_{\mathbb Q}\to GL_2(\mathbb F_p).
\end{align*}
By *Modularity of Semistable Elliptic Curves*, the semistable Frey curve is modular, so $\bar{\rho}_{E,p}$ comes from a weight $2$ modular form at the conductor level of $E$. By *Ribet Level Lowering for the Frey Curve*, the same residual representation must then arise from a weight $2$ newform of level $2$.
But the level-$2$ target space is zero:
\begin{align*}
S_2(\Gamma_0(2))&=0.
\end{align*}
A newform is, by definition, a nonzero cusp form, so a weight $2$ newform of level $2$ would be a nonzero element of $S_2(\Gamma_0(2))$. This would require
\begin{align*}
S_2(\Gamma_0(2))&\ne 0,
\end{align*}
contradicting
\begin{align*}
S_2(\Gamma_0(2))&=0.
\end{align*}
Thus the assumed primitive solution cannot exist. The contradiction is not obtained by manipulating the equation $a^p+b^p=c^p$ numerically; it is obtained because the Frey representation is forced into an empty space of modular forms.
[/example]
The chapter completes the course's main narrative. Modular forms began as analytic objects with Fourier expansions and Hecke operators; through modular curves and cohomology they produced Galois representations; through elliptic curves and modularity they became powerful enough to settle a Diophantine problem that had resisted elementary methods for centuries.
The Fermat argument shows how far the modular-form-and-Galois-representation machine can reach when combined with modularity. The closing chapter then steps back and places that machine inside the broader Langlands programme, where the same pattern is expected to govern much more general arithmetic objects.
# 12. Beyond the Course: Galois Representations and Langlands
This final chapter steps back from the Fermat application in Chapter 11 and explains how the constructions of the course sit inside the wider Langlands programme. The earlier chapters attached Galois representations to modular forms and used Frobenius traces to compare arithmetic and analytic data. We now step beyond the main syllabus to compatible systems, Serre modularity, Fontaine-Mazur, Hilbert modular forms, Langlands reciprocity, and the modularity argument behind Fermat's Last Theorem.
## Compatible Systems and Motives Attached to Modular Forms
A single $\ell$-adic representation attached to an eigenform already contains a great deal of arithmetic information, but it depends on a choice of prime and a choice of place above it in the coefficient field. The first problem is to understand why these different representations should be viewed as different realisations of one object, rather than as unrelated representations.
[definition: Compatible System Attached to a Modular Eigenform]
Let $f = \sum_{n \ge 1} a_n q^n$ be a normalized cuspidal Hecke eigenform of weight $k \ge 2$, level $N$, nebentypus character $\chi$, and coefficient field $K_f = \mathbb Q(a_n : n \ge 1)$. A compatible system attached to $f$ is the collection of continuous representations
\begin{align*}
\rho_{f,\lambda}: G_{\mathbb Q} \to GL_2(K_{f,\lambda})
\end{align*}
indexed by finite places $\lambda$ of $K_f$, satisfying the standard unramified Frobenius trace and determinant identities away from $N\ell$, where $\ell$ is the residue characteristic of $\lambda$.
[/definition]
The word compatible refers to the fact that, at almost all primes $p$, the characteristic polynomial of Frobenius has coefficients in $K_f$ and does not depend on the auxiliary place $\lambda$. The remaining issue is existence: a definition of a compatible system would be empty unless modular eigenforms actually produced such Galois representations. Deligne's construction is the theorem that fills this gap, identifying Hecke eigenvalues with Frobenius traces and showing that the same modular form controls all the auxiliary $\lambda$-adic realisations.
[quotetheorem:4790]
[citeproof:4790]
This theorem is the bridge from modular forms to arithmetic: Hecke eigenvalues become Frobenius traces. The restriction $p \nmid N\ell$ is essential: primes dividing $N$ are precisely where the modular curve or local system may have bad reduction, and $p=\ell$ is where the representation is governed by $p$-adic Hodge theory rather than by an ordinary unramified Frobenius element. Cuspidality removes Eisenstein pieces whose Galois representations decompose into characters, and the eigenform hypothesis is what lets every Hecke operator act through a scalar $a_p$. The condition $k \ge 2$ keeps the representation in the cohomological range treated here; weight one forms have finite-image Artin representations and require the Deligne-Serre theory instead. It also explains why $q$-expansions are not merely analytic data; their coefficients are recording point-counting and Galois action.
[example: Weight Two Newform and an Elliptic Curve]
Let $f \in S_2(\Gamma_0(11))$ be the normalized newform attached to
\begin{align*}
E: y^2+y=x^3-x^2-10x-20.
\end{align*}
For a prime $p \nmid 11$, the trace of Frobenius on $T_\ell E \otimes_{\mathbb Z_\ell}\mathbb Q_\ell$ is
\begin{align*}
a_p(E)=p+1-|E(\mathbb F_p)|,
\end{align*}
and modularity identifies this trace with the Hecke eigenvalue $a_p(f)$.
For example, over $\mathbb F_2$ the right-hand side is
\begin{align*}
x^3-x^2-10x-20 \equiv x^3+x^2=x^2(x+1).
\end{align*}
For $x=0$ and $x=1$ this is $0$, and in $\mathbb F_2$ both $y=0$ and $y=1$ satisfy $y^2+y=0$. Hence there are $4$ affine points and one point at infinity, so
\begin{align*}
|E(\mathbb F_2)|=5,
\qquad
a_2(f)=2+1-5=-2.
\end{align*}
Over $\mathbb F_3$,
\begin{align*}
x^3-x^2-10x-20 \equiv x^3+2x^2+2x+1.
\end{align*}
Evaluating this at $x=0,1,2$ gives
\begin{align*}
1,\quad 1+2+2+1 \equiv 0,\quad 8+8+4+1 \equiv 0.
\end{align*}
The values of $y^2+y$ in $\mathbb F_3$ are
\begin{align*}
0^2+0=0,\qquad 1^2+1=2,\qquad 2^2+2=6\equiv 0,
\end{align*}
so the fibers over $x=0,1,2$ have $0,2,2$ points respectively. Thus
\begin{align*}
|E(\mathbb F_3)|=0+2+2+1=5,
\qquad
a_3(f)=3+1-5=-1.
\end{align*}
These computations give the beginning of the $q$-expansion
\begin{align*}
f(q)=q-2q^2-q^3+\cdots,
\end{align*}
and they illustrate concretely that the compatible system attached to $f$ is the same system obtained from the Tate modules of $E$.
[/example]
The motivic language packages this pattern. Instead of saying that every $\ell$ gives a separate representation, one expects a pure motive over $\mathbb Q$ whose $\ell$-adic realisations are the $\rho_{f,\lambda}$.
[explanation: Motives as a Common Source]
A motive is not needed as a fully constructed object for the main arguments of this course, but it gives the right organising principle. Cohomology theories such as singular cohomology, de Rham cohomology, and $\ell$-adic cohomology often behave like different shadows of the same geometric object. For modular eigenforms of weight $k \ge 2$, the relevant piece of the cohomology of a modular curve, or of a Kuga-Sato variety in higher weight, is expected to define a motive $M_f$ over $\mathbb Q$ with coefficients in $K_f$.
The $\ell$-adic realisation of $M_f$ is $\rho_{f,\lambda}$, and the Hodge-theoretic realisation remembers the weight and type of the modular form. In this language, the compatibility of Frobenius polynomials reflects the fact that all $\ell$-adic realisations come from one common source.
[/explanation]
The compatible-system language packages this phenomenon into a reusable object: a family of $\ell$-adic representations whose good-prime Frobenius polynomials are independent of $\ell$. We need a purity condition as well, because Langlands-style comparisons also track the complex absolute values of Frobenius eigenvalues.
[definition: Pure Compatible System]
Let $K$ be a number field, let $E$ be a coefficient field, and let $w \in \mathbb Z$. A two-dimensional pure $E$-rational compatible system of weight $w$ over $K$ consists of representations
\begin{align*}
\rho_\lambda: G_K \to GL_2(E_\lambda)
\end{align*}
for finite places $\lambda$ of $E$, together with a finite set of primes $S$ of $K$, such that for every prime $v \notin S$ with residue characteristic different from that of $\lambda$, the representation $\rho_\lambda$ is unramified at $v$, the characteristic polynomial of $\rho_\lambda(\operatorname{Frob}_v)$ lies in $E[X]$ independently of $\lambda$, and every root $\alpha$ of this polynomial satisfies
\begin{align*}
|\iota(\alpha)| = q_v^{w/2}
\end{align*}
for every embedding $\iota: E(\alpha) \hookrightarrow \mathbb C$, where $q_v$ is the size of the residue field at $v$.
[/definition]
For a modular form of weight $k$, the attached compatible system is pure of weight $k-1$: at primes $p \nmid N\ell$, the Frobenius eigenvalues have complex absolute value $p^{(k-1)/2}$ after any embedding into $\mathbb C$.
## Serre Modularity and the Fontaine-Mazur Conjecture
The next problem reverses the direction of the course. We know that modular eigenforms produce Galois representations; which Galois representations arise in this way?
[definition: Odd Two-Dimensional Mod p Representation]
Let $p$ be a prime. A continuous representation
\begin{align*}
\bar\rho: G_{\mathbb Q} \to GL_2(\overline{\mathbb F}_p)
\end{align*}
is odd if, for complex conjugation $c \in G_{\mathbb Q}$, one has
\begin{align*}
\det(\bar\rho(c)) = -1.
\end{align*}
[/definition]
Oddness is the sign condition forced by modular forms over $\mathbb Q$. We need this real-place condition because it is the two-dimensional analogue of requiring that the representation have the correct behaviour at the real place.
[quotetheorem:4791]
[citeproof:4791]
This theorem was Serre's modularity conjecture and was proved by Khare and Wintenberger, with important input from modularity lifting methods. The hypotheses are restrictive in the right way. Absolute irreducibility excludes representations built from characters, which correspond instead to Eisenstein phenomena rather than cuspidal eigenforms. Oddness is forced by the real place for modular forms over $\mathbb Q$; even two-dimensional residual representations are not expected to come from classical cuspidal modular forms in this statement. Thus the theorem is not saying that every mod $p$ representation is modular, but that every continuous absolutely irreducible representation with the correct real-place sign is.
[example: Odd Mod p Representation]
Let $E/\mathbb Q$ be an elliptic curve, let $p$ be a prime, and consider the Galois action on the $p$-torsion:
\begin{align*}
\bar\rho_{E,p}:G_{\mathbb Q}\to GL_2(E[p])\cong GL_2(\mathbb F_p).
\end{align*}
Choose an $\mathbb F_p$-basis $P,Q$ of $E[p]$, and write
\begin{align*}
\bar\rho_{E,p}(\sigma)
=
\begin{pmatrix}
a & b\\
c & d
\end{pmatrix},
\qquad
\sigma(P)=aP+cQ,\qquad
\sigma(Q)=bP+dQ.
\end{align*}
The Weil pairing
\begin{align*}
e_p:E[p]\times E[p]\to \mu_p
\end{align*}
is bilinear, alternating, and Galois-equivariant. If $e_p(P,Q)=\zeta$, then
\begin{align*}
e_p(\sigma P,\sigma Q)
&=e_p(aP+cQ,bP+dQ)\\
&=e_p(P,P)^{ab}e_p(P,Q)^{ad}e_p(Q,P)^{cb}e_p(Q,Q)^{cd}\\
&=1^{ab}\zeta^{ad}\zeta^{-cb}1^{cd}\\
&=\zeta^{ad-bc}\\
&=\zeta^{\det(\bar\rho_{E,p}(\sigma))}.
\end{align*}
On the other hand, Galois-equivariance gives
\begin{align*}
e_p(\sigma P,\sigma Q)
=
\sigma(e_p(P,Q))
=
\sigma(\zeta)
=
\zeta^{\bar\chi_p(\sigma)},
\end{align*}
where $\bar\chi_p:G_{\mathbb Q}\to \mathbb F_p^\times$ is the mod $p$ cyclotomic character. Since $\zeta$ has order $p$, the equality
\begin{align*}
\zeta^{\det(\bar\rho_{E,p}(\sigma))}
=
\zeta^{\bar\chi_p(\sigma)}
\end{align*}
implies
\begin{align*}
\det(\bar\rho_{E,p}(\sigma))=\bar\chi_p(\sigma)
\end{align*}
in $\mathbb F_p^\times$.
For complex conjugation $c\in G_{\mathbb Q}$, one has
\begin{align*}
c(\zeta)=\overline{\zeta}=\zeta^{-1},
\end{align*}
so
\begin{align*}
\bar\chi_p(c)=-1\in \mathbb F_p^\times.
\end{align*}
Therefore, for odd $p$,
\begin{align*}
\det(\bar\rho_{E,p}(c))=-1,
\end{align*}
so $\bar\rho_{E,p}$ is odd. When $\bar\rho_{E,p}$ is irreducible, *[Serre Modularity Theorem](/theorems/4791)* says that this residual representation comes from a modular eigenform.
[/example]
Serre's theorem is a mod $p$ result. The corresponding $\ell$-adic question asks for a criterion recognizing those representations which should come from geometry and automorphic forms.
[definition: Geometric L-Adic Representation]
Let $\ell$ be a prime. A continuous representation
\begin{align*}
\rho: G_{\mathbb Q} \to GL_n(\overline{\mathbb Q}_\ell)
\end{align*}
is geometric if it is unramified outside finitely many primes and is de Rham at $\ell$.
[/definition]
The de Rham condition is a $p$-adic Hodge-theoretic replacement for saying that the representation has a sensible Hodge theory. Modular Galois representations satisfy this condition, with Hodge-Tate weights governed by the weight of the modular form.
[remark: Fontaine-Mazur Conjecture]
The [Fontaine-Mazur conjecture](/theorems/4792) is recorded here as a guiding prediction, not as a theorem proved in these notes. In the form relevant to this chapter, it says that continuous irreducible geometric $\ell$-adic representations of $G_{\mathbb Q}$, unramified outside finitely many primes, should arise from algebraic geometry after allowing Tate twists. In two-dimensional odd cases over $\mathbb Q$, this expectation points toward modular forms or automorphic representations. The course uses the conjecture to explain the direction of the subject, not as an input with a proof accordion.
[/remark]
The course states this conjecture as a guiding principle rather than proving it. Each hypothesis removes a real obstruction. Finite ramification is necessary because algebraic varieties over $\mathbb Q$ have good reduction outside finitely many primes, while a representation ramified at infinitely many primes cannot arise from such cohomology. The de Rham condition at $\ell$ rules out many continuous $\ell$-adic representations with no Hodge-theoretic structure. Irreducibility separates the basic building blocks from direct sums, and oddness in dimension two is the real-place sign compatible with classical modular forms. The conjecture therefore does not assert automorphy for arbitrary continuous representations; it predicts automorphic or geometric origin only for representations satisfying the same finiteness and Hodge-theoretic constraints seen in the examples from modular forms.
[example: Artin Representations and Weight One Forms]
An Artin representation is a continuous representation
\begin{align*}
\rho:G_{\mathbb Q}\to GL_2(\mathbb C)
\end{align*}
whose image is finite. If $H=\ker(\rho)$, then $H$ has finite index in $G_{\mathbb Q}$, so by Galois theory it corresponds to a finite Galois extension $L/\mathbb Q$, and $\rho$ factors as
\begin{align*}
G_{\mathbb Q}\longrightarrow \operatorname{Gal}(L/\mathbb Q)\longrightarrow GL_2(\mathbb C).
\end{align*}
Only finitely many rational primes ramify in the finite extension $L/\mathbb Q$, so for every prime $p$ outside that finite ramification set, the conjugacy class of $\operatorname{Frob}_p$ in $\operatorname{Gal}(L/\mathbb Q)$ is defined. The local Euler factor of the Artin $L$-function is then
\begin{align*}
\det\!\left(1-\rho(\operatorname{Frob}_p)p^{-s}\right)^{-1}.
\end{align*}
If the eigenvalues of $\rho(\operatorname{Frob}_p)$ are $\alpha_p$ and $\beta_p$, then
\begin{align*}
\det\!\left(1-\rho(\operatorname{Frob}_p)p^{-s}\right)
&=(1-\alpha_pp^{-s})(1-\beta_pp^{-s})\\
&=1-(\alpha_p+\beta_p)p^{-s}+\alpha_p\beta_p p^{-2s}\\
&=1-\operatorname{tr}(\rho(\operatorname{Frob}_p))p^{-s}
+\det(\rho(\operatorname{Frob}_p))p^{-2s}.
\end{align*}
Thus the Frobenius traces are exactly the coefficients appearing in the unramified local factors.
Weight one normalized eigenforms give the same kind of two-dimensional Galois representations, but with finite image rather than the infinite $\ell$-adic images typical in weights $k\ge 2$. The *Deligne-Serre theorem* identifies normalized weight one eigenforms with suitable odd irreducible two-dimensional Artin representations. This is the weight one counterpart of the higher-weight story: Frobenius traces still match Hecke eigenvalues, but the representation is complex and finite-image, so it has no nonzero Hodge-Tate spread of the kind seen in geometric $\ell$-adic representations attached to higher-weight modular forms.
[/example]
## Hilbert Modular Forms and Automorphic Representations
The final broadening changes the ground field. If $\mathbb Q$ is replaced by a number field $F$, especially a totally real field, the relevant modular objects are no longer classical modular forms on the upper half-plane alone. The problem is to find the correct automorphic replacement and the matching Galois representations.
[definition: Hilbert Modular Eigenform]
Let $F$ be a totally real number field, let $\mathbb A_F$ be its ring of adeles, and let $V$ be the coefficient field or representation space determined by the weight. A Hilbert modular eigenform over $F$ is a function
\begin{align*}
\varphi: GL_2(F) \backslash GL_2(\mathbb A_F) \to V
\end{align*}
which is cuspidal, transforms at the archimedean places with algebraic discrete-series type, satisfies the prescribed level condition at the finite places, and is a simultaneous eigenvector for the appropriate finite-place Hecke operators.
[/definition]
For $F=\mathbb Q$, this recovers classical modular eigenforms after translating between adelic and classical language. We need the associated Galois-representation theorem because, for general totally real $F$, there is one archimedean factor for each embedding $F \hookrightarrow \mathbb R$ and the weight becomes a tuple.
[quotetheorem:4793]
[citeproof:4793]
This is the direct analogue of the construction for classical modular forms, but the hypotheses carry real content. The field $F$ is taken totally real so that the archimedean components have the discrete-series algebraicity needed for the cohomology of Hilbert modular varieties. Regular algebraicity is the condition that makes the representation visible in algebraic cohomology and gives Hodge-Tate weights; cuspidality removes Eisenstein and reducible contributions. The matching statement is only made at places where both sides are unramified, because ramified places require a more delicate local Langlands comparison rather than a single Frobenius polynomial. The technical work is deeper because Hilbert modular varieties have higher dimension and the local conditions at primes above $\ell$ require $p$-adic Hodge theory.
[example: Base Change from Q to a Totally Real Field]
Let $f=\sum_{n\ge 1}a_nq^n$ be a classical newform of weight $k$, nebentypus $\chi$, and coefficient field $K_f$, and let $F/\mathbb Q$ be a totally real extension for which base change is available. The restricted representation
\begin{align*}
\rho_{f,\lambda}|_{G_F}:G_F\to GL_2(K_{f,\lambda})
\end{align*}
is the Galois representation attached to the corresponding Hilbert modular form over $F$.
Let $v$ be a finite place of $F$ above a rational prime $p$, and assume that $p\nmid N\ell$ and that $v$ is unramified in $F/\mathbb Q$. Write $r=f(v/p)$ for the residue degree, so the residue field at $v$ has size
\begin{align*}
q_v=p^r.
\end{align*}
For the original compatible system over $\mathbb Q$, choose Frobenius eigenvalues $\alpha_p,\beta_p$ so that
\begin{align*}
\alpha_p+\beta_p&=a_p,\\
\alpha_p\beta_p&=\chi(p)p^{k-1},
\end{align*}
and hence
\begin{align*}
\det\!\left(X-\rho_{f,\lambda}(\operatorname{Frob}_p)\right)
&=(X-\alpha_p)(X-\beta_p)\\
&=X^2-(\alpha_p+\beta_p)X+\alpha_p\beta_p\\
&=X^2-a_pX+\chi(p)p^{k-1}.
\end{align*}
Because $v$ has residue degree $r$ over $p$, the arithmetic Frobenius element $\operatorname{Frob}_v\in G_F$ maps to the conjugacy class of $\operatorname{Frob}_p^r$ in $G_{\mathbb Q}$. Therefore
\begin{align*}
(\rho_{f,\lambda}|_{G_F})(\operatorname{Frob}_v)
&=\rho_{f,\lambda}(\operatorname{Frob}_p^r)\\
&=\rho_{f,\lambda}(\operatorname{Frob}_p)^r.
\end{align*}
If $\rho_{f,\lambda}(\operatorname{Frob}_p)$ has eigenvalues $\alpha_p,\beta_p$, then its $r$th power has eigenvalues $\alpha_p^r,\beta_p^r$, so the Frobenius polynomial at $v$ is
\begin{align*}
\det\!\left(X-(\rho_{f,\lambda}|_{G_F})(\operatorname{Frob}_v)\right)
&=(X-\alpha_p^r)(X-\beta_p^r)\\
&=X^2-(\alpha_p^r+\beta_p^r)X+\alpha_p^r\beta_p^r\\
&=X^2-(\alpha_p^r+\beta_p^r)X+(\alpha_p\beta_p)^r\\
&=X^2-(\alpha_p^r+\beta_p^r)X+\chi(p)^r p^{r(k-1)}\\
&=X^2-(\alpha_p^r+\beta_p^r)X+\chi(p)^r q_v^{k-1}.
\end{align*}
Thus base change does not invent new Frobenius data at unramified places; it reads the old compatible system on the Frobenius class determined by $v$ over $p$.
[/example]
The modern formulation does not stop with $GL_2$. It predicts a relation between automorphic representations of reductive groups and Galois representations into dual groups.
[explanation: Langlands Reciprocity Perspective]
Langlands reciprocity generalizes the modularity dictionary. On one side are automorphic representations, analytic objects built from harmonic analysis on adelic groups. On the other side are Galois representations or Langlands parameters, arithmetic objects encoding Frobenius elements, ramification, and local behaviour.
For classical modular forms, the correspondence sends an eigenform $f$ to the compatible system $\rho_{f,\lambda}$, and the equality
\begin{align*}
\operatorname{tr}(\rho_{f,\lambda}(\operatorname{Frob}_p)) = a_p
\end{align*}
is the visible local shadow. For elliptic curves over $\mathbb Q$, modularity says that the same dictionary accounts for the Tate module of the curve. For higher-dimensional groups, one replaces $GL_2$ by a reductive group and replaces two-dimensional representations by representations valued in the appropriate dual group.
[/explanation]
After the concrete $GL_2$ examples, the natural question is what general pattern they are instances of. The obstruction is that automorphic objects are analytic and representation-theoretic, while Galois representations are arithmetic; a reciprocity principle has to specify when these very different kinds of data should determine one another. The following formulation records the guiding expectation behind the examples already seen, while keeping the statement at the level appropriate for this course.
[remark: Langlands Reciprocity Perspective]
The broad Langlands reciprocity principle is also a perspective rather than a theorem in the generality suggested by the slogan. For the holomorphic cuspidal eigenforms treated earlier in the course, Deligne's construction supplies the precise proved case: a normalized eigenform of weight $k\ge 2$ gives a weakly compatible system of $\ell$-adic Galois representations whose unramified Frobenius polynomials match the Hecke polynomials. The point of the present discussion is to place that proved modular-form case inside the larger expected correspondence, not to assert the full reciprocity principle as a finished theorem.
[/remark]
This principle is a perspective rather than a theorem in full generality. Algebraicity is essential: non-algebraic automorphic representations may have analytic $L$-functions but are not expected to produce compatible systems of $\ell$-adic Galois representations with Hodge-Tate weights. Compatibility is also essential, since isolated representations with unrelated Frobenius polynomials do not behave like the realisations of a common motive. The phrase "almost all places" marks another limitation: ramified local factors contain extra monodromy and conductor data that are not captured by the unramified Frobenius polynomial alone. Its power is that it explains why modularity theorems, compatible systems, motives, and $L$-functions are parts of one structure.
## Fermat's Last Theorem as a Modularity Argument
The historical reason Galois representations became central to the study of modular forms is that they allow Diophantine equations to be attacked through modularity and congruences. The final question is how the machinery of this course enters the proof of Fermat's Last Theorem.
[quotetheorem:4781]
[citeproof:4781]
The displayed trace formula is stated at primes of good reduction because only there is the $\ell$-adic representation unramified and the point-counting formula directly gives the Frobenius trace. At bad primes, the conductor records the type and depth of ramification, but equality of the level with the conductor does not mean that the local representation is unramified or that the same point-counting formula applies. The theorem is the central modularity input tying the elliptic curve to a newform of the same conductor.
[explanation: Frey-Ribet-Wiles Strategy]
Assume for contradiction that there is a nonzero solution to $a^n+b^n=c^n$ with $n \ge 3$. After standard reductions, one constructs the Frey elliptic curve
\begin{align*}
E_{a,b,n}: y^2 = x(x-a^n)(x+b^n).
\end{align*}
This curve is semistable and has a mod $n$ Galois representation with special ramification properties.
Modularity attaches a weight $2$ newform to the Frey curve. Ribet's level-lowering theorem then shows that the residual representation must arise from a newform of a much smaller level, ultimately level $2$ in the classical setup. But there is no such weight $2$ cuspidal newform at that level. The contradiction rules out the original Fermat solution.
[/explanation]
The explanation gives the organizing idea; the example now supplies a test case. This keeps the next construction tied to computations rather than only to terminology.
[example: Where Level Lowering Enters]
After the usual reduction, it is enough to consider a primitive solution
\begin{align*}
a^\ell+b^\ell=c^\ell
\end{align*}
with $\ell$ an odd prime and $\gcd(a,b,c)=1$. Attach the Frey curve
\begin{align*}
E_{a,b,\ell}:y^2=x(x-a^\ell)(x+b^\ell).
\end{align*}
The cubic on the right has roots $0$, $a^\ell$, and $-b^\ell$. For a curve $y^2=(x-r_1)(x-r_2)(x-r_3)$, the discriminant is $16\prod_{i<j}(r_i-r_j)^2$, so here
\begin{align*}
\Delta(E_{a,b,\ell})
&=16(0-a^\ell)^2(0+b^\ell)^2(a^\ell+b^\ell)^2\\
&=16a^{2\ell}b^{2\ell}(a^\ell+b^\ell)^2\\
&=16a^{2\ell}b^{2\ell}c^{2\ell}.
\end{align*}
Thus, for every odd prime $q\mid abc$, primitivity implies that $q$ divides exactly one of $a,b,c$, and hence
\begin{align*}
v_q(\Delta(E_{a,b,\ell}))=2\ell\,v_q(abc).
\end{align*}
In particular, $v_q(\Delta(E_{a,b,\ell}))$ is divisible by $\ell$.
Modularity attaches to $E_{a,b,\ell}$ a weight $2$ newform whose level is the conductor of the curve. At the odd primes $q\mid abc$, the Frey curve is semistable with multiplicative reduction, so each such $q$ contributes one factor to the conductor. Since the mod $\ell$ discriminant exponent above is divisible by $\ell$, *Ribet's level-lowering theorem* removes these odd primes from the level of the residual representation. Therefore the modular form forced by the residual representation has level supported only at $2$; in the classical Fermat normalization this gives level $2$.
But
\begin{align*}
S_2(\Gamma_0(2))=0,
\end{align*}
so there is no weight $2$ cuspidal newform of level $2$. The contradiction is not merely that the Frey curve is modular: it is that modularity gives a form at the conductor, while level lowering forces the same residual representation to come from a level where no such form exists.
[/example]
This closes the course by returning to its central theme: arithmetic geometry becomes tractable when geometric objects, modular forms, and Galois representations are identified through their local data. The equality of Hecke eigenvalues and Frobenius traces is the computational face of a much broader reciprocity principle.
Contents
- Introduction
- What Does It Mean for a Modular Form to Be Arithmetic?
- Why Modular Curves Enter the Story
- How Hecke Operators Become Correspondences
- What A Galois Representation Attached To A Form Should Do
- Why Weight Two Is The First Complete Case
- How Elliptic Curves Fit Into The Same Picture
- The Logical Shape Of The Course
- 1. Modular Curves and Level Structures
- Analytic Quotients of the Upper Half-Plane
- Moduli of Elliptic Curves with Level Structure
- Cusps and Compactification
- Maps Between Levels
- Genus and the First Nontrivial Example
- 2. Hecke Correspondences on Modular Curves
- From Operators to Correspondences
- Double Cosets and Isogenies
- Degeneracy Maps and the Operator $U_p$
- Atkin-Lehner Involutions and Oldforms
- Why Correspondences Matter for Galois Representations
- 3. Jacobians and the Hecke Algebra
- Divisor Classes on the Jacobian
- Hecke Operators on Geometry and Cohomology
- Eichler-Shimura on the Jacobian
- The Integral Hecke Algebra
- From Hecke Ideals to Abelian Variety Quotients
- 4. Eichler-Shimura Theory
- Weight Two Cusp Forms as Differentials
- Betti Cohomology and Periods
- The Eichler-Shimura Isomorphism
- Modular Symbols in Computation
- 5. $\ell$-adic Tate Modules and Galois Representations
- Torsion Points and Tate Modules
- Galois Action on Tate Modules
- Frobenius and Good Reduction
- Modular Jacobians and Hecke-Stable Quotients
- Isogenies and Semisimplicity
- 6. Galois Representations Attached to Weight 2 Newforms
- Newforms, Coefficient Fields, and Primes Above $l$
- Constructing the Representation $\rho_{f,\lambda}$
- Frobenius Traces, Determinants, and Ramification
- Uniqueness from Chebotarev
- 7. Higher Weight Forms and Deligne's Construction
- Why Higher Weight Requires Local Systems
- Deligne's Galois Representation Attached To A Weight k Eigenform
- Hodge-Tate Weights And The Weight Of A Modular Form
- Purity And Weil Bounds For Fourier Coefficients
- Functional Equations And The Motivic Viewpoint
- 8. Residual Representations and Congruences
- Reduction Modulo Primes of Coefficient Fields
- Hecke Congruences and Congruence Primes
- Eisenstein Ideals and Reducibility
- 9. Elliptic Curves, Modular Parametrizations, and L-functions
- Modularity as Equality of L-functions
- Modular Parametrizations and Optimal Quotients
- Conductors, Discriminants, and Reduction Types
- From Local Data to the Modular Form
- 10. The Modularity Theorem
- The Modularity Statement for Elliptic Curves
- Semistable Curves and the Wiles-Taylor Strategy
- Deformation Rings and Hecke Algebras
- From Modularity to Fermat and Langlands
- 11. Fermat's Last Theorem via Frey Curves
- From a Hypothetical Fermat Solution to the Frey Curve
- Serre, Ribet, and the Contradiction with Modularity
- What the Proof Uses and What It Does Not Prove About Explicit Solutions
- 12. Beyond the Course: Galois Representations and Langlands
- Compatible Systems and Motives Attached to Modular Forms
- Serre Modularity and the Fontaine-Mazur Conjecture
- Hilbert Modular Forms and Automorphic Representations
- Fermat's Last Theorem as a Modularity Argument
Modular Forms II: Galois Representations
Content
Problems
History
Created by admin on 5/31/2026 | Last updated on 6/1/2026
Prerequisites
No prerequisites required for this page.
Rate this page
★
★
★
★
★
Poor
Excellent