Diophantine equations ask for integer or rational solutions to polynomial equations. This page is a broad survey: it begins with concrete problems such as Pythagorean triples and sums of two squares, then follows the same existence and classification questions through descent, Pell equations, quadratic forms, elliptic curves, and the modern proof of Fermat's Last Theorem.
The early chapters develop hands-on techniques in detail: primitive reduction, congruences, rational parametrization, factorization, and descent. The middle chapters deliberately introduce a small amount of algebraic language, such as unique factorization, local fields, quadratic forms, and norm equations, because those are the natural settings in which the elementary examples become systematic. Later chapters are explicitly panoramic: they use elliptic curves and modularity only as named landmarks showing where the earlier methods lead, not as machinery to be proved in this course. The formal definitions and worked examples begin in Chapter 1.
# 1. Integer Points, Rational Points, and Primitive Solutions
This chapter sets up the language used throughout the course: integer solutions, rational solutions, and the passage between them by scaling. The first examples already show the two recurring themes of Diophantine equations: algebraic equations define geometric objects, but arithmetic conditions decide which points on those objects are allowed. We begin with polynomial equations over $\mathbb Z$ and $\mathbb Q$, then introduce primitive solutions and the first congruence obstructions that rule out many equations before any parametrization is attempted.
## Polynomial Equations Over the Integers and Rationals
What does it mean to solve a polynomial equation arithmetically rather than geometrically? A polynomial equation defines a set of points over any ring or field where its variables may take values, but the answer changes sharply when the allowed values are restricted from $\mathbb R$ to $\mathbb Q$ or $\mathbb Z$. The notation $\mathbb Z[x_1,\dots,x_n]$ means the set of polynomials in variables $x_1,\dots,x_n$ with integer coefficients.
[definition: Diophantine Equation]
A Diophantine equation over a set $S$ is an equation
\begin{align*}
F(x_1,\dots,x_n)=0
\end{align*}
where $F \in \mathbb Z[x_1,\dots,x_n]$ and the unknown tuple $(x_1,\dots,x_n)$ is required to lie in $S^n$.
[/definition]
In this course the set $S$ is usually $\mathbb Z$, $\mathbb Q$, or $\mathbb N$. The same polynomial can be studied over several arithmetic domains. For instance, $x^2+y^2=z^2$ has many nonzero integer solutions, while $x^2+y^2=3z^2$ has no nonzero integer solutions. Over $\mathbb R$, both equations have abundant nonzero solutions, so the difference is arithmetic rather than geometric over the real numbers.
[example: Integer And Rational Solutions]
For integer solutions of $x^2+y^2=1$, the squares $x^2$ and $y^2$ are nonnegative integers whose sum is $1$, so one of them is $1$ and the other is $0$. Hence the integer solutions are exactly $(\pm1,0)$ and $(0,\pm1)$.
Now let $t\in\mathbb Q$ and take the rational line through $(-1,0)$ with slope $t$:
\begin{align*}
y=t(x+1).
\end{align*}
Substituting this into $x^2+y^2=1$ gives
\begin{align*}
x^2+t^2(x+1)^2&=1,\\
x^2+t^2(x^2+2x+1)&=1,\\
(1+t^2)x^2+2t^2x+t^2-1&=0.
\end{align*}
Since the line already passes through $(-1,0)$, $x=-1$ is one root. Factoring the quadratic makes the other root visible:
\begin{align*}
(1+t^2)x^2+2t^2x+t^2-1
&=(x+1)\bigl((1+t^2)x+t^2-1\bigr).
\end{align*}
Thus the second intersection has
\begin{align*}
X=\frac{1-t^2}{1+t^2}.
\end{align*}
Its $y$-coordinate is
\begin{align*}
Y=t(X+1)
=t\left(\frac{1-t^2}{1+t^2}+1\right)
=t\left(\frac{1-t^2+1+t^2}{1+t^2}\right)
=\frac{2t}{1+t^2}.
\end{align*}
Because $t\in\mathbb Q$ and $1+t^2\ne0$, both $X$ and $Y$ are rational. They really lie on the circle, since
\begin{align*}
X^2+Y^2
&=\left(\frac{1-t^2}{1+t^2}\right)^2+\left(\frac{2t}{1+t^2}\right)^2\\
&=\frac{(1-t^2)^2+4t^2}{(1+t^2)^2}\\
&=\frac{1-2t^2+t^4+4t^2}{(1+t^2)^2}\\
&=\frac{1+2t^2+t^4}{(1+t^2)^2}\\
&=\frac{(1+t^2)^2}{(1+t^2)^2}\\
&=1.
\end{align*}
Different slopes give different second intersection points because the slope from $(-1,0)$ to that point is $t$. Thus the circle has infinitely many rational solutions even though it has only four integer solutions, showing how rational points connect the geometric circle with the stricter arithmetic question over $\mathbb Z$.
[/example]
The affine viewpoint keeps the variables themselves as coordinates. The projective viewpoint records ratios and treats scalar multiples as representing the same point; this is suited to homogeneous equations.
To make that projective viewpoint precise, we need to know which equations respect scaling. The relevant condition is that every term has the same total degree, so multiplying all coordinates by one scalar changes the whole equation by one predictable factor.
[definition: Homogeneous Polynomial]
A polynomial $F\in \mathbb Z[x_1,\dots,x_n]$ is homogeneous of degree $d$ if every monomial appearing in $F$ has total degree $d$.
[/definition]
For a homogeneous equation $F(x_1,\dots,x_n)=0$, scaling a solution by $\lambda$ multiplies the value of $F$ by $\lambda^d$. Hence nonzero integer solutions naturally determine rational projective points.
The next formal step is to specify what is meant by keeping only the ratio of coordinates. This lets us speak about one point represented by many nonzero tuples, rather than repeatedly counting all scalar multiples as different solutions.
[definition: Rational Projective Point]
A rational projective point in $\mathbb P^{n-1}(\mathbb Q)$ is an equivalence class
\begin{align*}
[x_1:\dots:x_n]
\end{align*}
of nonzero tuples $(x_1,\dots,x_n)\in\mathbb Q^n\setminus\{0\}$, where two tuples are equivalent if one is a nonzero rational scalar multiple of the other.
[/definition]
Here $\mathbb A^n$ denotes affine $n$-space, which in this course can be read as the ordinary coordinate space of $n$-tuples over the field being used. This language will be used throughout the course for equations such as $x^2+y^2=z^2$, where the affine cone in $\mathbb A^3$ and the projective conic in $\mathbb P^2$ encode related but different questions.
## Primitive Solutions and Scaling
When a homogeneous equation has one integer solution, it often has infinitely many by multiplying all coordinates by the same integer. How do we separate the arithmetic content of the solution from this automatic scaling?
[definition: Primitive Integer Tuple]
An integer tuple $(x_1,\dots,x_n)\in\mathbb Z^n$ is primitive if
\begin{align*}
\gcd(x_1,\dots,x_n)=1.
\end{align*}
[/definition]
A primitive tuple is the integer representative of a rational projective point with all common factors removed. For homogeneous equations, this reduction loses no essential information about rational points.
The key issue is whether this representative is merely convenient or actually canonical. We need a precise statement saying that every rational projective point can be cleared to an integral tuple, then reduced to a primitive one, with only the unavoidable sign ambiguity left.
[quotetheorem:4479]
[citeproof:4479]
This theorem is the reason primitive solutions dominate the first part of the course. The nonzero condition in projective space matters: the zero tuple cannot represent a ratio of coordinates, and allowing it would destroy uniqueness because every scalar multiple of zero is still zero. Homogeneity is also essential. For a non-homogeneous equation, scaling an integer solution usually changes the equation, so primitive representatives need not preserve the solution set. Thus the theorem does not classify solutions by itself; it only says that, once a homogeneous equation has been moved to projective language, each rational point has a primitive integral representative unique up to sign.
[example: Reducing A Pythagorean Triple]
The triple $(6,8,10)$ is a solution of $x^2+y^2=z^2$ because
\begin{align*}
6^2+8^2&=36+64\\
&=100\\
&=10^2.
\end{align*}
It is not primitive, since
\begin{align*}
\gcd(6,8,10)=2.
\end{align*}
Dividing every coordinate by this common factor gives
\begin{align*}
\left(\frac{6}{2},\frac{8}{2},\frac{10}{2}\right)=(3,4,5),
\end{align*}
and this new triple is primitive because
\begin{align*}
\gcd(3,4,5)=1.
\end{align*}
It still satisfies the same equation:
\begin{align*}
3^2+4^2&=9+16\\
&=25\\
&=5^2.
\end{align*}
Conversely, if $k$ is a positive integer, then $(3k,4k,5k)$ is again a solution, since
\begin{align*}
(3k)^2+(4k)^2&=9k^2+16k^2\\
&=25k^2\\
&=(5k)^2.
\end{align*}
Thus $(3,4,5)$ is the primitive representative of this family, while triples such as $(6,8,10)=2(3,4,5)$ are obtained from it by scaling.
[/example]
Many reductions use the principle that coprimality passes through products in controlled ways. The obstruction is that a divisibility statement about one factor usually cannot be separated from the other factor without some certificate that the two factors share no prime divisor. Bezout's lemma supplies exactly such a certificate by converting a gcd condition into a linear combination equal to $1$.
[quotetheorem:4480]
[citeproof:4480]
Bezout's lemma turns a gcd condition into an equation, which is often easier to multiply, reduce modulo primes, or combine with divisibility assumptions. The hypothesis that $a$ and $b$ are not both zero is necessary because $\gcd(0,0)$ is not defined in the usual positive-divisor convention, and there is no least positive integer combination to begin the proof. The lemma also does not say that the coefficients $u,v$ are unique; for example, once $ua+vb=1$ is known, adding $b$ to $u$ and subtracting $a$ from $v$ gives another representation. Its role is existence, not canonical choice, and this is exactly what is needed for divisibility arguments.
The next obstruction is subtler than ordinary divisibility. If a product is a square, its prime exponents are even in total, but that does not by itself force each factor to have even exponents. A coprimality condition is what prevents the same prime from contributing to both factors, so it allows the square condition to be separated factor by factor.
[quotetheorem:4481]
[citeproof:4481]
This lemma is used repeatedly when a product is known to be a square. The coprimality hypothesis cannot be dropped: $2\cdot2=4$ is a square, but neither factor is a square up to sign. The divisibility conclusion has the same limitation; from $2\mid 2c$ one cannot conclude $2\mid c$. Thus the lemma is a permission to split square factors only after verifying that the factors share no prime divisors, a check that will appear in the parametrisation of Pythagorean triples.
## Parity in Pythagorean Triples
Before parametrising all Pythagorean triples, what can congruences already tell us about the possible parities of $x,y,z$? The equation $x^2+y^2=z^2$ is rigid modulo $4$ because every square is congruent to $0$ or $1$ modulo $4$.
[quotetheorem:4482]
[citeproof:4482]
This parity result is the first structural fact about primitive triples, but its strength depends on primitivity. Nonprimitive triples such as $(6,8,10)$ still satisfy the equation, yet their common factor obscures the parity pattern inherited from the primitive triple $(3,4,5)$. The theorem also does not give a parametrization; it only reduces the possible parity cases to the one compatible with primitivity. In Chapter 2 this reduction justifies writing the even leg as $2mn$ and the odd leg as a difference of squares.
[example: Applying The Parity Classification]
Suppose $(x,y,z)$ is a primitive Pythagorean triple and $x$ is odd. By *Parity Classification For Primitive Pythagorean Triples*, $y$ is even and $z$ is odd. Since $z$ and $x$ are both odd, their sum and difference are even, so
\begin{align*}
A=\frac{z-x}{2},\qquad B=\frac{z+x}{2}
\end{align*}
are integers. The Pythagorean equation gives
\begin{align*}
y^2&=z^2-x^2\\
&=(z-x)(z+x)\\
&=(2A)(2B)\\
&=4AB.
\end{align*}
Because $y$ is even, write $y=2w$ with $w\in\mathbb Z$. Then
\begin{align*}
4w^2&=4AB,
\end{align*}
and dividing by $4$ gives
\begin{align*}
w^2=AB.
\end{align*}
The two factors $A$ and $B$ are coprime. Indeed, if $d$ divides both $A$ and $B$, then $d$ divides
\begin{align*}
A+B=\frac{z-x}{2}+\frac{z+x}{2}=z
\end{align*}
and also
\begin{align*}
B-A=\frac{z+x}{2}-\frac{z-x}{2}=x.
\end{align*}
Thus $d$ divides both $x$ and $z$. Since
\begin{align*}
y^2=z^2-x^2,
\end{align*}
any common divisor of $x$ and $z$ also divides $y$, so primitivity forces $d=1$. Therefore
\begin{align*}
\gcd\left(\frac{z-x}{2},\frac{z+x}{2}\right)=1.
\end{align*}
This rewrites the even leg as a product of two coprime factors whose product is a square, which is exactly the point at which the coprime square-factor argument enters Euclid's formula.
[/example]
## Congruence Obstructions
How can we prove that an equation has no integer solutions without solving it? A congruence obstruction shows that any hypothetical integer solution would produce a forbidden residue class modulo some integer.
The notation $\mathbb Z/m\mathbb Z$ denotes the set of residue classes modulo $m$; thus $(\mathbb Z/m\mathbb Z)^n$ is the set of $n$-tuples of such residue classes.
[definition: Congruence Obstruction]
A congruence obstruction to an equation $F(x_1,\dots,x_n)=0$ over $\mathbb Z$ is a modulus $m\ge2$ such that there is no tuple $(r_1,\dots,r_n)\in(\mathbb Z/m\mathbb Z)^n$ with
\begin{align*}
F(r_1,\dots,r_n)\equiv0\pmod m
\end{align*}
that can occur as the reduction modulo $m$ of an integer tuple allowed by the original problem.
[/definition]
The most useful first moduli are $4$, $8$, and odd primes. Squares modulo $4$ are $0,1$; squares modulo $8$ are $0,1,4$; and modulo an odd prime $p$, the nonzero square classes form a subgroup of $(\mathbb Z/p\mathbb Z)^\times$ of index $2$.
[example: No Nonzero Solutions To x Squared Plus y Squared Equals Three z Squared]
We show that
\begin{align*}
x^2+y^2=3z^2
\end{align*}
has no nonzero integer solution. Suppose, for contradiction, that $(x,y,z)\in\mathbb Z^3$ is a nonzero solution for which
\begin{align*}
|x|+|y|+|z|
\end{align*}
is as small as possible among all nonzero solutions.
Every integer square is congruent to $0$ or $1$ modulo $4$: if $n$ is even, then $n=2k$ and
\begin{align*}
n^2=(2k)^2=4k^2\equiv0\pmod4,
\end{align*}
while if $n$ is odd, then $n=2k+1$ and
\begin{align*}
n^2=(2k+1)^2=4k^2+4k+1=4k(k+1)+1\equiv1\pmod4.
\end{align*}
If $z$ were odd, then
\begin{align*}
3z^2\equiv3\cdot1\equiv3\pmod4.
\end{align*}
But $x^2$ and $y^2$ are each congruent to $0$ or $1$ modulo $4$, so
\begin{align*}
x^2+y^2\equiv0,\ 1,\ \text{or }2\pmod4,
\end{align*}
never $3\pmod4$. Hence $z$ is even.
Write $z=2z_1$ with $z_1\in\mathbb Z$. Then
\begin{align*}
x^2+y^2=3z^2=3(2z_1)^2=12z_1^2,
\end{align*}
so
\begin{align*}
x^2+y^2\equiv0\pmod4.
\end{align*}
Since each of $x^2$ and $y^2$ is congruent to $0$ or $1$ modulo $4$, the only way their sum is $0$ modulo $4$ is
\begin{align*}
x^2\equiv0\pmod4,\qquad y^2\equiv0\pmod4.
\end{align*}
Thus $x$ and $y$ are even. Write
\begin{align*}
x=2x_1,\qquad y=2y_1,\qquad z=2z_1
\end{align*}
with $x_1,y_1,z_1\in\mathbb Z$. Substituting into the original equation gives
\begin{align*}
(2x_1)^2+(2y_1)^2&=3(2z_1)^2,\\
4x_1^2+4y_1^2&=12z_1^2,\\
4(x_1^2+y_1^2)&=4(3z_1^2),
\end{align*}
and dividing by $4$ gives
\begin{align*}
x_1^2+y_1^2=3z_1^2.
\end{align*}
The new triple is still nonzero, because otherwise $(x,y,z)=(0,0,0)$. Also,
\begin{align*}
|x_1|+|y_1|+|z_1|
=\frac{|x|+|y|+|z|}{2}
<|x|+|y|+|z|.
\end{align*}
This contradicts the minimal choice of $(x,y,z)$. Therefore the equation has no nonzero integer solution.
[/example]
The descent step in this example illustrates a common pattern: a congruence may first prove that every solution is nonprimitive, and homogeneity then lets us divide by a common factor to get a smaller solution of the same equation. The useful question is therefore not just whether one residue class is impossible, but whether the congruence forces a common divisor in every alleged solution. When that happens for a homogeneous equation, the obstruction becomes stronger than a single contradiction: it creates an infinite descent.
[quotetheorem:4483]
[citeproof:4483]
The first part of the test is only a parity obstruction: it rules out the odd-odd case, but it says nothing about mixed parity, where triples such as $(3,4,5)$ do exist. The second part is stronger because it forces every solution of $a^2+b^2=3c^2$ to be nonprimitive, and then homogeneity turns that into an infinite descent for nonzero solutions. The test is also modulus-dependent; failure modulo $4$ proves impossibility, but passing modulo $4$ does not prove that an integer solution exists.
Modulo $8$ refines parity because odd squares are all congruent to $1\pmod8$. This is useful when modulo $4$ records only that a number is odd.
[example: Odd Squares Modulo Eight]
Let $n$ be an odd integer. Then there is an integer $k$ such that
\begin{align*}
n=2k+1.
\end{align*}
Squaring and expanding gives
\begin{align*}
n^2
&=(2k+1)^2\\
&=(2k)^2+2(2k)(1)+1^2\\
&=4k^2+4k+1\\
&=4k(k+1)+1.
\end{align*}
The integers $k$ and $k+1$ have opposite parity, so one of them is even. Hence there is an integer $m$ such that
\begin{align*}
k(k+1)=2m.
\end{align*}
Substituting this into the previous expression gives
\begin{align*}
n^2
&=4k(k+1)+1\\
&=4(2m)+1\\
&=8m+1.
\end{align*}
Therefore
\begin{align*}
n^2\equiv1\pmod8.
\end{align*}
So every odd square has residue $1$ modulo $8$, and any equation that would force an odd square to be congruent to $3$, $5$, or $7$ modulo $8$ has no solution in that parity class.
[/example]
Odd prime moduli introduce quadratic residues. We will develop the theory of residues later, but the elementary idea already appears here: reducing an equation modulo $p$ can force a number to be a square class that it is not.
[example: A Modulo Three Obstruction]
Squares modulo $3$ are $0$ and $1$: if $n\equiv0,1,$ or $2\pmod3$, then
\begin{align*}
n^2&\equiv0^2,\ 1^2,\ \text{or }2^2\pmod3\\
&\equiv0,\ 1,\ \text{or }4\pmod3\\
&\equiv0\ \text{or }1\pmod3.
\end{align*}
Suppose
\begin{align*}
x^2+y^2=3z^2.
\end{align*}
Reducing modulo $3$ gives
\begin{align*}
x^2+y^2\equiv 3z^2\equiv0\pmod3.
\end{align*}
Since each of $x^2$ and $y^2$ is congruent to $0$ or $1$ modulo $3$, the possible sums are
\begin{align*}
0+0&\equiv0\pmod3,\\
0+1&\equiv1\pmod3,\\
1+0&\equiv1\pmod3,\\
1+1&\equiv2\pmod3.
\end{align*}
Thus $x^2\equiv0\pmod3$ and $y^2\equiv0\pmod3$, so $3\mid x$ and $3\mid y$. Write
\begin{align*}
x=3x_1,\qquad y=3y_1
\end{align*}
with $x_1,y_1\in\mathbb Z$. Substituting into the equation gives
\begin{align*}
(3x_1)^2+(3y_1)^2&=3z^2,\\
9x_1^2+9y_1^2&=3z^2,\\
3x_1^2+3y_1^2&=z^2.
\end{align*}
Hence
\begin{align*}
z^2\equiv0\pmod3,
\end{align*}
and the square-residue calculation above forces $3\mid z$. Writing $z=3z_1$ gives
\begin{align*}
(3x_1)^2+(3y_1)^2&=3(3z_1)^2,\\
9x_1^2+9y_1^2&=27z_1^2,\\
x_1^2+y_1^2&=3z_1^2.
\end{align*}
Thus any integer solution is divisible by $3$ in all three coordinates and scales down to another solution, giving a modulo $3$ descent route for the same impossibility result.
[/example]
The chapter's main lesson is methodological. For homogeneous Diophantine equations, first pass to primitive representatives, then use gcd and parity arguments to constrain their shape, and only then attempt a full parametrization. Chapter 2 applies this program to $x^2+y^2=z^2$, where these preliminary reductions leave exactly enough structure for a complete classification.
With primitive solutions and parity constraints in hand, the next natural test case is the most familiar quadratic equation in three variables. Pythagorean triples show how a homogeneous equation can be reduced to a complete classification once the right arithmetic structure is exposed.
# 2. Pythagorean Triples
This chapter answers the first substantial classification problem of the course: which integer right triangles can occur, and how can they be listed without repetition? Chapter 1 reduced many homogeneous Diophantine questions to primitive solutions and parity constraints, especially through the primitive representative theorem and the parity classification for primitive triples. We now apply those reductions to the equation $x^2+y^2=z^2$, where a complete answer is possible and already displays two recurring methods: factorisation in $\mathbb Z$ and rational parametrisation of a conic.
## Primitive Right Triangles
The starting problem is to separate the essential triples from their scalar multiples. If $(x,y,z)$ is a solution of $x^2+y^2=z^2$, then so is $(kx,ky,kz)$ for every $k \in \mathbb N$, so classification begins by removing this scaling ambiguity.
[definition: Pythagorean Triple]
A Pythagorean triple is a triple $(x,y,z) \in \mathbb Z^3$ such that
\begin{align*}
x^2+y^2=z^2.
\end{align*}
It is positive if $x,y,z \in \mathbb N$.
[/definition]
The classification becomes manageable only after separating triples that differ by an overall scaling factor. Without this separation, the single triangle shape represented by $(3,4,5)$ would also appear as $(6,8,10)$, $(9,12,15)$, and infinitely many further multiples.
To count the reduced triangle rather than all of its enlargements, we need a condition that detects when no common scaling remains. That condition is primitivity: the side lengths share no common divisor, so the triple represents the essential arithmetic pattern rather than a multiple of a smaller one.
[definition: Primitive Pythagorean Triple]
A primitive Pythagorean triple is a positive Pythagorean triple $(x,y,z)$ such that $\gcd(x,y,z)=1$.
[/definition]
For Pythagorean triples, primitivity can be checked using any two entries. If a prime divides two of $x,y,z$, then the equation forces it to divide the third. Thus $\gcd(x,y,z)=1$ is equivalent to pairwise coprimality of $x$, $y$, and $z$.
[example: Scaling A Primitive Triple]
The triple $(3,4,5)$ is a positive Pythagorean triple because
\begin{align*}
3^2+4^2&=9+16\\
&=25\\
&=5^2.
\end{align*}
It is primitive because
\begin{align*}
\gcd(3,4)&=1,\\
\gcd(1,5)&=1,
\end{align*}
so $\gcd(3,4,5)=1$. Multiplying each entry by $7$ gives $(21,28,35)$, and this new triple still satisfies the Pythagorean equation:
\begin{align*}
21^2+28^2&=(7\cdot 3)^2+(7\cdot 4)^2\\
&=7^2\cdot 3^2+7^2\cdot 4^2\\
&=7^2(3^2+4^2)\\
&=7^2\cdot 5^2\\
&=(7\cdot 5)^2\\
&=35^2.
\end{align*}
However, $7$ divides $21$, $28$, and $35$, so $\gcd(21,28,35)\ge 7$ and $(21,28,35)$ is not primitive. Thus scaling preserves the equation but introduces a common factor, which is why primitive triples are the irreducible objects in the classification.
[/example]
The parity restrictions from Chapter 1 now become decisive. Squares are congruent to $0$ or $1$ modulo $4$, so a primitive triple cannot have both legs odd and cannot have both legs even. Therefore exactly one of $x,y$ is even, while $z$ is odd.
[quotetheorem:4484]
[citeproof:4484]
The primitivity hypothesis is essential here. Nonprimitive triples such as $(6,8,10)$ can have both legs even, so the theorem is not a parity classification of all Pythagorean triples. What it gives is the precise parity pattern after all common scaling has been removed, and that pattern is what makes the factorisation argument work. After swapping $x$ and $y$ if needed, we may assume the even leg is $y$. The equation becomes
\begin{align*}
y^2=z^2-x^2=(z-x)(z+x),
\end{align*}
and the two factors on the right are positive even integers. The classification comes from understanding when their product is a square.
## Euclid's Formula
The main question is now more precise: once the even leg is chosen, can every primitive triple be recovered from two smaller coprime integers? The answer is Euclid's formula. It gives both a construction and a uniqueness statement, up to exchanging the two legs.
[quotetheorem:4485]
[citeproof:4485]
This theorem is a full classification of primitive triples with a chosen even leg, not merely a source of examples. It does not classify nonprimitive triples directly: those are obtained afterwards by multiplying primitive triples by a common positive integer. The hypotheses on $m,n$ cannot be suppressed; for instance, $(m,n)=(3,1)$ gives $(8,6,10)$ because the parameters have the same parity, while $(m,n)=(4,2)$ gives $(12,16,20)$ because the parameters are not coprime. To list primitive triples, choose coprime positive integers $m>n$ of opposite parity and apply the formula. To recognise a given primitive triple, identify the even leg and solve for $m^2$ and $n^2$ from $z+x$ and $z-x$.
[example: Generating The First Triples]
For $(m,n)=(2,1)$, the parameters satisfy $2>1$, $\gcd(2,1)=1$, and $2$ and $1$ have opposite parity. Applying Euclid's formulas gives
\begin{align*}
x&=m^2-n^2\\
&=2^2-1^2\\
&=4-1\\
&=3,\\
y&=2mn\\
&=2\cdot 2\cdot 1\\
&=4,\\
z&=m^2+n^2\\
&=2^2+1^2\\
&=4+1\\
&=5.
\end{align*}
Thus $(m,n)=(2,1)$ produces $(3,4,5)$.
For $(m,n)=(3,2)$, we again have $3>2$, $\gcd(3,2)=1$, and the two parameters have opposite parity. The same formulas give
\begin{align*}
x&=3^2-2^2\\
&=9-4\\
&=5,\\
y&=2\cdot 3\cdot 2\\
&=12,\\
z&=3^2+2^2\\
&=9+4\\
&=13.
\end{align*}
So this pair produces $(5,12,13)$.
For $(m,n)=(5,2)$, the conditions also hold: $5>2$, $\gcd(5,2)=1$, and $5$ is odd while $2$ is even. The entries are
\begin{align*}
x&=5^2-2^2\\
&=25-4\\
&=21,\\
y&=2\cdot 5\cdot 2\\
&=20,\\
z&=5^2+2^2\\
&=25+4\\
&=29.
\end{align*}
This gives $(21,20,29)$, which is usually written as $(20,21,29)$ after exchanging the two legs. In each case, coprimality and opposite parity are the conditions that keep Euclid's formula from producing a scaled or repeated nonprimitive triple.
[/example]
The parametrization is most useful when it can be read backwards from a concrete triple. The next example shows how the parameter is recovered rather than chosen in advance.
[example: Recovering Parameters From A Triple]
Consider the primitive triple $(20,21,29)$. Since the even leg is $20$, we take
\begin{align*}
y&=20, & x&=21, & z&=29.
\end{align*}
The recovery formulas give
\begin{align*}
m^2&=\frac{z+x}{2}\\
&=\frac{29+21}{2}\\
&=\frac{50}{2}\\
&=25\\
&=5^2,
\end{align*}
and
\begin{align*}
n^2&=\frac{z-x}{2}\\
&=\frac{29-21}{2}\\
&=\frac{8}{2}\\
&=4\\
&=2^2.
\end{align*}
Because $m,n \in \mathbb N$, this gives $m=5$ and $n=2$. These parameters satisfy $5>2$, $\gcd(5,2)=1$, and they have opposite parity. Substituting them back into Euclid's formulas recovers the triple:
\begin{align*}
m^2-n^2&=5^2-2^2\\
&=25-4\\
&=21\\
&=x,\\
2mn&=2\cdot 5\cdot 2\\
&=20\\
&=y,\\
m^2+n^2&=5^2+2^2\\
&=25+4\\
&=29\\
&=z.
\end{align*}
Thus $(20,21,29)$ is obtained from the parameter pair $(m,n)=(5,2)$, with the even leg assigned to $y=2mn$.
[/example]
The hypotheses in Euclid's formula are necessary. If $m,n$ have a common divisor, then all three entries have a common divisor. If $m,n$ are both odd, then $m^2-n^2$ and $m^2+n^2$ are even, so the resulting triple is not primitive.
[example: Why The Conditions Matter]
For $(m,n)=(3,1)$, the Euclid expressions produce
\begin{align*}
m^2-n^2&=3^2-1^2\\
&=9-1\\
&=8,\\
2mn&=2\cdot 3\cdot 1\\
&=6,\\
m^2+n^2&=3^2+1^2\\
&=9+1\\
&=10.
\end{align*}
Thus the resulting triple is $(8,6,10)$. It is a Pythagorean triple because
\begin{align*}
8^2+6^2&=64+36\\
&=100\\
&=10^2,
\end{align*}
but it is not primitive:
\begin{align*}
8&=2\cdot 4,\\
6&=2\cdot 3,\\
10&=2\cdot 5,
\end{align*}
so $2$ divides all three entries. Equivalently,
\begin{align*}
(8,6,10)&=2(4,3,5).
\end{align*}
The failure comes from parity: $3$ and $1$ are both odd, so $m$ and $n$ do not have opposite parity.
For $(m,n)=(4,2)$, the entries are
\begin{align*}
m^2-n^2&=4^2-2^2\\
&=16-4\\
&=12,\\
2mn&=2\cdot 4\cdot 2\\
&=16,\\
m^2+n^2&=4^2+2^2\\
&=16+4\\
&=20.
\end{align*}
Thus the resulting triple is $(12,16,20)$. It satisfies the equation since
\begin{align*}
12^2+16^2&=144+256\\
&=400\\
&=20^2,
\end{align*}
but it is not primitive because
\begin{align*}
12&=4\cdot 3,\\
16&=4\cdot 4,\\
20&=4\cdot 5,
\end{align*}
so $4$ divides all three entries. This time the obstruction is coprimality:
\begin{align*}
\gcd(4,2)=2\ne 1.
\end{align*}
These two computations show separately why Euclid's classification requires opposite parity and coprime parameters.
[/example]
## Rational Points On The Unit Circle
The next problem is to reinterpret the same classification geometrically. Dividing $x^2+y^2=z^2$ by $z^2$ turns a positive integer solution into a rational point on the unit circle:
\begin{align*}
X^2+Y^2=1, \qquad X=\frac{x}{z}, \qquad Y=\frac{y}{z}.
\end{align*}
Conversely, a rational point $(X,Y)$ on the unit circle can be cleared of denominators to produce an integer Pythagorean triple.
[definition: Rational Point On The Unit Circle]
A rational point on the unit circle is a point $(X,Y) \in \mathbb Q^2$ satisfying
\begin{align*}
X^2+Y^2=1.
\end{align*}
[/definition]
Listing rational points directly is hard because the equation $X^2+Y^2=1$ couples two rational unknowns by a quadratic constraint. The useful idea is to replace the two-coordinate search by the single rational parameter given by a slope. The circle contains the rational point $(-1,0)$. Every line of rational slope through this point has equation $Y=t(X+1)$ with $t \in \mathbb Q$, and its second intersection with the circle gives a rational point. This is the slope method, and it is the prototype for parametrising rational points on conics.
[illustration:unit-circle-slope-parametrization]
The picture suggests a complete parametrisation, but a drawing alone does not show that every rational point has been found or that the coordinates produced from a rational slope are themselves rational. The obstruction is the chosen base point: a finite slope line through $(-1,0)$ can meet the circle again, but the base point itself is not recovered as a second intersection from a finite slope. The needed result turns this geometric construction into exact formulas and records the single exceptional point.
[quotetheorem:4486]
[citeproof:4486]
The exceptional point $(-1,0)$ matters because it is the chosen base point: the line through $(-1,0)$ and itself has no finite slope parameter in this description. For every other rational point, the slope is rational and recovers the point uniquely. The parametrisation is the rational version of Euclid's formula, but it does not by itself assert that the integer triple obtained after clearing denominators is primitive; the fraction $t=n/m$ must first be written in lowest terms, and the parity of $m,n$ controls whether a common factor remains. If $t=n/m$ with $m,n \in \mathbb N$, then
\begin{align*}
X&=\frac{m^2-n^2}{m^2+n^2}, & Y&=\frac{2mn}{m^2+n^2}.
\end{align*}
Thus the numerator and denominator of a rational point on the circle encode the same data as a primitive Pythagorean triple after the same coprimality and parity reductions as before. This geometric viewpoint will reappear for other conics, while later higher-degree equations show that such a single rational parametrisation need not exist.
[example: Slope Corresponding To A Triple]
For the triple $(5,12,13)$ with even leg $12$, divide the two legs by the hypotenuse to get the rational point
\begin{align*}
(X,Y)&=\left(\frac{5}{13},\frac{12}{13}\right).
\end{align*}
It lies on the unit circle because
\begin{align*}
\left(\frac{5}{13}\right)^2+\left(\frac{12}{13}\right)^2
&=\frac{25}{169}+\frac{144}{169}\\
&=\frac{169}{169}\\
&=1.
\end{align*}
The line through $(-1,0)$ and this point has slope
\begin{align*}
t&=\frac{Y-0}{X-(-1)}\\
&=\frac{12/13}{5/13+1}\\
&=\frac{12/13}{5/13+13/13}\\
&=\frac{12/13}{18/13}\\
&=\frac{12}{13}\cdot \frac{13}{18}\\
&=\frac{12}{18}\\
&=\frac{2}{3}.
\end{align*}
For the Euclid parameters $(m,n)=(3,2)$, we have
\begin{align*}
\frac{n}{m}&=\frac{2}{3},
\end{align*}
so the slope parameter is exactly $t=n/m$. Thus the geometric slope parametrisation records the same parameter ratio that produces the triple $(5,12,13)$.
[/example]
The same circle parametrization also explains how rational points encode integer triples after denominators are cleared. The next example makes that conversion explicit.
[example: A Rational Point Gives An Integer Triple]
Take $t=3/4$ in the rational parametrisation formulas. Since
\begin{align*}
t^2&=\left(\frac{3}{4}\right)^2\\
&=\frac{9}{16},
\end{align*}
the first coordinate is
\begin{align*}
X&=\frac{1-t^2}{1+t^2}\\
&=\frac{1-\frac{9}{16}}{1+\frac{9}{16}}\\
&=\frac{\frac{16}{16}-\frac{9}{16}}{\frac{16}{16}+\frac{9}{16}}\\
&=\frac{\frac{7}{16}}{\frac{25}{16}}\\
&=\frac{7}{16}\cdot \frac{16}{25}\\
&=\frac{7}{25}.
\end{align*}
The second coordinate is
\begin{align*}
Y&=\frac{2t}{1+t^2}\\
&=\frac{2\cdot \frac{3}{4}}{1+\frac{9}{16}}\\
&=\frac{\frac{6}{4}}{\frac{16}{16}+\frac{9}{16}}\\
&=\frac{\frac{3}{2}}{\frac{25}{16}}\\
&=\frac{3}{2}\cdot \frac{16}{25}\\
&=\frac{48}{50}\\
&=\frac{24}{25}.
\end{align*}
Thus the rational point is
\begin{align*}
(X,Y)&=\left(\frac{7}{25},\frac{24}{25}\right).
\end{align*}
Clearing denominators gives the integer triple $(7,24,25)$, and it satisfies the Pythagorean equation because
\begin{align*}
7^2+24^2&=49+576\\
&=625\\
&=25^2.
\end{align*}
It is primitive since
\begin{align*}
\gcd(7,24)&=1,\\
\gcd(1,25)&=1,
\end{align*}
so $\gcd(7,24,25)=1$. Finally, this is exactly Euclid's formula with $(m,n)=(4,3)$:
\begin{align*}
m^2-n^2&=4^2-3^2\\
&=16-9\\
&=7,\\
2mn&=2\cdot 4\cdot 3\\
&=24,\\
m^2+n^2&=4^2+3^2\\
&=16+9\\
&=25.
\end{align*}
So the slope $t=3/4$ produces the same primitive triple as the parameter pair $(m,n)=(4,3)$.
[/example]
The slope method reframes the arithmetic classification as a geometric statement: rational lines through one known rational point account for all rational points on the conic. Chapter 3 turns this into the general conic parametrization theorem, while Chapter 8 explains why higher-degree equations need not be captured by a single elementary parametrization.
The classification of Pythagorean triples is the first instance where a rational point leads to a full parametrization of all others. That geometric idea extends beyond the circle, and the next chapter develops it for general conics by turning a single rational point into a systematic description of all rational points.
# 3. Conics and Parametrization
This chapter moves from the special case of Pythagorean triples to the general geometry of quadratic equations in two variables. The guiding question is: once a single rational point on a conic is known, can all the other rational points be found by a uniform procedure? The answer is the slope method, which turns lines through the known point into a rational parametrization of the conic.
## Lines Through a Known Rational Point
Suppose a plane quadratic equation has one rational solution. The problem is to decide whether this isolated datum is enough to describe every rational solution, rather than searching for solutions one by one.
[definition: Affine Plane Conic]
An affine plane conic over $\mathbb Q$ is the zero set in $\mathbb A^2$ of a polynomial
\begin{align*}
F(X,Y)=aX^2+bXY+cY^2+dX+eY+f
\end{align*}
with $a,b,c,d,e,f \in \mathbb Q$, not all quadratic coefficients zero.
[/definition]
The useful feature of a conic is that a line intersects it in degree two. If one intersection point is already known, the second point can be recovered by solving a linear equation after the known root has been factored out.
[example: Unit Circle From The Point Minus One Zero]
Consider the circle $X^2+Y^2=1$ and the rational point $P=(-1,0)$. A line of rational slope $t \in \mathbb Q$ through $P$ has equation
\begin{align*}
Y=t(X+1).
\end{align*}
Substituting this expression for $Y$ into $X^2+Y^2=1$ gives
\begin{align*}
X^2+t^2(X+1)^2&=1,\\
X^2+t^2(X^2+2X+1)&=1,\\
(1+t^2)X^2+2t^2X+t^2-1&=0.
\end{align*}
Since $P=(-1,0)$ lies on both the line and the circle, $X=-1$ must be one root. To factor the quadratic, write
\begin{align*}
(1+t^2)X^2+2t^2X+t^2-1
&=(X+1)\bigl((1+t^2)X+(t^2-1)\bigr),
\end{align*}
because
\begin{align*}
(X+1)\bigl((1+t^2)X+(t^2-1)\bigr)
&=(1+t^2)X^2+(t^2-1)X+(1+t^2)X+(t^2-1)\\
&=(1+t^2)X^2+2t^2X+t^2-1.
\end{align*}
Thus the second intersection is found from
\begin{align*}
(1+t^2)X+(t^2-1)&=0,\\
(1+t^2)X&=1-t^2,\\
X&=\frac{1-t^2}{1+t^2}.
\end{align*}
Then the line equation gives
\begin{align*}
Y&=t\left(\frac{1-t^2}{1+t^2}+1\right)\\
&=t\left(\frac{1-t^2+1+t^2}{1+t^2}\right)\\
&=\frac{2t}{1+t^2}.
\end{align*}
For every $t \in \mathbb Q$, the denominator $1+t^2$ is nonzero, so this gives a rational point
\begin{align*}
\left(\frac{1-t^2}{1+t^2},\frac{2t}{1+t^2}\right)
\end{align*}
on the unit circle. This is the standard rational parametrization that later becomes Euclid's formula for primitive Pythagorean triples after clearing denominators.
[/example]
The same computation works for any nonsingular conic once the known rational point is fixed. We use only the elementary form of the idea here: draw every rational line through the known point, solve the resulting quadratic equation, and take the second intersection point.
With that language in place, the obstruction becomes precise. A rational line through a rational base point already accounts for one rational intersection with the conic. Since the total intersection has degree two, the remaining intersection is forced by a linear calculation in the slope, and this is what turns the family of lines into a rational parametrization.
[explanation: Rational Parametrization Of A Conic From One Point]
Let a nonsingular conic over $\mathbb Q$ have one rational point $P$. Lines through $P$ with rational slope meet the conic in exactly one further point, except for the tangent direction. Solving for that second intersection expresses the rational points of the conic by rational functions of the slope.
[/explanation]
This parametrization is the geometric form of a familiar algebraic trick: once a quadratic has one known root, the other root is controlled by the sum or product of the roots. The hypotheses are doing real work. If the base point is not rational, a rational slope line need not produce a rational second point; for example, the point $(\sqrt{2},0)$ on $X^2+Y^2=2$ is not a rational starting point for a rational parametrization over $\mathbb Q$. Nonsingularity also matters: a singular quadratic may split into two lines or have a double point, so the line-intersection argument no longer describes a smooth degree-two curve. Finally, the projective completion is not cosmetic, because vertical lines and points at infinity are exactly where the affine slope parameter alone can miss part of the geometry.
[example: Parametrizing X Squared Minus Two Y Squared Equals Z Squared]
Work projectively with
\begin{align*}
X^2-2Y^2=Z^2.
\end{align*}
The point $P=(1:0:1)$ is rational, and in the affine chart $Z=1$ it is the point $(1,0)$. For a rational slope $t \in \mathbb Q$, the line through $(1,0)$ has equation
\begin{align*}
Y=t(X-1).
\end{align*}
Substituting this into $X^2-2Y^2=1$ gives
\begin{align*}
X^2-2t^2(X-1)^2&=1,\\
X^2-2t^2(X^2-2X+1)&=1,\\
X^2-2t^2X^2+4t^2X-2t^2&=1,\\
(1-2t^2)X^2+4t^2X-2t^2-1&=0.
\end{align*}
Since $X=1$ is the known intersection point, factor the quadratic by writing
\begin{align*}
(1-2t^2)X^2+4t^2X-2t^2-1
&=(X-1)\bigl((1-2t^2)X+(1+2t^2)\bigr),
\end{align*}
because
\begin{align*}
(X-1)\bigl((1-2t^2)X+(1+2t^2)\bigr)
&=(1-2t^2)X^2+(1+2t^2)X\\
&\qquad -(1-2t^2)X-(1+2t^2)\\
&=(1-2t^2)X^2+4t^2X-2t^2-1.
\end{align*}
The second intersection therefore satisfies
\begin{align*}
(1-2t^2)X+(1+2t^2)&=0,\\
(1-2t^2)X&=-(1+2t^2),\\
X&=\frac{1+2t^2}{2t^2-1}.
\end{align*}
Then the line equation gives
\begin{align*}
Y&=t\left(\frac{1+2t^2}{2t^2-1}-1\right)\\
&=t\left(\frac{1+2t^2-(2t^2-1)}{2t^2-1}\right)\\
&=\frac{2t}{2t^2-1}.
\end{align*}
Thus, for rational $t$ with $2t^2 \ne 1$, the second affine intersection point is
\begin{align*}
\left(\frac{2t^2+1}{2t^2-1},\frac{2t}{2t^2-1}\right).
\end{align*}
This gives the rational points in the affine chart $Z=1$ obtained from nonvertical rational lines through $(1,0)$; the vertical tangent direction at the base point is the projective direction not represented by a finite slope $t$.
[/example]
## Homogenization And Projective Conics Over $\mathbb{Q}$
Affine equations can lose points at infinity, and the slope method naturally wants a parameter value $\infty$. The next problem is to set the conic in a space where finite points and points at infinity are treated by the same equation.
[definition: Homogenization Of A Polynomial]
Let $F(X,Y) \in \mathbb Q[X,Y]$ have total degree $d$. Its homogenization is the homogeneous polynomial
\begin{align*}
F^h(X,Y,Z)=Z^d F\left(\frac{X}{Z},\frac{Y}{Z}\right) \in \mathbb Q[X,Y,Z].
\end{align*}
[/definition]
After homogenization, the affine equation $F(X,Y)=0$ becomes $F^h(X,Y,Z)=0$. The original affine plane is recovered by setting $Z=1$, while $Z=0$ records the directions in which the affine conic goes to infinity.
We now need a name for the projective object produced by a homogeneous quadratic equation. This definition packages both the equation and the rule that proportional nonzero triples represent the same point, which is what makes points at infinity and ordinary affine points part of one setting.
[definition: Projective Plane Conic]
A projective plane conic over $\mathbb Q$ is the zero set in $\mathbb P^2$ of a nonzero homogeneous quadratic polynomial
\begin{align*}
Q(X,Y,Z) \in \mathbb Q[X,Y,Z].
\end{align*}
Its rational points are triples $(x:y:z)$ with $x,y,z \in \mathbb Q$, not all zero, satisfying $Q(x,y,z)=0$, modulo multiplication by a nonzero rational scalar.
[/definition]
Projective language is especially suited to Diophantine equations because integer triples and rational projective points differ mostly by scaling. This is why primitive integer solutions are often the correct integral representatives of rational projective solutions.
For Pythagorean triples, this relationship must be stated carefully because the same projective point has many integral representatives. The obstruction is scaling: $(3,4,5)$ and $(6,8,10)$ give the same point of $X^2+Y^2=Z^2$ in projective space, but only one should count as the reduced integer representative. What has to be proved is that imposing primitivity removes exactly this ambiguity and loses no rational projective point with integer coordinates after clearing denominators.
[quotetheorem:4488]
[citeproof:4488]
The theorem explains why the projective conic is the natural home for Pythagorean triples. The arithmetic condition of being primitive is not extra geometry; it is a choice of integral representative for a rational point. The primitivity condition is necessary for uniqueness: $(3,4,5)$ and $(6,8,10)$ determine the same projective point but should not be counted as different rational points. The theorem does not classify which primitive triples arise from a particular choice of parameters; that refinement comes from applying the slope parametrization and then imposing coprimality and parity conditions. This prepares the transition from projective rational points back to explicit integer formulae.
[example: Recovering Euclid Parameters From A Slope]
On $X^2+Y^2=Z^2$, use the rational point $(-1:0:1)$ and work in the affine chart $Z=1$, where the equation is $X^2+Y^2=1$. With the convention that the line through $(-1,0)$ has equation $Y=t(X+1)$, take a rational slope $t=n/m$ with $m,n \in \mathbb Z$ and $m \ne 0$. From the slope parametrization already computed from this base point,
\begin{align*}
X=\frac{1-t^2}{1+t^2}, \qquad Y=\frac{2t}{1+t^2}, \qquad Z=1.
\end{align*}
Substituting $t=n/m$ gives
\begin{align*}
X
&=\frac{1-\left(\frac{n}{m}\right)^2}{1+\left(\frac{n}{m}\right)^2}\\
&=\frac{\frac{m^2}{m^2}-\frac{n^2}{m^2}}{\frac{m^2}{m^2}+\frac{n^2}{m^2}}\\
&=\frac{\frac{m^2-n^2}{m^2}}{\frac{m^2+n^2}{m^2}}\\
&=\frac{m^2-n^2}{m^2+n^2},
\end{align*}
and
\begin{align*}
Y
&=\frac{2\left(\frac{n}{m}\right)}{1+\left(\frac{n}{m}\right)^2}\\
&=\frac{\frac{2n}{m}}{\frac{m^2+n^2}{m^2}}\\
&=\frac{2n}{m}\cdot \frac{m^2}{m^2+n^2}\\
&=\frac{2mn}{m^2+n^2}.
\end{align*}
Thus the corresponding rational projective point is
\begin{align*}
\left(\frac{m^2-n^2}{m^2+n^2}:\frac{2mn}{m^2+n^2}:1\right).
\end{align*}
Multiplying all three homogeneous coordinates by the nonzero scalar $m^2+n^2$ gives the same projective point represented by
\begin{align*}
(m^2-n^2:2mn:m^2+n^2).
\end{align*}
The identity
\begin{align*}
(m^2-n^2)^2+(2mn)^2
&=m^4-2m^2n^2+n^4+4m^2n^2\\
&=m^4+2m^2n^2+n^4\\
&=(m^2+n^2)^2
\end{align*}
confirms that this is a Pythagorean triple. Imposing $\gcd(m,n)=1$ and opposite parity is the later integrality condition that removes the common-factor cases and selects the primitive triples.
[/example]
## Elementary Local Obstructions
The parametrization theorem starts with a rational point. The remaining problem is to detect when no such point exists, preferably before attempting a search over rational numbers.
[definition: Local Congruence Obstruction]
Let $Q(X,Y,Z) \in \mathbb Z[X,Y,Z]$ be homogeneous. A local congruence obstruction modulo $m$ to a nonzero rational point on $Q=0$ is the assertion that there is a prime $p \mid m$ such that every integer solution of
\begin{align*}
Q(x,y,z) \equiv 0 \pmod m
\end{align*}
that could arise from a solution of $Q(x,y,z)=0$ satisfies $p \mid x$, $p \mid y$, and $p \mid z$. This contradicts the existence of a primitive integer representative, since a rational projective point can be cleared of denominators and divided by the greatest common divisor of its coordinates.
[/definition]
The point of a local obstruction is that a rational solution can be cleared of denominators and then made primitive. If every congruence solution forces a common prime divisor, the assumed primitive representative cannot exist.
[quotetheorem:4489]
[citeproof:4489]
This argument is called local because it uses information modulo a single prime. The primitive representative condition is essential: if a common factor were allowed, the conclusion $7 \mid x,y,z$ would not contradict the existence of the displayed integer triple, because the same projective point could be represented after dividing by $7$. The modulus is also essential to the proof; modulo $4$, for instance, the equation does not force $x$ and $y$ to have a common prime divisor. Thus the obstruction is not merely that a congruence was checked, but that the congruence forces every supposed primitive representative to be divisible by the same prime. This is powerful enough to rule out parametrization: a conic with no rational point cannot be parametrized by rational lines through a rational base point.
[example: Why Modulo Seven Is The Right Modulus]
For the equation $x^2+y^2=7z^2$, compare what happens modulo $4$ and modulo $7$. Squares modulo $4$ are only $0$ and $1$, because
\begin{align*}
0^2&\equiv 0 \pmod 4,\\
1^2&\equiv 1 \pmod 4,\\
2^2&=4\equiv 0 \pmod 4,\\
3^2&=9\equiv 1 \pmod 4.
\end{align*}
Therefore $x^2+y^2$ can be congruent to
\begin{align*}
0+0&\equiv 0 \pmod 4,\\
0+1&\equiv 1 \pmod 4,\\
1+0&\equiv 1 \pmod 4,\\
1+1&\equiv 2 \pmod 4.
\end{align*}
Since $7\equiv 3 \pmod 4$, the right side satisfies $7z^2\equiv 0$ or $3 \pmod 4$ according as $z^2\equiv 0$ or $1 \pmod 4$. The congruence modulo $4$ can rule out some parity patterns, but it does not force one prime to divide $x,y,$ and $z$ simultaneously.
Modulo $7$, the equation becomes
\begin{align*}
x^2+y^2 \equiv 7z^2 \equiv 0 \pmod 7.
\end{align*}
The squares modulo $7$ are
\begin{align*}
0^2&\equiv 0 \pmod 7,\\
1^2&\equiv 1 \pmod 7,\\
2^2&=4\equiv 4 \pmod 7,\\
3^2&=9\equiv 2 \pmod 7,\\
4^2&=16\equiv 2 \pmod 7,\\
5^2&=25\equiv 4 \pmod 7,\\
6^2&=36\equiv 1 \pmod 7.
\end{align*}
Thus the possible square residues are $0,1,2,4$. To have $a^2+b^2\equiv 0 \pmod 7$, the possible sums of nonzero square residues would have to include $0$:
\begin{align*}
1+1&\equiv 2 \pmod 7,&
1+2&\equiv 3 \pmod 7,&
1+4&\equiv 5 \pmod 7,\\
2+1&\equiv 3 \pmod 7,&
2+2&\equiv 4 \pmod 7,&
2+4&\equiv 6 \pmod 7,\\
4+1&\equiv 5 \pmod 7,&
4+2&\equiv 6 \pmod 7,&
4+4&\equiv 1 \pmod 7.
\end{align*}
None is $0$, so $a^2+b^2\equiv 0 \pmod 7$ forces $a^2\equiv b^2\equiv 0 \pmod 7$, hence $a\equiv b\equiv 0 \pmod 7$. Applied to $x$ and $y$, this gives $7\mid x$ and $7\mid y$; then the original equation gives $7\mid z^2$, so $7\mid z$. This is the common-divisor contradiction that modulo $4$ cannot produce.
[/example]
## The Working Dichotomy For Conics
The course will use conics through a simple dichotomy: either a rational point is found, in which case the whole conic can be parametrized, or a congruence obstruction proves that no rational point exists. This is the first appearance of a local-to-global theme, although for general equations the relationship between local information and rational solutions becomes much more subtle.
[remark: What The Slope Method Does Not Prove]
The slope method does not prove that a conic has a rational point. It proves that once such a point is available, the rest of the rational points are controlled by lines through it. Finding the first point is an arithmetic problem, and congruence obstructions are the first tools for showing that the search may be impossible.
[/remark]
Conic parametrization explains how rational points can be generated once one point is known, but it also highlights a more arithmetic question: when does a given equation even have a solution? For sums of two squares, congruence obstructions and factorization in a larger ring combine to give a complete answer.
# 4. Sums of Two Squares
This chapter studies the Diophantine equation $x^2+y^2=n$ from the point of view developed in Chapters 1-3: congruence obstructions, primitive factorization, and the use of enlarged arithmetic settings suggested by conics and projective scaling. The new idea is to enlarge the integers to the Gaussian integers, where $x^2+y^2$ becomes a norm. This turns a quadratic representation problem into a factorisation problem, and it gives a complete criterion for which positive integers are sums of two squares.
## The Sum-Of-Two-Squares Criterion
The prime case is the first test: for which primes $p$ do there exist $a,b\in \mathbb Z$ with $p=a^2+b^2$? Congruences give a necessary condition. If $p$ is an odd prime and $p=a^2+b^2$, then neither $a$ nor $b$ is divisible by $p$, and $-1\equiv (ab^{-1})^2\pmod p$ after choosing the nonzero term. Thus $-1$ must be a quadratic residue modulo $p$.
For a general positive integer $n$, prime behavior is not enough by itself: representations multiply, but a prime divisor congruent to $3\pmod 4$ can only occur in paired form. The complete criterion therefore records exactly which prime exponents are allowed, turning the prime obstruction into a statement about the factorization of $n$.
[quotetheorem:1725]
[citeproof:1725/diophantine-equations]
This theorem says that the congruence obstruction modulo $4$ is not merely a necessary test for individual primes. It is the organizing principle for all integers: primes congruent to $1\pmod 4$ may contribute freely, while primes congruent to $3\pmod 4$ must occur to even exponent. The Gaussian integers $\mathbb Z[i]$ explain this because $a^2+b^2$ is the norm of $a+bi$, so the representation problem becomes a factorization problem in an enlarged arithmetic setting.
[example: Small Prime Representations]
The primes $5$, $13$, and $17$ satisfy
\begin{align*}
5&\equiv 1\pmod 4,\\
13&\equiv 1\pmod 4,\\
17&\equiv 1\pmod 4,
\end{align*}
so the Fermat two-square criterion for primes predicts that each should be representable as a sum of two squares. The representations are obtained by evaluating the displayed squares:
\begin{align*}
1^2+2^2&=1+4=5,\\
2^2+3^2&=4+9=13,\\
1^2+4^2&=1+16=17.
\end{align*}
These same equalities appear as Gaussian factorizations into conjugate factors:
\begin{align*}
(2+i)(2-i)&=2^2-i^2=4-(-1)=5,\\
(3+2i)(3-2i)&=3^2-(2i)^2=9-4i^2=9-4(-1)=13,\\
(4+i)(4-i)&=4^2-i^2=16-(-1)=17.
\end{align*}
Equivalently, the factors $2+i$, $3+2i$, and $4+i$ have norms
\begin{align*}
N(2+i)&=2^2+1^2=5,\\
N(3+2i)&=3^2+2^2=13,\\
N(4+i)&=4^2+1^2=17.
\end{align*}
The pattern is that these primes $p\equiv 1\pmod 4$ split into two conjugate Gaussian factors, and the norm of either factor recovers the corresponding representation $p=a^2+b^2$.
[/example]
## Why introduce the Gaussian integers?
The expression $x^2+y^2$ factors over the complex numbers as $(x+iy)(x-iy)$. Inside $\mathbb Z$ alone this factorisation is invisible: knowing that a prime divides $x^2+y^2$ gives congruence information, but it does not produce two integer factors whose sizes can be controlled. The Gaussian integers are the smallest integral setting where this factorisation can be used while retaining a useful divisibility theory. The main purpose of this section is to show that $\mathbb Z[i]$ behaves enough like $\mathbb Z$ for prime factorisation arguments to work.
[definition: Gaussian Integer]
A Gaussian integer is a complex number of the form $a+bi$ with $a,b\in\mathbb Z$. The set of Gaussian integers is
\begin{align*}
\mathbb Z[i]=\{a+bi:a,b\in\mathbb Z\}.
\end{align*}
[/definition]
Addition and multiplication are inherited from $\mathbb C$, and $\mathbb Z[i]$ is a commutative ring with identity. The conjugate of $z=a+bi$ is $\overline z=a-bi$.
The factorisation $x^2+y^2=(x+iy)(x-iy)$ becomes arithmetically useful only if we can measure the size of a Gaussian integer by an ordinary integer. That measure must turn $a+bi$ into the sum $a^2+b^2$, so that algebra inside $\mathbb Z[i]$ continues to remember the original sum-of-two-squares problem.
[definition: Gaussian Norm]
The Gaussian norm is the map
\begin{align*}
N: \mathbb Z[i]&\to \mathbb Z_{\ge 0},\\
z=a+bi&\mapsto z\overline z=a^2+b^2.
\end{align*}
[/definition]
The norm is integer-valued and multiplicative. This is the bridge between algebra in $\mathbb Z[i]$ and representations as sums of two squares. To use Gaussian integers arithmetically, we need multiplication in the ring to translate into multiplication of ordinary integers.
For the Gaussian integers, the needed norm fact can be checked directly:
\begin{align*}
N(zw)=N(z)N(w)
\end{align*}
for all $z,w\in\mathbb Z[i]$. This is enough for the sum-of-two-squares argument, because it turns multiplication in $\mathbb Z[i]$ into multiplication of ordinary integers represented as sums of two squares.
The important point is not just the identity $N(zw)=N(z)N(w)$, but where it lives: both $z$ and $w$ remain inside $\mathbb Z[i]$, and the norm remains an ordinary nonnegative integer. That makes divisibility in the Gaussian integers visible in $\mathbb Z$. The first payoff is the description of units. A Gaussian integer is invertible only when its norm is $1$, so the only units are $1,-1,i,-i$. Multiplicativity alone still does not classify all divisors of $N(z)$; that requires the factorization theory introduced next.
[definition: Gaussian Prime]
A nonzero nonunit $\pi\in\mathbb Z[i]$ is a Gaussian prime if whenever $\pi=zw$ with $z,w\in\mathbb Z[i]$, one of $z,w$ is a unit.
[/definition]
The norm gives a way to measure Gaussian integers, but Fermat's theorem needs more than a size function. We must be able to factor elements of $\mathbb Z[i]$ and know that the resulting prime factors are essentially determined.
The ring-theoretic words used here are the standard ones. A unique factorization domain, or UFD, is an integral domain where every nonzero nonunit factors into irreducibles, and that factorization is unique up to reordering and multiplication by units.
This is the bridge from geometry to arithmetic. The Euclidean division available in $\mathbb Z[i]$ supplies unique factorization, and unique factorization is what lets us interpret Gaussian prime factors in a controlled way.
[remark: Why Unique Factorization Matters]
In $\mathbb Z[i]$, Euclidean division implies unique factorization. Other quadratic integer rings can fail this property; in $\mathbb Z[\sqrt{-5}]$, for example, $6$ has the distinct factorizations $6=2\cdot 3=(1+\sqrt{-5})(1-\sqrt{-5})$. Thus the Euclidean division property is the structural reason that Gaussian-integer factorization behaves reliably in this chapter.
[/remark]
[illustration:gaussian-nearest-integer-square]
[example: Factoring Rational Primes in Gaussian Integers]
The splittings are obtained by multiplying each Gaussian integer by its conjugate:
\begin{align*}
(2+i)(2-i)&=4-2i+2i-i^2=4-(-1)=5,\\
(3+2i)(3-2i)&=9-6i+6i-4i^2=9-4(-1)=13,\\
(4+i)(4-i)&=16-4i+4i-i^2=16-(-1)=17.
\end{align*}
Equivalently, using the Gaussian norm $N(a+bi)=a^2+b^2$,
\begin{align*}
N(2+i)&=2^2+1^2=4+1=5,\\
N(2-i)&=2^2+(-1)^2=4+1=5,\\
N(3+2i)&=3^2+2^2=9+4=13,\\
N(3-2i)&=3^2+(-2)^2=9+4=13,\\
N(4+i)&=4^2+1^2=16+1=17,\\
N(4-i)&=4^2+(-1)^2=16+1=17.
\end{align*}
We now check why these factors are Gaussian primes. For example, suppose $2+i=zw$ with $z,w\in\mathbb Z[i]$. By multiplicativity of the Gaussian norm,
\begin{align*}
5=N(2+i)=N(zw)=N(z)N(w).
\end{align*}
The integers $N(z)$ and $N(w)$ are nonnegative, and their product is the rational prime $5$, so
\begin{align*}
(N(z),N(w))=(1,5)\quad\text{or}\quad (5,1).
\end{align*}
If $N(a+bi)=1$, then $a^2+b^2=1$, so $a+bi$ is one of $1,-1,i,-i$, hence a unit. Thus one of $z,w$ is a unit, and $2+i$ is Gaussian prime. The same argument applies to $2-i$, $3\pm 2i$, and $4\pm i$, because their norms are respectively the rational primes $5$, $13$, and $17$. These computations show concretely how the represented primes split into conjugate Gaussian prime factors.
[/example]
## How do sums of two squares multiply?
Before classifying all $n$, we need to know how existing representations combine. The identity below is the integer shadow of norm multiplicativity in $\mathbb Z[i]$.
[quotetheorem:4490]
[citeproof:4490]
This theorem explains why prime representations are enough for the positive direction of the classification: once each relevant prime power is represented, products of represented numbers are represented. The converse is not built into the identity; from a representation of $mn$ one cannot generally read off representations of $m$ and $n$ without using prime factorisation. The formula also allows zero squares, which is essential later because $q^{2r}=(q^r)^2+0^2$ for primes $q\equiv 3\pmod 4$.
[example: Writing 65 as a Sum of Two Squares]
Since $65=5\cdot 13$, start from the two representations
\begin{align*}
5&=1^2+2^2,\\
13&=2^2+3^2.
\end{align*}
Using the *Brahmagupta-Fibonacci Identity* with $(a,b)=(1,2)$ and $(c,d)=(2,3)$, we get
\begin{align*}
65&=5\cdot 13\\
&=(1^2+2^2)(2^2+3^2)\\
&=(1\cdot 2-2\cdot 3)^2+(1\cdot 3+2\cdot 2)^2\\
&=(2-6)^2+(3+4)^2\\
&=(-4)^2+7^2\\
&=16+49\\
&=65.
\end{align*}
Therefore $65$ is represented as a sum of two squares, namely $65=4^2+7^2$.
[/example]
The same identity can be applied repeatedly, so prime powers and products with repeated factors require no new algebra. The next example shows how a square factor changes the numerical representation while using the same multiplication rule.
[example: Writing 325 as a Sum of Two Squares]
We first write the factors already known to be sums of two squares:
\begin{align*}
25&=5^2=(3^2+4^2),\\
13&=2^2+3^2.
\end{align*}
Applying the *Brahmagupta-Fibonacci Identity* with $(a,b)=(3,4)$ and $(c,d)=(2,3)$, we compute
\begin{align*}
325&=25\cdot 13\\
&=(3^2+4^2)(2^2+3^2)\\
&=(3\cdot 2-4\cdot 3)^2+(3\cdot 3+4\cdot 2)^2\\
&=(6-12)^2+(9+8)^2\\
&=(-6)^2+17^2\\
&=36+289\\
&=325.
\end{align*}
Thus one representation is $325=6^2+17^2$.
Changing the sign of one Gaussian factor corresponds to using $13=2^2+(-3)^2$ instead. The same identity with $(a,b)=(3,4)$ and $(c,d)=(2,-3)$ gives
\begin{align*}
325&=(3^2+4^2)(2^2+(-3)^2)\\
&=(3\cdot 2-4\cdot(-3))^2+(3\cdot(-3)+4\cdot 2)^2\\
&=(6+12)^2+(-9+8)^2\\
&=18^2+(-1)^2\\
&=324+1\\
&=325.
\end{align*}
So the same prime factors can produce the alternative representation $325=1^2+18^2$.
[/example]
## Which integers are sums of two squares?
The final question is global: given the prime factorisation of $n$, when does $x^2+y^2=n$ have an integer solution? Fermat's theorem handles primes, and multiplicativity handles products, but primes congruent to $3\pmod 4$ behave differently. The obstruction is that such a prime dividing a sum of two coprime squares would make $-1$ a quadratic residue modulo that prime.
[quotetheorem:4491]
[citeproof:4491]
Repeatedly applying this obstruction shows that any prime $q\equiv 3\pmod 4$ must occur to even exponent in a represented integer. The congruence condition on $q$ is essential: for example, $5\equiv 1\pmod 4$ divides $1^2+2^2$ without dividing either summand. Thus the obstruction is not a general divisibility property of sums of squares, but a special consequence of the nonexistence of a square root of $-1$ modulo primes $3\pmod 4$.
[quotetheorem:4492]
[citeproof:4492]
This is the complete existence classification. The primes $q\equiv 3\pmod 4$ are not forbidden entirely; they are allowed exactly when they appear in square factors, so an odd exponent obstructs representation while an even exponent can be absorbed as $(q^r)^2+0^2$. The theorem does not classify uniqueness or count the number of representations; those questions depend on units, signs, ordering, and the number of split prime factors. Its role here is to show that the local congruence obstruction and Gaussian factorisation together give a complete yes-or-no test.
[example: Why 21 Is Not a Sum of Two Squares]
The prime factorisation of $21$ is
\begin{align*}
21&=3\cdot 7.
\end{align*}
Both prime factors are of the obstructing congruence class:
\begin{align*}
3&\equiv 3\pmod 4,\\
7&\equiv 3\pmod 4.
\end{align*}
In the notation of *Fermat's Theorem on Sums of Two Squares*, this means
\begin{align*}
21=3^1\cdot 7^1,
\end{align*}
so the exponents of the primes $3\equiv 3\pmod 4$ and $7\equiv 3\pmod 4$ are both equal to $1$. Since $1$ is odd, not every prime congruent to $3\pmod 4$ occurs to even exponent. Therefore *Fermat's Theorem on Sums of Two Squares* rules out integers $x,y\in\mathbb Z$ satisfying
\begin{align*}
x^2+y^2=21.
\end{align*}
This shows that $21$ fails because of the parity of the exponents in its prime factorisation, not because the number is too large or because no small squares happen to work.
[/example]
The obstruction is about parity of exponents, not about the mere presence of primes $3\pmod 4$. A square factor built from such primes can still appear in a represented integer.
[example: A Number with Obstructing and Non-Obstructing Primes]
Let $n=45$. Its prime factorisation is
\begin{align*}
45&=9\cdot 5\\
&=3^2\cdot 5.
\end{align*}
The prime in the obstructing congruence class is $3$, and
\begin{align*}
3&\equiv 3\pmod 4,
\end{align*}
but its exponent in $45=3^2\cdot 5$ is $2$, which is even. The remaining prime factor satisfies
\begin{align*}
5&\equiv 1\pmod 4.
\end{align*}
Thus *Fermat's Theorem on Sums of Two Squares* predicts that $45$ should be representable as a sum of two squares.
The representation can be checked by expanding the squares:
\begin{align*}
6^2+3^2&=36+9\\
&=45.
\end{align*}
So
\begin{align*}
45=6^2+3^2.
\end{align*}
This example shows that primes congruent to $3\pmod 4$ are allowed in a sum of two squares exactly when their contribution is through even powers, such as the square factor $3^2$ here.
[/example]
## How the Gaussian proof fits the earlier methods
The earlier chapters used congruences, coprimality, and descent to rule out or parametrize solutions. The present chapter keeps those tools but adds a new algebraic move: factor the equation in a larger ring where the quadratic form becomes a norm. In $\mathbb Z[i]$, the equation $x^2+y^2=n$ becomes $N(x+iy)=n$, so arithmetic of norms controls arithmetic of representations.
[remark: Role of Unique Factorization]
Unique factorization in $\mathbb Z[i]$ is stronger than needed for small examples, but it is the conceptual reason the classification has such a simple prime-factor form. Without it, the passage from a congruence solution $u^2\equiv -1\pmod p$ to an actual equality $p=a^2+b^2$ would have no systematic mechanism. This is the first appearance in the course of a typical algebraic number theory theme: a Diophantine problem over $\mathbb Z$ becomes simpler after passing to a larger ring, but only if that ring has sufficiently good factorisation.
[/remark]
The chapter also prepares later stages of the course. Chapter 6 treats Pell equations, and Chapter 7 treats quadratic forms; both again ask whether local congruence information can be assembled into integer solutions, but the answer becomes more delicate. Sums of two squares are the model case where algebraic factorization gives a complete and usable answer.
The sums-of-two-squares theorem is a model case in which factorization resolves a Diophantine problem cleanly. When factorization alone is not enough, descent becomes the next tool, replacing a direct search with an argument that any solution would force a smaller one.
# 5. Descent Methods
Descent methods turn a hypothetical solution into a smaller solution of the same kind. This chapter uses that idea to prove impossibility results: instead of searching all integers, we show that any solution would force an infinite strictly decreasing sequence of positive integers. The main application is Fermat's theorem that no right triangle with integer sides has square area, which then gives the exponent-$4$ case of Fermat's Last Theorem.
## Why Descent Proves Impossibility
How can a proof use the existence of one solution to contradict itself? The key observation is that positive integers cannot decrease forever. If a construction takes any alleged solution and produces a new solution measured by a smaller positive integer, then the original solution cannot exist.
[definition: Descent Measure]
Let $S$ be a set of integer solutions to a Diophantine problem. A descent measure on $S$ is a function $M:S\to \mathbb N$.
[/definition]
The measure is chosen to record the size that the descent construction decreases. For Pythagorean triples it is usually the hypotenuse; for equations such as $x^4+y^4=z^2$ it is often the value $z$.
[remark: Infinite Descent Principle]
If every alleged positive integer solution produces another solution with a strictly smaller positive descent measure, then no such solution exists. Otherwise, a nonempty set of solutions would have a minimal measure, and the descent step would contradict that minimality.
[/remark]
The principle is often used through a minimal counterexample. If a solution exists, choose one with minimal descent measure. The descent step then constructs a smaller solution, contradicting minimality.
[example: Descent From Even Square Divisibility]
Suppose positive integers $a,b$ satisfy $a^2=2b^2$. Since $2b^2$ is divisible by $2$, the integer $a^2$ is even. If $a$ were odd, then $a=2k+1$ for some integer $k$, and
\begin{align*}
a^2=(2k+1)^2=4k^2+4k+1=2(2k^2+2k)+1,
\end{align*}
which is odd. Hence $a$ is even, so write $a=2c$ with $c\in\mathbb N$. Substituting into $a^2=2b^2$ gives
\begin{align*}
(2c)^2&=2b^2,\\
4c^2&=2b^2,\\
2c^2&=b^2.
\end{align*}
Thus $b^2$ is even, and the same parity argument shows that $b$ is even; write $b=2d$ with $d\in\mathbb N$. Substituting $b=2d$ into $2c^2=b^2$ gives
\begin{align*}
2c^2&=(2d)^2,\\
2c^2&=4d^2,\\
c^2&=2d^2.
\end{align*}
Therefore any positive solution $(a,b)$ produces another positive solution $(c,d)=(a/2,b/2)$ with smaller entries. Repeating this would give an infinite strictly decreasing sequence of positive integers, which is impossible, so no positive integer solution exists.
[/example]
This simple example contains the pattern used later: a congruence or divisibility argument forces a common factor, and removing that factor preserves the same equation.
## Coprime Factors and Square Divisibility
When a product is a square, how can the square property be distributed among its factors? Descent arguments repeatedly split an equation into coprime factors and then use prime factorization to promote a product statement into separate square statements.
[quotetheorem:4494]
[citeproof:4494]
The coprimality hypothesis is the essential safeguard in this statement. If two factors share a prime, the square exponent can be split between them and neither factor has to be a square on its own. Once the factors are coprime, prime exponents cannot be shared, so the square condition separates factor by factor. This is why the theorem is useful after rewriting an equation into two visibly coprime factors.
The square-factor test still needs one more form for Pythagorean triples, because their parametrization naturally produces products equal to twice a square rather than to a square. The question is where the single extra factor of $2$ can live. Coprimality controls the odd primes, while parity identifies which factor carries the even contribution.
[quotetheorem:4495]
[citeproof:4495]
The point of the twice-a-square variant is not just bookkeeping. In descent arguments the factor $2$ has to be isolated before coprimality can force the remaining odd factors to be squares. The parity hypothesis prevents the power of $2$ from being hidden across both factors, while coprimality prevents any odd prime exponent from being shared. We will use this as a diagnostic: after a Pythagorean triple is parametrized, the square-area condition can be split into separate square conditions on the coprime pieces.
[example: Splitting The Area Condition]
Let $(x,y,z)$ be a primitive Pythagorean triple with $x$ odd and $y$ even, so write
\begin{align*}
x=m^2-n^2,\qquad y=2mn,\qquad z=m^2+n^2,
\end{align*}
where $\gcd(m,n)=1$ and $m,n$ have opposite parity. If the area is a square, then
\begin{align*}
\frac{xy}{2}
&=\frac{(m^2-n^2)(2mn)}{2}\\
&=(m^2-n^2)mn\\
&=xmn
\end{align*}
is a square.
We first check that the factors $x$, $m$, and $n$ are pairwise coprime. Since $\gcd(m,n)=1$, it remains to compare $x$ with $m$ and $n$. If a prime $p$ divides both $x$ and $m$, then $p\mid m$ and
\begin{align*}
p\mid x=m^2-n^2.
\end{align*}
Because $p\mid m$, we have $p\mid m^2$, so from $p\mid m^2-n^2$ it follows that $p\mid n^2$. Hence $p\mid n$, contradicting $\gcd(m,n)=1$. Thus $\gcd(x,m)=1$. Similarly, if a prime $p$ divides both $x$ and $n$, then $p\mid n$, $p\mid n^2$, and
\begin{align*}
p\mid x=m^2-n^2
\end{align*}
imply $p\mid m^2$, hence $p\mid m$, again contradicting $\gcd(m,n)=1$. Thus $\gcd(x,n)=1$.
Therefore $x$, $m$, and $n$ are pairwise coprime and their product $xmn$ is a square. By the *Coprime Product Square Lemma*, each factor is itself a square, so there exist positive integers $u,v,w$ such that
\begin{align*}
x=u^2,\qquad m=v^2,\qquad n=w^2.
\end{align*}
The square-area condition has therefore forced the Pythagorean parameters themselves to become squares, which is the arithmetic input needed for the later descent.
[/example]
The last example is the bridge from square area to a new Pythagorean relation:
\begin{align*}
u^2=x=m^2-n^2=v^4-w^4.
\end{align*}
The descent will exploit the same identity in a more symmetric form.
## Fermat's Descent for Square Area Right Triangles
Can a right triangle with integer side lengths have area equal to a square? The answer is no, and the proof is a model descent: from one such triangle we manufacture a smaller one with the same property.
[definition: Square-Area Right Triangle]
A square-area right triangle is a triple $(x,y,z)\in\mathbb N^3$ such that
\begin{align*}
x^2+y^2=z^2
\end{align*}
and $xy/2$ is a square integer.
[/definition]
The definition allows nonprimitive triples, but any nonprimitive example can be divided by its common factor until a primitive example remains. Thus it is enough to descend among primitive triples.
[quotetheorem:4496]
[citeproof:4496]
At the page level, the theorem is best understood as a minimal-counterexample obstruction. A square-area right triangle would not merely be a rare Pythagorean triple; it would have to survive Fermat's descent. The cited proof supplies the descent mechanism, while the surrounding note uses the consequence: there can be no first example, so there can be no example at all.
[example: Why This Is A Descent]
The theorem should be read as a no-minimal-counterexample result. If a primitive square-area triangle existed, choose one with the smallest possible hypotenuse. Fermat's argument would produce another such triangle with a smaller hypotenuse, contradicting that choice. This is the same descent pattern as the irrationality of $\sqrt 2$, but with the Pythagorean parametrization and the coprime square-divisibility lemmas supplying the mechanism.
[/example]
The example explains the shape of the contradiction, but the detailed algebraic construction belongs inside the proof rather than in the surrounding exposition. The next remark separates the use of the theorem in these notes from the descent calculation that proves it.
[remark: What The Construction Is Used For]
The detailed construction is not repeated here because it belongs to the proof accordion. Its role in the surrounding note is to show how elementary divisibility, parametrization, and well-ordering combine into a genuinely global impossibility result.
[/remark]
The important lesson for the rest of the course is the measurement principle: a descent argument must name a positive integer that strictly decreases. Here that measure is the hypotenuse, so the well-ordering of $\mathbb N$ turns the construction inside the cited proof into a contradiction.
## The Equation $x^4+y^4=z^2$
What does square-area descent say about fourth powers? The equation $x^4+y^4=z^2$ says that the legs $x^2$ and $y^2$ form a right triangle whose hypotenuse is $z$. More directly, its primitive parametrization forces the same nested Pythagorean structure that appeared in Fermat's square-area descent.
[quotetheorem:4497]
[citeproof:4497]
This theorem is the standard exponent-$4$ descent in a compressed form. The construction is not a congruence obstruction: the contradiction comes from the repeated reproduction of the same equation at a smaller hypotenuse.
[example: Exponent Four Fermat]
The exponent-$4$ case of Fermat's Last Theorem is contained in the stronger square statement. Any hypothetical solution of $a^4+b^4=c^4$ would turn the right side into the square $(c^2)^2$, so it would be a special case of an equation of the form $x^4+y^4=z^2$. The cited theorem rules out that larger possibility, which is why the exponent-$4$ case follows immediately without repeating the descent construction here.
[/example]
The exponent-$4$ result is stronger than the statement $a^4+b^4=c^4$ has no nonzero integer solutions, because it rules out the larger possibility that the sum of two fourth powers is any square.
## What Descent Contributes To The Course
Why is descent more powerful than a congruence obstruction? A congruence obstruction usually proves impossibility by finding a single modulus that no solution can satisfy. Descent proves impossibility even when solutions survive many local tests, by showing that the global structure of a hypothetical solution reproduces itself at smaller scale.
[remark: Descent Versus Parametrization]
Parametrization describes all solutions to a simpler equation, such as $x^2+y^2=z^2$. Descent becomes effective when a second condition, such as square area or fourth powers, imposes enough coprimality and square-divisibility constraints on the parameters to create a smaller solution.
[/remark]
The first remark settles one point of interpretation. The following remark, Minimality Must Be Attached To A Positive Integer, records a second boundary case before the notes move on.
[remark: Minimality Must Be Attached To A Positive Integer]
A descent proof needs a well-founded measure. In this chapter the measure is the hypotenuse or the square root $z$ in $x^4+y^4=z^2$. Measures such as rational size or real area require extra work, because the usual order on positive rationals or positive reals admits infinite decreasing sequences.
[/remark]
The next stage of the course moves from descent to Pell equations. The same ingredients remain present: coprimality, factorization, congruence restrictions, and a well-chosen arithmetic setting, but the emphasis shifts from impossibility to generating infinite families of solutions.
Descent is especially powerful when the goal is to prove impossibility, but it also prepares us for equations with infinitely many solutions. Pell's equation shifts the emphasis from contradiction to construction, showing how one arithmetic idea can generate an infinite family of solutions.
# 6. Pell's Equation
Pell equations ask for integer points on a hyperbola rather than on a circle or on a conic with only a few visible small solutions. The equation $x^2-Dy^2=1$ is controlled by two linked ideas: good rational approximation to $\sqrt D$, and multiplication of units in a quadratic order. The chapter develops both viewpoints, then combines them into an algorithm that produces every positive solution from a single fundamental unit.
## The Equation $x^2-Dy^2=1$
Why is the case where $D$ is a square excluded, and what remains when $D$ is nonsquare? If $D=a^2$, then $(x-ay)(x+ay)=1$, so integer solutions force $x=\pm1$ and $y=0$. The nonsquare case is the genuine Pell case: the equation has a positive solution, then infinitely many, and their sizes are governed by the continued fraction of $\sqrt D$.
[definition: Pell Equation]
Let $D\in\mathbb N$ be positive and not a square. The Pell equation associated to $D$ is
\begin{align*}
x^2-Dy^2=1,
\end{align*}
with unknowns $x,y\in\mathbb Z$.
[/definition]
The identity solutions are $(1,0)$ and $(-1,0)$. In this chapter a positive solution means a solution with $x>0$ and $y>0$.
[example: First Solutions For D Equals Two]
For $D=2$, the equation is
\begin{align*}
x^2-2y^2=1,
\end{align*}
so for each positive integer $y$ we need
\begin{align*}
x^2=1+2y^2.
\end{align*}
For $y=1$, this gives $x^2=3$, and $1^2<3<2^2$, so there is no integer $x$. For $y=2$,
\begin{align*}
x^2=1+2\cdot 2^2=1+2\cdot 4=1+8=9=3^2,
\end{align*}
so $(x,y)=(3,2)$ is a solution, and indeed
\begin{align*}
3^2-2\cdot 2^2=9-2\cdot 4=9-8=1.
\end{align*}
Continuing the same check, the next displayed solution occurs at $y=12$:
\begin{align*}
x^2=1+2\cdot 12^2=1+2\cdot 144=1+288=289=17^2,
\end{align*}
so
\begin{align*}
17^2-2\cdot 12^2=289-2\cdot 144=289-288=1.
\end{align*}
Thus $(3,2)$ and $(17,12)$ both lie on the Pell hyperbola $x^2-2y^2=1$, but their $y$-coordinates jump from $2$ to $12$, showing that the positive solutions are not evenly spaced even when the first one is small.
[/example]
Rearranging the equation gives
\begin{align*}
(x+y\sqrt D)(x-y\sqrt D)=1.
\end{align*}
Thus every positive solution gives a strong rational approximation $x/y$ to $\sqrt D$, since
\begin{align*}
\left|\sqrt D-\frac{x}{y}\right|=\frac{1}{y(x+y\sqrt D)}.
\end{align*}
This is why continued fractions enter the problem.
[remark: Sign Symmetries]
If $(x,y)$ solves Pell equation, then $(\pm x,\pm y)$ also solves it. The positive branch $x>0$, $y>0$ contains the arithmetic information, while the remaining signs are recovered at the end.
[/remark]
## Continued Fractions of Square Roots
How can a finite calculation find a solution that may be extremely large? Continued fractions give a sequence of best rational approximations to an irrational number, and quadratic irrationalities have eventually periodic continued fractions. For square roots of nonsquare integers, the periodic part starts immediately after the integer part.
[definition: Simple Continued Fraction]
Let $\alpha\in\mathbb R\setminus\mathbb Q$. Its simple continued fraction is the expression
\begin{align*}
\alpha=[a_0;a_1,a_2,a_3,\dots]
=a_0+\frac{1}{a_1+\frac{1}{a_2+\cdots}},
\end{align*}
where $a_0\in\mathbb Z$ and $a_i\in\mathbb N$ for $i\ge 1$.
[/definition]
The finite truncations of a continued fraction are its convergents. They are the fractions that the Pell equation will eventually select. To use them in an algorithm, we need more than the visual idea of cutting off an infinite expression: we need a recurrence that computes the numerator and denominator at each stage. This recurrence is what makes continued fractions practical for Pell equations, because it turns approximation into a finite sequence of integer calculations.
[definition: Convergent]
For a simple continued fraction $[a_0;a_1,a_2,\dots]$, define integers $p_n,q_n$ by
\begin{align*}
p_{-2}&=0, & p_{-1}&=1, & p_n&=a_np_{n-1}+p_{n-2},\\
q_{-2}&=1, & q_{-1}&=0, & q_n&=a_nq_{n-1}+q_{n-2}
\end{align*}
for $n\ge 0$. The rational number $p_n/q_n$ is called the $n$th convergent.
[/definition]
The recurrence makes the computation fast because each new convergent uses only the previous two. For a general irrational number there is no reason for the partial quotients to repeat, so this still leaves an infinite computation. The special feature of $\sqrt D$ is that the complete quotients remain in a finite set of quadratic irrationalities, and that finiteness is what the next theorem formalises.
[quotetheorem:4498]
[citeproof:4498]
The nonsquare hypothesis is essential: if $D$ were a square, then $\sqrt D$ would be rational and the continued fraction would terminate instead of becoming periodic. The theorem does not yet say which convergent solves Pell's equation, but it reduces the search to arithmetic inside one repeating block. The final term $2a_0$ is a useful check on computations, since an incorrectly computed period usually fails this symmetry.
[example: Continued Fraction Of Square Root Two]
For $\sqrt2$, the continued fraction is $\sqrt2=[1;\overline{2}]$, so the partial quotients are
\begin{align*}
a_0&=1, & a_1&=2, & a_2&=2, & a_3&=2.
\end{align*}
Using the convergent recurrence, we get
\begin{align*}
p_0&=1\cdot 1+0=1, & q_0&=1\cdot 0+1=1,\\
p_1&=2\cdot 1+1=3, & q_1&=2\cdot 1+0=2,\\
p_2&=2\cdot 3+1=7, & q_2&=2\cdot 2+1=5,\\
p_3&=2\cdot 7+3=17, & q_3&=2\cdot 5+2=12.
\end{align*}
Thus the first four convergents are
\begin{align*}
\frac11,\quad \frac32,\quad \frac75,\quad \frac{17}{12}.
\end{align*}
Checking the Pell expression $x^2-2y^2$ for these pairs gives
\begin{align*}
1^2-2\cdot 1^2&=1-2=-1,\\
3^2-2\cdot 2^2&=9-2\cdot 4=9-8=1,\\
7^2-2\cdot 5^2&=49-2\cdot 25=49-50=-1,\\
17^2-2\cdot 12^2&=289-2\cdot 144=289-288=1.
\end{align*}
So the convergents shown alternate between the negative equation and the positive Pell equation, and the norm-one solutions displayed are $(3,2)$ and $(17,12)$.
[/example]
The next example shows that changing the parity of the period changes where the norm-one solutions appear. It also shows why looking only at the first few convergents without tracking the period can be misleading.
[example: Continued Fraction Of Square Root Three]
For $\sqrt3$, the continued fraction is
\begin{align*}
\sqrt3=[1;\overline{1,2}],
\end{align*}
so the partial quotients begin
\begin{align*}
a_0&=1, & a_1&=1, & a_2&=2, & a_3&=1, & a_4&=2.
\end{align*}
Using the convergent recurrence, with $p_{-2}=0$, $p_{-1}=1$, $q_{-2}=1$, and $q_{-1}=0$, we get
\begin{align*}
p_0&=1\cdot 1+0=1, & q_0&=1\cdot 0+1=1,\\
p_1&=1\cdot 1+1=2, & q_1&=1\cdot 1+0=1,\\
p_2&=2\cdot 2+1=5, & q_2&=2\cdot 1+1=3,\\
p_3&=1\cdot 5+2=7, & q_3&=1\cdot 3+1=4,\\
p_4&=2\cdot 7+5=19, & q_4&=2\cdot 4+3=11.
\end{align*}
Thus the first five convergents are
\begin{align*}
\frac11,\quad \frac21,\quad \frac53,\quad \frac74,\quad \frac{19}{11}.
\end{align*}
Checking $x^2-3y^2$ for these pairs gives
\begin{align*}
1^2-3\cdot 1^2&=1-3=-2,\\
2^2-3\cdot 1^2&=4-3=1,\\
5^2-3\cdot 3^2&=25-3\cdot 9=25-27=-2,\\
7^2-3\cdot 4^2&=49-3\cdot 16=49-48=1,\\
19^2-3\cdot 11^2&=361-3\cdot 121=361-363=-2.
\end{align*}
So the norm-one convergents shown give the positive Pell solutions $(2,1)$ and $(7,4)$; among these, $(2,1)$ is the smallest positive solution.
[/example]
## Extracting Solutions From the Period
Which convergent first solves $x^2-Dy^2=1$? The parity of the period length determines whether one period is enough or whether two periods are needed. The obstruction is that the natural endpoint of a continued-fraction period may have norm $-1$ rather than norm $1$. In that case one must pass through the period twice before the sign becomes positive and a Pell solution appears.
[quotetheorem:4499]
[citeproof:4499]
This theorem is constructive: compute the period, compute the indicated convergent, and read off $x=p_n$, $y=q_n$. The parity condition is not cosmetic; it records whether one pass through the repeating block gives norm $1$ or norm $-1$. The theorem also explains why the nonsquare assumption remains in force, since otherwise there is no periodic block from which to extract the convergent.
[quotetheorem:4500]
[citeproof:4500]
Existence is a qualitative consequence of the explicit convergent construction, not a separate compactness or descent argument. The positivity condition comes from choosing the convergent with positive denominator and positive numerator, so it rules out the identity solutions $(\pm1,0)$. The result guarantees at least one positive solution, but it does not by itself describe how later solutions are organised; that structure comes from multiplication in the quadratic order.
[example: The Large First Solution For D Equals Sixty One]
For $D=61$, the continued fraction is
\begin{align*}
\sqrt{61}=[7;\overline{1,4,3,1,2,2,1,3,4,1,14}],
\end{align*}
so the period length is $\ell=11$. Since $\ell$ is odd, the period rule for Pell convergents selects the convergent with index
\begin{align*}
2\ell-1=2\cdot 11-1=22-1=21.
\end{align*}
Using the recurrence for convergents gives
\begin{align*}
\frac{p_{21}}{q_{21}}=\frac{1766319049}{226153980},
\end{align*}
so the candidate smallest positive norm-one solution is
\begin{align*}
(x,y)=(1766319049,226153980).
\end{align*}
Now verify the Pell equation by expanding the two terms separately:
\begin{align*}
1766319049^2&=3119882982860264401,\\
226153980^2&=51145622669840400,\\
61\cdot 226153980^2&=61\cdot 51145622669840400\\
&=60\cdot 51145622669840400+51145622669840400\\
&=3068737360190424000+51145622669840400\\
&=3119882982860264400.
\end{align*}
Therefore
\begin{align*}
1766319049^2-61\cdot226153980^2
&=3119882982860264401-3119882982860264400\\
&=1.
\end{align*}
Thus a relatively small value of $D$ can have a fundamental solution whose coordinates are already in the hundreds of millions and billions.
[/example]
## Units in Quadratic Orders
Why do all later solutions follow a regular pattern once the first positive solution is known? The reason is multiplicative: $x^2-Dy^2$ is the norm of $x+y\sqrt D$ in the quadratic order $\mathbb Z[\sqrt D]$.
[definition: Norm In A Quadratic Order]
Let $D\in\mathbb N$ be positive and not a square. The conjugation map is
\begin{align*}
\sigma_D: \mathbb Z[\sqrt D] &\to \mathbb Z[\sqrt D], & \sigma_D(x+y\sqrt D)&=x-y\sqrt D.
\end{align*}
The norm map is
\begin{align*}
N: \mathbb Z[\sqrt D] &\to \mathbb Z, & N(\alpha)&=\alpha\sigma_D(\alpha).
\end{align*}
For $\alpha=x+y\sqrt D$, this is
\begin{align*}
N(x+y\sqrt D)=x^2-Dy^2.
\end{align*}
[/definition]
Pell equation is therefore the condition $N(x+y\sqrt D)=1$. Since $N(\alpha\beta)=N(\alpha)N(\beta)$, products of norm-one elements again give solutions. This raises the organizing question: can all positive solutions be generated from one smallest solution by multiplication? The fundamental unit is the element chosen to play that role, so the definition singles out the minimal norm-one unit greater than $1$.
[definition: Fundamental Unit]
Let $D\in\mathbb N$ be positive and not a square. A fundamental unit for Pell equation is a unit
\begin{align*}
\varepsilon_1=x_1+y_1\sqrt D\in\mathbb Z[\sqrt D]
\end{align*}
with $x_1,y_1\in\mathbb N$, $N(\varepsilon_1)=1$, and $\varepsilon_1>1$ minimal among all norm-one units $x+y\sqrt D>1$ with $x,y\in\mathbb Z$.
[/definition]
The continued-fraction construction supplies this unit. Finding several solutions by search would not prove that the list is complete, because a later solution could in principle fail to be generated by the ones already found. The next result rules out that failure: the smallest positive norm-one unit generates the whole positive branch.
[quotetheorem:4501]
[citeproof:4501]
The norm-one hypothesis is what keeps the powers on the same Pell equation rather than moving to equations $x^2-Dy^2=m$ with another right-hand side. Positivity is also essential for the ordering argument, since units with negative real value or negative $y$ are recovered later by sign symmetries rather than by the positive branch. Multiplying by the fundamental unit gives the recurrence
\begin{align*}
x_{n+1}&=x_1x_n+Dy_1y_n,\\
y_{n+1}&=x_1y_n+y_1x_n.
\end{align*}
This recurrence turns the abstract generation theorem into an effective listing procedure after the first solution is known.
[example: All Solutions For D Equals Two]
For $D=2$, the fundamental unit is $3+2\sqrt2$, and its norm is
\begin{align*}
N(3+2\sqrt2)=3^2-2\cdot 2^2=9-8=1.
\end{align*}
The generation theorem for positive Pell solutions therefore gives every positive solution in the form
\begin{align*}
x_n+y_n\sqrt2=(3+2\sqrt2)^n,\qquad n\ge 1.
\end{align*}
The first power is
\begin{align*}
(3+2\sqrt2)^1=3+2\sqrt2,
\end{align*}
so $(x_1,y_1)=(3,2)$. The second power is
\begin{align*}
(3+2\sqrt2)^2
&=3^2+2\cdot 3\cdot 2\sqrt2+(2\sqrt2)^2\\
&=9+12\sqrt2+4\cdot 2\\
&=17+12\sqrt2,
\end{align*}
so $(x_2,y_2)=(17,12)$. Multiplying once more,
\begin{align*}
(3+2\sqrt2)^3
&=(17+12\sqrt2)(3+2\sqrt2)\\
&=17\cdot 3+17\cdot 2\sqrt2+12\sqrt2\cdot 3+12\sqrt2\cdot 2\sqrt2\\
&=51+34\sqrt2+36\sqrt2+24\cdot 2\\
&=99+70\sqrt2,
\end{align*}
so $(x_3,y_3)=(99,70)$. Thus the first three positive solutions are $(3,2)$, $(17,12)$, and $(99,70)$, and the rest are obtained by continuing the same powers of $3+2\sqrt2$.
[/example]
This first norm computation shows how a unit acts on one solution. For $D=3$, the same recurrence is simple enough to display as a complete family.
[example: All Solutions For D Equals Three]
For $D=3$, the fundamental unit is $2+\sqrt3$, and its norm is
\begin{align*}
N(2+\sqrt3)=2^2-3\cdot 1^2=4-3=1.
\end{align*}
By *Generation Of Positive Pell Solutions*, every positive solution is obtained from a positive power of this unit:
\begin{align*}
x_n+y_n\sqrt3=(2+\sqrt3)^n,\qquad n\ge 1.
\end{align*}
The first power is
\begin{align*}
(2+\sqrt3)^1=2+\sqrt3,
\end{align*}
so $(x_1,y_1)=(2,1)$, and
\begin{align*}
2^2-3\cdot 1^2=4-3=1.
\end{align*}
The second power is
\begin{align*}
(2+\sqrt3)^2
&=2^2+2\cdot 2\sqrt3+(\sqrt3)^2\\
&=4+4\sqrt3+3\\
&=7+4\sqrt3,
\end{align*}
so $(x_2,y_2)=(7,4)$, and
\begin{align*}
7^2-3\cdot 4^2=49-3\cdot 16=49-48=1.
\end{align*}
Multiplying once more,
\begin{align*}
(2+\sqrt3)^3
&=(7+4\sqrt3)(2+\sqrt3)\\
&=7\cdot 2+7\sqrt3+4\sqrt3\cdot 2+4\sqrt3\cdot\sqrt3\\
&=14+7\sqrt3+8\sqrt3+4\cdot 3\\
&=26+15\sqrt3,
\end{align*}
so $(x_3,y_3)=(26,15)$, and
\begin{align*}
26^2-3\cdot 15^2=676-3\cdot 225=676-675=1.
\end{align*}
Thus the first three positive solutions are $(2,1)$, $(7,4)$, and $(26,15)$, and all later positive solutions are produced by continuing the powers of $2+\sqrt3$.
[/example]
Not every value of $D$ gives the same behavior. The next example contrasts the preceding family with a norm obstruction for $D=2$.
[example: A Norm Obstruction For D Equals Two]
For $D=2$, compute the norm of $1+\sqrt2$ using $N(x+y\sqrt D)=x^2-Dy^2$:
\begin{align*}
N(1+\sqrt2)&=1^2-2\cdot 1^2\\
&=1-2\\
&=-1.
\end{align*}
Since the norm is multiplicative, powers of $1+\sqrt2$ have norms
\begin{align*}
N\bigl((1+\sqrt2)^n\bigr)&=N(1+\sqrt2)^n\\
&=(-1)^n.
\end{align*}
Thus odd powers have norm $-1$, while even powers have norm $1$.
Squaring the norm-negative element gives
\begin{align*}
(1+\sqrt2)^2
&=1^2+2\cdot 1\cdot \sqrt2+(\sqrt2)^2\\
&=1+2\sqrt2+2\\
&=3+2\sqrt2.
\end{align*}
Its norm is therefore
\begin{align*}
N(3+2\sqrt2)&=3^2-2\cdot 2^2\\
&=9-2\cdot 4\\
&=9-8\\
&=1,
\end{align*}
so $(x,y)=(3,2)$ solves
\begin{align*}
x^2-2y^2=1.
\end{align*}
This illustrates the obstruction-and-squaring pattern: a solution of $x^2-Dy^2=-1$, when it exists, can be squared to produce a solution of the positive Pell equation.
[/example]
## Period Parity and the Negative Equation
What information is contained in the sign obtained after one continued-fraction period? The same calculation that solves Pell equation also detects when $x^2-Dy^2=-1$ is soluble.
[quotetheorem:4502]
[citeproof:4502]
The theorem separates two superficially similar equations: $x^2-Dy^2=1$ is always soluble for nonsquare positive $D$, while $x^2-Dy^2=-1$ is controlled by period parity. The converse depends on the continued-fraction best-approximation theorem, so it is not merely a statement about factoring norms. For example, $1^2-2\cdot1^2=-1$ and the period length of $\sqrt2$ is $1$. By contrast, $x^2-3y^2=-1$ has no integer solution, and the period length of $\sqrt3$ is $2$.
## The Complete Pell Algorithm
How should a Pell equation be solved in practice? The computation has three stages: find the continued-fraction period of $\sqrt D$, extract the fundamental unit, and take its powers.
[explanation: Pell Equation Algorithm]
Given positive nonsquare $D$, compute
\begin{align*}
\sqrt D=[a_0;\overline{a_1,\dots,a_\ell}].
\end{align*}
If $\ell$ is even, take $(x_1,y_1)=(p_{\ell-1},q_{\ell-1})$. If $\ell$ is odd, take $(x_1,y_1)=(p_{2\ell-1},q_{2\ell-1})$. Put $\varepsilon_1=x_1+y_1\sqrt D$. Then every positive solution is determined by
\begin{align*}
x_k+y_k\sqrt D=\varepsilon_1^k,\qquad k\in\mathbb N.
\end{align*}
[/explanation]
This closes the transition from parametrizing individual conics to studying arithmetic structure. Earlier methods used congruences, factorization, and descent to rule out or classify solution sets; Pell equations show how an infinite family of integer solutions can be generated by one unit in a quadratic order. Chapter 7 turns next to local solubility for quadratic forms.
Pell's equation completes the transition from isolated solutions to structured infinite families. The chapter on quadratic forms then asks how far the same local and arithmetic ideas can be pushed when the equation is no longer tied to a single recurrence or unit group.
# 7. Quadratic Forms and Local Conditions
This chapter studies quadratic equations after the earlier chapters have developed parametrisations, congruences, descent, and Pell-type methods. The guiding question is whether a quadratic equation has a solution over the integers or rationals, and how much of that question can be tested by finite congruence calculations before attempting a global search.
Quadratic forms are the first setting where local tests become especially systematic. We focus on elementary congruence obstructions and Legendre's theorem for ternary forms, while mentioning deeper local-global theorems only as context.
## Quadratic Forms Over Different Ground Rings
The first problem is to decide what counts as the same quadratic equation when variables may be changed. Over $\mathbb Z$ we care about integral changes of variables and divisibility; over $\mathbb Q$, $\mathbb R$, and $\mathbb Q_p$ we can divide by more scalars, so the same-looking equation may become simpler.
[definition: Quadratic Form]
Let $R$ be a commutative ring. A quadratic form in $n$ variables over $R$ is a map
\begin{align*}
q:R^n\to R
\end{align*}
whose coordinate expression is a homogeneous polynomial
\begin{align*}
q(x_1,\dots,x_n)=\sum_{1\le i\le j\le n} a_{ij}x_ix_j
\end{align*}
with coefficients $a_{ij}\in R$.
[/definition]
The word homogeneous matters: $q(\lambda x)=\lambda^2q(x)$, so rational and integral solutions are naturally studied projectively. A nonzero solution of $q=0$ is usually called an isotropic vector for $q$.
The central yes-or-no question for a quadratic form is whether such a nonzero zero exists over the field being studied. Naming this property lets us separate forms that define rational points on their projective quadric from forms whose only zero is the zero vector.
[definition: Isotropic Quadratic Form]
Let $F$ be a field and let $q$ be a quadratic form over $F$ in $n$ variables. The form $q$ is isotropic over $F$ if there exists $x=(x_1,\dots,x_n)\in F^n\setminus\{0\}$ such that $q(x)=0$.
[/definition]
When $q$ is not isotropic over $F$, it is called anisotropic over $F$. For Diophantine equations, isotropy is the projective version of solubility: the equation $q=0$ has a nonzero rational solution exactly when the projective quadric it defines has a rational point.
[example: Binary Form As Norm Equation]
Let $d\in\mathbb Q$. We show that the binary form
\begin{align*}
q(x,y)=x^2-dy^2
\end{align*}
is isotropic over $\mathbb Q$ exactly when $d$ is a square in $\mathbb Q$.
Suppose first that $q$ is isotropic. Then there is a nonzero pair $(x,y)\in\mathbb Q^2$ such that
\begin{align*}
x^2-dy^2=0.
\end{align*}
If $y=0$, then the equation becomes
\begin{align*}
x^2-d\cdot 0^2=x^2=0,
\end{align*}
so $x=0$, contradicting that $(x,y)$ is nonzero. Hence $y\ne 0$, and division by $y^2$ is allowed. From
\begin{align*}
x^2-dy^2&=0
\end{align*}
we get
\begin{align*}
x^2&=dy^2,\\
\frac{x^2}{y^2}&=d,\\
\left(\frac{x}{y}\right)^2&=d.
\end{align*}
Thus $d$ is a square in $\mathbb Q$.
Conversely, suppose $d=r^2$ for some $r\in\mathbb Q$. Then the nonzero vector $(r,1)\in\mathbb Q^2$ satisfies
\begin{align*}
q(r,1)&=r^2-d\cdot 1^2\\
&=r^2-d\\
&=r^2-r^2\\
&=0.
\end{align*}
So $q$ is isotropic over $\mathbb Q$. Therefore this binary diagonal form detects exactly whether $d$ is trivial in the square-class group of $\mathbb Q$.
[/example]
Over $\mathbb Q$ or $\mathbb R$, completing the square often converts a quadratic form into a diagonal one. Over $\mathbb Z$ this must be handled more carefully, because changes of variables can introduce denominators or fail to preserve integral solutions. For this course, diagonal forms are therefore treated as a useful normal form for examples, not as a replacement for checking the original integer equation.
The number of variables is the next organizing feature. Two-variable and three-variable forms behave differently geometrically: binary forms give the first norm-style examples, while ternary forms are the natural language for projective plane conics and many homogenized Diophantine equations.
[definition: Binary And Ternary Quadratic Forms]
A binary quadratic form is a quadratic form in two variables. A ternary quadratic form is a quadratic form in three variables.
[/definition]
Binary forms lead to conics with two variables after projectivisation, while ternary forms define plane conics. The ternary case is central because equations such as $ax^2+by^2+cz^2=0$ include many classical Diophantine problems after homogenisation.
[example: Homogenising A Conic]
The affine equation $X^2+Y^2=3$ is homogenised by introducing a new variable $z$ and replacing
\begin{align*}
X=\frac{x}{z},\qquad Y=\frac{y}{z}
\end{align*}
when $z\ne 0$. Then
\begin{align*}
X^2+Y^2=3
&\Longleftrightarrow \left(\frac{x}{z}\right)^2+\left(\frac{y}{z}\right)^2=3\\
&\Longleftrightarrow \frac{x^2}{z^2}+\frac{y^2}{z^2}=3\\
&\Longleftrightarrow \frac{x^2+y^2}{z^2}=3\\
&\Longleftrightarrow x^2+y^2=3z^2,
\end{align*}
where the last step multiplies by $z^2$, which is allowed because $z\ne 0$.
Thus any rational affine solution $(X,Y)$ gives a rational homogeneous solution
\begin{align*}
(x,y,z)=(X,Y,1),
\end{align*}
since
\begin{align*}
x^2+y^2&=X^2+Y^2\\
&=3\\
&=3\cdot 1^2\\
&=3z^2.
\end{align*}
Conversely, any rational homogeneous solution $(x,y,z)$ with $z\ne 0$ gives the affine solution
\begin{align*}
(X,Y)=\left(\frac{x}{z},\frac{y}{z}\right),
\end{align*}
because the displayed equivalences above run in reverse.
Finally, if $(x,y,z)\in\mathbb Z^3\setminus\{0\}$ satisfies
\begin{align*}
x^2+y^2=3z^2,
\end{align*}
and $g=\gcd(x,y,z)$, then write
\begin{align*}
x=gx_0,\qquad y=gy_0,\qquad z=gz_0
\end{align*}
with $\gcd(x_0,y_0,z_0)=1$. Substituting gives
\begin{align*}
(gx_0)^2+(gy_0)^2&=3(gz_0)^2,\\
g^2x_0^2+g^2y_0^2&=3g^2z_0^2,\\
g^2(x_0^2+y_0^2)&=g^2(3z_0^2).
\end{align*}
Since $g\ne 0$, division by $g^2$ gives
\begin{align*}
x_0^2+y_0^2=3z_0^2.
\end{align*}
So the homogeneous equation records the projective conic, and integral homogeneous solutions may be represented by primitive triples.
[/example]
## Solubility Modulo Prime Powers
The next problem is to rule out solutions before attempting an infinite search. Congruences give finite tests: if an integer equation has a solution, then it has a solution modulo every positive integer, and in particular modulo every prime power.
[definition: Solubility Modulo A Prime Power]
Let $f\in\mathbb Z[x_1,\dots,x_n]$, let $p$ be prime, and let $k\in\mathbb N$. The congruence $f=0$ is soluble modulo $p^k$ if there exists $x=(x_1,\dots,x_n)\in(\mathbb Z/p^k\mathbb Z)^n$ such that
\begin{align*}
f(x)\equiv 0\pmod{p^k}.
\end{align*}
For a homogeneous equation, a solution modulo $p^k$ is primitive if at least one coordinate is not divisible by $p$.
[/definition]
Primitive solubility is the right modular shadow of a nonzero integer solution to a homogeneous equation. If all coordinates are divisible by $p$, the solution may only be the zero solution reduced modulo $p^k$.
The obstruction we want is one-way: a genuine primitive integer solution must survive reduction modulo every prime power. Therefore a single prime power with no primitive congruence solution is enough to rule out the global equation before any search for integer solutions begins.
[quotetheorem:4503]
[citeproof:4503]
The contrapositive is the useful direction: failure modulo one prime power proves there is no primitive integer solution. Homogeneity is needed because scaling and reduction preserve the equation in a projective way; for a non-homogeneous equation, reducing a solution may lose the relation between primitive integer vectors and projective local data. Primitivity is also essential: the zero vector always solves a homogeneous congruence, and a vector all of whose coordinates are divisible by $p$ may carry no information about a nonzero rational or integral point. Thus the theorem gives a necessary local test, not a sufficient one.
[example: The Equation x Squared Plus y Squared Equals Three z Squared]
Consider the homogeneous equation
\begin{align*}
x^2+y^2=3z^2.
\end{align*}
We show that it has no primitive integer solution by reducing modulo $4$. For any integer $n$, either $n=2m$ or $n=2m+1$. In the first case,
\begin{align*}
n^2=(2m)^2=4m^2\equiv 0\pmod 4,
\end{align*}
and in the second case,
\begin{align*}
n^2=(2m+1)^2=4m^2+4m+1=4(m^2+m)+1\equiv 1\pmod 4.
\end{align*}
Thus every square is congruent to $0$ or $1$ modulo $4$.
Suppose, for contradiction, that $(x,y,z)$ is a primitive integer solution. If $z$ is odd, then
\begin{align*}
z^2&\equiv 1\pmod 4,\\
3z^2&\equiv 3\cdot 1\equiv 3\pmod 4.
\end{align*}
But $x^2$ and $y^2$ are each congruent to $0$ or $1$ modulo $4$, so the possible residues of the left-hand side are
\begin{align*}
x^2+y^2&\equiv 0+0\equiv 0\pmod 4,\\
x^2+y^2&\equiv 0+1\equiv 1\pmod 4,\\
x^2+y^2&\equiv 1+0\equiv 1\pmod 4,\\
x^2+y^2&\equiv 1+1\equiv 2\pmod 4.
\end{align*}
So $x^2+y^2$ cannot be congruent to $3$ modulo $4$, contradicting $x^2+y^2=3z^2$. Hence $z$ is even.
Since $z$ is even, write $z=2w$. Then
\begin{align*}
3z^2&=3(2w)^2\\
&=3\cdot 4w^2\\
&=12w^2\\
&\equiv 0\pmod 4.
\end{align*}
The equation therefore gives
\begin{align*}
x^2+y^2\equiv 0\pmod 4.
\end{align*}
From the list of possible residues above, this can happen only when
\begin{align*}
x^2\equiv 0\pmod 4,\qquad y^2\equiv 0\pmod 4,
\end{align*}
so $x$ and $y$ are even. Thus $x,y,z$ are all divisible by $2$, contradicting primitivity. Therefore the congruence test modulo $4$ proves that the equation has no primitive integer solution before any numerical search begins.
[/example]
The modulo $4$ example gives one local obstruction, but a complete local test must look simultaneously at the real numbers and at every prime. Solubility modulo every $p^k$ is packaged by the $p$-adic numbers: a $p$-adic solution records a coherent sequence of solutions modulo $p, p^2, p^3,\dots$. We write $\mathbb{Q}_p$ for the field of $p$-adic numbers and $\mathbb{Z}_p$ for its subring of $p$-adic integers, the elements whose $p$-adic absolute value is at most $1$.
Local solubility names the situation in which no obstruction appears in any of these completions. It is a necessary test for rational solubility: a rational solution would be visible over $\mathbb R$ and over every $\mathbb Q_p$.
[definition: Local Solubility]
Let $q\in\mathbb Q[x_1,\dots,x_n]$ be a quadratic form. The equation $q=0$ is locally soluble if it has a nonzero solution over $\mathbb R$ and a nonzero solution over $\mathbb Q_p$ for every prime $p$.
[/definition]
For a homogeneous equation with integral coefficients, a nonzero solution over $\mathbb{Q}_p$ is equivalent, after scaling, to a primitive solution over $\mathbb{Z}_p$. In practice this is checked by solving congruences modulo increasing powers of $p$, often using a lifting result.
[remark: Simple Root Lifting]
If a polynomial congruence has a solution modulo $p$ at which at least one relevant derivative is nonzero modulo $p$, then the solution can often be lifted coherently to solutions modulo $p^2,p^3,\dots$. This is the practical content of Hensel lifting used here: nonsingular congruence solutions usually represent genuine local solutions.
[/remark]
This lifting theorem explains why many local checks stop after a small modulus: once a nonsingular solution is found modulo $p$, it automatically represents a genuine $p$-adic solution. The nonzero partial derivative hypothesis cannot be dropped; for example, $f(x)=x^2-p$ has $x=0$ as a solution modulo $p$ with derivative $0$ modulo $p$, but for odd $p$ it has no solution in $\mathbb Q_p$ because a square cannot have odd $p$-adic valuation. Singular solutions modulo $p$ therefore require separate analysis, and for $p=2$ the first useful modulus is often $4$ or $8$ rather than $2$. This is why local solubility computations combine Hensel lifting with explicit congruence checks at the bad primes.
## Local Obstructions Without Heavy Machinery
The local viewpoint can be developed much further in advanced quadratic form theory, but this course only needs the elementary shadow of that theory: reduce the equation modulo carefully chosen prime powers, check real sign obstructions, and use those failures to rule out global solutions. The advanced language is useful in later courses because it packages many congruence calculations at once, but the computations here remain explicit.
## Legendre's Theorem For Ternary Quadratic Forms
The main problem for ternary diagonal forms is to decide when
\begin{align*}
ax^2+by^2+cz^2=0
\end{align*}
has a nonzero integer or rational solution. Legendre's theorem gives an explicit square-class and congruence criterion in the case where the coefficients are squarefree and pairwise coprime.
[quotetheorem:4505]
[citeproof:4505]
The sign condition is the real local condition: if $a,b,c$ all have the same sign, the only real solution is the zero vector. The three residue conditions are the finite local conditions at primes dividing the coefficients.
Legendre's theorem should be read as an explicit local-to-global theorem for ternary quadratic forms. The squarefree hypothesis removes repeated prime powers so that residue conditions modulo primes combine cleanly into conditions modulo $|a|$, $|b|$, and $|c|$. Pairwise coprimality is what lets the reduction modulo a prime dividing one coefficient leave the other two coefficients invertible; without it, the local conditions must include higher-adic information. The sign hypothesis is exactly the real condition: a definite form such as $x^2+y^2+z^2$ cannot have a nonzero real solution, regardless of finite congruences. Thus the theorem is powerful but deliberately normalised before use.
[example: Legendre Applied To x Squared Plus y Squared Equals Three z Squared]
We rewrite
\begin{align*}
x^2+y^2=3z^2
\end{align*}
as
\begin{align*}
x^2+y^2-3z^2=0,
\end{align*}
so this is the diagonal ternary equation
\begin{align*}
ax^2+by^2+cz^2=0
\end{align*}
with
\begin{align*}
a=1,\qquad b=1,\qquad c=-3.
\end{align*}
These coefficients are nonzero, squarefree, pairwise coprime, and not all of the same sign, so *Legendre's theorem on ternary quadratic forms* applies.
The three residue conditions are
\begin{align*}
-bc&=-(1)(-3)=3 \quad \text{modulo } |a|=1,\\
-ca&=-(-3)(1)=3 \quad \text{modulo } |b|=1,\\
-ab&=-(1)(1)=-1 \quad \text{modulo } |c|=3.
\end{align*}
The first two conditions are modulo $1$, so they impose no restriction. The remaining condition asks whether $-1$ is a quadratic residue modulo $3$.
The residue classes modulo $3$ are $0,1,2$. Their squares are
\begin{align*}
0^2&=0\equiv 0\pmod 3,\\
1^2&=1\equiv 1\pmod 3,\\
2^2&=4=3+1\equiv 1\pmod 3.
\end{align*}
Thus the only square residues modulo $3$ are $0$ and $1$. But
\begin{align*}
-1\equiv 2\pmod 3,
\end{align*}
so $-1$ is not a quadratic residue modulo $3$. Therefore one of Legendre's residue conditions fails, and the equation
\begin{align*}
x^2+y^2=3z^2
\end{align*}
has no nonzero integer solution.
[/example]
This example matches the earlier modulo $4$ obstruction only in conclusion, not in mechanism. Legendre sees the obstruction at $3$, while the parity argument sees that any primitive integer solution would force all variables even.
## How Local Conditions Guide Computation
The practical question is how to use the theory without computing in every completion from scratch. For a diagonal rational quadratic form, only finitely many primes require serious attention: the prime $2$ and primes dividing the coefficients after clearing denominators and removing square factors.
[explanation: Finite Local Checking]
Start with a rational diagonal form $a_1x_1^2+\dots+a_nx_n^2$ and multiply by a rational square if convenient. Clear denominators and remove square factors from each coefficient, since square factors can be absorbed into variables over $\mathbb Q$ and over $\mathbb Q_p$. For primes not dividing $2a_1\cdots a_n$, the reduction modulo $p$ is nonsingular and has enough points in the ternary and higher-dimensional cases relevant here. The difficult primes are therefore among the finite set dividing $2a_1\cdots a_n$.
[/explanation]
This finite reduction is what makes local solubility useful in computations. It replaces a search through infinitely many possible integer triples by a short list of sign, residue, and lifting checks.
[example: Congruence Failure Before Search]
Suppose, for contradiction, that there is a primitive integer solution $(x,y,z)$ to
\begin{align*}
x^2+y^2=3z^2.
\end{align*}
Reducing the equation modulo $3$ gives
\begin{align*}
x^2+y^2&\equiv 3z^2 \pmod 3,\\
x^2+y^2&\equiv 0 \pmod 3.
\end{align*}
The square residues modulo $3$ are
\begin{align*}
0^2&\equiv 0\pmod 3,\\
1^2&\equiv 1\pmod 3,\\
2^2&=4\equiv 1\pmod 3.
\end{align*}
Thus each of $x^2$ and $y^2$ is congruent to either $0$ or $1$ modulo $3$. The possible sums are
\begin{align*}
0+0&\equiv 0\pmod 3,\\
0+1&\equiv 1\pmod 3,\\
1+0&\equiv 1\pmod 3,\\
1+1&\equiv 2\pmod 3.
\end{align*}
Since $x^2+y^2\equiv 0\pmod 3$, the only possible case is
\begin{align*}
x^2\equiv 0\pmod 3,\qquad y^2\equiv 0\pmod 3.
\end{align*}
From the list of square residues, this forces
\begin{align*}
x\equiv 0\pmod 3,\qquad y\equiv 0\pmod 3.
\end{align*}
So write $x=3x_1$ and $y=3y_1$ with $x_1,y_1\in\mathbb Z$. Substituting into the original equation gives
\begin{align*}
(3x_1)^2+(3y_1)^2&=3z^2,\\
9x_1^2+9y_1^2&=3z^2,\\
9(x_1^2+y_1^2)&=3z^2.
\end{align*}
Dividing by $3$ gives
\begin{align*}
3(x_1^2+y_1^2)=z^2,
\end{align*}
so $z^2$ is divisible by $3$. Again using the square residues modulo $3$, this implies
\begin{align*}
z\equiv 0\pmod 3.
\end{align*}
Therefore $x,y,z$ are all divisible by $3$, contradicting the assumption that the solution is primitive. Hence the equation has no primitive integer solution, and this finite congruence check rules out the search before any triples are tested.
[/example]
The chapter's moral is that local conditions are not merely quick tricks. For quadratic forms over $\mathbb Q$, they are the complete existence theory; for integral equations, they are the first and often decisive filter before descent, parametrisation, or explicit search begins.
Quadratic forms show both the strength and the limits of local reasoning: for quadratic equations, local conditions can often settle existence, but they do not simplify every problem. Once degree rises beyond two, the methods developed so far begin to lose their reach, and new geometric ideas become necessary.
# 8. Beyond Quadratics: Cubics and Higher Degree Equations
This chapter marks the point where the elementary methods developed for linear and quadratic Diophantine equations begin to lose their force. For conics, a single rational point often unlocks a parametrization of all rational points, and integer solutions can then be studied through divisibility and congruence conditions on the parameters. For cubics and higher degree equations, rational points no longer behave like a one-parameter family in general, and the distinction between rational and integer solutions becomes more structural.
## Why the Slope Method Stops Being Enough
The central question is: why does the successful parametrization of conics not extend to arbitrary higher degree curves? The slope method works for a conic because a line through a known rational point meets the conic in exactly one further point, counted with multiplicity, and this second point is forced to have rational coordinates when the line has rational slope. For a cubic, the same line usually meets the curve in two further points, so a single slope no longer selects a single new solution.
[explanation: Degree And Intersections]
A plane curve of degree $d$ is expected to meet a line in $d$ points over an algebraic closure, counted with multiplicity. For a quadratic curve, fixing one rational point leaves one unknown intersection point. The coefficients of the resulting quadratic equation are rational, so if one root is rational then the other root is rational.
For a cubic curve, fixing one rational point leaves a quadratic equation for the remaining intersections. The two further roots may be irrational, or they may be rational only together after an additional arithmetic condition. This is the first sign that rational points on cubics have a richer structure than rational points on conics.
[/explanation]
The explanation gives the organizing idea; the example now supplies a test case. This keeps the next construction tied to computations rather than only to terminology.
[example: Unit Circle Parametrization Revisited]
Let $t \in \mathbb Q$, and consider the line of rational slope through the known rational point $(-1,0)$:
\begin{align*}
Y=t(X+1).
\end{align*}
Substituting this into $X^2+Y^2=1$ gives
\begin{align*}
X^2+t^2(X+1)^2&=1,\\
X^2+t^2(X^2+2X+1)&=1,\\
(1+t^2)X^2+2t^2X+t^2-1&=0.
\end{align*}
Since $X=-1$ is the known intersection point, the quadratic must have a factor $X+1$. Dividing visibly,
\begin{align*}
(1+t^2)X^2+2t^2X+t^2-1
&=(X+1)\bigl((1+t^2)X+(t^2-1)\bigr),
\end{align*}
because
\begin{align*}
(X+1)\bigl((1+t^2)X+(t^2-1)\bigr)
&=(1+t^2)X^2+(t^2-1)X+(1+t^2)X+(t^2-1)\\
&=(1+t^2)X^2+2t^2X+t^2-1.
\end{align*}
The second intersection therefore satisfies
\begin{align*}
(1+t^2)X+(t^2-1)&=0,\\
(1+t^2)X&=1-t^2,\\
X&=\frac{1-t^2}{1+t^2}.
\end{align*}
Then
\begin{align*}
Y=t(X+1)
&=t\left(\frac{1-t^2}{1+t^2}+1\right)\\
&=t\left(\frac{1-t^2+1+t^2}{1+t^2}\right)\\
&=\frac{2t}{1+t^2}.
\end{align*}
Thus every rational slope $t$ gives the rational point
\begin{align*}
X=\frac{1-t^2}{1+t^2}, \qquad Y=\frac{2t}{1+t^2},
\end{align*}
which is the quadratic benchmark for the cubic failure discussed here.
[/example]
The same line-intersection idea does not disappear for cubics; instead, it changes character. A line through two rational points on a cubic has a third intersection point whose coordinates are rational. This observation is the beginning of the group law on elliptic curves, but it is not a parametrization by one free rational parameter.
[example: A Cubic Line Intersection]
Let $C$ be the affine cubic $y^2=x^3-2$, and let $y=mx+b$ be a line with $m,b\in\mathbb Q$. Substituting the line equation into the curve gives
\begin{align*}
(mx+b)^2&=x^3-2,\\
m^2x^2+2mbx+b^2&=x^3-2,\\
0&=x^3-m^2x^2-2mbx-(b^2+2).
\end{align*}
Thus the $x$-coordinates of the intersection points are the roots of
\begin{align*}
f(x)=x^3-m^2x^2-2mbx-(b^2+2)\in\mathbb Q[x].
\end{align*}
Suppose the line meets $C$ in two rational points whose $x$-coordinates are $r_1,r_2\in\mathbb Q$, and let $r_3$ be the remaining root, counted with multiplicity. Since $f$ is monic,
\begin{align*}
f(x)&=(x-r_1)(x-r_2)(x-r_3).
\end{align*}
Expanding the right-hand side,
\begin{align*}
(x-r_1)(x-r_2)(x-r_3)
&=(x^2-(r_1+r_2)x+r_1r_2)(x-r_3)\\
&=x^3-r_3x^2-(r_1+r_2)x^2+(r_1+r_2)r_3x+r_1r_2x-r_1r_2r_3\\
&=x^3-(r_1+r_2+r_3)x^2+\bigl((r_1+r_2)r_3+r_1r_2\bigr)x-r_1r_2r_3.
\end{align*}
Comparing the coefficient of $x^2$ with
\begin{align*}
f(x)=x^3-m^2x^2-2mbx-(b^2+2),
\end{align*}
we get
\begin{align*}
-(r_1+r_2+r_3)&=-m^2,\\
r_1+r_2+r_3&=m^2,\\
r_3&=m^2-r_1-r_2.
\end{align*}
Because $m,r_1,r_2\in\mathbb Q$, this shows $r_3\in\mathbb Q$. The corresponding $y$-coordinate is
\begin{align*}
y_3=mr_3+b,
\end{align*}
which is also rational because $m,b,r_3\in\mathbb Q$. So a line through two rational points on this cubic produces a third rational point, but the input is two rational points together with a line, not one freely chosen rational slope that parametrizes all rational points.
[/example]
The contrast with cubics shows what is special about conics. A line through one known rational point on a conic has only one remaining intersection point, so the slope of the line can serve as a single rational parameter. This is the same elementary parametrization idea used earlier for the unit circle.
[explanation: Plane Conic Parametrization]
For an irreducible plane conic over $\mathbb Q$ with a rational point, projection from that point gives a rational parametrization of the remaining rational points. In concrete terms, choose a rational slope, write the line through the known point, substitute into the conic equation, and solve for the second intersection.
[/explanation]
This parametrization is the conceptual reason the early part of the course could solve so many quadratic equations explicitly. Its role is to separate two tasks that often look the same in examples: first produce a rational parametrization of the rational points on an irreducible conic, and only afterward impose divisibility, coprimality, and parity conditions when the original problem asks for integer solutions. Reducible conics such as $xy=0$ sit outside this picture because they are unions of simpler components rather than one irreducible curve controlled by projection from a chosen rational point. The next sections explain why the analogous cubic problems lead to finiteness theorems and group structures rather than elementary parametrizations.
## Mordell-Type Equations As First Elliptic Curves
The next question is: what is the simplest cubic equation whose integer points already show new behaviour? A standard family is
\begin{align*}
y^2 = x^3 + k,
\end{align*}
where $k \in \mathbb Z$ is fixed. These are called Mordell-type equations, and for many values of $k$ they are affine models of elliptic curves after adding a point at infinity.
[definition: Mordell-Type Equation]
Let $k \in \mathbb Z$. A Mordell-type equation is an equation of the form
\begin{align*}
y^2 = x^3 + k
\end{align*}
in unknowns $x,y \in \mathbb Z$ or $x,y \in \mathbb Q$.
[/definition]
The same polynomial equation can be studied over different number systems. The set of integer solutions is denoted by
\begin{align*}
\{(x,y) \in \mathbb Z^2 : y^2 = x^3+k\},
\end{align*}
while the rational solutions lie in
\begin{align*}
\{(x,y) \in \mathbb Q^2 : y^2 = x^3+k\}.
\end{align*}
Rational points are more flexible because denominators are allowed; integer points are more rigid because divisibility constraints remain visible.
To use the geometry of cubic curves systematically, we need to restrict to the nonsingular cubics where tangents and chords behave predictably. The Weierstrass form isolates the standard family used for elliptic curves and adds the point at infinity required by the rational group law.
[definition: Elliptic Curve In Weierstrass Form]
Let $A,B \in \mathbb Q$ satisfy $4A^3 + 27B^2 \ne 0$. An elliptic curve in Weierstrass form over $\mathbb Q$ is the projective curve obtained from
\begin{align*}
y^2 = x^3 + Ax + B
\end{align*}
together with its point at infinity.
[/definition]
The condition $4A^3+27B^2 \ne 0$ excludes singular cubics. For the Mordell-type curve $y^2=x^3+k$, this says $k \ne 0$. The point at infinity is not an integer solution of the affine equation, but it is essential for the rational group law.
[example: Small Integer Solutions To A Mordell Equation]
Consider
\begin{align*}
y^2=x^3-2.
\end{align*}
Since $y^2\ge 0$ for every integer $y$, any integer solution must satisfy $x^3-2\ge 0$. If $x\le 1$, then $x^3\le 1$, so
\begin{align*}
x^3-2\le 1-2=-1<0,
\end{align*}
and no such $x$ can occur. For $x=3$,
\begin{align*}
3^3-2=27-2=25=5^2=(-5)^2,
\end{align*}
so $(3,5)$ and $(3,-5)$ are integer solutions.
For $0\le x\le 10$, the right-hand side is computed value by value:
\begin{align*}
0^3-2&=-2,\\
1^3-2&=-1,\\
2^3-2&=8-2=6,\\
3^3-2&=27-2=25,\\
4^3-2&=64-2=62,\\
5^3-2&=125-2=123,\\
6^3-2&=216-2=214,\\
7^3-2&=343-2=341,\\
8^3-2&=512-2=510,\\
9^3-2&=729-2=727,\\
10^3-2&=1000-2=998.
\end{align*}
The negative values $-2$ and $-1$ are not squares. Among the positive values, we compare each with consecutive integer squares:
\begin{align*}
2^2=4&<6<9=3^2,\\
25&=5^2,\\
7^2=49&<62<64=8^2,\\
11^2=121&<123<144=12^2,\\
14^2=196&<214<225=15^2,\\
18^2=324&<341<361=19^2,\\
22^2=484&<510<529=23^2,\\
26^2=676&<727<729=27^2,\\
31^2=961&<998<1024=32^2.
\end{align*}
Thus, in the range $0\le x\le 10$, the only value producing a square is $x=3$, giving $y=\pm 5$. This computation finds the small integral solutions in the chosen range, but it does not prove that larger values of $x$ cannot produce more solutions.
[/example]
Congruences still help, but they rarely settle the full equation. For instance, reducing $y^2=x^3-2$ modulo a prime can rule out some residue classes for $x$, yet many classes often remain. The need to combine local congruence information with global structure is one of the main transitions from elementary Diophantine equations to arithmetic geometry.
[remark: Mordell's Theorem As Context]
Mordell's theorem says that the rational points on an elliptic curve over $\mathbb Q$ form a finitely generated abelian group. Here it is used only as orientation: it explains why rational points on cubics are organized by finitely many generators and the geometric addition law, even though finding those generators is not an elementary task.
[/remark]
This context result means that all rational points on an elliptic curve can be generated from finitely many basic rational points using the geometric addition law. It does not say that finding those generators is a routine elementary task.
## Rational Points Versus Integer Points
The guiding question is now: if rational points on an elliptic curve may be infinite, why should integer points often be finite? Denominators can vary indefinitely under the group law, while integer points must satisfy a much stricter divisibility condition. The contrast is already visible in simple examples where rational operations quickly produce fractions even when starting from integral points.
[definition: Integral Point On An Affine Curve]
Let $F \in \mathbb Z[x,y]$. An integral point on the affine curve $F(x,y)=0$ is a pair $(a,b) \in \mathbb Z^2$ such that $F(a,b)=0$.
[/definition]
It is useful to compare this with the same equation considered over $\mathbb Q$. Allowing rational coordinates changes the nature of the problem: line intersections, tangent constructions, and group laws may produce points with denominators even when the original equation has integral coefficients. The rational notion records all such solutions before one asks the harder question of which of them have integral coordinates.
[definition: Rational Point On An Affine Curve]
Let $F \in \mathbb Z[x,y]$. A rational point on the affine curve $F(x,y)=0$ is a pair $(a,b) \in \mathbb Q^2$ such that $F(a,b)=0$.
[/definition]
The definitions differ only by the ambient set, but the arithmetic consequences are substantial. Passing from integers to rationals allows scaling and line-intersection constructions; returning from rationals to integers requires controlling denominators, which is usually a much harder problem.
[example: Rational Points Need Not Stay Integral]
On the curve $y^2=x^3-2$, the point $P=(3,5)$ is integral because
\begin{align*}
5^2&=25,\\
3^3-2&=27-2=25.
\end{align*}
We compute the tangent line at $P$ and then find its third intersection with the cubic. Differentiating $y^2=x^3-2$ implicitly gives
\begin{align*}
2y\frac{dy}{dx}&=3x^2,
\end{align*}
so at $(3,5)$ the tangent slope is
\begin{align*}
m&=\frac{3x^2}{2y}\bigg|_{(3,5)}
=\frac{3\cdot 3^2}{2\cdot 5}
=\frac{27}{10}.
\end{align*}
Thus the tangent line is
\begin{align*}
y-5&=\frac{27}{10}(x-3),\\
y&=\frac{27}{10}x-\frac{81}{10}+5
=\frac{27}{10}x-\frac{81}{10}+\frac{50}{10}
=\frac{27}{10}x-\frac{31}{10}.
\end{align*}
Substitute this line into $y^2=x^3-2$:
\begin{align*}
\left(\frac{27}{10}x-\frac{31}{10}\right)^2&=x^3-2,\\
\frac{(27x-31)^2}{100}&=x^3-2,\\
(27x-31)^2&=100x^3-200.
\end{align*}
Expanding the left side gives
\begin{align*}
(27x-31)^2
&=(27x)^2-2\cdot 27x\cdot 31+31^2\\
&=729x^2-1674x+961,
\end{align*}
so
\begin{align*}
729x^2-1674x+961&=100x^3-200,\\
0&=100x^3-729x^2+1674x-1161.
\end{align*}
The tangent point $x=3$ occurs with multiplicity $2$, so we divide by $(x-3)^2=x^2-6x+9$. The factorization is
\begin{align*}
100x^3-729x^2+1674x-1161
&=(x^2-6x+9)(100x-129),
\end{align*}
because
\begin{align*}
(x^2-6x+9)(100x-129)
&=100x^3-129x^2-600x^2+774x+900x-1161\\
&=100x^3-729x^2+1674x-1161.
\end{align*}
The third intersection therefore has
\begin{align*}
100x-129&=0,\\
100x&=129,\\
x&=\frac{129}{100}.
\end{align*}
The elliptic curve group law reflects this third intersection across the $x$-axis, so the doubled point $2P$ has the same $x$-coordinate:
\begin{align*}
x(2P)=\frac{129}{100}.
\end{align*}
This is rational but not an integer, since $100\nmid 129$. Thus starting from the integral point $(3,5)$, the group law produces a rational point whose $x$-coordinate is nonintegral, showing that integral points form a much thinner subset than rational points.
[/example]
The group law produces many rational points, but the example shows that integrality is not preserved by adding points. This raises a different question from the rational one: even if an elliptic curve has infinitely many rational points, can it have infinitely many points whose coordinates are integers? Siegel's theorem gives the fundamental finiteness answer for this integral problem.
[remark: Siegel's Theorem As Context]
Siegel's theorem says that an affine curve of genus at least $1$ has only finitely many integral points under the standard hypotheses. In this course it functions as a finiteness landmark: it explains why equations such as $y^2=x^3+k$ are expected to have only finitely many integer solutions even when their rational points form an infinite group.
[/remark]
This context result explains why equations such as $y^2=x^3+k$ are expected to have only finitely many integer solutions even when their rational points form an infinite group. In its classical form it proves finiteness without directly listing all integral points.
[example: Conic Versus Cubic Fermat Equations]
For the quadratic Fermat equation
\begin{align*}
x^2+y^2=z^2,
\end{align*}
the parametrization from the conic method gives
\begin{align*}
x=m^2-n^2,\qquad y=2mn,\qquad z=m^2+n^2,
\end{align*}
with $m,n\in\mathbb Z$, $\gcd(m,n)=1$, and $m,n$ of opposite parity. Substituting these formulas verifies the equation:
\begin{align*}
x^2+y^2
&=(m^2-n^2)^2+(2mn)^2\\
&=(m^4-2m^2n^2+n^4)+4m^2n^2\\
&=m^4+2m^2n^2+n^4\\
&=(m^2+n^2)^2\\
&=z^2.
\end{align*}
This gives infinitely many primitive integer solutions, for example by taking $n=1$ and $m=2k$ with $k\ge 1$:
\begin{align*}
x&=4k^2-1,\\
y&=4k,\\
z&=4k^2+1.
\end{align*}
The values of $z=4k^2+1$ are distinct as $k$ varies, so these give infinitely many different solutions.
By contrast, the cubic Fermat equation
\begin{align*}
x^3+y^3=z^3
\end{align*}
has no nonzero integer solutions, by *Fermat for exponent three*. Thus the cubic equation is not obtained from the quadratic one by a small change in degree: the conic case has a rational parametrization producing infinite families, while the cubic case requires descent in a richer arithmetic setting.
[/example]
The contrast with the parametrized Pythagorean triples raises a sharper question: can the same equation remain solvable after the exponent is raised from $2$ to $3$? A direct search is misleading, because congruence obstructions alone do not immediately rule out every possible nonzero triple. What is needed is a structural argument showing that any hypothetical solution would force a smaller one, eventually contradicting positivity. This is the first place where the elementary-looking equation already requires a genuine descent principle rather than a parametrization.
[quotetheorem:4510]
[citeproof:4510]
This result shows that higher degree equations can be rigid in ways that conics are not. The word nonzero is essential: $(0,0,0)$ is always a solution of $x^3+y^3=z^3$, and allowing zeros also gives degenerate cases such as $x^3+0=x^3$. The theorem is also specific to exponent $3$; it is one case of Fermat's Last Theorem, not a proof of the corresponding statement for every exponent. Nor does it say that all nearby cubic equations lack solutions, since equations such as $y^2=x^3-2$ may have integer and rational solutions. The failure of parametrization is therefore not merely a technical inconvenience: the arithmetic of cubics is governed by new structures, including elliptic curve groups, descent, and deep finiteness theorems.
## What Elementary Methods Still Contribute
The final question of the chapter is: what remains useful from the elementary toolkit? Congruences, parity, coprimality, factorization, and descent still matter, but they usually provide partial information rather than complete classifications. Their role changes from solving every equation directly to reducing the problem to a smaller arithmetic core.
[example: A Congruence Filter For A Mordell Equation]
For the Mordell equation
\begin{align*}
y^2=x^3-2,
\end{align*}
we reduce both sides modulo $9$ and compute the possible residue classes.
The square residues modulo $9$ come from the nine residue classes:
\begin{align*}
0^2&\equiv 0 \pmod 9,\\
1^2&\equiv 1 \pmod 9,\\
2^2&=4\equiv 4 \pmod 9,\\
3^2&=9\equiv 0 \pmod 9,\\
4^2&=16\equiv 7 \pmod 9,\\
5^2&=25\equiv 7 \pmod 9,\\
6^2&=36\equiv 0 \pmod 9,\\
7^2&=49\equiv 4 \pmod 9,\\
8^2&=64\equiv 1 \pmod 9.
\end{align*}
Thus
\begin{align*}
y^2 \equiv 0,1,4,\text{ or }7 \pmod 9.
\end{align*}
Similarly, the cubic residues modulo $9$ are
\begin{align*}
0^3&\equiv 0 \pmod 9,\\
1^3&\equiv 1 \pmod 9,\\
2^3&=8\equiv 8 \pmod 9,\\
3^3&=27\equiv 0 \pmod 9,\\
4^3&=64\equiv 1 \pmod 9,\\
5^3&=125\equiv 8 \pmod 9,\\
6^3&=216\equiv 0 \pmod 9,\\
7^3&=343\equiv 1 \pmod 9,\\
8^3&=512\equiv 8 \pmod 9.
\end{align*}
Hence
\begin{align*}
x^3\equiv 0,1,\text{ or }8 \pmod 9.
\end{align*}
Subtracting $2$ from these three possibilities gives
\begin{align*}
0-2&\equiv -2\equiv 7 \pmod 9,\\
1-2&\equiv -1\equiv 8 \pmod 9,\\
8-2&\equiv 6 \pmod 9.
\end{align*}
Therefore
\begin{align*}
x^3-2\equiv 7,8,\text{ or }6 \pmod 9.
\end{align*}
Since $y^2=x^3-2$, the residue of $x^3-2$ must be one of the square residues $0,1,4,7$. Among $7,8,6$, only $7$ appears in the square-residue list, so
\begin{align*}
x^3-2&\equiv 7 \pmod 9,\\
x^3&\equiv 9\equiv 0 \pmod 9.
\end{align*}
From the cubic-residue computation, $x^3\equiv 0\pmod 9$ occurs exactly when
\begin{align*}
x\equiv 0,3,\text{ or }6 \pmod 9,
\end{align*}
which is equivalent to $3\mid x$. Thus every integer solution must have $3\mid x$. The known solution $x=3$, $y=\pm 5$ satisfies this necessary condition, but the congruence filter only restricts possible solutions; it does not rule out larger multiples of $3$.
[/example]
The example also points to a boundary of the idea. The following remark, The New Shape Of The Subject, records that interpretation before the construction is used again.
[remark: The New Shape Of The Subject]
For quadratic Diophantine equations, the course repeatedly turned geometry into parametrization and then turned parametrization into arithmetic. For cubic equations, geometry produces operations on points, while arithmetic asks which points are integral, which are rational, and which can be generated from known ones. This is the bridge from the elementary theory of Diophantine equations to the arithmetic of elliptic curves.
[/remark]
Cubic and higher-degree equations bring us to the boundary of the elementary theory, where geometry and arithmetic interact in deeper ways. Fermat's Last Theorem sits precisely at that boundary. The next chapter is a proof overview rather than a technical development: its purpose is to show how the familiar Diophantine strategy of attaching structure to a hypothetical solution survives in a modern setting.
# 9. Fermat's Last Theorem: History and Proof Overview
Fermat's Last Theorem is the point at which the elementary theory of Diophantine equations meets modern arithmetic geometry. Earlier chapters used parametrization, congruences, unique factorization in favorable rings, Pell units, local conditions, and infinite descent to solve particular equations. This chapter follows the historical path from Fermat's original claim through the classical exponents $3$ and $4$, then gives a high-level map of the twentieth-century proof. Advanced terms such as modularity and level lowering are used only to mark the logical steps of the proof, not as tools developed in detail here.
## Fermat's Statement and Prime Exponent Reduction
The first question is how much of Fermat's claim has to be proved separately. The equation has one exponent $n$, but divisibility of exponents lets us reduce the problem to a small family of cases.
[quotetheorem:4789]
The theorem is deliberately stated for positive integers. If zero or negative integers are allowed, the statement must be adjusted: signs can be absorbed when $n$ is odd, while zero gives degenerate solutions such as $0^n+b^n=b^n$. The restriction to positive integers also keeps the statement focused on the genuinely Diophantine obstruction: whether two nonzero like powers can add to another like power. In this form, the theorem separates the classical Pythagorean equation from all higher exponents, even though both are written in the same additive shape. The next issue is economy: proving the assertion for every integer exponent separately would obscure the divisibility relation among exponents.
[quotetheorem:4512]
[citeproof:4512]
This reduction explains the shape of the historical work. Fermat himself supplied the descent argument for $n=4$, Euler treated $n=3$ by methods in quadratic integer rings, and later work concentrated on prime exponents.
[example: Reducing Exponent Twelve]
Suppose, for contradiction, that positive integers $a,b,c$ satisfy
\begin{align*}
a^{12}+b^{12}=c^{12}.
\end{align*}
Set $A=a^3$, $B=b^3$, and $C=c^3$. Then
\begin{align*}
A^4+B^4
&=(a^3)^4+(b^3)^4\\
&=a^{12}+b^{12}\\
&=c^{12}\\
&=(c^3)^4\\
&=C^4.
\end{align*}
Thus the same solution produces a fourth-power equation.
It also produces a cubic equation. Set $X=a^4$, $Y=b^4$, and $Z=c^4$. Then
\begin{align*}
X^3+Y^3
&=(a^4)^3+(b^4)^3\\
&=a^{12}+b^{12}\\
&=c^{12}\\
&=(c^4)^3\\
&=Z^3.
\end{align*}
So a solution at exponent $12$ would force both a solution at exponent $4$ and a solution at exponent $3$; either *Fermat's Descent for the Fourth Power* or *Fermat's Last Theorem for Exponent Three* rules out the original equation.
[/example]
## The Classical Case of the Fourth Power
Why is exponent $4$ accessible by elementary methods? The key is that a solution to $x^4+y^4=z^4$ gives a smaller solution to a closely related equation, contradicting the well-ordering of the positive integers.
[quotetheorem:4513]
[citeproof:4513]
The descent proves the stronger equation $x^4+y^4=z^2$ rather than only $x^4+y^4=z^4$. This strengthening is typical of infinite descent: the equation is enlarged to a form stable under the descent step.
[example: Why the Descent Starts from Fermat's Equation]
Suppose positive integers $x,y,z$ satisfy $x^4+y^4=z^4$. Since $z^4=(z^2)^2$, the same equation can be rewritten as
\begin{align*}
x^4+y^4
&=z^4\\
&=(z^2)^2.
\end{align*}
Now set $X=x$, $Y=y$, and $Z=z^2$. Because $x,y,z$ are positive integers, $X,Y,Z$ are positive integers, and the displayed equality becomes
\begin{align*}
X^4+Y^4
&=x^4+y^4\\
&=(z^2)^2\\
&=Z^2.
\end{align*}
Thus any solution of $x^4+y^4=z^4$ is automatically a solution of the stronger square-right-hand-side equation $X^4+Y^4=Z^2$.
By *Fermat's Descent for the Fourth Power*, no positive integers satisfy $X^4+Y^4=Z^2$. Therefore the assumed fourth-power equation cannot have had a positive integer solution. The descent begins here because the right-hand side only needs to be a square; it does not need to remain visibly a fourth power.
[/example]
## The Classical Case of the Cubic
What changes for exponent $3$? The equation $x^3+y^3=z^3$ factors over the Eisenstein integers, and the arithmetic of that ring supplies a substitute for the parametrisation used in the fourth-power descent.
[quotetheorem:4514]
[citeproof:4514]
This proof is already less elementary than the exponent $4$ argument because it depends on unique factorisation outside $\mathbb Z$. The following example only identifies the algebraic factorisation that motivates the method; the coprimality and descent details belong to the proof itself.
[example: The Eisenstein Factorisation]
Let $\omega$ satisfy $\omega^2+\omega+1=0$, so
\begin{align*}
1+\omega+\omega^2=0.
\end{align*}
We verify the factorisation by multiplying the last two factors first:
\begin{align*}
(x+\omega y)(x+\omega^2y)
&=x^2+(\omega+\omega^2)xy+\omega^3y^2\\
&=x^2-xy+y^2,
\end{align*}
because $\omega+\omega^2=-1$ and $\omega^3=1$. Hence
\begin{align*}
(x+y)(x+\omega y)(x+\omega^2y)
&=(x+y)(x^2-xy+y^2)\\
&=x(x^2-xy+y^2)+y(x^2-xy+y^2)\\
&=x^3-x^2y+xy^2+x^2y-xy^2+y^3\\
&=x^3+y^3.
\end{align*}
Thus an equation $x^3+y^3=z^3$ becomes
\begin{align*}
(x+y)(x+\omega y)(x+\omega^2y)=z^3
\end{align*}
inside $\mathbb Z[\omega]$. The theorem's proof then studies how these three factors can share divisors and how unique factorisation constrains a product that is a cube. This example stops at the identity because the remaining descent is the substantive theorem, not a separate classroom computation.
[/example]
## Sophie Germain's Theorem and Modular Restrictions
For an odd prime exponent $p$, the natural question is whether congruences alone can block primitive solutions. Sophie Germain's work gave the first systematic answer for infinitely many primes, separating possible solutions according to divisibility by $p$.
[definition: First Case of Fermat's Last Theorem]
Let $p$ be an odd prime. The first case of Fermat's Last Theorem for $p$ is the assertion that there are no positive integers $a,b,c \in \mathbb Z$ such that
\begin{align*}
a^p+b^p=c^p, \qquad p \nmid abc.
\end{align*}
[/definition]
The complementary situation, where $p$ divides at least one of $a,b,c$, is called the second case. Many early results attacked the first case because it is more directly controlled by congruences modulo auxiliary primes.
The difficulty is to find a modulus that sees more than the original equation modulo $p$. Sophie Germain's method introduces an auxiliary prime whose residue classes force any hypothetical first-case solution into an impossible divisibility pattern, turning a global equation into a sharper congruence obstruction. In this setting $(\mathbb Z/q\mathbb Z)^\times$ denotes the multiplicative group of nonzero residue classes modulo the prime $q$.
[quotetheorem:4515]
[citeproof:4515]
The theorem does not prove Fermat's Last Theorem for all exponents, but it changed the problem. It introduced auxiliary primes and residue conditions as a systematic method, foreshadowing the later use of congruences attached to elliptic curves and modular forms.
[example: Germain Prime Strategy for Small Exponents]
For $p=3$, take the auxiliary prime
\begin{align*}
q=7=2\cdot 1\cdot 3+1,
\end{align*}
so $q \equiv 1 \pmod{2p}$. The nonzero cubic residues modulo $7$ are found by cubing the six nonzero residue classes:
\begin{align*}
1^3&\equiv 1 \pmod 7,\\
2^3&=8\equiv 1 \pmod 7,\\
3^3&=27\equiv 6 \pmod 7,\\
4^3&=64\equiv 1 \pmod 7,\\
5^3&=125\equiv 6 \pmod 7,\\
6^3&=216\equiv 6 \pmod 7.
\end{align*}
Thus the nonzero cubic residues modulo $7$ are exactly
\begin{align*}
\{1,6\}.
\end{align*}
Their possible differences are
\begin{align*}
1-6&=-5\equiv 2 \pmod 7,\\
6-1&=5\pmod 7,
\end{align*}
so no two nonzero cubic residues differ by $1$ modulo $7$. Also,
\begin{align*}
3\not\equiv 1 \pmod 7
\qquad\text{and}\qquad
3\not\equiv 6 \pmod 7,
\end{align*}
so $p=3$ is not itself a cubic residue modulo $7$.
Both hypotheses in *Sophie Germain's Theorem* therefore hold for $p=3$ and $q=7$. Hence there is no first-case solution
\begin{align*}
a^3+b^3=c^3,
\qquad 3\nmid abc,
\end{align*}
and the obstruction comes entirely from the explicitly computed residue classes modulo $7$.
[/example]
## From Hypothetical Solutions to Frey Curves
The decisive modern question is different from the classical one: what global object would a hypothetical solution create? Frey's insight was that a solution to Fermat's equation should produce an elliptic curve with arithmetic properties too extreme to exist.
[definition: Frey Curve Attached to a Fermat Solution]
Let $p$ be an odd prime, and suppose $a,b,c \in \mathbb Z$ are pairwise coprime nonzero integers satisfying
\begin{align*}
a^p+b^p=c^p.
\end{align*}
The associated Frey curve is the elliptic curve over $\mathbb Q$ given by
\begin{align*}
E_{a,b,p}: y^2=x(x-a^p)(x+b^p).
\end{align*}
[/definition]
The curve is built so that its bad primes and modularity behavior remember the arithmetic of the original equation. The equation $a^p+b^p=c^p$ makes the three branch points $0$, $a^p$, and $-b^p$ interact in an unusually rigid way.
[example: Shape of the Frey Curve]
Starting from a hypothetical primitive solution $a^p+b^p=c^p$ with $p$ odd, consider
\begin{align*}
E_{a,b,p}: y^2=x(x-a^p)(x+b^p).
\end{align*}
The three roots of the cubic on the right are
\begin{align*}
x=0,\qquad x=a^p,\qquad x=-b^p.
\end{align*}
They are rational and distinct: $a,b$ are nonzero, and $a^p=-b^p$ would imply $a^p+b^p=0$, contrary to $c^p\ne 0$. Hence the points
\begin{align*}
(0,0),\qquad (a^p,0),\qquad (-b^p,0)
\end{align*}
lie on $E_{a,b,p}(\mathbb Q)$. On a Weierstrass curve, the inverse of $(x,y)$ is $(x,-y)$, so each of these points is equal to its own inverse because $y=0$. Therefore each has order $2$, and together with the point at infinity they give the full rational $2$-torsion.
Now write
\begin{align*}
x(x-a^p)(x+b^p)=(x-0)(x-a^p)(x-(-b^p)).
\end{align*}
For a curve $y^2=(x-r_1)(x-r_2)(x-r_3)$, the Weierstrass discriminant is
\begin{align*}
16(r_1-r_2)^2(r_1-r_3)^2(r_2-r_3)^2.
\end{align*}
With $r_1=0$, $r_2=a^p$, and $r_3=-b^p$, this gives
\begin{align*}
\Delta(E_{a,b,p})
&=16(0-a^p)^2(0-(-b^p))^2(a^p-(-b^p))^2\\
&=16(-a^p)^2(b^p)^2(a^p+b^p)^2\\
&=16a^{2p}b^{2p}(c^p)^2\\
&=16a^{2p}b^{2p}c^{2p}.
\end{align*}
Thus every odd prime appearing in the discriminant divides $abc$, and it appears there with exponent divisible by $2p$. This is the arithmetic shape Frey's construction needs: the curve has rational $2$-torsion, and its bad primes are forced by the prime divisors of $abc$. The modern proof uses this rigid pattern as input to structural theorems about elliptic curves.
[/example]
The discriminant computation explains why Frey's curve is not just an auxiliary object attached to a hypothetical solution. Its bad reduction is forced into a very special pattern by the equation $a^p+b^p=c^p$, so the curve carries arithmetic information that should not occur accidentally. The missing link is a theorem that turns this special pattern into a contradiction with modularity; in this overview, that link is recorded by name as Ribet's theorem.
[remark: Ribet's Theorem As The Bridge]
Ribet's theorem supplies the level-lowering bridge used in the modern proof of Fermat's Last Theorem: a primitive Fermat solution would force the associated Frey curve to have a modular profile incompatible with the relevant lower level. The statement is recorded here as a historical and logical bridge, not as a result proved from the course's elementary tools.
[/remark]
Ribet's theorem is the bridge from Fermat's equation to modularity. Its proof is not part of this course. Conceptually, it converts the existence of a Fermat solution into the existence of an elliptic curve with an impossible modular profile. The theorem is therefore not an isolated technical lemma in these notes; it is the named point where the arithmetic of the original equation is handed off to the classification theory of elliptic curves. Without this bridge, the Frey curve would remain a suggestive association rather than a decisive implication.
## Modularity and Wiles's Proof
The final question is whether the Frey curve can exist. Ribet says that a Fermat solution would produce a non-modular semistable elliptic curve, while Wiles proved that every semistable elliptic curve over $\mathbb Q$ is modular.
[remark: Wiles's Modularity Input]
Wiles's modularity theorem for semistable elliptic curves over $\mathbb Q$ supplies the decisive modern input: every semistable elliptic curve over $\mathbb Q$ is modular. Combined with Ribet's level-lowering implication for the Frey curve, this creates the contradiction needed for Fermat's Last Theorem.
[/remark]
This is the decisive modern input. Its proof belongs to arithmetic geometry, but for the logic of Fermat's Last Theorem the important point is its scope: semistability is exactly the condition needed for the Frey curve arising from a primitive Fermat solution. Combined with Ribet's theorem, Wiles's result says that the same curve would have to be modular and non-modular at once. The contradiction is not produced by estimating the size of solutions, but by placing the hypothetical solution inside an incompatible pair of structural theorems.
With Ribet's implication and Wiles's modularity theorem in place, the remaining task is to assemble the contradiction back into the original Diophantine statement. A hypothetical primitive solution would create a Frey curve satisfying the hypotheses of both structural results, so the final theorem records the consequence for the equation itself.
[remark: Fermat's Last Theorem From The Modern Inputs]
The final assembly is logical: a primitive Fermat solution would produce a Frey curve; Ribet's theorem forces that curve into a forbidden non-modular situation; Wiles's theorem says the semistable Frey curve must be modular. Therefore the hypothetical solution cannot exist.
[/remark]
The modern proof has the same logical structure as many earlier arguments in the course: assume a smallest or primitive solution, attach extra structure, and prove that the structure cannot exist. What changes is the scale of the attached object. Instead of a parametrised conic or a factorisation in a quadratic ring, the proof uses elliptic curves and modularity, which belong to a later course in arithmetic geometry.
[remark: Historical Meaning of the Proof]
Fermat's original margin note suggested that he had a descent proof, but no such proof can cover the general theorem using only the methods now attributed to him. Wiles's proof did not complete a missing elementary argument; it revealed that the equation lies inside a much larger web of arithmetic geometry. This is why Fermat's Last Theorem is both the endpoint of a classical Diophantine course and a gateway to modern number theory.
[/remark]
The proof of Fermat's Last Theorem shows that the course's earlier techniques are best understood as parts of a larger strategy, not as isolated tricks. The synthesis chapter now gathers those methods into a practical toolkit for deciding which Diophantine problems can be attacked by congruences, parametrization, descent, factorization, or local analysis.
# 10. Synthesis: Strategies for Diophantine Problems
This final chapter gathers the main methods from the course into a practical guide for solving Diophantine equations. Earlier chapters developed congruences, parametrization of conics, descent, factorization in arithmetic rings, Pell equations, and local methods separately; here the emphasis is on choosing between them. The goal is not to replace the detailed arguments from previous chapters, but to make explicit the diagnostic questions that guide a complete solution.
## Reading the Shape of a Diophantine Problem
When a problem first appears, the main question is not "which theorem applies?" but "what kind of solution set could this equation have?" Some equations ask for a finite classification, some ask for a parametrisation, some ask only for impossibility, and some have infinitely many solutions generated by a unit or recurrence.
For homogeneous equations, this first diagnostic must also separate essential solutions from scaled copies. Without that separation, a single solution can generate infinitely many multiples and obscure whether the equation has genuinely different arithmetic patterns. The primitive condition is the standard way to remove this scaling noise before applying congruences, parametrisations, or descent.
[definition: Primitive Integer Solution]
A solution $(x_1,\dots,x_n)\in\mathbb Z^n$ of $F(x_1,\dots,x_n)=0$ is primitive if
\begin{align*}
\gcd(x_1,\dots,x_n)=1.
\end{align*}
[/definition]
Primitive reduction is useful because many homogeneous equations preserve solutions under scaling. Once primitive solutions are classified, nonprimitive solutions are obtained by multiplying all coordinates by a common integer, subject to the homogeneity of the equation.
[remark: Choosing The First Test]
A useful first pass is to ask what kind of answer the problem is likely to have. Congruences are good at proving impossibility, especially when residues modulo a small number already contradict the equation. Homogeneous quadratic equations often invite a search for a rational point and then a parametrisation by lines through that point. Equations resembling $x^2-Dy^2=N$ suggest factoring or using recurrences coming from units. These are not theorems that solve the problem automatically; they are diagnostic choices that help decide which proof strategy is worth trying.
[/remark]
The limitations are part of the method. A conic parametrisation needs an initial rational point, and the line method has nothing to start from if the conic is already obstructed modulo some integer. Factorisation methods can also fail to classify solutions when the two factors have uncontrolled common divisors or when unique factorisation is unavailable. Passing congruence tests is only a necessary condition for an integer solution, so these tests should be read as increasingly structural filters rather than as guarantees of solubility.
[example: Obstructing A Quadratic Equation Modulo Eight]
Consider a primitive integer solution $(x,y,z)$ of
\begin{align*}
x^2+y^2=3z^2.
\end{align*}
Modulo $4$, every square is congruent to $0$ or $1$. If $x$ and $y$ were both odd, then
\begin{align*}
x^2+y^2\equiv 1+1\equiv 2\pmod 4,
\end{align*}
whereas
\begin{align*}
3z^2\equiv 0 \text{ or } 3\pmod 4.
\end{align*}
Thus $x$ and $y$ are not both odd.
Now work modulo $8$. The square residues modulo $8$ are
\begin{align*}
0^2&\equiv 0,&
1^2&\equiv 1,&
2^2&\equiv 4,&
3^2&\equiv 1,&
4^2&\equiv 0,&
5^2&\equiv 1,&
6^2&\equiv 4,&
7^2&\equiv 1\pmod 8,
\end{align*}
so every square is congruent to $0$, $1$, or $4$ modulo $8$, and every odd square is congruent to $1$ modulo $8$. If $z$ were odd, then
\begin{align*}
3z^2\equiv 3\pmod 8.
\end{align*}
Since $x$ and $y$ are not both odd, the possible residues for $x^2+y^2$ modulo $8$ are
\begin{align*}
0+0&\equiv 0,&
0+4&\equiv 4,&
4+4&\equiv 0,&
1+0&\equiv 1,&
1+4&\equiv 5\pmod 8,
\end{align*}
and none is congruent to $3$. Hence $z$ is even.
With $z$ even, the right-hand side is congruent to $0$ or $4$ modulo $8$. If exactly one of $x,y$ were odd, then
\begin{align*}
x^2+y^2\equiv 1+0\equiv 1
\quad\text{or}\quad
x^2+y^2\equiv 1+4\equiv 5\pmod 8,
\end{align*}
again impossible. Since $x$ and $y$ are not both odd, this leaves only the case that $x$ and $y$ are both even. Thus $x,y,z$ are all even, contradicting primitivity. Therefore no nonzero primitive integer solution exists, and the equation is ruled out by local congruence information before any parametrisation is needed.
[/example]
## Choosing the Arithmetic Setting
Which number system makes a Diophantine equation factor? Over $\mathbb Z$, only some expressions split usefully. Passing to $\mathbb Z[i]$, to a quadratic order, or to a $p$-adic completion can turn an opaque equation into a divisibility or norm problem.
[definition: Norm Equation In A Quadratic Ring]
Let $D\in\mathbb Z$ be nonsquare, let $K=\mathbb Q(\sqrt D)$, and let $R$ be an order in $K$. The field norm is the map
\begin{align*}
N_{K/\mathbb Q}:K&\to\mathbb Q,
&
\alpha&\mapsto \alpha\sigma(\alpha),
\end{align*}
where $\sigma:K\to K$ is the nontrivial $\mathbb Q$-automorphism. A norm equation in $R$ is an equation
\begin{align*}
N_{K/\mathbb Q}(\alpha)=N,
\end{align*}
with $\alpha\in R$ and $N\in\mathbb Z$.
[/definition]
For $\alpha=x+y\sqrt D$, the norm is $x^2-Dy^2$. Thus Pell equations and their variants are not isolated tricks; they are the basic norm equations in real quadratic fields.
[remark: Factorization Method]
When an equation can be rewritten as a product
\begin{align*}
UV=N
\end{align*}
inside a unique factorization domain, one can often reduce the search to possible factor pairs of $N$. This is a method, not a theorem that solves every equation: it works only when common factors, units, and compatibility conditions can be controlled.
[/remark]
The hypothesis about controlling common factors is essential. A factorisation that creates uncontrolled common divisors may hide rather than solve the original problem. For instance, from $ab=C$ in $\mathbb Z$ one can list divisors of $C$, but from $(x+y)(x-y)=C$ the parity condition that $x+y$ and $x-y$ have the same parity is an extra compatibility condition. In non-UFD rings, even the phrase "the divisors of $C$" may fail to give a usable finite classification without replacing principal elements by ideals or adding class group information.
[example: Gaussian Integers And Sums Of Two Squares]
For an odd prime $p$, asking whether
\begin{align*}
p=a^2+b^2
\end{align*}
is the same as asking whether $p$ factors in $\mathbb Z[i]$ as
\begin{align*}
p=(a+bi)(a-bi),
\end{align*}
because
\begin{align*}
(a+bi)(a-bi)
&=a^2-a bi+a bi-b^2 i^2\\
&=a^2-b^2(-1)\\
&=a^2+b^2.
\end{align*}
Suppose first that $p\equiv 3\pmod 4$ and that $p=a^2+b^2$. If $p\mid b$, then
\begin{align*}
a^2=p-b^2
\end{align*}
is divisible by $p$, so $p\mid a$; then $p^2\mid a^2+b^2=p$, impossible. Hence $b\not\equiv 0\pmod p$. Reducing $a^2+b^2=p$ modulo $p$ gives
\begin{align*}
a^2+b^2&\equiv 0\pmod p,\\
a^2&\equiv -b^2\pmod p.
\end{align*}
Multiplying by the inverse of $b^2$ modulo $p$ gives
\begin{align*}
(ab^{-1})^2\equiv -1\pmod p.
\end{align*}
But for $p\equiv 3\pmod 4$, $-1$ is not a quadratic residue modulo $p$, so no such representation can exist.
Now suppose $p\equiv 1\pmod 4$. Then $-1$ is a quadratic residue modulo $p$, so choose $t$ with
\begin{align*}
t^2\equiv -1\pmod p.
\end{align*}
In $\mathbb Z[i]$,
\begin{align*}
(t+i)(t-i)
&=t^2-i^2\\
&=t^2+1,
\end{align*}
so $p\mid (t+i)(t-i)$. However, $p\nmid t+i$: if $t+i=p(c+di)$, then comparing imaginary parts gives $1=pd$, impossible in integers. Similarly $p\nmid t-i$. Thus $p$ divides a product in $\mathbb Z[i]$ without dividing either factor, so $p$ is not prime in the Euclidean domain $\mathbb Z[i]$. Therefore $p$ is reducible:
\begin{align*}
p=\alpha\beta
\end{align*}
with neither $\alpha$ nor $\beta$ a unit. Taking Gaussian norms gives
\begin{align*}
p^2=N(p)=N(\alpha)N(\beta).
\end{align*}
Since nonunits in $\mathbb Z[i]$ have norm at least $2$, and neither factor is a unit, the only possibility is
\begin{align*}
N(\alpha)=N(\beta)=p.
\end{align*}
Writing $\alpha=a+bi$, we get
\begin{align*}
p=N(a+bi)=a^2+b^2.
\end{align*}
Thus primes $p\equiv 1\pmod 4$ split as sums of two squares, while primes $p\equiv 3\pmod 4$ do not.
[/example]
The Gaussian integer example shows the factorisation method in its cleanest form because $\mathbb Z[i]$ is Euclidean. Pell equations are different: the decisive arithmetic is not a finite list of divisors, but the infinite unit group acting on elements of fixed norm.
[example: Pell Equations As Unit Problems]
The norm of $x+y\sqrt 7$ in $\mathbb Z[\sqrt 7]$ is
\begin{align*}
N(x+y\sqrt 7)
&=(x+y\sqrt 7)(x-y\sqrt 7)\\
&=x^2-xy\sqrt 7+xy\sqrt 7-7y^2\\
&=x^2-7y^2.
\end{align*}
Thus solving $x^2-7y^2=1$ is the same as finding units of norm $1$ in this quadratic ring.
The continued fraction for $\sqrt 7$ begins
\begin{align*}
\sqrt 7
&=[2;\overline{1,1,1,4}],
\end{align*}
and the convergent before the period closes is
\begin{align*}
[2;1,1,1]
&=2+\frac{1}{1+\frac{1}{1+\frac{1}{1}}}\\
&=2+\frac{1}{1+\frac{1}{2}}\\
&=2+\frac{1}{\frac{3}{2}}\\
&=2+\frac{2}{3}\\
&=\frac{8}{3}.
\end{align*}
This gives
\begin{align*}
8^2-7\cdot 3^2
&=64-7\cdot 9\\
&=64-63\\
&=1,
\end{align*}
so $8+3\sqrt 7$ is a positive unit of norm $1$. Since the period length of the continued fraction is $4$, the standard continued-fraction theorem for Pell equations gives $8+3\sqrt 7$ as the fundamental positive unit.
Therefore every positive solution is obtained from
\begin{align*}
x_k+y_k\sqrt 7=(8+3\sqrt 7)^k,\qquad k\ge 1.
\end{align*}
The construction preserves the equation because, if $x^2-7y^2=1$, then multiplying by $8+3\sqrt 7$ gives
\begin{align*}
(x+y\sqrt 7)(8+3\sqrt 7)
&=8x+3x\sqrt 7+8y\sqrt 7+21y\\
&=(8x+21y)+(3x+8y)\sqrt 7,
\end{align*}
and
\begin{align*}
(8x+21y)^2-7(3x+8y)^2
&=(64x^2+336xy+441y^2)-7(9x^2+48xy+64y^2)\\
&=64x^2+336xy+441y^2-63x^2-336xy-448y^2\\
&=x^2-7y^2\\
&=1.
\end{align*}
So the Pell equation has infinitely many positive solutions, organized by powers of the fundamental unit.
[/example]
The Pell example gives global integer solutions, but many Diophantine problems fail earlier: they already have no compatible solution in one prime-adic integer system. Since every integer solution determines a solution in $\mathbb Z_p$ for each prime $p$, checking a single prime can reveal an obstruction invisible over the real numbers or over small samples of integers.
[definition: Local Solubility At A Prime]
Let $F\in\mathbb Z[x_1,\dots,x_n]$ and let $p$ be prime. The equation $F=0$ is soluble over $\mathbb{Z}_p$ if there exists $(a_1,\dots,a_n)\in\mathbb{Z}_p^n$ such that
\begin{align*}
F(a_1,\dots,a_n)=0.
\end{align*}
[/definition]
A solution in integers gives a solution in every $\mathbb{Z}_p$, so failure over one $\mathbb{Z}_p$ proves global impossibility.
## Parametrisation and Descent as Opposite Moves
What should be done after congruence tests fail to obstruct? There are two opposite possibilities. Parametrisation expands one known solution into all solutions; descent assumes a solution exists and compresses it into a smaller one until contradiction.
[definition: Rational Parametrisation From A Point]
Let $C\subset\mathbb A^2$ be a plane conic over $\mathbb Q$ with a rational point $P\in C(\mathbb Q)$. A rational parametrisation from $P$ is a map
\begin{align*}
\varphi:U\subset\mathbb Q&\to C(\mathbb Q),
&
t&\mapsto \varphi(t),
\end{align*}
where $U$ is the set of rational slopes for which the line through $P$ of slope $t$ is not exceptional, and $\varphi(t)$ is the second intersection point of that line with $C$.
[/definition]
The definition is geometric, but the output is algebraic: substituting the line equation into the conic gives a quadratic with one known root, so the other root is a rational function of $t$.
[illustration:conic-line-parametrization]
For equations such as $x^2+2y^2=z^2$, the conic viewpoint turns the search for integer triples into a search for rational points and then into a divisibility problem. A rational parametrisation gives many rational solutions, but primitive integer solutions require controlling common factors and parity after clearing denominators. The coefficient $2$ changes the bookkeeping compared with Pythagorean triples, so the final formula must record exactly which parameters avoid a common divisor. The classification below is the arithmetic form of that geometric parametrisation.
[quotetheorem:4521]
[citeproof:4521]
This result is the direct analogue of Euclid's formula for Pythagorean triples. The coefficient $2$ changes the arithmetic of the parity condition: requiring $m$ odd is what prevents the displayed triple from having a common factor $2$. The theorem is a classification of primitive triples only; nonprimitive triples are obtained afterwards by multiplying a primitive triple by a common integer.
[example: Extracting Primitive Triples]
Using the parametrising formulas
\begin{align*}
x&=m^2-2n^2,&
y&=2mn,&
z&=m^2+2n^2,
\end{align*}
the choice $(m,n)=(3,1)$ gives
\begin{align*}
x&=3^2-2\cdot 1^2=9-2=7,\\
y&=2\cdot 3\cdot 1=6,\\
z&=3^2+2\cdot 1^2=9+2=11.
\end{align*}
Substituting these values into $x^2+2y^2=z^2$ gives
\begin{align*}
7^2+2\cdot 6^2
&=49+2\cdot 36\\
&=49+72\\
&=121\\
&=11^2.
\end{align*}
The triple is primitive because
\begin{align*}
\gcd(7,6,11)=1.
\end{align*}
If instead $(m,n)=(2,1)$, then
\begin{align*}
x&=2^2-2\cdot 1^2=4-2=2,\\
y&=2\cdot 2\cdot 1=4,\\
z&=2^2+2\cdot 1^2=4+2=6.
\end{align*}
This still satisfies the equation:
\begin{align*}
2^2+2\cdot 4^2
&=4+2\cdot 16\\
&=4+32\\
&=36\\
&=6^2.
\end{align*}
However,
\begin{align*}
\gcd(2,4,6)=2,
\end{align*}
so $(2,4,6)$ is not primitive. Dividing all three coordinates by $2$ gives
\begin{align*}
(1,2,3),
\end{align*}
and the reduced triple satisfies
\begin{align*}
1^2+2\cdot 2^2
&=1+8\\
&=9\\
&=3^2.
\end{align*}
This shows why the parity condition on $m$ is part of the primitive parametrisation: taking $m$ even can produce a valid triple with a common factor.
[/example]
Parametrisation is constructive: once a rational point is known, the slope parameter produces solutions. Descent moves in the opposite direction, using the existence of one solution to force a smaller one and thereby prove that no first solution could have existed.
The obstruction descent exploits is well-ordering: a positive integer measure cannot decrease forever. If every supposed solution creates a strictly smaller one of the same kind, then a minimal counterexample cannot exist in the first place.
[definition: Infinite Descent]
An infinite descent argument proves impossibility by showing that any positive integer solution satisfying a chosen minimality condition would produce another positive integer solution with a strictly smaller value of the chosen measure.
[/definition]
The measure might be $z$, $|xy|$, $x+y$, or a norm. The hard part is not the final contradiction; it is choosing a measure that the algebra actually decreases.
[remark: Infinite Descent Template]
An infinite descent proof has three parts: choose a well-founded positive measure $M$, show that any solution $s$ produces another solution $s'$ with $M(s')<M(s)$, and conclude that no starting solution can exist.
[/remark]
This template underlies Fermat's descent for special fourth-power equations. The mathematical content in each application is the construction of $s'$. The strict inequality $M(s')<M(s)$ is indispensable: a construction that merely returns another solution of the same size, such as swapping $x$ and $y$ in a symmetric equation, gives no contradiction. Positivity or another well-founded target is also needed; a decreasing sequence inside $\mathbb Z$ need not stop if negative values are allowed.
[example: Descent Behind Fermat's Right Triangle Argument]
The detailed descent for $x^4+y^4=z^2$ belongs in the cited theorem proof, but its shape is worth isolating. A hypothetical primitive solution first turns $(x^2,y^2,z)$ into a primitive Pythagorean triple. Euclid's parametrization then forces the relevant coprime factors to be squares or twice squares, and repeating the parametrization produces another solution of the same fourth-power-square equation with a smaller square root on the right. The example therefore illustrates how a familiar parametrization becomes an impossibility proof once it is paired with a decreasing measure.
[/example]
## Local-to-Global Testing
How much can be learned by solving the equation modulo many primes? Local testing is a disciplined way to search for impossibility: every global integer solution must be compatible with every modular and $p$-adic shadow of the equation.
[definition: Local Obstruction]
A local obstruction to an integer equation $F=0$ is a prime $p$ and an exponent $k\ge 1$ such that the congruence
\begin{align*}
F(x_1,\dots,x_n)\equiv 0\pmod{p^k}
\end{align*}
has no solution satisfying the required side conditions.
[/definition]
The side conditions matter. For primitive homogeneous equations, the local search must exclude the all-zero residue class and enforce that not every coordinate is divisible by $p$.
[remark: Global Solutions Give Local Congruence Solutions]
Every integer solution of an equation reduces to a solution modulo $m$ for each $m\ge 1$. Therefore, if the required congruence has no solution modulo some $m$, the original integer equation has no solution satisfying the same side conditions.
[/remark]
The converse is delicate. Passing every modulus is strong evidence that no finite congruence obstruction has been found, but it need not classify integer solutions and, outside special settings, need not guarantee an integer point. Thus local testing is best understood as an obstruction-finding method unless the course has proved a local-to-global theorem for the specific class under discussion.
[example: A Pell-Type Local Obstruction]
Consider the equation
\begin{align*}
x^2-3y^2=2
\end{align*}
with $x,y\in\mathbb Z$. Reducing both sides modulo $3$, the term $3y^2$ vanishes because
\begin{align*}
3y^2\equiv 0\pmod 3,
\end{align*}
so any integer solution would satisfy
\begin{align*}
x^2-3y^2&\equiv 2\pmod 3,\\
x^2-0&\equiv 2\pmod 3,\\
x^2&\equiv 2\pmod 3.
\end{align*}
But every integer $x$ is congruent to $0$, $1$, or $2$ modulo $3$, and the corresponding square residues are
\begin{align*}
0^2&\equiv 0\pmod 3,\\
1^2&\equiv 1\pmod 3,\\
2^2&=4\equiv 1\pmod 3.
\end{align*}
Thus no square is congruent to $2$ modulo $3$, contradicting the necessary congruence $x^2\equiv 2\pmod 3$. Therefore $x^2-3y^2=2$ has no integer solution.
Although the left-hand side is the norm
\begin{align*}
N(x+y\sqrt 3)=x^2-3y^2
\end{align*}
in $\mathbb Z[\sqrt 3]$, the modulo $3$ obstruction already proves impossibility, so no unit-generation argument is needed.
[/example]
## Writing a Complete Solution
What makes a Diophantine solution write-up complete? It must match the promise of the problem. A proof of existence, a classification, and an impossibility proof have different obligations.
[explanation: Three Write-Up Patterns]
For existence, exhibit a solution and verify it satisfies the equation and side conditions.
For classification, prove both directions. First show that every object produced by the formula is a solution. Then start with an arbitrary solution and prove it appears in the formula, including gcd, sign, ordering, and parity conventions.
For impossibility, identify the obstruction and prove that every hypothetical solution meets the hypotheses of the obstruction. If the proof uses descent, state the minimality condition before the descent step. If the proof uses a local obstruction, specify the modulus and list the relevant residue classes.
[/explanation]
These patterns also determine how much detail belongs in the proof. A parametrisation is especially demanding because it makes a two-sided promise: it must produce solutions and prove that no solutions have escaped the chosen parameters.
[remark: Checking A Parametrisation]
A complete parametrisation should state exactly which parameters are allowed, verify that every permitted choice gives a solution, and prove conversely that every solution arises from some permitted choice. It should also record boundary conventions: primitive versus nonprimitive, positive versus signed, ordered versus unordered, and whether different parameter values can describe the same solution.
[/remark]
This checklist is especially important for quadratic forms. The algebra may be short, but most mistakes occur at the boundary. The duplicate-parameter condition is not cosmetic; for example, the slopes $t$ and $-t$ in a conic parametrisation often correspond to changing the sign of one coordinate, while multiplying all parameters by a common factor may leave the same rational point after clearing denominators. Stating these identifications prevents a formula from being mistaken for a one-to-one parametrisation.
[example: Solving Or Obstructing X Squared Minus D Y Squared Equals N]
For equations of the form
\begin{align*}
x^2-Dy^2=N,
\end{align*}
the first tests should usually be congruence tests modulo $4$, modulo $8$, and modulo primes dividing $D$ or $N$. These are necessary tests: any integer solution remains a solution after reducing both sides modulo any chosen modulus.
For the concrete equation
\begin{align*}
x^2-2y^2=-1,
\end{align*}
there is no obstruction to the small solution $(x,y)=(1,1)$, since
\begin{align*}
1^2-2\cdot 1^2
&=1-2\\
&=-1.
\end{align*}
The associated Pell equation
\begin{align*}
u^2-2v^2=1
\end{align*}
has the positive unit $3+2\sqrt2$, because
\begin{align*}
3^2-2\cdot 2^2
&=9-8\\
&=1.
\end{align*}
Multiplying $1+\sqrt2$ by this unit preserves norm $-1$. For one step,
\begin{align*}
(1+\sqrt2)(3+2\sqrt2)
&=3+2\sqrt2+3\sqrt2+2(\sqrt2)^2\\
&=3+5\sqrt2+4\\
&=7+5\sqrt2,
\end{align*}
and the new coefficients again satisfy
\begin{align*}
7^2-2\cdot 5^2
&=49-2\cdot 25\\
&=49-50\\
&=-1.
\end{align*}
In general, define integers $x_k,y_k$ by
\begin{align*}
x_k+y_k\sqrt2=(1+\sqrt2)(3+2\sqrt2)^k,\qquad k\ge 0.
\end{align*}
If $x_k^2-2y_k^2=-1$, then
\begin{align*}
(x_k+y_k\sqrt2)(3+2\sqrt2)
&=3x_k+2x_k\sqrt2+3y_k\sqrt2+2y_k(\sqrt2)^2\\
&=(3x_k+4y_k)+(2x_k+3y_k)\sqrt2,
\end{align*}
so
\begin{align*}
x_{k+1}&=3x_k+4y_k,\\
y_{k+1}&=2x_k+3y_k.
\end{align*}
Then
\begin{align*}
x_{k+1}^2-2y_{k+1}^2
&=(3x_k+4y_k)^2-2(2x_k+3y_k)^2\\
&=(9x_k^2+24x_ky_k+16y_k^2)-2(4x_k^2+12x_ky_k+9y_k^2)\\
&=9x_k^2+24x_ky_k+16y_k^2-8x_k^2-24x_ky_k-18y_k^2\\
&=x_k^2-2y_k^2\\
&=-1.
\end{align*}
Thus the powers
\begin{align*}
(1+\sqrt2)(3+2\sqrt2)^k
\end{align*}
produce infinitely many positive integer solutions of $x^2-2y^2=-1$, illustrating how local tests, one seed solution, and a norm-one unit combine to organize a Pell-type equation.
[/example]
This final example combines the whole chapter's workflow: local tests first, then a search for one global solution, then the unit action to organise the infinite family. The same sequence of moves is useful in many problems, but each step has to be justified in the arithmetic setting chosen for that equation.
[remark: Strategy Is Not A Substitute For Arithmetic]
The methods in this chapter are reliable because they are tied to invariants: gcd, parity, residues, norms, units, valuations, and minimality. A solution write-up should name the invariant being used at each stage. This keeps the argument checkable and prevents a search computation from being mistaken for a proof.
[/remark]
The course began with the concrete geometry of Pythagorean triples and ends with a broader lesson. Diophantine equations are not solved by a single universal trick; they are solved by choosing the arithmetic setting in which the equation exposes its structure.
Contents
- 1. Integer Points, Rational Points, and Primitive Solutions
- Polynomial Equations Over the Integers and Rationals
- Primitive Solutions and Scaling
- Parity in Pythagorean Triples
- Congruence Obstructions
- 2. Pythagorean Triples
- Primitive Right Triangles
- Euclid's Formula
- Rational Points On The Unit Circle
- 3. Conics and Parametrization
- Lines Through a Known Rational Point
- Homogenization And Projective Conics Over $\mathbb{Q}$
- Elementary Local Obstructions
- The Working Dichotomy For Conics
- 4. Sums of Two Squares
- The Sum-Of-Two-Squares Criterion
- Why introduce the Gaussian integers?
- How do sums of two squares multiply?
- Which integers are sums of two squares?
- How the Gaussian proof fits the earlier methods
- 5. Descent Methods
- Why Descent Proves Impossibility
- Coprime Factors and Square Divisibility
- Fermat's Descent for Square Area Right Triangles
- The Equation $x^4+y^4=z^2$
- What Descent Contributes To The Course
- 6. Pell's Equation
- The Equation $x^2-Dy^2=1$
- Continued Fractions of Square Roots
- Extracting Solutions From the Period
- Units in Quadratic Orders
- Period Parity and the Negative Equation
- The Complete Pell Algorithm
- 7. Quadratic Forms and Local Conditions
- Quadratic Forms Over Different Ground Rings
- Solubility Modulo Prime Powers
- Local Obstructions Without Heavy Machinery
- Legendre's Theorem For Ternary Quadratic Forms
- How Local Conditions Guide Computation
- 8. Beyond Quadratics: Cubics and Higher Degree Equations
- Why the Slope Method Stops Being Enough
- Mordell-Type Equations As First Elliptic Curves
- Rational Points Versus Integer Points
- What Elementary Methods Still Contribute
- 9. Fermat's Last Theorem: History and Proof Overview
- Fermat's Statement and Prime Exponent Reduction
- The Classical Case of the Fourth Power
- The Classical Case of the Cubic
- Sophie Germain's Theorem and Modular Restrictions
- From Hypothetical Solutions to Frey Curves
- Modularity and Wiles's Proof
- 10. Synthesis: Strategies for Diophantine Problems
- Reading the Shape of a Diophantine Problem
- Choosing the Arithmetic Setting
- Parametrisation and Descent as Opposite Moves
- Local-to-Global Testing
- Writing a Complete Solution
Diophantine Equations
Content
Problems
History
Created by admin on 5/31/2026 | Last updated on 6/1/2026
Prerequisites (0/6 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent