This course develops the analytic side of number theory through the study of multiplicative functions, Dirichlet series, and the analytic behavior of zeta and $L$-functions. It begins with arithmetic functions and Dirichlet convolution as the algebraic language for encoding number-theoretic data, then turns to Dirichlet series as generating functions that translate sums over integers into complex-analytic objects. From there, the course focuses on the Riemann zeta function near $\operatorname{Re}(s)>1$, Perron's formula, and summatory-function techniques that connect analytic estimates back to counting problems.
A central theme is how analytic information about generating functions controls the distribution of primes. After proving the prime number theorem for $\zeta(s)$, the course introduces Dirichlet characters and orthogonality to study arithmetic progressions, leading to Dirichlet $L$-functions and their Euler products. The later chapters establish nonvanishing at $s=1$, prove Dirichlet's theorem on primes in arithmetic progressions, and extend the framework to completed $L$-functions and functional equations. Each stage builds on the previous one: algebraic structure leads to analytic continuation, analytic continuation leads to asymptotics, and asymptotics reveal deep prime-distribution phenomena.
# Introduction
This course asks how analytic objects remember arithmetic information. We begin with functions on the positive integers, package them into Dirichlet series, and study the resulting complex functions near their singularities. The guiding example is the passage from the zeta function to the prime number theorem, followed by the extension from all primes to primes in arithmetic progressions through Dirichlet characters and $L$-functions.
The course assumes elementary number theory, complex analysis, real analysis, and finite abelian groups. The first part builds the algebra of arithmetic functions and Dirichlet series. The second part uses complex-analytic continuation, residues, zero-free regions, and Tauberian reasoning to extract asymptotic information about primes.
## The Central Problem
The first question is how to turn statements about divisibility and primes into analytic statements about functions of a complex variable. Counting primes directly is difficult because primality is not additive or smooth. Analytic number theory replaces the primes by weighted sums and generating functions whose singularities encode growth.
[definition: Prime Counting Functions]
Let $x \ge 1$. The prime counting functions used in this course are
\begin{align*}
\pi(x) &:= |\{p \le x : p \text{ prime}\}|,\\
\vartheta(x) &:= \sum_{p \le x} \log p,\\
\psi(x) &:= \sum_{n \le x} \Lambda(n),
\end{align*}
where $\Lambda$ is the von Mangoldt function.
[/definition]
The function $\pi(x)$ is the most direct object, but $\vartheta(x)$ and $\psi(x)$ are better adapted to products over primes and logarithmic differentiation. A major theme of the course is that these three functions contain essentially the same first-order information.
To make $\psi(x)$ usable, we need the weight that counts prime powers with exactly the logarithmic mass contributed by their underlying prime. This is the arithmetic function that will later appear when Euler products are logarithmically differentiated.
[definition: Von Mangoldt Function]
The von Mangoldt function is the arithmetic function $\Lambda: \mathbb N \to \mathbb R$ defined by
\begin{align*}
\Lambda(n) :=
\begin{cases}
\log p, & n = p^k \text{ for some prime } p \text{ and } k \in \mathbb N,\\
0, & \text{otherwise}.
\end{cases}
\end{align*}
[/definition]
The weight $\Lambda(n)$ treats prime powers as the natural logarithmic atoms of the integers. It appears when one differentiates Euler products, so it forms the bridge between prime factorisation and analytic functions.
The prime number theorem will be the endpoint of the zeta-function part of the course, but the objects needed for its proof appear earlier. The function $\Lambda$ is the version of prime counting that enters the logarithmic derivative of an Euler product, so the path begins by treating arithmetic functions as algebraic data before introducing their Dirichlet series.
## Arithmetic Functions as Data
The next question is what kind of arithmetic data should be studied before introducing complex variables. An arithmetic function is a sequence indexed by the positive integers, but multiplication of integers suggests a convolution product rather than pointwise multiplication.
[definition: Arithmetic Function]
An arithmetic function is a function $f: \mathbb N \to \mathbb C$.
[/definition]
We use the standard notation $\tau(n)$ for the number of positive divisors of $n$, $\sigma_k(n):=\sum_{d\mid n}d^k$ for the $k$th divisor-sum function, $\varphi(n):=|\{1\le a\le n:\gcd(a,n)=1\}|$ for Euler's totient function, and $\mu(n)$ for the Mobius function.
Arithmetic functions form a vector space under pointwise addition and scalar multiplication. Pointwise multiplication misses the way an integer is assembled from its divisors: multiplying $f(n)$ and $g(n)$ only compares the two functions at the same integer $n$, while divisor identities need all decompositions $n=ab$. The useful multiplication for this course is Dirichlet convolution, because divisors are the natural finite pieces of the multiplicative structure of $\mathbb N$.
[definition: Dirichlet Convolution]
For arithmetic functions $f,g: \mathbb N \to \mathbb C$, their Dirichlet convolution is the arithmetic function $f*g: \mathbb N \to \mathbb C$ defined by
\begin{align*}
(f*g)(n) := \sum_{d \mid n} f(d)g(n/d).
\end{align*}
[/definition]
The definition is deliberately compressed, so the first useful check is whether it recovers the familiar divisor-counting function.
[example: Divisor Function As A Convolution]
Let $1:\mathbb N\to\mathbb C$ be the constant arithmetic function $1(n)=1$. For any $n\in\mathbb N$, the definition of Dirichlet convolution gives
\begin{align*}
(1*1)(n)
&=\sum_{d\mid n}1(d)\,1(n/d)\\
&=\sum_{d\mid n}1\cdot 1\\
&=\sum_{d\mid n}1.
\end{align*}
The last sum has one summand for each positive divisor $d$ of $n$, and each summand equals $1$, so
\begin{align*}
(1*1)(n)=|\{d\in\mathbb N:d\mid n\}|=\tau(n).
\end{align*}
Since this holds for every $n\in\mathbb N$, we have the identity of arithmetic functions $1*1=\tau$. Thus the divisor function arises by summing over all factorizations $n=d(n/d)$, which is exactly the information encoded by Dirichlet convolution.
[/example]
The first chapter develops this algebra systematically. Mobius inversion is the statement that divisor-sum identities can be inverted inside this convolution algebra. The basic problem is that divisor sums often hide the original function: knowing all values of $g(n)=\sum_{d\mid n}f(d)$ gives cumulative information over the divisor lattice, not the individual values of $f$. To use divisor sums effectively, we need an exact rule that removes the contribution of the proper divisors and recovers $f$ pointwise.
[quotetheorem:4334]
[citeproof:4334]
This result sets the pattern for the early lectures: define an arithmetic operation, identify its unit and inverses, then interpret classical divisor identities as identities in a ring. The condition that $g$ is a divisor sum of $f$ is necessary for this exact inverse formula; Mobius inversion is not an arbitrary transform but the inverse of convolution by the constant function $1$. The limitation is that it inverts divisor aggregation, not general summation procedures. This distinction matters later because Dirichlet series turn precisely this convolution inverse into ordinary reciprocal functions.
## Dirichlet Series and Euler Products
Once arithmetic functions have an algebra, the next question is how to encode them analytically. Ordinary power series are poorly matched to divisibility: the product of two ordinary generating functions groups terms by additive decompositions $n=a+b$, while divisibility problems group terms by multiplicative decompositions $n=ab$. Dirichlet series turn convolution into multiplication and prime-power data into Euler factors.
[definition: Dirichlet Series]
Given an arithmetic function $f: \mathbb N \to \mathbb C$, its Dirichlet series is the formal expression
\begin{align*}
\sum_{n=1}^\infty \frac{f(n)}{n^s}.
\end{align*}
When the series converges on a set $\Omega_f\subseteq\mathbb C$, it defines a function $F:\Omega_f\to\mathbb C$ by
\begin{align*}
F(s) := \sum_{n=1}^\infty \frac{f(n)}{n^s}.
\end{align*}
[/definition]
When the series converges absolutely, Dirichlet convolution becomes ordinary multiplication of functions. This is the analytic reason that convolution is the correct algebraic structure.
The next issue is whether the formal coefficient identity for convolution is legitimate for infinite sums. Multiplying two Dirichlet series creates a double series, so we need a convergence condition strong enough to justify regrouping its terms by the product $ab=n$.
[quotetheorem:4335]
[citeproof:4335]
The absolute convergence hypothesis is the point that allows the arithmetic regrouping by $ab=n$ to become a legitimate analytic operation. Without it, the same formal identity may still be true in special cases, but rearranging an infinite double series can change its value or fail to converge. The result is therefore both algebraic and analytic: convolution gives the coefficient rule, while absolute convergence supplies permission to multiply the series as functions.
[definition: Multiplicative Function]
An arithmetic function $f: \mathbb N \to \mathbb C$ is multiplicative if $f(1)=1$ and
\begin{align*}
f(mn)=f(m)f(n)
\end{align*}
whenever $\gcd(m,n)=1$.
[/definition]
Multiplicative functions are determined by their values on prime powers. Their Dirichlet series factor into local pieces, one for each prime. The point is not merely formal: unique factorization writes each integer as a product of independent prime-power parts, while multiplicativity turns the coefficient attached to that integer into the product of the corresponding local coefficients. The analytic question is when this prime-by-prime decomposition may be multiplied into an actual infinite product without changing the value of the series.
[quotetheorem:4336]
[citeproof:4336]
The condition of multiplicativity is exactly what makes the local factors independent: values at coprime prime-power parts multiply to recover the value at the whole integer. Absolute convergence is again needed to pass from finite products to an infinite product without losing control of the omitted primes. This theorem explains why later arguments can study primes one local factor at a time and then recombine the information globally.
[example: Zeta As The First Euler Product]
Let $1:\mathbb N\to\mathbb C$ be the constant arithmetic function $1(n)=1$. Its Dirichlet series is
\begin{align*}
\sum_{n=1}^\infty \frac{1(n)}{n^s}
&=\sum_{n=1}^\infty \frac{1}{n^s}
=\zeta(s).
\end{align*}
For a prime $p$ and an integer $k\ge 0$, we have $1(p^k)=1$, so the local prime-power series is
\begin{align*}
\sum_{k=0}^\infty \frac{1(p^k)}{p^{ks}}
&=\sum_{k=0}^\infty \frac{1}{p^{ks}}
=\sum_{k=0}^\infty \left(p^{-s}\right)^k.
\end{align*}
If $\operatorname{Re}(s)>1$, then $|p^{-s}|=p^{-\operatorname{Re}(s)}<1$, and the geometric-series formula gives
\begin{align*}
\sum_{k=0}^\infty \left(p^{-s}\right)^k
=\frac{1}{1-p^{-s}}.
\end{align*}
For a finite set of primes $P$, multiplying the local factors gives
\begin{align*}
\prod_{p\in P}\left(\sum_{k=0}^\infty p^{-ks}\right)
&=\sum_{(k_p)_{p\in P}}\prod_{p\in P}p^{-k_ps} \\
&=\sum_{(k_p)_{p\in P}}\left(\prod_{p\in P}p^{k_p}\right)^{-s}.
\end{align*}
By unique factorisation, the integers of the form $\prod_{p\in P}p^{k_p}$ are exactly the positive integers whose prime factors all lie in $P$, each occurring once. Hence
\begin{align*}
\prod_{p\in P}\left(1-p^{-s}\right)^{-1}
=\sum_{\substack{n\ge 1\\ p\mid n\Rightarrow p\in P}}\frac{1}{n^s}.
\end{align*}
Letting $P$ increase through all primes and using absolute convergence of $\sum_{n=1}^\infty n^{-s}$ for $\operatorname{Re}(s)>1$, the omitted tail tends to $0$, so
\begin{align*}
\zeta(s)=\sum_{n=1}^\infty \frac{1}{n^s}
=\prod_p\left(1-p^{-s}\right)^{-1}.
\end{align*}
Thus the global zeta series factors into independent prime-power contributions; later, logarithmic differentiation of these factors will convert the product into weighted sums over primes and prime powers.
[/example]
## Complex Analysis Enters Through Singularities
The following question drives the middle of the course: once an arithmetic function is encoded by a Dirichlet series, which analytic features determine the asymptotic behaviour of its partial sums? Poles, residues, and zeros become arithmetic data.
[definition: Summatory Function]
For an arithmetic function $f: \mathbb N \to \mathbb C$, its summatory function is the map $S_f:[1,\infty)\to\mathbb C$ defined by
\begin{align*}
S_f(x):=\sum_{n\le x} f(n)
\end{align*}
for $x\ge 1$.
[/definition]
A pole at $s=1$ often produces a main term of size $x$. The residue of that pole determines the constant in the asymptotic formula, while zeros near the line of convergence control the error term.
[example: Pole Of Zeta And The Size Of The Integers]
For the constant arithmetic function $1(n)=1$, its summatory function is obtained by counting the integers that occur in the sum. If $x\ge 1$, then
\begin{align*}
\sum_{n\le x}1
&=\sum_{\substack{n\in\mathbb N\\ n\le x}}1 \\
&=|\{n\in\mathbb N:n\le x\}| \\
&=\lfloor x\rfloor.
\end{align*}
Since
\begin{align*}
\lfloor x\rfloor \le x < \lfloor x\rfloor+1,
\end{align*}
subtracting $\lfloor x\rfloor$ gives
\begin{align*}
0\le x-\lfloor x\rfloor<1,
\end{align*}
and hence
\begin{align*}
\left|\frac{\lfloor x\rfloor}{x}-1\right|
=\frac{x-\lfloor x\rfloor}{x}
<\frac{1}{x}.
\end{align*}
Therefore $\lfloor x\rfloor/x\to 1$ as $x\to\infty$, so
\begin{align*}
\sum_{n\le x}1=\lfloor x\rfloor \sim x.
\end{align*}
The Dirichlet series attached to $1(n)=1$ is $\zeta(s)=\sum_{n=1}^{\infty}n^{-s}$, and its simple pole at $s=1$ has residue $1$, matching the coefficient of the main term $x$. This is the basic model for the later principle that a simple pole at $s=1$ with residue $A$ contributes a main term $Ax$ to the corresponding summatory function.
[/example]
The examples above suggest that poles of Dirichlet series should control summatory functions, but a mechanism is needed to pass from a series in the complex variable $s$ back to a sharp cutoff $n\le x$. The obstruction is that the cutoff is discontinuous, so it cannot be recovered by ordinary termwise evaluation of the Dirichlet series. Perron's formula supplies the bridge: it represents the cutoff sum as a vertical complex integral whose contour can later be shifted to pick up residues.
[quotetheorem:4337]
[citeproof:4337]
Perron's formula is the mechanism behind many later contour shifts. The nonintegral condition on $x$ avoids the jump discontinuity in the indicator function produced by the kernel; when $x$ is an integer, the formula must be stated with a boundary convention or a small smoothing. The vertical line $\operatorname{Re}(s)=b$ is chosen to the right of the initial half-plane of convergence so that the Dirichlet series may be inserted safely. Later arguments move this line leftward, and the residues crossed during that movement become the main terms of arithmetic asymptotics.
[illustration:perron-contour-shift]
The course will use it with $\zeta(s)$, $-\zeta'(s)/\zeta(s)$, and Dirichlet $L$-functions.
## Characters And Arithmetic Progressions
After the prime number theorem, the next question is whether primes are evenly distributed among the reduced residue classes modulo $q$. The algebraic tool for isolating one residue class is the character group of $(\mathbb Z/q\mathbb Z)^\times$.
[definition: Dirichlet Character]
A Dirichlet character modulo $q$ is a function $\chi: \mathbb Z \to \mathbb C$ such that there exists a group homomorphism
\begin{align*}
\chi_0:(\mathbb Z/q\mathbb Z)^\times \to \mathbb C^\times
\end{align*}
with
\begin{align*}
\chi(n)=
\begin{cases}
\chi_0(\bar n), & \gcd(n,q)=1,\\
0, & \gcd(n,q)>1.
\end{cases}
\end{align*}
[/definition]
Characters diagonalise congruence conditions. We write $\varphi(q)=|(\mathbb Z/q\mathbb Z)^\times|$ for Euler's totient function. This turns primes in a single residue class into a finite linear combination of twisted prime sums.
To carry out that diagonalisation, we need an exact formula that detects one reduced residue class using all characters modulo $q$. The required orthogonality relation is the finite Fourier inversion principle for the group of units modulo $q$.
[quotetheorem:4338]
[citeproof:4338]
The condition $\gcd(n,q)=1$ is necessary because characters modulo $q$ are extended by zero away from the unit group. Orthogonality therefore detects residue classes only inside $(\mathbb Z/q\mathbb Z)^\times$; primes dividing $q$ must be treated separately and contribute only finitely many exceptions.
The algebraic decomposition now needs an analytic object that can carry each character as a coefficient system. For zeta, the coefficients are all $1$; for arithmetic progressions, the natural replacement is a Dirichlet series whose coefficients are the values of a fixed character. This is the twisted analogue of the zeta function.
[definition: Dirichlet L-Function]
For a Dirichlet character $\chi$ modulo $q$, the Dirichlet $L$-function attached to $\chi$ is the function $L(\cdot,\chi):\{s\in\mathbb C:\operatorname{Re}(s)>1\}\to\mathbb C$ defined by
\begin{align*}
L(s,\chi):=\sum_{n=1}^\infty \frac{\chi(n)}{n^s}.
\end{align*}
[/definition]
The final part of the course studies these $L$-functions as the zeta-function method with a congruence twist. The principal character produces a pole at $s=1$, while nonprincipal characters must be shown not to vanish at $s=1$.
The motivating payoff is the prime number theorem in each reduced residue class. After characters isolate a class and $L$-functions encode the resulting sums, the remaining question is whether primes are asymptotically equidistributed among the allowable congruence classes.
[quotetheorem:1625]
[citeproof:1625]
This theorem shows how the course extends from the integers as a whole to arithmetic progressions. The coprimality condition $\gcd(a,q)=1$ is necessary, since any progression with a common factor with $q$ contains at most one prime not dividing that common factor. The factor $1/\varphi(q)$ expresses the expected equidistribution among the reduced residue classes, while the analytic difficulty lies in ruling out boundary zeros for all nonprincipal twists. The new ideas are not a replacement for the zeta method; they are a representation-theoretic refinement of it.
## Structure Of The Course
The course now has a visible path. We first learn the algebra of arithmetic functions, then translate that algebra into Dirichlet series, then use complex analysis to read asymptotics from singularities.
[explanation: Lecture Progression]
The first lectures cover arithmetic functions, Dirichlet convolution, Mobius inversion, multiplicativity, and formal Euler products. These tools explain why functions such as $\mu$, $\varphi$, $\tau$, $\sigma_k$, and $\Lambda$ appear together.
The next lectures study Dirichlet series as analytic functions. Absolute convergence, abscissae of convergence, Euler products, logarithmic derivatives, and Perron-type formulae provide the bridge from algebra to analysis.
The central analytic block concerns $\zeta(s)$. The course proves its meromorphic continuation, analyses its pole at $s=1$, establishes enough zero-free information on $\operatorname{Re}(s)=1$, and derives the prime number theorem.
The final lectures introduce Dirichlet characters and $L(s,\chi)$. Orthogonality separates residue classes, Euler products connect characters to primes, and non-vanishing at $s=1$ leads to primes in arithmetic progressions.
[/explanation]
The emphasis throughout is foundational. Sieve methods use arithmetic functions and convolution to estimate primes without relying solely on complex zeros. Additive prime problems combine weighted prime sums with Fourier analysis. Automorphic forms and modern $L$-functions generalise the same Euler product and analytic continuation pattern far beyond Dirichlet characters. Deep zero-density estimates refine the link between zeros and error terms that this course first exposes.
The opening discussion of primes, zeros, and error terms now shifts to the algebraic framework that makes those ideas workable. Before analytic estimates can bite, we need a language for arithmetic functions and the divisor-based operations that organize them. That is the role of Dirichlet convolution, which turns summation over divisors into a structural tool.
# 1. Arithmetic Functions and Dirichlet Convolution
This opening chapter sets up the algebraic language used throughout analytic number theory. Instead of studying the primes directly at first, we study functions on the positive integers and the operations that encode summation over divisors. The main structural point is that divisor sums are products in a convolution algebra, and many classical identities become statements about inverses in that algebra.
## Arithmetic Functions as Divisor Data
How can information about the divisors of an integer be packaged so that it can later interact with infinite series and Euler products? The first step is to regard any numerical rule on positive integers as an arithmetic function, then to single out the basic functions that measure divisors, prime powers, and logarithmic prime content.
[definition: Arithmetic Function]
An arithmetic function is a function $f: \mathbb N \to \mathbb C$.
[/definition]
The codomain $\mathbb C$ is chosen because Dirichlet series and $L$-functions will later be complex-valued. At this stage, the important feature is the domain: arithmetic functions are indexed by positive integers, and many useful operations are defined by summing over divisors of an integer.
Before studying identities, we need a small vocabulary of functions that will recur in every chapter. Some count divisors, some sum powers of divisors, some measure coprimality, and some isolate prime-power structure. Naming them now lets later convolution and Dirichlet-series formulas be stated without repeatedly rebuilding the same examples.
[definition: Basic Arithmetic Functions]
1. The constant-one function $\mathbf{1}: \mathbb N \to \mathbb C$ is defined by $\mathbf{1}(n)=1$ for all $n \in \mathbb N$.
2. The convolution identity $\varepsilon: \mathbb N \to \mathbb C$ is defined by $\varepsilon(1)=1$ and $\varepsilon(n)=0$ for $n>1$.
3. The identity function $\operatorname{id}: \mathbb N \to \mathbb C$ is defined by $\operatorname{id}(n)=n$.
4. For $k \in \mathbb C$, the power function $\operatorname{id}_k: \mathbb N \to \mathbb C$ is defined by $\operatorname{id}_k(n)=n^k$.
5. The divisor-counting function $\tau: \mathbb N \to \mathbb C$ is defined by $\tau(n)=\sum_{d \mid n} 1$.
6. The divisor-sum function $\sigma_k: \mathbb N \to \mathbb C$ is defined by $\sigma_k(n)=\sum_{d \mid n} d^k$.
7. Euler totient function $\varphi: \mathbb N \to \mathbb C$ is defined by $\varphi(n)=|\{a \in \{1,\dots,n\}: \gcd(a,n)=1\}|$.
8. The von Mangoldt function $\Lambda: \mathbb N \to \mathbb C$ is defined by $\Lambda(n)=\log p$ if $n=p^a$ for a prime $p$ and an integer $a \ge 1$, and $\Lambda(n)=0$ otherwise.
9. The Mobius function $\mu: \mathbb N \to \mathbb C$ is defined by $\mu(1)=1$, by $\mu(n)=(-1)^r$ if $n=p_1\cdots p_r$ is a product of distinct primes, and by $\mu(n)=0$ if $p^2 \mid n$ for some prime $p$.
[/definition]
The function $\varepsilon$ is not arithmetically interesting by itself, but it is the identity element for Dirichlet convolution. The functions $\mathbf{1}$, $\operatorname{id}_k$, $\mu$, and $\Lambda$ are the building blocks from which the standard divisor identities are formed.
[example: Prime Power Values of Standard Functions]
Let $p$ be prime and let $a\ge 1$. The divisors of $p^a$ are exactly
\begin{align*}
1,p,p^2,\dots,p^a,
\end{align*}
because every divisor of a prime power has the form $p^j$ with $0\le j\le a$. Hence
\begin{align*}
\tau(p^a)
&=\sum_{d\mid p^a}1\\
&=\sum_{j=0}^{a}1\\
&=a+1,
\end{align*}
and, for any $k\in\mathbb C$,
\begin{align*}
\sigma_k(p^a)
&=\sum_{d\mid p^a}d^k\\
&=\sum_{j=0}^{a}(p^j)^k\\
&=1+p^k+p^{2k}+\cdots+p^{ak}.
\end{align*}
For the totient, an integer $m$ with $1\le m\le p^a$ fails to be coprime to $p^a$ exactly when $p\mid m$. The multiples of $p$ in this interval are
\begin{align*}
p,2p,3p,\dots,p^{a-1}p,
\end{align*}
so there are $p^{a-1}$ of them. Therefore
\begin{align*}
\varphi(p^a)
&=|\{1\le m\le p^a:\gcd(m,p^a)=1\}|\\
&=p^a-p^{a-1}.
\end{align*}
For the Mobius function, $p^a$ is a product of one distinct prime when $a=1$, while $p^2\mid p^a$ when $a\ge2$. Thus
\begin{align*}
\mu(p^a)
&=
\begin{cases}
(-1)^1,&a=1,\\
0,&a\ge2
\end{cases}\\
&=
\begin{cases}
-1,&a=1,\\
0,&a\ge2.
\end{cases}
\end{align*}
Finally, $p^a$ is a prime power with base prime $p$, so the definition of the von Mangoldt function gives
\begin{align*}
\Lambda(p^a)=\log p.
\end{align*}
These formulas isolate the local data attached to one prime, which is why prime-power values become the basic inputs for multiplicative functions.
[/example]
## Dirichlet Convolution and Inversion
What operation turns a divisor sum into an algebraic product? Ordinary pointwise multiplication does not remember divisors, so analytic number theory uses Dirichlet convolution, where the value at $n$ is obtained by multiplying values over complementary divisors $d$ and $n/d$.
[definition: Dirichlet Convolution]
For arithmetic functions $f,g: \mathbb N \to \mathbb C$, their Dirichlet convolution $f*g: \mathbb N \to \mathbb C$ is defined by
\begin{align*}
(f*g)(n)=\sum_{d \mid n} f(d)g(n/d).
\end{align*}
[/definition]
This definition makes divisor sums into products. For instance, summing $f$ over all divisors of $n$ is the same as convolving $f$ with $\mathbf{1}$.
For convolution to serve as an algebraic language, it must behave like a genuine product rather than just a notation for one divisor sum. We therefore need the structural facts that allow parentheses to be moved, factors to be reordered, and an identity element to be used.
[quotetheorem:4339]
[citeproof:4339]
This theorem matters because it identifies the exact algebraic setting in which divisor sums can be manipulated without repeatedly expanding definitions. The finiteness of the divisor sum is essential here; no convergence hypothesis is needed at this stage. The identity element $\varepsilon$ is what makes inversion meaningful, and the next theorem describes precisely when such inverses exist.
The algebra viewpoint lets us replace many divisor manipulations by short symbolic identities. It is especially powerful because the algebra has many invertible elements. For inversion to be useful, however, we need a criterion that can be checked from the arithmetic function itself, not by guessing a second function and multiplying. Since the value at $1$ is the first coefficient in every convolution product, it is the natural place where invertibility can either begin or fail.
[quotetheorem:4340]
[citeproof:4340]
The condition $f(1)\ne0$ cannot be omitted: if $f(1)=0$, then $(f*g)(1)=0$ for every arithmetic function $g$, so $f*g$ can never equal $\varepsilon$. The recursive construction also explains why Dirichlet inversion is arithmetic rather than analytic: each value of the inverse depends only on smaller divisor data.
The most important inverse at the beginning of the course is the inverse of $\mathbf{1}$, because convolution by $\mathbf{1}$ is exactly the operation of summing over divisors. To undo divisor sums, we therefore need to identify the arithmetic function that cancels $\mathbf{1}$ under convolution. The Mobius function is designed to perform this cancellation by inclusion-exclusion over prime divisors.
[quotetheorem:4341]
[citeproof:4341]
The theorem says that $\mu$ performs inclusion-exclusion over the prime divisors of $n$. The squarefree condition in the definition of $\mu$ is essential: repeated prime factors should not create new inclusion-exclusion choices.
After identifying $\mu$ as the convolution inverse of $\mathbf{1}$, the next task is to translate that inverse into a usable recovery rule. Many arithmetic identities present a function only through its divisor sums, so the practical question is how to solve $g(n)=\sum_{d\mid n}f(d)$ for the hidden function $f(n)$. Mobius inversion is the pointwise formula that performs this recovery.
[quotetheorem:1749]
[citeproof:1749]
The equivalence is useful because many natural functions are first encountered through cumulative divisor sums rather than by a direct formula. The hypothesis is an identity for every $n$, so the inversion is exact and pointwise, not asymptotic. The example below shows how a familiar arithmetic function appears by undoing a divisor sum.
[example: Inverting a Divisor Sum]
Suppose $g(n)=\sum_{d\mid n}f(d)$ for every $n$, so $g=f*\mathbf{1}$. By *Mobius Inversion Formula*, convolving with $\mu$ recovers $f$:
\begin{align*}
f
&=g*\mu,\\
f(n)
&=(g*\mu)(n)\\
&=\sum_{d\mid n}\mu(d)g(n/d).
\end{align*}
If $g(n)=n$, then $g=\operatorname{id}$, and therefore
\begin{align*}
f(n)
&=\sum_{d\mid n}\mu(d)\operatorname{id}(n/d)\\
&=\sum_{d\mid n}\mu(d)\frac{n}{d}.
\end{align*}
Equivalently,
\begin{align*}
f
&=\mu*\operatorname{id}\\
&=\operatorname{id}*\mu,
\end{align*}
where the last equality uses commutativity of Dirichlet convolution. Thus the recovered function is
\begin{align*}
\varphi(n)=\sum_{d\mid n}\mu(d)\frac{n}{d},
\end{align*}
so $\varphi=\operatorname{id}*\mu$. This shows that Euler's totient function is obtained by applying Mobius inclusion-exclusion to the unrestricted count $g(n)=n$.
[/example]
## Classical Identities in the Convolution Algebra
Which familiar number-theoretic formulas become transparent once convolution is available? The point of the notation is that divisor functions, totients, and logarithmic prime weights can be computed as products of a few basic arithmetic functions.
[quotetheorem:4342]
[citeproof:4342]
These identities show why $\tau$ and $\sigma_k$ are not isolated formulas but products in the convolution algebra. The role of $\mathbf{1}$ is necessary: it is exactly the function that performs an unweighted divisor sum. The identity for $\sigma_k$ is the prototype for later Dirichlet-series factorizations, where multiplication of Dirichlet series turns convolution into ordinary multiplication of generating functions.
Euler's totient function has a different flavor because it counts only residue classes coprime to $n$, not all divisors or all integers up to $n$. The obstruction is the coprimality condition: the unrestricted count is easy, but the classes sharing a prime factor with $n$ must be removed. Mobius inclusion-exclusion is exactly suited to impose this coprimality constraint.
[quotetheorem:4343]
[citeproof:4343]
This theorem identifies $\varphi$ as an inclusion-exclusion correction to the identity function. The coprimality condition in the definition of $\varphi$ is exactly what Mobius inversion removes from the unrestricted count of integers $1\le a\le n$. The following computations show the local prime-power formula and a small composite example from the same convolution identity.
[example: Computing Phi from the Convolution Formula]
Using the convolution formula $\varphi=\operatorname{id}*\mu$, equivalently
\begin{align*}
\varphi(n)=\sum_{d\mid n}\mu(d)\frac{n}{d},
\end{align*}
we compute first at a prime power $n=p^a$. The divisors of $p^a$ are $p^j$ for $0\le j\le a$, so
\begin{align*}
\varphi(p^a)
&=\sum_{j=0}^{a}\mu(p^j)\frac{p^a}{p^j}\\
&=\mu(1)p^a+\mu(p)p^{a-1}+\sum_{j=2}^{a}\mu(p^j)p^{a-j}.
\end{align*}
Since $\mu(1)=1$, $\mu(p)=-1$, and $\mu(p^j)=0$ for every $j\ge2$ because $p^2\mid p^j$, this becomes
\begin{align*}
\varphi(p^a)
&=1\cdot p^a+(-1)p^{a-1}+\sum_{j=2}^{a}0\cdot p^{a-j}\\
&=p^a-p^{a-1}.
\end{align*}
For $n=12$, the divisors are $1,2,3,4,6,12$, hence
\begin{align*}
\varphi(12)
&=\sum_{d\mid 12}\mu(d)\frac{12}{d}\\
&=\mu(1)\frac{12}{1}+\mu(2)\frac{12}{2}+\mu(3)\frac{12}{3}
+\mu(4)\frac{12}{4}+\mu(6)\frac{12}{6}+\mu(12)\frac{12}{12}\\
&=1\cdot 12+(-1)\cdot 6+(-1)\cdot 4+0\cdot 3+1\cdot 2+0\cdot 1\\
&=12-6-4+0+2+0\\
&=4.
\end{align*}
Only squarefree divisors contribute nonzero Mobius values, so the convolution formula is exactly inclusion-exclusion over the prime divisors of $n$.
[/example]
The von Mangoldt function isolates prime powers with logarithmic weight. Its basic identity says that the total prime-power logarithmic content of $n$ is exactly $\log n$.
To connect this weight with convolution, we need to express ordinary logarithms as a divisor sum of prime-power contributions. That identity is the algebraic form of unique factorization that later becomes the logarithmic derivative of an Euler product.
[quotetheorem:4344]
[citeproof:4344]
This theorem turns prime factorization into a divisor identity: each occurrence of a prime in $n$ contributes one copy of $\log p$. The restriction to prime powers in the definition of $\Lambda$ is necessary, because logarithms add over prime-power exponents rather than over arbitrary composite divisors. By Mobius inversion, the preceding theorem is also a formula for $\Lambda$ itself:
\begin{align*}
\Lambda=\mu*\log.
\end{align*}
This inverted form is frequently useful when replacing prime-power weights by more flexible divisor sums.
[example: Recovering Lambda by Mobius Inversion]
Using $\Lambda=\mu*\log$, where $\log(n)$ denotes the arithmetic function $n\mapsto \log n$, we first compute at $n=p^a$ with $p$ prime and $a\ge1$. The divisors of $p^a$ are $p^j$ for $0\le j\le a$, so
\begin{align*}
\Lambda(p^a)
&=(\mu*\log)(p^a)\\
&=\sum_{j=0}^{a}\mu(p^j)\log(p^{a-j})\\
&=\mu(1)\log(p^a)+\mu(p)\log(p^{a-1})
+\sum_{j=2}^{a}\mu(p^j)\log(p^{a-j})\\
&=1\cdot \log(p^a)+(-1)\cdot \log(p^{a-1})
+\sum_{j=2}^{a}0\cdot \log(p^{a-j})\\
&=\log(p^a)-\log(p^{a-1})\\
&=a\log p-(a-1)\log p\\
&=\log p.
\end{align*}
For $n=6$, the divisors are $1,2,3,6$, and the squarefree values are $\mu(1)=1$, $\mu(2)=-1$, $\mu(3)=-1$, and $\mu(6)=1$. Hence
\begin{align*}
\Lambda(6)
&=(\mu*\log)(6)\\
&=\sum_{d\mid 6}\mu(d)\log(6/d)\\
&=\mu(1)\log 6+\mu(2)\log 3+\mu(3)\log 2+\mu(6)\log 1\\
&=1\cdot\log 6+(-1)\cdot\log 3+(-1)\cdot\log 2+1\cdot\log 1\\
&=\log 6-\log 3-\log 2+\log 1\\
&=(\log 2+\log 3)-\log 3-\log 2+0\\
&=0.
\end{align*}
The Mobius-weighted divisor sum keeps the single-prime-power contribution $\log p$ and cancels the logarithmic contributions when at least two distinct primes are present.
[/example]
## Multiplicativity and Euler Factorization
Why do prime powers govern so many arithmetic functions? The unique factorization of integers implies that a function compatible with coprime products is determined prime by prime, and Dirichlet convolution respects this prime-by-prime structure.
[definition: Multiplicative Arithmetic Function]
An arithmetic function $f: \mathbb N\to\mathbb C$ is multiplicative if $f(1)=1$ and
\begin{align*}
f(mn)=f(m)f(n)
\end{align*}
whenever $\gcd(m,n)=1$.
[/definition]
Multiplicativity only controls products of coprime integers, so it does not determine how a function behaves when the same prime appears in both factors. For example, knowing $f(p)$ does not by itself force $f(p^2)=f(p)^2$. Some later Euler factors and character examples require this stronger compatibility with all products, including products sharing prime divisors.
[definition: Completely Multiplicative Arithmetic Function]
An arithmetic function $f: \mathbb N\to\mathbb C$ is completely multiplicative if $f(1)=1$ and
\begin{align*}
f(mn)=f(m)f(n)
\end{align*}
for all $m,n\in\mathbb N$.
[/definition]
Complete multiplicativity is stronger because it also controls products with common prime factors. Multiplicativity only separates different primes, so the values $f(p^a)$ may vary in a more flexible way.
[example: Multiplicative but Not Completely Multiplicative]
Let $m,n\in\mathbb N$ with $\gcd(m,n)=1$. If $m$ is not squarefree, then $q^2\mid m$ for some prime $q$, so $q^2\mid mn$ and
\begin{align*}
\mu(mn)=0=\mu(m)\mu(n).
\end{align*}
The same argument applies if $n$ is not squarefree. It remains to consider the case where both are squarefree. Write
\begin{align*}
m=p_1\cdots p_r,\qquad n=q_1\cdots q_s,
\end{align*}
with all listed primes distinct. Since $\gcd(m,n)=1$, no $p_i$ equals any $q_j$, so
\begin{align*}
mn=p_1\cdots p_rq_1\cdots q_s
\end{align*}
is a product of $r+s$ distinct primes. Therefore
\begin{align*}
\mu(mn)
&=(-1)^{r+s}\\
&=(-1)^r(-1)^s\\
&=\mu(m)\mu(n).
\end{align*}
Thus $\mu$ is multiplicative.
It is not completely multiplicative: for any prime $p$,
\begin{align*}
\mu(p)\mu(p)
&=(-1)(-1)\\
&=1,
\end{align*}
while $p^2\mid p^2$, so
\begin{align*}
\mu(p^2)=0.
\end{align*}
Hence $\mu(p^2)\ne \mu(p)\mu(p)$.
By contrast, $\operatorname{id}_k$ is completely multiplicative. For all $m,n\in\mathbb N$,
\begin{align*}
\operatorname{id}_k(mn)
&=(mn)^k\\
&=\exp(k\log(mn))\\
&=\exp(k(\log m+\log n))\\
&=\exp(k\log m)\exp(k\log n)\\
&=m^k n^k\\
&=\operatorname{id}_k(m)\operatorname{id}_k(n).
\end{align*}
So $\mu$ separates only coprime products, while $\operatorname{id}_k$ separates every product.
[/example]
The examples show that multiplicativity is a local condition tied to coprime factorization, so the next question is how much data is needed to specify such a function. Since every integer has a unique prime-power factorization, values on prime powers are the natural local input. What remains to prove is that these local choices determine one and only one global multiplicative function.
[quotetheorem:4345]
[citeproof:4345]
This theorem is the precise sense in which prime-power data determines a multiplicative function. The condition $\gcd(m,n)=1$ is essential, since multiplicativity gives no rule for combining powers of the same prime. The next result explains why convolution is the correct multiplication law for multiplicative functions, and it is the finite, formal version of Euler product factorization.
[quotetheorem:4346]
[citeproof:4346]
The coprimality hypothesis is the mechanism that lets divisors of $mn$ split uniquely into divisors of $m$ and divisors of $n$. Without it, the same prime could appear in both factors and the divisor decomposition would no longer separate. The prime-power formula is therefore the local rule from which global convolution identities are assembled.
[example: Local Computation of Tau and Sigma]
Let $p$ be prime and $a\ge 0$. The divisors of $p^a$ are $p^j$ with $0\le j\le a$, so the convolution identity $\tau=\mathbf{1}*\mathbf{1}$ gives
\begin{align*}
\tau(p^a)
&=(\mathbf{1}*\mathbf{1})(p^a)\\
&=\sum_{j=0}^{a}\mathbf{1}(p^j)\mathbf{1}(p^{a-j})\\
&=\sum_{j=0}^{a}1\cdot 1\\
&=\underbrace{1+\cdots+1}_{a+1\text{ terms}}\\
&=a+1.
\end{align*}
Similarly, using $\sigma_k=\operatorname{id}_k*\mathbf{1}$,
\begin{align*}
\sigma_k(p^a)
&=(\operatorname{id}_k*\mathbf{1})(p^a)\\
&=\sum_{j=0}^{a}\operatorname{id}_k(p^j)\mathbf{1}(p^{a-j})\\
&=\sum_{j=0}^{a}(p^j)^k\cdot 1\\
&=\sum_{j=0}^{a}p^{jk}\\
&=1+p^k+p^{2k}+\cdots+p^{ak}.
\end{align*}
Now let $n=p_1^{a_1}\cdots p_r^{a_r}$ be its prime factorization. Since $\tau$ and $\sigma_k$ are multiplicative, their values on $n$ are the products of their prime-power values:
\begin{align*}
\tau(n)
&=\prod_{i=1}^{r}\tau(p_i^{a_i})\\
&=\prod_{i=1}^{r}(a_i+1),
\end{align*}
and
\begin{align*}
\sigma_k(n)
&=\prod_{i=1}^{r}\sigma_k(p_i^{a_i})\\
&=\prod_{i=1}^{r}\left(1+p_i^k+p_i^{2k}+\cdots+p_i^{a_i k}\right).
\end{align*}
Thus the global divisor-counting and divisor-sum formulas are obtained by multiplying the local contributions from each prime power in $n$.
[/example]
The local formulas for $\tau$ and $\sigma_k$ illustrate a general pattern: once a multiplicative function is known on prime powers, its global values are obtained by multiplying local contributions. This is the same structural principle behind Euler products. The formal Euler factorization is the conceptual bridge to analytic applications: if $f$ is multiplicative, then any generating object built from $f(n)$ and compatible with multiplication should split into independent prime factors.
The remaining task is to state that factorization independently of any particular arithmetic function. A formal Euler product records all choices of prime-power exponents at once, and multiplicativity is exactly the condition that turns each such choice into the coefficient of the resulting integer.
[quotetheorem:4347]
[citeproof:4347]
This factorization requires multiplicativity; without it, the coefficient attached to $\prod_p p^{a_p}$ would not collapse to $f(n)$. The result is formal in this chapter, but it already explains the shape of Euler products for zeta and $L$-functions. The formal factorization shows why the algebra of arithmetic functions is the right starting point for analytic number theory. Convolution explains divisor sums, Mobius inversion explains cancellation, and multiplicativity explains the passage from global functions to local prime-power data. The next chapter turns these formal identities into analytic identities for Dirichlet series.
Once arithmetic functions are organized by convolution, the next step is to package them into analytic objects. Dirichlet series convert divisor identities and multiplicative structure into functions of a complex variable, where convergence and coefficient growth can be studied together. This prepares the way for the zeta function, the central example where the dictionary becomes especially powerful.
# 2. Dirichlet Series as Generating Functions
Dirichlet convolution turns arithmetic functions into an algebra, but it is still a formal operation until we attach analytic functions to it. This chapter introduces Dirichlet series as generating functions adapted to divisibility rather than addition. The main questions are when such series converge, how multiplication of series records Dirichlet convolution, and why Euler products convert multiplicativity into complex-analytic information about primes.
The chapter assumes the basic convolution algebra of arithmetic functions from Chapter 1, together with standard facts about absolute convergence of complex series and locally uniform limits of holomorphic functions. The guiding example is the Riemann zeta function. Its series and Euler product provide the bridge between integers and primes, while its logarithmic derivative isolates prime powers through the von Mangoldt function.
## Convergence of Dirichlet Series
When we write a generating function $\sum a_n x^n$, the variable $x$ measures additive size. For arithmetic functions the natural scale is multiplicative, so the analogue is to weight $a_n$ by $n^{-s}$ and ask for which complex numbers $s$ the resulting series behaves like a genuine analytic function.
[definition: Dirichlet Series]
Let $(a_n)_{n\ge 1}\in\mathbb C^{\mathbb N}$ be a sequence of complex numbers. The Dirichlet series associated to $(a_n)$ is the formal series
\begin{align*}
\sum_{n=1}^{\infty} a_n n^{-s}.
\end{align*}
Its convergence domain is
\begin{align*}
D_F=\left\{s\in\mathbb C: \sum_{n=1}^{\infty}a_n n^{-s}\text{ converges in }\mathbb C\right\}.
\end{align*}
On $D_F$, the associated function is $F:D_F\to\mathbb C$ defined by
\begin{align*}
F(s)=\sum_{n=1}^{\infty}a_n n^{-s}.
\end{align*}
[/definition]
Writing $s=\sigma+it$, the size of the $n$-th term is controlled by $n^{-\sigma}$, while $n^{-it}=e^{-it\log n}$ only oscillates. Thus convergence is organised by vertical half-planes rather than by discs.
Because later arguments multiply, rearrange, and compare Dirichlet series term by term, ordinary pointwise convergence is not enough. We need a half-plane where the absolute values form a convergent majorant, since that is the region in which these analytic manipulations are stable.
[definition: Absolute Convergence Half-Plane]
A Dirichlet series $\sum_{n=1}^{\infty}a_n n^{-s}$ is absolutely convergent at $s\in\mathbb C$ if
\begin{align*}
\sum_{n=1}^{\infty}|a_n|n^{-\operatorname{Re}(s)}<\infty.
\end{align*}
Its abscissa of absolute convergence is
\begin{align*}
\sigma_a=\inf\{\beta\in\mathbb R: \sum_{n=1}^{\infty}|a_n|n^{-\sigma}\text{ converges for every }\sigma>\beta\}.
\end{align*}
[/definition]
The displayed set definition is to be read with the usual convention that the infimum may be $\pm\infty$. After the boundary value $\sigma_a$ is crossed, absolute convergence persists further to the right.
[definition: Convergence Abscissa]
The abscissa of convergence of a Dirichlet series $\sum_{n=1}^{\infty}a_n n^{-s}$ is
\begin{align*}
\sigma_c=\inf\{\beta\in\mathbb R: \sum_{n=1}^{\infty}a_n n^{-s}\text{ converges for every }s\text{ with }\operatorname{Re}(s)>\beta\}.
\end{align*}
[/definition]
The distinction between $\sigma_c$ and $\sigma_a$ matters when cancellation in the coefficients gives conditional convergence. In this chapter most arithmetic examples are first used in the region of absolute convergence, since products and rearrangements are then justified without additional summability arguments.
The analytic reason for privileging absolute convergence is local uniform control. To treat a Dirichlet series as a holomorphic function, we need more than convergence at isolated points: we need uniform convergence on smaller right half-planes so that standard complex-analytic operations apply.
[quotetheorem:4348]
[citeproof:4348]
This theorem is the analytic licence for treating an absolutely convergent Dirichlet series like a holomorphic function in its half-plane of convergence. The strict shift by $\varepsilon$ is essential: convergence at one boundary line does not control uniform convergence all the way down to that line, just as a power series may behave badly on the boundary of its disc of convergence. The hypothesis of absolute convergence at $s_0$ supplies a summable majorant; without it, oscillation may give pointwise conditional convergence while leaving no uniform bound suitable for rearrangement, multiplication, or termwise limiting arguments. Thus the theorem says nothing about the exact boundary $\operatorname{Re}(s)=\sigma_0$, and later arguments deliberately work on smaller half-planes such as $\operatorname{Re}(s)\ge 1+\varepsilon$ before returning to questions about boundary behaviour separately.
[example: Zeta Series Convergence]
For the constant arithmetic function $\mathbf{1}(n)=1$, the associated Dirichlet series is
\begin{align*}
\zeta(s)=\sum_{n=1}^{\infty}n^{-s}.
\end{align*}
Write $s=\sigma+it$. Since $n^{-s}=e^{-s\log n}$, we have
\begin{align*}
|n^{-s}|
&=|e^{-(\sigma+it)\log n}|\\
&=|e^{-\sigma\log n}e^{-it\log n}|\\
&=e^{-\sigma\log n}\\
&=n^{-\sigma}.
\end{align*}
Therefore
\begin{align*}
\sum_{n=1}^{\infty}|n^{-s}|=\sum_{n=1}^{\infty}n^{-\sigma},
\end{align*}
which converges exactly when $\sigma>1$ by the $p$-series test. Hence the zeta series is absolutely convergent exactly on the half-plane $\operatorname{Re}(s)>1$, so $\sigma_a=1$.
This also gives ordinary convergence for every $s$ with $\operatorname{Re}(s)>1$, so $\sigma_c\le 1$. Conversely, if $\beta<1$, choose a real number $r$ with $\beta<r\le 1$. Then
\begin{align*}
\sum_{n=1}^{\infty}n^{-r}
\end{align*}
diverges by the same $p$-series test, so the series does not converge for every real $s>\beta$. Thus $\sigma_c\ge 1$, and therefore $\sigma_c=1$. By uniform absolute convergence on each smaller half-plane $\operatorname{Re}(s)\ge 1+\varepsilon$, the partial sums converge locally uniformly to a holomorphic function on $\operatorname{Re}(s)>1$.
[/example]
The same estimate also gives a convenient rule of thumb: polynomially growing coefficients force absolute convergence once the real part of $s$ is far enough to the right.
[example: Polynomial Growth Coefficients]
Suppose $|a_n|\le Cn^A$ for constants $C>0$ and $A\ge 0$, and write $s=\sigma+it$. For each $n\ge 1$,
\begin{align*}
|a_n n^{-s}|
&=|a_n|\,|n^{-s}|\\
&=|a_n|\,|e^{-s\log n}|\\
&=|a_n|\,|e^{-(\sigma+it)\log n}|\\
&=|a_n|\,|e^{-\sigma\log n}e^{-it\log n}|\\
&=|a_n|\,e^{-\sigma\log n}\\
&=|a_n|n^{-\sigma}\\
&\le Cn^A n^{-\sigma}\\
&=Cn^{A-\sigma}.
\end{align*}
Thus
\begin{align*}
\sum_{n=1}^{\infty}|a_n n^{-s}|
\le C\sum_{n=1}^{\infty}n^{A-\sigma}
= C\sum_{n=1}^{\infty}\frac{1}{n^{\sigma-A}}.
\end{align*}
The $p$-series test says that $\sum_{n=1}^{\infty}n^{-p}$ converges exactly when $p>1$, so the majorant converges when $\sigma-A>1$, equivalently when $\operatorname{Re}(s)=\sigma>A+1$. By comparison, $\sum_{n=1}^{\infty}a_n n^{-s}$ converges absolutely throughout the half-plane $\operatorname{Re}(s)>A+1$.
This gives a first convergence region whenever the coefficients are known only to have polynomial growth; sharper estimates on $a_n$ may move the boundary farther left.
[/example]
## Multiplication and Dirichlet Convolution
The reason Dirichlet series are the correct generating functions for arithmetic functions is that ordinary multiplication of series matches divisor summation. The question is: if $F(s)$ represents $f$ and $G(s)$ represents $g$, which arithmetic function is represented by $F(s)G(s)$?
[quotetheorem:4349]
[citeproof:4349]
This is the analytic form of the convolution algebra from Chapter 1. Absolute convergence is the condition that prevents the Cauchy product from depending on the order in which terms are grouped: for conditionally convergent series, rearrangements can change sums, so grouping the double series by the value of $ab$ would not be justified by formal algebra alone. The theorem does not claim that every product of conditionally convergent Dirichlet series has convolution coefficients, nor that the resulting series converges at boundary points. In applications we first prove identities in a right half-plane of absolute convergence, where the divisor grouping is legitimate, and only later ask whether analytic continuation extends the identity further.
[example: Divisor Function from Zeta Squared]
For $\operatorname{Re}(s)>1$, the zeta series is absolutely convergent, so the *Product Theorem for Absolutely Convergent Dirichlet Series* applies to
\begin{align*}
\zeta(s)=\sum_{a=1}^{\infty}a^{-s}
\qquad\text{and}\qquad
\zeta(s)=\sum_{b=1}^{\infty}b^{-s}.
\end{align*}
Multiplying and grouping terms by the product $n=ab$ gives
\begin{align*}
\zeta(s)^2
&=\left(\sum_{a=1}^{\infty}a^{-s}\right)\left(\sum_{b=1}^{\infty}b^{-s}\right)\\
&=\sum_{a=1}^{\infty}\sum_{b=1}^{\infty}a^{-s}b^{-s}\\
&=\sum_{a=1}^{\infty}\sum_{b=1}^{\infty}(ab)^{-s}\\
&=\sum_{n=1}^{\infty}\left(\sum_{\substack{a,b\ge 1\\ ab=n}}1\right)n^{-s}.
\end{align*}
For fixed $n$, the pairs $(a,b)$ with $ab=n$ are in bijection with the positive divisors of $n$: a divisor $d\mid n$ determines the pair $(d,n/d)$, and every pair $(a,b)$ with $ab=n$ arises this way from $d=a$. Hence
\begin{align*}
\sum_{\substack{a,b\ge 1\\ ab=n}}1
=\sum_{d\mid n}1
=\tau(n).
\end{align*}
Therefore
\begin{align*}
\zeta(s)^2=\sum_{n=1}^{\infty}\tau(n)n^{-s}\qquad(\operatorname{Re}(s)>1).
\end{align*}
This identity says that squaring the Dirichlet series for $\mathbf{1}$ records the convolution $\mathbf{1}*\mathbf{1}$, whose value at $n$ is the number of positive divisors of $n$.
[/example]
Changing one of the two factors from $\mathbf{1}$ to the power function $n^k$ turns divisor counting into weighted divisor summation.
[example: Sum of Powers of Divisors]
For $k\ge 0$, define $\operatorname{id}_k(n)=n^k$. For $\operatorname{Re}(s)>k+1$,
\begin{align*}
\sum_{n=1}^{\infty}\operatorname{id}_k(n)n^{-s}
&=\sum_{n=1}^{\infty}n^k n^{-s}\\
&=\sum_{n=1}^{\infty}n^{k-s}\\
&=\sum_{n=1}^{\infty}n^{-(s-k)}\\
&=\zeta(s-k),
\end{align*}
and also $\sum_{n=1}^{\infty}\mathbf{1}(n)n^{-s}=\zeta(s)$, since $\operatorname{Re}(s)>k+1\ge 1$ implies $\operatorname{Re}(s)>1$.
For each $n\ge 1$, the Dirichlet convolution of $\operatorname{id}_k$ with $\mathbf{1}$ is
\begin{align*}
(\operatorname{id}_k*\mathbf{1})(n)
&=\sum_{d\mid n}\operatorname{id}_k(d)\mathbf{1}(n/d)\\
&=\sum_{d\mid n}d^k\cdot 1\\
&=\sum_{d\mid n}d^k\\
&=\sigma_k(n).
\end{align*}
Thus, by the *Product Theorem for Absolutely Convergent Dirichlet Series*,
\begin{align*}
\zeta(s-k)\zeta(s)
&=\left(\sum_{a=1}^{\infty}\operatorname{id}_k(a)a^{-s}\right)
\left(\sum_{b=1}^{\infty}\mathbf{1}(b)b^{-s}\right)\\
&=\sum_{n=1}^{\infty}(\operatorname{id}_k*\mathbf{1})(n)n^{-s}\\
&=\sum_{n=1}^{\infty}\sigma_k(n)n^{-s}
\end{align*}
on the half-plane $\operatorname{Re}(s)>k+1$. The identity records that weighting each divisor $d$ by $d^k$ is exactly the convolution operation represented analytically by multiplying $\zeta(s-k)$ and $\zeta(s)$.
[/example]
The same product rule also packages Mobius inversion into a single quotient of zeta functions, as the Euler totient function illustrates.
[example: Euler Phi from Mobius Inversion]
For $\operatorname{Re}(s)>2$, the two Dirichlet series
\begin{align*}
\sum_{n=1}^{\infty}\operatorname{id}_1(n)n^{-s}
\qquad\text{and}\qquad
\sum_{n=1}^{\infty}\mu(n)n^{-s}
\end{align*}
are absolutely convergent, and
\begin{align*}
\sum_{n=1}^{\infty}\operatorname{id}_1(n)n^{-s}
&=\sum_{n=1}^{\infty}n\cdot n^{-s}\\
&=\sum_{n=1}^{\infty}n^{1-s}\\
&=\sum_{n=1}^{\infty}n^{-(s-1)}\\
&=\zeta(s-1),
\end{align*}
while the Dirichlet series for $\mu$ is
\begin{align*}
\sum_{n=1}^{\infty}\mu(n)n^{-s}=\frac{1}{\zeta(s)}.
\end{align*}
Using the identity $\varphi=\operatorname{id}_1*\mu$ and the *Product Theorem for Absolutely Convergent Dirichlet Series*, we get
\begin{align*}
\sum_{n=1}^{\infty}\varphi(n)n^{-s}
&=\sum_{n=1}^{\infty}(\operatorname{id}_1*\mu)(n)n^{-s}\\
&=\left(\sum_{a=1}^{\infty}\operatorname{id}_1(a)a^{-s}\right)
\left(\sum_{b=1}^{\infty}\mu(b)b^{-s}\right)\\
&=\zeta(s-1)\cdot \frac{1}{\zeta(s)}\\
&=\frac{\zeta(s-1)}{\zeta(s)}.
\end{align*}
Thus a quotient of Dirichlet series appears because multiplying by the Dirichlet inverse $\mu$ corresponds, on the analytic side, to dividing by the Dirichlet series for $\mathbf{1}$.
[/example]
## Euler Products for Multiplicative Functions
Multiplicative functions are determined by their values on prime powers, so their Dirichlet series should factor into independent local pieces, one for each prime. The analytic issue is to justify passing from the formal prime-power expansion to an infinite product.
[quotetheorem:4350]
[citeproof:4350]
The theorem turns multiplicativity into factorisation. Multiplicativity is necessary because the coefficient of $p^a q^b$ in the product of local factors is forced to be $f(p^a)f(q^b)$; if $f$ is not multiplicative, these local choices do not reconstruct $f(n)$ for composite $n$ with several prime factors. Absolute convergence is also necessary for this analytic statement, since an infinite product over primes involves both an ordering limit and a regrouping of infinitely many terms. The theorem therefore does not give Euler products for arbitrary arithmetic functions, nor does it justify Euler products on the boundary of convergence. For completely multiplicative functions each local factor becomes a geometric series; for merely multiplicative functions, the local data may contain richer prime-power information.
[example: Euler Product for the Zeta Function]
Let $s=\sigma+it$ with $\sigma>1$. For the constant arithmetic function $\mathbf{1}(n)=1$, the zeta series
\begin{align*}
\zeta(s)=\sum_{n=1}^{\infty}\mathbf{1}(n)n^{-s}
\end{align*}
converges absolutely, and $\mathbf{1}$ is multiplicative because
\begin{align*}
\mathbf{1}(mn)=1=1\cdot 1=\mathbf{1}(m)\mathbf{1}(n)
\end{align*}
whenever $(m,n)=1$. Therefore the *Euler Product for Absolutely Convergent Multiplicative Series* gives
\begin{align*}
\zeta(s)
&=\prod_p\left(\sum_{a=0}^{\infty}\mathbf{1}(p^a)p^{-as}\right)\\
&=\prod_p\left(\sum_{a=0}^{\infty}p^{-as}\right)\\
&=\prod_p\left(1+p^{-s}+p^{-2s}+\cdots\right).
\end{align*}
For each prime $p$,
\begin{align*}
|p^{-s}|
&=|e^{-s\log p}|\\
&=|e^{-\sigma\log p}e^{-it\log p}|\\
&=e^{-\sigma\log p}\\
&=p^{-\sigma}\\
&<1,
\end{align*}
so the geometric series formula applies with ratio $p^{-s}$:
\begin{align*}
\sum_{a=0}^{\infty}p^{-as}
&=\sum_{a=0}^{\infty}(p^{-s})^a\\
&=\frac{1}{1-p^{-s}}.
\end{align*}
Hence
\begin{align*}
\zeta(s)=\prod_p(1-p^{-s})^{-1}\qquad(\operatorname{Re}(s)>1).
\end{align*}
This identity rewrites the sum over all positive integers as independent prime-power contributions, so unique factorisation is now visible inside the analytic function $\zeta(s)$.
[/example]
Taking the reciprocal Euler product identifies the Dirichlet series of the Mobius function and makes the inverse relation $\mu * \mathbf{1}=\delta$ analytic.
[example: Dirichlet Series for the Mobius Function]
Let $s=\sigma+it$ with $\sigma>1$. The Möbius function is multiplicative, and its prime-power values are
\begin{align*}
\mu(p^0)&=\mu(1)=1,\\
\mu(p^1)&=-1,\\
\mu(p^a)&=0\qquad(a\ge 2).
\end{align*}
The series $\sum_{n=1}^{\infty}\mu(n)n^{-s}$ is absolutely convergent on this half-plane because $|\mu(n)|\le 1$ for every $n$ and
\begin{align*}
\sum_{n=1}^{\infty}|\mu(n)n^{-s}|
&=\sum_{n=1}^{\infty}|\mu(n)|\,|n^{-s}|\\
&=\sum_{n=1}^{\infty}|\mu(n)|n^{-\sigma}\\
&\le \sum_{n=1}^{\infty}n^{-\sigma},
\end{align*}
which converges when $\sigma>1$ by the $p$-series test.
Thus the *Euler Product for Absolutely Convergent Multiplicative Series* gives
\begin{align*}
\sum_{n=1}^{\infty}\mu(n)n^{-s}
&=\prod_p\left(\sum_{a=0}^{\infty}\mu(p^a)p^{-as}\right)\\
&=\prod_p\left(\mu(1)p^{0}+\mu(p)p^{-s}+\sum_{a=2}^{\infty}\mu(p^a)p^{-as}\right)\\
&=\prod_p\left(1+(-1)p^{-s}+\sum_{a=2}^{\infty}0\cdot p^{-as}\right)\\
&=\prod_p(1-p^{-s}).
\end{align*}
On the same half-plane, the zeta Euler product is
\begin{align*}
\zeta(s)=\prod_p(1-p^{-s})^{-1}.
\end{align*}
Taking reciprocals of the finite prime partial products gives
\begin{align*}
\left(\prod_{p\in P}(1-p^{-s})^{-1}\right)^{-1}
=\prod_{p\in P}(1-p^{-s}),
\end{align*}
and passing to the absolutely convergent infinite-product limit yields
\begin{align*}
\prod_p(1-p^{-s})=\frac{1}{\zeta(s)}.
\end{align*}
Therefore
\begin{align*}
\sum_{n=1}^{\infty}\mu(n)n^{-s}=\frac{1}{\zeta(s)}\qquad(\operatorname{Re}(s)>1).
\end{align*}
This is the analytic form of the arithmetic fact that $\mu$ is the Dirichlet inverse of $\mathbf{1}$.
[/example]
For divisor-sum functions, the Euler product can be read prime by prime; each local factor is a two-variable geometric sum in disguise.
[example: Prime-Power Data for Sigma Functions]
Fix an integer $k\ge 0$ and let $s=\sigma+it$ with $\sigma>k+1$. For a prime power $p^a$, the positive divisors are $1,p,\ldots,p^a$, so
\begin{align*}
\sigma_k(p^a)
&=\sum_{j=0}^{a}(p^j)^k\\
&=\sum_{j=0}^{a}p^{jk}\\
&=1+p^k+\cdots+p^{ak}.
\end{align*}
For the local generating function, write each term using this divisor sum:
\begin{align*}
\sum_{a=0}^{\infty}\sigma_k(p^a)p^{-as}
&=\sum_{a=0}^{\infty}\left(\sum_{j=0}^{a}p^{jk}\right)p^{-as}\\
&=\sum_{a=0}^{\infty}\sum_{j=0}^{a}p^{jk}p^{-as}.
\end{align*}
Set $a=j+b$, where $j\ge 0$ and $b\ge 0$. Since
\begin{align*}
p^{jk}p^{-as}
&=p^{jk}p^{-(j+b)s}\\
&=p^{j(k-s)}p^{-bs},
\end{align*}
we get
\begin{align*}
\sum_{a=0}^{\infty}\sigma_k(p^a)p^{-as}
&=\sum_{j=0}^{\infty}\sum_{b=0}^{\infty}p^{j(k-s)}p^{-bs}\\
&=\left(\sum_{j=0}^{\infty}(p^{k-s})^j\right)
\left(\sum_{b=0}^{\infty}(p^{-s})^b\right).
\end{align*}
Here
\begin{align*}
|p^{k-s}|=p^{k-\sigma}<1
\qquad\text{and}\qquad
|p^{-s}|=p^{-\sigma}<1,
\end{align*}
so both geometric series converge, and therefore
\begin{align*}
\sum_{a=0}^{\infty}\sigma_k(p^a)p^{-as}
&=\frac{1}{1-p^{k-s}}\cdot\frac{1}{1-p^{-s}}\\
&=\frac{1}{(1-p^{-s})(1-p^{k-s})}.
\end{align*}
Since $\sigma_k=\operatorname{id}_k*\mathbf{1}$ is multiplicative, the Euler product for absolutely convergent multiplicative Dirichlet series gives
\begin{align*}
\sum_{n=1}^{\infty}\sigma_k(n)n^{-s}
&=\prod_p\left(\sum_{a=0}^{\infty}\sigma_k(p^a)p^{-as}\right)\\
&=\prod_p\frac{1}{(1-p^{-s})(1-p^{k-s})}\\
&=\left(\prod_p(1-p^{-s})^{-1}\right)
\left(\prod_p(1-p^{-(s-k)})^{-1}\right)\\
&=\zeta(s)\zeta(s-k).
\end{align*}
The condition $\sigma>k+1$ ensures both $\operatorname{Re}(s)>1$ and $\operatorname{Re}(s-k)>1$, so both zeta Euler products are being used inside their absolute convergence half-planes. Thus the prime-power formula for $\sigma_k(p^a)$ is exactly the local data behind the global identity
\begin{align*}
\sum_{n=1}^{\infty}\sigma_k(n)n^{-s}=\zeta(s)\zeta(s-k).
\end{align*}
[/example]
## Logarithmic Differentiation and Prime-Sensitive Coefficients
Products over primes are powerful because taking a logarithm turns multiplication over primes into addition over primes. Differentiating the logarithm then weights each prime power by a logarithm, producing coefficients that detect prime powers rather than all divisors evenly.
[definition: Von Mangoldt Function]
The von Mangoldt function is the arithmetic function $\Lambda:\mathbb N\to\mathbb R$ defined by
\begin{align*}
\Lambda(n)=
\begin{cases}
\log p, & n=p^a\text{ for some prime }p\text{ and }a\ge 1,\\
0, & \text{otherwise}.
\end{cases}
\end{align*}
[/definition]
This function is designed so that sums of $\Lambda(n)$ count primes with the correct logarithmic weight. The identity below explains why $\Lambda$ is the natural coefficient sequence attached to the logarithmic derivative of $\zeta$.
[quotetheorem:1750]
[citeproof:1750]
This identity is the analytic gateway to the prime number theorem. The restriction $\operatorname{Re}(s)>1$ is not cosmetic: it is the region where the Euler product, logarithm, and differentiated prime-power series are all controlled by locally uniform absolute convergence. At the boundary $\operatorname{Re}(s)=1$, the original series for $\zeta(s)$ already has a pole at $s=1$, and the logarithmic derivative has singular behaviour that cannot be reached by the same termwise argument. Beyond the boundary, any use of the identity must come from analytic continuation rather than from direct manipulation of the defining series. Information about zeros and poles of $\zeta(s)$ becomes information about weighted prime counts because $-\zeta'(s)/\zeta(s)$ has coefficients $\Lambda(n)$.
[example: Recovering the Divisor Sum for Logarithms]
Fix $s$ with $\operatorname{Re}(s)>1$. In this half-plane,
\begin{align*}
-\frac{\zeta'(s)}{\zeta(s)}
&=\sum_{n=1}^{\infty}\Lambda(n)n^{-s}
\end{align*}
by the *Logarithmic Derivative of Zeta*, and
\begin{align*}
\zeta(s)=\sum_{n=1}^{\infty}\mathbf{1}(n)n^{-s}.
\end{align*}
Both series are absolutely convergent there, so the *Product Theorem for Absolutely Convergent Dirichlet Series* gives
\begin{align*}
-\zeta'(s)
&=\left(-\frac{\zeta'(s)}{\zeta(s)}\right)\zeta(s)\\
&=\left(\sum_{a=1}^{\infty}\Lambda(a)a^{-s}\right)
\left(\sum_{b=1}^{\infty}\mathbf{1}(b)b^{-s}\right)\\
&=\sum_{n=1}^{\infty}(\Lambda*\mathbf{1})(n)n^{-s}.
\end{align*}
For each $n\ge 1$, the convolution coefficient is
\begin{align*}
(\Lambda*\mathbf{1})(n)
&=\sum_{d\mid n}\Lambda(d)\mathbf{1}(n/d)\\
&=\sum_{d\mid n}\Lambda(d)\cdot 1\\
&=\sum_{d\mid n}\Lambda(d),
\end{align*}
and hence
\begin{align*}
-\zeta'(s)=\sum_{n=1}^{\infty}\left(\sum_{d\mid n}\Lambda(d)\right)n^{-s}.
\end{align*}
On the other hand, differentiating the zeta series term by term is justified on every smaller half-plane $\operatorname{Re}(s)\ge 1+\varepsilon$, since
\begin{align*}
\sum_{n=1}^{\infty}(\log n)n^{-1-\varepsilon}
\end{align*}
converges. For each $n$,
\begin{align*}
\frac{d}{ds}n^{-s}
&=\frac{d}{ds}e^{-s\log n}\\
&=-(\log n)e^{-s\log n}\\
&=-(\log n)n^{-s},
\end{align*}
so
\begin{align*}
\zeta'(s)
&=\sum_{n=1}^{\infty}-(\log n)n^{-s},
\end{align*}
and therefore
\begin{align*}
-\zeta'(s)=\sum_{n=1}^{\infty}(\log n)n^{-s}.
\end{align*}
Thus, for all real $s>1$,
\begin{align*}
\sum_{n=1}^{\infty}\left(\sum_{d\mid n}\Lambda(d)\right)n^{-s}
=
\sum_{n=1}^{\infty}(\log n)n^{-s}.
\end{align*}
By *coefficient comparison in a common half-plane of absolute convergence*, the coefficients agree:
\begin{align*}
\log n=\sum_{d\mid n}\Lambda(d)\qquad(n\ge 1).
\end{align*}
The logarithmic derivative therefore recovers exactly the divisor-sum identity expressing $\log n$ as the total von Mangoldt weight of the divisors of $n$.
[/example]
The logarithmic derivative calculation can also be reversed: it gives the Dirichlet series whose coefficients are the von Mangoldt weights themselves.
[example: Lambda as a Dirichlet Series Coefficient]
For $\operatorname{Re}(s)>1$, the coefficient sequence $\Lambda$ cannot be handled by the Euler product theorem for multiplicative functions, because $\Lambda$ is not multiplicative. Indeed, $(2,3)=1$, but
\begin{align*}
\Lambda(2)&=\log 2,\\
\Lambda(3)&=\log 3,\\
\Lambda(2)\Lambda(3)&=(\log 2)(\log 3),
\end{align*}
whereas $6$ is not a prime power, so
\begin{align*}
\Lambda(6)=0.
\end{align*}
Since $(\log 2)(\log 3)\ne 0$, we have
\begin{align*}
\Lambda(6)\ne \Lambda(2)\Lambda(3),
\end{align*}
so $\Lambda$ is not multiplicative.
Instead, $\Lambda$ appears when the Euler product for $\zeta$ is logarithmically differentiated. On $\operatorname{Re}(s)>1$,
\begin{align*}
\zeta(s)=\prod_p(1-p^{-s})^{-1}.
\end{align*}
Taking logarithms and using $-\log(1-z)=\sum_{a=1}^{\infty}z^a/a$ for $|z|<1$ gives
\begin{align*}
\log \zeta(s)
&=\sum_p -\log(1-p^{-s})\\
&=\sum_p\sum_{a=1}^{\infty}\frac{(p^{-s})^a}{a}\\
&=\sum_p\sum_{a=1}^{\infty}\frac{p^{-as}}{a}.
\end{align*}
Differentiating term by term in the absolutely convergent half-plane,
\begin{align*}
\frac{d}{ds}\left(\frac{p^{-as}}{a}\right)
&=\frac{1}{a}\frac{d}{ds}e^{-as\log p}\\
&=\frac{1}{a}\left(-a\log p\right)e^{-as\log p}\\
&=-(\log p)p^{-as}.
\end{align*}
Therefore
\begin{align*}
\frac{\zeta'(s)}{\zeta(s)}
&=-\sum_p\sum_{a=1}^{\infty}(\log p)p^{-as},
\end{align*}
and hence
\begin{align*}
-\frac{\zeta'(s)}{\zeta(s)}
&=\sum_p\sum_{a=1}^{\infty}(\log p)(p^a)^{-s}.
\end{align*}
Each term in this double sum is indexed by a prime power $n=p^a$, and its coefficient is $\log p=\Lambda(p^a)$. If $n$ is not a prime power, it appears in none of the local prime-power sums, and its coefficient is $0=\Lambda(n)$. Thus
\begin{align*}
-\frac{\zeta'(s)}{\zeta(s)}
=\sum_{n=1}^{\infty}\Lambda(n)n^{-s}.
\end{align*}
The coefficient $\Lambda(n)$ therefore comes from differentiating the prime factors of the Euler product, not from an Euler product for $\Lambda$ itself.
[/example]
## Dirichlet Series Identities as Arithmetic Identities
The chapter closes by collecting the dictionary that will be used throughout the course. Arithmetic operations become analytic operations only in regions where convergence permits the required rearrangements, and this restriction is part of the statement rather than a technical afterthought.
[quotetheorem:4351]
[citeproof:4351]
This uniqueness principle lets us move both ways across the dictionary: convolution identities imply products of Dirichlet series, and analytic products imply arithmetic convolution identities. The common absolute convergence half-plane gives enough control to isolate the first nonzero coefficient by sending real $s$ to $\infty$; without a shared region of convergence, the expression being compared may not define functions on any common domain. Equality must also hold on a set of real values tending to infinity, or more generally on a set with enough accumulation inside a connected domain of holomorphy, because isolated coincidences of analytic functions do not determine coefficients. The theorem is therefore not a formal rule for arbitrary Dirichlet series; it is a uniqueness statement for genuinely convergent analytic representatives.
[example: Basic Dictionary]
For $k\ge 0$, the basic Dirichlet-series dictionary is obtained by matching each arithmetic operation with its analytic operation in a half-plane where all series involved converge absolutely. First, for $\operatorname{Re}(s)>1$,
\begin{align*}
\sum_{n=1}^{\infty}\mathbf{1}(n)n^{-s}
&=\sum_{n=1}^{\infty}1\cdot n^{-s}\\
&=\sum_{n=1}^{\infty}n^{-s}\\
&=\zeta(s).
\end{align*}
The Möbius identity follows from the Euler product: since $\mu(p^0)=1$, $\mu(p)=-1$, and $\mu(p^a)=0$ for $a\ge 2$, the local factor is
\begin{align*}
\sum_{a=0}^{\infty}\mu(p^a)p^{-as}
&=1-p^{-s}+\sum_{a=2}^{\infty}0\cdot p^{-as}\\
&=1-p^{-s}.
\end{align*}
Thus, for $\operatorname{Re}(s)>1$,
\begin{align*}
\sum_{n=1}^{\infty}\mu(n)n^{-s}
&=\prod_p(1-p^{-s})\\
&=\left(\prod_p(1-p^{-s})^{-1}\right)^{-1}\\
&=\frac{1}{\zeta(s)}.
\end{align*}
The divisor function satisfies $\tau=\mathbf{1}*\mathbf{1}$, because
\begin{align*}
(\mathbf{1}*\mathbf{1})(n)
&=\sum_{d\mid n}\mathbf{1}(d)\mathbf{1}(n/d)\\
&=\sum_{d\mid n}1\cdot 1\\
&=\sum_{d\mid n}1\\
&=\tau(n).
\end{align*}
Therefore, by the *Product Theorem for Absolutely Convergent Dirichlet Series*, for $\operatorname{Re}(s)>1$,
\begin{align*}
\sum_{n=1}^{\infty}\tau(n)n^{-s}
&=\sum_{n=1}^{\infty}(\mathbf{1}*\mathbf{1})(n)n^{-s}\\
&=\left(\sum_{a=1}^{\infty}\mathbf{1}(a)a^{-s}\right)
\left(\sum_{b=1}^{\infty}\mathbf{1}(b)b^{-s}\right)\\
&=\zeta(s)\zeta(s)\\
&=\zeta(s)^2.
\end{align*}
For $\sigma_k(n)=\sum_{d\mid n}d^k$, we have $\sigma_k=\operatorname{id}_k*\mathbf{1}$, since
\begin{align*}
(\operatorname{id}_k*\mathbf{1})(n)
&=\sum_{d\mid n}\operatorname{id}_k(d)\mathbf{1}(n/d)\\
&=\sum_{d\mid n}d^k\cdot 1\\
&=\sigma_k(n).
\end{align*}
Also, for $\operatorname{Re}(s)>k+1$,
\begin{align*}
\sum_{n=1}^{\infty}\operatorname{id}_k(n)n^{-s}
&=\sum_{n=1}^{\infty}n^k n^{-s}\\
&=\sum_{n=1}^{\infty}n^{-(s-k)}\\
&=\zeta(s-k).
\end{align*}
Hence, again by the *Product Theorem for Absolutely Convergent Dirichlet Series*,
\begin{align*}
\sum_{n=1}^{\infty}\sigma_k(n)n^{-s}
&=\sum_{n=1}^{\infty}(\operatorname{id}_k*\mathbf{1})(n)n^{-s}\\
&=\left(\sum_{a=1}^{\infty}\operatorname{id}_k(a)a^{-s}\right)
\left(\sum_{b=1}^{\infty}\mathbf{1}(b)b^{-s}\right)\\
&=\zeta(s-k)\zeta(s).
\end{align*}
For Euler's totient function, the arithmetic identity is $\varphi=\operatorname{id}_1*\mu$. Thus, for $\operatorname{Re}(s)>2$,
\begin{align*}
\sum_{n=1}^{\infty}\varphi(n)n^{-s}
&=\sum_{n=1}^{\infty}(\operatorname{id}_1*\mu)(n)n^{-s}\\
&=\left(\sum_{a=1}^{\infty}\operatorname{id}_1(a)a^{-s}\right)
\left(\sum_{b=1}^{\infty}\mu(b)b^{-s}\right)\\
&=\zeta(s-1)\cdot \frac{1}{\zeta(s)}\\
&=\frac{\zeta(s-1)}{\zeta(s)}.
\end{align*}
Finally, the von Mangoldt function is the coefficient sequence of the logarithmic derivative of $\zeta$. By the *Logarithmic Derivative of Zeta*, for $\operatorname{Re}(s)>1$,
\begin{align*}
-\frac{\zeta'(s)}{\zeta(s)}
&=\sum_{n=1}^{\infty}\Lambda(n)n^{-s}.
\end{align*}
So the dictionary is
\begin{align*}
\sum_{n=1}^{\infty}\mathbf{1}(n)n^{-s}&=\zeta(s),\\
\sum_{n=1}^{\infty}\mu(n)n^{-s}&=\frac{1}{\zeta(s)},\\
\sum_{n=1}^{\infty}\tau(n)n^{-s}&=\zeta(s)^2,\\
\sum_{n=1}^{\infty}\sigma_k(n)n^{-s}&=\zeta(s)\zeta(s-k),\\
\sum_{n=1}^{\infty}\varphi(n)n^{-s}&=\frac{\zeta(s-1)}{\zeta(s)},\\
\sum_{n=1}^{\infty}\Lambda(n)n^{-s}&=-\frac{\zeta'(s)}{\zeta(s)}.
\end{align*}
These identities show, respectively, the analytic forms of the constant function, Dirichlet inverse, convolution square, divisor weighting, Möbius inversion for $\varphi$, and prime-power detection.
[/example]
Chapter 3 uses this dictionary in the first genuinely complex-analytic direction, with the Riemann zeta function as the model case. Once Dirichlet series are known in a right half-plane, analytic continuation and the location of singularities begin to control asymptotic sums of their coefficients.
The general theory of Dirichlet series is now ready for its central example. Near $\operatorname{Re}(s)>1$, the Riemann zeta function brings together Euler products, absolute convergence, and coefficient sums; later arguments will keep returning to this example when analytic behavior is translated into arithmetic information.
# 3. The Riemann Zeta Function Near $\operatorname{Re}(s)>1$
Chapter 2 treated Dirichlet series as generating functions and explained how arithmetic identities become analytic identities in a half-plane of absolute convergence. We now apply that dictionary to the most important example, the Riemann zeta function. The guiding questions are: how does unique factorisation appear analytically, what singularity does the harmonic series create at $s=1$, and what estimates are needed before contour integration can extract arithmetic information?
## Euler Products and the Harmonic Sum over Primes
The first problem is to understand how the primes are encoded in a Dirichlet series whose coefficients are all equal to $1$. The point is that $\frac{1}{n^s}$ factors according to the prime factorisation of $n$, and absolute convergence permits the resulting infinite product to be manipulated without ambiguity.
[definition: Riemann Zeta Function]
The Riemann zeta function is the map
\begin{align*}
\zeta:\{s\in \mathbb C: \operatorname{Re}(s)>1\}&\longrightarrow \mathbb C,\\
s&\longmapsto \sum_{n=1}^{\infty}\frac{1}{n^s}.
\end{align*}
[/definition]
The condition $\operatorname{Re}(s)>1$ is the natural first domain because the series is dominated by $\sum n^{-\sigma}$, where $\sigma=\operatorname{Re}(s)$.
To make zeta useful for primes, the additive-looking sum over integers must be converted into a multiplicative object indexed by primes. Absolute convergence in this half-plane is exactly what permits the formal factorisation suggested by unique factorisation to become a genuine analytic identity.
[quotetheorem:1694]
[citeproof:1694]
This product is the first bridge from analysis to prime numbers. The hypothesis $\operatorname{Re}(s)>1$ is essential here: at $s=1$ the harmonic series diverges, so neither the Dirichlet series nor the infinite product is absolutely convergent. The theorem does not yet say anything about primes quantitatively; it only converts unique factorisation into a non-vanishing analytic identity in the initial half-plane. Since no factor $1-p^{-s}$ vanishes in this region and the product converges absolutely, it also shows that $\zeta(s)\ne 0$ for $\operatorname{Re}(s)>1$, a fact later used when logarithmic derivatives are introduced.
[example: Logarithm of the Euler Product]
Let $s=\sigma+it$ with $\sigma>1$. By the Euler product for zeta in this half-plane, each factor satisfies $|p^{-s}|=p^{-\sigma}<1$, so the power-series identity
\begin{align*}
-\log(1-z)=\sum_{k=1}^{\infty}\frac{z^k}{k}
\qquad (|z|<1)
\end{align*}
may be applied with $z=p^{-s}$. For a finite set $P$ of primes this gives
\begin{align*}
\log \prod_{p\in P}(1-p^{-s})^{-1}
&=\sum_{p\in P}\log (1-p^{-s})^{-1}\\
&=\sum_{p\in P}\sum_{k=1}^{\infty}\frac{(p^{-s})^k}{k}\\
&=\sum_{p\in P}\sum_{k=1}^{\infty}\frac{1}{k p^{ks}}.
\end{align*}
The absolute values are controlled by
\begin{align*}
\sum_p\sum_{k=1}^{\infty}\left|\frac{1}{k p^{ks}}\right|
&\le \sum_p\sum_{k=1}^{\infty}p^{-k\sigma}\\
&=\sum_p\frac{p^{-\sigma}}{1-p^{-\sigma}}\\
&\le \frac{1}{1-2^{-\sigma}}\sum_p p^{-\sigma}\\
&\le \frac{1}{1-2^{-\sigma}}\sum_{n=2}^{\infty}n^{-\sigma}<\infty.
\end{align*}
Thus the finite-prime identities pass to the limit, and
\begin{align*}
\log \zeta(s)
=\sum_p\sum_{k=1}^{\infty}\frac{1}{k p^{ks}}
=\sum_p p^{-s}+\sum_p\sum_{k=2}^{\infty}\frac{1}{k p^{ks}}.
\end{align*}
For real $s>1$, the higher prime-power part is uniformly bounded as $s\to 1^+$, because
\begin{align*}
0\le \sum_p\sum_{k=2}^{\infty}\frac{1}{k p^{ks}}
&\le \sum_p\sum_{k=2}^{\infty}\frac{1}{p^k}\\
&=\sum_p\frac{p^{-2}}{1-p^{-1}}\\
&=\sum_p\frac{1}{p(p-1)}
\le \sum_{n=2}^{\infty}\frac{1}{n(n-1)}
=1.
\end{align*}
So the only part of $\log \zeta(s)$ that can diverge at the boundary $s=1$ is the first prime layer $\sum_p p^{-s}$.
[/example]
The logarithmic form leaves one boundary question: as $s$ approaches $1$ from the right, which part of the Euler product is responsible for the divergence of $\zeta(s)$? The boundedness of the higher prime-power terms shows that the first prime layer must carry the divergence.
[quotetheorem:1745]
[citeproof:1745/analytic-number-theory-i-multiplicative-functions-and-l-functions]
This theorem strengthens Euclid's theorem: not only are there infinitely many primes, but their reciprocal weights are still too large to have a finite sum. The real limiting process $s\to 1^+$ is essential; for each fixed $s>1$ the series $\sum_p p^{-s}$ is bounded by $\sum_{n=1}^{\infty}n^{-s}$, so the divergence is invisible away from the boundary. The theorem does not estimate how many primes lie below $x$; it only proves that primes are numerous enough for the reciprocal sum to diverge. Its forward role is to make the pole of $\zeta$ feel arithmetical before the later use of $-\zeta'(s)/\zeta(s)$.
The reciprocal-prime divergence is stronger than Euclid's theorem, but it is worth isolating the logical consequence that a finite set of primes is impossible. If there were only finitely many primes, the Euler product and the reciprocal-prime sum would both be finite objects, contradicting the divergence just proved. The theorem below records this qualitative arithmetic conclusion as the simplest application of the analytic argument.
[quotetheorem:4352]
[citeproof:4352]
The contradiction uses the full reciprocal-prime theorem; Euclid's original proof obtains the same conclusion without zeta functions, so this result is not logically necessary for infinitude alone. Its value in these notes is methodological: it shows how an analytic singularity can imply an arithmetic impossibility. Later arguments follow the same pattern but aim for asymptotic formulae rather than a qualitative contradiction.
## The Pole at One and Continuation to the Right Half-Plane
The next problem is to decide how much of $\zeta(s)$ survives beyond its initial half-plane of convergence. The series itself stops being absolutely convergent at $\operatorname{Re}(s)=1$, but the function has a continuation past that boundary, with exactly one singularity in $\operatorname{Re}(s)>0$.
[quotetheorem:4353]
[citeproof:4353]
The approach to $s=1$ from the half-plane of convergence is essential: the original series is not defined at $s=1$, and the comparison subtracts exactly the divergent integral that has the same leading size as the harmonic series. The theorem does not yet give a global continuation of $\zeta$, only the local principal part at the boundary point. The pole records the order of growth of the counting function for the positive integers; later, the prime number theorem will come from performing a similar analysis for logarithmically weighted primes.
For later contour arguments, local information at $s=1$ is not enough. The notes will use the global continuation and functional equation as structural input: they say where $\zeta$ is allowed to have singularities, how values on opposite sides of the critical strip are related, and why the completed normalization is the natural object.
The obstruction is that Perron-type arguments need to move across vertical lines where the original series no longer converges. A meromorphic continuation and a functional equation identify the continued object, so later zero and residue statements are about a well-defined global function rather than a formal extrapolation of the series.
[quotetheorem:3374]
[citeproof:3374]
The exceptional point $s=1$ is forced by the previous pole computation, while the functional equation relates values on opposite sides of the critical strip and explains why the gamma factor is part of the natural normalization. This is the analytic setting in which contour shifts can cross the line $\operatorname{Re}(s)=1$ and pick up exactly one residue.
For the prime number theorem, the most important consequence is not the full symmetry of the functional equation, but the permission to treat $\zeta$ as a meromorphic function beyond its defining series. The original sum $\sum n^{-s}$ is still only an initial representation; after continuation, poles and zeros of the continued function become the objects that control arithmetic asymptotics.
## Estimates in Vertical Strips
Contour arguments require more than continuation: they need bounds on the function along vertical and horizontal sides of rectangles. For this chapter, the estimates only need to be strong enough to justify moving contours in elementary Perron-type arguments.
[quotetheorem:4354]
[citeproof:4354]
The restriction $0<\delta\le 1$ prevents the displayed exponent from claiming decay in strips lying wholly to the right of $1$, where $\zeta(\sigma+it)$ is bounded but not forced by this estimate to tend to $0$. The theorem does not give sharp convexity or subconvexity bounds; it supplies only polynomial control sufficient for elementary contour shifts. Its role is to ensure that integrals over long horizontal edges can be controlled after multiplying by kernels such as $x^s/s$ or $x^s/s^2$.
[remark: Absolute Convergence Bounds]
In the smaller region $\operatorname{Re}(s)\ge 1+\varepsilon$, the Dirichlet series gives the stronger bounds
\begin{align*}
|\zeta(s)|\le \zeta(1+\varepsilon),
\qquad
|\zeta'(s)|\le \sum_{n=2}^{\infty}\frac{\log n}{n^{1+\varepsilon}}.
\end{align*}
These bounds are uniform in the imaginary direction and are often used before a contour is moved leftward.
[/remark]
## Two First Residue Calculations
The purpose of locating the pole at $s=1$ is that residues turn analytic singularities into main terms for sums of arithmetic functions. The two computations below are deliberately elementary versions of the contour shifts that later involve $\Lambda(n)$ and Dirichlet characters.
[example: Counting Positive Integers by a Residue]
For $x>1$ and $c>1$, Perron's formula gives, with the usual half-weight convention when $x$ is an integer,
\begin{align*}
A(x)=\sum_{n\le x}1
=\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\zeta(s)\frac{x^s}{s}\,ds .
\end{align*}
Shift the vertical line to a line $\operatorname{Re}(s)=\delta$ with $0<\delta<1$, avoiding $s=0$. The only singularity crossed in the right half-plane is the simple pole of $\zeta$ at $s=1$, so Cauchy's residue theorem contributes
\begin{align*}
\operatorname{Res}_{s=1}\left(\zeta(s)\frac{x^s}{s}\right).
\end{align*}
To compute this residue, write $u=s-1$. By the pole expansion for $\zeta$ at $1$,
\begin{align*}
\zeta(1+u)=\frac{1}{u}+H(1+u),
\end{align*}
where $H$ is holomorphic near $1$. Also
\begin{align*}
\frac{x^{1+u}}{1+u}
&=x\frac{e^{u\log x}}{1+u}\\
&=x\left(1+u\log x+O(u^2)\right)\left(1-u+O(u^2)\right)\\
&=x\left(1+u(\log x-1)+O(u^2)\right).
\end{align*}
Therefore
\begin{align*}
\zeta(1+u)\frac{x^{1+u}}{1+u}
&=\left(\frac{1}{u}+H(1+u)\right)
x\left(1+u(\log x-1)+O(u^2)\right)\\
&=\frac{x}{u}+x(\log x-1)+xH(1+u)\left(1+u(\log x-1)+O(u^2)\right)+O(u).
\end{align*}
The coefficient of $u^{-1}=(s-1)^{-1}$ is $x$, hence
\begin{align*}
\operatorname{Res}_{s=1}\frac{\zeta(s)x^s}{s}=x.
\end{align*}
The integrals on the shifted contour are controlled by the vertical-strip bounds for $\zeta$, while the factor $x^s$ has size $x^{\operatorname{Re}(s)}$ on each side of the rectangle. Letting the height tend to infinity and then taking $\delta$ close enough to $1$ gives an error $o(x)$. Thus
\begin{align*}
\sum_{n\le x}1=A(x)=x+o(x),
\end{align*}
so the residue at the pole of $\zeta$ recovers the main term for counting positive integers.
[/example]
This calculation is intentionally simple: the counting function is already known exactly as $\lfloor x\rfloor$. Its value is methodological, because the same contour shape will later be applied to functions whose summatory behaviour is not visible from their definition. A double pole changes the residue calculation by introducing a logarithmic main term, which is the next phenomenon to isolate.
[example: The Divisor Function and the Square of Zeta]
Let $\tau(n)=\sum_{d\mid n}1$. Equivalently, $\tau(n)$ counts ordered factorisations $n=ab$ with $a,b\ge 1$, because choosing $d\mid n$ determines the pair $(d,n/d)$ and every such pair arises uniquely in this way. For $\operatorname{Re}(s)>1$, absolute convergence gives
\begin{align*}
\zeta(s)^2
&=\left(\sum_{a=1}^{\infty}\frac{1}{a^s}\right)\left(\sum_{b=1}^{\infty}\frac{1}{b^s}\right)\\
&=\sum_{a=1}^{\infty}\sum_{b=1}^{\infty}\frac{1}{(ab)^s}\\
&=\sum_{n=1}^{\infty}\left(\sum_{\substack{a,b\ge 1\\ab=n}}1\right)\frac{1}{n^s}\\
&=\sum_{n=1}^{\infty}\frac{\tau(n)}{n^s}.
\end{align*}
Thus the Dirichlet series for $\tau$ has the same singular behaviour at $s=1$ as $\zeta(s)^2$.
To see the residue that would appear in Perron's formula, expand $\zeta(s)^2x^s/s$ at $s=1$. Put $u=s-1$. The pole expansion of $\zeta$ at $1$ gives
\begin{align*}
\zeta(1+u)=\frac{1}{u}+H(1+u),
\end{align*}
where $H$ is holomorphic near $1$. Hence
\begin{align*}
H(1+u)&=H(1)+O(u),\\
\zeta(1+u)^2
&=\left(\frac{1}{u}+H(1+u)\right)^2\\
&=\frac{1}{u^2}+\frac{2H(1+u)}{u}+H(1+u)^2\\
&=\frac{1}{u^2}+\frac{2H(1)}{u}+O(1).
\end{align*}
Also,
\begin{align*}
\frac{x^{1+u}}{1+u}
&=x\frac{e^{u\log x}}{1+u}\\
&=x\left(1+u\log x+O(u^2)\right)\left(1-u+O(u^2)\right)\\
&=x\left(1+u(\log x-1)+O(u^2)\right).
\end{align*}
Multiplying the two Laurent expansions,
\begin{align*}
\zeta(1+u)^2\frac{x^{1+u}}{1+u}
&=\left(\frac{1}{u^2}+\frac{2H(1)}{u}+O(1)\right)
x\left(1+u(\log x-1)+O(u^2)\right)\\
&=\frac{x}{u^2}+\frac{x(\log x-1)}{u}+\frac{2xH(1)}{u}+O(1)\\
&=\frac{x}{u^2}+\frac{x\log x+(2H(1)-1)x}{u}+O(1).
\end{align*}
Therefore
\begin{align*}
\operatorname{Res}_{s=1}\left(\zeta(s)^2\frac{x^s}{s}\right)
=x\log x+(2H(1)-1)x.
\end{align*}
The double pole contributes the extra factor $\log x$, while the coefficient of $x$ is controlled by the constant term $H(1)$ in the Laurent expansion of $\zeta$ at $1$.
[/example]
The divisor example shows why the order of a pole matters. A simple pole gives a term proportional to $x$, while a double pole gives a logarithmic enhancement; later, zeros and poles of $L$-functions will govern the distribution of primes in the same way.
Having built analytic functions from arithmetic data, we now reverse the process and recover arithmetic information from those functions. Perron's formula is the bridge: it turns a Dirichlet series back into a summatory function by means of contour integration. This is the mechanism that lets poles, residues, and continuation translate into asymptotic formulas for counting problems.
# 4. Perron's Formula and Summatory Functions
This chapter turns Dirichlet series back into information about their coefficients. In the previous chapter the main objects were analytic functions such as $\zeta(s)$ and Euler products, initially built from arithmetic data. Perron's formula reverses that construction: it recovers summatory functions from vertical integrals of Dirichlet series, and contour shifting then turns poles into main terms. The prerequisites are the absolute convergence and coefficient dictionary from Chapter 2, the Euler product and pole of $\zeta(s)$ from Chapter 3, the residue theorem, and basic estimates for holomorphic functions on vertical contours.
The guiding question is: if
\begin{align*}
F(s)=\sum_{n=1}^{\infty}\frac{a_n}{n^s},
\end{align*}
what can the analytic behaviour of $F$ tell us about $A(x)=\sum_{n\le x}a_n$? The answer is the template behind many arguments in analytic number theory: write $A(x)$ as an inverse Mellin integral, move the contour, collect residues, and estimate the remaining integrals.
## Perron's Formula as an Inverse Mellin Transform
How can a vertical complex integral detect the cutoff condition $n\le x$? The mechanism is the Mellin transform of a step function. Instead of trying to approximate the indicator $\mathbf{1}_{[1,\infty)}(y)$ by elementary functions, Perron's formula represents it by the singularity of $1/s$ at the origin.
[quotetheorem:4355]
[citeproof:4355]
The condition $c>0$ places the original line to the right of the pole at $s=0$, so shifting left for $y>1$ crosses exactly the singularity that creates the value $1$. The assumption $y>0$ is needed because $y^s=e^{s\log y}$ is being used with a real logarithm; a complex or negative value would require choosing a branch and would no longer represent an order cutoff. The discontinuity at $y=1$ is not a defect of the proof but a feature of the sharp indicator: symmetric inversion returns the midpoint value $1/2$. The theorem is not an absolutely convergent integral identity, so it cannot be inserted into arbitrary sums without separate convergence or truncation arguments. This kernel is the analytic replacement for the sharp cutoff, and Perron's formula is obtained by applying it to the ratios $x/n$.
[quotetheorem:4356]
[citeproof:4356]
[example: Counting Integers With Perron's Formula]
Take $a_n=1$. Then, for $\operatorname{Re}(s)>1$,
\begin{align*}
F(s)=\sum_{n=1}^{\infty}\frac{1}{n^s}=\zeta(s).
\end{align*}
For $c>1$ and non-integral $x>0$, Perron's formula gives
\begin{align*}
\lfloor x\rfloor
=
\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\zeta(s)\frac{x^s}{s}\,ds.
\end{align*}
To see the main term, shift the line of integration from $\operatorname{Re}(s)=c$ to a line $\operatorname{Re}(s)=\alpha$ with $0<\alpha<1$. The integrand
\begin{align*}
\zeta(s)\frac{x^s}{s}
\end{align*}
has a pole at $s=1$ inside the strip, because $\zeta(s)$ has Laurent expansion
\begin{align*}
\zeta(s)=\frac{1}{s-1}+O(1)
\end{align*}
there, while $x^s/s$ is holomorphic at $s=1$. Hence
\begin{align*}
\operatorname{Res}_{s=1}\left(\zeta(s)\frac{x^s}{s}\right)
&=
\lim_{s\to 1}(s-1)\zeta(s)\frac{x^s}{s}\\
&=
\left(\lim_{s\to 1}(s-1)\zeta(s)\right)\frac{x^1}{1}\\
&=x.
\end{align*}
Thus the contour shift separates
\begin{align*}
\lfloor x\rfloor=x+\text{shifted-contour contribution}.
\end{align*}
The residue $x$ is the smooth main term, and the shifted-contour contribution is exactly what remains to produce the jumps in $\lfloor x\rfloor-x$.
[/example]
The hypotheses are doing real work. Absolute convergence on the line $\operatorname{Re}(s)=c$ permits the Dirichlet series to be paired with the kernel term by term; if $c\le \sigma_a$, even the expression obtained by expanding $F(s)$ may fail to converge. The integer case records the unavoidable boundary convention for a sharp cutoff: the vertical integral cannot distinguish whether the jump at $n=x$ should be counted from the left or from the right, so symmetric truncation gives half the jump. The formula should be read as a bridge, not as a final estimate, because it gives a representation rather than a useful bound. To obtain usable asymptotics, the infinite line must be truncated and then shifted into a region where the Dirichlet series has analytic continuation and controlled growth.
## Effective Truncation and Contour Shifts
What changes when we replace the infinite Perron integral by an integral over $|\operatorname{Im}(s)|\le T$? The truncation introduces an error near the discontinuity $n=x$, because a finite vertical integral only approximates the step function. The practical form of Perron's formula balances this truncation error against the estimates available for $F(s)$ on shifted contours.
[quotetheorem:4357]
[citeproof:4357]
This estimate separates two difficulties. Terms with $n$ far from $x$ are well approximated because $|\log(x/n)|$ is not small. Terms close to $x$ are harder, and their contribution is usually bounded using growth information about the coefficients. The condition $T\ge 1$ ensures that the truncated kernel has enough oscillation to approximate a step function; for very small $T$, the integral has no reason to detect the cutoff. The condition $c>\max(0,\sigma_a)$ keeps the coefficient series absolutely summable after weighting by $(x/n)^c$ and avoids the pole of $1/s$ lying on the integration line. The theorem does not estimate the displayed error by itself: for example, if many large coefficients cluster near $x$, the near-boundary part can dominate unless a separate bound for sums of $|a_n|$ is available. This is why later applications combine truncated Perron with coefficient-growth estimates or smoothing.
[remark: Smoothing Versus Sharp Cutoffs]
Sharp Perron formula has a visible boundary error because the summatory function jumps. Many later arguments replace the sharp cutoff by a smooth weight, whose Mellin transform decays faster on vertical lines. Smoothing reduces truncation errors but answers a slightly different summatory question.
[/remark]
To turn Perron formula into an asymptotic estimate, the integral must be compared with integrals on lines farther left. The obstruction is that the new contour may cross poles and may introduce horizontal boundary terms that are not automatically small.
The next tool is therefore not another Perron estimate, but the bookkeeping rule that makes contour movement usable. It identifies the residues that become main terms and separates them from the horizontal and new vertical integrals that still require bounds.
[quotetheorem:4358]
[citeproof:4358]
The residue principle explains why poles create main terms. Meromorphic continuation throughout the rectangle is necessary because the contour shift crosses points where $F$ was not initially defined by its Dirichlet series; without continuation, Cauchy's theorem cannot be applied. The condition $c>\sigma_a$ keeps the starting integral tied to the original coefficients, while the new line $\operatorname{Re}(s)=\alpha$ is useful only if $F$ is controlled there. Horizontal sides are not harmless by definition: if $F(s)$ grows too quickly as $|\operatorname{Im}(s)|=T$, then $E_{\mathrm{hor}}(x,T)$ may be larger than the residues. A pole at $s=\rho$ contributes a term of size about $x^{\operatorname{Re}(\rho)}$, with possible powers of $\log x$ if the pole has order greater than $1$.
[example: A Double Pole Produces a Logarithm]
Suppose $F(s)$ has a double pole at $s=1$ with Laurent expansion
\begin{align*}
F(s)=\frac{A}{(s-1)^2}+\frac{B}{s-1}+O(1).
\end{align*}
Put $u=s-1$, so that $s=1+u$ and $u\to 0$. For fixed $x>0$,
\begin{align*}
x^s
&=x^{1+u}\\
&=x e^{u\log x}\\
&=x\left(1+u\log x+O(u^2)\right),
\end{align*}
using the Taylor expansion of $e^z$ at $z=0$. Also,
\begin{align*}
\frac{1}{s}
&=\frac{1}{1+u}\\
&=1-u+O(u^2),
\end{align*}
using the Taylor expansion of $(1+u)^{-1}$ at $u=0$. Therefore
\begin{align*}
\frac{x^s}{s}
&=x\left(1+u\log x+O(u^2)\right)\left(1-u+O(u^2)\right)\\
&=x\left(1+u\log x-u+O(u^2)\right)\\
&=x\left(1+u(\log x-1)+O(u^2)\right).
\end{align*}
Multiplying the Laurent expansion of $F$ by this Taylor expansion gives
\begin{align*}
F(s)\frac{x^s}{s}
&=\left(\frac{A}{u^2}+\frac{B}{u}+O(1)\right)x\left(1+u(\log x-1)+O(u^2)\right)\\
&=x\left(\frac{A}{u^2}+\frac{A(\log x-1)}{u}+\frac{B}{u}+O(1)\right).
\end{align*}
The residue is the coefficient of $u^{-1}=(s-1)^{-1}$, so
\begin{align*}
\operatorname{Res}_{s=1}\left(F(s)\frac{x^s}{s}\right)
&=xA(\log x-1)+xB\\
&=Ax\log x+(B-A)x.
\end{align*}
Thus the second-order pole supplies the $x\log x$ term, while the next Laurent coefficient and the factor $1/s$ together determine the accompanying constant multiple of $x$.
[/example]
## Divisor-Type Summatory Functions
How does the pole structure of $\zeta(s)^k$ translate into average orders of divisor functions? The coefficient of $\zeta(s)^k$ is the $k$-fold divisor function $d_k(n)$, defined by counting ordered factorizations of $n$ into $k$ positive factors. Since $\zeta(s)$ has a simple pole at $s=1$, the function $\zeta(s)^k$ has a pole of order $k$ there, and this produces a polynomial in $\log x$ of degree $k-1$.
[definition: K-Fold Divisor Function]
For $k\in\mathbb N$, the $k$-fold divisor function $d_k:\mathbb N\to\mathbb N$ is defined by
\begin{align*}
d_k(n)=\#\{(n_1,\dots,n_k)\in\mathbb N^k:n_1\cdots n_k=n\}.
\end{align*}
[/definition]
For $k=2$, this is the usual divisor function $\tau(n)$. Its Dirichlet series is especially important because
\begin{align*}
\sum_{n=1}^{\infty}\frac{\tau(n)}{n^s}=\zeta(s)^2
\end{align*}
for $\operatorname{Re}(s)>1$. This identity turns the average order of $\tau(n)$ into a question about the singular behavior of $\zeta(s)^2$ at $s=1$. Because the pole is double rather than simple, the main term should acquire a logarithmic factor, but the exact constant and the next term require a careful residue calculation. The theorem makes this prediction precise and separates the main asymptotic from the harder error term.
[quotetheorem:4359]
[citeproof:4359]
The pole order is the central point of the theorem. A simple pole would only produce a term proportional to $x$, but the double pole of $\zeta(s)^2$ forces the extra factor $\log x$. The theorem gives the first two terms of the average order, not a sharp description of the error term; the Dirichlet divisor problem asks how small that error can be made. The proof therefore separates residue computation, which determines the displayed main terms, from vertical-line estimates, which determine the strength of the remainder.
[example: Why the Constant Is $2\gamma-1$]
Near $s=1$, write $u=s-1$. The Laurent expansion of $\zeta$ gives
\begin{align*}
\zeta(s)^2
&=\frac{1}{u^2}+\frac{2\gamma}{u}+O(1).
\end{align*}
For fixed $x>0$,
\begin{align*}
x^s
&=x^{1+u}\\
&=xe^{u\log x}\\
&=x\left(1+u\log x+O(u^2)\right),
\end{align*}
using the Taylor expansion of $e^z$ at $z=0$. Also,
\begin{align*}
\frac{1}{s}
&=\frac{1}{1+u}\\
&=1-u+O(u^2),
\end{align*}
using the Taylor expansion of $(1+u)^{-1}$ at $u=0$. Hence
\begin{align*}
\frac{x^s}{s}
&=x\left(1+u\log x+O(u^2)\right)\left(1-u+O(u^2)\right)\\
&=x\left(1+u\log x-u+O(u^2)\right)\\
&=x\left(1+u(\log x-1)+O(u^2)\right).
\end{align*}
Multiplying the two expansions,
\begin{align*}
\zeta(s)^2\frac{x^s}{s}
&=\left(\frac{1}{u^2}+\frac{2\gamma}{u}+O(1)\right)
x\left(1+u(\log x-1)+O(u^2)\right)\\
&=x\left(\frac{1}{u^2}+\frac{\log x-1}{u}+\frac{2\gamma}{u}+O(1)\right)\\
&=x\left(\frac{1}{u^2}+\frac{\log x+2\gamma-1}{u}+O(1)\right).
\end{align*}
The residue at $s=1$ is the coefficient of $u^{-1}=(s-1)^{-1}$, so
\begin{align*}
\operatorname{Res}_{s=1}\left(\zeta(s)^2\frac{x^s}{s}\right)
&=x(\log x+2\gamma-1)\\
&=x\log x+(2\gamma-1)x.
\end{align*}
Thus the constant $2\gamma-1$ comes from the $2\gamma/(s-1)$ term in $\zeta(s)^2$ together with the $-u$ term in the expansion of $1/s$.
[/example]
The divisor example suggests a general rule that pole order controls the logarithmic degree in a summatory asymptotic. To use that rule beyond $d_2$, one needs a uniform statement for the fixed higher divisor functions $d_k$, whose Dirichlet series is $\zeta(s)^k$ and whose pole at $s=1$ has order $k$.
[quotetheorem:4360]
[citeproof:4360]
The word fixed is essential: the constants in the contour estimates and the coefficients of $P_{k-1}$ depend on $k$, so this statement is not uniform for a growing number of factors. The theorem records the average size of $d_k(n)$ through its summatory function, rather than giving pointwise control of $d_k(n)$. Its proof also shows the recurring pattern of the course: algebraic operations on Dirichlet series change pole order, and pole order dictates powers of $\log x$ in the main term. Later analytic estimates refine the $o$-term, but the polynomial main term is already determined by the Laurent expansion at $s=1$.
## The Von Mangoldt Function and Chebyshev's Psi Function
How do primes enter the same Perron framework? The logarithmic derivative of the zeta function turns the Euler product into a Dirichlet series whose coefficients are the von Mangoldt function. Its summatory function is Chebyshev's function $\psi(x)$, the weighted count of prime powers up to $x$.
[definition: Chebyshev Psi Function]
The Chebyshev psi function is the map $\psi:\mathbb R_{+}\to\mathbb R$ defined by
\begin{align*}
\psi:x\mapsto \sum_{n\le x}\Lambda(n).
\end{align*}
[/definition]
The identity
\begin{align*}
-\frac{\zeta'}{\zeta}(s)=\sum_{n=1}^{\infty}\frac{\Lambda(n)}{n^s}
\end{align*}
holds for $\operatorname{Re}(s)>1$.
Now Perron's formula can be applied to the specific Dirichlet series that encodes prime powers. The purpose of the resulting formula is not merely to restate $\psi(x)$ as an integral, but to put the weighted prime-counting problem into a form where poles and zeros of $\zeta(s)$ become residue terms after contour shifting.
[quotetheorem:4361]
[citeproof:4361]
The restriction $c>1$ is necessary because the Dirichlet series for $-\zeta'/\zeta(s)$ is obtained from the Euler product only in the half-plane of absolute convergence. The same integer-boundary issue from Perron's formula remains: if $x\in\mathbb N$, the symmetric integral gives a half contribution from $\Lambda(x)$ rather than the right-continuous value of $\psi(x)$. This representation alone is not the prime number theorem; without analytic continuation and zero-free information for $\zeta(s)$, it is only an exact integral form of the weighted prime-counting problem. Its value is that every pole or zero of $\zeta(s)$ becomes a residue term after contour shifting, so estimates for primes are converted into estimates for zeros and vertical integrals.
[example: Residues in the Explicit Formula for Psi]
Start from
\begin{align*}
\frac{1}{2\pi i}\int_{c-iT}^{c+iT}-\frac{\zeta'}{\zeta}(s)\frac{x^s}{s}\,ds,
\end{align*}
where $c>1$ and $x>0$. If the contour is shifted left through $s=1$, the local expansion
\begin{align*}
\zeta(s)=\frac{1}{s-1}+h(s)
\end{align*}
with $h$ holomorphic near $1$ gives, writing $u=s-1$,
\begin{align*}
\zeta(s)&=\frac{1+u h(s)}{u},\\
\zeta'(s)&=-\frac{1}{u^2}+h'(s).
\end{align*}
Therefore
\begin{align*}
-\frac{\zeta'}{\zeta}(s)
&=\frac{u^{-2}-h'(s)}{u^{-1}+h(s)}\\
&=\frac{u^{-1}-u h'(s)}{1+u h(s)}\\
&=\frac{1}{u}+O(1),
\end{align*}
so $-\zeta'/\zeta$ has a simple pole at $s=1$ with residue $1$. Since $x^s/s$ is holomorphic at $s=1$ and has value $x^1/1=x$, the residue of the full integrand at $s=1$ is
\begin{align*}
\operatorname{Res}_{s=1}\left(-\frac{\zeta'}{\zeta}(s)\frac{x^s}{s}\right)
&=\left(\operatorname{Res}_{s=1}-\frac{\zeta'}{\zeta}(s)\right)\frac{x^1}{1}\\
&=x.
\end{align*}
Now let $\rho\ne 0$ be a zero of $\zeta$ of multiplicity $m$. Then near $\rho$ we can write
\begin{align*}
\zeta(s)=(s-\rho)^m g(s),
\end{align*}
where $g$ is holomorphic and $g(\rho)\ne 0$. Differentiating and dividing by $\zeta(s)$ gives
\begin{align*}
\frac{\zeta'}{\zeta}(s)
&=\frac{m(s-\rho)^{m-1}g(s)+(s-\rho)^m g'(s)}{(s-\rho)^m g(s)}\\
&=\frac{m}{s-\rho}+\frac{g'(s)}{g(s)}.
\end{align*}
Thus
\begin{align*}
-\frac{\zeta'}{\zeta}(s)
&=-\frac{m}{s-\rho}-\frac{g'(s)}{g(s)},
\end{align*}
so the residue at $\rho$ is $-m$. Since $x^s/s$ is holomorphic at $\rho\ne 0$, the corresponding residue of the full integrand is
\begin{align*}
\operatorname{Res}_{s=\rho}\left(-\frac{\zeta'}{\zeta}(s)\frac{x^s}{s}\right)
&=-m\frac{x^\rho}{\rho}.
\end{align*}
For a simple zero this becomes $-x^\rho/\rho$. Thus the pole of $\zeta$ at $1$ contributes the main term $x$, while zeros of $\zeta$ contribute oscillating terms of the form $-m x^\rho/\rho$ once the contour shift and error estimates are justified.
[/example]
This example shows why the prime number theorem is a statement about analytic continuation and zero-free regions. If no zero of $\zeta(s)$ lies on the line $\operatorname{Re}(s)=1$, then the contour can be shifted far enough to separate the main residue at $s=1$ from all other contributions at the scale $x$.
## From Perron Integrals to Later Asymptotic Theorems
What remains after Perron's formula has converted coefficients into contour integrals? The obstacle is no longer finding the right integral representation, but proving that the shifted contour and horizontal sides are smaller than the residues. Perron's formula supplies the central conversion rule of the course: analytic singularities of Dirichlet series become asymptotic terms for summatory arithmetic functions. A simple pole at $s=1$ usually gives a main term proportional to $x$, a pole of order $k$ gives $x$ times a polynomial in $\log x$ of degree $k-1$, and zeros or poles away from $1$ contribute oscillating secondary terms.
The method has three recurring steps. First, identify the Dirichlet series attached to the arithmetic function. Second, use Perron's formula to express the summatory function as a vertical integral. Third, shift the contour through a meromorphic continuation, collecting residues and estimating the new contour. The following chapters refine the analytic estimates needed to make this scheme strong enough for the prime number theorem and for primes in arithmetic progressions.
The contour method from the previous chapter becomes decisive when applied to primes. For $\zeta(s)$, the pole at $s=1$ supplies the main term, while control of the line $\operatorname{Re}(s)=1$ determines the quality of the error. That analytic pattern is the prototype for the prime number theorem and for every later application to primes in congruence classes.
# 5. The Prime Number Theorem for zeta
The preceding chapters built the analytic machinery around Dirichlet series, Euler products, and the meromorphic continuation of the Riemann zeta function. This chapter applies that machinery to the distribution of primes. The guiding theme is that the pole of $\zeta(s)$ at $s=1$ produces the main term in prime counting, while the absence of zeros on the boundary line $\operatorname{Re}(s)=1$ prevents an oscillating term of the same size.
We prove the prime number theorem first in its weighted form $\psi(x) \sim x$, then translate it into the more familiar statement
\begin{align*}
\pi(x) \sim \frac{x}{\log x}.
\end{align*}
The weighted functions are not cosmetic: they are exactly the coefficients of $-\zeta'/\zeta$, so they are the form in which complex analysis naturally sees primes.
## Weighted Prime Counting Functions
How should primes be counted if the goal is to connect them to an Euler product? Counting each prime once gives $\pi(x)$, but differentiating the Euler product of $\zeta(s)$ counts prime powers with logarithmic weights. The Chebyshev functions are designed to match that logarithmic derivative.
[definition: Prime Counting Function]
The prime counting function is the map
\begin{align*}
\pi:[1,\infty) &\longrightarrow \mathbb N\cup\{0\},\\
x &\longmapsto |\{p \le x : p \text{ is prime}\}|.
\end{align*}
[/definition]
The unweighted function $\pi(x)$ is the object appearing in the classical statement of the prime number theorem, but it is not the coefficient sum naturally produced by Euler products. The immediate problem is to keep a prime-counting function while changing the weights to the logarithms forced by differentiating Euler factors. The first weighted substitute keeps only primes but replaces the weight $1$ by $\log p$.
[definition: Chebyshev Theta Function]
The Chebyshev theta function is the map
\begin{align*}
\vartheta:[1,\infty) &\longrightarrow \mathbb R,\\
x &\longmapsto \sum_{p \le x} \log p.
\end{align*}
[/definition]
The function $\vartheta(x)$ counts primes with weight $\log p$. This is already closer to the analytic side than $\pi(x)$, because logarithms are the weights produced by differentiating prime factors.
However, the logarithmic derivative of an Euler product does not stop at primes: each factor also contributes the powers $p^2,p^3,\ldots$. To match that Dirichlet series coefficient-by-coefficient, one needs an arithmetic function that recognizes prime powers, assigns them the logarithm of their underlying prime, and vanishes on all other integers.
[definition: Von Mangoldt Function]
The von Mangoldt function is the map
\begin{align*}
\Lambda:\mathbb N &\longrightarrow \mathbb R,\\
n &\longmapsto
\begin{cases}
\log p, & \text{if } n=p^k \text{ for some prime } p \text{ and } k\in\mathbb N,\\
0, & \text{otherwise.}
\end{cases}
\end{align*}
[/definition]
The von Mangoldt function gives the local weight attached to each integer, but Perron's formula applies to cumulative sums rather than to isolated coefficients. The next object packages these weights up to a real cutoff $x$, producing the analytic prime-counting function that can be represented by the logarithmic derivative $-\zeta'/\zeta$.
[definition: Chebyshev Psi Function]
The Chebyshev psi function is the map
\begin{align*}
\psi:[1,\infty) &\longrightarrow \mathbb R,\\
x &\longmapsto \sum_{n \le x} \Lambda(n).
\end{align*}
[/definition]
Since $\Lambda(n)$ is supported on prime powers, $\psi(x)$ can also be written as
\begin{align*}
\psi(x) = \sum_{p^k \le x} \log p.
\end{align*}
This is the form directly generated by the logarithmic derivative of the Euler product.
[example: First Values Of Chebyshev Functions]
For $x=10$, the primes at most $10$ are exactly $2,3,5,7$, so the definition of $\vartheta$ gives
\begin{align*}
\vartheta(10)
&=\sum_{p\le 10}\log p\\
&=\log 2+\log 3+\log 5+\log 7.
\end{align*}
For $\psi(10)$, we must list all prime powers at most $10$:
\begin{align*}
2&=2^1, & 3&=3^1, & 4&=2^2, & 5&=5^1,\\
7&=7^1, & 8&=2^3, & 9&=3^2.
\end{align*}
Thus
\begin{align*}
\psi(10)
&=\sum_{n\le 10}\Lambda(n)\\
&=\Lambda(2)+\Lambda(3)+\Lambda(4)+\Lambda(5)+\Lambda(7)+\Lambda(8)+\Lambda(9)\\
&=\log 2+\log 3+\log 2+\log 5+\log 7+\log 2+\log 3.
\end{align*}
Subtracting the two displayed formulae term by term,
\begin{align*}
\psi(10)-\vartheta(10)
&=(\log 2+\log 3+\log 2+\log 5+\log 7+\log 2+\log 3)\\
&\quad -(\log 2+\log 3+\log 5+\log 7)\\
&=\log 2+\log 2+\log 3.
\end{align*}
The remaining terms are exactly the contributions from the higher prime powers $4=2^2$, $8=2^3$, and $9=3^2$. In general, if $p^k\le x$ with $k\ge 2$, then $p^2\le p^k\le x$, hence $p\le x^{1/2}$; this is why only prime bases up to $x^{1/2}$ can contribute to $\psi(x)-\vartheta(x)$.
[/example]
The example shows why the weighted prime-power count is not an arbitrary modification of $\pi(x)$. To connect that count to complex analysis, the coefficient sequence must appear as an actual Dirichlet series. Logarithmically differentiating the Euler product supplies precisely that series in the half-plane of absolute convergence.
The next step is to identify the analytic object whose coefficients are the von Mangoldt weights. Once this identity is available, estimates for $\psi(x)$ can be obtained by studying the singularities and continuation of $-\zeta'(s)/\zeta(s)$.
[quotetheorem:1750]
[citeproof:1750]
This identity is the bridge from analytic information about $\zeta(s)$ to summatory information about primes. The hypothesis $\operatorname{Re}(s)>1$ is essential at this stage: it is where the Euler product and the differentiated double series are absolutely convergent, so termwise logarithmic differentiation is justified. The theorem does not assert convergence on the critical strip or at the boundary line $\operatorname{Re}(s)=1$; all boundary information must come later from meromorphic continuation and nonvanishing. The rest of the proof is a method for extracting asymptotics for partial sums of $\Lambda(n)$ from the singularity of this Dirichlet series at $s=1$.
## Equivalent Forms Of The Prime Number Theorem
Which prime-counting statement should count as the prime number theorem? Analytic arguments naturally give $\psi(x)\sim x$, while elementary formulations usually use $\pi(x)\sim x/\log x$. The problem is to show that the weighted count, the Chebyshev function, and the ordinary prime-counting function carry the same first-order information. The theorem below provides that translation, allowing us to prove whichever form is analytically most convenient.
[quotetheorem:4362]
[citeproof:4362]
The theorem lets us prove the most analytically convenient version and then translate it. Its scope is deliberately first-order: it transfers the statement that the quotient tends to $1$, but it does not preserve sharp error terms without extra estimates. For example, an error bound for $\psi(x)-x$ can be weakened when prime powers are removed and partial summation is applied, so quantitative versions of the prime number theorem require more careful bookkeeping. The function $\psi$ is preferred here because it is the summatory function of the Dirichlet coefficients of $-\zeta'/\zeta$.
[example: From Theta To The Logarithmic Integral]
Assume $\vartheta(x)\sim x$. Partial summation gives
\begin{align*}
\pi(x)
&=\frac{\vartheta(x)}{\log x}+\int_2^x \frac{\vartheta(t)}{t(\log t)^2}\,dt.
\end{align*}
Write $\vartheta(t)=t(1+\varepsilon(t))$, where $\varepsilon(t)\to 0$. Then
\begin{align*}
\pi(x)
&=\frac{x(1+\varepsilon(x))}{\log x}
+\int_2^x \frac{t(1+\varepsilon(t))}{t(\log t)^2}\,dt\\
&=\frac{x}{\log x}+\frac{x\varepsilon(x)}{\log x}
+\int_2^x \frac{dt}{(\log t)^2}
+\int_2^x \frac{\varepsilon(t)}{(\log t)^2}\,dt.
\end{align*}
The second term is $o(x/\log x)$. For the last term, fix $\eta>0$ and choose $T$ such that $|\varepsilon(t)|\le \eta$ for $t\ge T$. Then
\begin{align*}
\left|\int_2^x \frac{\varepsilon(t)}{(\log t)^2}\,dt\right|
&\le \left|\int_2^T \frac{\varepsilon(t)}{(\log t)^2}\,dt\right|
+\eta\int_T^x \frac{dt}{(\log t)^2}.
\end{align*}
The first integral is constant in $x$, while
\begin{align*}
\int_2^x \frac{dt}{(\log t)^2}
&=\int_2^{\sqrt{x}} \frac{dt}{(\log t)^2}
+\int_{\sqrt{x}}^x \frac{dt}{(\log t)^2}\\
&\le \frac{\sqrt{x}}{(\log 2)^2}
+\frac{x}{(\frac12\log x)^2}\\
&=\frac{\sqrt{x}}{(\log 2)^2}+\frac{4x}{(\log x)^2}
=o\left(\frac{x}{\log x}\right).
\end{align*}
Since $\eta$ was arbitrary,
\begin{align*}
\int_2^x \frac{\varepsilon(t)}{(\log t)^2}\,dt
=o\left(\frac{x}{\log x}\right),
\end{align*}
and therefore
\begin{align*}
\pi(x)
&=\frac{x}{\log x}+\int_2^x \frac{dt}{(\log t)^2}
+o\left(\frac{x}{\log x}\right).
\end{align*}
Now compute the corresponding expansion for the logarithmic integral. By integration by parts with $u=1/\log t$ and $dv=dt$,
\begin{align*}
\operatorname{Li}(x)
&=\int_2^x \frac{dt}{\log t}\\
&=\left[\frac{t}{\log t}\right]_2^x
-\int_2^x t\left(-\frac{1}{t(\log t)^2}\right)\,dt\\
&=\frac{x}{\log x}-\frac{2}{\log 2}
+\int_2^x \frac{dt}{(\log t)^2}\\
&=\frac{x}{\log x}+\int_2^x \frac{dt}{(\log t)^2}+O(1).
\end{align*}
Thus $\pi(x)-\operatorname{Li}(x)=o(x/\log x)$. Since the displayed bound above gives $\int_2^x dt/(\log t)^2=o(x/\log x)$, we also have $\operatorname{Li}(x)\sim x/\log x$, and hence $\pi(x)\sim \operatorname{Li}(x)$.
[/example]
This calculation explains why $x/\log x$ and $\operatorname{Li}(x)$ are interchangeable at the level of first-order asymptotics, even though $\operatorname{Li}(x)$ is the more accurate main term. The analytic proof below only needs this coarse equivalence.
[remark: Why Psi Is Smoother Than Pi]
The jumps of $\pi(x)$ all have size $1$, while the jumps of $\psi(x)$ have size $\log p$ at prime powers. This may look less smooth pointwise, but it is smoother analytically: its Mellin transform is a logarithmic derivative of an Euler product, so poles and zeros of $\zeta(s)$ govern its asymptotics.
[/remark]
## Nonvanishing On The Boundary Line
What analytic obstruction could prevent $\psi(x)$ from being asymptotic to $x$? The pole of $-\zeta'/\zeta$ at $s=1$ should contribute $x$, but any zero of $\zeta(s)$ on $\operatorname{Re}(s)=1$ would create another pole on the line of integration. Such a pole would produce an oscillating term of size $x$, incompatible with a clean first-order asymptotic.
[quotetheorem:4363]
[citeproof:4363]
The theorem is the exact boundary statement needed for the prime number theorem in this course. A zero at $1+it$ would make $-\zeta'/\zeta$ have a pole at $1+it$, and after the logarithmic change of variables this would become a boundary singularity of the Tauberian transform at $it$. Newman's theorem cannot cross such a singularity; analytically, it would correspond to an oscillating contribution of the same size as the main term. Stronger zero-free regions give error terms, but the first-order theorem only needs nonvanishing on the closed boundary line except for the pole at $1$.
[remark: Boundary Nonvanishing Versus Zero-Free Regions]
A zero-free region has the form $\zeta(s)\ne 0$ for $\operatorname{Re}(s)\ge 1-c/\log(|\operatorname{Im}(s)|+2)$, apart from the pole at $s=1$. That stronger statement implies quantitative estimates for $\psi(x)-x$. The argument here only uses the qualitative consequence that $-\zeta'/\zeta$ is analytic on $\operatorname{Re}(s)\ge 1$ except for a simple pole at $s=1$.
[/remark]
## Extracting The Main Term From The Pole
How does a pole at $s=1$ become a term $x$ in a summatory function? The answer is that Mellin inversion converts division by $s$ and a pole at $1$ into a residue equal to $x$. The logarithmic derivative $-\zeta'/\zeta$ has exactly this pole because $\zeta(s)$ has a simple pole at $s=1$.
In the next theorem, $\mathcal M(\Omega)$ denotes the meromorphic functions on the open set $\Omega\subset\mathbb C$.
[quotetheorem:3368]
[citeproof:3368]
The simple pole statement is local, and it depends on two facts from the previous chapter: $\zeta$ has a simple pole at $1$, and its nonzero holomorphic factor does not vanish after restricting to a small neighbourhood. It says nothing about other singularities of $-\zeta'/\zeta$, which come from zeros of $\zeta$ and must be controlled separately. In the Tauberian proof, this theorem identifies the only boundary singularity that is allowed to remain, because that singularity is precisely the source of the main term.
[example: Residue Producing The Main Term]
For fixed $x>0$, Perron-type formulae contain the meromorphic factor
\begin{align*}
-\frac{\zeta'}{\zeta}(s)\frac{x^s}{s}.
\end{align*}
By the local pole statement for logarithmic derivatives, there is a function $H$ holomorphic near $s=1$ such that
\begin{align*}
-\frac{\zeta'}{\zeta}(s)=\frac{1}{s-1}+H(s).
\end{align*}
Hence, near $s=1$,
\begin{align*}
-\frac{\zeta'}{\zeta}(s)\frac{x^s}{s}
&=\left(\frac{1}{s-1}+H(s)\right)\frac{x^s}{s}\\
&=\frac{x^s}{s(s-1)}+H(s)\frac{x^s}{s}.
\end{align*}
The function $s\mapsto x^s/s$ is holomorphic at $s=1$, and $H(s)x^s/s$ is also holomorphic there. Therefore only the first summand contributes to the residue:
\begin{align*}
\operatorname{Res}_{s=1}\left(-\frac{\zeta'}{\zeta}(s)\frac{x^s}{s}\right)
&=\lim_{s\to 1}(s-1)\left(\frac{x^s}{s(s-1)}+H(s)\frac{x^s}{s}\right)\\
&=\lim_{s\to 1}\frac{x^s}{s}
+\lim_{s\to 1}(s-1)H(s)\frac{x^s}{s}\\
&=x+0\\
&=x.
\end{align*}
Thus the simple pole of $-\zeta'/\zeta$ at $s=1$ contributes exactly the main term $x$ in the asymptotic formula for $\psi(x)$.
[/example]
The example isolates the whole philosophy of the proof. The pole at $1$ contributes $x$; the remaining analytic part must be shown to contribute $o(x)$.
## Newman's Tauberian Argument
Can we prove $\psi(x)\sim x$ without a long explicit formula and zero-free region? Newman's proof gives a compact Tauberian route: it converts analytic continuation of a Laplace transform to the boundary line into an asymptotic statement for a nonnegative summatory function.
In the theorem below, $d\mathcal L^1(t)$ denotes integration with respect to one-dimensional Lebesgue measure on the real line; equivalently, it is the usual real integral notation $dt$ written in measure form.
[quotetheorem:4364]
[citeproof:4364]
This theorem is the conversion principle at the end of the zeta argument. Its hypotheses explain what the preceding sections were arranged to provide: bounded real-variable data, a Laplace transform, and holomorphic continuation to the boundary. The theorem itself does not know about primes; the prime number theorem enters only after the logarithmic derivative of $\zeta$ supplies the relevant transform and boundary nonvanishing removes the possible singularities on the critical edge.
The analytic work above would be incomplete without translating it back into the original counting problem for primes. The Tauberian conclusion controls a weighted prime-counting function, while the statement readers usually want is an asymptotic for how many primes lie below a large bound. The next quoted result packages that conversion in the standard equivalent form of the prime number theorem.
[quotetheorem:1692]
[citeproof:1692]
This is the central result of the first half of the course. It is a first-order theorem: it determines the main density of the primes, but it does not give a useful numerical error term for $\pi(x)-\operatorname{Li}(x)$. Such error terms require more detailed information about where zeros of $\zeta$ lie inside the critical strip, not merely the absence of zeros on the boundary line. In later chapters, Dirichlet characters and Dirichlet $L$-functions repeat the same pattern: Euler products encode congruence-restricted primes, boundary nonvanishing supplies the Tauberian input, and the pole of the principal $L$-function produces the main term.
To study primes in arithmetic progressions, we need a way to isolate residue classes without abandoning multiplicative structure. Dirichlet characters provide exactly that by acting as finite Fourier modes on $(\mathbb{Z}/q\mathbb{Z})^\times$. With them, the global Dirichlet-series machinery can be refined to focus on individual congruence classes.
# 6. Dirichlet Characters and Orthogonality
This chapter introduces the finite Fourier analysis needed to isolate arithmetic progressions. In the previous chapters, Euler products and Dirichlet series organised multiplicative information over all integers at once. Dirichlet characters let us keep the multiplicative structure while imposing congruence conditions such as $n \equiv a \pmod q$. The main mechanism is orthogonality: averaging characters kills every unwanted residue class and keeps the one we ask for.
## Multiplicative Characters Modulo an Integer
How can a congruence condition be encoded by multiplicative data rather than by additive residue classes? The reduced residue classes modulo $q$ form the finite abelian group $(\mathbb{Z}/q\mathbb{Z})^\times$, and its homomorphisms to $\mathbb{C}^\times$ are the objects that diagonalise multiplicative periodicity. Extending those homomorphisms by zero on non-units gives functions on all integers, which is the form needed for Dirichlet series.
[definition: Dirichlet Character Modulo Q]
Let $q \in \mathbb{N}$. A Dirichlet character modulo $q$ is a function $\chi: \mathbb{Z} \to \mathbb{C}$ such that there exists a group homomorphism
\begin{align*}
\widetilde{\chi}: (\mathbb{Z}/q\mathbb{Z})^\times \to \mathbb{C}^\times
\end{align*}
with
\begin{align*}
\chi(n) =
\begin{cases}
\widetilde{\chi}(\bar n), & \gcd(n,q)=1,\\
0, & \gcd(n,q)>1.
\end{cases}
\end{align*}
[/definition]
The bar denotes the residue class modulo $q$. Since the nonzero values come from a finite subgroup of $\mathbb{C}^\times$, every nonzero value of $\chi$ is a root of unity. Thus $|\chi(n)|$ is either $1$ or $0$, according as $n$ is coprime to $q$ or not.
Among all characters there is one exceptional case with no oscillation on the reduced residue classes. This creates a real obstruction in later analytic arguments: its $L$-function carries the same pole at $s=1$ as the zeta function, while the other characters are expected to contribute only oscillatory error terms.
The terminology must therefore separate the character that only records coprimality from the characters that actually oscillate. The definition below singles out the identity character on the unit classes modulo $q$, extended by zero on the non-units.
[definition: Principal Character]
Let $q \in \mathbb{N}$. The principal character modulo $q$ is the Dirichlet character $\chi_0$ defined by
\begin{align*}
\chi_0(n) =
\begin{cases}
1, & \gcd(n,q)=1,\\
0, & \gcd(n,q)>1.
\end{cases}
\end{align*}
[/definition]
A character modulo $q$ that is not $\chi_0$ is called nonprincipal. Principal characters carry only the information of coprimality to $q$, while nonprincipal characters oscillate among the reduced residue classes.
Before forming $L(s,\chi)$, we need the structural facts that make characters suitable Dirichlet-series coefficients. Periodicity supplies congruence information, complete multiplicativity supplies Euler products, and vanishing on non-units keeps the definition compatible with reduction modulo $q$.
[quotetheorem:4365]
[citeproof:4365]
The point of the definition is that a character is not merely a periodic function. An arbitrary periodic weight can detect congruence classes, but it usually destroys Euler products because it need not satisfy $a(mn)=a(m)a(n)$. The quoted properties record what survives when a finite group homomorphism is extended to all integers by zero on non-units.
These properties explain why characters are compatible with Dirichlet series: the coefficients $\chi(n)$ are periodic and completely multiplicative. The coprimality hypothesis is essential: a character modulo $q$ must vanish at every integer sharing a prime factor with $q$, so it cannot behave like a nonzero group homomorphism on all residue classes modulo $q$. The theorem also does not say that every completely multiplicative periodic function is primitive or nonprincipal; for example $\chi_0$ is completely multiplicative but carries no oscillation. This is the reason the later $L$-function $L(s,\chi)=\sum_{n=1}^\infty \chi(n)n^{-s}$ has both congruence information and an Euler product.
[example: Characters Modulo Three]
Modulo $3$, the reduced residue classes are
\begin{align*}
(\mathbb{Z}/3\mathbb{Z})^\times=\{\bar 1,\bar 2\}.
\end{align*}
The class $\bar 2$ has order $2$, because
\begin{align*}
\bar 2^2=\bar 4=\bar 1 \quad \text{in } \mathbb{Z}/3\mathbb{Z}.
\end{align*}
Thus a character is determined by the value assigned to $\bar 2$. If $\widetilde{\chi}(\bar 2)=z$, then the homomorphism law gives
\begin{align*}
z^2=\widetilde{\chi}(\bar 2)^2=\widetilde{\chi}(\bar 2^2)=\widetilde{\chi}(\bar 1)=1,
\end{align*}
so $z=1$ or $z=-1$. These two choices give exactly two characters.
For $z=1$, we get the principal character:
\begin{align*}
\chi_0(n)=
\begin{cases}
1, & n \equiv 1 \pmod 3,\\
1, & n \equiv 2 \pmod 3,\\
0, & n \equiv 0 \pmod 3.
\end{cases}
\end{align*}
For $z=-1$, we get the nonprincipal character:
\begin{align*}
\chi_1(n)=
\begin{cases}
1, & n \equiv 1 \pmod 3,\\
-1, & n \equiv 2 \pmod 3,\\
0, & n \equiv 0 \pmod 3.
\end{cases}
\end{align*}
Thus $\chi_1$ gives a sign to the two reduced residue classes modulo $3$: it is $1$ on the class of $1$, $-1$ on the class of $2$, and $0$ exactly on the non-units.
[/example]
The modulus $3$ case shows the first nontrivial oscillation: the nonprincipal character distinguishes the two units by a sign. Modulus $4$ has the same group shape, but it is arithmetically different because the zero values now remove all even integers.
[example: Characters Modulo Four]
Modulo $4$, the reduced residue classes are
\begin{align*}
(\mathbb{Z}/4\mathbb{Z})^\times=\{\bar 1,\bar 3\}.
\end{align*}
The class $\bar 3$ has order $2$, since
\begin{align*}
\bar 3^2=\bar 9=\bar 1 \quad \text{in } \mathbb{Z}/4\mathbb{Z}.
\end{align*}
Thus a homomorphism $\widetilde{\chi}:(\mathbb{Z}/4\mathbb{Z})^\times\to \mathbb{C}^\times$ is determined by the value $z=\widetilde{\chi}(\bar 3)$. The homomorphism law forces
\begin{align*}
z^2
&=\widetilde{\chi}(\bar 3)^2\\
&=\widetilde{\chi}(\bar 3^2)\\
&=\widetilde{\chi}(\bar 1)\\
&=1,
\end{align*}
so $z=1$ or $z=-1$. These two choices give exactly two characters modulo $4$.
For $z=1$, the character is principal:
\begin{align*}
\chi_0(n)=
\begin{cases}
1, & n \equiv 1 \pmod 4,\\
1, & n \equiv 3 \pmod 4,\\
0, & n \equiv 0 \pmod 2.
\end{cases}
\end{align*}
For $z=-1$, the nonprincipal character is
\begin{align*}
\chi_{-4}(n)=
\begin{cases}
1, & n \equiv 1 \pmod 4,\\
-1, & n \equiv 3 \pmod 4,\\
0, & n \equiv 0 \pmod 2.
\end{cases}
\end{align*}
Thus $\chi_{-4}$ separates the two odd residue classes modulo $4$, while every even integer receives value $0$ because it is not coprime to $4$.
[/example]
Modulo $5$ gives the first example where the character group has four elements, including genuinely complex-valued characters.
[example: Characters Modulo Five]
The reduced residue classes modulo $5$ are
\begin{align*}
(\mathbb{Z}/5\mathbb{Z})^\times=\{\bar 1,\bar 2,\bar 3,\bar 4\}.
\end{align*}
The class $\bar 2$ generates this group, because
\begin{align*}
\bar 2^0&=\bar 1,\\
\bar 2^1&=\bar 2,\\
\bar 2^2&=\bar 4,\\
\bar 2^3&=\bar 8=\bar 3,\\
\bar 2^4&=\bar {16}=\bar 1
\end{align*}
in $\mathbb{Z}/5\mathbb{Z}$. Thus $(\mathbb{Z}/5\mathbb{Z})^\times$ is cyclic of order $4$.
Let $i=\sqrt{-1}$. If $\widetilde{\chi}$ is a homomorphism on $(\mathbb{Z}/5\mathbb{Z})^\times$ and $z=\widetilde{\chi}(\bar 2)$, then
\begin{align*}
z^4
&=\widetilde{\chi}(\bar 2)^4\\
&=\widetilde{\chi}(\bar 2^4)\\
&=\widetilde{\chi}(\bar 1)\\
&=1.
\end{align*}
Hence $z$ must be one of the four fourth roots of unity $1,i,-1,-i$, which are exactly $i^j$ for $j\in\{0,1,2,3\}$. Conversely, choosing $\widetilde{\chi}_j(\bar 2)=i^j$ determines a homomorphism, because every reduced residue class is a unique power of $\bar 2$.
For each $j\in\{0,1,2,3\}$, extend this homomorphism to a Dirichlet character $\chi_j$ modulo $5$ by setting $\chi_j(n)=0$ when $5\mid n$. On the reduced residue classes, the values are obtained from the powers of $\bar 2$:
\begin{align*}
\chi_j(1)
&=\chi_j(2^0)
=(i^j)^0
=1,\\
\chi_j(2)
&=i^j,\\
\chi_j(4)
&=\chi_j(2^2)
=(i^j)^2
=i^{2j}
=(-1)^j,\\
\chi_j(3)
&=\chi_j(2^3)
=(i^j)^3
=i^{3j}
=(i^3)^j
=(-i)^j.
\end{align*}
Together with $\chi_j(n)=0$ for $5\mid n$, these four choices list all four characters modulo $5$; the case $j=0$ is the principal character, since it takes value $1$ on every reduced residue class.
[/example]
## Orthogonality And Projection Onto Residue Classes
How do characters isolate one residue class among many? The obstacle is that the ordinary congruence indicator $\mathbb{1}_{n \equiv a \pmod q}$ is additive and has no reason to be multiplicative, so inserting it directly into a Dirichlet series usually loses the Euler product structure. The guiding analogy is the Fourier identity saying that exponentials on a finite abelian group are orthogonal: just as the functions $x \mapsto e^{2\pi i kx/N}$ form an orthogonal basis on $\mathbb{Z}/N\mathbb{Z}$, the characters of $(\mathbb{Z}/q\mathbb{Z})^\times$ form the multiplicative Fourier basis on the reduced residue classes. Averaging them gives exact projection operators on reduced residue classes.
[remark: Character Orthogonality on Reduced Residue Classes]
Let $q\ge 1$, and let the sum run over all Dirichlet characters modulo $q$. If $a$ and $b$ are both coprime to $q$, then
\begin{align*}
\frac{1}{\varphi(q)}\sum_{\chi \bmod q}\chi(a)\overline{\chi(b)}
=
\begin{cases}
1, & a\equiv b \pmod q,\\
0, & a\not\equiv b \pmod q.
\end{cases}
\end{align*}
The hypotheses $\gcd(a,q)=\gcd(b,q)=1$ are not cosmetic: if $a$ is not a unit modulo $q$, then its class is not an element of $(\mathbb{Z}/q\mathbb{Z})^\times$, so multiplicative Fourier analysis has no object to evaluate. For example modulo $4$, characters cannot separate the classes $0$ and $2$, because every character is zero on both even classes. Thus this projection principle does not apply to all residue classes modulo $q$; it applies inside the reduced residue classes.
[/remark]
For applications to arithmetic progressions, the projection formula must be expressed as an actual indicator function in the variable being counted. The obstruction is that the residue class condition should vanish on every wrong reduced class while remaining compatible with character sums. The next result records this indicator form, which is the version inserted into Dirichlet series and prime sums.
[quotetheorem:4366]
[citeproof:4366]
The caveat about reduced residue classes is important. The formula is a replacement for the additive indicator only after all non-units have been excluded; it is not a formula for arbitrary congruence classes. For instance, modulo $6$ the classes $2$ and $4$ are both invisible to every character, even though they are different additive residue classes. In analytic number theory this is usually the correct setting, since primes in an arithmetic progression $a \pmod q$ can only be equidistributed among the classes with $\gcd(a,q)=1$; the finitely many primes dividing $q$ are exceptional.
[example: Projecting Onto One Class Modulo Five]
Take $q=5$ and $a=2$. For the four characters $\chi_j$ modulo $5$ from the previous example, $\chi_j(2)=i^j$, so
\begin{align*}
\overline{\chi_j(2)}
&=\overline{i^j}\\
&=\overline{i}^{\,j}\\
&=(-i)^j.
\end{align*}
Thus the projection expression for the reduced class $2 \pmod 5$ is
\begin{align*}
P(n)
=\frac{1}{4}\sum_{j=0}^{3}\overline{\chi_j(2)}\chi_j(n)
=\frac{1}{4}\sum_{j=0}^{3}(-i)^j\chi_j(n).
\end{align*}
If $5\mid n$, then $\chi_j(n)=0$ for every $j$, so
\begin{align*}
P(n)=\frac{1}{4}\sum_{j=0}^{3}(-i)^j\cdot 0=0.
\end{align*}
Now suppose $5\nmid n$. The values from the character table modulo $5$ are
\begin{align*}
\chi_j(1)&=1,\\
\chi_j(2)&=i^j,\\
\chi_j(3)&=(-i)^j,\\
\chi_j(4)&=(-1)^j.
\end{align*}
For $n\equiv 1\pmod 5$,
\begin{align*}
P(n)
&=\frac{1}{4}\sum_{j=0}^{3}(-i)^j\\
&=\frac{1}{4}\left(1+(-i)+(-i)^2+(-i)^3\right)\\
&=\frac{1}{4}(1-i-1+i)\\
&=0.
\end{align*}
For $n\equiv 2\pmod 5$,
\begin{align*}
P(n)
&=\frac{1}{4}\sum_{j=0}^{3}(-i)^j i^j\\
&=\frac{1}{4}\sum_{j=0}^{3}((-i)i)^j\\
&=\frac{1}{4}\sum_{j=0}^{3}1^j\\
&=\frac{1}{4}(1+1+1+1)\\
&=1.
\end{align*}
For $n\equiv 3\pmod 5$,
\begin{align*}
P(n)
&=\frac{1}{4}\sum_{j=0}^{3}(-i)^j(-i)^j\\
&=\frac{1}{4}\sum_{j=0}^{3}((-i)^2)^j\\
&=\frac{1}{4}\sum_{j=0}^{3}(-1)^j\\
&=\frac{1}{4}(1-1+1-1)\\
&=0.
\end{align*}
For $n\equiv 4\pmod 5$,
\begin{align*}
P(n)
&=\frac{1}{4}\sum_{j=0}^{3}(-i)^j(-1)^j\\
&=\frac{1}{4}\sum_{j=0}^{3}((-i)(-1))^j\\
&=\frac{1}{4}\sum_{j=0}^{3}i^j\\
&=\frac{1}{4}(1+i-1-i)\\
&=0.
\end{align*}
Therefore $P(n)$ is $1$ exactly on the reduced residue class $n\equiv 2\pmod 5$ and is $0$ on the other residue classes, so the character average is precisely the desired projection.
[/example]
The first orthogonality relation used characters to detect a fixed reduced residue class. The complementary question is whether the reduced residue classes can detect equality of characters themselves. This matters because later arguments decompose functions into character components, and that decomposition is useful only if distinct characters do not interfere with one another under averaging.
[explanation: Orthogonality of Dirichlet Characters]
If $\chi$ and $\psi$ are Dirichlet characters modulo $q$, then
\begin{align*}
\frac{1}{\varphi(q)}\sum_{\substack{a\bmod q\\(a,q)=1}}\chi(a)\overline{\psi(a)}
=
\begin{cases}
1,&\chi=\psi,\\
0,&\chi\ne\psi.
\end{cases}
\end{align*}
[/explanation]
The second orthogonality relation says that different characters are independent Fourier modes. The restriction to reduced residue classes is again essential: summing over all residues modulo $q$ would merely add zeros from non-units, while trying to compare values on non-units would give no information because all characters vanish there. The theorem also does not assert cancellation for partial sums such as $\sum_{a\le x}\chi(a)$; it is an exact complete-period identity over the finite group. Together with the first relation, it is precisely the finite Fourier orthogonality pair: one relation projects points using modes, and the other separates modes using points.
[example: Averaging A Nonprincipal Character]
For the nonprincipal character $\chi_{-4}$ modulo $4$, the reduced residue classes are exactly $\bar 1$ and $\bar 3$, since
\begin{align*}
\gcd(0,4)&=4,\\
\gcd(1,4)&=1,\\
\gcd(2,4)&=2,\\
\gcd(3,4)&=1.
\end{align*}
Therefore the complete reduced sum is
\begin{align*}
\sum_{\substack{a \bmod 4\\ \gcd(a,4)=1}}\chi_{-4}(a)
&=\chi_{-4}(1)+\chi_{-4}(3)\\
&=1+(-1)\\
&=0.
\end{align*}
Equivalently, taking $\psi=\chi_0$ and $\chi=\chi_{-4}$ in the *Second Orthogonality Relation For Dirichlet Characters* gives
\begin{align*}
\frac{1}{\varphi(4)}
\sum_{\substack{a \bmod 4\\ \gcd(a,4)=1}}
\chi_{-4}(a)\overline{\chi_0(a)}
&=0,
\end{align*}
because $\chi_{-4}\ne \chi_0$. Since $\varphi(4)=2$ and $\chi_0(1)=\chi_0(3)=1$, this is the same cancellation:
\begin{align*}
\frac{1}{2}\left(\chi_{-4}(1)\overline{\chi_0(1)}
+\chi_{-4}(3)\overline{\chi_0(3)}\right)
&=\frac{1}{2}\left(1\cdot \overline{1}+(-1)\cdot \overline{1}\right)\\
&=\frac{1}{2}(1-1)\\
&=0.
\end{align*}
Thus the two reduced residue classes contribute equal and opposite values, which is the complete-period cancellation that orthogonality detects.
[/example]
## Primitive Characters And Conductors
When does a character modulo $q$ contain genuinely modulo $q$ information? Some characters modulo $q$ are obtained by pulling back characters from a smaller modulus. The primitive characters are those for which no smaller modulus carries the same information, and the conductor records the minimal modulus.
[definition: Induced Character]
Let $d,q \in \mathbb{N}$ with $d \mid q$. Let $\chi^*$ be a Dirichlet character modulo $d$. The character modulo $q$ induced by $\chi^*$ is the function $\chi: \mathbb{Z} \to \mathbb{C}$ defined by
\begin{align*}
\chi(n)=
\begin{cases}
\chi^*(n), & \gcd(n,q)=1,\\
0, & \gcd(n,q)>1.
\end{cases}
\end{align*}
[/definition]
The extra zero condition matters: even if $\chi^*(n)$ is nonzero, the induced character modulo $q$ is forced to vanish when $n$ is not coprime to $q$. Thus induction changes the behaviour at primes dividing $q$ but not $d$.
This creates an ambiguity in the modulus attached to a character. A character displayed modulo $q$ may have all of its genuine oscillation coming from a smaller divisor, with the larger modulus only adding forced zeros at new primes. The terminology below removes that ambiguity by naming characters that cannot be descended further and by recording the smallest modulus from which a given character is induced.
[definition: Primitive Character And Conductor]
Let $\chi$ be a Dirichlet character modulo $q$. The character $\chi$ is primitive modulo $q$ if it is not induced by any character modulo $d$ with $d \mid q$ and $d<q$. The conductor of $\chi$ is the smallest positive integer $f$ such that $\chi$ is induced by a primitive character modulo $f$.
[/definition]
The conductor is the true modulus of oscillation. Later, analytic properties of $L(s,\chi)$ are naturally stated in terms of primitive characters; imprimitive characters inherit local factors from their primitive source and have modified Euler factors at primes dividing $q/f$.
For this terminology to be useful, every character must have a well-defined primitive source. The following structural result guarantees that the conductor exists, divides the visible modulus, and supplies the primitive character from which the original one is induced.
[quotetheorem:4367]
[citeproof:4367]
This theorem is mostly structural at this stage, but the hypotheses are doing real work. The divisor condition $f\mid q$ reflects the fact that induction changes only the modulus through which the reduced residue classes are viewed; it is not an arbitrary change of period. The theorem also does not say that the visible modulus $q$ is analytically decisive: sums over all characters modulo $q$ include both primitive and imprimitive characters, while analytic continuation and functional equations are most naturally proved first for primitive characters.
[example: The Principal Character Modulo Four Is Imprimitive]
Modulo $1$, every integer is congruent to the single residue class $\bar 0$, and every integer is coprime to $1$ because $\gcd(n,1)=1$ for all $n\in\mathbb{Z}$. Thus the principal character modulo $1$ is the function
\begin{align*}
\chi_0^{(1)}(n)=1
\end{align*}
for every integer $n$.
Now induce $\chi_0^{(1)}$ to modulus $4$. By the definition of an induced character, the induced character $\chi$ modulo $4$ is
\begin{align*}
\chi(n)
=
\begin{cases}
\chi_0^{(1)}(n), & \gcd(n,4)=1,\\
0, & \gcd(n,4)>1.
\end{cases}
\end{align*}
Since $\chi_0^{(1)}(n)=1$ for every $n$, this becomes
\begin{align*}
\chi(n)
=
\begin{cases}
1, & \gcd(n,4)=1,\\
0, & \gcd(n,4)>1.
\end{cases}
\end{align*}
The condition $\gcd(n,4)=1$ holds exactly when $n$ is odd, because the only prime divisor of $4$ is $2$. Hence
\begin{align*}
\chi(n)
=
\begin{cases}
1, & n \equiv 1 \pmod 2,\\
0, & n \equiv 0 \pmod 2.
\end{cases}
\end{align*}
This is exactly the principal character modulo $4$, so the principal character modulo $4$ is induced from modulus $1$. Therefore its conductor is $1$, the smallest possible modulus from which it can be induced.
[/example]
The principal character modulo $4$ loses all information except parity; the nonprincipal character modulo $4$ behaves differently because it distinguishes the two odd classes.
[example: The Nonprincipal Character Modulo Four Is Primitive]
The character $\chi_{-4}$ modulo $4$ is
\begin{align*}
\chi_{-4}(n)
=
\begin{cases}
1, & n\equiv 1 \pmod 4,\\
-1, & n\equiv 3 \pmod 4,\\
0, & n\equiv 0 \pmod 2.
\end{cases}
\end{align*}
To show that it is primitive, we check the proper divisors of $4$, namely $1$ and $2$.
Modulo $1$, the only character is the principal character $\chi_0^{(1)}$, and it satisfies
\begin{align*}
\chi_0^{(1)}(n)=1
\end{align*}
for every integer $n$. If it is induced to modulus $4$, the induced character $\chi$ is
\begin{align*}
\chi(n)
=
\begin{cases}
\chi_0^{(1)}(n), & \gcd(n,4)=1,\\
0, & \gcd(n,4)>1,
\end{cases}
=
\begin{cases}
1, & \gcd(n,4)=1,\\
0, & \gcd(n,4)>1.
\end{cases}
\end{align*}
Hence
\begin{align*}
\chi(1)&=1,\\
\chi(3)&=1,
\end{align*}
while
\begin{align*}
\chi_{-4}(1)&=1,\\
\chi_{-4}(3)&=-1.
\end{align*}
So $\chi_{-4}$ is not induced from modulus $1$.
Modulo $2$, the reduced residue classes are only $\bar 1$, so the only character modulo $2$ is the principal character $\chi_0^{(2)}$, with
\begin{align*}
\chi_0^{(2)}(n)
=
\begin{cases}
1, & n\equiv 1 \pmod 2,\\
0, & n\equiv 0 \pmod 2.
\end{cases}
\end{align*}
Inducing $\chi_0^{(2)}$ to modulus $4$ gives
\begin{align*}
\chi(n)
=
\begin{cases}
\chi_0^{(2)}(n), & \gcd(n,4)=1,\\
0, & \gcd(n,4)>1.
\end{cases}
\end{align*}
For $n=1$ and $n=3$, both integers are coprime to $4$ and both are congruent to $1 \pmod 2$, so
\begin{align*}
\chi(1)&=\chi_0^{(2)}(1)=1,\\
\chi(3)&=\chi_0^{(2)}(3)=1.
\end{align*}
Again this does not match $\chi_{-4}(3)=-1$.
Therefore $\chi_{-4}$ is induced from neither proper divisor of $4$, so it is primitive modulo $4$. Its conductor is consequently $4$.
[/example]
## Parity Of A Character
What information remains when a character is evaluated at $-1$? Since $-1$ has order at most $2$ in every group $(\mathbb{Z}/q\mathbb{Z})^\times$, a character takes the value $1$ or $-1$ at $-1$. This sign is called the parity of the character, and it controls the symmetry of later Gauss sums and functional equations.
[definition: Parity Of A Dirichlet Character]
Let $\chi$ be a Dirichlet character modulo $q$. The character $\chi$ is even if $\chi(-1)=1$ and odd if $\chi(-1)=-1$.
[/definition]
For all $n \in \mathbb{Z}$, multiplicativity gives $\chi(-n)=\chi(-1)\chi(n)$. Thus an even character satisfies $\chi(-n)=\chi(n)$, while an odd character satisfies $\chi(-n)=-\chi(n)$.
[example: Parity Modulo Three And Four]
For the nonprincipal character modulo $3$, the integer $-1$ is congruent to $2$ modulo $3$, since
\begin{align*}
-1-2=-3
\end{align*}
is divisible by $3$. Therefore periodicity gives
\begin{align*}
\chi_1(-1)=\chi_1(2).
\end{align*}
From the character table modulo $3$,
\begin{align*}
\chi_1(2)=-1,
\end{align*}
so
\begin{align*}
\chi_1(-1)=-1.
\end{align*}
Hence $\chi_1$ is odd.
For the character $\chi_{-4}$ modulo $4$, the integer $-1$ is congruent to $3$ modulo $4$, since
\begin{align*}
-1-3=-4
\end{align*}
is divisible by $4$. Therefore
\begin{align*}
\chi_{-4}(-1)=\chi_{-4}(3).
\end{align*}
From the character table modulo $4$,
\begin{align*}
\chi_{-4}(3)=-1,
\end{align*}
so
\begin{align*}
\chi_{-4}(-1)=-1.
\end{align*}
Thus $\chi_{-4}$ is also odd.
For the principal character modulo $3$, we have
\begin{align*}
\gcd(-1,3)=1,
\end{align*}
and for the principal character modulo $4$, we have
\begin{align*}
\gcd(-1,4)=1.
\end{align*}
By the definition of the principal character, this gives
\begin{align*}
\chi_0^{(3)}(-1)&=1,\\
\chi_0^{(4)}(-1)&=1.
\end{align*}
Therefore the principal characters modulo $3$ and modulo $4$ are even, while the two nonprincipal characters in these moduli are odd.
[/example]
Modulo $5$, parity can be read directly from the value of a character on the generator $2$, since $-1\equiv 2^2\pmod 5$.
[example: Parity Of Characters Modulo Five]
For the characters $\chi_j$ modulo $5$ defined by $\chi_j(2)=i^j$, first locate the residue class of $-1$. Since
\begin{align*}
-1-4=-5,
\end{align*}
we have $-1\equiv 4\pmod 5$. Also
\begin{align*}
2^2=4,
\end{align*}
so $\bar 4=\bar 2^2$ in $(\mathbb{Z}/5\mathbb{Z})^\times$. Therefore, by periodicity and multiplicativity of Dirichlet characters,
\begin{align*}
\chi_j(-1)
&=\chi_j(4)\\
&=\chi_j(2^2)\\
&=\chi_j(2)^2\\
&=(i^j)^2\\
&=i^{2j}\\
&=(i^2)^j\\
&=(-1)^j.
\end{align*}
Thus
\begin{align*}
\chi_0(-1)&=(-1)^0=1,\\
\chi_1(-1)&=(-1)^1=-1,\\
\chi_2(-1)&=(-1)^2=1,\\
\chi_3(-1)&=(-1)^3=-1.
\end{align*}
By the definition of parity, $\chi_0$ and $\chi_2$ are even, while $\chi_1$ and $\chi_3$ are odd. This split is the first sign that the four characters modulo $5$ will have two different symmetry types in their analytic theory.
[/example]
The chapter's main output is the dictionary between reduced congruence classes and characters. Arithmetic progressions are additive-looking conditions, but on reduced residue classes they can be represented through multiplicative homomorphisms. The next chapter attaches Dirichlet series to these homomorphisms, turning the finite Fourier decomposition into analytic objects.
# 7. Dirichlet L-Functions and Euler Products
Dirichlet characters turn the zeta-function method into a tool for arithmetic progressions. The main objects are the Dirichlet $L$-functions $L(s,\chi)$, their Euler products, their logarithmic derivatives, and the elementary analytic continuation facts needed near $s=1$.
The guiding question is: how much of the analytic behaviour of $\zeta(s)$ survives after twisting by a Dirichlet character? Principal characters retain the pole of $\zeta(s)$, with finitely many Euler factors removed. Nonprincipal characters have enough cancellation in their partial sums to continue beyond the half-plane of absolute convergence, and this cancellation is the first analytic sign that primes may distribute between residue classes.
## Dirichlet Series Attached to Characters
A Dirichlet character is already an arithmetic function, so the natural first question is what its Dirichlet generating function records. Since characters are periodic and multiplicative, the resulting series is both a Dirichlet series and a finite Fourier-like object on residue classes modulo $q$.
[definition: Dirichlet L-Function]
Let $q \in \mathbb N$, and let $\chi$ be a Dirichlet character modulo $q$. The Dirichlet $L$-function attached to $\chi$ is the function
\begin{align*}
L(\cdot,\chi):\{s\in\mathbb C: \operatorname{Re}(s)>1\} &\to \mathbb C,\\
s &\mapsto \sum_{n=1}^{\infty} \frac{\chi(n)}{n^s}.
\end{align*}
[/definition]
The restriction $\operatorname{Re}(s)>1$ is the first domain where no cancellation is needed: $|\chi(n)|\le 1$, so the series is dominated by $\sum n^{-\sigma}$ with $\sigma=\operatorname{Re}(s)$. Before using characters for cancellation near the line $\operatorname{Re}(s)=1$, we need a baseline result that the defining series is a legitimate analytic object in this initial half-plane.
[quotetheorem:4389]
[citeproof:4389]
This theorem is deliberately restricted to $\operatorname{Re}(s)>1$: at the boundary $s=1$ the comparison series becomes harmonic, so absolute convergence gives no information. The hypothesis that $\chi$ is a Dirichlet character is used only through the bound $|\chi(n)|\le 1$; no arithmetic cancellation has entered yet. The next step asks what can replace absolute convergence when the coefficients are periodic and have average zero.
For nonprincipal characters, periodic cancellation replaces absolute convergence. The key input is that a nonprincipal character has mean zero over a complete residue system modulo $q$. To use this cancellation analytically, we need it in a cumulative form: partial sums must remain bounded instead of growing proportionally to their length. That boundedness is the bridge from finite orthogonality to convergence statements for Dirichlet series beyond the half-plane of absolute convergence.
[quotetheorem:4368]
[citeproof:4368]
The nonprincipal hypothesis is essential: for the principal character modulo $q$, the partial sums grow like a positive proportion of $x$, so no bounded estimate of this kind can hold. The theorem does not yet prove convergence of an $L$-series; it supplies the cancellation estimate that Abel summation will convert into convergence. Conceptually, this is the first place where arithmetic information about residue classes becomes analytic information about a complex function.
The bounded partial-sum estimate is the analytic reason that nonprincipal $L$-functions continue farther left than the defining series suggests. The remaining question is how to convert bounded sums of coefficients into convergence of the weighted series $\sum \chi(n)n^{-s}$. Abel summation supplies exactly that conversion: the powers $n^{-s}$ vary slowly enough when $\operatorname{Re}(s)>0$ that the bounded character sums control the tail.
[quotetheorem:4369]
[citeproof:4369]
The conclusion is weaker than absolute convergence: the series may converge only because positive and negative character values cancel over complete periods. The condition $\operatorname{Re}(s)>0$ is also tied to bounded partial sums; without that bound, Abel summation would leave an uncontrolled integral tail. This result separates the analytic behaviour of nonprincipal characters from the principal case, where the pole inherited from $\zeta(s)$ still has to be treated separately.
[example: Principal Character Series]
Let $\chi_0$ be the principal character modulo $q$, so $\chi_0(n)=1$ when $(n,q)=1$ and $\chi_0(n)=0$ otherwise. For $\operatorname{Re}(s)>1$, the Euler product for $\zeta(s)$ is absolutely convergent, and multiplying by the finite product over primes dividing $q$ gives
\begin{align*}
\zeta(s)\prod_{p\mid q}(1-p^{-s})
&=\left(\prod_p(1-p^{-s})^{-1}\right)\prod_{p\mid q}(1-p^{-s})\\
&=\left(\prod_{p\mid q}(1-p^{-s})^{-1}\right)
\left(\prod_{p\nmid q}(1-p^{-s})^{-1}\right)
\prod_{p\mid q}(1-p^{-s})\\
&=\prod_{p\nmid q}(1-p^{-s})^{-1}.
\end{align*}
For each prime $p\nmid q$,
\begin{align*}
(1-p^{-s})^{-1}=\sum_{k=0}^{\infty}p^{-ks},
\end{align*}
so absolute convergence permits multiplication of the local geometric series:
\begin{align*}
\prod_{p\nmid q}(1-p^{-s})^{-1}
&=\prod_{p\nmid q}\left(\sum_{k=0}^{\infty}p^{-ks}\right)\\
&=\sum_{\substack{n\ge 1\\(n,q)=1}}\frac{1}{n^s}.
\end{align*}
The last equality follows from unique factorisation: an integer is coprime to $q$ exactly when its prime factorisation uses no prime divisor of $q$, and such an integer contributes the term $\prod_p p^{-k_p s}=n^{-s}$. Therefore
\begin{align*}
L(s,\chi_0)
&=\sum_{n=1}^{\infty}\frac{\chi_0(n)}{n^s}\\
&=\sum_{\substack{n\ge 1\\(n,q)=1}}\frac{1}{n^s}\\
&=\zeta(s)\prod_{p\mid q}(1-p^{-s}).
\end{align*}
Thus the principal character series is exactly the zeta-function with the finitely many Euler factors at primes dividing $q$ removed.
[/example]
This example shows why the principal character behaves like $\zeta(s)$: it differs only by finitely many local factors. Nonprincipal characters instead encode cancellation between residue classes.
## Euler Products and Prime Data
The next problem is to see how the prime factorisation of integers appears inside $L(s,\chi)$. A Dirichlet series does not automatically have an Euler product: the coefficients must interact correctly with unique factorisation, and rearranging infinitely many local factors requires absolute convergence. Complete multiplicativity of Dirichlet characters supplies the algebraic condition, while the half-plane $\operatorname{Re}(s)>1$ supplies the analytic condition. With both in place, the Euler product has the same formal shape as the product for $\zeta(s)$, with each prime weighted by $\chi(p)$.
[quotetheorem:4370]
[citeproof:4370]
The restriction $\operatorname{Re}(s)>1$ matters because outside this half-plane the product need not converge absolutely, so the passage from local factors to the global Dirichlet series is no longer justified by this argument. Complete multiplicativity is also essential: ordinary multiplicativity would still control coprime factors but would not force the local geometric series to have coefficients $\chi(p^k)=\chi(p)^k$. The product shows exactly how primes enter the $L$-function, and the next operation extracts that prime information coefficient by coefficient.
The Euler product is the bridge from $L$-functions to primes. Taking a logarithmic derivative isolates prime powers and introduces the von Mangoldt function. This step is not merely formal: differentiating an infinite product requires locally uniform convergence of the logarithmic expansion, which is why the argument remains in $\operatorname{Re}(s)>1$.
[quotetheorem:4371]
[citeproof:4371]
This identity is the twisted analogue of the formula $-\zeta'(s)/\zeta(s)=\sum_{n\ge 1}\Lambda(n)n^{-s}$. It does not by itself locate primes in any single residue class; that requires character orthogonality to separate congruence classes after the twist has been introduced. The theorem is nevertheless the analytic form needed later, because information about primes is usually carried by $\Lambda(n)$ rather than by the unweighted prime indicator.
[example: Local Factors for a Vanishing Character Value]
Suppose $p\mid q$ and $\chi$ is a Dirichlet character modulo $q$. Since $p\mid q$, we have $(p,q)\ge p>1$, so $p$ is not coprime to $q$. By the defining vanishing property of Dirichlet characters on non-coprime integers,
\begin{align*}
\chi(p)=0.
\end{align*}
Therefore the Euler factor attached to $p$ is
\begin{align*}
(1-\chi(p)p^{-s})^{-1}
&=(1-0\cdot p^{-s})^{-1}\\
&=(1-0)^{-1}\\
&=1^{-1}\\
&=1.
\end{align*}
Thus a prime divisor of the modulus contributes no nontrivial local factor to the Euler product. More generally, if $n$ has a prime divisor in common with $q$, then $(n,q)>1$, hence $\chi(n)=0$; this is the coefficient-side reason that primes dividing the modulus disappear from the product.
[/example]
The logarithmic derivative identity is the form used in prime number theorems for arithmetic progressions. It replaces the coefficient $\Lambda(n)$ for $\zeta(s)$ by the twisted coefficient $\Lambda(n)\chi(n)$.
## Behaviour Near s Equals One
The analytic question needed for arithmetic progressions is what happens at $s=1$. The zeta-function has a simple pole there; Dirichlet $L$-functions either inherit that pole in the principal case or remain holomorphic in the nonprincipal case.
[quotetheorem:4372]
[citeproof:4372]
The principal hypothesis is responsible for the pole: the finite Euler factors can remove only finitely many local contributions, not the singularity of $\zeta(s)$ at $s=1$. The residue records the density of integers coprime to $q$, namely $\varphi(q)/q$, rather than a new analytic phenomenon. This prepares the contrast with nonprincipal characters, where cancellation removes the pole altogether in the elementary half-plane considered here.
For nonprincipal characters, the Abel-summation argument already provides continuation to $\operatorname{Re}(s)>0$, so there is no pole at $s=1$ in this elementary range. The point still needs to be stated explicitly because the behaviour at $s=1$ is where principal and nonprincipal characters diverge most sharply. The following result isolates the regularity statement that later arguments will pair with the deeper question of whether the value at $1$ can vanish.
[quotetheorem:4373]
[citeproof:4373]
The theorem proves regularity at $s=1$, but it deliberately stops short of proving $L(1,\chi)\ne 0$. That nonvanishing statement is the deeper ingredient needed to prove Dirichlet's theorem for arithmetic progressions. The example below shows that in a basic nonprincipal case the value at $1$ is not only finite but explicitly positive.
[example: The Character Modulo Four]
Let $\chi_4$ be the nonprincipal character modulo $4$ defined by $\chi_4(n)=0$ for even $n$, $\chi_4(n)=1$ for $n\equiv 1\pmod 4$, and $\chi_4(n)=-1$ for $n\equiv 3\pmod 4$. At $s=1$, grouping the terms by their residue class modulo $4$ gives
\begin{align*}
L(1,\chi_4)
&=\sum_{n=1}^{\infty}\frac{\chi_4(n)}{n}\\
&=\sum_{m=0}^{\infty}\frac{\chi_4(4m+1)}{4m+1}
+\sum_{m=0}^{\infty}\frac{\chi_4(4m+2)}{4m+2}
+\sum_{m=0}^{\infty}\frac{\chi_4(4m+3)}{4m+3}
+\sum_{m=1}^{\infty}\frac{\chi_4(4m)}{4m}\\
&=\sum_{m=0}^{\infty}\frac{1}{4m+1}
+\sum_{m=0}^{\infty}\frac{0}{4m+2}
+\sum_{m=0}^{\infty}\frac{-1}{4m+3}
+\sum_{m=1}^{\infty}\frac{0}{4m}\\
&=\sum_{m=0}^{\infty}\left(\frac{1}{4m+1}-\frac{1}{4m+3}\right)\\
&=1-\frac{1}{3}+\frac{1}{5}-\frac{1}{7}+\cdots.
\end{align*}
For $0\le r<1$, the geometric series gives
\begin{align*}
\frac{1}{1+t^2}=\sum_{m=0}^{\infty}(-1)^m t^{2m},
\end{align*}
and the convergence is uniform on $0\le t\le r$, so integration term by term gives
\begin{align*}
\arctan r
&=\int_0^r\frac{dt}{1+t^2}\\
&=\int_0^r\sum_{m=0}^{\infty}(-1)^m t^{2m}\,dt\\
&=\sum_{m=0}^{\infty}(-1)^m\int_0^r t^{2m}\,dt\\
&=\sum_{m=0}^{\infty}(-1)^m\frac{r^{2m+1}}{2m+1}.
\end{align*}
The series $\sum_{m=0}^{\infty}(-1)^m/(2m+1)$ converges by the alternating series test, and *Abel's limiting theorem* gives
\begin{align*}
\sum_{m=0}^{\infty}\frac{(-1)^m}{2m+1}
&=\lim_{r\to 1^-}\sum_{m=0}^{\infty}(-1)^m\frac{r^{2m+1}}{2m+1}\\
&=\lim_{r\to 1^-}\arctan r\\
&=\arctan 1\\
&=\frac{\pi}{4}.
\end{align*}
Therefore
\begin{align*}
L(1,\chi_4)=\frac{\pi}{4},
\end{align*}
so this nonprincipal $L$-function is finite and positive at $s=1$.
[/example]
This example gives the first concrete value of a nonprincipal Dirichlet $L$-function at $1$. Its positivity is a special case of the general nonvanishing theorem $L(1,\chi)\ne 0$ for nonprincipal characters, which lies beyond the elementary continuation facts established in this chapter.
## What Remains for Arithmetic Progressions
What analytic data is still missing before primes can be counted in each reduced residue class modulo $q$? The principal character carries the pole at $s=1$, while nonprincipal characters are holomorphic there because their periodic sums cancel. The remaining question is whether a nonprincipal value $L(1,\chi)$ can vanish; Chapter 8 is devoted to ruling out exactly that obstruction.
# 8. Nonvanishing at $s=1$ and Dirichlet's Theorem
This chapter turns the analytic machinery of Dirichlet characters into the arithmetic statement that every reduced residue class contains infinitely many primes. Chapter 7 built Euler products and logarithmic expansions for $L(s,\chi)$ in the half-plane $\operatorname{Re}(s)>1$; here the point is to understand what happens as $s \to 1^+$. The whole argument has one decisive input: if $\chi$ is nonprincipal modulo $q$, then $L(1,\chi)\ne 0$.
## The Decisive Role of Nonvanishing
Why should a single value $L(1,\chi)$ control the existence of infinitely many primes in each arithmetic progression? The answer is that character orthogonality separates residue classes, while the principal character contributes exactly the logarithmic divergence needed to force primes to appear.
Let $q\in\mathbb N$, let $a\in\mathbb Z$ with $\gcd(a,q)=1$, and write
\begin{align*}
P_a(s) := \sum_{p\equiv a\pmod q} p^{-s}, \qquad s>1.
\end{align*}
The goal is to prove that $P_a(s)$ diverges as $s\to 1^+$, since a finite set of primes would give a bounded sum at $s=1$.
[definition: Principal Character]
Let $q\in\mathbb N$. The principal character modulo $q$ is the map $\chi_0:\mathbb Z\to\mathbb C$ defined by
\begin{align*}
n\mapsto \chi_0(n):=
\begin{cases}
1, & \gcd(n,q)=1,\\
0, & \gcd(n,q)>1.
\end{cases}
\end{align*}
[/definition]
The principal character is responsible for the pole of the corresponding $L$-function. The nonprincipal characters should contribute bounded correction terms at $s=1$; this is exactly where nonvanishing enters.
The obstruction is that the desired prime sum is not itself an $L$-function, while the analytic information is encoded in Euler products. We therefore need an estimate converting sums over primes weighted by a character into logarithms of Dirichlet $L$-functions, with an error small enough that the singular behavior at $s=1$ remains visible.
[quotetheorem:4374]
[citeproof:4374]
This theorem is the analytic bridge from Euler products to prime sums. Its limitation is that it only controls the first prime term up to a bounded error; by itself it cannot decide whether a prime sum diverges or converges. The next step supplies the arithmetic filter that extracts a single residue class before the nonvanishing theorem controls the nonprincipal logarithms.
[quotetheorem:4375]
[citeproof:4375]
The hypothesis $\gcd(a,q)=1$ is essential: if $a$ is not a reduced residue class, then every sufficiently large prime is excluded from $p\equiv a\pmod q$ unless the congruence forces $p$ to divide $q$. Thus the theorem is not a statement about all congruence classes, but about the unit group modulo $q$. Its role is to turn the problem into a finite linear combination of character sums, where the principal character carries the divergent part and the nonprincipal characters must be shown harmless.
Combining the two theorems gives
\begin{align*}
P_a(s)=\frac{1}{\varphi(q)}\overline{\chi_0(a)}\log L(s,\chi_0)
+\frac{1}{\varphi(q)}\sum_{\chi\ne\chi_0}\overline{\chi(a)}\log L(s,\chi)+O(1).
\end{align*}
The first term diverges like
\begin{align*}
\frac{1}{\varphi(q)}\log\frac{1}{s-1}.
\end{align*}
If every nonprincipal $L(1,\chi)$ is nonzero, then the remaining logarithms stay bounded near $s=1$. If some nonprincipal $L(1,\chi)$ vanished, its logarithm could contribute a negative singularity and might cancel the principal divergence in the residue-class average. The whole nonvanishing theorem exists to rule out exactly that cancellation.
[example: Isolating the Principal Divergence]
Take $q=5$ and $a=2$. The reduced residue classes modulo $5$ form a group of order $\varphi(5)=4$, and $2$ is a generator since
\begin{align*}
2^0\equiv 1,\qquad
2^1\equiv 2,\qquad
2^2\equiv 4,\qquad
2^3\equiv 3
\pmod 5.
\end{align*}
Thus each character modulo $5$ is determined by its value on $2$, and the possible values are $1,i,-1,-i$. Character orthogonality gives
\begin{align*}
P_2(s)
&=\sum_{p\equiv 2\pmod 5}p^{-s} \\
&=\frac14\sum_{\chi\bmod 5}\overline{\chi(2)}\sum_{p\ne 5}\chi(p)p^{-s}.
\end{align*}
For the principal character $\chi_0$, we have $\chi_0(2)=1$ and $\chi_0(p)=1$ for every prime $p\ne 5$, so its contribution is
\begin{align*}
\frac14\overline{\chi_0(2)}\sum_{p\ne 5}\chi_0(p)p^{-s}
&=\frac14\sum_{p\ne 5}p^{-s} \\
&=\frac14\left(\sum_p p^{-s}-5^{-s}\right).
\end{align*}
Since $\log \zeta(s)=\sum_p p^{-s}+O(1)$ as $s\to 1^+$ and $\zeta(s)$ has a simple pole at $s=1$, this term is
\begin{align*}
\frac14\log\frac{1}{s-1}+O(1).
\end{align*}
Now let $\psi$ be the character with $\psi(2)=i$. Then $\overline{\psi(2)}=-i$, and the part of $P_2(s)$ coming from $\psi$ is
\begin{align*}
\frac14\overline{\psi(2)}\sum_{p\ne 5}\psi(p)p^{-s}
&=-\frac{i}{4}\sum_{p\ne 5}\psi(p)p^{-s} \\
&=-\frac{i}{4}\bigl(\log L(s,\psi)+O(1)\bigr) \\
&=-\frac{i}{4}\log L(s,\psi)+O(1).
\end{align*}
If $L(1,\psi)\ne 0$, then $L(s,\psi)$ stays nonzero and finite for $s$ sufficiently close to $1$ from the right, so $\log L(s,\psi)$ remains bounded there. The same calculation for the characters with values $-1$ and $-i$ at $2$ gives only bounded terms, while the principal character contributes the unbounded term $\frac14\log\frac{1}{s-1}$.
[/example]
## Nonvanishing for Real Characters
How can we rule out a zero at $s=1$ when the character is real-valued? The central idea is positivity: for real characters the relevant Euler products can be arranged so that a hypothetical zero would force an analytic function with nonnegative Dirichlet coefficients to be too small near its first singularity.
[definition: Real Dirichlet Character]
A real Dirichlet character modulo $q$ is a Dirichlet character $\chi:\mathbb Z\to\mathbb C$ such that
\begin{align*}
n\mapsto \chi(n)\in\{-1,0,1\}
\end{align*}
for every $n\in\mathbb Z$.
[/definition]
Real characters are either principal or quadratic after passing to their primitive part. Their values do not rotate around the unit circle, so the Euler product has a sign structure that can be used directly.
The obstruction at $s=1$ is a possible real zero: if $L(1,\chi)$ vanished for a nonprincipal real character, the logarithmic prime-sum argument would lose control of one of its correction terms. The positivity available for real-valued characters is the tool that prevents this obstruction.
[quotetheorem:4376]
[citeproof:4376]
The real-valued hypothesis is what makes the coefficients $a_n$ nonnegative; without it, cancellations in the Dirichlet series would invalidate the Landau argument. The nonprincipal hypothesis is also necessary, because the principal character has $L(s,\chi_0)$ with a pole at $s=1$ rather than a nonzero finite value. This theorem handles exactly the case where positivity is available, and the next section replaces positivity of coefficients by a logarithmic inequality for complex characters.
[example: Why Positivity Is Needed]
For a genuinely complex character $\chi$, the coefficients in $\zeta(s)L(s,\chi)$ need not be nonnegative real numbers. Suppose, for instance, that $\chi(p)=i$ at some prime $p$. The Euler factor at $p$ in the product for $\zeta(s)L(s,\chi)$ is
\begin{align*}
(1-p^{-s})^{-1}(1-\chi(p)p^{-s})^{-1}
&=(1+p^{-s}+p^{-2s}+\cdots)(1+\chi(p)p^{-s}+\chi(p)^2p^{-2s}+\cdots).
\end{align*}
The coefficient of $p^{-s}$ comes from choosing $p^{-s}$ from the first factor and $1$ from the second, or choosing $1$ from the first factor and $\chi(p)p^{-s}$ from the second. Hence that coefficient is
\begin{align*}
1+\chi(p)=1+i.
\end{align*}
Since $1+i$ is not a real number, it is not a nonnegative real coefficient. Thus the real-character proof cannot be copied directly in the complex case: the Dirichlet series for $\zeta(s)L(s,\chi)$ no longer has the positivity property needed for the Tauberian argument.
[/example]
The character modulo $4$ gives a concrete boundary value: its $L$-series at $s=1$ is the classical odd alternating harmonic series.
[example: The Character Modulo Four]
Let $\chi_4:\mathbb Z\to\mathbb C$ be defined by $\chi_4(n)=0$ for even $n$, $\chi_4(n)=1$ for $n\equiv 1\pmod 4$, and $\chi_4(n)=-1$ for $n\equiv 3\pmod 4$. At $s=1$, its Dirichlet series is the odd alternating harmonic series:
\begin{align*}
L(1,\chi_4)
&=\sum_{n=1}^{\infty}\frac{\chi_4(n)}{n} \\
&=\sum_{k=0}^{\infty}\frac{\chi_4(4k+1)}{4k+1}
+\sum_{k=0}^{\infty}\frac{\chi_4(4k+3)}{4k+3} \\
&=\sum_{k=0}^{\infty}\frac{1}{4k+1}
-\sum_{k=0}^{\infty}\frac{1}{4k+3} \\
&=\sum_{k=0}^{\infty}\left(\frac{1}{4k+1}-\frac{1}{4k+3}\right) \\
&=1-\frac13+\frac15-\frac17+\cdots .
\end{align*}
To evaluate this series, use the geometric expansion
\begin{align*}
\frac{1}{1+t^2}=\sum_{k=0}^{\infty}(-1)^k t^{2k}, \qquad |t|<1.
\end{align*}
For $0<r<1$, the series converges uniformly on $[0,r]$, so termwise integration gives
\begin{align*}
\arctan r
&=\int_0^r \frac{1}{1+t^2}\,dt \\
&=\int_0^r \sum_{k=0}^{\infty}(-1)^k t^{2k}\,dt \\
&=\sum_{k=0}^{\infty}(-1)^k\int_0^r t^{2k}\,dt \\
&=\sum_{k=0}^{\infty}(-1)^k\frac{r^{2k+1}}{2k+1}.
\end{align*}
Letting $r\to 1^-$ and using *Abel's theorem for power series* on the convergent alternating series,
\begin{align*}
\arctan 1
&=\sum_{k=0}^{\infty}\frac{(-1)^k}{2k+1} \\
&=1-\frac13+\frac15-\frac17+\cdots .
\end{align*}
Since $\arctan 1=\pi/4$, we obtain
\begin{align*}
L(1,\chi_4)=\frac{\pi}{4}\ne 0.
\end{align*}
Thus this particular real character is visibly nonvanishing at $1$, while the general nonvanishing theorem gives the same conclusion for every nonprincipal real character without requiring an explicit evaluation.
[/example]
## Nonvanishing for Complex Characters
What changes when the character takes genuinely complex values? Positivity is no longer available for $L(s,\chi)$ alone, so the argument pairs $\chi$ with its conjugate and studies logarithmic singularities.
[definition: Conjugate Character]
Let $\chi:\mathbb Z\to\mathbb C$ be a Dirichlet character modulo $q$. Its conjugate character is the map $\overline{\chi}:\mathbb Z\to\mathbb C$ defined by
\begin{align*}
n\mapsto \overline{\chi}(n):=\overline{\chi(n)}.
\end{align*}
[/definition]
The functions $L(s,\chi)$ and $L(s,\overline{\chi})$ have conjugate Euler products for real $s>1$. Hence a zero at $s=1$ for one forces a zero at $s=1$ for the other.
For genuinely complex characters, direct positivity is unavailable because the coefficients are no longer ordered real numbers. The remaining obstruction is again a possible zero at $s=1$, but it must be ruled out by comparing the character with its conjugate so that the complex oscillation can be converted into a real inequality.
[quotetheorem:4377]
[citeproof:4377]
The non-real hypothesis separates this case from the real quadratic case, where squaring the character returns the principal character and a separate pole analysis is needed. That boundary case is exactly why the real-character theorem was proved first. Conjugation matters because it lets the proof pair $\chi$ with $\overline{\chi}$ and turn complex oscillation into a real nonnegative comparison near $s=1$.
[remark: Primitive and Imprimitive Characters]
If $\chi$ modulo $q$ is induced by a primitive character $\chi^*$ modulo $q^*$, then
\begin{align*}
L(s,\chi)=L(s,\chi^*)\prod_{p\mid q}\left(1-\chi^*(p)p^{-s}\right).
\end{align*}
The finite product is nonzero at $s=1$, so nonvanishing at $1$ is unchanged by passing between a character and its primitive inducing character.
[/remark]
## Dirichlet Theorem on Primes in Arithmetic Progressions
How does nonvanishing become an infinitude theorem for primes? We now return to the residue-class prime sum and show that it diverges for every reduced class.
[quotetheorem:1625]
[citeproof:1625]
This proof does not estimate how many such primes there are up to $x$; it proves infinitude through divergence of a weighted prime sum. Chapter 9 replaces this divergence statement by asymptotics for the weighted functions $\psi(x;q,a)$ and then for $\pi(x;q,a)$.
[example: Infinitely Many Primes Congruent to One Modulo Four]
For $q=4$, the reduced residue classes are $1$ and $3$, so $\varphi(4)=2$. The two characters modulo $4$ are the principal character $\chi_0$ and the real character $\chi_4$, and
\begin{align*}
\chi_0(1)=1,\qquad \chi_4(1)=1.
\end{align*}
Applying character orthogonality to the class $1\pmod 4$ gives, for $s>1$,
\begin{align*}
\sum_{p\equiv 1\pmod 4}p^{-s}
&=\frac{1}{2}\sum_{\chi\bmod 4}\overline{\chi(1)}\sum_{p\nmid 4}\chi(p)p^{-s} \\
&=\frac{1}{2}\overline{\chi_0(1)}\sum_{p\ne 2}\chi_0(p)p^{-s}
+\frac{1}{2}\overline{\chi_4(1)}\sum_{p\ne 2}\chi_4(p)p^{-s} \\
&=\frac{1}{2}\sum_{p\ne 2}p^{-s}
+\frac{1}{2}\sum_{p\ne 2}\chi_4(p)p^{-s}.
\end{align*}
For the principal term,
\begin{align*}
\frac{1}{2}\sum_{p\ne 2}p^{-s}
&=\frac{1}{2}\left(\sum_p p^{-s}-2^{-s}\right).
\end{align*}
By the *Logarithmic Prime Expansion for Characters* applied to the principal character,
\begin{align*}
\sum_p p^{-s}=\log \zeta(s)+O(1),
\end{align*}
and since $\zeta(s)$ has a simple pole at $s=1$,
\begin{align*}
\log\zeta(s)=\log\frac{1}{s-1}+O(1)
\end{align*}
as $s\to 1^+$. Also $2^{-s}$ remains bounded near $s=1$, so
\begin{align*}
\frac{1}{2}\sum_{p\ne 2}p^{-s}
&=\frac{1}{2}\log\frac{1}{s-1}+O(1).
\end{align*}
For the nonprincipal term, the same logarithmic prime expansion gives
\begin{align*}
\sum_{p\ne 2}\chi_4(p)p^{-s}
&=\log L(s,\chi_4)+O(1).
\end{align*}
The earlier computation $L(1,\chi_4)=\pi/4\ne 0$ shows that $L(s,\chi_4)$ stays finite and nonzero for real $s$ sufficiently close to $1$, so $\log L(s,\chi_4)$ remains bounded there. Hence
\begin{align*}
\sum_{p\equiv 1\pmod 4}p^{-s}
&=\frac{1}{2}\log\frac{1}{s-1}+O(1),
\end{align*}
which diverges to $+\infty$ as $s\to 1^+$. If there were only finitely many primes $p\equiv 1\pmod 4$, then $\sum_{p\equiv 1\pmod 4}p^{-s}$ would tend to the finite sum $\sum_{p\equiv 1\pmod 4}p^{-1}$ as $s\to 1^+$, a contradiction. Therefore infinitely many primes are congruent to $1$ modulo $4$.
[/example]
The other reduced class modulo $4$ uses the same two characters, but the nonprincipal contribution appears with the opposite sign. This comparison is a useful check that the principal divergence is shared equally between the two reduced classes.
[example: Infinitely Many Primes Congruent to Three Modulo Four]
For $q=4$, the reduced residue classes are $1$ and $3$, and the two characters modulo $4$ are $\chi_0$ and $\chi_4$. Since
\begin{align*}
\chi_0(3)=1,\qquad \chi_4(3)=-1,
\end{align*}
we have
\begin{align*}
\overline{\chi_0(3)}=1,\qquad \overline{\chi_4(3)}=-1.
\end{align*}
Applying *Character Isolation of a Residue Class* to the class $3\pmod 4$ gives, for $s>1$,
\begin{align*}
\sum_{p\equiv 3\pmod 4}p^{-s}
&=\frac{1}{2}\sum_{\chi\bmod 4}\overline{\chi(3)}\sum_{p\nmid 4}\chi(p)p^{-s} \\
&=\frac{1}{2}\overline{\chi_0(3)}\sum_{p\ne 2}\chi_0(p)p^{-s}
+\frac{1}{2}\overline{\chi_4(3)}\sum_{p\ne 2}\chi_4(p)p^{-s} \\
&=\frac{1}{2}\sum_{p\ne 2}p^{-s}
-\frac{1}{2}\sum_{p\ne 2}\chi_4(p)p^{-s}.
\end{align*}
For the principal term,
\begin{align*}
\frac{1}{2}\sum_{p\ne 2}p^{-s}
&=\frac{1}{2}\left(\sum_p p^{-s}-2^{-s}\right).
\end{align*}
By the *Logarithmic Prime Expansion for Characters* applied to the principal character,
\begin{align*}
\sum_p p^{-s}=\log\zeta(s)+O(1),
\end{align*}
and the simple pole of $\zeta(s)$ at $s=1$ gives
\begin{align*}
\log\zeta(s)=\log\frac{1}{s-1}+O(1)
\end{align*}
as $s\to 1^+$. Since $2^{-s}$ is bounded near $s=1$,
\begin{align*}
\frac{1}{2}\sum_{p\ne 2}p^{-s}
&=\frac{1}{2}\log\frac{1}{s-1}+O(1).
\end{align*}
For the nonprincipal term, the same logarithmic prime expansion gives
\begin{align*}
\sum_{p\ne 2}\chi_4(p)p^{-s}
&=\log L(s,\chi_4)+O(1).
\end{align*}
The earlier evaluation $L(1,\chi_4)=\pi/4\ne 0$ implies that $L(s,\chi_4)$ remains finite and nonzero for real $s$ sufficiently close to $1$, so $\log L(s,\chi_4)$ is bounded there. Therefore
\begin{align*}
\sum_{p\equiv 3\pmod 4}p^{-s}
&=\frac{1}{2}\log\frac{1}{s-1}+O(1),
\end{align*}
which tends to $+\infty$ as $s\to 1^+$. If only finitely many primes satisfied $p\equiv 3\pmod 4$, then $\sum_{p\equiv 3\pmod 4}p^{-s}$ would tend to the finite sum $\sum_{p\equiv 3\pmod 4}p^{-1}$ as $s\to 1^+$, contradicting the divergence above. Hence infinitely many primes are congruent to $3$ modulo $4$, and the sign change in the $\chi_4$ term shows how the same principal divergence is shared by the two reduced residue classes modulo $4$.
[/example]
## What This Argument Establishes
What has been gained compared with Euclid-style arguments for special progressions? The proof works uniformly for every reduced residue class modulo every positive integer $q$, and its only class-specific input is the finite character table of $(\mathbb Z/q\mathbb Z)^\times$.
[remark: Infinitude Versus Asymptotic Distribution]
Dirichlet theorem says that each reduced residue class modulo $q$ contains infinitely many primes. It does not by itself prove
\begin{align*}
\pi(x;q,a)\sim \frac{1}{\varphi(q)}\frac{x}{\log x}.
\end{align*}
That stronger statement requires analytic continuation and zero-free information for Dirichlet $L$-functions in a region near the line $\operatorname{Re}(s)=1$.
[/remark]
Euler products turn primes into logarithms, characters isolate residue classes, and nonvanishing at $s=1$ prevents the nonprincipal characters from cancelling the principal pole. To strengthen this qualitative conclusion into an asymptotic distribution, Chapter 9 combines the same character decomposition with the Tauberian framework used for the ordinary prime number theorem.
# 9. Prime Number Theorem in Arithmetic Progressions
This chapter applies the analytic machinery of Dirichlet $L$-functions to the distribution of primes in reduced residue classes. The ordinary prime number theorem in Chapter 5 comes from the pole of $\zeta(s)$ at $s=1$ and the absence of zeros on the boundary line $\operatorname{Re}(s)=1$. For arithmetic progressions, Dirichlet characters separate the congruence condition, and the corresponding $L$-functions supply the same analytic input class by class.
## Detecting a Residue Class with Characters
How can an analytic function see the congruence condition $n \equiv a \pmod q$? The key is that characters modulo $q$ form an orthogonal basis for functions on the finite abelian group $(\mathbb Z/q\mathbb Z)^\times$, so a congruence restriction may be written as an average over characters.
[definition: Generalized Chebyshev Function]
Let $\chi$ be a Dirichlet character modulo $q$. The generalized Chebyshev function attached to $\chi$ is the map
\begin{align*}
\psi_\chi : [1,\infty) &\longrightarrow \mathbb C,\\
x &\longmapsto \psi(x,\chi) := \sum_{n \le x} \Lambda(n)\chi(n).
\end{align*}
[/definition]
The function $\psi(x,\chi)$ is the character-twisted analogue of the Chebyshev function $\psi(x)=\sum_{n\le x}\Lambda(n)$. It packages primes and prime powers with a phase depending only on their residue classes modulo $q$.
To state a prime number theorem for one residue class, however, we also need the untwisted target quantity: the weighted count of prime powers whose exponents land in a fixed congruence class. This is the object that character orthogonality will later decompose into twisted sums.
[definition: Chebyshev Function in an Arithmetic Progression]
Let $q \in \mathbb N$ and let $a \in \mathbb Z$ satisfy $\gcd(a,q)=1$. The Chebyshev function in the arithmetic progression $a \pmod q$ is the map
\begin{align*}
\psi_{q,a} : [1,\infty) &\longrightarrow \mathbb R,\\
x &\longmapsto \psi(x;q,a) := \sum_{\substack{n \le x \\ n \equiv a \pmod q}} \Lambda(n).
\end{align*}
[/definition]
This is the weighted prime-counting function in the residue class $a \pmod q$. The restriction $\gcd(a,q)=1$ is natural because primes in non-reduced classes are forced to divide $q$, apart from finitely many exceptions.
The analytic difficulty is that the congruence condition $n \equiv a \pmod q$ is not itself a Dirichlet coefficient of a single $L$-function. To bring $\psi(x;q,a)$ into the same contour-integral framework as the untwisted Chebyshev function, the residue-class indicator must be rewritten as a finite linear combination of character values. Character orthogonality supplies exactly that conversion, turning the arithmetic progression problem into twisted sums $\psi(x,\chi)$ attached to Dirichlet $L$-functions.
[quotetheorem:4378]
[citeproof:4378]
The decomposition isolates the problem into one principal character and finitely many nonprincipal characters. The hypothesis $\gcd(a,q)=1$ is essential: for example, if $q=4$ and $a=2$, every character modulo $4$ vanishes on even integers, so the character average cannot detect the prime power terms in the class $2\pmod 4$. The theorem is only an exact algebraic identity; it gives no estimate for the size of any summand. Its role is to reduce the analytic work to proving that the principal character supplies the main term and that the nonprincipal characters contribute only lower-order oscillation.
[example: Modulus Three Decomposition]
Modulo $3$, the reduced residue classes are $1$ and $2$, so $\varphi(3)=2$. The two characters are the principal character $\chi_0$, with $\chi_0(1)=\chi_0(2)=1$, and the nonprincipal real character $\chi$, with $\chi(1)=1$ and $\chi(2)=-1$. Hence the character decomposition gives
\begin{align*}
\psi(x;3,1)
&=\frac{1}{2}\left(\overline{\chi_0(1)}\psi(x,\chi_0)+\overline{\chi(1)}\psi(x,\chi)\right)\\
&=\frac{1}{2}\left(1\cdot \psi(x,\chi_0)+1\cdot \psi(x,\chi)\right)\\
&=\frac{1}{2}\bigl(\psi(x,\chi_0)+\psi(x,\chi)\bigr),
\end{align*}
and
\begin{align*}
\psi(x;3,2)
&=\frac{1}{2}\left(\overline{\chi_0(2)}\psi(x,\chi_0)+\overline{\chi(2)}\psi(x,\chi)\right)\\
&=\frac{1}{2}\left(1\cdot \psi(x,\chi_0)+(-1)\cdot \psi(x,\chi)\right)\\
&=\frac{1}{2}\bigl(\psi(x,\chi_0)-\psi(x,\chi)\bigr).
\end{align*}
Writing
\begin{align*}
A(x)=\psi(x;3,1), \qquad B(x)=\psi(x;3,2),
\end{align*}
the two displayed identities imply
\begin{align*}
A(x)+B(x)
&=\frac{1}{2}\bigl(\psi(x,\chi_0)+\psi(x,\chi)\bigr)
+\frac{1}{2}\bigl(\psi(x,\chi_0)-\psi(x,\chi)\bigr)\\
&=\psi(x,\chi_0),
\end{align*}
and
\begin{align*}
A(x)-B(x)
&=\frac{1}{2}\bigl(\psi(x,\chi_0)+\psi(x,\chi)\bigr)
-\frac{1}{2}\bigl(\psi(x,\chi_0)-\psi(x,\chi)\bigr)\\
&=\psi(x,\chi).
\end{align*}
Thus $A(x)\sim x/2$ and $B(x)\sim x/2$ imply $\psi(x,\chi_0)=A(x)+B(x)\sim x$ and $\psi(x,\chi)=A(x)-B(x)=o(x)$. Conversely, if $\psi(x,\chi_0)\sim x$ and $\psi(x,\chi)=o(x)$, then
\begin{align*}
\psi(x;3,1)
&=\frac{1}{2}\bigl(\psi(x,\chi_0)+\psi(x,\chi)\bigr)
=\frac{1}{2}\bigl(x+o(x)+o(x)\bigr)
\sim \frac{x}{2},
\end{align*}
and the same calculation with the minus sign gives $\psi(x;3,2)\sim x/2$. The principal character supplies the shared main term, while the nonprincipal character measures exactly the lower-order imbalance between the two classes.
[/example]
## Logarithmic Derivatives and Boundary Nonvanishing
What analytic property of $L(s,\chi)$ forces cancellation in $\psi(x,\chi)$? As in the proof of the ordinary prime number theorem, the relevant Dirichlet series is the logarithmic derivative of the Euler product. Its singularities record zeros and poles of $L(s,\chi)$, so boundary nonvanishing at $\operatorname{Re}(s)=1$ is the analytic condition that prevents an additional main term.
[quotetheorem:4379]
[citeproof:4379]
The identity ties the summatory function $\psi(x,\chi)$ to the analytic continuation of $-L'/L$. The half-plane condition $\operatorname{Re}(s)>1$ is needed for the Euler product and termwise differentiation; at $s=1$, the corresponding series already contains the divergent prime-number-theorem-scale mass. The formula by itself does not prove continuation, nonvanishing, or any asymptotic estimate. It identifies the analytic object whose boundary singularities must be controlled next: a pole of $-L'/L$ at $s=1$ produces a term of size $x$, while holomorphy on the boundary line gives enough cancellation for $o(x)$.
[remark: Boundary Nonvanishing for the Characters Used Below]
For the prime number theorem in arithmetic progressions, the boundary assertion must be applied to primitive nonprincipal characters, and then transferred to imprimitive characters with care. If $\chi$ is induced by a primitive character $\chi^*$, then $L(s,\chi)$ differs from $L(s,\chi^*)$ by finitely many Euler factors at primes dividing the modulus. Those finite factors can themselves vanish on the line $\operatorname{Re}(s)=1$, so the safe conclusion for imprimitive characters is not blanket nonvanishing of $L(s,\chi)$ there. What the later argument needs is that these finite Euler-factor zeros are known and do not create a pole of the logarithmic derivative at $s=1$; the primitive nonprincipal part supplies the genuine boundary nonvanishing input.
[/remark]
This boundary input is qualitative: it gives no explicit zero-free region. For a fixed modulus $q$, qualitative control is enough because only finitely many characters occur.
[remark: Principal Character Factor]
If $\chi_0$ is the principal character modulo $q$, then
\begin{align*}
L(s,\chi_0)=\sum_{\gcd(n,q)=1}\frac{1}{n^s}=\zeta(s)\prod_{p\mid q}(1-p^{-s}).
\end{align*}
Thus the pole at $s=1$ has residue $\varphi(q)/q$, reflecting the density of integers coprime to $q$ among all positive integers.
[/remark]
## Twisted Prime Number Theorems
How does boundary nonvanishing become an asymptotic formula for $\psi(x,\chi)$? The answer is the same Tauberian bridge used for the classical prime number theorem: analytic continuation of a logarithmic derivative to the closed half-plane up to the boundary, with controlled singularities, determines the first-order growth of its coefficients.
[quotetheorem:4381]
[citeproof:4381]
The point is that the pole at $s=1$ occurs only for the principal character. The distinction between principal and nonprincipal characters is essential: treating $\chi_0$ as if it had cancellation would contradict the ordinary prime number theorem, since $\psi(x,\chi_0)$ differs from $\psi(x)$ only by the finitely many prime factors of $q$. The result is still qualitative for fixed $q$; it gives no error term and no uniformity in the conductor. It is exactly the input needed for the character average, because every nonprincipal character has cancellation strong enough to disappear in the first-order asymptotic.
[example: Main Term and Cancellation]
Fix $q$ and $a$ with $\gcd(a,q)=1$. Since $a$ is coprime to $q$, the principal character satisfies $\chi_0(a)=1$, so $\overline{\chi_0(a)}=1$. By the *Character Decomposition of Arithmetic Progression Chebyshev Function*,
\begin{align*}
\psi(x;q,a)
&=\frac{1}{\varphi(q)}\sum_{\chi\bmod q}\overline{\chi(a)}\,\psi(x,\chi)\\
&=\frac{1}{\varphi(q)}\overline{\chi_0(a)}\,\psi(x,\chi_0)
+\frac{1}{\varphi(q)}\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}\overline{\chi(a)}\,\psi(x,\chi)\\
&=\frac{1}{\varphi(q)}\psi(x,\chi_0)
+\frac{1}{\varphi(q)}\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}\overline{\chi(a)}\,\psi(x,\chi).
\end{align*}
By the *Twisted Prime Number Theorem for Characters*, the principal character term satisfies $\psi(x,\chi_0)\sim x$, so there is a function $\varepsilon_0(x)\to 0$ such that
\begin{align*}
\psi(x,\chi_0)=x+x\varepsilon_0(x).
\end{align*}
For each nonprincipal character $\chi$, the same theorem gives $\psi(x,\chi)=o(x)$, so there is a function $\varepsilon_\chi(x)\to 0$ such that
\begin{align*}
\psi(x,\chi)=x\varepsilon_\chi(x).
\end{align*}
Substituting these forms gives
\begin{align*}
\psi(x;q,a)
&=\frac{x+x\varepsilon_0(x)}{\varphi(q)}
+\frac{1}{\varphi(q)}
\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}
\overline{\chi(a)}\,x\varepsilon_\chi(x)\\
&=\frac{x}{\varphi(q)}
+\frac{x\varepsilon_0(x)}{\varphi(q)}
+\frac{x}{\varphi(q)}
\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}
\overline{\chi(a)}\,\varepsilon_\chi(x).
\end{align*}
For fixed $q$, the number of nonprincipal characters modulo $q$ is finite. Also $\gcd(a,q)=1$ implies $|\chi(a)|=1$ for every character modulo $q$. Hence
\begin{align*}
\left|
\frac{x}{\varphi(q)}
\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}
\overline{\chi(a)}\,\varepsilon_\chi(x)
\right|
&\le
\frac{x}{\varphi(q)}
\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}
|\overline{\chi(a)}|\,|\varepsilon_\chi(x)|\\
&=
\frac{x}{\varphi(q)}
\sum_{\substack{\chi\bmod q\\ \chi\ne\chi_0}}
|\varepsilon_\chi(x)|\\
&=o(x),
\end{align*}
because a finite sum of functions tending to $0$ again tends to $0$. The term $x\varepsilon_0(x)/\varphi(q)$ is also $o(x)$, so
\begin{align*}
\psi(x;q,a)=\frac{x}{\varphi(q)}+o(x).
\end{align*}
Thus the principal character contributes the whole main term, while the finitely many nonprincipal characters contribute only lower-order cancellation.
[/example]
## Prime Number Theorem in Arithmetic Progressions
What does the weighted estimate for $\psi(x;q,a)$ say about actual primes? Weighted prime powers are analytically convenient, but the final theorem is a statement about the number of primes $p\le x$ in a fixed reduced residue class.
[quotetheorem:4382]
[citeproof:4382]
This theorem says that, to first order, the weighted prime powers do not prefer one reduced residue class modulo $q$ over another. The reduced-residue hypothesis is essential: for $q=4$ and $a=2$, the only prime in the class is $2$, so no asymptotic $x/\varphi(4)$ can hold. The theorem is still a Chebyshev-form statement rather than a direct count of primes, and the modulus is fixed throughout; stronger uniform versions require zero-free regions and estimates that this chapter does not pursue. The remaining task is to remove the logarithmic weights and the prime-power terms by partial summation.
[quotetheorem:4383]
[citeproof:4383]
The hypotheses again matter: if $a$ is not coprime to $q$, then the progression contains at most the primes dividing $q$, so the stated main term is false. The theorem does not give a bound for the error term, nor does it compare different moduli as $q$ varies with $x$. The proof has separated the argument into algebraic orthogonality, analytic nonvanishing, a Tauberian step, and partial summation; this pattern is the template for many later equidistribution theorems in analytic number theory.
[example: Primes One and Two Modulo Three]
Taking $q=3$, we have $\varphi(3)=2$, and the reduced residue classes modulo $3$ are $1$ and $2$. Applying the *Prime Number Theorem for Arithmetic Progressions* separately to $a=1$ and $a=2$ gives
\begin{align*}
\pi(x;3,1)&\sim \frac{\operatorname{Li}(x)}{\varphi(3)}
=\frac{\operatorname{Li}(x)}{2},\\
\pi(x;3,2)&\sim \frac{\operatorname{Li}(x)}{\varphi(3)}
=\frac{\operatorname{Li}(x)}{2}.
\end{align*}
Equivalently, there are functions $\varepsilon_1(x)\to 0$ and $\varepsilon_2(x)\to 0$ such that
\begin{align*}
\pi(x;3,1)&=\frac{\operatorname{Li}(x)}{2}\bigl(1+\varepsilon_1(x)\bigr),\\
\pi(x;3,2)&=\frac{\operatorname{Li}(x)}{2}\bigl(1+\varepsilon_2(x)\bigr).
\end{align*}
Subtracting the two estimates term by term gives
\begin{align*}
\pi(x;3,1)-\pi(x;3,2)
&=\frac{\operatorname{Li}(x)}{2}\bigl(1+\varepsilon_1(x)\bigr)
-\frac{\operatorname{Li}(x)}{2}\bigl(1+\varepsilon_2(x)\bigr)\\
&=\frac{\operatorname{Li}(x)}{2}
+\frac{\operatorname{Li}(x)}{2}\varepsilon_1(x)
-\frac{\operatorname{Li}(x)}{2}
-\frac{\operatorname{Li}(x)}{2}\varepsilon_2(x)\\
&=\frac{\operatorname{Li}(x)}{2}\bigl(\varepsilon_1(x)-\varepsilon_2(x)\bigr).
\end{align*}
Since $\varepsilon_1(x)\to 0$ and $\varepsilon_2(x)\to 0$, their difference also tends to $0$, so
\begin{align*}
\pi(x;3,1)-\pi(x;3,2)=o(\operatorname{Li}(x)).
\end{align*}
At the weighted Chebyshev level, the earlier modulus-$3$ character decomposition gives
\begin{align*}
\psi(x;3,1)&=\frac{1}{2}\bigl(\psi(x,\chi_0)+\psi(x,\chi)\bigr),\\
\psi(x;3,2)&=\frac{1}{2}\bigl(\psi(x,\chi_0)-\psi(x,\chi)\bigr),
\end{align*}
where $\chi_0$ is the principal character and $\chi$ is the nonprincipal character modulo $3$. Therefore
\begin{align*}
\psi(x;3,1)-\psi(x;3,2)
&=\frac{1}{2}\bigl(\psi(x,\chi_0)+\psi(x,\chi)\bigr)
-\frac{1}{2}\bigl(\psi(x,\chi_0)-\psi(x,\chi)\bigr)\\
&=\frac{1}{2}\psi(x,\chi_0)+\frac{1}{2}\psi(x,\chi)
-\frac{1}{2}\psi(x,\chi_0)+\frac{1}{2}\psi(x,\chi)\\
&=\psi(x,\chi).
\end{align*}
Thus the two prime-counting functions have the same first-order asymptotic, and the corresponding weighted imbalance between the two residue classes is exactly the nonprincipal character contribution.
[/example]
## What This Chapter Proves and What It Leaves Open
Which parts of the theorem are qualitative, and which parts are quantitative? For a fixed modulus, boundary nonvanishing gives the limiting equidistribution of primes among reduced residue classes. It does not give a useful error term, nor does it control how the estimate changes when $q$ grows with $x$.
[remark: Fixed Modulus]
The notation $q$ fixed means that constants and implicit convergence may depend on $q$. The statement does not imply uniformity for all moduli up to a function of $x$.
[/remark]
The first remark settles one point of interpretation. The following remark, From Dirichlet to Prime Number Theorem, records a second boundary case before the notes move on.
[remark: From Dirichlet to Prime Number Theorem]
Dirichlet's theorem says each reduced residue class contains infinitely many primes. The prime number theorem in arithmetic progressions strengthens this by giving the asymptotic density of those primes among all primes.
[/remark]
The prime-distribution results above used $L$-functions mainly near $s=1$. The final chapter changes perspective: it completes primitive Dirichlet $L$-functions by adding their archimedean factors, revealing the functional equation that governs the whole critical strip.
# 10. Completed $L$-Functions and Functional Equations
The missing ingredient is archimedean: it comes from the infinite place rather than from the finite Euler factors at primes. Once the correct gamma factor is attached, a primitive Dirichlet $L$-function satisfies a symmetry relating $s$ to $1-s$, just as the completed zeta function does.
The guiding question is: what must be added to $L(s,\chi)$ so that the resulting function has a natural reflection law? The answer depends on two finite pieces of data attached to a primitive character: its parity and its Gauss sum.
## Gauss Sums and Parity
A Dirichlet character is periodic and multiplicative, but the functional equation is revealed by taking its finite Fourier transform. The finite Fourier transform of a character is controlled by a single complex number, the Gauss sum, and this is where primitivity enters the analytic continuation of $L(s,\chi)$.
[definition: Parity of a Dirichlet Character]
Let $\chi:\mathbb Z \to \mathbb C$ be a Dirichlet character modulo $q$, viewed as a periodic function with $\chi(n)=0$ when $(n,q)>1$. The parity of $\chi$ is the number $a \in \{0,1\}$ defined by
\begin{align*}
\chi(-1)=(-1)^a.
\end{align*}
[/definition]
Thus $a=0$ means that $\chi$ is even, while $a=1$ means that $\chi$ is odd. The parity determines whether the archimedean factor should involve $\Gamma(s/2)$ or $\Gamma((s+1)/2)$.
The finite part of the same functional equation needs its own normalizing constant. When a periodic character is Fourier transformed over one period, the coefficient at the basic additive frequency is the number that later becomes the root-number factor.
[definition: Gauss Sum]
Let $\chi:\mathbb Z \to \mathbb C$ be a Dirichlet character modulo $q$, viewed as a periodic function. Its Gauss sum is
\begin{align*}
\tau(\chi)=\sum_{r=1}^{q} \chi(r)e^{2\pi i r/q}.
\end{align*}
[/definition]
The Gauss sum is a finite Fourier coefficient of $\chi$. For primitive characters, all the other Fourier coefficients are multiples of this one. This is the finite Fourier fact needed before the functional equation can be normalized cleanly: additive frequencies should not produce unrelated constants. The obstruction is imprimitive behaviour, where the character really lives at a smaller conductor and the Fourier transform can vanish at incompatible frequencies.
[quotetheorem:4384]
[citeproof:4384]
The special case $n=1$ recovers the definition of $\tau(\chi)$, while the formula says that changing the additive frequency only twists by $\overline{\chi}(n)$. The hypothesis of primitivity is essential: if a character modulo $q$ is induced from a smaller conductor, then its finite Fourier transform is supported only on compatible additive frequencies and the neat formula with a single Gauss sum can vanish or pick up extra conductor factors. For example, the principal character modulo $4$ has Gauss sum $e^{2\pi i/4}+e^{6\pi i/4}=0$, so it cannot satisfy a primitive formula with nonzero magnitude $\sqrt{4}$. Thus the theorem is not a statement about every Dirichlet character modulo $q$; it is the Fourier signature of characters whose conductor is exactly $q$.
Once the Fourier transform of a primitive character is controlled by a single Gauss sum, the next missing datum is its size. The phase can carry subtle arithmetic information, but the normalization of the completed $L$-function needs the absolute value. The following result gives the exact magnitude that makes the functional equation have the expected square-root conductor factor.
[remark: Magnitude of a Primitive Dirichlet Gauss Sum]
If $\chi$ is primitive modulo $q$, then its Gauss sum satisfies
\begin{align*}
|\tau(\chi)|=\sqrt q.
\end{align*}
This determines only the size of the Gauss sum, not its phase. The phase is arithmetic: for quadratic characters it depends on congruence information, and in the functional equation it becomes part of the root number. Primitivity is again doing real work. For an imprimitive character the Gauss sum modulo the larger modulus may vanish, as with the principal character modulo $4$, so the statement would be false if the conductor were not the modulus used in the sum.
[/remark]
The abstract magnitude formula becomes concrete in the smallest odd primitive example. Working it out modulo $4$ also introduces the parity convention that will appear in the gamma factor of the completed $L$-function.
[example: Quadratic Character Modulo Four]
Let $\chi_4$ be the nonprincipal character modulo $4$, so $\chi_4(n)=0$ for even $n$, $\chi_4(1)=1$, and $\chi_4(3)=-1$. Since $-1 \equiv 3 \pmod 4$, periodicity gives
\begin{align*}
\chi_4(-1)=\chi_4(3)=-1=(-1)^1,
\end{align*}
so the parity is $a=1$ and $\chi_4$ is odd.
Using the definition of the Gauss sum with $q=4$,
\begin{align*}
\tau(\chi_4)
&=\sum_{r=1}^{4}\chi_4(r)e^{2\pi i r/4} \\
&=\chi_4(1)e^{2\pi i/4}
+\chi_4(2)e^{4\pi i/4}
+\chi_4(3)e^{6\pi i/4}
+\chi_4(4)e^{8\pi i/4} \\
&=1\cdot e^{\pi i/2}
+0\cdot e^{\pi i}
+(-1)\cdot e^{3\pi i/2}
+0\cdot e^{2\pi i} \\
&=1\cdot i+0\cdot(-1)+(-1)\cdot(-i)+0\cdot 1 \\
&=i+i \\
&=2i.
\end{align*}
Therefore
\begin{align*}
|\tau(\chi_4)|=|2i|=2=\sqrt{4}.
\end{align*}
This confirms in this concrete case that the primitive Gauss sum has size $\sqrt q$.
[/example]
This example is the model for the odd gamma factor in the completed beta function $L(s,\chi_4)$.
## Completed Dirichlet L-Functions
The ordinary Dirichlet series $L(s,\chi)$ has a clean Euler product for $\operatorname{Re}(s)>1$, but its reflection law is not visible in that form. The missing factor is archimedean: it records the parity of the character in the same way that $\pi^{-s/2}\Gamma(s/2)$ records the symmetry of the zeta function.
[definition: Completed Primitive Dirichlet L-Function]
Let $\chi$ be a primitive Dirichlet character modulo $q$, and let $a \in \{0,1\}$ be defined by $\chi(-1)=(-1)^a$. The completed Dirichlet $L$-function attached to $\chi$ is
\begin{align*}
\Lambda(s,\chi)=\left(\frac{q}{\pi}\right)^{(s+a)/2}\Gamma\left(\frac{s+a}{2}\right)L(s,\chi).
\end{align*}
[/definition]
For $a=0$ this uses $\Gamma(s/2)$; for $a=1$ it uses $\Gamma((s+1)/2)$. This shift is forced by whether the character is even or odd under $n \mapsto -n$.
With the conductor, parity, and Gauss sum all built into the normalization, the completed function is ready to state its reflection law. The result below is the analytic payoff of those choices: it identifies the precise symmetry relating $s$ to $1-s$.
[quotetheorem:4386]
[citeproof:4386]
The factor $\varepsilon(\chi)$ is called the root number. It is a complex number on the unit circle and measures the phase in the reflection symmetry. The conjugate character appears because the finite Fourier transform of $\chi$ is naturally expressed using $\overline{\chi}(n)$; replacing it by $\chi$ would give the wrong symmetry for non-real characters. Primitivity is also essential: an imprimitive character has a functional equation governed by its conductor, with extra Euler factors at primes dividing the modulus but not the conductor. The theorem therefore does not say that the bare Dirichlet series $L(s,\chi)$ is symmetric, nor does it locate its zeros; it says that after inserting the correct conductor, gamma factor, and root number, the completed function reflects across the line $\operatorname{Re}(s)=1/2$.
[example: Completed L-Function for the Character Modulo Four]
For the nonprincipal character $\chi_4$ modulo $4$, the preceding computation gives $\chi_4(-1)=-1=(-1)^1$, so its parity is $a=1$. Substituting $q=4$ and $a=1$ into the definition of the completed primitive Dirichlet $L$-function gives
\begin{align*}
\Lambda(s,\chi_4)
&=\left(\frac{q}{\pi}\right)^{(s+a)/2}\Gamma\left(\frac{s+a}{2}\right)L(s,\chi_4) \\
&=\left(\frac{4}{\pi}\right)^{(s+1)/2}\Gamma\left(\frac{s+1}{2}\right)L(s,\chi_4).
\end{align*}
The Gauss sum computation gives $\tau(\chi_4)=2i$, and $\sqrt{4}=2$. Since $i^{-1}=1/i=-i$, the root number from the *Functional Equation for Primitive Dirichlet L-Functions* is
\begin{align*}
\varepsilon(\chi_4)
&=\frac{i^{-a}\tau(\chi_4)}{\sqrt{q}} \\
&=\frac{i^{-1}\cdot 2i}{2} \\
&=\frac{(-i)\cdot 2i}{2} \\
&=\frac{-2i^2}{2} \\
&=\frac{-2(-1)}{2} \\
&=1.
\end{align*}
Also $\chi_4$ is real-valued, so $\overline{\chi_4}=\chi_4$. Therefore the functional equation becomes
\begin{align*}
\Lambda(s,\chi_4)
&=\varepsilon(\chi_4)\Lambda(1-s,\overline{\chi_4}) \\
&=1\cdot \Lambda(1-s,\chi_4) \\
&=\Lambda(1-s,\chi_4).
\end{align*}
Thus the completed $L$-function attached to $\chi_4$ reflects with no extra phase factor.
[/example]
This is the completed form of Dirichlet's beta function, since $L(s,\chi_4)=1-3^{-s}+5^{-s}-7^{-s}+\cdots$.
## Forced Zeros and the Critical Strip
The functional equation transfers information between the half-plane $\operatorname{Re}(s)>1$ and the half-plane $\operatorname{Re}(s)<0$. Since the gamma factor has poles at nonpositive integers of the appropriate parity, the ordinary $L$-function must vanish at certain points so that the completed function remains holomorphic there.
[definition: Critical Strip for a Dirichlet L-Function]
For a Dirichlet $L$-function $L(s,\chi)$, the critical strip is the region
\begin{align*}
0<\operatorname{Re}(s)<1.
\end{align*}
[/definition]
Zeros inside this strip are the difficult zeros. Zeros outside it that are forced by the gamma factor are part of the formal structure of the functional equation.
The completed function is holomorphic for primitive nonprincipal characters, but its definition includes a gamma factor with unavoidable poles at certain negative integers. At those points the only way for the product to remain holomorphic is for the ordinary factor $L(s,\chi)$ to vanish. Before studying the genuinely mysterious zeros in the critical strip, we need to separate off these forced zeros and identify exactly which negative integers arise from the parity of the character.
[quotetheorem:4387]
[citeproof:4387]
For even characters the forced zeros occur at $s=0,-2,-4,\dots$; for odd characters they occur at $s=-1,-3,-5,\dots$. The nonprincipal hypothesis is needed because the principal character has the pole structure inherited from the zeta function; for example, the principal character modulo $1$ gives $\zeta(s)$, whose completed form requires a pole-removing factor and whose zero/pole bookkeeping is different at $s=0$ and $s=1$. Primitivity keeps the conductor and gamma factor aligned with the actual functional equation; if an imprimitive character is written with a larger modulus, the forced-zero statement must be read after reducing to the primitive character and accounting for missing Euler factors. These zeros are forced by the archimedean factor only, so the theorem says nothing about whether zeros in the critical strip lie on the central line or how many such zeros occur.
[example: Forced Zeros for an Odd Quadratic Character]
For $\chi_4$ we have $\chi_4(-1)=-1=(-1)^1$, so $a=1$. Substituting $a=1$ into the completed $L$-function gives
\begin{align*}
\Lambda(s,\chi_4)
&=\left(\frac{4}{\pi}\right)^{(s+1)/2}\Gamma\left(\frac{s+1}{2}\right)L(s,\chi_4).
\end{align*}
The gamma function has simple poles at $0,-1,-2,\dots$. Therefore $\Gamma((s+1)/2)$ has poles exactly when
\begin{align*}
\frac{s+1}{2}&=-k \qquad (k=0,1,2,\dots),\\
s+1&=-2k,\\
s&=-1-2k.
\end{align*}
Thus the pole locations are
\begin{align*}
s=-1,-3,-5,\dots.
\end{align*}
By *Functional Equation for Primitive Dirichlet L-Functions*, $\Lambda(s,\chi_4)$ is entire because $\chi_4$ is primitive and nonprincipal. At each point $s=-1-2k$, the factor $\left(4/\pi\right)^{(s+1)/2}$ is nonzero and finite, while $\Gamma((s+1)/2)$ has a pole. Hence $L(s,\chi_4)$ must vanish there to cancel that pole:
\begin{align*}
L(-1,\chi_4)=L(-3,\chi_4)=L(-5,\chi_4)=\cdots=0.
\end{align*}
At $s=0$ we have
\begin{align*}
\frac{s+1}{2}=\frac{0+1}{2}=\frac12,
\end{align*}
and $\Gamma(1/2)$ is finite, so $s=0$ is not forced to be a zero by the gamma factor for this odd character.
[/example]
The critical strip is therefore where the arithmetic content remains after the forced cancellations have been accounted for. The analogue with the completed zeta function is exact in structure: add an archimedean gamma factor, prove a reflection formula, then distinguish the forced zeros from the zeros in $0<\operatorname{Re}(s)<1$.
## Relation with the Completed Zeta Function
The completed zeta function is the prototype for every construction in this chapter. The Dirichlet case adds finite Fourier analysis through Gauss sums and replaces the single gamma factor by a parity-dependent one.
[definition: Completed Zeta Function]
The completed zeta function is
\begin{align*}
\xi(s)=\frac{1}{2}s(s-1)\pi^{-s/2}\Gamma\left(\frac{s}{2}\right)\zeta(s).
\end{align*}
[/definition]
The extra factor $\frac{1}{2}s(s-1)$ removes the poles at $s=0$ and $s=1$. For nonprincipal primitive Dirichlet characters, no such pole-removing factor is needed because the completed function is already entire.
This prototype clarifies what the Dirichlet functional equation is imitating and what it changes. The comparison theorem below places the zeta symmetry beside the primitive-character symmetry, showing how the pole-removing factor, gamma factor, and finite Fourier data play parallel roles.
[quotetheorem:4388]
[citeproof:4388]
The primitive Dirichlet functional equation follows the same outline, with the theta function replaced by a character-twisted theta series. The finite Fourier transform of the character is what introduces $\tau(\chi)$ and the root number. The comparison also shows what the zeta case hides: the zeta function corresponds to the constant finite character, so there is no Gauss sum or conjugate character to track. The polynomial factor in $\xi(s)$ is not decorative; without it, the completed zeta factor $\pi^{-s/2}\Gamma(s/2)\zeta(s)$ would still have the pole inherited from $\zeta(s)$ and would not be an entire symmetric function. Thus the theorem gives the model for the completed-object symmetry, not a symmetry of the uncompleted Dirichlet series itself.
[example: Parity-Dependent Gamma Factor for a Quadratic Character]
Let $\chi$ be a primitive quadratic character modulo $q$. Its parity is the number $a\in\{0,1\}$ determined by
\begin{align*}
\chi(-1)=(-1)^a.
\end{align*}
Substituting this value of $a$ into the definition of the completed primitive Dirichlet $L$-function gives
\begin{align*}
\Lambda(s,\chi)
=\left(\frac{q}{\pi}\right)^{(s+a)/2}
\Gamma\left(\frac{s+a}{2}\right)L(s,\chi).
\end{align*}
If $\chi(-1)=1$, then
\begin{align*}
1=\chi(-1)=(-1)^a.
\end{align*}
Since $a\in\{0,1\}$, the two possible values are
\begin{align*}
(-1)^0=1,\qquad (-1)^1=-1,
\end{align*}
so $a=0$. Therefore
\begin{align*}
\Lambda(s,\chi)
&=\left(\frac{q}{\pi}\right)^{(s+0)/2}
\Gamma\left(\frac{s+0}{2}\right)L(s,\chi)\\
&=\left(\frac{q}{\pi}\right)^{s/2}
\Gamma\left(\frac{s}{2}\right)L(s,\chi).
\end{align*}
If $\chi(-1)=-1$, then
\begin{align*}
-1=\chi(-1)=(-1)^a.
\end{align*}
Again using $a\in\{0,1\}$ and
\begin{align*}
(-1)^0=1,\qquad (-1)^1=-1,
\end{align*}
we get $a=1$. Hence
\begin{align*}
\Lambda(s,\chi)
&=\left(\frac{q}{\pi}\right)^{(s+1)/2}
\Gamma\left(\frac{s+1}{2}\right)L(s,\chi).
\end{align*}
Thus the sign of $\chi(-1)$ determines whether the completed function uses the even factor $\Gamma(s/2)$ or the odd factor $\Gamma((s+1)/2)$.
[/example]
The chapter closes the first construction of Dirichlet $L$-functions as global analytic objects. In the next stage of the course, this functional equation becomes one of the tools for locating zeros and for translating analytic information back into statements about primes in arithmetic progressions.
Contents
- Introduction
- The Central Problem
- Arithmetic Functions as Data
- Dirichlet Series and Euler Products
- Complex Analysis Enters Through Singularities
- Characters And Arithmetic Progressions
- Structure Of The Course
- 1. Arithmetic Functions and Dirichlet Convolution
- Arithmetic Functions as Divisor Data
- Dirichlet Convolution and Inversion
- Classical Identities in the Convolution Algebra
- Multiplicativity and Euler Factorization
- 2. Dirichlet Series as Generating Functions
- Convergence of Dirichlet Series
- Multiplication and Dirichlet Convolution
- Euler Products for Multiplicative Functions
- Logarithmic Differentiation and Prime-Sensitive Coefficients
- Dirichlet Series Identities as Arithmetic Identities
- 3. The Riemann Zeta Function Near $\operatorname{Re}(s)>1$
- Euler Products and the Harmonic Sum over Primes
- The Pole at One and Continuation to the Right Half-Plane
- Estimates in Vertical Strips
- Two First Residue Calculations
- 4. Perron's Formula and Summatory Functions
- Perron's Formula as an Inverse Mellin Transform
- Effective Truncation and Contour Shifts
- Divisor-Type Summatory Functions
- The Von Mangoldt Function and Chebyshev's Psi Function
- From Perron Integrals to Later Asymptotic Theorems
- 5. The Prime Number Theorem for zeta
- Weighted Prime Counting Functions
- Equivalent Forms Of The Prime Number Theorem
- Nonvanishing On The Boundary Line
- Extracting The Main Term From The Pole
- Newman's Tauberian Argument
- 6. Dirichlet Characters and Orthogonality
- Multiplicative Characters Modulo an Integer
- Orthogonality And Projection Onto Residue Classes
- Primitive Characters And Conductors
- Parity Of A Character
- 7. Dirichlet L-Functions and Euler Products
- Dirichlet Series Attached to Characters
- Euler Products and Prime Data
- Behaviour Near s Equals One
- What Remains for Arithmetic Progressions
- 8. Nonvanishing at $s=1$ and Dirichlet's Theorem
- The Decisive Role of Nonvanishing
- Nonvanishing for Real Characters
- Nonvanishing for Complex Characters
- Dirichlet Theorem on Primes in Arithmetic Progressions
- What This Argument Establishes
- 9. Prime Number Theorem in Arithmetic Progressions
- Detecting a Residue Class with Characters
- Logarithmic Derivatives and Boundary Nonvanishing
- Twisted Prime Number Theorems
- Prime Number Theorem in Arithmetic Progressions
- What This Chapter Proves and What It Leaves Open
- 10. Completed $L$-Functions and Functional Equations
- Gauss Sums and Parity
- Completed Dirichlet L-Functions
- Forced Zeros and the Critical Strip
- Relation with the Completed Zeta Function
Analytic Number Theory I: Multiplicative Functions and L-Functions
Content
Problems
History
Created by admin on 5/30/2026 | Last updated on 6/1/2026
Prerequisites (0/62 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent