[proofplan]
The theorem is the equivalence between the Riemann Hypothesis (RH) and the sharp error bound $|\pi(x) - \mathrm{li}(x)| = O(\sqrt{x}\log x)$. The central tool is the von Mangoldt explicit formula, which expresses the Chebyshev function $\psi(x) = \sum_{n \leq x} \Lambda(n)$ as a sum over the non-trivial zeros $\rho$ of $\zeta$:
\begin{align*}
\psi_0(x) &= x - \sum_{\rho} \frac{x^\rho}{\rho} - \log(2\pi) - \frac{1}{2}\log(1 - x^{-2}).
\end{align*}
For the forward direction (RH $\Rightarrow$ error bound), each zero $\rho = 1/2 + i\gamma$ contributes a term of size $x^{1/2}/|\rho|$; the density of zeros up to height $T$ is $O(T \log T)$, so a careful Riemann--von Mangoldt zero-counting estimate gives $\psi(x) - x = O(\sqrt{x}(\log x)^2)$, and an elementary Abel-summation conversion sharpens the logarithm to give $\pi(x) - \mathrm{li}(x) = O(\sqrt{x}\log x)$. For the reverse direction, a contrapositive argument: a zero $\rho_0 = \beta_0 + i\gamma_0$ with $\beta_0 > 1/2$ produces oscillations in $\psi(x) - x$ of size $\Omega(x^{\beta_0})$, hence in $\pi(x) - \mathrm{li}(x)$ of size $\Omega(x^{\beta_0}/\log x)$, incompatible with $O(\sqrt{x}\log x)$ when $\beta_0 > 1/2$.
[/proofplan]
[step:Reduce the theorem about $\pi(x) - \mathrm{li}(x)$ to an estimate on $\psi(x) - x$]
Define the [Chebyshev psi function](/pages/Chebyshev%20Functions)
\begin{align*}
\psi : (0, \infty) &\to [0, \infty) \\
x &\mapsto \sum_{n \leq x} \Lambda(n),
\end{align*}
where $\Lambda$ is the [von Mangoldt function](/pages/Von%20Mangoldt%20Function): $\Lambda(n) = \log p$ if $n = p^k$ for a prime $p$ and integer $k \geq 1$, else $\Lambda(n) = 0$. Also define $\vartheta(x) := \sum_{p \leq x} \log p$.
We claim that the two bounds
\begin{align*}
\pi(x) - \mathrm{li}(x) &= O(\sqrt{x} \log x), \tag{A}\\
\psi(x) - x &= O(\sqrt{x} (\log x)^2) \tag{B}
\end{align*}
are equivalent (as $x \to \infty$). Once this equivalence is established, we reduce the theorem to proving (B) $\Leftrightarrow$ RH.
**Bound difference $\psi - \vartheta$:** Since $\psi(x) - \vartheta(x) = \sum_{k \geq 2} \vartheta(x^{1/k})$ and $\vartheta(y) \leq \psi(y) \leq C y$ (elementary Chebyshev bound), we have
\begin{align*}
|\psi(x) - \vartheta(x)| &\leq C \sum_{k=2}^{\lfloor \log_2 x \rfloor} x^{1/k} \leq C \log_2(x) \cdot \sqrt{x} = O(\sqrt{x} \log x).
\end{align*}
So (B) is equivalent to the corresponding bound on $\vartheta(x) - x$.
**Convert $\vartheta(x) - x$ to $\pi(x) - \mathrm{li}(x)$ by Abel summation:** We have
\begin{align*}
\pi(x) &= \frac{\vartheta(x)}{\log x} + \int_2^x \frac{\vartheta(t)}{t (\log t)^2} \, d\mathcal{L}^1(t)
\end{align*}
(derived in [Abel Summation](/theorems/???)). Also, integration by parts gives
\begin{align*}
\mathrm{li}(x) &= \int_2^x \frac{d\mathcal{L}^1(t)}{\log t} = \frac{x}{\log x} - \frac{2}{\log 2} + \int_2^x \frac{d\mathcal{L}^1(t)}{(\log t)^2}.
\end{align*}
Subtracting:
\begin{align*}
\pi(x) - \mathrm{li}(x) &= \frac{\vartheta(x) - x}{\log x} + \int_2^x \frac{\vartheta(t) - t}{t (\log t)^2} \, d\mathcal{L}^1(t) + O(1).
\end{align*}
If $|\vartheta(t) - t| \leq K \sqrt{t} (\log t)^2$ for all $t \geq 2$ (which follows from (B)), then
\begin{align*}
|\pi(x) - \mathrm{li}(x)| &\leq \frac{K \sqrt{x} (\log x)^2}{\log x} + K \int_2^x \frac{\sqrt{t} (\log t)^2}{t (\log t)^2} \, d\mathcal{L}^1(t) + O(1) \\
&= K \sqrt{x} \log x + K \int_2^x t^{-1/2} \, d\mathcal{L}^1(t) + O(1) \\
&= K \sqrt{x} \log x + 2K \sqrt{x} + O(1) = O(\sqrt{x} \log x),
\end{align*}
establishing (A). The reverse implication (A) $\Rightarrow$ (B), via reversing the Abel-summation integral, is analogous.
Thus the theorem reduces to proving:
\begin{align*}
\text{(B)} \; \text{holds} \; \iff \; \text{all non-trivial zeros of } \zeta \text{ lie on } \operatorname{Re}(s) = \tfrac{1}{2}.
\end{align*}
The remaining steps prove this equivalence.
[/step]
[step:State the von Mangoldt explicit formula for $\psi(x)$]
[claim:Von Mangoldt explicit formula]
For $x > 1$ not a prime power, define $\psi_0(x) := \psi(x)$ (the function is continuous at such points). Then
\begin{align*}
\psi_0(x) &= x - \sum_{\rho} \frac{x^\rho}{\rho} - \log(2\pi) - \frac{1}{2} \log\!\left(1 - \frac{1}{x^2}\right),
\end{align*}
where the sum is over all non-trivial zeros $\rho$ of $\zeta$ (those in the critical strip $0 < \operatorname{Re}(s) < 1$), counted with multiplicity and understood in the symmetric sense $\lim_{T \to \infty} \sum_{|\operatorname{Im}(\rho)| \leq T}$.
[/claim]
[proof]
This is the [von Mangoldt explicit formula](/theorems/???), proved in detail on its own page. We sketch the derivation here to locate the inputs used in the sequel. The derivation uses the Hadamard factorisation of $\xi(s) := \frac{1}{2} s (s - 1) \pi^{-s/2} \Gamma(s/2) \zeta(s)$:
\begin{align*}
\xi(s) &= e^{A + Bs} \prod_{\rho} \left( 1 - \frac{s}{\rho} \right) e^{s/\rho},
\end{align*}
valid because $\xi$ is an [entire function](/pages/Entire%20Function) of order $1$. Taking the logarithmic derivative yields an expression for $-\zeta'(s)/\zeta(s)$ as a sum over zeros plus explicit terms from the gamma factor and the pole. Applying Perron's formula
\begin{align*}
\psi_0(x) &= \frac{1}{2\pi i} \int_{c - i\infty}^{c + i\infty} \left( -\frac{\zeta'(s)}{\zeta(s)} \right) \frac{x^s}{s} \, ds, \qquad c > 1,
\end{align*}
and shifting the contour leftward past the pole at $s = 1$ (giving the $+x$ term), past each non-trivial zero $\rho$ (giving the $-x^\rho/\rho$ terms), past $s = 0$ (giving $-\zeta'(0)/\zeta(0) = -\log(2\pi)$), and past the trivial zeros at $s = -2, -4, \ldots$ (giving the $-\frac{1}{2}\log(1 - 1/x^2)$ term, summed geometrically), produces the formula.
[/proof]
Subtracting $x$ from both sides and grouping terms with absolute convergence:
\begin{align*}
\psi_0(x) - x &= - \sum_{\rho} \frac{x^\rho}{\rho} - \log(2\pi) - \frac{1}{2} \log\!\left( 1 - \frac{1}{x^2} \right).
\end{align*}
The last two terms are $O(1)$ uniformly for $x \geq 2$. Hence the crux of the estimate is the sum over zeros.
[/step]
[step:Recall the Riemann--von Mangoldt zero-counting formula]
[claim:Riemann--von Mangoldt]
Let $N(T)$ denote the number of non-trivial zeros $\rho = \beta + i\gamma$ of $\zeta$ with $0 < \beta < 1$ and $0 < \gamma \leq T$, counted with multiplicity. Then
\begin{align*}
N(T) &= \frac{T}{2\pi} \log\!\left( \frac{T}{2\pi} \right) - \frac{T}{2\pi} + O(\log T) \qquad \text{as } T \to \infty.
\end{align*}
In particular, the number of zeros with $|\gamma| \in (T, T + 1]$ is $O(\log T)$.
[/claim]
This is the [Riemann--von Mangoldt Formula](/theorems/???). As a consequence, the sum
\begin{align*}
\sum_{\rho} \frac{1}{|\rho|^2} &= \sum_{\rho} \frac{1}{\beta^2 + \gamma^2}
\end{align*}
converges (since pairing with the functional-equation symmetry $\rho \leftrightarrow 1 - \rho$ and integrating against $dN(T)$ gives $\int \log T / T^2 \, dT < \infty$), but $\sum_\rho 1/|\rho|$ diverges.
[/step]
[step:Forward direction — assuming RH, bound $|\psi(x) - x|$]
Assume the Riemann Hypothesis: every non-trivial zero $\rho = \beta + i\gamma$ satisfies $\beta = 1/2$. Then $|x^\rho| = x^{1/2}$ and $|\rho| = \sqrt{1/4 + \gamma^2}$.
The explicit sum $\sum_\rho x^\rho/\rho$ does not converge absolutely, so we use a truncated form. Fix a truncation height $T \geq 1$; the truncated explicit formula (proved alongside the formula in the cited reference) gives
\begin{align*}
\psi(x) - x &= -\sum_{|\gamma| \leq T} \frac{x^\rho}{\rho} + R(x, T),
\end{align*}
with remainder bound
\begin{align*}
|R(x, T)| &\leq C \left( \frac{x (\log x)^2}{T} + \log x \right)
\end{align*}
for some absolute constant $C > 0$, valid uniformly for $x \geq 2$, $T \geq 2$ with $x$ not a prime power (or adjusting by $O(\log x)$ at prime powers).
**Truncated zero-sum estimate under RH.** Under RH,
\begin{align*}
\left| \sum_{|\gamma| \leq T} \frac{x^\rho}{\rho} \right| &\leq \sum_{|\gamma| \leq T} \frac{x^{1/2}}{|\rho|} = x^{1/2} \sum_{|\gamma| \leq T} \frac{1}{\sqrt{1/4 + \gamma^2}}.
\end{align*}
By the Riemann--von Mangoldt formula (Step 3), split the sum into dyadic intervals $|\gamma| \in (2^k, 2^{k+1}]$ for $0 \leq k \leq \log_2 T$, plus an initial block $|\gamma| \leq 1$. Each dyadic block contains $O(2^k \log 2^k) = O(2^k k)$ zeros, each contributing $O(2^{-k})$ to the sum. Summing:
\begin{align*}
\sum_{|\gamma| \leq T} \frac{1}{|\rho|} &\leq O(1) + \sum_{k=0}^{\log_2 T} O(2^k k) \cdot O(2^{-k}) = O\!\left( \sum_{k=0}^{\log_2 T} k \right) = O((\log T)^2).
\end{align*}
Hence
\begin{align*}
\left| \sum_{|\gamma| \leq T} \frac{x^\rho}{\rho} \right| &\leq C_1 \sqrt{x} (\log T)^2.
\end{align*}
**Optimise $T$.** Combining with the remainder bound:
\begin{align*}
|\psi(x) - x| &\leq C_1 \sqrt{x} (\log T)^2 + C \frac{x (\log x)^2}{T} + C \log x.
\end{align*}
Choose $T = \sqrt{x}$. Then $(\log T)^2 = \frac{1}{4} (\log x)^2$ and $x/T = \sqrt{x}$, so
\begin{align*}
|\psi(x) - x| &\leq \frac{C_1}{4} \sqrt{x} (\log x)^2 + C \sqrt{x} (\log x)^2 + C \log x = O(\sqrt{x} (\log x)^2),
\end{align*}
which is bound (B). By Step 1, this implies bound (A): $|\pi(x) - \mathrm{li}(x)| = O(\sqrt{x} \log x)$.
[guided]
The heart of the forward direction is to estimate the oscillatory sum $\sum_\rho x^\rho/\rho$ under RH. Each zero contributes a term of size $x^{1/2}/|\rho|$ because $|x^\rho| = x^{\operatorname{Re}(\rho)} = x^{1/2}$ under RH. The question is: how do we sum the $1/|\rho|$?
The obstruction is that $\sum_\rho 1/|\rho|$ diverges logarithmically (by Riemann--von Mangoldt, the density of zeros grows like $\log T$, and $\int^T \log t / t \, dt = (\log T)^2/2$, divergent). So we cannot directly sum over all zeros — we must truncate at some height $T$ and absorb the tail into a remainder term.
The remainder $R(x, T)$ in the truncated explicit formula comes from shifting the Perron contour only to height $T$ rather than to $-\infty$; its size is governed by $x/T$ (the integrand size times the vertical extent). We then optimise the trade-off: making $T$ larger shrinks the remainder but makes the zero sum larger. The optimum is $T \asymp \sqrt{x}$, balancing $\sqrt{x}(\log T)^2 \asymp x/T$.
Why does the final bound have $(\log x)^2$ rather than $\log x$? Because the Riemann--von Mangoldt density is $\log \gamma$, giving a $(\log T)^2$ sum after integrating $\log T / T \cdot dT$ up to $T$. The extra logarithm can then be shaved by the Abel-summation conversion of Step 1, which replaces $(\log x)^2 / \log x$ with $\log x$; this is why the theorem statement is $O(\sqrt{x} \log x)$ for $\pi(x) - \mathrm{li}(x)$ rather than $O(\sqrt{x} (\log x)^2)$.
Why $(\log x)^2$ is sharp for $\psi$ but $\log x$ is sharp for $\pi$: the conversion from $\psi$ to $\pi$ divides by $\log x$ (through the factor $1/\log x$ in the Abel sum), absorbing one power of log.
[/guided]
[/step]
[step:Reverse direction — contrapositive, assuming a zero off the critical line]
We prove: if $\zeta$ has a non-trivial zero $\rho_0 = \beta_0 + i\gamma_0$ with $\beta_0 > 1/2$, then $\psi(x) - x \neq O(\sqrt{x} (\log x)^2)$; more strongly, we show
\begin{align*}
\limsup_{x \to \infty} \frac{|\psi(x) - x|}{x^{\beta_0 - \varepsilon}} &= +\infty \qquad \text{for every } \varepsilon > 0,
\end{align*}
which contradicts (B) since $\beta_0 - \varepsilon > 1/2$ for sufficiently small $\varepsilon > 0$.
**Setup.** By the functional equation $\zeta(s) = \chi(s) \zeta(1 - s)$ for an explicit $\chi$, zeros of $\zeta$ in the critical strip come in pairs $\rho, 1 - \rho$. The supremum $\Theta := \sup_\rho \operatorname{Re}(\rho)$ over non-trivial zeros is therefore $\geq 1/2$, with $\Theta = 1/2$ iff RH holds. We assume $\Theta > 1/2$, and must show (B) fails.
By the explicit formula (Step 2) and truncation, the term $-x^{\rho_0}/\rho_0 - x^{1 - \rho_0}/(1 - \rho_0)$ (writing $\bar\rho_0 = 1 - \overline{\rho_0}$ for the conjugate partner is not identical; the symmetric pair is $\rho_0$ and $\overline{\rho_0}$ by the Schwarz reflection $\zeta(\bar s) = \overline{\zeta(s)}$) contributes
\begin{align*}
-\frac{x^{\rho_0}}{\rho_0} - \frac{x^{\overline{\rho_0}}}{\overline{\rho_0}} &= -2\operatorname{Re}\!\left( \frac{x^{\rho_0}}{\rho_0} \right) = -\frac{2 x^{\beta_0}}{|\rho_0|} \cos(\gamma_0 \log x - \arg \rho_0).
\end{align*}
This is an oscillating term of amplitude $2 x^{\beta_0} / |\rho_0|$, which equals $\Omega(x^{\beta_0})$ in size (not $o(x^{\beta_0 - \varepsilon})$) as $x \to \infty$.
**Extracting oscillations via Landau's oscillation theorem.** To make the argument rigorous against the possibility of other zeros cancelling the contribution of $\rho_0$, invoke [Landau's Oscillation Theorem](/theorems/???): if $f(x) = \psi(x) - x$ satisfies $f(x) = O(x^\alpha)$ for some $\alpha > 1/2$, then the Mellin transform
\begin{align*}
\tilde f(s) &= \int_1^\infty f(x) x^{-s-1} \, d\mathcal{L}^1(x) = \frac{1}{s}\left( -\frac{\zeta'(s)}{\zeta(s)} - \frac{1}{s - 1} \right) + \text{(lower order terms)}
\end{align*}
is holomorphic in the half-plane $\operatorname{Re}(s) > \alpha$. Conversely, a singularity of $\tilde f$ at $s = s_0$ with $\operatorname{Re}(s_0) = \alpha$ forces $\limsup x^{-\alpha} |f(x)| > 0$.
Landau's theorem applied contrapositively: the function $\tilde f(s)$ has a pole at every non-trivial zero $\rho$ of $\zeta$ (because $-\zeta'/\zeta$ does). Hence the abscissa of holomorphy of $\tilde f$ is exactly $\Theta = \sup_\rho \operatorname{Re}(\rho)$. If $\Theta > 1/2$, then
\begin{align*}
\limsup_{x \to \infty} \frac{|\psi(x) - x|}{x^{\Theta - \varepsilon}} &= +\infty \qquad \text{for every } \varepsilon > 0,
\end{align*}
because otherwise the Mellin transform would extend holomorphically past the pole at $\rho_0$ with $\operatorname{Re}(\rho_0) = \Theta$, a contradiction.
**Conclusion of the contrapositive.** If (B) held, then $\psi(x) - x = O(\sqrt{x}(\log x)^2) = O(x^{1/2 + \delta})$ for every $\delta > 0$. By Landau's theorem, the abscissa of holomorphy of $\tilde f$ would be $\leq 1/2 + \delta$ for every $\delta > 0$, hence $\leq 1/2$. Therefore $\Theta \leq 1/2$, i.e., every non-trivial zero has real part $\leq 1/2$. Combined with the functional-equation symmetry $\rho \leftrightarrow 1 - \rho$ (which maps zeros to zeros and swaps real parts $\beta$ with $1 - \beta$), every zero has $\operatorname{Re}(\rho) = 1/2$, which is RH.
[guided]
The reverse direction is more delicate than the forward direction because we cannot simply read off the contribution of a single zero from the explicit formula — the full sum over all zeros could in principle conspire to cancel the "bad" zero's contribution. Landau's oscillation theorem is precisely the tool to rule this out.
Landau's theorem says: the Mellin transform of a function $f$ with $f(x) = O(x^\alpha)$ is holomorphic in $\operatorname{Re}(s) > \alpha$. This is a one-way implication, but the converse-contrapositive form is what we use: a singularity of the Mellin transform at $s_0$ with $\operatorname{Re}(s_0) = \alpha$ means $f(x)/x^\alpha$ cannot tend to $0$ — in fact, $\limsup |f(x)|/x^{\alpha - \varepsilon} = +\infty$ for every $\varepsilon > 0$.
Applied to our problem: the Mellin transform of $\psi(x) - x$ is essentially $-\zeta'(s)/(s \zeta(s)) - 1/(s(s-1))$, which has poles at every zero of $\zeta$. The abscissa of these poles — the rightmost pole — is $\Theta$, the supremum of real parts of non-trivial zeros. Hence the abscissa of holomorphy of the Mellin transform of $\psi(x) - x$ equals $\Theta$, and $|\psi(x) - x|$ cannot be $O(x^{\Theta - \varepsilon})$ for any $\varepsilon > 0$.
If $\Theta > 1/2$, this defeats the bound $O(\sqrt{x}(\log x)^2) \subset O(x^{1/2 + \delta})$ for small enough $\delta$. Hence (B) forces $\Theta \leq 1/2$, and the functional-equation symmetry upgrades this to $\Theta = 1/2$, which is RH.
Why is the functional-equation symmetry needed? Without it, the conclusion would only be "no zero has real part $> 1/2$", leaving open zeros with $\beta < 1/2$. But the functional equation $\zeta(s) = \chi(s)\zeta(1-s)$ pairs zeros at $\rho$ with zeros at $1 - \rho$ (the point is that a zero at $\beta + i\gamma$ with $\beta < 1/2$ would force a zero at $1 - \beta - i\gamma$ with $1 - \beta > 1/2$, which we have excluded).
[/guided]
[/step]
[step:Combine both directions to conclude the equivalence]
By Step 4 (forward direction): RH implies $|\psi(x) - x| = O(\sqrt{x}(\log x)^2)$, which by Step 1 implies $|\pi(x) - \mathrm{li}(x)| = O(\sqrt{x}\log x)$.
By Step 5 (reverse direction): the bound $|\psi(x) - x| = O(\sqrt{x}(\log x)^2)$, hence the stronger bound $|\pi(x) - \mathrm{li}(x)| = O(\sqrt{x}\log x)$ (which implies $|\psi(x) - x| = O(\sqrt{x}(\log x)^2)$ by the reverse Abel-summation in Step 1), implies that every non-trivial zero of $\zeta$ has real part exactly $1/2$, i.e., RH.
Therefore the two statements are equivalent:
\begin{align*}
|\pi(x) - \mathrm{li}(x)| &= O(\sqrt{x} \log x) \quad \iff \quad \text{every non-trivial zero of } \zeta \text{ has } \operatorname{Re}(s) = \tfrac{1}{2}.
\end{align*}
The explicit Schoenfeld bound $|\pi(x) - \mathrm{li}(x)| \leq \frac{1}{8\pi} \sqrt{x} \log x$ for $x \geq 2657$, assuming RH, sharpens the forward direction by tracking constants through the dyadic zero-sum estimate and optimising the truncation $T$ more carefully; its proof is due to Schoenfeld (1976) and follows the same template, with explicit numerical bounds on the coefficients appearing in the Riemann--von Mangoldt formula and the truncated explicit formula.
This completes the proof of the equivalence.
[/step]