[step:Reduce the theorem to the asymptotic $\psi(x) \sim x$ for the Chebyshev function]
Define the Chebyshev functions
\begin{align*}
\vartheta : (0, \infty) &\to [0, \infty) \\
x &\mapsto \sum_{p \leq x,\, p \text{ prime}} \log p,
\end{align*}
\begin{align*}
\psi : (0, \infty) &\to [0, \infty) \\
x &\mapsto \sum_{p^k \leq x,\, p \text{ prime},\, k \geq 1} \log p.
\end{align*}
Equivalently $\psi(x) = \sum_{n \leq x} \Lambda(n)$ where $\Lambda$ is the [von Mangoldt function](/pages/Von%20Mangoldt%20Function) defined by $\Lambda(n) = \log p$ if $n = p^k$ for some prime $p$ and integer $k \geq 1$, and $\Lambda(n) = 0$ otherwise.
We claim that $\pi(x) \sim \mathrm{li}(x)$ follows once we establish
\begin{align*}
\psi(x) &\sim x \quad \text{as } x \to \infty. \tag{$*$}
\end{align*}
First, note $\vartheta(x) \sim \psi(x)$: the difference is
\begin{align*}
\psi(x) - \vartheta(x) &= \sum_{k \geq 2} \sum_{p^k \leq x} \log p \leq \sum_{k=2}^{\lfloor \log_2 x \rfloor} \vartheta(x^{1/k}) \leq (\log_2 x) \cdot \vartheta(\sqrt{x}) \leq (\log_2 x) \cdot \sqrt{x} \log x,
\end{align*}
using the elementary bound $\vartheta(y) \leq y \log y$. Hence $\psi(x) - \vartheta(x) = O(\sqrt{x} (\log x)^2) = o(x)$, giving $\vartheta(x) \sim \psi(x) \sim x$ under $(*)$.
Next, we convert $\vartheta(x) \sim x$ to $\pi(x) \sim x/\log x$ by [Abel Summation](/theorems/???). Write
\begin{align*}
\pi(x) &= \sum_{p \leq x} 1 = \sum_{p \leq x} \frac{\log p}{\log p},
\end{align*}
and apply Abel's identity to the sequence $a_p = \log p$ summed against the weight $f(t) = 1/\log t$ for $t \geq 2$:
\begin{align*}
\pi(x) &= \frac{\vartheta(x)}{\log x} + \int_2^x \frac{\vartheta(t)}{t (\log t)^2} \, d\mathcal{L}^1(t).
\end{align*}
The derivation: $\pi(x) = \sum_{p \leq x} 1$, and with $A(t) := \vartheta(t)$ Abel summation gives $\pi(x) = A(x)/\log x - A(2)/\log 2 + \int_2^x A(t) \cdot (1/\log t)' \, (-1) \, d\mathcal{L}^1(t)$, whence the formula (with $A(2)/\log 2 = 0$ since $\vartheta(2) = \log 2$ gives the constant $1$, absorbed into the integration).
Assuming $\vartheta(t) \sim t$, for every $\varepsilon > 0$ there is $T_\varepsilon$ with $(1 - \varepsilon) t \leq \vartheta(t) \leq (1 + \varepsilon) t$ for $t \geq T_\varepsilon$. Then $\vartheta(x)/\log x \sim x/\log x$, and
\begin{align*}
\int_{T_\varepsilon}^x \frac{\vartheta(t)}{t(\log t)^2} \, d\mathcal{L}^1(t) &= (1 + o(1)) \int_{T_\varepsilon}^x \frac{d\mathcal{L}^1(t)}{(\log t)^2} = o\!\left( \frac{x}{\log x} \right)
\end{align*}
as $x \to \infty$, since the elementary estimate $\int_2^x (\log t)^{-2} \, d\mathcal{L}^1(t) = o(x/\log x)$ holds (integrate by parts or split at $\sqrt{x}$). Therefore $\pi(x) = x/\log x + o(x/\log x)$.
Finally, $\mathrm{li}(x) = \int_2^x (\log t)^{-1} \, d\mathcal{L}^1(t) = x/\log x + O(x/(\log x)^2)$ by integration by parts, so $\pi(x) \sim x/\log x \sim \mathrm{li}(x)$.
It remains to prove $(*)$. The rest of the argument is devoted to this.
[/step]