[proofplan]
We expand the residue-class indicator using the orthogonality relations for Dirichlet characters modulo the fixed modulus. After the relevant character set, principal character, and twisted Chebyshev sums are defined below, this expansion expresses the residue-class Chebyshev function as an average of twisted sums. The principal character contributes the main term, while every nonprincipal character contributes an error negligible compared with $x$. Here and below, $r(x)=o(x)$ means
\begin{align*}
\frac{r(x)}{x}\to0
\end{align*}
as $x \to \infty$. Since there are only finitely many characters modulo the fixed modulus, the total nonprincipal contribution remains $o(x)$.
[/proofplan]
[step:Expand the residue class by Dirichlet character orthogonality]
Let $\mathcal{X}_q$ denote the finite set of Dirichlet characters modulo $q$, extended to all integers by setting $\chi(n)=0$ whenever $\gcd(n,q)>1$. Let $\chi_0 \in \mathcal{X}_q$ denote the principal character modulo $q$. Let $A_a := \{m \in \mathbb{N} : m \equiv a \pmod q\}$, and define the indicator function
\begin{align*}
\mathbb{1}_{A_a}: \mathbb{N} &\to \{0,1\} \\
m &\mapsto
\begin{cases}
1, & m \in A_a, \\
0, & m \notin A_a.
\end{cases}
\end{align*}
We write $\mathbb{1}_{\{n \equiv a \pmod q\}}$ for the value $\mathbb{1}_{A_a}(n)$. For each $\chi \in \mathcal{X}_q$, define the twisted Chebyshev function
\begin{align*}
\psi_\chi : [1,\infty) &\to \mathbb{C} \\
x &\mapsto \sum_{n \leq x} \Lambda(n)\chi(n).
\end{align*}
By the [Orthogonality Relation for Dirichlet Characters](/theorems/TEMP-5), for every $n \in \mathbb{N}$,
\begin{align*}
\mathbb{1}_{\{n \equiv a \pmod q\}}
=
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q} \overline{\chi(a)}\chi(n).
\end{align*}
Indeed, if $n \equiv a \pmod q$, then $\gcd(n,q)=1$ and the sum equals $\varphi(q)$; if $n \not\equiv a \pmod q$, then either $\gcd(n,q)>1$, in which case every $\chi(n)=0$, or $n$ represents a different unit class modulo $q$, in which case character orthogonality gives zero.
Multiplying by $\Lambda(n)$ and summing over $n \leq x$ gives
\begin{align*}
\psi(x;q,a)
&=
\sum_{n \leq x}\Lambda(n)\mathbb{1}_{\{n \equiv a \pmod q\}} \\
&=
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q}
\overline{\chi(a)}
\sum_{n \leq x}\Lambda(n)\chi(n) \\
&=
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q}
\overline{\chi(a)}\psi_\chi(x).
\end{align*}
[guided]
The purpose of Dirichlet characters is to replace the arithmetic condition $n \equiv a \pmod q$ by a finite Fourier expansion on the multiplicative group $(\mathbb{Z}/q\mathbb{Z})^\times$.
Let $\mathcal{X}_q$ be the set of Dirichlet characters modulo $q$, where each character is extended to all integers by the rule $\chi(n)=0$ when $\gcd(n,q)>1$. Let $\chi_0$ be the principal character, so $\chi_0(n)=1$ for $\gcd(n,q)=1$ and $\chi_0(n)=0$ otherwise. Let $A_a := \{m \in \mathbb{N} : m \equiv a \pmod q\}$, and define the indicator function
\begin{align*}
\mathbb{1}_{A_a}: \mathbb{N} &\to \{0,1\} \\
m &\mapsto
\begin{cases}
1, & m \in A_a, \\
0, & m \notin A_a.
\end{cases}
\end{align*}
We write $\mathbb{1}_{\{n \equiv a \pmod q\}}$ for the value $\mathbb{1}_{A_a}(n)$. For each character $\chi \in \mathcal{X}_q$, define
\begin{align*}
\psi_\chi : [1,\infty) &\to \mathbb{C} \\
x &\mapsto \sum_{n \leq x} \Lambda(n)\chi(n).
\end{align*}
We now use the [Orthogonality Relation for Dirichlet Characters](/theorems/TEMP-5):
\begin{align*}
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q}\overline{\chi(a)}\chi(n)
=
\mathbb{1}_{\{n \equiv a \pmod q\}}.
\end{align*}
The hypothesis $\gcd(a,q)=1$ is used here because $a$ must represent an element of the unit group modulo $q$. If $n \equiv a \pmod q$, then $n$ is also coprime to $q$, and the character sum is the full orthogonality sum at the identity element, hence equals $1$. If $n$ is not congruent to $a$ modulo $q$, then either $n$ is not coprime to $q$, so all characters vanish at $n$, or $n$ is a different unit class, so the orthogonality sum is zero.
Therefore
\begin{align*}
\psi(x;q,a)
&=
\sum_{n \leq x}\Lambda(n)\mathbb{1}_{\{n \equiv a \pmod q\}} \\
&=
\sum_{n \leq x}
\Lambda(n)
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q}\overline{\chi(a)}\chi(n).
\end{align*}
Because $\mathcal{X}_q$ is finite, the character sum can be interchanged with the finite sum over $n \leq x$, giving
\begin{align*}
\psi(x;q,a)
&=
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q}
\overline{\chi(a)}
\sum_{n \leq x}\Lambda(n)\chi(n) \\
&=
\frac{1}{\varphi(q)}
\sum_{\chi \in \mathcal{X}_q}
\overline{\chi(a)}\psi_\chi(x).
\end{align*}
[/guided]
[/step]
[step:Isolate the principal character contribution]
Since $\gcd(a,q)=1$, the principal character satisfies $\chi_0(a)=1$. Hence the decomposition from the previous step becomes
\begin{align*}
\psi(x;q,a)
=
\frac{1}{\varphi(q)}\psi_{\chi_0}(x)
+
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
\overline{\chi(a)}\psi_\chi(x).
\end{align*}
For the principal character,
\begin{align*}
\psi_{\chi_0}(x)
=
\sum_{\substack{n \leq x \\ \gcd(n,q)=1}}\Lambda(n).
\end{align*}
Let
\begin{align*}
E_q(x) := \sum_{\substack{n \leq x \\ \gcd(n,q)>1}}\Lambda(n).
\end{align*}
Then
\begin{align*}
\psi_{\chi_0}(x)=\psi(x)-E_q(x),
\end{align*}
where $\psi:[1,\infty)\to\mathbb R$ is the ordinary Chebyshev function defined by
\begin{align*}
\psi(x):=\sum_{n\leq x}\Lambda(n).
\end{align*}
If $p$ ranges over the finitely many primes dividing $q$, then only prime powers $p^k \leq x$ with $p \mid q$ contribute to $E_q(x)$, so
\begin{align*}
0 \leq E_q(x)
\leq
\sum_{p \mid q}\sum_{\substack{k \geq 1 \\ p^k \leq x}}\log p
\leq
\sum_{p \mid q}(\log x)
=
\omega(q)\log x,
\end{align*}
where $\omega(q)$ denotes the number of distinct prime divisors of $q$. Thus $E_q(x)=O_q(\log x)=o(x)$.
By the [Prime Number Theorem for Dirichlet Characters](/theorems/TEMP-48), applied to the principal character,
\begin{align*}
\psi(x)\sim x.
\end{align*}
Therefore
\begin{align*}
\psi_{\chi_0}(x)
=
x+o(x).
\end{align*}
[guided]
The principal character is the only character whose twisted Chebyshev sum has a main term of size $x$. Since $\gcd(a,q)=1$, the principal character satisfies $\chi_0(a)=1$, and therefore $\overline{\chi_0(a)}=1$. Separating $\chi_0$ from the character decomposition gives
\begin{align*}
\psi(x;q,a)
=
\frac{1}{\varphi(q)}\psi_{\chi_0}(x)
+
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
\overline{\chi(a)}\psi_\chi(x).
\end{align*}
We now identify the size of $\psi_{\chi_0}(x)$. Since $\chi_0(n)=1$ when $\gcd(n,q)=1$ and $\chi_0(n)=0$ otherwise,
\begin{align*}
\psi_{\chi_0}(x)
=
\sum_{\substack{n \leq x \\ \gcd(n,q)=1}}\Lambda(n).
\end{align*}
This is almost the ordinary Chebyshev function $\psi:[1,\infty)\to\mathbb R$ defined by
\begin{align*}
\psi(x):=\sum_{n \leq x}\Lambda(n),
\end{align*}
except that it omits the terms with $\gcd(n,q)>1$. Define the omitted part by
\begin{align*}
E_q(x) := \sum_{\substack{n \leq x \\ \gcd(n,q)>1}}\Lambda(n).
\end{align*}
Then
\begin{align*}
\psi_{\chi_0}(x)=\psi(x)-E_q(x).
\end{align*}
We estimate $E_q(x)$. The von Mangoldt function is supported on prime powers: $\Lambda(n)\neq 0$ only when $n=p^k$ for a prime $p$ and an integer $k\geq 1$, in which case $\Lambda(p^k)=\log p$. If $\gcd(p^k,q)>1$, then $p$ divides $q$. Therefore
\begin{align*}
0 \leq E_q(x)
\leq
\sum_{p \mid q}\sum_{\substack{k \geq 1 \\ p^k \leq x}}\log p.
\end{align*}
For a fixed prime $p$, the number of integers $k\geq 1$ with $p^k \leq x$ is at most $\log x/\log p$, so
\begin{align*}
\sum_{\substack{k \geq 1 \\ p^k \leq x}}\log p
\leq
\log x.
\end{align*}
Summing over the finitely many distinct prime divisors of $q$ gives
\begin{align*}
0 \leq E_q(x) \leq \omega(q)\log x,
\end{align*}
where $\omega(q)$ is the number of distinct prime divisors of $q$. Since $q$ is fixed, $\omega(q)\log x=o(x)$.
Finally, by the [Prime Number Theorem for Dirichlet Characters](/theorems/TEMP-48), applied to the principal character,
\begin{align*}
\psi(x)\sim x.
\end{align*}
Combining this with $E_q(x)=o(x)$ yields
\begin{align*}
\psi_{\chi_0}(x)=\psi(x)-E_q(x)=x+o(x).
\end{align*}
[/guided]
[/step]
[step:Bound the total contribution of the nonprincipal characters]
For every nonprincipal character $\chi \in \mathcal{X}_q$ with $\chi \neq \chi_0$, the nonprincipal Chebyshev estimate for Dirichlet characters applies because $q$ is fixed and $\chi$ is a nonprincipal Dirichlet character modulo $q$. It gives
\begin{align*}
\psi_\chi(x)=o(x)
\end{align*}
as $x \to \infty$ by the [Prime Number Theorem for Dirichlet Characters](/theorems/TEMP-48).
Because $q$ is fixed, the set $\mathcal{X}_q$ is finite with cardinality $\varphi(q)$. Also $|\overline{\chi(a)}|\leq 1$ for every $\chi \in \mathcal{X}_q$. Hence
\begin{align*}
\left|
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
\overline{\chi(a)}\psi_\chi(x)
\right|
&\leq
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
|\psi_\chi(x)| \\
&=
o(x).
\end{align*}
[guided]
Each nonprincipal character gives an oscillatory twist of the von Mangoldt function. The analytic input is the nonprincipal Chebyshev estimate for Dirichlet characters. Its hypotheses are satisfied here because the modulus $q$ is fixed and $\chi \in \mathcal{X}_q$ is assumed nonprincipal. Therefore, as $x \to \infty$,
\begin{align*}
\psi_\chi(x)=o(x)
\end{align*}
for every nonprincipal $\chi \in \mathcal{X}_q$ by the [Prime Number Theorem for Dirichlet Characters](/theorems/TEMP-48).
We must also check that summing these error terms still gives $o(x)$. This is where the hypothesis that $q$ is fixed is used. The set $\mathcal{X}_q$ has exactly $\varphi(q)$ elements, so the number of nonprincipal characters modulo $q$ is finite and independent of $x$. Since $\gcd(a,q)=1$, each $\chi(a)$ is a complex number of modulus $1$, and therefore $|\overline{\chi(a)}|\leq 1$. Thus
\begin{align*}
\left|
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
\overline{\chi(a)}\psi_\chi(x)
\right|
&\leq
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
|\overline{\chi(a)}|\,|\psi_\chi(x)| \\
&\leq
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
|\psi_\chi(x)|.
\end{align*}
Every summand on the right is $o(x)$, and there are only finitely many summands depending on the fixed modulus $q$. Therefore the entire nonprincipal contribution is $o(x)$.
[/guided]
[/step]
[step:Combine the principal main term with the nonprincipal error]
Substituting the estimates from the previous two steps into the character decomposition gives
\begin{align*}
\psi(x;q,a)
&=
\frac{1}{\varphi(q)}\psi_{\chi_0}(x)
+
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
\overline{\chi(a)}\psi_\chi(x) \\
&=
\frac{1}{\varphi(q)}(x+o(x))+o(x) \\
&=
\frac{x}{\varphi(q)}+o(x).
\end{align*}
Since $\varphi(q)$ is a positive constant depending only on the fixed modulus $q$, this is equivalent to
\begin{align*}
\psi(x;q,a)\sim \frac{x}{\varphi(q)}.
\end{align*}
This proves the theorem.
[guided]
We now combine the two estimates already proved. The character decomposition gives
\begin{align*}
\psi(x;q,a)
&=
\frac{1}{\varphi(q)}\psi_{\chi_0}(x)
+
\frac{1}{\varphi(q)}
\sum_{\substack{\chi \in \mathcal{X}_q \\ \chi \neq \chi_0}}
\overline{\chi(a)}\psi_\chi(x).
\end{align*}
The principal-character step proved $\psi_{\chi_0}(x)=x+o(x)$, and the nonprincipal-character step proved that the whole second term is $o(x)$. Substituting these two estimates gives
\begin{align*}
\psi(x;q,a)
&=
\frac{1}{\varphi(q)}(x+o(x))+o(x) \\
&=
\frac{x}{\varphi(q)}+o(x).
\end{align*}
Since $q$ is fixed, $\varphi(q)$ is a positive constant independent of $x$. Thus the last display is exactly the statement that
\begin{align*}
\frac{\psi(x;q,a)}{x/\varphi(q)} \to 1
\end{align*}
as $x \to \infty$, which is
\begin{align*}
\psi(x;q,a)\sim \frac{x}{\varphi(q)}.
\end{align*}
This proves the theorem.
[/guided]
[/step]