[proofplan]
We prove the estimate by reducing the discrepancy in residue classes to averaged character sums and then applying the standard Bombieri-Vinogradov mean-value estimate for the von Mangoldt function. The principal character contributes the expected main term $x/\varphi(q)$, while the non-principal characters are controlled on average over $q \le x^{1/2}(\log x)^{-B}$. The parameter $B(A)$ is chosen large enough to absorb the logarithmic losses in the mean-value estimate.
[/proofplan]
[step:Express the progression error through non-principal Dirichlet characters]
For $q \in \mathbb{N}$ and $a \in \mathbb{Z}$ with $\gcd(a,q)=1$, let $\mathcal{X}(q)$ denote the finite set of Dirichlet characters modulo $q$, and let $\chi_0 \in \mathcal{X}(q)$ denote the principal character. For each fixed $x\ge 2$, define the character-sum map $\Psi_x:\mathcal X(q)\to\mathbb C$ by $\Psi_x(\chi):=\Psi(x,\chi)$, where
\begin{align*}
\Psi(x,\chi) := \sum_{n \le x} \Lambda(n)\chi(n).
\end{align*}
By the character orthogonality relation on the unit group $(\mathbb{Z}/q\mathbb{Z})^\times$, for every reduced residue class $a \pmod q$ we have
\begin{align*}
\psi(x;q,a) = \frac{1}{\varphi(q)}\sum_{\chi \in \mathcal{X}(q)} \overline{\chi(a)}\Psi(x,\chi).
\end{align*}
The principal character contribution is
\begin{align*}
\frac{1}{\varphi(q)}\Psi(x,\chi_0) = \frac{x}{\varphi(q)} + O\left(\frac{1}{\varphi(q)}\sum_{p^k \le x,\ p \mid q} \log p\right) + O\left(\frac{x}{\varphi(q)}e^{-c\sqrt{\log x}}\right)
\end{align*}
for an absolute constant $c>0$, using the [prime number theorem](/theorems/1692) in the form $\sum_{n \le x}\Lambda(n)=x+O(xe^{-c\sqrt{\log x}})$. Hence
\begin{align*}
\max_{\gcd(a,q)=1}\left|\psi(x;q,a)-\frac{x}{\varphi(q)}\right| \le \frac{1}{\varphi(q)}\sum_{\substack{\chi \in \mathcal{X}(q),\ \chi \ne \chi_0}} |\Psi(x,\chi)| + E_q(x),
\end{align*}
where
\begin{align*}
E_q(x) := O\left(\frac{1}{\varphi(q)}\sum_{p^k \le x,\ p \mid q} \log p\right) + O\left(\frac{x}{\varphi(q)}e^{-c\sqrt{\log x}}\right).
\end{align*}
[/step]
[step:Control the principal-character error after summing over moduli]
Let $Q: [3,\infty) \to [1,\infty)$ be defined by
\begin{align*}
Q(x) := x^{1/2}(\log x)^{-B}.
\end{align*}
The elementary bound $\sum_{p^k \le x,\ p \mid q}\log p \le \sum_{p \mid q} \log x \le (\log x)^2$ gives
\begin{align*}
\sum_{q \le Q(x)} E_q(x) \ll Q(x)(\log x)^2 + x e^{-c\sqrt{\log x}}\sum_{q \le Q(x)}\frac{1}{\varphi(q)}.
\end{align*}
Using the standard estimate $\sum_{q \le Q}1/\varphi(q) \ll \log Q$ for $Q \ge 2$, this becomes
\begin{align*}
\sum_{q \le Q(x)} E_q(x) \ll x^{1/2}(\log x)^{2-B} + x e^{-c\sqrt{\log x}}\log x.
\end{align*}
For every fixed $A>0$, this is $O_A(x(\log x)^{-A})$ once $x$ is sufficiently large.
[/step]
[step:Apply the averaged character estimate with enough logarithmic saving]
We use the assumed Bombieri-Vinogradov character mean-value estimate: for every $A>0$ there exists $B_0=B_0(A)>0$ such that, with $Q(x)=x^{1/2}(\log x)^{-B_0}$,
\begin{align*}
\sum_{q \le Q(x)} \frac{1}{\varphi(q)}\sum_{\substack{\chi \in \mathcal{X}(q),\ \chi \ne \chi_0}} |\Psi(x,\chi)| \ll_A \frac{x}{(\log x)^A}.
\end{align*}
This is exactly the prerequisite stated in the theorem statement, applied with exponent $A$.
[guided]
The point of passing to characters is that the arithmetic progression condition $n \equiv a \pmod q$ can be separated into characters, so the difficult part becomes bounding $\Psi(x,\chi)$ on average over $q$ and $\chi$. The required input is the assumed Bombieri-Vinogradov character mean-value estimate: for each fixed $A>0$, there is a logarithmic loss parameter $B_0(A)>0$ such that
\begin{align*}
\sum_{q \le x^{1/2}(\log x)^{-B_0}} \frac{1}{\varphi(q)}\sum_{\substack{\chi \in \mathcal{X}(q),\ \chi \ne \chi_0}} |\Psi(x,\chi)| \ll_A \frac{x}{(\log x)^A}.
\end{align*}
We verify the hypotheses in this application. The modulus range is exactly $q \le x^{1/2}(\log x)^{-B_0}$, the character sums are formed with $\Psi(x,\chi)=\sum_{n\le x}\Lambda(n)\chi(n)$, and the principal character is excluded because its contribution was isolated in the previous step.
[/guided]
[/step]
[step:Choose $B(A)$ and combine the estimates]
Fix $A>0$ and choose
\begin{align*}
B(A) := B_0(A+1)+A+3.
\end{align*}
Then $Q(x)=x^{1/2}(\log x)^{-B(A)}$ is no larger than the modulus range permitted by the averaged character estimate with exponent $A+1$. Summing the inequality from the first step over $q \le Q(x)$, using the principal-character error estimate from the second step, and applying the averaged character estimate from the third step gives
\begin{align*}
\sum_{q \le Q(x)} \max_{\gcd(a,q)=1}\left|\psi(x;q,a)-\frac{x}{\varphi(q)}\right| \ll_A \frac{x}{(\log x)^A}.
\end{align*}
This is precisely the asserted Bombieri-Vinogradov estimate, with the implicit constant depending only on $A$.
[/step]