[proofplan]
We identify Dirichlet characters modulo $q$ with the complex character group of the finite abelian group $G = (\mathbb{Z}/q\mathbb{Z})^\times$. Since $a$ and $n$ are both coprime to $q$, the summand becomes the value of each group character at the single group element $\bar n \bar a^{-1}$. The result is then the finite-group character orthogonality relation: the average of all characters is $1$ at the identity and $0$ away from the identity. We prove the needed separation fact for finite abelian groups directly, so no external character-theory theorem is being invoked.
[/proofplan]
[step:Translate the Dirichlet character sum to a finite abelian group character sum]
Let
\begin{align*}
G := (\mathbb{Z}/q\mathbb{Z})^\times
\end{align*}
be the multiplicative group of invertible residue classes modulo $q$. Its order is $|G|=\varphi(q)$. Let
\begin{align*}
\widehat G := \{\psi: G \to \mathbb{C}^\times : \psi(xy)=\psi(x)\psi(y) \text{ for all } x,y \in G\}
\end{align*}
be the group of complex-valued group characters of $G$.
A Dirichlet character modulo $q$ is precisely a character $\psi \in \widehat G$, viewed as a function $\chi:\mathbb{Z}\to\mathbb{C}$ by setting $\chi(m)=\psi(\bar m)$ when $\gcd(m,q)=1$ and $\chi(m)=0$ when $\gcd(m,q)>1$. Since $\gcd(a,q)=\gcd(n,q)=1$, both $\bar a$ and $\bar n$ lie in $G$, and every corresponding Dirichlet character satisfies
\begin{align*}
\chi(a)=\psi(\bar a),
\qquad
\chi(n)=\psi(\bar n).
\end{align*}
For every $\psi\in \widehat G$, the value $\psi(\bar a)$ is a root of unity, because $\bar a$ has finite order in $G$. Hence $|\psi(\bar a)|=1$ and
\begin{align*}
\overline{\psi(\bar a)}=\psi(\bar a)^{-1}=\psi(\bar a^{-1}).
\end{align*}
Therefore
\begin{align*}
\overline{\chi(a)}\,\chi(n)
=
\overline{\psi(\bar a)}\,\psi(\bar n)
=
\psi(\bar a^{-1})\psi(\bar n)
=
\psi(\bar n \bar a^{-1}).
\end{align*}
Defining
\begin{align*}
g := \bar n \bar a^{-1} \in G,
\end{align*}
the desired average becomes
\begin{align*}
\frac{1}{\varphi(q)}\sum_{\chi \bmod q}\overline{\chi(a)}\,\chi(n)
=
\frac{1}{|G|}\sum_{\psi\in\widehat G}\psi(g).
\end{align*}
[guided]
The point of this step is to remove the arithmetic notation and expose the underlying group-theoretic statement. Since $a$ and $n$ are both coprime to $q$, their residue classes $\bar a$ and $\bar n$ are elements of the unit group
\begin{align*}
G := (\mathbb{Z}/q\mathbb{Z})^\times.
\end{align*}
A Dirichlet character modulo $q$ is a group character
\begin{align*}
\psi: G \to \mathbb{C}^\times
\end{align*}
extended to all integers by assigning the value $0$ on residue classes not coprime to $q$. Because $a$ and $n$ are units modulo $q$, no zero values occur in this proof.
For $\psi\in\widehat G$, the value $\psi(\bar a)$ has finite multiplicative order, since $\bar a$ has finite order in the finite group $G$. Thus $\psi(\bar a)$ lies on the unit circle, and complex conjugation agrees with inversion:
\begin{align*}
\overline{\psi(\bar a)}=\psi(\bar a)^{-1}.
\end{align*}
Using the homomorphism property of $\psi$, we get
\begin{align*}
\overline{\chi(a)}\,\chi(n)
=
\overline{\psi(\bar a)}\,\psi(\bar n)
=
\psi(\bar a)^{-1}\psi(\bar n)
=
\psi(\bar a^{-1})\psi(\bar n)
=
\psi(\bar n\bar a^{-1}).
\end{align*}
Thus the whole Dirichlet character sum is the average of all group characters evaluated at the single element
\begin{align*}
g := \bar n\bar a^{-1}\in G.
\end{align*}
So it remains to prove the finite-group assertion
\begin{align*}
\frac{1}{|G|}\sum_{\psi\in\widehat G}\psi(g)
=
\begin{cases}
1, & g=e,\\
0, & g\ne e.
\end{cases}
\end{align*}
[/guided]
[/step]
[step:Prove that finite abelian group characters separate nonidentity elements]
[claim:Every nonidentity element of a finite abelian group is detected by a character]
Let $A$ be a finite abelian group, and let $x\in A$ with $x\ne e_A$. Then there exists a character
\begin{align*}
\theta: A \to \mathbb{C}^\times
\end{align*}
such that $\theta(x)\ne 1$.
[/claim]
[proof]
Let $m\in\mathbb{N}$ be the order of $x$, so $m>1$. Let $H:=\langle x\rangle\subset A$ be the cyclic subgroup generated by $x$. Define
\begin{align*}
\theta_0: H &\to \mathbb{C}^\times \\
x^k &\mapsto \exp(2\pi i k/m).
\end{align*}
This is well-defined because $x^k=x^\ell$ holds exactly when $k\equiv \ell \pmod m$, and the exponential above has period $m$. It is a group homomorphism, and
\begin{align*}
\theta_0(x)=\exp(2\pi i/m)\ne 1.
\end{align*}
It remains to extend $\theta_0$ from $H$ to all of $A$. Consider all pairs $(B,\eta)$ where $B$ is a subgroup of $A$ containing $H$ and
\begin{align*}
\eta:B\to\mathbb{C}^\times
\end{align*}
is a group homomorphism extending $\theta_0$. Since $A$ is finite, choose such a pair $(B,\eta)$ with $B$ maximal by inclusion.
We show that $B=A$. Suppose not. Choose $y\in A\setminus B$. Let $r\in\mathbb{N}$ be the smallest positive integer such that $y^r\in B$; such an $r$ exists because $A$ is finite. Choose $\lambda\in\mathbb{C}^\times$ satisfying
\begin{align*}
\lambda^r=\eta(y^r),
\end{align*}
which exists because every nonzero complex number has an $r$-th root.
Let $B' := \langle B,y\rangle$. Every element of $B'$ has a unique representation $b y^k$ with $b\in B$ and $0\le k<r$. Indeed, existence follows by reducing the exponent of $y$ modulo $r$, and uniqueness follows from the minimality of $r$: if $b y^k=b' y^\ell$ with $0\le k,\ell<r$, then $y^{k-\ell}\in B$, so $k=\ell$ and then $b=b'$.
Define
\begin{align*}
\eta':B' &\to \mathbb{C}^\times \\
b y^k &\mapsto \eta(b)\lambda^k
\end{align*}
for $b\in B$ and $0\le k<r$. The uniqueness just proved makes $\eta'$ well-defined. If $b,d\in B$ and $0\le k,\ell<r$, write $k+\ell=sr+t$ with $s\in\{0,1\}$ and $0\le t<r$. Since $A$ is abelian,
\begin{align*}
(b y^k)(d y^\ell)=bd(y^r)^s y^t.
\end{align*}
Therefore
\begin{align*}
\eta'((b y^k)(d y^\ell))
&=
\eta(bd(y^r)^s)\lambda^t \\
&=
\eta(b)\eta(d)\eta(y^r)^s\lambda^t \\
&=
\eta(b)\eta(d)\lambda^{rs+t} \\
&=
\eta(b)\lambda^k\,\eta(d)\lambda^\ell \\
&=
\eta'(b y^k)\eta'(d y^\ell).
\end{align*}
Thus $\eta'$ is a character on $B'$ extending $\eta$. Since $y\notin B$, we have $B\subsetneq B'$, contradicting the maximality of $B$. Hence $B=A$.
The maximal extension $\eta:A\to\mathbb{C}^\times$ extends $\theta_0$, so
\begin{align*}
\eta(x)=\theta_0(x)=\exp(2\pi i/m)\ne 1.
\end{align*}
Taking $\theta:=\eta$ proves the claim.
[/proof]
Apply the claim with $A=G$. If $g\ne e_G$, there exists $\psi_0\in\widehat G$ with $\psi_0(g)\ne 1$.
[/step]
[step:Count the characters of a finite abelian group]
[claim:A finite abelian group has exactly as many complex characters as elements]
Let $A$ be a finite abelian group, and let
\begin{align*}
\widehat A := \{\theta:A\to\mathbb{C}^\times : \theta(xy)=\theta(x)\theta(y) \text{ for all } x,y\in A\}
\end{align*}
be its complex character group. Then $|\widehat A|=|A|$.
[/claim]
[proof]
We argue by induction on $|A|$. If $|A|=1$, then $A=\{e_A\}$ and the only character is the constant character $e_A\mapsto 1$, so $|\widehat A|=1=|A|$.
Assume $|A|>1$ and that the assertion is known for all finite abelian groups of smaller order. Choose $x\in A$ with $x\ne e_A$, let $m\in\mathbb{N}$ be the order of $x$, and let $H:=\langle x\rangle\subset A$. Then $1<m\le |A|$. The cyclic group $H$ has exactly $m$ characters: for each integer $j\in\{0,1,\dots,m-1\}$, define
\begin{align*}
\theta_j:H &\to \mathbb{C}^\times \\
x^k &\mapsto \exp(2\pi i jk/m),
\end{align*}
and these are all characters because a character on $H$ is determined by its value on $x$, which must be an $m$-th root of unity.
Define the restriction map
\begin{align*}
R:\widehat A &\to \widehat H \\
\theta &\mapsto \theta|_H.
\end{align*}
We prove that $R$ is surjective. Let $\theta_0:H\to\mathbb{C}^\times$ be a character. Consider all pairs $(B,\eta)$ where $B$ is a subgroup of $A$ containing $H$ and $\eta:B\to\mathbb{C}^\times$ is a character extending $\theta_0$. Since $A$ is finite, choose such a pair maximal by inclusion. If $B\ne A$, choose $y\in A\setminus B$, let $r\in\mathbb{N}$ be minimal with $y^r\in B$, choose $\lambda\in\mathbb{C}^\times$ with $\lambda^r=\eta(y^r)$, and define on $B':=\langle B,y\rangle$ the map
\begin{align*}
\eta':B' &\to \mathbb{C}^\times \\
b y^k &\mapsto \eta(b)\lambda^k,
\end{align*}
where $b\in B$ and $0\le k<r$. The same uniqueness and multiplication computation as in the preceding separation claim shows that $\eta'$ is a well-defined character extending $\eta$, contradicting maximality. Hence $B=A$, so $\theta_0$ extends to a character of $A$ and $R$ is surjective.
The kernel of $R$ is
\begin{align*}
\ker R = \{\theta\in\widehat A : \theta(h)=1 \text{ for all } h\in H\}.
\end{align*}
Characters in $\ker R$ are exactly the characters that factor through the [quotient group](/page/Quotient%20Group) $A/H$: the bijection is
\begin{align*}
\widehat{A/H} &\to \ker R \\
\rho &\mapsto \bigl(a\mapsto \rho(aH)\bigr).
\end{align*}
The map is well-defined because $H$ is a subgroup of the abelian group $A$, hence $A/H$ is a finite abelian group. It is injective because every coset $aH$ is represented by some $a\in A$, and it is surjective because if $\theta\in\ker R$, then $\rho(aH):=\theta(a)$ is independent of the representative: whenever $aH=bH$, one has $b^{-1}a\in H$, and therefore
\begin{align*}
\theta(a)=\theta(b)\theta(b^{-1}a)=\theta(b).
\end{align*}
Thus $|\ker R|=|\widehat{A/H}|$.
Since $R:\widehat A\to\widehat H$ is a surjective homomorphism of finite groups, the fiber-counting formula for a homomorphism gives
\begin{align*}
|\widehat A|=|\ker R|\,|\widehat H|.
\end{align*}
The group $A/H$ has smaller order than $A$, so the induction hypothesis gives $|\widehat{A/H}|=|A/H|$. Combining this with $|\widehat H|=|H|=m$, we obtain
\begin{align*}
|\widehat A|
=
|\widehat{A/H}|\,|\widehat H|
=
|A/H|\,|H|
=
|A|.
\end{align*}
This completes the induction.
[/proof]
Applying the claim with $A=G$ gives $|\widehat G|=|G|$.
[/step]
[step:Evaluate the character average at the identity]
If $g=e_G$, then every $\psi\in\widehat G$ satisfies $\psi(g)=\psi(e_G)=1$. Since the preceding step proved $|\widehat G|=|G|$, we get
\begin{align*}
\frac{1}{|G|}\sum_{\psi\in\widehat G}\psi(e_G)
=
\frac{1}{|G|}\sum_{\psi\in\widehat G}1
=
\frac{|\widehat G|}{|G|}
=
1.
\end{align*}
[/step]
[step:Force the character average to vanish away from the identity]
Assume $g\ne e_G$. By the separation claim, choose $\psi_0\in\widehat G$ such that $\psi_0(g)\ne 1$. Define the character-sum function
\begin{align*}
S:G &\to \mathbb{C} \\
h &\mapsto \sum_{\psi\in\widehat G}\psi(h).
\end{align*}
Multiplication by $\psi_0$ permutes $\widehat G$, because the map
\begin{align*}
M_{\psi_0}:\widehat G &\to \widehat G \\
\psi &\mapsto \psi_0\psi
\end{align*}
has inverse $M_{\psi_0^{-1}}$. Therefore
\begin{align*}
S(g)
&=
\sum_{\psi\in\widehat G}(\psi_0\psi)(g) \\
&=
\sum_{\psi\in\widehat G}\psi_0(g)\psi(g) \\
&=
\psi_0(g)S(g).
\end{align*}
Since $\psi_0(g)\ne 1$, subtracting the right-hand side gives
\begin{align*}
(1-\psi_0(g))S(g)=0,
\end{align*}
and hence $S(g)=0$. Thus
\begin{align*}
\frac{1}{|G|}\sum_{\psi\in\widehat G}\psi(g)=0
\end{align*}
whenever $g\ne e_G$.
[guided]
Here is the cancellation mechanism in full detail. We know $g\ne e_G$, so the previous step gives a character
\begin{align*}
\psi_0:G\to\mathbb{C}^\times
\end{align*}
with $\psi_0(g)\ne 1$. Define the character-sum function
\begin{align*}
S:G &\to \mathbb{C} \\
h &\mapsto \sum_{\psi\in\widehat G}\psi(h).
\end{align*}
The key idea is to evaluate this function at $g$ and multiply every character in the sum by the fixed character $\psi_0$. This does not change the set being summed over: the map
\begin{align*}
M_{\psi_0}:\widehat G &\to \widehat G \\
\psi &\mapsto \psi_0\psi
\end{align*}
is a bijection, with inverse $\psi\mapsto \psi_0^{-1}\psi$.
Because this relabelling is a bijection, the sum is unchanged:
\begin{align*}
S(g)
=
\sum_{\psi\in\widehat G}(\psi_0\psi)(g).
\end{align*}
Now evaluate the product character at $g$:
\begin{align*}
(\psi_0\psi)(g)=\psi_0(g)\psi(g).
\end{align*}
Since $\psi_0(g)$ is independent of $\psi$, it factors out:
\begin{align*}
S(g)
=
\sum_{\psi\in\widehat G}\psi_0(g)\psi(g)
=
\psi_0(g)\sum_{\psi\in\widehat G}\psi(g)
=
\psi_0(g)S(g).
\end{align*}
Thus
\begin{align*}
(1-\psi_0(g))S(g)=0.
\end{align*}
The factor $1-\psi_0(g)$ is nonzero by construction, so $S(g)=0$. This is the orthogonality cancellation: away from the identity, the character values cancel exactly.
[/guided]
[/step]
[step:Convert the group identity condition back to the congruence condition]
By definition,
\begin{align*}
g=e_G
\iff
\bar n \bar a^{-1}=1
\iff
\bar n=\bar a
\iff
n\equiv a \pmod q.
\end{align*}
Combining this equivalence with the preceding two steps gives
\begin{align*}
\frac{1}{\varphi(q)}\sum_{\chi \bmod q}\overline{\chi(a)}\,\chi(n)
=
\frac{1}{|G|}\sum_{\psi\in\widehat G}\psi(g)
=
\begin{cases}
1, & n \equiv a \pmod q,\\
0, & n \not\equiv a \pmod q.
\end{cases}
\end{align*}
This is exactly the asserted orthogonality relation for Dirichlet characters modulo $q$.
[/step]