[proofplan]
We first translate the interval of summation to $\{1,\dots,N\}$; this only multiplies each exponential sum by a complex number of modulus $1$. We then prove the dual operator estimate for sums indexed by $\Theta$, because the desired inequality is exactly the boundedness of the adjoint operator. The dual estimate is obtained by expanding the square: the diagonal contribution is $N\sum |c_\theta|^2$, and the full off-diagonal [bilinear form](/page/Bilinear%20Form) is controlled by the Montgomery--Vaughan Hilbert inequality for $\Delta$-spaced points on $\mathbb{R}/\mathbb{Z}$.
[/proofplan]
[step:Translate the interval to $\{1,\dots,N\}$]
For each integer $j$ with $1 \le j \le N$, define $b_j := a_{M+j} \in \mathbb{C}$. Then, for every $\theta \in \Theta$,
\begin{align*}
\sum_{M < n \le M+N} a_n e(n\theta) = e(M\theta)\sum_{j=1}^{N} b_j e(j\theta).
\end{align*}
Since $|e(M\theta)| = 1$, the left-hand side in the theorem is equal to
\begin{align*}
\sum_{\theta \in \Theta}\left|\sum_{j=1}^{N} b_j e(j\theta)\right|^2.
\end{align*}
Also
\begin{align*}
\sum_{M < n \le M+N}|a_n|^2 = \sum_{j=1}^{N}|b_j|^2.
\end{align*}
It is therefore enough to prove the theorem in the case $M=0$, with the summation range $1 \le n \le N$.
[/step]
[step:Pass to the finite-dimensional dual inequality]
Define the [linear map](/page/Linear%20Map)
\begin{align*}
T: \mathbb{C}^{N} &\to \mathbb{C}^{\Theta}
\end{align*}
by
\begin{align*}
(Tb)(\theta) := \sum_{n=1}^{N} b_n e(n\theta)
\end{align*}
for $b=(b_1,\dots,b_N)\in \mathbb{C}^N$ and $\theta \in \Theta$. Equip $\mathbb{C}^{N}$ with the norm
\begin{align*}
\|b\|_{\ell^2_N}^2 := \sum_{n=1}^{N}|b_n|^2
\end{align*}
and equip $\mathbb{C}^{\Theta}$ with the norm
\begin{align*}
\|c\|_{\ell^2(\Theta)}^2 := \sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
It suffices to prove that, for every family $c=(c_\theta)_{\theta \in \Theta} \in \mathbb{C}^{\Theta}$,
\begin{align*}
\sum_{n=1}^{N}\left|\sum_{\theta \in \Theta} c_\theta e(n\theta)\right|^2 \le \left(N-1+\Delta^{-1}\right)\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
Indeed, this is the estimate $\|T^*c\|_{\ell^2_N}^2 \le (N-1+\Delta^{-1})\|c\|_{\ell^2(\Theta)}^2$, up to complex conjugation of the unimodular factors. Since $T$ is a linear map between finite-dimensional Hilbert spaces, $T$ and $T^*$ have the same operator norm. Hence the dual estimate implies
\begin{align*}
\sum_{\theta \in \Theta}\left|\sum_{n=1}^{N} b_n e(n\theta)\right|^2 \le \left(N-1+\Delta^{-1}\right)\sum_{n=1}^{N}|b_n|^2.
\end{align*}
[guided]
The desired inequality says precisely that the operator $T$ has squared operator norm at most $N-1+\Delta^{-1}$. Instead of estimating $T$ directly, we estimate its adjoint. This is legitimate because, in finite-dimensional Hilbert spaces, the operator norm of a linear map is equal to the operator norm of its adjoint.
Let
\begin{align*}
T: \mathbb{C}^{N} &\to \mathbb{C}^{\Theta}
\end{align*}
be defined by
\begin{align*}
(Tb)(\theta) := \sum_{n=1}^{N} b_n e(n\theta).
\end{align*}
The norm on the domain is
\begin{align*}
\|b\|_{\ell^2_N}^2 := \sum_{n=1}^{N}|b_n|^2,
\end{align*}
and the norm on the target is
\begin{align*}
\|c\|_{\ell^2(\Theta)}^2 := \sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
The adjoint operator has the same matrix with conjugated entries. Since each $e(n\theta)$ has modulus $1$, conjugating all phases does not change the square norm. Thus it is enough to prove the dual estimate
\begin{align*}
\sum_{n=1}^{N}\left|\sum_{\theta \in \Theta} c_\theta e(n\theta)\right|^2 \le \left(N-1+\Delta^{-1}\right)\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
Once this estimate is known for all $c \in \mathbb{C}^{\Theta}$, it gives $\|T^*\|^2 \le N-1+\Delta^{-1}$. Therefore $\|T\|^2 \le N-1+\Delta^{-1}$, which is exactly the desired inequality for the coefficients $b_1,\dots,b_N$.
[/guided]
[/step]
[step:Expand the dual square and separate the diagonal]
Fix $c=(c_\theta)_{\theta \in \Theta} \in \mathbb{C}^{\Theta}$. Expanding the square and interchanging the two finite sums gives
\begin{align*}
\sum_{n=1}^{N}\left|\sum_{\theta \in \Theta} c_\theta e(n\theta)\right|^2 = \sum_{\theta,\theta' \in \Theta} c_\theta \overline{c_{\theta'}}\sum_{n=1}^{N} e(n(\theta-\theta')).
\end{align*}
For $\theta=\theta'$, the inner sum equals $N$. Hence the diagonal contribution is
\begin{align*}
N\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
The off-diagonal contribution is therefore
\begin{align*}
\mathcal{O} := \sum_{\theta,\theta' \in \Theta,\ \theta \ne \theta'} c_\theta \overline{c_{\theta'}}\sum_{n=1}^{N} e(n(\theta-\theta')).
\end{align*}
[/step]
[step:Control the off-diagonal bilinear form by Hilbert's inequality]
We use the Montgomery--Vaughan Hilbert inequality for additive characters on the circle as an external prerequisite: if $\Theta \subset \mathbb{R}/\mathbb{Z}$ is finite and $\Delta$-spaced, then for every $c=(c_\theta)_{\theta \in \Theta} \in \mathbb{C}^{\Theta}$ and every $N \in \mathbb{N}$,
\begin{align*}
\left|\sum_{\theta,\theta' \in \Theta,\ \theta \ne \theta'} c_\theta \overline{c_{\theta'}}\sum_{n=1}^{N} e(n(\theta-\theta'))\right| \le \left(\Delta^{-1}-1\right)\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
The spacing hypothesis needed in this inequality is exactly the hypothesis $\|\theta-\theta'\|_{\mathbb{R}/\mathbb{Z}} \ge \Delta$ for distinct $\theta,\theta'\in\Theta$. Applying it to the off-diagonal form $\mathcal{O}$ gives
\begin{align*}
\mathcal{O} \le \left(\Delta^{-1}-1\right)\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
Here the cited input is the standard Montgomery--Vaughan form of Hilbert's inequality.
[/step]
[step:Combine the diagonal and off-diagonal estimates]
Combining the expansion with the diagonal computation and the off-diagonal estimate yields
\begin{align*}
\sum_{n=1}^{N}\left|\sum_{\theta \in \Theta} c_\theta e(n\theta)\right|^2 \le N\sum_{\theta \in \Theta}|c_\theta|^2 + \left(\Delta^{-1}-1\right)\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
Therefore
\begin{align*}
\sum_{n=1}^{N}\left|\sum_{\theta \in \Theta} c_\theta e(n\theta)\right|^2 \le \left(N-1+\Delta^{-1}\right)\sum_{\theta \in \Theta}|c_\theta|^2.
\end{align*}
This proves the dual estimate. By the finite-dimensional duality argument above, the corresponding estimate for $T$ follows. Finally, the initial translation from $M<n\le M+N$ to $1\le n\le N$ transfers the bound back to the original coefficients $(a_n)_{M<n\le M+N}$. This proves the additive large sieve inequality.
[/step]