[proofplan]
We apply Weyl differencing $k-1$ times, reducing the degree-$k$ polynomial phase to linear phases whose coefficients are controlled by the leading coefficient $\alpha_k$. The iterated finite difference has leading linear coefficient $k!\alpha_k h_1\cdots h_{k-1}$, so after grouping difference tuples by their product the divisor bound reduces the problem to estimating sums of $\min\{X,\|k!h\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\}$. The rational approximation $\alpha_k\approx a/q$ gives the standard Vinogradov summation estimate for these quantities. Substituting that estimate into the differenced inequality and taking the $2^{k-1}$-st root gives the claimed bound.
[/proofplan]
[step:Apply Weyl differencing until only linear phases remain]
Set $N:=\lfloor X\rfloor$ and define
\begin{align*}
S:=\sum_{1\le n\le N}e(P(n)).
\end{align*}
For $h\in\mathbb Z$ and a function $Q:\mathbb Z\to\mathbb R$, define the finite difference
\begin{align*}
\delta_h Q:\mathbb Z\to\mathbb R,\qquad (\delta_hQ)(n):=Q(n+h)-Q(n).
\end{align*}
Iterating the [[Weyl Differencing Lemma](/theorems/9052)][citetheorem:9052] $k-1$ times, with all intervals produced by the successive restrictions retained explicitly, gives
\begin{align*}
|S|^{2^{k-1}}\lesssim_k X^{2^{k-1}-k}\sum_{\substack{h_1, \dots, h_{k-1}\in\mathbb Z, |h_i|<X}}\left|\sum_{n\in I(h_1,\dots,h_{k-1})}e((\delta_{h_{k-1}}\cdots\delta_{h_1}P)(n))\right|,
\end{align*}
where, for each tuple $(h_1,\dots,h_{k-1})$, the set $I(h_1,\dots,h_{k-1})\subset\mathbb Z$ is an integer interval of cardinality at most $N$.
Define the product map
\begin{align*}
H:\mathbb Z^{k-1}\to\mathbb Z,\qquad (h_1,\dots,h_{k-1})\mapsto h_1\cdots h_{k-1}
\end{align*}.
By the [[Leading Coefficient of an Iterated Finite Difference](/theorems/9053)][citetheorem:9053], whenever $H(h_1,\dots,h_{k-1})\ne 0$, the polynomial
\begin{align*}
n\mapsto(\delta_{h_{k-1}}\cdots\delta_{h_1}P)(n)
\end{align*}
has degree $1$ and leading coefficient
\begin{align*}
k!\alpha_k H(h_1,\dots,h_{k-1}).
\end{align*}
[guided]
Set $N:=\lfloor X\rfloor$ and write the Weyl sum as
\begin{align*}
S:=\sum_{1\le n\le N}e(P(n)).
\end{align*}
The purpose of Weyl differencing is to trade one power of $|S|$ for a family of sums whose phases are finite differences of $P$. Since each finite difference lowers the degree of a polynomial by one when the difference parameter is nonzero, applying the process $k-1$ times turns a degree-$k$ phase into a linear phase.
For $h\in\mathbb Z$ and a function $Q:\mathbb Z\to\mathbb R$, define
\begin{align*}
\delta_h Q:\mathbb Z\to\mathbb R,\qquad (\delta_hQ)(n):=Q(n+h)-Q(n).
\end{align*}
We apply the [Weyl Differencing Lemma][citetheorem:9052] successively. At each stage the summation interval is restricted so that both shifted arguments remain in the preceding interval; hence after $k-1$ applications the inner summation is over an integer interval $I(h_1,\dots,h_{k-1})\subset\mathbb Z$ with at most $N$ elements. The iteration gives
\begin{align*}
|S|^{2^{k-1}}\lesssim_k X^{2^{k-1}-k}\sum_{\substack{h_1, \dots, h_{k-1}\in\mathbb Z, |h_i|<X}}\left|\sum_{n\in I(h_1,\dots,h_{k-1})}e((\delta_{h_{k-1}}\cdots\delta_{h_1}P)(n))\right|.
\end{align*}
The factor $X^{2^{k-1}-k}$ is the accumulated normalization from the $k-1$ differencing steps.
Now define the product map $H:\mathbb Z^{k-1}\to\mathbb Z$ by
\begin{align*}
H(h_1,\dots,h_{k-1}):=h_1\cdots h_{k-1}.
\end{align*}
When this product is nonzero, each difference step genuinely lowers the degree by one. The [Leading Coefficient of an Iterated Finite Difference][citetheorem:9053] applies to the polynomial $P$ over the characteristic-zero field $\mathbb R$ and gives that
\begin{align*}
n\mapsto(\delta_{h_{k-1}}\cdots\delta_{h_1}P)(n)
\end{align*}
is linear with leading coefficient
\begin{align*}
k!\alpha_k H(h_1,\dots,h_{k-1}).
\end{align*}
This is the key structural point: all lower coefficients of $P$ affect only the constant term of the final linear phase, while the oscillation of the final sum is governed by $k!\alpha_k h_1\cdots h_{k-1}$.
[/guided]
[/step]
[step:Bound the resulting linear sums by distance to the nearest integer]
Let $D:=k!$. For $t\in\mathbb R$, define the distance-to-integer map
\begin{align*}
\|\cdot\|_{\mathbb R/\mathbb Z}:\mathbb R\to\left[0,\tfrac12\right],\qquad t\mapsto \operatorname{dist}(t,\mathbb Z)=\inf_{m\in\mathbb Z}|t-m|
\end{align*}.
If $H(h_1,\dots,h_{k-1})=0$, the corresponding inner sum has absolute value at most $N\le X$. There are $\lesssim_k X^{k-2}$ such tuples, so the total contribution of these tuples is $\lesssim_k X^{k-1}$.
Assume now that $H(h_1,\dots,h_{k-1})\ne 0$. There exists $c(h_1,\dots,h_{k-1})\in\mathbb R$ such that, for all $n\in\mathbb Z$,
\begin{align*}
(\delta_{h_{k-1}}\cdots\delta_{h_1}P)(n)=D\alpha_kH(h_1,\dots,h_{k-1})n+c(h_1,\dots,h_{k-1}).
\end{align*}
For every $\theta\in\mathbb R$, every $\gamma\in\mathbb R$, and every integer interval $J\subset\mathbb Z$ with $\#J\le X$, the finite geometric-series identity gives
\begin{align*}
\left|\sum_{n\in J}e(\theta n+\gamma)\right|\le \min\left\{X,\frac{1}{2\|\theta\|_{\mathbb R/\mathbb Z}}\right\},
\end{align*}
with the convention that the second term is $+\infty$ when $\theta\in\mathbb Z$.
Using this with $\theta=D\alpha_kH(h_1,\dots,h_{k-1})$, we obtain
\begin{align*}
|S|^{2^{k-1}}\lesssim_k X^{2^{k-1}-k}\left(X^{k-1}+\sum_{1\le |r|\le X^{k-1}}d_{k-1}(|r|)\min\left\{X,\|D r\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\right\}\right),
\end{align*}
where the $(k-1)$-fold divisor function is the map
\begin{align*}
d_{k-1}:\mathbb N\to\mathbb N,\qquad m\mapsto \#\{(u_1,\dots,u_{k-1})\in\mathbb N^{k-1}:u_1\cdots u_{k-1}=m\}
\end{align*}.
We use the elementary divisor bound: for each $\eta>0$ there is a constant $C_{k,\eta}>0$ such that
\begin{align*}
d_{k-1}(m)\le C_{k,\eta}m^\eta
\end{align*}
for every $m\in\mathbb N$. Indeed, this follows by induction on $k-1$ from the estimate $\#\{d\in\mathbb N:d\mid m\}\le 2m^{\eta/2}$ for all sufficiently large $m$, with the finitely many smaller values absorbed into $C_{k,\eta}$. Since $1\le r\le X^{k-1}$ implies $r^\eta\le X^{(k-1)\eta}$, choosing $\eta>0$ so that $(k-1)\eta\le\varepsilon$ absorbs the divisor factor into $X^\varepsilon$. Hence
\begin{align*}
|S|^{2^{k-1}}\lesssim_{k,\varepsilon} X^{2^{k-1}-k+\varepsilon}\left(X^{k-1}+\sum_{1\le r\le X^{k-1}}\min\left\{X,\|D r\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\right\}\right).
\end{align*}
[/step]
[step:Estimate the averaged reciprocal distances using the rational approximation]
We use the following Vinogradov rational-approximation summation estimate with a bounded approximation constant: if $Y\ge 1$, $R\ge 1$, $A\ge 1$, $\theta\in\mathbb R$, and there are coprime integers $b$ and $Q\ge 1$ with
\begin{align*}
\left|\theta-\frac{b}{Q}\right|\le A Q^{-2},
\end{align*}
then
\begin{align*}
\sum_{1\le r\le R}\min\{Y,\|r\theta\|_{\mathbb R/\mathbb Z}^{-1}\}\lesssim_{A,\eta} (RYQ^{-1}+R+Q)Y^\eta
\end{align*}
for every $\eta>0$. This is the standard block decomposition form of Vinogradov's summation estimate; the bounded constant $A$ is part of the hypothesis and is carried into the implicit constant.
Apply the lemma to $\theta=D\alpha_k$. Reduce $Da/q$ to lowest terms, writing $Da/q=b/Q$ with $\gcd(b,Q)=1$. Then $Q=q/g$ for some divisor $g$ of $D$, so $q/D\le Q\le q$. Moreover
\begin{align*}
\left|D\alpha_k-\frac{b}{Q}\right|=D\left|\alpha_k-\frac{a}{q}\right|\le Dq^{-2}\le DQ^{-2}.
\end{align*}
Thus the lemma applies with $A=D$, and all constants depend on $D=k!$, hence only on $k$.
With $R:=\lfloor X^{k-1}\rfloor$ and $Y:=X$, the lemma gives
\begin{align*}
\sum_{1\le r\le X^{k-1}}\min\left\{X,\|D r\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\right\}\lesssim_{k,\varepsilon}X^\varepsilon\left(X^kq^{-1}+X^{k-1}+q\right).
\end{align*}
Since
\begin{align*}
X^kq^{-1}+X^{k-1}+q=X^k(q^{-1}+X^{-1}+qX^{-k}),
\end{align*}
and since $q\le X^k$ implies $X^{k-1}\le X^k(q^{-1}+X^{-1}+qX^{-k})$, we conclude that
\begin{align*}
X^{k-1}+\sum_{1\le r\le X^{k-1}}\min\left\{X,\|D r\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\right\}\lesssim_{k,\varepsilon}X^{k+\varepsilon}(q^{-1}+X^{-1}+qX^{-k}).
\end{align*}
[guided]
The point of this step is to convert the Diophantine information on $\alpha_k$ into an average bound for the reciprocal distances $\|Dr\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}$. The correct summation input must allow a bounded constant in the rational approximation: after multiplying $\alpha_k$ by $D=k!$, the approximation error is multiplied by $D$.
We use the Vinogradov rational-approximation summation lemma in the following form. If $Y\ge 1$, $R\ge 1$, $A\ge 1$, $\theta\in\mathbb R$, and $\theta$ has a reduced rational approximation $b/Q$ satisfying
\begin{align*}
\left|\theta-\frac{b}{Q}\right|\le A Q^{-2},
\end{align*}
then, for every $\eta>0$,
\begin{align*}
\sum_{1\le r\le R}\min\{Y,\|r\theta\|_{\mathbb R/\mathbb Z}^{-1}\}\lesssim_{A,\eta}(RYQ^{-1}+R+Q)Y^\eta.
\end{align*}
This estimate is proved by grouping $r$ into blocks of length $Q$ and using the spacing of the residues $rb\bmod Q$; the constant depends on $A$ because the approximation is allowed to be within $AQ^{-2}$ rather than exactly $Q^{-2}$.
Now set $\theta:=D\alpha_k$. Reduce $Da/q$ to lowest terms and write $Da/q=b/Q$ with $\gcd(b,Q)=1$. Since the reduction cancels a divisor of $D$, one has $Q=q/g$ for some $g\mid D$, hence $q/D\le Q\le q$. The approximation hypothesis gives
\begin{align*}
\left|D\alpha_k-\frac{b}{Q}\right|=D\left|\alpha_k-\frac{a}{q}\right|\le Dq^{-2}\le DQ^{-2}.
\end{align*}
Therefore the lemma applies with $A=D$, and its implicit constant depends only on $k$ and $\eta$.
Taking $R:=\lfloor X^{k-1}\rfloor$ and $Y:=X$, and using $Q\asymp_k q$, gives
\begin{align*}
\sum_{1\le r\le X^{k-1}}\min\left\{X,\|D r\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\right\}\lesssim_{k,\varepsilon}X^\varepsilon\left(X^kq^{-1}+X^{k-1}+q\right).
\end{align*}
Finally,
\begin{align*}
X^kq^{-1}+X^{k-1}+q=X^k(q^{-1}+X^{-1}+qX^{-k}).
\end{align*}
The extra $X^{k-1}$ term already present outside the sum is also bounded by $X^k(q^{-1}+X^{-1}+qX^{-k})$, because the middle summand is exactly $X^{k-1}$. Hence
\begin{align*}
X^{k-1}+\sum_{1\le r\le X^{k-1}}\min\left\{X,\|D r\alpha_k\|_{\mathbb R/\mathbb Z}^{-1}\right\}\lesssim_{k,\varepsilon}X^{k+\varepsilon}(q^{-1}+X^{-1}+qX^{-k}).
\end{align*}
[/guided]
[/step]
[step:Insert the summation estimate and take the final root]
Substituting the preceding estimate into the differenced inequality yields
\begin{align*}
|S|^{2^{k-1}}\lesssim_{k,\varepsilon}X^{2^{k-1}+\varepsilon}(q^{-1}+X^{-1}+qX^{-k}).
\end{align*}
Taking the $2^{k-1}$-st root gives
\begin{align*}
|S|\lesssim_{k,\varepsilon}X^{1+\varepsilon}\left(q^{-1}+X^{-1}+qX^{-k}\right)^{2^{1-k}},
\end{align*}
after replacing $\varepsilon$ by a smaller positive constant depending only on $k$. Since $S=\sum_{\substack{n\in\mathbb Z, 1\le n\le X}}e(P(n))$, this is exactly the desired estimate.
[guided]
The previous steps give the powered estimate
\begin{align*}
|S|^{2^{k-1}}\lesssim_{k,\varepsilon}X^{2^{k-1}+\varepsilon}(q^{-1}+X^{-1}+qX^{-k}).
\end{align*}
Taking the $2^{k-1}$-st root is legitimate because both sides are nonnegative [real numbers](/page/Real%20Numbers). This gives
\begin{align*}
|S|\lesssim_{k,\varepsilon}X^{1+\varepsilon/2^{k-1}}\left(q^{-1}+X^{-1}+qX^{-k}\right)^{2^{1-k}}.
\end{align*}
Since the theorem asks for the estimate for every positive $\varepsilon$, we replace the earlier exponent parameter by $2^{k-1}\varepsilon$ and obtain
\begin{align*}
|S|\lesssim_{k,\varepsilon}X^{1+\varepsilon}\left(q^{-1}+X^{-1}+qX^{-k}\right)^{2^{1-k}}.
\end{align*}
By definition, $N=\lfloor X\rfloor$, so the integers $n$ with $1\le n\le X$ are exactly $1,\dots,N$. Therefore
\begin{align*}
S=\sum_{\substack{n\in\mathbb Z, 1\le n\le X}}e(P(n)),
\end{align*}
and the displayed bound is precisely the claimed estimate.
[/guided]
[/step]