[proofplan]
We pass to frozen action-angle variables depending on the slow time $s=\varepsilon t$. The time-dependence of this canonical change of variables contributes an order-$\varepsilon$ angle-dependent perturbation to the Hamiltonian. Since the unperturbed angle velocity is bounded away from zero, we solve a homological equation that removes the oscillatory part of the perturbation from the action equation. The transformed action changes only at order $\varepsilon^2$, so over times of length $O(\varepsilon^{-1})$ it changes by $O(\varepsilon)$; the original action differs from it by another $O(\varepsilon)$.
[/proofplan]
[step:Write the Hamiltonian in slow action-angle variables]
Set $s=\varepsilon t$. For each $s\in[0,C]$, define $\mu(s):=\lambda(s)$. By hypothesis, the map
\begin{align*}
\Phi_{\mu(s)}:J\times\mathbb{T}\to\mathbb{R}^2
\end{align*}
is a symplectic action-angle chart for the frozen Hamiltonian. Thus, along the solution, we write
\begin{align*}
(q_\varepsilon(t),p_\varepsilon(t))=\Phi_{\mu(\varepsilon t)}(I(t),\theta(t)).
\end{align*}
By the exact moving-chart hypothesis, the transformed Hamiltonian is the smooth map
\begin{align*}
\mathcal{H}_\varepsilon:J\times\mathbb{T}\times[0,C]\to\mathbb R
\end{align*}
defined by
\begin{align*}
\mathcal{H}_\varepsilon(I,\theta,s)=K(I,\mu(s))+\varepsilon R(I,\theta,s),
\end{align*}
where $R:J\times\mathbb{T}\times[0,C]\to\mathbb{R}$ is smooth and $2\pi$-periodic in $\theta$. This exactness assumption rules out a non-Hamiltonian flux term from the moving annulus of charts.
Hamilton's equations in these variables are therefore
\begin{align*}
\dot I(t)=-\varepsilon\,\partial_\theta R(I(t),\theta(t),\varepsilon t).
\end{align*}
Also,
\begin{align*}
\dot\theta(t)=\partial_I K(I(t),\mu(\varepsilon t))+\varepsilon\,\partial_I R(I(t),\theta(t),\varepsilon t).
\end{align*}
By assumption, $I(t)\in J_0$ for all $0\le t\le C/\varepsilon$. Choose once and for all a compact interval $J_1$ determined by $J_0$ and $J$, with $J_0\subset J_1^\circ\subset J$. All estimates below are made on $J_1\times\mathbb{T}\times[0,C]$.
[/step]
[step:Solve the homological equation for the oscillatory perturbation]
Define the frozen frequency
\begin{align*}
\omega(I,s):=\partial_I K(I,\mu(s)).
\end{align*}
By hypothesis, $|\omega(I,s)|\ge\omega_0>0$ on $J_1\times[0,C]$.
Define the angle average of $R$ by
\begin{align*}
\overline R(I,s):=\frac{1}{2\pi}\int_0^{2\pi}R(I,\theta,s)\,d\mathcal{L}^1(\theta).
\end{align*}
Since $R-\overline R$ has zero average in $\theta$, define
\begin{align*}
S:J_1\times\mathbb{T}\times[0,C]\to\mathbb{R}
\end{align*}
to be the unique smooth $2\pi$-periodic function with zero angle average satisfying
\begin{align*}
\omega(I,s)\,\partial_\theta S(I,\theta,s)=R(I,\theta,s)-\overline R(I,s).
\end{align*}
This definition is valid because division by $\omega(I,s)$ is uniformly bounded by $\omega_0^{-1}$, and the right-hand side has zero average in $\theta$.
[guided]
The action equation contains the oscillatory term $-\varepsilon\partial_\theta R$. If we integrate this directly over time $0\le t\le C/\varepsilon$, the factor $\varepsilon$ is not enough by itself, since the time interval has length of order $\varepsilon^{-1}$. We need to use the rapid rotation in $\theta$.
The fast rotation is controlled by
\begin{align*}
\omega(I,s):=\partial_I K(I,\mu(s)).
\end{align*}
The non-separatrix hypothesis is encoded in the bound $|\omega(I,s)|\ge\omega_0>0$ on $J_1\times[0,C]$. This is the place where the proof would fail near a separatrix: the period can become unbounded, equivalently the frequency can approach zero, and the division below would no longer be uniformly controlled.
We separate $R$ into its average and oscillatory parts. Define
\begin{align*}
\overline R(I,s):=\frac{1}{2\pi}\int_0^{2\pi}R(I,\theta,s)\,d\mathcal{L}^1(\theta).
\end{align*}
The function $R(I,\theta,s)-\overline R(I,s)$ has zero average over $\theta$. Therefore it has a periodic primitive in $\theta$. Because $\omega(I,s)$ is bounded away from zero, we can define a smooth periodic function
\begin{align*}
S:J_1\times\mathbb{T}\times[0,C]\to\mathbb{R}
\end{align*}
with zero angle average by requiring
\begin{align*}
\omega(I,s)\,\partial_\theta S(I,\theta,s)=R(I,\theta,s)-\overline R(I,s).
\end{align*}
The zero-average normalization removes the additive constant ambiguity. Smoothness follows from the smoothness of $R$, $K$, and $\lambda$, together with the uniform lower bound on $|\omega|$.
[/guided]
[/step]
[step:Introduce an averaged action that differs from the frozen action by order $\varepsilon$]
Define the corrected action function
\begin{align*}
\mathcal J_\varepsilon:[0,C/\varepsilon]\to\mathbb R
\end{align*}
by
\begin{align*}
\mathcal J_\varepsilon(t):=I(t)+\varepsilon\,\partial_\theta S(I(t),\theta(t),\varepsilon t).
\end{align*}
Since $S$ is smooth on the compact set $J_1\times\mathbb{T}\times[0,C]$, the quantity $\partial_\theta S$ is bounded there. Let
\begin{align*}
A_0:=\sup_{(I,\theta,s)\in J_1\times\mathbb{T}\times[0,C]}|\partial_\theta S(I,\theta,s)|.
\end{align*}
Then, whenever $I(t)\in J_1$,
\begin{align*}
|\mathcal J_\varepsilon(t)-I(t)|\le \varepsilon A_0.
\end{align*}
[/step]
[step:Differentiate the corrected action and cancel the order $\varepsilon$ oscillation]
Differentiate $\mathcal J_\varepsilon(t)$ using the chain rule:
\begin{align*}
\dot{\mathcal J}_\varepsilon(t)=\dot I(t)+\varepsilon\,\partial_{I\theta}S(I(t),\theta(t),s)\dot I(t)+\varepsilon\,\partial_{\theta\theta}S(I(t),\theta(t),s)\dot\theta(t)+\varepsilon^2\,\partial_{s\theta}S(I(t),\theta(t),s),
\end{align*}
where $s=\varepsilon t$.
Substitute the equations of motion. The order-$\varepsilon$ part is
\begin{align*}
-\varepsilon\,\partial_\theta R(I,\theta,s)+\varepsilon\,\omega(I,s)\,\partial_{\theta\theta}S(I,\theta,s).
\end{align*}
Differentiate the homological equation in $\theta$. Since $\omega$ is independent of $\theta$,
\begin{align*}
\omega(I,s)\,\partial_{\theta\theta}S(I,\theta,s)=\partial_\theta R(I,\theta,s).
\end{align*}
Thus the order-$\varepsilon$ terms cancel.
All remaining terms contain the factor $\varepsilon^2$ and are smooth functions of $(I,\theta,s)$ on the compact set $J_1\times\mathbb{T}\times[0,C]$. Hence there is a constant $A_1>0$ such that
\begin{align*}
|\dot{\mathcal J}_\varepsilon(t)|\le A_1\varepsilon^2
\end{align*}
for every $t$ for which $I(t)\in J_1$.
[/step]
[step:Integrate the averaged equation over the long time interval]
For $0\le t\le C/\varepsilon$, integrate the previous estimate with respect to one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on the time interval:
\begin{align*}
|\mathcal J_\varepsilon(t)-\mathcal J_\varepsilon(0)|\le \int_0^t A_1\varepsilon^2\,d\mathcal{L}^1(\tau).
\end{align*}
Since $t\le C/\varepsilon$, this gives
\begin{align*}
|\mathcal J_\varepsilon(t)-\mathcal J_\varepsilon(0)|\le A_1C\varepsilon.
\end{align*}
[/step]
[step:Transfer the estimate back to the frozen action]
Using $|I(t)-\mathcal J_\varepsilon(t)|\le A_0\varepsilon$ and $|I(0)-\mathcal J_\varepsilon(0)|\le A_0\varepsilon$, we obtain
\begin{align*}
|I(t)-I(0)|\le |I(t)-\mathcal J_\varepsilon(t)|+|\mathcal J_\varepsilon(t)-\mathcal J_\varepsilon(0)|+|\mathcal J_\varepsilon(0)-I(0)|.
\end{align*}
Therefore
\begin{align*}
|I(t)-I(0)|\le (2A_0+A_1C)\varepsilon.
\end{align*}
Set $M:=2A_0+A_1C$. Since $J_1$ was fixed as a choice determined by $J_0$ and $J$, this constant depends only on the compact action interval $J_0$, the frozen Hamiltonian family, the slow path $\lambda$, the chosen exact action-angle chart, and $C$. Hence
\begin{align*}
I(t)=I(0)+O(\varepsilon)
\end{align*}
uniformly for $0\le t\le C/\varepsilon$. This is the claimed adiabatic invariance of the action.
[/step]