[proofplan]
The proof reduces the rank statistic to a finite-population sampling problem. Under the continuous null hypothesis, the ranks occupied by the $X$-sample form a uniformly chosen subset of size $m_N$ of $\{1,\dots,N\}$. After centering the deterministic scores, the statistic is therefore the sum of $m_N$ elements sampled by simple random sampling without replacement from a finite population with mean zero and variance $s_N^2$. The stated maximum-score assumption is exactly the Lindeberg-type hypothesis in the Hájek finite-population [central limit theorem](/theorems/521) for simple random sampling without replacement, which gives convergence in distribution to the standard normal distribution with the finite-population variance factor.
[/proofplan]
[step:Show that the $X$-sample ranks form a uniform subset]
For each $N$, let $R_{N,i}$ denote the rank of $X_{N,i}$ among the $N$ pooled observations, and define the random subset $S_N=\{R_{N,1},\dots,R_{N,m_N}\}$ of $\{1,\dots,N\}$. Because the common distribution under the null is continuous, ties occur with probability zero. On the no-tie event, the relative ordering of the $N$ independent identically distributed observations is exchangeable, so each of the $N!$ strict orderings has probability $1/N!$. Hence, for every subset $A\subset \{1,\dots,N\}$ with $|A|=m_N$,
\begin{align*}
\mathbb{P}(S_N=A)=\frac{m_N!n_N!}{N!}=\binom{N}{m_N}^{-1}.
\end{align*}
Thus $S_N$ is a uniformly chosen subset of $\{1,\dots,N\}$ of cardinality $m_N$.
[guided]
For each $N$, let $R_{N,i}$ denote the rank of $X_{N,i}$ among the $N$ pooled observations, and define $S_N=\{R_{N,1},\dots,R_{N,m_N}\}$ as the subset of $\{1,\dots,N\}$ occupied by the observations from the first sample. The continuity assumption is used here and only here: since the common distribution is continuous, the probability that two pooled observations are equal is zero. Therefore the pooled observations have a strict ordering with probability one.
On this no-tie event, the $N$ observations are independent and identically distributed, so their labels are exchangeable. Equivalently, every permutation of the $N$ labels among the ordered positions has the same probability. Since there are $N!$ possible strict orderings, each ordering has probability $1/N!$.
Now fix a subset $A\subset \{1,\dots,N\}$ with $|A|=m_N$. The event $S_N=A$ says exactly that the $m_N$ observations labelled $X$ occupy the ordered positions in $A$, while the $n_N$ observations labelled $Y$ occupy the complement $\{1,\dots,N\}\setminus A$. Once the positions in $A$ are fixed, the $X$ labels may be assigned to those positions in $m_N!$ ways, and the $Y$ labels may be assigned to the remaining positions in $n_N!$ ways. Hence
\begin{align*}
\mathbb{P}(S_N=A)
=
\frac{m_N!n_N!}{N!}
=
\binom{N}{m_N}^{-1}.
\end{align*}
This probability is the same for every subset $A$ of size $m_N$, so $S_N$ is a uniformly chosen subset of $\{1,\dots,N\}$ with cardinality $m_N$.
[/guided]
[/step]
[step:Rewrite the centered rank statistic as a sampled finite-population sum]
Define the centered score population $b_N:\{1,\dots,N\}\to \mathbb{R}$ by
\begin{align*}
b_N(r)=a_N(r)-\bar a_N.
\end{align*}
Define the two-sample linear rank statistic $L_N^{(a)}:\mathbb{R}^{N}\to\mathbb{R}$ on no-tie pooled samples by
\begin{align*}
L_N^{(a)}(Z_N)=\sum_{i=1}^{m_N} a_N(R_{N,i}),
\end{align*}
where $R_{N,i}$ is the rank of $X_{N,i}$ among the pooled observations.
Then
\begin{align*}
\frac{1}{N}\sum_{r=1}^{N} b_N(r)=0,
\qquad
\frac{1}{N}\sum_{r=1}^{N} b_N(r)^2=s_N^2.
\end{align*}
Using the definition of $S_N$, first subtract $\bar a_N$ from each of the $m_N$ selected scores:
\begin{align*}
L_N^{(a)}(Z_N)-m_N\bar a_N
=
\sum_{i=1}^{m_N}\bigl(a_N(R_{N,i})-\bar a_N\bigr).
\end{align*}
Since $S_N$ is the set of the selected ranks and $b_N(r)=a_N(r)-\bar a_N$ for each $r\in\{1,\dots,N\}$, this is equivalently
\begin{align*}
L_N^{(a)}(Z_N)-m_N\bar a_N
=
\sum_{r\in S_N} b_N(r).
\end{align*}
Thus the centered rank statistic is the sum of $m_N$ values sampled without replacement from the finite population $\{b_N(1),\dots,b_N(N)\}$.
[guided]
The purpose of this step is to remove the rank-statistic notation and replace it with a finite-population sum. Define $b_N:\{1,\dots,N\}\to\mathbb{R}$ by $b_N(r)=a_N(r)-\bar a_N$. By the definition of $\bar a_N$, this population has mean zero, and by the definition of $s_N^2$, it has variance $s_N^2$:
\begin{align*}
\frac{1}{N}\sum_{r=1}^{N} b_N(r)=0,
\qquad
\frac{1}{N}\sum_{r=1}^{N} b_N(r)^2=s_N^2.
\end{align*}
We also define the statistic explicitly. On the no-tie event, let $L_N^{(a)}:\mathbb{R}^{N}\to\mathbb{R}$ be given by
\begin{align*}
L_N^{(a)}(Z_N)=\sum_{i=1}^{m_N}a_N(R_{N,i}),
\end{align*}
where $R_{N,i}$ is the rank of $X_{N,i}$ among the pooled observations. Subtracting $m_N\bar a_N$ subtracts the same centering constant from each selected score:
\begin{align*}
L_N^{(a)}(Z_N)-m_N\bar a_N
=
\sum_{i=1}^{m_N}\bigl(a_N(R_{N,i})-\bar a_N\bigr).
\end{align*}
Since $S_N$ is exactly the set of selected ranks and $b_N(r)=a_N(r)-\bar a_N$, the last display is the same as
\begin{align*}
L_N^{(a)}(Z_N)-m_N\bar a_N
=
\sum_{r\in S_N}b_N(r).
\end{align*}
This is the finite-population reformulation needed for Hájek's theorem: the randomness is only the uniformly selected subset $S_N$, while the values $b_N(1),\dots,b_N(N)$ are deterministic.
[/guided]
[/step]
[step:Apply Hájek's finite-population central limit theorem]
Define the normalized finite population $c_N:\{1,\dots,N\}\to \mathbb{R}$ by
\begin{align*}
c_N(r)=\frac{b_N(r)}{s_N}.
\end{align*}
Then
\begin{align*}
\frac{1}{N}\sum_{r=1}^{N} c_N(r)=0,
\qquad
\frac{1}{N}\sum_{r=1}^{N} c_N(r)^2=1,
\end{align*}
and the assumed maximum condition becomes
\begin{align*}
\frac{\max_{1\leq r\leq N}|c_N(r)|}{\sqrt{N}}\to 0.
\end{align*}
We use the following form of the Hájek finite-population [central limit theorem](/theorems/1848) for simple random sampling without replacement. If $d_N:\{1,\dots,N\}\to\mathbb{R}$ is a deterministic population satisfying
\begin{align*}
\frac{1}{N}\sum_{r=1}^{N} d_N(r)=0,
\end{align*}
\begin{align*}
\frac{1}{N}\sum_{r=1}^{N} d_N(r)^2=1,
\end{align*}
and
\begin{align*}
\frac{\max_{1\leq r\leq N}|d_N(r)|}{\sqrt{N}}\to 0,
\end{align*}
and if $T_N$ is a uniformly chosen subset of $\{1,\dots,N\}$ with $|T_N|=m_N$ and $m_N/N\to\lambda\in(0,1)$, then
\begin{align*}
\frac{\sum_{r\in T_N} d_N(r)}{\sqrt{m_N(N-m_N)/(N-1)}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
This statement uses the finite-population variance convention $N^{-1}\sum_{r=1}^{N}d_N(r)^2=1$, so the sampling variance factor is $m_N(N-m_N)/(N-1)$. We apply it with $d_N=c_N$ and $T_N=S_N$. The mean condition, variance condition, maximum condition, and sampling fraction condition were verified above, and the first step proved that $S_N$ is uniformly sampled without replacement. Since $N-m_N=n_N$, the theorem gives
\begin{align*}
\frac{\sum_{r\in S_N} c_N(r)}{\sqrt{\frac{m_Nn_N}{N-1}}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
Multiplying the numerator by $s_N$ gives
\begin{align*}
\frac{\sum_{r\in S_N} b_N(r)}{\sqrt{\frac{m_Nn_N}{N-1}s_N^2}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
[guided]
We now apply the finite-population central limit theorem to the deterministic population after normalising it to have variance one. Define $c_N:\{1,\dots,N\}\to\mathbb{R}$ by
\begin{align*}
c_N(r)=\frac{b_N(r)}{s_N}.
\end{align*}
This definition is valid because $s_N^2>0$, hence $s_N>0$. The centering and variance computations give
\begin{align*}
\frac{1}{N}\sum_{r=1}^{N}c_N(r)=0,
\qquad
\frac{1}{N}\sum_{r=1}^{N}c_N(r)^2=1.
\end{align*}
The maximum condition in the theorem statement becomes
\begin{align*}
\frac{\max_{1\leq r\leq N}|c_N(r)|}{\sqrt{N}}
=
\frac{\max_{1\leq r\leq N}|a_N(r)-\bar a_N|}{\sqrt{N}s_N}
\to 0.
\end{align*}
The Hájek finite-population central limit theorem applies to a deterministic population $d_N:\{1,\dots,N\}\to\mathbb{R}$ with mean zero, variance one, and maximum element $o(\sqrt{N})$, sampled by a uniformly chosen subset $T_N$ of size $m_N$ with $m_N/N\to\lambda\in(0,1)$. We apply it with $d_N=c_N$ and $T_N=S_N$. The uniform-subset hypothesis was proved in the first step, and the sampling fraction hypothesis is part of the theorem statement. Therefore
\begin{align*}
\frac{\sum_{r\in S_N}c_N(r)}{\sqrt{m_N(N-m_N)/(N-1)}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
Since $N-m_N=n_N$ and $b_N(r)=s_Nc_N(r)$ for every $r\in\{1,\dots,N\}$, this is equivalently
\begin{align*}
\frac{\sum_{r\in S_N}b_N(r)}{\sqrt{\frac{m_Nn_N}{N-1}s_N^2}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
[/guided]
[/step]
[step:Return from the sampled-score sum to the rank statistic]
From the finite-population representation already proved,
\begin{align*}
\sum_{r\in S_N} b_N(r)=L_N^{(a)}(Z_N)-m_N\bar a_N.
\end{align*}
Substituting this identity into the convergence obtained from Hájek's theorem yields
\begin{align*}
\frac{L_N^{(a)}(Z_N)-m_N\bar a_N}{\sqrt{\frac{m_Nn_N}{N-1}s_N^2}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
This is precisely the asserted asymptotic normality of the two-sample linear rank statistic for the sample sizes $m_N$ and $n_N$.
[guided]
The previous step proved the asymptotic normality of the sampled centered-score sum:
\begin{align*}
\frac{\sum_{r\in S_N} b_N(r)}{\sqrt{\frac{m_Nn_N}{N-1}s_N^2}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
The finite-population representation proved earlier identifies this numerator exactly with the centered linear rank statistic:
\begin{align*}
\sum_{r\in S_N} b_N(r)=L_N^{(a)}(Z_N)-m_N\bar a_N.
\end{align*}
Substituting this identity gives
\begin{align*}
\frac{L_N^{(a)}(Z_N)-m_N\bar a_N}{\sqrt{\frac{m_Nn_N}{N-1}s_N^2}}
\xrightarrow{d}
\mathcal{N}(0,1).
\end{align*}
This is exactly the claimed convergence under the continuous two-sample null.
[/guided]
[/step]