[proofplan]
We first use ergodicity to show that every non-zero eigenfunction has constant modulus and that each eigenspace is one-dimensional. Orthogonality of distinct eigenspaces inside the separable Hilbert space $L^2(X,\mathcal B,\mu)$ gives countability, and products and conjugates give the subgroup property. We then choose the normalized eigenfunctions multiplicatively, use them as coordinate functions for a measurable map into the compact dual group $G=\widehat{\Lambda(T)}$, and prove by Fourier coefficients that the pushforward of $\mu$ is Haar measure. Finally, the pullback by this map sends the character basis of $L^2(G)$ onto the eigenfunction basis of $L^2(X)$, so the factor map is an isomorphism; the classification follows by comparing both systems with the same compact rotation model.
[/proofplan]
[step:Normalize eigenfunctions and make each eigenspace one-dimensional]
Let
\begin{align*}
U_T:L^2(X,\mathcal B,\mu;\mathbb C)&\to L^2(X,\mathcal B,\mu;\mathbb C)\\
[f]&\mapsto [f\circ T]
\end{align*}
be the Koopman operator of $T$. Since $T$ is measure-preserving, $U_T$ is an isometry. Define the eigenvalue set
\begin{align*}
\Lambda(T):=\{\lambda\in \mathbb T:\ker(U_T-\lambda I)\neq \{0\}\}.
\end{align*}
For $\lambda\in \Lambda(T)$, write
\begin{align*}
E_\lambda:=\ker(U_T-\lambda I)
\end{align*}
for the corresponding eigenspace.
Let $\lambda\in\Lambda(T)$ and let $f\in E_\lambda$ be non-zero. Then
\begin{align*}
|f|\circ T=|f\circ T|=|\lambda f|=|f|
\end{align*}
as elements of $L^2(X,\mathcal B,\mu)$. By the invariant-function form of the [Equivalence of Ergodicity Conditions](/theorems/3444), every $T$-invariant function in $L^2(X,\mathcal B,\mu)$ is constant almost everywhere. Hence there is a constant $c_\lambda>0$ such that $|f|=c_\lambda$ almost everywhere. Therefore $c_\lambda^{-1}f$ is a $\mathbb T$-valued eigenfunction with eigenvalue $\lambda$.
Now let $u,v\in E_\lambda$ be $\mathbb T$-valued eigenfunctions. Define
\begin{align*}
h:X&\to \mathbb T\\
x&\mapsto u(x)\overline{v(x)}
\end{align*}
as an element of $L^\infty(X,\mathcal B,\mu;\mathbb T)$. Since $u\circ T=\lambda u$ and $v\circ T=\lambda v$ almost everywhere,
\begin{align*}
h\circ T=(u\circ T)\overline{(v\circ T)}
=\lambda u\,\overline{\lambda v}
=u\overline v
=h
\end{align*}
almost everywhere. Applying the invariant-function criterion again, $h=a$ almost everywhere for some $a\in\mathbb T$. Thus $u=av$ almost everywhere. Consequently every eigenspace $E_\lambda$ is one-dimensional.
[/step]
[step:Prove that the eigenvalue set is a countable subgroup of $\mathbb T$]
The constant function $\mathbf 1_X:X\to\mathbb C$, $x\mapsto 1$, belongs to $E_1$, so $1\in\Lambda(T)$.
Let $\lambda,\eta\in\Lambda(T)$. Choose $\mathbb T$-valued eigenfunctions $u_\lambda\in E_\lambda$ and $u_\eta\in E_\eta$, whose existence was proved in the previous step. The product
\begin{align*}
u_\lambda u_\eta:X&\to \mathbb T\\
x&\mapsto u_\lambda(x)u_\eta(x)
\end{align*}
satisfies
\begin{align*}
(u_\lambda u_\eta)\circ T
=(u_\lambda\circ T)(u_\eta\circ T)
=(\lambda u_\lambda)(\eta u_\eta)
=(\lambda\eta)(u_\lambda u_\eta)
\end{align*}
almost everywhere. Hence $\lambda\eta\in\Lambda(T)$. Also
\begin{align*}
\overline{u_\lambda}\circ T
=\overline{u_\lambda\circ T}
=\overline{\lambda u_\lambda}
=\lambda^{-1}\overline{u_\lambda}
\end{align*}
almost everywhere, so $\lambda^{-1}\in\Lambda(T)$. Thus $\Lambda(T)$ is a subgroup of $\mathbb T$.
For countability, choose for each $\lambda\in\Lambda(T)$ a $\mathbb T$-valued eigenfunction $u_\lambda\in E_\lambda$. Since $\mu$ is a probability measure,
\begin{align*}
\|u_\lambda\|_{L^2(X,\mu)}^2
=\int_X |u_\lambda(x)|^2\,d\mu(x)
=\int_X 1\,d\mu(x)
=1.
\end{align*}
If $\lambda\neq\eta$, then
\begin{align*}
\langle u_\lambda,u_\eta\rangle_{L^2(X,\mu)}
&=\langle U_Tu_\lambda,U_Tu_\eta\rangle_{L^2(X,\mu)}\\
&=\lambda\overline{\eta}\,\langle u_\lambda,u_\eta\rangle_{L^2(X,\mu)}.
\end{align*}
Because $\lambda\overline{\eta}\neq 1$, it follows that
\begin{align*}
\langle u_\lambda,u_\eta\rangle_{L^2(X,\mu)}=0.
\end{align*}
Thus $\{u_\lambda:\lambda\in\Lambda(T)\}$ is an orthonormal set. By the [Separability of $L^p$ Spaces](/theorems/548), applied with $p=2$ to the standard probability space $(X,\mathcal B,\mu)$, the Hilbert space $L^2(X,\mathcal B,\mu)$ is separable. A separable Hilbert space has at most countable orthonormal sets, so $\Lambda(T)$ is countable.
[/step]
[step:Choose eigenfunctions that multiply according to the eigenvalue group]
Let $\mathcal E$ be the abelian group, under pointwise multiplication of equivalence classes, of all $\mathbb T$-valued eigenfunctions of $U_T$. Define
\begin{align*}
q:\mathcal E&\to \Lambda(T)\\
u&\mapsto \lambda \quad\text{where } U_Tu=\lambda u.
\end{align*}
The map $q$ is a surjective group homomorphism. Its kernel consists exactly of the constant $\mathbb T$-valued functions, by ergodicity and the one-dimensionality of $E_1$.
We construct a group homomorphism
\begin{align*}
s:\Lambda(T)&\to \mathcal E
\end{align*}
such that $q\circ s=\operatorname{id}_{\Lambda(T)}$. Suppose $H$ is a subgroup of $\Lambda(T)$ and
\begin{align*}
s_H:H&\to\mathcal E
\end{align*}
is a homomorphism with $q\circ s_H=\operatorname{id}_H$. Let $\alpha\in\Lambda(T)$ and set $K:=\langle H,\alpha\rangle$.
If no positive integer $n$ satisfies $\alpha^n\in H$, then each element of $K$ has a unique form $\beta\alpha^k$ with $\beta\in H$ and $k\in\mathbb Z$. Choose $u\in\mathcal E$ with $q(u)=\alpha$, and define
\begin{align*}
s_K(\beta\alpha^k):=s_H(\beta)u^k.
\end{align*}
The uniqueness of the representation makes $s_K$ well-defined, and the definition gives a homomorphism extending $s_H$ with $q\circ s_K=\operatorname{id}_K$.
If there is a positive integer $n$ with $\alpha^n\in H$, let $n$ be the least such integer. Choose $u\in\mathcal E$ with $q(u)=\alpha$. Then
\begin{align*}
u^n s_H(\alpha^n)^{-1}\in \ker q,
\end{align*}
so there is a constant $c\in\mathbb T$ such that
\begin{align*}
u^n s_H(\alpha^n)^{-1}=c.
\end{align*}
Choose $a\in\mathbb T$ with $a^n=c^{-1}$, and set $v:=au$. Then
\begin{align*}
v^n=s_H(\alpha^n).
\end{align*}
Every element of $K$ has a unique form $\beta\alpha^k$ with $\beta\in H$ and $0\leq k\leq n-1$. Define
\begin{align*}
s_K(\beta\alpha^k):=s_H(\beta)v^k.
\end{align*}
The relation $v^n=s_H(\alpha^n)$ proves that $s_K$ is multiplicative when exponents cross $n$, so $s_K$ is a homomorphism extending $s_H$ and satisfying $q\circ s_K=\operatorname{id}_K$.
Since $\Lambda(T)$ is countable, choose an enumeration $(\lambda_j)_{j\in\mathbb N}$ of $\Lambda(T)$ with $\lambda_1=1$. Start with $H_1=\{1\}$ and $s_{H_1}(1)=\mathbf 1_X$. Applying the extension construction inductively to
\begin{align*}
H_j:=\langle \lambda_1,\ldots,\lambda_j\rangle
\end{align*}
gives compatible homomorphic sections $s_{H_j}:H_j\to\mathcal E$. Their union is the required section $s:\Lambda(T)\to\mathcal E$.
For each $\lambda\in\Lambda(T)$, set
\begin{align*}
e_\lambda:=s(\lambda).
\end{align*}
Then
\begin{align*}
e_1=\mathbf 1_X,\qquad e_{\lambda\eta}=e_\lambda e_\eta,\qquad U_Te_\lambda=\lambda e_\lambda
\end{align*}
for all $\lambda,\eta\in\Lambda(T)$.
[/step]
[step:Realize the multiplicative eigenfunctions as a measurable map into the dual group]
For each $\lambda\in\Lambda(T)$, choose a measurable representative
\begin{align*}
\widetilde e_\lambda:X&\to\mathbb T
\end{align*}
of the equivalence class $e_\lambda$. Since $\Lambda(T)$ is countable, there is a measurable set $X_0\in\mathcal B$ with $\mu(X_0)=1$ such that $T(X_0)\subseteq X_0$ and, for every $x\in X_0$ and every $\lambda,\eta\in\Lambda(T)$,
\begin{align*}
\widetilde e_\lambda(Tx)=\lambda \widetilde e_\lambda(x),
\qquad
\widetilde e_{\lambda\eta}(x)=\widetilde e_\lambda(x)\widetilde e_\eta(x),
\qquad
\widetilde e_1(x)=1.
\end{align*}
Indeed, take the union of the countably many null sets on which one of these identities fails, and then remove all of its inverse images under the maps $T^k$ for $k\in\mathbb N\cup\{0\}$.
Let
\begin{align*}
G:=\widehat{\Lambda(T)}
\end{align*}
be the compact abelian group of all homomorphisms $\gamma:\Lambda(T)\to\mathbb T$, equipped with the product topology inherited from $\mathbb T^{\Lambda(T)}$. Let $\mathcal B_G$ be its Borel $\sigma$-algebra. Define
\begin{align*}
\Phi:X&\to G
\end{align*}
by
\begin{align*}
\Phi(x)(\lambda)=
\begin{cases}
\widetilde e_\lambda(x),&x\in X_0,\\
1,&x\notin X_0.
\end{cases}
\end{align*}
For $x\in X_0$, the identity $\widetilde e_{\lambda\eta}(x)=\widetilde e_\lambda(x)\widetilde e_\eta(x)$ shows that $\lambda\mapsto\Phi(x)(\lambda)$ is a homomorphism, so $\Phi(x)\in G$. Since the coordinate maps generate $\mathcal B_G$ and each coordinate $\Phi(\cdot)(\lambda)$ is $\mathcal B$-measurable, $\Phi$ is measurable.
Define
\begin{align*}
g:\Lambda(T)&\to\mathbb T\\
\lambda&\mapsto \lambda
\end{align*}
and
\begin{align*}
R_g:G&\to G\\
\gamma&\mapsto g\gamma,
\end{align*}
where $(g\gamma)(\lambda)=g(\lambda)\gamma(\lambda)$. Since $\Lambda(T)$ is a subgroup of $\mathbb T$, the map $g$ is a character of $\Lambda(T)$, hence $g\in G$. For $x\in X_0$ and $\lambda\in\Lambda(T)$,
\begin{align*}
\Phi(Tx)(\lambda)
=\widetilde e_\lambda(Tx)
=\lambda \widetilde e_\lambda(x)
=g(\lambda)\Phi(x)(\lambda)
=(R_g(\Phi(x)))(\lambda).
\end{align*}
Therefore
\begin{align*}
\Phi\circ T=R_g\circ \Phi
\end{align*}
almost everywhere.
[/step]
[step:Identify the pushforward measure as Haar measure by Fourier coefficients]
Let
\begin{align*}
\nu:=\Phi_*\mu
\end{align*}
be the pushforward probability measure on $(G,\mathcal B_G)$, and let $m_G$ denote Haar probability measure on $G$. For each $\lambda\in\Lambda(T)$, define the continuous character
\begin{align*}
\chi_\lambda:G&\to\mathbb T\\
\gamma&\mapsto \gamma(\lambda).
\end{align*}
Then
\begin{align*}
\int_G \chi_\lambda(\gamma)\,d\nu(\gamma)
&=\int_X \chi_\lambda(\Phi(x))\,d\mu(x)\\
&=\int_X \widetilde e_\lambda(x)\,d\mu(x).
\end{align*}
If $\lambda=1$, this integral equals $1$. If $\lambda\neq 1$, then $e_\lambda$ is orthogonal to the constant eigenspace $E_1$, so
\begin{align*}
\int_X \widetilde e_\lambda(x)\,d\mu(x)
=\langle e_\lambda,\mathbf 1_X\rangle_{L^2(X,\mu)}
=0.
\end{align*}
Thus
\begin{align*}
\int_G \chi_\lambda(\gamma)\,d\nu(\gamma)
=
\begin{cases}
1,&\lambda=1,\\
0,&\lambda\neq 1.
\end{cases}
\end{align*}
The same identities hold for $m_G$. For $\lambda=1$, $\chi_1$ is the constant function $1$. For $\lambda\neq 1$, characters on the discrete abelian group separate points, so there is some $\gamma_0\in G$ such that $\chi_\lambda(\gamma_0)\neq 1$. If
\begin{align*}
I_\lambda:=\int_G \chi_\lambda(\gamma)\,d m_G(\gamma),
\end{align*}
then left-invariance of Haar measure gives
\begin{align*}
I_\lambda
=\int_G \chi_\lambda(\gamma_0\gamma)\,d m_G(\gamma)
=\chi_\lambda(\gamma_0) I_\lambda.
\end{align*}
Since $\chi_\lambda(\gamma_0)\neq 1$, $I_\lambda=0$.
The linear span of $\{\chi_\lambda:\lambda\in\Lambda(T)\}$ is a unital self-adjoint subalgebra of $C(G;\mathbb C)$ and separates points of $G$. By the [Stone-Weierstrass Theorem](/theorems/886), it is uniformly dense in $C(G;\mathbb C)$. Since $\nu$ and $m_G$ are regular Borel probability measures on the compact Hausdorff space $G$ and agree on a uniformly dense subspace of $C(G;\mathbb C)$, the [Riesz-Markov-Kakutani Representation Theorem](/theorems/976) implies
\begin{align*}
\nu=m_G.
\end{align*}
[/step]
[step:Use spectral density to make the factor map an isomorphism]
Define the pullback operator
\begin{align*}
C_\Phi:L^2(G,\mathcal B_G,m_G;\mathbb C)&\to L^2(X,\mathcal B,\mu;\mathbb C)\\
[F]&\mapsto [F\circ\Phi].
\end{align*}
Since $\Phi_*\mu=m_G$, for every $F\in L^2(G,\mathcal B_G,m_G;\mathbb C)$,
\begin{align*}
\|C_\Phi F\|_{L^2(X,\mu)}^2
&=\int_X |F(\Phi(x))|^2\,d\mu(x)\\
&=\int_G |F(\gamma)|^2\,d(\Phi_*\mu)(\gamma)\\
&=\int_G |F(\gamma)|^2\,d m_G(\gamma)\\
&=\|F\|_{L^2(G,m_G)}^2.
\end{align*}
Thus $C_\Phi$ is an isometry.
For each $\lambda\in\Lambda(T)$,
\begin{align*}
C_\Phi\chi_\lambda=e_\lambda
\end{align*}
as elements of $L^2(X,\mathcal B,\mu)$. The linear span of $\{\chi_\lambda:\lambda\in\Lambda(T)\}$ is dense in $L^2(G,\mathcal B_G,m_G)$ because it is uniformly dense in $C(G;\mathbb C)$ and continuous functions are dense in $L^2(G,m_G)$. The linear span of $\{e_\lambda:\lambda\in\Lambda(T)\}$ is dense in $L^2(X,\mathcal B,\mu)$ by the pure discrete spectrum hypothesis and the one-dimensionality of the eigenspaces. Hence the range of $C_\Phi$ contains a dense subspace of $L^2(X,\mathcal B,\mu)$. Since the range of an isometry is closed, $C_\Phi$ is surjective.
Therefore $C_\Phi$ is unitary. For standard probability systems, a measure-preserving factor map whose pullback operator is unitary has pullback $\sigma$-algebra equal to the whole target $L^2$ space, hence is an isomorphism modulo null sets. Hence
\begin{align*}
\Phi:(X,\mathcal B,\mu,T)\to (G,\mathcal B_G,m_G,R_g)
\end{align*}
is a measure-theoretic isomorphism.
The rotation is ergodic as well. If $\lambda\neq 1$, then
\begin{align*}
\chi_\lambda(g)=g(\lambda)=\lambda\neq 1,
\end{align*}
so no non-constant character is fixed by $R_g$. Since characters are dense in $L^2(G,m_G)$, every $R_g$-invariant $L^2$ function is constant. Thus $R_g$ is ergodic. Equivalently, the cyclic subgroup $\{g^n:n\in\mathbb Z\}$ is dense in $G$.
[/step]
[step:Classify pure discrete ergodic systems by their eigenvalue group]
Let $(X_1,\mathcal B_1,\mu_1,T_1)$ and $(X_2,\mathcal B_2,\mu_2,T_2)$ be ergodic measure-preserving systems with pure discrete spectrum.
First suppose
\begin{align*}
\Lambda(T_1)=\Lambda(T_2)=:\Lambda.
\end{align*}
Applying the construction above to each system gives measure-theoretic isomorphisms
\begin{align*}
\Phi_i:(X_i,\mathcal B_i,\mu_i,T_i)&\to (G,\mathcal B_G,m_G,R_g),
\qquad i\in\{1,2\},
\end{align*}
where
\begin{align*}
G:=\widehat{\Lambda}
\end{align*}
and
\begin{align*}
g:\Lambda&\to\mathbb T\\
\lambda&\mapsto \lambda.
\end{align*}
Therefore $\Phi_2^{-1}\circ\Phi_1$ is a measure-theoretic isomorphism from $(X_1,\mathcal B_1,\mu_1,T_1)$ to $(X_2,\mathcal B_2,\mu_2,T_2)$.
Conversely, suppose
\begin{align*}
\Psi:(X_1,\mathcal B_1,\mu_1,T_1)&\to (X_2,\mathcal B_2,\mu_2,T_2)
\end{align*}
is a measure-theoretic isomorphism. Define
\begin{align*}
C_\Psi:L^2(X_2,\mathcal B_2,\mu_2;\mathbb C)&\to L^2(X_1,\mathcal B_1,\mu_1;\mathbb C)\\
[F]&\mapsto [F\circ\Psi].
\end{align*}
The operator $C_\Psi$ is unitary and intertwines the Koopman operators:
\begin{align*}
C_\Psi U_{T_2}F
&=(F\circ T_2)\circ\Psi\\
&=F\circ(T_2\circ\Psi)\\
&=F\circ(\Psi\circ T_1)\\
&=(F\circ\Psi)\circ T_1\\
&=U_{T_1}C_\Psi F.
\end{align*}
If $F\neq 0$ satisfies $U_{T_2}F=\lambda F$, then $C_\Psi F\neq 0$ and
\begin{align*}
U_{T_1}(C_\Psi F)=C_\Psi(U_{T_2}F)=\lambda C_\Psi F.
\end{align*}
Hence $\Lambda(T_2)\subseteq \Lambda(T_1)$. Applying the same argument to $\Psi^{-1}$ gives $\Lambda(T_1)\subseteq\Lambda(T_2)$. Therefore
\begin{align*}
\Lambda(T_1)=\Lambda(T_2).
\end{align*}
This proves that ergodic measure-preserving systems with pure discrete spectrum are classified, up to measure-theoretic isomorphism, by their eigenvalue group.
[/step]