[proofplan]
The hypothesis $L^2(X,\mu) \neq \{0\}$ excludes the degenerate zero space and guarantees that the operator norm equals exactly $1$ (rather than $0$). We first verify that composition with $T$ is well defined on $L^2(X,\mu)$, using the measure-preserving identity to preserve null sets. Then we prove linearity and compute the $L^2$ norm of $U_T f$, which gives boundedness, isometry, and operator norm $1$. Finally, under the additional hypothesis that $T$ is invertible with measurable measure-preserving inverse, we identify the adjoint $U_T^* = U_{T^{-1}}$ via the change-of-variables formula for pushforward measures and verify that $U_{T^{-1}}$ is the two-sided inverse of $U_T$, proving unitarity.
[/proofplan]
[step:Show that composition with $T$ is well defined on $L^2(X,\mu)$]
Let $\mathcal{N} \subseteq \mathcal{B}$ denote the collection of $\mu$-null measurable sets. By definition of a [measure-preserving transformation](/page/Measure-Preserving%20Transformation), $T: X \to X$ is $\mathcal{B}$-measurable and satisfies
\begin{align*}
\mu(T^{-1}(E)) = \mu(E)
\end{align*}
for every $E \in \mathcal{B}$.
Let $f: X \to \mathbb{C}$ be a $\mathcal{B}$-measurable representative of an element of $L^2(X,\mu)$. Since $T$ is $\mathcal{B}$-measurable, the composition
\begin{align*}
f \circ T: X &\to \mathbb{C} \\
x &\mapsto f(T(x))
\end{align*}
is $\mathcal{B}$-measurable.
If $f_1: X \to \mathbb{C}$ and $f_2: X \to \mathbb{C}$ are measurable representatives with $f_1 = f_2$ $\mu$-almost everywhere, define
\begin{align*}
N := \{y \in X : f_1(y) \neq f_2(y)\}.
\end{align*}
Then $N \in \mathcal{N}$, and
\begin{align*}
\{x \in X : f_1(T(x)) \neq f_2(T(x))\} = T^{-1}(N).
\end{align*}
Using measure preservation with $E=N$ gives
\begin{align*}
\mu(T^{-1}(N)) = \mu(N) = 0.
\end{align*}
Hence $f_1 \circ T = f_2 \circ T$ $\mu$-almost everywhere. Therefore the rule defining the [Koopman operator](/page/Koopman%20Operator)
\begin{align*}
U_T: L^2(X,\mu) &\to L^2(X,\mu) \\
[f] &\mapsto [f \circ T]
\end{align*}
is independent of the chosen representative once we prove $f \circ T \in L^2(X,\mu)$.
[/step]
[step:Use measure preservation to prove the $L^2$ isometry identity]
Let $f: X \to \mathbb{C}$ be a measurable representative of an element of $L^2(X,\mu)$. Define the non-negative measurable function
\begin{align*}
h: X &\to [0,\infty] \\
y &\mapsto |f(y)|^2.
\end{align*}
Because $f \in L^2(X,\mu)$,
\begin{align*}
\int_X h(y) \, d\mu(y) < \infty.
\end{align*}
Since $T$ is measure-preserving, the [pushforward measure](/page/Pushforward%20Measure) $T_\#\mu$ on $(X,\mathcal{B})$, defined by
\begin{align*}
(T_\#\mu)(E) := \mu(T^{-1}(E))
\end{align*}
for $E \in \mathcal{B}$, equals $\mu$. By the [Change of Variables (general)](/theorems/22) applied to the non-negative measurable function $h$ through the pushforward identity,
\begin{align*}
\int_X |f(T(x))|^2 \, d\mu(x)
= \int_X h(T(x)) \, d\mu(x)
= \int_X h(y) \, d(T_\#\mu)(y)
= \int_X |f(y)|^2 \, d\mu(y).
\end{align*}
Thus $f \circ T \in L^2(X,\mu)$ and
\begin{align*}
\|U_T f\|_{L^2(X,\mu)} = \|f\|_{L^2(X,\mu)}.
\end{align*}
[guided]
We need to prove more than a formal identity: $U_T f$ must actually be an element of $L^2(X,\mu)$. Let $f: X \to \mathbb{C}$ be a measurable representative of an $L^2$ class, and define
\begin{align*}
h: X &\to [0,\infty] \\
y &\mapsto |f(y)|^2.
\end{align*}
This function is measurable because $f$ is measurable and the map $z \mapsto |z|^2$ from $\mathbb{C}$ to $[0,\infty)$ is continuous. Since $f \in L^2(X,\mu)$,
\begin{align*}
\int_X h(y) \, d\mu(y) < \infty.
\end{align*}
The measure-preserving hypothesis says exactly that pulling measurable sets back by $T$ does not change their measure. Equivalently, the [pushforward measure](/page/Pushforward%20Measure) $T_\#\mu$ defined by
\begin{align*}
(T_\#\mu)(E) := \mu(T^{-1}(E))
\end{align*}
for every $E \in \mathcal{B}$ satisfies $T_\#\mu = \mu$. The [Change of Variables (general)](/theorems/22) requires only that the integrand is non-negative and measurable (or, alternatively, integrable). Both conditions hold for $h$, so applying it to $h$ gives
\begin{align*}
\int_X |f(T(x))|^2 \, d\mu(x)
= \int_X h(T(x)) \, d\mu(x)
= \int_X h(y) \, d(T_\#\mu)(y)
= \int_X h(y) \, d\mu(y)
= \int_X |f(y)|^2 \, d\mu(y).
\end{align*}
The right-hand side is finite, so $f \circ T \in L^2(X,\mu)$. Taking square roots gives
\begin{align*}
\|U_T f\|_{L^2(X,\mu)} = \|f\|_{L^2(X,\mu)}.
\end{align*}
[/guided]
[/step]
[step:Deduce linearity, boundedness, and norm one]
Let $f,g \in L^2(X,\mu)$, and let $\alpha,\beta \in \mathbb{C}$. Choose measurable representatives, still denoted $f$ and $g$. For every $x \in X$,
\begin{align*}
U_T(\alpha f+\beta g)(x)
= (\alpha f+\beta g)(T(x))
= \alpha f(T(x))+\beta g(T(x))
= \alpha U_T f(x)+\beta U_T g(x).
\end{align*}
Therefore $U_T$ is linear on $L^2(X,\mu)$.
The isometry identity from the previous step gives
\begin{align*}
\|U_T f\|_{L^2(X,\mu)} = \|f\|_{L^2(X,\mu)}
\end{align*}
for every $f \in L^2(X,\mu)$. Hence $U_T$ is bounded with
\begin{align*}
\|U_T\|_{\mathcal{L}(L^2)}
= \sup_{\|f\|_{L^2(X,\mu)} \leq 1} \|U_T f\|_{L^2(X,\mu)}
\leq 1.
\end{align*}
By the hypothesis $L^2(X,\mu) \neq \{0\}$, there exists $f_0 \in L^2(X,\mu)$ with $\|f_0\|_{L^2(X,\mu)} > 0$. The normalised element
\begin{align*}
e_0 := \frac{f_0}{\|f_0\|_{L^2(X,\mu)}} \in L^2(X,\mu)
\end{align*}
satisfies $\|e_0\|_{L^2(X,\mu)} = 1$, and the isometry identity gives
\begin{align*}
\|U_T e_0\|_{L^2(X,\mu)} = \|e_0\|_{L^2(X,\mu)} = 1.
\end{align*}
Therefore the supremum is attained and
\begin{align*}
\|U_T\|_{\mathcal{L}(L^2)} = 1.
\end{align*}
[/step]
[step:Compute the adjoint when $T$ is invertible and $T^{-1}$ is measure preserving]
Assume now that $T$ is invertible, that $T^{-1}: X \to X$ is $\mathcal{B}$-measurable, and that $T^{-1}$ is measure-preserving. Define
\begin{align*}
U_{T^{-1}}: L^2(X,\mu) &\to L^2(X,\mu) \\
[g] &\mapsto [g \circ T^{-1}].
\end{align*}
Applying the preceding two steps with $T$ replaced by $T^{-1}$, this is a bounded linear isometry, and in particular $g \circ T^{-1} \in L^2(X,\mu)$ with $\|g \circ T^{-1}\|_{L^2(X,\mu)} = \|g\|_{L^2(X,\mu)}$ for every $g \in L^2(X,\mu)$.
Let $f,g \in L^2(X,\mu)$, represented by measurable functions $f,g: X \to \mathbb{C}$. Define the candidate integrand
\begin{align*}
\phi: X &\to \mathbb{C} \\
y &\mapsto f(y)\,\overline{g(T^{-1}(y))}.
\end{align*}
Then $\phi$ is $\mathcal{B}$-measurable, and by the [Cauchy-Schwarz Inequality](/theorems/432) in $L^2(X,\mu)$ applied to $f$ and $g \circ T^{-1}$,
\begin{align*}
\int_X |\phi(y)| \, d\mu(y)
= \int_X |f(y)|\,|g(T^{-1}(y))| \, d\mu(y)
\leq \|f\|_{L^2(X,\mu)} \, \|g \circ T^{-1}\|_{L^2(X,\mu)}
= \|f\|_{L^2(X,\mu)} \, \|g\|_{L^2(X,\mu)}
< \infty,
\end{align*}
so $\phi \in L^1(X,\mu)$.
We now apply the change-of-variables formula for pushforward measures to the integrable function $\phi$ with the measure-preserving map $T$, so that $T_\#\mu = \mu$. Since $T^{-1}(T(x)) = x$ for every $x \in X$,
\begin{align*}
\phi(T(x)) = f(T(x))\,\overline{g(T^{-1}(T(x)))} = f(T(x))\,\overline{g(x)}.
\end{align*}
The change-of-variables formula gives
\begin{align*}
\int_X f(T(x))\,\overline{g(x)} \, d\mu(x)
&= \int_X \phi(T(x)) \, d\mu(x) \\
&= \int_X \phi(y) \, d(T_\#\mu)(y) \\
&= \int_X \phi(y) \, d\mu(y) \\
&= \int_X f(y)\,\overline{g(T^{-1}(y))} \, d\mu(y).
\end{align*}
Recognising both sides as $L^2$ inner products,
\begin{align*}
\langle U_T f,g\rangle_{L^2(X,\mu)}
= \langle f,U_{T^{-1}}g\rangle_{L^2(X,\mu)}.
\end{align*}
Because this identity holds for every $f,g \in L^2(X,\mu)$, the adjoint of $U_T$ is
\begin{align*}
U_T^* = U_{T^{-1}}.
\end{align*}
[guided]
Assume $T$ is invertible, $T^{-1}: X \to X$ is measurable, and $T^{-1}$ is measure-preserving. Applying the isometry result already proved to $T^{-1}$ defines a bounded linear isometry
\begin{align*}
U_{T^{-1}}: L^2(X,\mu) &\to L^2(X,\mu) \\
[g] &\mapsto [g \circ T^{-1}].
\end{align*}
In particular, $g \circ T^{-1} \in L^2(X,\mu)$ and $\|g \circ T^{-1}\|_{L^2(X,\mu)} = \|g\|_{L^2(X,\mu)}$.
We now compute the adjoint from the defining identity for adjoints on the Hilbert space $L^2(X,\mu)$: we want to find $h \in L^2(X,\mu)$ such that
\begin{align*}
\langle U_T f, g\rangle_{L^2(X,\mu)} = \langle f, h\rangle_{L^2(X,\mu)}
\quad\text{for every } f \in L^2(X,\mu),
\end{align*}
and then $U_T^* g := h$.
To evaluate the left-hand side rigorously rather than informally substituting in an integral, we use the change-of-variables formula for pushforward measures. The strategy is to identify a single integrable function $\phi$ to which the formula will be applied. Define
\begin{align*}
\phi: X &\to \mathbb{C} \\
y &\mapsto f(y)\,\overline{g(T^{-1}(y))}.
\end{align*}
We verify $\phi \in L^1(X,\mu)$ before applying the formula. Since $g \circ T^{-1} \in L^2(X,\mu)$ (from applying the isometry step to $T^{-1}$), the [Cauchy-Schwarz Inequality](/theorems/432) in $L^2(X,\mu)$ gives
\begin{align*}
\int_X |\phi(y)| \, d\mu(y)
= \int_X |f(y)|\,|g(T^{-1}(y))| \, d\mu(y)
\leq \|f\|_{L^2(X,\mu)} \, \|g \circ T^{-1}\|_{L^2(X,\mu)}
= \|f\|_{L^2(X,\mu)} \, \|g\|_{L^2(X,\mu)}
< \infty.
\end{align*}
Therefore $\phi \in L^1(X,\mu)$, so the [Change of Variables (general)](/theorems/22) applies to $\phi$ through the pushforward identity.
By the measure-preserving hypothesis, $T_\#\mu = \mu$. Computing $\phi \circ T$ using $T^{-1} \circ T = \operatorname{id}_X$,
\begin{align*}
\phi(T(x)) = f(T(x))\,\overline{g(T^{-1}(T(x)))} = f(T(x))\,\overline{g(x)}.
\end{align*}
The change-of-variables formula then gives
\begin{align*}
\int_X f(T(x))\,\overline{g(x)} \, d\mu(x)
&= \int_X \phi(T(x)) \, d\mu(x) \\
&= \int_X \phi(y) \, d(T_\#\mu)(y) \\
&= \int_X \phi(y) \, d\mu(y) \\
&= \int_X f(y)\,\overline{g(T^{-1}(y))} \, d\mu(y).
\end{align*}
Reinterpreting the endpoints of this chain as inner products,
\begin{align*}
\langle U_T f, g\rangle_{L^2(X,\mu)}
= \int_X f(T(x))\,\overline{g(x)} \, d\mu(x)
= \int_X f(y)\,\overline{g(T^{-1}(y))} \, d\mu(y)
= \langle f, U_{T^{-1}} g\rangle_{L^2(X,\mu)}.
\end{align*}
Because this identity holds for every $f,g \in L^2(X,\mu)$, the adjoint of $U_T$ is
\begin{align*}
U_T^* = U_{T^{-1}}.
\end{align*}
[/guided]
[/step]
[step:Identify the inverse and conclude unitarity]
For every measurable representative $f: X \to \mathbb{C}$ of an element of $L^2(X,\mu)$ and every $x \in X$,
\begin{align*}
(U_{T^{-1}}U_T f)(x)
= (U_T f)(T^{-1}(x))
= f(T(T^{-1}(x)))
= f(x),
\end{align*}
and
\begin{align*}
(U_TU_{T^{-1}} f)(x)
= (U_{T^{-1}} f)(T(x))
= f(T^{-1}(T(x)))
= f(x).
\end{align*}
Thus
\begin{align*}
U_{T^{-1}}U_T = I_{L^2(X,\mu)}
\quad\text{and}\quad
U_TU_{T^{-1}} = I_{L^2(X,\mu)}.
\end{align*}
Combining with $U_T^* = U_{T^{-1}}$ from the preceding step,
\begin{align*}
U_T^*U_T = I_{L^2(X,\mu)}
\quad\text{and}\quad
U_TU_T^* = I_{L^2(X,\mu)}.
\end{align*}
By the definition of a [unitary operator](/page/Unitary%20Operator) on a Hilbert space, $U_T$ is unitary.
[/step]