[proofplan]
Let $(X,\mathcal A,\mu,T)$ and $(Y,\mathcal B,\nu,S)$ be measure-preserving systems, and let $X_0\subset X$, $Y_0\subset Y$, and $\Phi:X_0\to Y_0$ denote the full-measure invariant sets and bimeasurable measure-preserving conjugacy supplied by the definition of measure-theoretic isomorphism. We compare Kolmogorov-Sinai entropy through $\Phi$ by pulling every finite measurable partition of $Y$ back to a finite measurable partition of $X$. The pullback preserves atom measures, and the intertwining relation $\Phi \circ T = S \circ \Phi$ identifies the iterated joins used in the definition of partition entropy. This gives $h_\nu(S) \le h_\mu(T)$; applying the same argument to $\Phi^{-1}$ gives the reverse inequality. The Bernoulli-shift obstruction follows by combining this invariance with the [entropy formula for Bernoulli shifts](/theorems/6776).
[/proofplan]
[step:Pull a finite partition of $Y$ back to a finite partition of $X$]
Let $\mathcal Q = \{Q_1,\dots,Q_m\}$ be a finite measurable partition of $Y$, where each $Q_j \in \mathcal B$. Define the pullback partition $\Phi^{-1}\mathcal Q$ of $X$ by
\begin{align*}
\Phi^{-1}\mathcal Q := \{\Phi^{-1}(Q_1 \cap Y_0),\dots,\Phi^{-1}(Q_m \cap Y_0)\}.
\end{align*}
Since $\Phi: X_0 \to Y_0$ is measurable and $X \setminus X_0$ is $\mu$-null, this is a finite measurable partition of $X_0$. Throughout the proof, finite partitions are identified modulo null atoms; equivalently, one may add the null atom $X \setminus X_0$ to obtain an actual finite measurable partition of $X$ without changing the Shannon entropy of a finite partition.
For each $j \in \{1,\dots,m\}$, the measure-preserving property of $\Phi$ gives
\begin{align*}
\mu(\Phi^{-1}(Q_j \cap Y_0)) = \nu(Q_j \cap Y_0) = \nu(Q_j).
\end{align*}
Thus $\mathcal Q$ and $\Phi^{-1}\mathcal Q$ have the same atom measures. Therefore their Shannon entropies agree:
\begin{align*}
H_\mu(\Phi^{-1}\mathcal Q) = H_\nu(\mathcal Q),
\end{align*}
where the partition entropy functionals
\begin{align*}
H_\mu:\{\text{finite measurable partitions of }X\}\to[0,\infty)
\end{align*}
and
\begin{align*}
H_\nu:\{\text{finite measurable partitions of }Y\}\to[0,\infty)
\end{align*}
are defined by
\begin{align*}
H_\mu(\mathcal P) := -\sum_{A \in \mathcal P} \mu(A)\log \mu(A)
\end{align*}
for any finite measurable partition $\mathcal P$ of $X$, and by the analogous formula
\begin{align*}
H_\nu(\mathcal R) := -\sum_{B \in \mathcal R} \nu(B)\log \nu(B)
\end{align*}
for any finite measurable partition $\mathcal R$ of $Y$, with the convention $0\log 0 := 0$.
[guided]
Start with an arbitrary finite measurable partition $\mathcal Q = \{Q_1,\dots,Q_m\}$ of $Y$. The reason to begin with an arbitrary partition is that Kolmogorov-Sinai entropy is defined by taking the supremum over all such partitions. We define its pullback along the isomorphism by
\begin{align*}
\Phi^{-1}\mathcal Q := \{\Phi^{-1}(Q_1 \cap Y_0),\dots,\Phi^{-1}(Q_m \cap Y_0)\}.
\end{align*}
The intersections with $Y_0$ are included because $\Phi$ is defined on the full-measure model $X_0 \to Y_0$. Since $Y \setminus Y_0$ has $\nu$-measure zero, replacing $Q_j$ by $Q_j \cap Y_0$ does not change any entropy computation.
Each atom $\Phi^{-1}(Q_j \cap Y_0)$ is measurable because $\Phi$ is measurable. The atoms cover $X_0$ and are pairwise disjoint; since $X \setminus X_0$ is $\mu$-null, they form a partition of $X$ up to null sets, which is exactly the [equivalence relation](/page/Equivalence%20Relation) used in measure-theoretic entropy.
Now compare atom measures. Since $\Phi_*\mu = \nu$ on $Y_0$, for each $j \in \{1,\dots,m\}$ we have
\begin{align*}
\mu(\Phi^{-1}(Q_j \cap Y_0)) = \nu(Q_j \cap Y_0).
\end{align*}
Because $\nu(Y \setminus Y_0) = 0$, this equals $\nu(Q_j)$. Hence the list of atom measures of $\Phi^{-1}\mathcal Q$ is exactly the same as the list of atom measures of $\mathcal Q$.
The Shannon entropy of a finite measurable partition depends only on the measures of its atoms. The entropy functionals are
\begin{align*}
H_\mu:\{\text{finite measurable partitions of }X\}\to[0,\infty)
\end{align*}
and
\begin{align*}
H_\nu:\{\text{finite measurable partitions of }Y\}\to[0,\infty),
\end{align*}
defined by
\begin{align*}
H_\mu(\mathcal P) := -\sum_{A \in \mathcal P} \mu(A)\log \mu(A),
\end{align*}
and by the analogous formula
\begin{align*}
H_\nu(\mathcal R) := -\sum_{B \in \mathcal R} \nu(B)\log \nu(B),
\end{align*}
using the convention $0\log 0 := 0$. The equality of atom measures gives
\begin{align*}
H_\mu(\Phi^{-1}\mathcal Q) = H_\nu(\mathcal Q).
\end{align*}
This is the static entropy comparison; the next step checks that it remains true after all dynamical refinements by $T$ and $S$.
[/guided]
[/step]
[step:Identify the iterated joins under the conjugacy]
For each integer $n \ge 1$, define the $n$-fold dynamical refinements
\begin{align*}
\mathcal Q_n := \bigvee_{k=0}^{n-1} S^{-k}\mathcal Q
\end{align*}
and
\begin{align*}
\mathcal P_n := \bigvee_{k=0}^{n-1} T^{-k}(\Phi^{-1}\mathcal Q).
\end{align*}
We claim that $\mathcal P_n$ equals $\Phi^{-1}\mathcal Q_n$ up to $\mu$-null sets.
An atom of $\mathcal Q_n$ has the form
\begin{align*}
B = \bigcap_{k=0}^{n-1} S^{-k}Q_{i_k}
\end{align*}
for a choice of indices $i_0,\dots,i_{n-1} \in \{1,\dots,m\}$. Its pullback under $\Phi$ is
\begin{align*}
\Phi^{-1}(B \cap Y_0) = \bigcap_{k=0}^{n-1} \Phi^{-1}(S^{-k}Q_{i_k} \cap Y_0).
\end{align*}
Since $\Phi \circ T^k = S^k \circ \Phi$ $\mu$-almost everywhere on $X_0$ for each $k \in \{0,\dots,n-1\}$, we have
\begin{align*}
\Phi^{-1}(S^{-k}Q_{i_k} \cap Y_0) = T^{-k}\Phi^{-1}(Q_{i_k} \cap Y_0)
\end{align*}
up to $\mu$-null sets. Therefore
\begin{align*}
\Phi^{-1}(B \cap Y_0) = \bigcap_{k=0}^{n-1} T^{-k}\Phi^{-1}(Q_{i_k} \cap Y_0)
\end{align*}
up to $\mu$-null sets, which is an atom of $\mathcal P_n$. This proves $\mathcal P_n = \Phi^{-1}\mathcal Q_n$ modulo $\mu$-null sets. Applying the atom-measure comparison from the previous step to $\mathcal Q_n$ gives
\begin{align*}
H_\mu\left(\bigvee_{k=0}^{n-1} T^{-k}(\Phi^{-1}\mathcal Q)\right)
=
H_\nu\left(\bigvee_{k=0}^{n-1} S^{-k}\mathcal Q\right).
\end{align*}
[guided]
Fix an integer $n \ge 1$. We must compare the finite-time refinements before passing to entropy rates. Define
\begin{align*}
\mathcal Q_n := \bigvee_{k=0}^{n-1} S^{-k}\mathcal Q
\end{align*}
and
\begin{align*}
\mathcal P_n := \bigvee_{k=0}^{n-1} T^{-k}(\Phi^{-1}\mathcal Q).
\end{align*}
The goal is to prove that $\mathcal P_n$ and $\Phi^{-1}\mathcal Q_n$ have the same atoms modulo $\mu$-null sets.
An atom of $\mathcal Q_n$ is obtained by choosing indices $i_0,\dots,i_{n-1}\in\{1,\dots,m\}$ and taking
\begin{align*}
B = \bigcap_{k=0}^{n-1} S^{-k}Q_{i_k}.
\end{align*}
Pulling this atom back through $\Phi:X_0\to Y_0$ gives
\begin{align*}
\Phi^{-1}(B\cap Y_0)=\bigcap_{k=0}^{n-1}\Phi^{-1}(S^{-k}Q_{i_k}\cap Y_0).
\end{align*}
The conjugacy relation says that $\Phi\circ T^k=S^k\circ\Phi$ for $\mu$-almost every point of $X_0$, for every $k\in\{0,\dots,n-1\}$. Therefore membership in $\Phi^{-1}(S^{-k}Q_{i_k}\cap Y_0)$ is equivalent, outside a $\mu$-null set, to membership in $T^{-k}\Phi^{-1}(Q_{i_k}\cap Y_0)$. Hence
\begin{align*}
\Phi^{-1}(B\cap Y_0)=\bigcap_{k=0}^{n-1}T^{-k}\Phi^{-1}(Q_{i_k}\cap Y_0)
\end{align*}
modulo $\mu$-null sets.
The right-hand side is exactly an atom of $\mathcal P_n$. Conversely, every atom of $\mathcal P_n$ arises from such a choice of indices and therefore from the corresponding atom $B$ of $\mathcal Q_n$. Thus $\mathcal P_n=\Phi^{-1}\mathcal Q_n$ modulo $\mu$-null sets. Since the previous guided step proved that pullback by $\Phi$ preserves the measures of atoms of every finite partition, applying that result to $\mathcal Q_n$ gives
\begin{align*}
H_\mu\left(\bigvee_{k=0}^{n-1} T^{-k}(\Phi^{-1}\mathcal Q)\right)=H_\nu\left(\bigvee_{k=0}^{n-1} S^{-k}\mathcal Q\right).
\end{align*}
[/guided]
[/step]
[step:Pass from finite-time partition entropies to entropy rates]
For the finite partition $\Phi^{-1}\mathcal Q$, define
\begin{align*}
a_n := H_\mu\left(\bigvee_{k=0}^{n-1} T^{-k}(\Phi^{-1}\mathcal Q)\right)
\end{align*}
for $n\geq 1$. Because $\Phi^{-1}\mathcal Q$ is finite, each refinement $\bigvee_{k=0}^{n-1}T^{-k}(\Phi^{-1}\mathcal Q)$ is finite, so $a_n<\infty$ for every $n\geq 1$. The standard subadditivity of finite partition entropy under joins gives $a_{n+r}\leq a_n+a_r$ for all $n,r\geq 1$; measure preservation of $T$ is the hypothesis used to identify the shifted length-$r$ block with the unshifted one. By the finite-valued subadditive-sequence lemma, also called Fekete's lemma, the limit defining the entropy rate exists and equals $\inf_{n\geq 1}a_n/n$. Thus
\begin{align*}
h_\mu(T,\Phi^{-1}\mathcal Q) := \lim_{n \to \infty}\frac{a_n}{n}.
\end{align*}
Similarly, with
\begin{align*}
b_n := H_\nu\left(\bigvee_{k=0}^{n-1} S^{-k}\mathcal Q\right),
\end{align*}
subadditivity and Fekete's lemma give
\begin{align*}
h_\nu(S,\mathcal Q) := \lim_{n \to \infty}\frac{b_n}{n}.
\end{align*}
The previous step proves $a_n=b_n$ for every $n\geq 1$, so the normalized limits are equal:
\begin{align*}
h_\mu(T,\Phi^{-1}\mathcal Q) = h_\nu(S,\mathcal Q).
\end{align*}
Since the Kolmogorov-Sinai entropy $h_\mu(T)$ is the supremum of $h_\mu(T,\mathcal P)$ over all finite measurable partitions $\mathcal P$ of $X$, and $\Phi^{-1}\mathcal Q$ is one such partition, we obtain
\begin{align*}
h_\nu(S,\mathcal Q) = h_\mu(T,\Phi^{-1}\mathcal Q) \le h_\mu(T).
\end{align*}
Taking the supremum over all finite measurable partitions $\mathcal Q$ of $Y$ gives
\begin{align*}
h_\nu(S) \le h_\mu(T).
\end{align*}
[guided]
We now pass from equality of finite-time entropies to equality of entropy rates. For the pulled-back partition $\Phi^{-1}\mathcal Q$, define
\begin{align*}
a_n := H_\mu\left(\bigvee_{k=0}^{n-1} T^{-k}(\Phi^{-1}\mathcal Q)\right)
\end{align*}
for each $n\geq 1$. The sequence $(a_n)$ is subadditive: the join for a block of length $n+r$ splits into the first $n$ iterates and the next $r$ iterates, entropy of a join is at most the sum of the entropies, and measure preservation of $T$ identifies the entropy of the shifted $r$-block with the entropy of the original $r$-block. Hence $a_{n+r}\leq a_n+a_r$.
Because $\Phi^{-1}\mathcal Q$ is a finite partition, each finite join $\bigvee_{k=0}^{n-1}T^{-k}(\Phi^{-1}\mathcal Q)$ is also finite, and therefore $a_n<\infty$ for every $n\geq 1$. The finite-valued subadditive-sequence lemma, also called Fekete's lemma, applies to $(a_n)$, so the limit $\lim_{n\to\infty}a_n/n$ exists and equals $\inf_{n\geq 1}a_n/n$. This justifies the definition
\begin{align*}
h_\mu(T,\Phi^{-1}\mathcal Q) := \lim_{n \to \infty}\frac{a_n}{n}.
\end{align*}
The same argument for $S$ and $\mathcal Q$, with
\begin{align*}
b_n := H_\nu\left(\bigvee_{k=0}^{n-1} S^{-k}\mathcal Q\right),
\end{align*}
gives the entropy rate
\begin{align*}
h_\nu(S,\mathcal Q) := \lim_{n \to \infty}\frac{b_n}{n}.
\end{align*}
The conjugacy comparison proved above says $a_n=b_n$ for every $n\geq 1$. Dividing by $n$ and taking limits gives
\begin{align*}
h_\mu(T,\Phi^{-1}\mathcal Q) = h_\nu(S,\mathcal Q).
\end{align*}
Finally, $h_\mu(T)$ is defined as the supremum of $h_\mu(T,\mathcal P)$ over all finite measurable partitions $\mathcal P$ of $X$. Since $\Phi^{-1}\mathcal Q$ is one admissible finite partition, we have
\begin{align*}
h_\nu(S,\mathcal Q)=h_\mu(T,\Phi^{-1}\mathcal Q)\leq h_\mu(T).
\end{align*}
Taking the supremum over all finite measurable partitions $\mathcal Q$ of $Y$ gives
\begin{align*}
h_\nu(S)\leq h_\mu(T).
\end{align*}
[/guided]
[/step]
[step:Apply the same argument to the inverse isomorphism]
By the bimeasurability and inverse measure-preservation included in the definition of measure-theoretic isomorphism, the inverse map
\begin{align*}
\Phi^{-1}: Y_0 \to X_0
\end{align*}
is measurable and measure-preserving. Hence it is a measure-preserving isomorphism from $(Y_0,\mathcal B|_{Y_0},\nu,S)$ to $(X_0,\mathcal A|_{X_0},\mu,T)$, and the intertwining relation for $\Phi$ implies
\begin{align*}
\Phi^{-1} \circ S = T \circ \Phi^{-1}
\end{align*}
$\nu$-almost everywhere on $Y_0$. Repeating the preceding argument with the roles of $(X,\mathcal A,\mu,T)$ and $(Y,\mathcal B,\nu,S)$ interchanged yields
\begin{align*}
h_\mu(T) \le h_\nu(S).
\end{align*}
Combining this inequality with
\begin{align*}
h_\nu(S) \le h_\mu(T)
\end{align*}
gives
\begin{align*}
h_\mu(T) = h_\nu(S).
\end{align*}
[guided]
The previous inequality was asymmetric because it started with a partition of $Y$ and pulled it back to $X$. An isomorphism is symmetric: by definition, the inverse map $\Phi^{-1}:Y_0\to X_0$ is measurable and measure-preserving, and the original intertwining relation implies $\Phi^{-1}\circ S=T\circ\Phi^{-1}$ for $\nu$-almost every point of $Y_0$. Therefore the same pullback argument can be repeated with $X$ and $Y$ interchanged.
That repetition gives
\begin{align*}
h_\mu(T)\leq h_\nu(S).
\end{align*}
Together with the inequality already proved,
\begin{align*}
h_\nu(S)\leq h_\mu(T),
\end{align*}
we conclude
\begin{align*}
h_\mu(T)=h_\nu(S).
\end{align*}
This proves that Kolmogorov-Sinai entropy is invariant under measure-theoretic isomorphism.
[/guided]
[/step]
[step:Use the Bernoulli entropy formula to separate shifts with different base entropies]
Let $\sigma_p$ and $\sigma_q$ denote the Bernoulli shifts with base probability vectors $p = (p_i)_{i \in I}$ and $q = (q_j)_{j \in J}$, respectively. Let $\mu_p$ and $\mu_q$ denote the corresponding product probability measures on the two shift spaces. We write their Kolmogorov-Sinai entropies with respect to these product measures as $h_{\mu_p}(\sigma_p)$ and $h_{\mu_q}(\sigma_q)$. Define the base Shannon entropies by
\begin{align*}
H(p) := -\sum_{i\in I}p_i\log p_i
\end{align*}
and
\begin{align*}
H(q) := -\sum_{j\in J}q_j\log q_j,
\end{align*}
with the convention $0\log 0:=0$. By the entropy formula for Bernoulli shifts, applied to the product measures $\mu_p$ and $\mu_q$, we have
\begin{align*}
h_{\mu_p}(\sigma_p) = H(p)
\end{align*}
and
\begin{align*}
h_{\mu_q}(\sigma_q) = H(q).
\end{align*}
If $H(p) \ne H(q)$ and the two Bernoulli shifts were measure-theoretically isomorphic, the isomorphism invariance just proved would imply
\begin{align*}
H(p) = h_{\mu_p}(\sigma_p) = h_{\mu_q}(\sigma_q) = H(q),
\end{align*}
contradicting $H(p) \ne H(q)$. Therefore Bernoulli shifts with different base entropies are not measure-theoretically isomorphic.
[guided]
The invariant just proved becomes an obstruction once we know an explicit entropy formula for Bernoulli shifts. Let $\sigma_p$ be the Bernoulli shift with base probability vector $p=(p_i)_{i\in I}$ and product measure $\mu_p$, and let $\sigma_q$ be the Bernoulli shift with base probability vector $q=(q_j)_{j\in J}$ and product measure $\mu_q$. Their Kolmogorov-Sinai entropies are denoted by $h_{\mu_p}(\sigma_p)$ and $h_{\mu_q}(\sigma_q)$.
The base entropies are
\begin{align*}
H(p):=-\sum_{i\in I}p_i\log p_i
\end{align*}
and
\begin{align*}
H(q):=-\sum_{j\in J}q_j\log q_j,
\end{align*}
where $0\log 0:=0$.
The entropy formula for Bernoulli shifts says precisely that the Kolmogorov-Sinai entropy of a Bernoulli shift equals the Shannon entropy of its one-coordinate base law. Applying it to $\sigma_p$ and $\sigma_q$ gives
\begin{align*}
h_{\mu_p}(\sigma_p)=H(p)
\end{align*}
and
\begin{align*}
h_{\mu_q}(\sigma_q)=H(q).
\end{align*}
If the two Bernoulli shifts were measure-theoretically isomorphic, the invariance theorem proved above would force $h_{\mu_p}(\sigma_p)=h_{\mu_q}(\sigma_q)$. Substituting the Bernoulli entropy formula yields
\begin{align*}
H(p)=h_{\mu_p}(\sigma_p)=h_{\mu_q}(\sigma_q)=H(q),
\end{align*}
which contradicts the assumption $H(p)\ne H(q)$. Hence Bernoulli shifts with different base entropies are not measure-theoretically isomorphic.
[/guided]
[/step]