[proofplan]
The finite-entropy generating partition allows the metric entropy to be read from the exponential decay rate of cylinder measures. The bounded cylinder-measure estimate identifies this decay rate, up to a uniformly bounded additive error, with the logarithmic derivative growth along the same orbit. The chain rule converts $\log |(T^n)'|$ into a Birkhoff sum of $\log |T'|$, and Birkhoff's ergodic theorem turns that average into the space integral.
[/proofplan]
[step:Discard the endpoint orbit set where derivatives or cylinders are ambiguous]
Let $B\subset[0,1]$ denote the set of all endpoints of intervals in $\alpha$. Since $\alpha$ is finite or countable, $B$ is countable, and hence $\mathcal L^1(B)=0$. Because $\mu$ is equivalent to $\mathcal L^1$, we have $\mu(B)=0$.
Define the exceptional set
\begin{align*}
E:=\bigcup_{k=0}^{\infty}T^{-k}(B).
\end{align*}
The map $T$ is nonsingular with respect to $\mathcal L^1$, and $\mu$ is equivalent to $\mathcal L^1$; hence $T$ is nonsingular with respect to $\mu$. Therefore $\mu(T^{-k}(B))=0$ for every $k\in\mathbb N\cup\{0\}$. [Countable subadditivity](/theorems/1108) gives $\mu(E)=0$.
For every $x\in[0,1]\setminus E$ and every $n\in\mathbb N$, the points $x,T(x),\dots,T^{n-1}(x)$ do not belong to $B$. Hence, for each $0\le k\le n-1$, there is a unique interval atom $I_{i_k}\in\alpha$ such that $T^k x\in I_{i_k}^{\circ}$. It follows that the $n$-cylinder $C_n(x)\in\alpha_n$ containing $x$ is well defined modulo endpoints. Since $T$ is $C^1$ on every branch interior and the whole finite orbit segment remains inside branch interiors, the ordinary chain rule applies along this orbit segment:
\begin{align*}
\log |(T^n)'(x)|=\sum_{k=0}^{n-1}\log |T'(T^k x)|.
\end{align*}
[/step]
[step:Read entropy from the generating branch partition]
For each $n\in\mathbb N$, define the cylinder partition
\begin{align*}
\alpha_n:=\bigvee_{k=0}^{n-1}T^{-k}\alpha.
\end{align*}
For a countable measurable partition $\beta$ of $[0,1]$, let $H_\mu(\beta)$ denote its Shannon entropy,
\begin{align*}
H_\mu(\beta):=-\sum_{A\in\beta}\mu(A)\log \mu(A),
\end{align*}
with the convention $0\log 0:=0$. Let $h_\mu(T,\alpha)$ denote the metric entropy of $T$ relative to the partition $\alpha$, defined by
\begin{align*}
h_\mu(T,\alpha):=\lim_{n\to\infty}\frac{1}{n}H_\mu(\alpha_n),
\end{align*}
where the limit exists by subadditivity of partition entropy for a probability-preserving transformation.
Let $\mathcal B([0,1])$ denote the Borel $\sigma$-algebra on $[0,1]$. Because $\alpha$ is generating modulo $\mu$-null sets and $H_\mu(\alpha)<\infty$, the Kolmogorov-Sinai entropy formula for a one-sided probability-preserving transformation with a finite-entropy countable generator gives
\begin{align*}
h_\mu(T)=h_\mu(T,\alpha)=\lim_{n\to\infty}\frac{1}{n}H_\mu(\alpha_n).
\end{align*}
This version applies to the non-invertible transformation $T$ because $\mu$ is $T$-invariant, $\alpha$ is countable with finite entropy, and the forward iterates of $\alpha$ generate $\mathcal B([0,1])$ modulo $\mu$-null sets. Moreover, by the [Shannon-McMillan-Breiman theorem](/theorems/6766) for one-sided ergodic probability-preserving transformations and finite-entropy countable partitions, applied to the system $([0,1],\mathcal B([0,1]),\mu,T)$ and the partition $\alpha$, for $\mu$-almost every $x\in[0,1]$,
\begin{align*}
\lim_{n\to\infty}-\frac{1}{n}\log \mu(C_n(x))=h_\mu(T,\alpha)=h_\mu(T).
\end{align*}
Here $C_n(x)$ denotes the unique atom of $\alpha_n$ containing $x$, for $x$ outside the endpoint orbit set $E$.
[guided]
The purpose of this step is to replace the abstract metric entropy $h_\mu(T)$ by a pointwise quantity attached to the cylinder containing a typical point. The hypotheses needed for this replacement are exactly the hypotheses imposed on the partition: $\alpha$ is generating and has finite entropy.
For each $n\in\mathbb N$, the partition
\begin{align*}
\alpha_n:=\bigvee_{k=0}^{n-1}T^{-k}\alpha
\end{align*}
is the $n$-cylinder partition. For a countable measurable partition $\beta$ of $[0,1]$, its Shannon entropy is
\begin{align*}
H_\mu(\beta):=-\sum_{A\in\beta}\mu(A)\log \mu(A),
\end{align*}
with $0\log 0:=0$. The entropy of $T$ relative to $\alpha$ is
\begin{align*}
h_\mu(T,\alpha):=\lim_{n\to\infty}\frac{1}{n}H_\mu(\alpha_n),
\end{align*}
where the limit exists by subadditivity of partition entropy for a probability-preserving transformation.
First, let $\mathcal B([0,1])$ denote the Borel $\sigma$-algebra on $[0,1]$. Since $\alpha$ is generating modulo $\mu$-null sets and $H_\mu(\alpha)<\infty$, the Kolmogorov-Sinai entropy theorem for one-sided probability-preserving transformations with generating finite-entropy countable partitions applies to $([0,1],\mathcal B([0,1]),\mu,T)$. The theorem does not require $T$ to be invertible; it requires $T$ to preserve the probability measure, the partition to have finite entropy, and the forward iterates of the partition to generate the ambient $\sigma$-algebra modulo null sets. These conditions hold because $\mu$ is $T$-invariant, $H_\mu(\alpha)<\infty$, and $\alpha$ is generating by hypothesis. Hence
\begin{align*}
h_\mu(T)=h_\mu(T,\alpha)=\lim_{n\to\infty}\frac{1}{n}H_\mu(\alpha_n).
\end{align*}
Second, the Shannon-McMillan-Breiman theorem for one-sided ergodic probability-preserving transformations and finite-entropy countable partitions applies. Its hypotheses are: the system is probability-preserving, the system is ergodic, and the partition has finite entropy. The first holds because $\mu$ is $T$-invariant, the second is an assumption on $\mu$, and the third is $H_\mu(\alpha)<\infty$. With $\alpha_n=\bigvee_{k=0}^{n-1}T^{-k}\alpha$ as the corresponding cylinder partition, it gives, for $\mu$-almost every $x$,
\begin{align*}
\lim_{n\to\infty}-\frac{1}{n}\log \mu(C_n(x))=h_\mu(T,\alpha).
\end{align*}
Because $\alpha$ is generating, $h_\mu(T,\alpha)=h_\mu(T)$. Thus the exponential rate at which the measure of the cylinder $C_n(x)$ shrinks is exactly the entropy of $T$ for almost every $x$.
[/guided]
[/step]
[step:Use the cylinder-measure estimate to replace cylinder mass by derivative growth]
Fix $x\in[0,1]\setminus E$ for which the Shannon-McMillan-Breiman conclusion holds. For each $n\in\mathbb N$, let $C_n(x)\in\alpha_n$ be the $n$-cylinder containing $x$. The bounded cylinder-measure hypothesis gives
\begin{align*}
C_0^{-1}\le \mu(C_n(x))\, |(T^n)'(x)|\le C_0.
\end{align*}
Taking logarithms gives
\begin{align*}
-\log C_0\le \log \mu(C_n(x))+\log |(T^n)'(x)|\le \log C_0.
\end{align*}
Equivalently,
\begin{align*}
\left|-\log \mu(C_n(x))-\log |(T^n)'(x)|\right|\le \log C_0.
\end{align*}
Dividing by $n$ yields
\begin{align*}
\left|-\frac{1}{n}\log \mu(C_n(x))-\frac{1}{n}\log |(T^n)'(x)|\right|\le \frac{\log C_0}{n}.
\end{align*}
Since $\frac{\log C_0}{n}\to0$, the two sequences have the same limit whenever either limit exists. Therefore, for this $x$,
\begin{align*}
h_\mu(T)=\lim_{n\to\infty}\frac{1}{n}\log |(T^n)'(x)|.
\end{align*}
[/step]
[step:Convert derivative growth into a Birkhoff average]
For $x\in[0,1]\setminus E$, the chain rule from the first step gives
\begin{align*}
\frac{1}{n}\log |(T^n)'(x)|=\frac{1}{n}\sum_{k=0}^{n-1}\log |T'(T^k x)|.
\end{align*}
Define the observable $\varphi:[0,1]\to\mathbb R$ by setting $\varphi(x):=\log |T'(x)|$ for $x\in[0,1]\setminus B$ and $\varphi(x):=0$ for $x\in B$. Since $B$ is $\mu$-null and $\log |T'|\in L^1(\mu)$ by hypothesis, this extension satisfies $\varphi\in L^1([0,1],\mu)$. Since $T$ preserves $\mu$ and is ergodic, Birkhoff's ergodic theorem applies to the integrable observable $\varphi$ and gives, for $\mu$-almost every $x\in[0,1]$,
\begin{align*}
\lim_{n\to\infty}\frac{1}{n}\sum_{k=0}^{n-1}\varphi(T^k x)=\int_{[0,1]}\varphi\,d\mu.
\end{align*}
Thus, for $\mu$-almost every $x$ belonging also to the full-measure Shannon-McMillan-Breiman set,
\begin{align*}
\lim_{n\to\infty}\frac{1}{n}\log |(T^n)'(x)|=\int_{[0,1]}\log |T'(x)|\,d\mu(x).
\end{align*}
[/step]
[step:Identify the two almost sure limits]
The set on which the endpoint orbit is avoided, the Shannon-McMillan-Breiman limit holds, and the Birkhoff limit holds has full $\mu$-measure, because it is the intersection of three full-measure sets. Choose any $x$ in this intersection. The cylinder-measure estimate gives
\begin{align*}
h_\mu(T)=\lim_{n\to\infty}\frac{1}{n}\log |(T^n)'(x)|.
\end{align*}
The Birkhoff average computation gives
\begin{align*}
\lim_{n\to\infty}\frac{1}{n}\log |(T^n)'(x)|=\int_{[0,1]}\log |T'(x)|\,d\mu(x).
\end{align*}
Combining these two equalities yields
\begin{align*}
h_\mu(T)=\int_{[0,1]}\log |T'(x)|\,d\mu(x).
\end{align*}
This is the claimed Rokhlin entropy formula.
[/step]