Variational Principle for Topological Entropy (Theorem # 6728)
Theorem
Let $(X,T)$ be a compact metrizable topological dynamical system. Then
\begin{align*}
h_{\mathrm{top}}(T)=\sup_{\mu\in\mathcal M_T(X)} h_\mu(T).
\end{align*}
Knowledge Status
Analysis
Discussion
States and proves Variational Principle for Topological Entropy, a result in advanced ergodic theory focused on entropy, dynamical structure, and related invariants.
Proof
[proofplan]
We prove the two inequalities separately. The upper bound compares the information in a finite measurable partition with the number of orbit names allowed by a finite open cover, using regularity of Borel probability measures on compact metric spaces. The lower bound constructs invariant measures from maximal separated orbit sets: uniform measures on separated sets are averaged along orbits, weak* limits are invariant, and a small-diameter partition with null boundary converts separated orbit names into partition entropy. Letting the time length and then the spatial scale vary gives equality.
[/proofplan]
[step:Define the entropy objects used in the proof]
Fix a compatible metric $d:X\times X\to[0,\infty)$ on the compact [metrizable space](/page/Metrizable%20Space) $X$. For $n\in\mathbb N$, define the Bowen metric $d_n:X\times X\to[0,\infty)$ by
\begin{align*}
d_n(x,y):=\max_{0\le k\le n-1}d(T^k x,T^k y).
\end{align*}
For $\varepsilon>0$, let $s_n(\varepsilon)$ denote the maximal cardinality of an $(n,\varepsilon)$-separated set in $X$, meaning a set $E\subset X$ such that $d_n(x,y)>\varepsilon$ whenever $x,y\in E$ and $x\ne y$. Define
\begin{align*}
h_{\mathrm{sep}}(T,\varepsilon):=\limsup_{n\to\infty}\frac{1}{n}\log s_n(\varepsilon).
\end{align*}
By the separated-set characterization of topological entropy, applied to the compatible metric $d$ on the [compact space](/page/Compact%20Space) $X$,
\begin{align*}
h_{\mathrm{top}}(T)=\lim_{\varepsilon\downarrow 0}h_{\mathrm{sep}}(T,\varepsilon).
\end{align*}
If $\mathcal U$ is a finite open cover of $X$, define
\begin{align*}
\mathcal U_0^{n-1}:=\bigvee_{k=0}^{n-1}T^{-k}\mathcal U,
\end{align*}
where the join of covers consists of all sets $U_0\cap T^{-1}U_1\cap\cdots\cap T^{-(n-1)}U_{n-1}$ with $U_k\in\mathcal U$. Let $N(\mathcal U_0^{n-1})$ denote the least cardinality of a subcover of $\mathcal U_0^{n-1}$.
If $\mu\in\mathcal M_T(X)$ and $\mathcal P=\{P_1,\dots,P_m\}$ is a finite Borel partition of $X$, define
\begin{align*}
H_\mu(\mathcal P):=-\sum_{i=1}^m\mu(P_i)\log\mu(P_i),
\end{align*}
with the convention $0\log 0=0$. Define the joined partition
\begin{align*}
\mathcal P_0^{n-1}:=\bigvee_{k=0}^{n-1}T^{-k}\mathcal P.
\end{align*}
The entropy of $T$ relative to $\mathcal P$ is
\begin{align*}
h_\mu(T,\mathcal P):=\lim_{n\to\infty}\frac{1}{n}H_\mu(\mathcal P_0^{n-1}),
\end{align*}
where the limit exists by subadditivity. Finally,
\begin{align*}
h_\mu(T):=\sup_{\mathcal P}h_\mu(T,\mathcal P),
\end{align*}
where the supremum is over all finite Borel partitions of $X$.
[/step]
[step:Bound every measure entropy by topological entropy]
Fix $\mu\in\mathcal M_T(X)$ and a finite Borel partition $\mathcal P=\{P_1,\dots,P_m\}$ of $X$. We prove
\begin{align*}
h_\mu(T,\mathcal P)\le h_{\mathrm{top}}(T).
\end{align*}
Let $\alpha>0$. By regularity of the Borel probability measure $\mu$ on the compact [metric space](/page/Metric%20Space) $X$, for each atom $P_i$ choose a compact set $K_i\subset P_i$ such that
\begin{align*}
\mu(P_i\setminus K_i)<\frac{\alpha}{m}.
\end{align*}
Set $K:=K_1\cup\dots\cup K_m$. Since the finitely many compact sets $K_i$ are pairwise disjoint, choose open sets $U_i\subset X$ with $K_i\subset U_i$ such that $U_i\cap U_l=\varnothing$ whenever $i\ne l$. Define the finite open cover
\begin{align*}
\mathcal U:=\{U_1,\dots,U_m,X\setminus K\}.
\end{align*}
This cover is fixed after $\alpha$ and $\mathcal P$ have been chosen.
Let $\mathcal R=\{R_0,R_1,\dots,R_m\}$ be the finite Borel partition defined by $R_i=K_i$ for $1\le i\le m$ and $R_0=X\setminus K$. For each $x\in K_i$, membership in $\mathcal R$ determines membership in $\mathcal P$. Thus the only possible disagreement between the one-step information in $\mathcal P$ and $\mathcal R$ occurs on $R_0=X\setminus K$, whose measure satisfies
\begin{align*}
\mu(R_0)\le\sum_{i=1}^m\mu(P_i\setminus K_i)<\alpha.
\end{align*}
Let $H_2:[0,1]\to[0,\log 2]$ denote the binary entropy function $H_2(t)=-t\log t-(1-t)\log(1-t)$, with the convention $0\log 0=0$.
[claim:Control conditional entropy by an exceptional set]
If a finite partition $\mathcal A$ with at most $M$ atoms is determined by a finite partition $\mathcal B$ outside a Borel set $E\subset X$ with $\mu(E)\le\alpha$, then
\begin{align*}
H_\mu(\mathcal A\mid\mathcal B)\le H_2(\alpha)+\alpha\log M.
\end{align*}
[/claim]
[proof]
Let $I_E=\{E,X\setminus E\}$ be the two-atom partition generated by $E$. The chain rule for finite conditional entropy gives
\begin{align*}
H_\mu(\mathcal A\mid\mathcal B)\le H_\mu(I_E\mid\mathcal B)+H_\mu(\mathcal A\mid\mathcal B\vee I_E).
\end{align*}
The monotonicity principle that conditioning reduces finite entropy gives $H_\mu(I_E\mid\mathcal B)\le H_\mu(I_E)\le H_2(\alpha)$. On $X\setminus E$, the atom of $\mathcal A$ is determined by the atom of $\mathcal B$, so the remaining conditional entropy is supported on $E$ and is at most $\mu(E)\log M\le\alpha\log M$. Combining these two estimates proves the claim.
[/proof]
Applying the claim to $\mathcal A=\mathcal P$, $\mathcal B=\mathcal R$, $E=R_0$, and $M=m$ gives
\begin{align*}
H_\mu(\mathcal P\mid\mathcal R)\le H_2(\alpha)+\alpha\log m.
\end{align*}
By the chain rule for finite partition entropy and $T$-invariance of $\mu$,
\begin{align*}
H_\mu(\mathcal P_0^{n-1})\le H_\mu(\mathcal R_0^{n-1})+\sum_{k=0}^{n-1}H_\mu(T^{-k}\mathcal P\mid T^{-k}\mathcal R)
\end{align*}
and therefore
\begin{align*}
H_\mu(\mathcal P_0^{n-1})\le H_\mu(\mathcal R_0^{n-1})+n\bigl(H_2(\alpha)+\alpha\log m\bigr).
\end{align*}
[claim:Code the refined partition by the open-cover name and exceptional times]
For every $n\in\mathbb N$,
\begin{align*}
H_\mu(\mathcal R_0^{n-1})\le \log N(\mathcal U_0^{n-1})+nH_2(\alpha).
\end{align*}
[/claim]
[proof]
Let $\mathcal V_n\subset\mathcal U_0^{n-1}$ be a subcover with $|\mathcal V_n|=N(\mathcal U_0^{n-1})$, and choose a measurable tie-breaking map $C_n:X\to\mathcal V_n$ such that $x\in C_n(x)$ for every $x\in X$. For $0\le k\le n-1$, define the Borel set $E_k:=T^{-k}R_0$ and the two-atom partition $I_k:=\{E_k,X\setminus E_k\}$. Since $\mu$ is $T$-invariant, $\mu(E_k)=\mu(R_0)<\alpha$, and hence $H_\mu(I_k)\le H_2(\alpha)$.
We claim that the partition $\mathcal R_0^{n-1}$ is determined by the finite data consisting of $C_n$ and the exceptional indicators $I_0,\dots,I_{n-1}$. Indeed, write a selected cover element as
\begin{align*}
V=V_0\cap T^{-1}V_1\cap\cdots\cap T^{-(n-1)}V_{n-1},
\end{align*}
with each $V_k\in\mathcal U$. If $T^k x\in R_0$, the $k$th symbol of the $\mathcal R$-name is $0$. If $T^k x\notin R_0$, then $T^k x\in K_i$ for a unique $i\ge1$. Because the open sets $U_1,\dots,U_m$ are pairwise disjoint and $K_i\subset U_i$, membership of $T^k x$ in $V_k$ determines this unique index $i$. Thus no additional information is needed.
Therefore the entropy of $\mathcal R_0^{n-1}$ is bounded by the entropy of $C_n$ plus the entropy of the joined indicator partition $I_0\vee\cdots\vee I_{n-1}$. The [random variable](/page/Random%20Variable) $C_n$ takes at most $N(\mathcal U_0^{n-1})$ values, so its entropy is at most $\log N(\mathcal U_0^{n-1})$. Subadditivity of finite entropy gives
\begin{align*}
H_\mu(I_0\vee\cdots\vee I_{n-1})\le\sum_{k=0}^{n-1}H_\mu(I_k)\le nH_2(\alpha).
\end{align*}
Combining the two bounds proves the claim.
[/proof]
Combining the two estimates and dividing by $n$ gives
\begin{align*}
\frac{1}{n}H_\mu(\mathcal P_0^{n-1})\le \frac{1}{n}\log N(\mathcal U_0^{n-1})+2H_2(\alpha)+\alpha\log m.
\end{align*}
Taking $\limsup_{n\to\infty}$, using that $\mathcal U$ is fixed, and then using the open-cover definition of topological entropy as the supremum over finite open covers gives
\begin{align*}
h_\mu(T,\mathcal P)\le h_{\mathrm{top}}(T)+2H_2(\alpha)+\alpha\log m.
\end{align*}
Letting $\alpha\downarrow0$ yields
\begin{align*}
h_\mu(T,\mathcal P)\le h_{\mathrm{top}}(T).
\end{align*}
Since $\mathcal P$ was arbitrary,
\begin{align*}
h_\mu(T)\le h_{\mathrm{top}}(T).
\end{align*}
Taking the supremum over $\mu\in\mathcal M_T(X)$ gives
\begin{align*}
\sup_{\mu\in\mathcal M_T(X)}h_\mu(T)\le h_{\mathrm{top}}(T).
\end{align*}
[guided]
Fix an invariant Borel probability measure $\mu$ and a finite Borel partition $\mathcal P=\{P_1,\dots,P_m\}$ of $X$. The goal is to compare the measurable orbit names from $\mathcal P$ with the orbit names allowed by one fixed finite open cover. The cover must be fixed before $n$ varies, because topological entropy controls the growth of $N(\mathcal U_0^{n-1})$ for a fixed cover $\mathcal U$.
Choose $\alpha>0$. Since $X$ is compact metric, every Borel probability measure on $X$ is regular. Therefore, for each $1\le i\le m$, choose a compact set $K_i\subset P_i$ such that
\begin{align*}
\mu(P_i\setminus K_i)<\frac{\alpha}{m}.
\end{align*}
Define the compact set $K\subset X$ by $K=K_1\cup\dots\cup K_m$. The compact sets $K_i$ are pairwise disjoint because the $P_i$ are disjoint. Since $X$ is metric and there are only finitely many of them, choose pairwise disjoint open sets $U_i\subset X$ with $K_i\subset U_i$ for $1\le i\le m$. Define the finite open cover
\begin{align*}
\mathcal U:=\{U_1,\dots,U_m,X\setminus K\}.
\end{align*}
This cover is the topological object that will encode the non-exceptional part of the partition names.
Define the finite Borel partition $\mathcal R=\{R_0,R_1,\dots,R_m\}$ by $R_i=K_i$ for $1\le i\le m$ and $R_0=X\setminus K$. If $x\in R_i$ with $i\ge1$, then $x\in P_i$, so the $\mathcal R$-atom determines the $\mathcal P$-atom. The only possible failure of determination is the exceptional set $R_0$, and finite subadditivity gives
\begin{align*}
\mu(R_0)\le\sum_{i=1}^m\mu(P_i\setminus K_i)<\alpha.
\end{align*}
Let $H_2:[0,1]\to[0,\log 2]$ denote the binary entropy function $H_2(t)=-t\log t-(1-t)\log(1-t)$, with $0\log0=0$.
We now prove the conditional-entropy estimate used at one time. Let $\mathcal A$ be a finite partition with at most $M$ atoms, let $\mathcal B$ be a finite partition, and let $E\subset X$ be a Borel set with $\mu(E)\le\alpha$. Assume that $\mathcal A$ is determined by $\mathcal B$ on $X\setminus E$. Define the two-atom partition $I_E=\{E,X\setminus E\}$. The chain rule and monotonicity of conditional entropy give
\begin{align*}
H_\mu(\mathcal A\mid\mathcal B)\le H_\mu(I_E\mid\mathcal B)+H_\mu(\mathcal A\mid\mathcal B\vee I_E).
\end{align*}
[Conditioning reduces entropy](/theorems/1652), so
\begin{align*}
H_\mu(I_E\mid\mathcal B)\le H_\mu(I_E)\le H_2(\alpha).
\end{align*}
On $X\setminus E$, the atom of $\mathcal A$ is already determined by the atom of $\mathcal B$, so no conditional entropy remains there. On $E$, there are at most $M$ possible atoms of $\mathcal A$, so the contribution is at most $\mu(E)\log M\le\alpha\log M$. Hence
\begin{align*}
H_\mu(\mathcal A\mid\mathcal B)\le H_2(\alpha)+\alpha\log M.
\end{align*}
Applying this with $\mathcal A=\mathcal P$, $\mathcal B=\mathcal R$, $E=R_0$, and $M=m$ gives
\begin{align*}
H_\mu(\mathcal P\mid\mathcal R)\le H_2(\alpha)+\alpha\log m.
\end{align*}
For $n\in\mathbb N$, the chain rule for finite partition entropy gives
\begin{align*}
H_\mu(\mathcal P_0^{n-1})\le H_\mu(\mathcal R_0^{n-1})+\sum_{k=0}^{n-1}H_\mu(T^{-k}\mathcal P\mid T^{-k}\mathcal R).
\end{align*}
Since $\mu$ is $T$-invariant, the entropy of the pulled-back conditional partition is unchanged:
\begin{align*}
H_\mu(T^{-k}\mathcal P\mid T^{-k}\mathcal R)=H_\mu(\mathcal P\mid\mathcal R).
\end{align*}
Therefore
\begin{align*}
H_\mu(\mathcal P_0^{n-1})\le H_\mu(\mathcal R_0^{n-1})+n\bigl(H_2(\alpha)+\alpha\log m\bigr).
\end{align*}
It remains to bound $H_\mu(\mathcal R_0^{n-1})$ by the open-cover complexity. Let $\mathcal V_n\subset\mathcal U_0^{n-1}$ be a subcover with $|\mathcal V_n|=N(\mathcal U_0^{n-1})$. Since $\mathcal V_n$ is finite, define a measurable tie-breaking map $C_n:X\to\mathcal V_n$ by fixing an ordering $\mathcal V_n=\{V_1,\dots,V_N\}$ and setting $C_n(x)=V_l$ for the least $l$ such that $x\in V_l$. Each fibre of $C_n$ is Borel because it is obtained from finitely many Borel cover elements by intersections and differences.
For $0\le k\le n-1$, define $E_k=T^{-k}R_0$ and $I_k=\{E_k,X\setminus E_k\}$. Invariance gives $\mu(E_k)=\mu(R_0)<\alpha$, hence
\begin{align*}
H_\mu(I_k)\le H_2(\alpha).
\end{align*}
We claim that the atom of $\mathcal R_0^{n-1}$ containing $x$ is determined by $C_n(x)$ together with the atoms of $I_0,\dots,I_{n-1}$ containing $x$. Write the selected cover element in the form
\begin{align*}
C_n(x)=V_0\cap T^{-1}V_1\cap\cdots\cap T^{-(n-1)}V_{n-1},
\end{align*}
with $V_k\in\mathcal U$. If $x\in E_k$, then $T^k x\in R_0$, so the $k$th $\mathcal R$-symbol is $0$. If $x\notin E_k$, then $T^k x\in K_i$ for a unique $i\ge1$. Because the sets $U_1,\dots,U_m$ are pairwise disjoint and $K_i\subset U_i$, the condition $T^k x\in V_k$ identifies this unique index $i$. Thus the selected cover element plus the exceptional indicators determines every symbol in the $\mathcal R$-name.
Consequently, the entropy of $\mathcal R_0^{n-1}$ is bounded by the entropy of the finite-valued map $C_n$ plus the entropy of the joined indicator partition $I_0\vee\cdots\vee I_{n-1}$. The map $C_n$ has at most $N(\mathcal U_0^{n-1})$ values, so
\begin{align*}
H_\mu(C_n)\le \log N(\mathcal U_0^{n-1}).
\end{align*}
Subadditivity of finite entropy gives
\begin{align*}
H_\mu(I_0\vee\cdots\vee I_{n-1})\le\sum_{k=0}^{n-1}H_\mu(I_k)\le nH_2(\alpha).
\end{align*}
Hence
\begin{align*}
H_\mu(\mathcal R_0^{n-1})\le \log N(\mathcal U_0^{n-1})+nH_2(\alpha).
\end{align*}
Combining this estimate with the previous estimate for $H_\mu(\mathcal P_0^{n-1})$ and dividing by $n$ gives
\begin{align*}
\frac{1}{n}H_\mu(\mathcal P_0^{n-1})\le \frac{1}{n}\log N(\mathcal U_0^{n-1})+2H_2(\alpha)+\alpha\log m.
\end{align*}
Because $\mathcal U$ is fixed, the open-cover definition of topological entropy implies
\begin{align*}
h_\mu(T,\mathcal P)\le h_{\mathrm{top}}(T)+2H_2(\alpha)+\alpha\log m.
\end{align*}
Finally let $\alpha\downarrow0$. Since $H_2(\alpha)\to0$, we obtain
\begin{align*}
h_\mu(T,\mathcal P)\le h_{\mathrm{top}}(T).
\end{align*}
Taking the supremum over all finite Borel partitions gives $h_\mu(T)\le h_{\mathrm{top}}(T)$, and then taking the supremum over $\mu\in\mathcal M_T(X)$ gives the desired upper bound.
[/guided]
[/step]
[step:Build invariant measures from separated orbit sets]
Fix $\varepsilon>0$ and fix a real number $L$ satisfying $0\le L<h_{\mathrm{sep}}(T,\varepsilon)$, with the convention that such $L$ is arbitrary when $h_{\mathrm{sep}}(T,\varepsilon)=\infty$. By the definition of the limit superior, choose integers $n_j\to\infty$ such that
\begin{align*}
\frac{1}{n_j}\log s_{n_j}(\varepsilon)\ge L
\end{align*}
for every $j$. For each $j$, choose an $(n_j,\varepsilon)$-separated set $E_j\subset X$ with
\begin{align*}
|E_j|=s_{n_j}(\varepsilon).
\end{align*}
Define the uniform probability measure on $E_j$ by
\begin{align*}
\sigma_j:=\frac{1}{|E_j|}\sum_{x\in E_j}\delta_x.
\end{align*}
For a Borel probability measure $\rho$ on $X$, define the pushforward measure $T_*\rho$ on the Borel subsets of $X$ by $(T_*\rho)(A)=\rho(T^{-1}A)$. Define the orbit-averaged probability measure
\begin{align*}
\nu_j:=\frac{1}{n_j}\sum_{k=0}^{n_j-1}T_*^k\sigma_j.
\end{align*}
By weak star compactness of probability measures, the space of Borel probability measures on the compact metric space $X$ is weak* compact. Passing to a subsequence, still indexed by $j$, there exists a Borel probability measure $\mu$ on $X$ such that $\nu_j$ converges weak star to $\mu$.
We verify that $\mu$ is $T$-invariant. Let
\begin{align*}
f:X\to\mathbb R
\end{align*}
be continuous. Then
\begin{align*}
\int_X f\,d(T_*\nu_j)-\int_X f\,d\nu_j=\frac{1}{n_j}\left(\int_X f\,d(T_*^{n_j}\sigma_j)-\int_X f\,d\sigma_j\right).
\end{align*}
Taking absolute values and using $| \int_X f\,d\rho |\le \|f\|_\infty$ for every probability measure $\rho$ gives
\begin{align*}
\left|\int_X f\,d(T_*\nu_j)-\int_X f\,d\nu_j\right|\le \frac{2\|f\|_\infty}{n_j}.
\end{align*}
The right-hand side tends to $0$. Passing to the weak* limit gives
\begin{align*}
\int_X f\,d(T_*\mu)=\int_X f\,d\mu.
\end{align*}
Since this holds for every continuous $f:X\to\mathbb R$, we have $T_*\mu=\mu$, so $\mu$ is an invariant measure and $\mu\in\mathcal M_T(X)$.
[/step]
[step:Choose a small partition whose boundary is invisible to the limit measure]
For a Borel set $A\subset X$, let $\partial A:=\overline{A}\setminus A^\circ$ denote its topological boundary in $X$. We choose a finite Borel partition $\mathcal P=\{P_1,\dots,P_m\}$ of $X$ such that every atom has diameter less than $\varepsilon$ and
\begin{align*}
\mu(\partial P_i)=0
\end{align*}
for every $1\le i\le m$.
To construct it, cover $X$ by finitely many open balls of radius less than $\varepsilon/3$. Fix one centre $x_a$. The spheres $\partial B(x_a,r)$ for distinct radii $r>0$ are pairwise disjoint. Since $\mu(X)=1$, for each integer $N\ge1$ there can be only finitely many radii $r>0$ with $\mu(\partial B(x_a,r))\ge 1/N$; otherwise finitely many disjoint such spheres would have total measure exceeding $1$. Hence the set of radii with $\mu(\partial B(x_a,r))>0$ is countable. For each chosen centre, select a radius still less than $\varepsilon/3$ and outside this countable exceptional set. Taking the corresponding balls and then disjointizing them by subtracting the preceding balls gives a finite Borel partition. Boundaries of the resulting atoms are contained in finite unions of spheres of $\mu$-measure zero, and the diameters remain less than $\varepsilon$.
[/step]
[step:Convert separated orbit names into partition entropy]
For each $j$, the map
\begin{align*}
\pi_j:E_j\to \mathcal P_0^{n_j-1},\qquad x\mapsto \text{the atom of }\mathcal P_0^{n_j-1}\text{ containing }x
\end{align*}
is injective. Indeed, if $x,y\in E_j$ have the same $\mathcal P_0^{n_j-1}$-name, then for each $0\le k\le n_j-1$ the points $T^k x$ and $T^k y$ lie in the same atom of $\mathcal P$, whose diameter is less than $\varepsilon$. Therefore
\begin{align*}
d_{n_j}(x,y)<\varepsilon.
\end{align*}
Since $E_j$ is $(n_j,\varepsilon)$-separated, this forces $x=y$.
Thus the partition $\mathcal P_0^{n_j-1}$ separates all atoms of the uniform measure $\sigma_j$, and hence
\begin{align*}
H_{\sigma_j}(\mathcal P_0^{n_j-1})=\log |E_j|=\log s_{n_j}(\varepsilon).
\end{align*}
Fix $q\in\mathbb N$. We prove the block estimate used to average orbit names. For each offset $r\in\{0,\dots,q-1\}$, let
\begin{align*}
B_{j,r}:=\{t\in\{0,\dots,n_j-q\}:t\equiv r\pmod q\}.
\end{align*}
The intervals $\{t,t+1,\dots,t+q-1\}$ with $t\in B_{j,r}$ are disjoint. They cover all times in $\{0,\dots,n_j-1\}$ except an initial remainder of length at most $q$ and a terminal remainder of length less than $q$. The joined partition over each remainder has entropy at most its length times $\log m$, so [subadditivity of entropy](/theorems/1634) under refinement gives
\begin{align*}
H_{\sigma_j}(\mathcal P_0^{n_j-1})\le \sum_{t\in B_{j,r}}H_{\sigma_j}(T^{-t}\mathcal P_0^{q-1})+2q\log m.
\end{align*}
Averaging this inequality over $r=0,\dots,q-1$ and using the identity $H_{\sigma_j}(T^{-t}\mathcal P_0^{q-1})=H_{T_*^t\sigma_j}(\mathcal P_0^{q-1})$ yields
\begin{align*}
H_{\sigma_j}(\mathcal P_0^{n_j-1})\le \frac{1}{q}\sum_{t=0}^{n_j-1}H_{T_*^t\sigma_j}(\mathcal P_0^{q-1})+2q\log m.
\end{align*}
The concavity of finite partition entropy as a function of the underlying probability measure, applied to the average measure $\nu_j=n_j^{-1}\sum_{t=0}^{n_j-1}T_*^t\sigma_j$, gives
\begin{align*}
\frac{1}{n_j}\sum_{t=0}^{n_j-1}H_{T_*^t\sigma_j}(\mathcal P_0^{q-1})\le H_{\nu_j}(\mathcal P_0^{q-1}).
\end{align*}
Combining these estimates with $H_{\sigma_j}(\mathcal P_0^{n_j-1})=\log s_{n_j}(\varepsilon)$ and dividing by $n_j$ yields
\begin{align*}
\frac{1}{n_j}\log s_{n_j}(\varepsilon)\le \frac{1}{q}H_{\nu_j}(\mathcal P_0^{q-1})+\frac{2q\log m}{n_j}.
\end{align*}
[/step]
[step:Pass the entropy estimate to the weak star limit]
Because $\mu(\partial P_i)=0$ for every atom $P_i$ of $\mathcal P$, every atom of the joined partition $\mathcal P_0^{q-1}$ has $\mu$-null boundary. Indeed, if $A=P_{i_0}\cap T^{-1}P_{i_1}\cap\cdots\cap T^{-(q-1)}P_{i_{q-1}}$ is such an atom, we use the standard topological fact that for a continuous map $S:X\to X$ and a Borel set $B\subset X$ one has $\partial(S^{-1}B)\subset S^{-1}(\partial B)$. Applying this to $S=T^k$ and $B=P_{i_k}$ gives
\begin{align*}
\partial A\subset \bigcup_{k=0}^{q-1}T^{-k}(\partial P_{i_k}).
\end{align*}
Since $\mu$ is $T$-invariant, $\mu(T^{-k}(\partial P_{i_k}))=\mu(\partial P_{i_k})=0$ for each $k$, and hence $\mu(\partial A)=0$ by finite subadditivity. By the [Portmanteau Theorem](/theorems/1171) for continuity sets, weak* convergence $\nu_j\to\mu$ therefore implies convergence of the atom masses of $\mathcal P_0^{q-1}$. Since the function
\begin{align*}
\psi:[0,1]\to[0,\infty),\qquad \psi(t):=-t\log t
\end{align*}
is continuous with $\psi(0)=0$, we obtain
\begin{align*}
H_{\nu_j}(\mathcal P_0^{q-1})\to H_\mu(\mathcal P_0^{q-1}).
\end{align*}
Taking the limit superior in the previous estimate and using the defining lower bound for the sequence $n_j$ gives
\begin{align*}
L\le \frac{1}{q}H_\mu(\mathcal P_0^{q-1}).
\end{align*}
Now let $q\to\infty$. By the definition of partition entropy,
\begin{align*}
L\le h_\mu(T,\mathcal P)\le h_\mu(T).
\end{align*}
Since $\mu\in\mathcal M_T(X)$, this proves
\begin{align*}
L\le \sup_{\rho\in\mathcal M_T(X)}h_\rho(T).
\end{align*}
Because $L<h_{\mathrm{sep}}(T,\varepsilon)$ was arbitrary, including arbitrarily large finite $L$ when $h_{\mathrm{sep}}(T,\varepsilon)=\infty$, we obtain
\begin{align*}
h_{\mathrm{sep}}(T,\varepsilon)\le \sup_{\rho\in\mathcal M_T(X)}h_\rho(T).
\end{align*}
Finally let $\varepsilon\downarrow0$. Using the separated-set formula for topological entropy,
\begin{align*}
h_{\mathrm{top}}(T)=\lim_{\varepsilon\downarrow0}h_{\mathrm{sep}}(T,\varepsilon),
\end{align*}
we conclude
\begin{align*}
h_{\mathrm{top}}(T)\le \sup_{\rho\in\mathcal M_T(X)}h_\rho(T).
\end{align*}
[/step]
[step:Combine the two inequalities]
The upper bound proved
\begin{align*}
\sup_{\mu\in\mathcal M_T(X)}h_\mu(T)\le h_{\mathrm{top}}(T).
\end{align*}
The lower bound proved
\begin{align*}
h_{\mathrm{top}}(T)\le \sup_{\mu\in\mathcal M_T(X)}h_\mu(T).
\end{align*}
Therefore
\begin{align*}
h_{\mathrm{top}}(T)=\sup_{\mu\in\mathcal M_T(X)}h_\mu(T).
\end{align*}
This is the variational principle for topological entropy.
[/step]
Prerequisites (0/5 completed)
Prerequisites Graph
Interactive dependency map showing how this theorem builds on foundational concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Explore Further
Metric Space
Definition
Continuity
Definition
Variational Principle for Topological Pressure
Theorem #6730
Variational Principle for Irreducible Shifts of Finite Type
Theorem #6800
Variational Principle for Pressure
Theorem #6816
Jordan's Lemma
Complex Analysis
Finite-Rank Operators are Compact
Analysis
Existence of Square Root of Two
Real Numbers
Kernel-Range Duality
Analysis
Airy Function Estimates
Dispersive PDE
Continuous Image of a Compact Space is Compact
Topology
Convolution Theorem for Laplace Transforms
Complex Analysis
Density of Trigonometric Polynomials
Approximation Theory
Analysis
Area