[proofplan]
The [Shannon-McMillan-Breiman theorem](/theorems/6766) says that the measure of the atom of $\mathcal P_0^{n-1}$ containing a typical point is exponentially close to $e^{-nh}$. We define the typical family $\mathcal W_n(\varepsilon)$ to consist of precisely those atoms whose measures lie between $e^{-n(h+\varepsilon)}$ and $e^{-n(h-\varepsilon)}$. The lower mass bound on each typical atom gives the upper bound on the number of typical names, while the upper mass bound on each typical atom forces any family covering fixed positive measure to contain exponentially many atoms.
[/proofplan]
[step:Define the typical atoms using Shannon-McMillan-Breiman]
For each $n\in\mathbb N$ and each $x\in X$, let $P_n(x)\in\mathcal P_0^{n-1}$ denote the unique atom of the partition $\mathcal P_0^{n-1}$ containing $x$. Define the information function
\begin{align*}
I_n:X\to[0,\infty],\qquad x\mapsto -\log \mu(P_n(x)),
\end{align*}
with the convention $-\log 0=\infty$.
By the Shannon-McMillan-Breiman theorem for finite partitions (cross-reference unavailable: the theorem search command failed with a database connection error), since $(X,\mathcal B,\mu,T)$ is ergodic and $\mathcal P$ is finite,
\begin{align*}
\frac{1}{n}I_n(x)\to h
\end{align*}
for $\mu$-almost every $x\in X$.
For $\varepsilon>0$, define
\begin{align*}
E_n(\varepsilon):=\left\{x\in X: h-\varepsilon\leq \frac{1}{n}I_n(x)\leq h+\varepsilon\right\}.
\end{align*}
Then $\mu(E_n(\varepsilon))\to 1$. Equivalently, $x\in E_n(\varepsilon)$ exactly when
\begin{align*}
e^{-n(h+\varepsilon)}\leq \mu(P_n(x))\leq e^{-n(h-\varepsilon)}.
\end{align*}
Define the family of typical $n$-name atoms
\begin{align*}
\mathcal W_n(\varepsilon):=\left\{A\in\mathcal P_0^{n-1}: e^{-n(h+\varepsilon)}\leq \mu(A)\leq e^{-n(h-\varepsilon)}\right\}.
\end{align*}
Since membership in $E_n(\varepsilon)$ depends only on the atom $P_n(x)$, we have
\begin{align*}
E_n(\varepsilon)=\bigcup_{A\in\mathcal W_n(\varepsilon)}A.
\end{align*}
Therefore
\begin{align*}
\mu\left(\bigcup_{A\in\mathcal W_n(\varepsilon)}A\right)=\mu(E_n(\varepsilon))\to 1.
\end{align*}
[guided]
For a point $x\in X$, the atom $P_n(x)$ records the first $n$ symbols of the orbit of $x$ with respect to the partition $\mathcal P$. Its measure is the probability of seeing that particular $n$-name. The Shannon-McMillan-Breiman theorem says that, for $\mu$-almost every $x$, this probability has exponential size $e^{-nh}$ in the sense that
\begin{align*}
\frac{-\log \mu(P_n(x))}{n}\to h.
\end{align*}
Fix $\varepsilon>0$. The typical event is therefore
\begin{align*}
E_n(\varepsilon):=\left\{x\in X: h-\varepsilon\leq \frac{-\log \mu(P_n(x))}{n}\leq h+\varepsilon\right\}.
\end{align*}
By Shannon-McMillan-Breiman, $\mu(E_n(\varepsilon))\to 1$.
Exponentiating the two inequalities in the definition of $E_n(\varepsilon)$ reverses the sign because of the minus sign in $-\log \mu(P_n(x))$. Thus $x\in E_n(\varepsilon)$ if and only if
\begin{align*}
e^{-n(h+\varepsilon)}\leq \mu(P_n(x))\leq e^{-n(h-\varepsilon)}.
\end{align*}
This motivates defining $\mathcal W_n(\varepsilon)$ as the family of atoms whose measures satisfy exactly this two-sided exponential estimate:
\begin{align*}
\mathcal W_n(\varepsilon):=\left\{A\in\mathcal P_0^{n-1}: e^{-n(h+\varepsilon)}\leq \mu(A)\leq e^{-n(h-\varepsilon)}\right\}.
\end{align*}
Because every point in a fixed atom $A\in\mathcal P_0^{n-1}$ has the same atom $P_n(x)=A$, the condition defining $E_n(\varepsilon)$ is constant on atoms. Hence
\begin{align*}
E_n(\varepsilon)=\bigcup_{A\in\mathcal W_n(\varepsilon)}A.
\end{align*}
Consequently the typical atoms cover asymptotically full measure:
\begin{align*}
\mu\left(\bigcup_{A\in\mathcal W_n(\varepsilon)}A\right)=\mu(E_n(\varepsilon))\to 1.
\end{align*}
[/guided]
[/step]
[step:Count typical atoms from the lower bound on their measures]
The atoms of $\mathcal P_0^{n-1}$ are pairwise disjoint, so the atoms in $\mathcal W_n(\varepsilon)$ are pairwise disjoint. For every $A\in\mathcal W_n(\varepsilon)$,
\begin{align*}
\mu(A)\geq e^{-n(h+\varepsilon)}.
\end{align*}
Therefore
\begin{align*}
1\geq \mu\left(\bigcup_{A\in\mathcal W_n(\varepsilon)}A\right)=\sum_{A\in\mathcal W_n(\varepsilon)}\mu(A)\geq |\mathcal W_n(\varepsilon)|e^{-n(h+\varepsilon)}.
\end{align*}
Multiplying by $e^{n(h+\varepsilon)}$ gives
\begin{align*}
|\mathcal W_n(\varepsilon)|\leq e^{n(h+\varepsilon)}.
\end{align*}
This estimate holds for every $n\in\mathbb N$, and hence in particular for all sufficiently large $n$.
[/step]
[step:Force many atoms in any family covering positive measure]
Fix $0<\delta<1$ and $\varepsilon>0$. Since $\mu(E_n(\varepsilon))\to 1$, there exists $N_1\in\mathbb N$ such that for every $n\geq N_1$,
\begin{align*}
\mu(E_n(\varepsilon))\geq 1-\frac{\delta}{2}.
\end{align*}
Let $n\geq N_1$, and let $\mathcal V_n\subseteq\mathcal P_0^{n-1}$ satisfy
\begin{align*}
\mu\left(\bigcup_{A\in\mathcal V_n}A\right)\geq \delta.
\end{align*}
Define
\begin{align*}
V_n:=\bigcup_{A\in\mathcal V_n}A.
\end{align*}
Then
\begin{align*}
\mu(V_n\cap E_n(\varepsilon))\geq \mu(V_n)-\mu(X\setminus E_n(\varepsilon))\geq \delta-\frac{\delta}{2}=\frac{\delta}{2}.
\end{align*}
Since $E_n(\varepsilon)$ is a union of atoms of $\mathcal P_0^{n-1}$, the set $V_n\cap E_n(\varepsilon)$ is the union of those atoms $A\in\mathcal V_n$ that also belong to $\mathcal W_n(\varepsilon)$. For each such atom,
\begin{align*}
\mu(A)\leq e^{-n(h-\varepsilon)}.
\end{align*}
Using pairwise disjointness of atoms,
\begin{align*}
\frac{\delta}{2}\leq \mu(V_n\cap E_n(\varepsilon))\leq |\mathcal V_n|e^{-n(h-\varepsilon)}.
\end{align*}
Multiplying by $e^{n(h-\varepsilon)}$ yields
\begin{align*}
|\mathcal V_n|\geq \frac{\delta}{2}e^{n(h-\varepsilon)}.
\end{align*}
Thus the claimed lower bound holds for every $n\geq N_1$, completing the proof.
[/step]