[proofplan]
Let $M$ be the finite irreducible zero-one matrix defining $\Sigma_M$, and let $\lambda$ be its Perron eigenvalue. The upper bound follows by comparing the entropy of every invariant measure with the logarithmic growth rate of admissible words. The lower bound is attained by the Parry Markov measure built from positive left and right Perron eigenvectors. The entropy formula for stationary Markov measures then gives $h_{\mu_P}(\sigma)=\log\lambda$, matching the topological entropy.
[/proofplan]
[step:Fix the symbolic model and the Perron data]
Write the state set as $A=\{1,\dots,k\}$, and let $M\in\{0,1\}^{A\times A}$ be the irreducible adjacency matrix defining
\begin{align*}
\Sigma_M:=\{x\in A^{\mathbb Z}:M_{x_nx_{n+1}}=1\text{ for every }n\in\mathbb Z\}.
\end{align*}
Let $\lambda>0$ be the Perron eigenvalue of $M$. Choose positive right and left Perron eigenvectors $r,l\in(0,\infty)^A$ satisfying
\begin{align*}
Mr=\lambda r
\end{align*}
and
\begin{align*}
l^\top M=\lambda l^\top.
\end{align*}
Normalize them so that
\begin{align*}
\sum_{i\in A}l_ir_i=1.
\end{align*}
The topological entropy of the irreducible topological Markov chain is
\begin{align*}
h_{\mathrm{top}}(\sigma|_{\Sigma_M})=\log\lambda.
\end{align*}
This is the Perron-Frobenius word-growth formula applied to the finite irreducible non-negative matrix $M$: the number of admissible words of length $n$ is $\sum_{i,j\in A}(M^{n-1})_{ij}$, and irreducibility ensures that this path-counting sequence has exponential growth rate $\lambda$.
[guided]
The shift space is determined by a finite directed graph encoded by $M$. A bi-infinite sequence belongs to $\Sigma_M$ exactly when every adjacent pair is allowed:
\begin{align*}
M_{x_nx_{n+1}}=1
\end{align*}
for every $n\in\mathbb Z$. Since $M$ is irreducible and non-negative, Perron-Frobenius theory gives a leading eigenvalue $\lambda>0$ and positive vectors $r,l$ with
\begin{align*}
Mr=\lambda r
\end{align*}
and
\begin{align*}
l^\top M=\lambda l^\top.
\end{align*}
We normalize by requiring
\begin{align*}
\sum_{i\in A}l_ir_i=1.
\end{align*}
Finally, $M$ is finite, irreducible, and non-negative, and admissible words of length $n$ are counted by entries of $M^{n-1}$. The Perron-Frobenius word-growth formula therefore gives exponential growth rate $\lambda$. Hence
\begin{align*}
h_{\mathrm{top}}(\sigma|_{\Sigma_M})=\log\lambda.
\end{align*}
[/guided]
[/step]
[step:Bound every invariant measure by word growth]
Let $\mu$ be a $\sigma$-invariant Borel probability measure on $\Sigma_M$. Let $\mathcal P$ be the finite partition according to the coordinate $x_0$. The shifts of $\mathcal P$ generate the Borel $\sigma$-algebra of $\Sigma_M$, so
\begin{align*}
h_\mu(\sigma)=\lim_{n\to\infty}\frac{1}{n}H_\mu\left(\bigvee_{j=0}^{n-1}\sigma^{-j}\mathcal P\right).
\end{align*}
For $n\geq 1$, define
\begin{align*}
\mathcal L_n(\Sigma_M):=\{a_0\cdots a_{n-1}\in A^n:\text{there exists }x\in\Sigma_M\text{ with }x_i=a_i\text{ for }0\leq i\leq n-1\}.
\end{align*}
The partition $\mathcal P$ is finite and generating, so the Kolmogorov-Sinai generator formula applies. The atoms of $\bigvee_{j=0}^{n-1}\sigma^{-j}\mathcal P$ are the admissible length-$n$ cylinders. Hence the number of nonempty atoms is at most $|\mathcal L_n(\Sigma_M)|$. The entropy of a probability distribution supported on at most $N$ atoms is at most $\log N$, so
\begin{align*}
H_\mu\left(\bigvee_{j=0}^{n-1}\sigma^{-j}\mathcal P\right)\leq \log|\mathcal L_n(\Sigma_M)|.
\end{align*}
Dividing by $n$ and passing to the limit gives
\begin{align*}
h_\mu(\sigma)\leq h_{\mathrm{top}}(\sigma|_{\Sigma_M})=\log\lambda.
\end{align*}
Since $\mu$ was arbitrary,
\begin{align*}
\sup\{h_\mu(\sigma):\mu\text{ is }\sigma\text{-invariant}\}\leq\log\lambda.
\end{align*}
[guided]
Take any invariant probability measure $\mu$. The one-coordinate partition $\mathcal P$ is finite because $A$ is finite, and it is generating because its shifts recover every coordinate of a point in $\Sigma_M$. Thus the Kolmogorov-Sinai generator formula computes $h_\mu(\sigma)$ from block partitions. For $n\geq 1$, set
\begin{align*}
\mathcal L_n(\Sigma_M):=\{a_0\cdots a_{n-1}\in A^n:\text{there exists }x\in\Sigma_M\text{ with }x_i=a_i\text{ for }0\leq i\leq n-1\}.
\end{align*}
The join
\begin{align*}
\bigvee_{j=0}^{n-1}\sigma^{-j}\mathcal P
\end{align*}
records the word from time $0$ through time $n-1$. Its nonempty atoms are exactly admissible length-$n$ cylinders. Therefore there are at most
\begin{align*}
|\mathcal L_n(\Sigma_M)|
\end{align*}
nonempty atoms. A probability distribution on at most $N$ atoms has entropy at most $\log N$, so
\begin{align*}
H_\mu\left(\bigvee_{j=0}^{n-1}\sigma^{-j}\mathcal P\right)\leq \log|\mathcal L_n(\Sigma_M)|.
\end{align*}
After dividing by $n$ and taking limits, the right side becomes topological entropy. Thus every invariant $\mu$ satisfies
\begin{align*}
h_\mu(\sigma)\leq h_{\mathrm{top}}(\sigma|_{\Sigma_M})=\log\lambda.
\end{align*}
[/guided]
[/step]
[step:Construct the Parry measure]
Define
\begin{align*}
\pi_i:=l_ir_i
\end{align*}
for $i\in A$, and define a transition matrix $P=(P_{ij})$ by
\begin{align*}
P_{ij}:=\frac{M_{ij}r_j}{\lambda r_i}.
\end{align*}
The rows of $P$ sum to $1$, because
\begin{align*}
\sum_{j\in A}P_{ij}=\frac{1}{\lambda r_i}\sum_{j\in A}M_{ij}r_j=\frac{(Mr)_i}{\lambda r_i}=1.
\end{align*}
The vector $\pi$ is stationary for $P$, since
\begin{align*}
\sum_{i\in A}\pi_iP_{ij}
=\sum_{i\in A}l_ir_i\frac{M_{ij}r_j}{\lambda r_i}
=\frac{r_j}{\lambda}\sum_{i\in A}l_iM_{ij}
=\frac{r_j}{\lambda}(l^\top M)_j
=l_jr_j
=\pi_j.
\end{align*}
Let $\mu_P$ be the stationary Markov measure on $\Sigma_M$ with initial distribution $\pi$ and transition matrix $P$. This is the Parry measure.
[guided]
The Perron eigenvectors define a Markov chain on the allowed edges. Set
\begin{align*}
\pi_i=l_ir_i
\end{align*}
and
\begin{align*}
P_{ij}=\frac{M_{ij}r_j}{\lambda r_i}.
\end{align*}
The normalization $\sum_i l_ir_i=1$ makes $\pi$ a probability vector. The identity $Mr=\lambda r$ makes every row of $P$ sum to $1$, and the identity $l^\top M=\lambda l^\top$ proves stationarity:
\begin{align*}
\sum_i\pi_iP_{ij}=\pi_j.
\end{align*}
Thus these data define a stationary Markov measure $\mu_P$ supported on $\Sigma_M$. This is the Parry measure.
[/guided]
[/step]
[step:Compute the entropy of the Parry measure]
The matrix $P$ is a finite stochastic matrix, $\pi$ is stationary for $P$, and $\mu_P$ is the stationary Markov measure built from these data. Therefore the entropy formula for stationary Markov measures applies:
\begin{align*}
h_{\mu_P}(\sigma)=-\sum_{i\in A}\pi_i\sum_{\{j\in A:P_{ij}>0\}}P_{ij}\log P_{ij}.
\end{align*}
For allowed transitions, $M_{ij}=1$, and therefore
\begin{align*}
\log P_{ij}=\log r_j-\log\lambda-\log r_i.
\end{align*}
Substituting this expression gives
\begin{align*}
h_{\mu_P}(\sigma)
=\sum_{i\in A}\pi_i\sum_jP_{ij}\bigl(\log\lambda+\log r_i-\log r_j\bigr).
\end{align*}
Since the rows of $P$ sum to $1$, the first two terms become
\begin{align*}
\log\lambda+\sum_{i\in A}\pi_i\log r_i.
\end{align*}
Since $\pi P=\pi$, the last term is
\begin{align*}
\sum_{i\in A}\pi_i\sum_jP_{ij}\log r_j=\sum_{j\in A}\pi_j\log r_j.
\end{align*}
The two $r$-terms cancel, and hence
\begin{align*}
h_{\mu_P}(\sigma)=\log\lambda.
\end{align*}
Combining this with the upper bound proves
\begin{align*}
h_{\mathrm{top}}(\sigma|_{\Sigma_M})=\sup\{h_\mu(\sigma):\mu\text{ is }\sigma\text{-invariant}\},
\end{align*}
and the supremum is attained by the Parry measure $\mu_P$.
[guided]
Here $P$ is finite and stochastic, $\pi$ is stationary, and $\mu_P$ is the corresponding stationary Markov measure. Thus the stationary Markov entropy formula applies: entropy is the stationary average of transition-row entropies,
\begin{align*}
h_{\mu_P}(\sigma)=-\sum_i\pi_i\sum_{\{j:P_{ij}>0\}}P_{ij}\log P_{ij}.
\end{align*}
For the Parry transition probabilities,
\begin{align*}
P_{ij}=\frac{M_{ij}r_j}{\lambda r_i}.
\end{align*}
On allowed transitions $M_{ij}=1$, so
\begin{align*}
\log P_{ij}=\log r_j-\log\lambda-\log r_i.
\end{align*}
The entropy becomes the average of $\log\lambda+\log r_i-\log r_j$. The average of $\log\lambda$ is $\log\lambda$. The average of $\log r_i$ with respect to $\pi_iP_{ij}$ is $\sum_i\pi_i\log r_i$, while the average of $\log r_j$ is $\sum_j\pi_j\log r_j$ by stationarity. These cancel. Therefore
\begin{align*}
h_{\mu_P}(\sigma)=\log\lambda.
\end{align*}
The upper bound showed no invariant measure can have entropy larger than $\log\lambda$, so the Parry measure attains the supremum.
[/guided]
[/step]