[proofplan]
Choose a finite forbidden list defining $X$ and take $N$ so that every forbidden word has length at most $N+1$. The new alphabet consists of the admissible length-$N$ words in $X$, and two new symbols are allowed to follow each other exactly when their length-$N$ words overlap in $N-1$ coordinates and form an admissible length-$(N+1)$ word. The conjugacy sends a bi-infinite point of $X$ to its sequence of consecutive length-$N$ blocks. The overlap condition gives an explicit inverse by reading the first coordinate of each block, and the length bound on the forbidden list proves that the reconstructed point lies in $X$.
[/proofplan]
[step:Choose a defining forbidden list with a uniform length bound]
Since $X \subset A^{\mathbb Z}$ is a subshift of finite type, there is a finite set $\mathcal F$ of finite words over $A$ such that
\begin{align*}
X = \{x \in A^{\mathbb Z} : \text{no translate of any word in } \mathcal F \text{ occurs in } x\}.
\end{align*}
Choose an integer $N \geq 1$ such that every word in $\mathcal F$ has length at most $N+1$.
For each integer $m \geq 1$, define the length-$m$ language of $X$ by
\begin{align*}
\mathcal L_m(X) := \{a_0 \cdots a_{m-1} \in A^m : \text{there exist } x \in X \text{ and } k \in \mathbb Z \text{ with } x_{k+i}=a_i \text{ for } 0 \leq i \leq m-1\}.
\end{align*}
Let
\begin{align*}
B := \mathcal L_N(X).
\end{align*}
Because $A$ is finite, $B \subset A^N$ is finite.
[guided]
The finite type hypothesis says that membership in $X$ is controlled by excluding finitely many finite patterns. We fix one such finite forbidden list and record its maximum relevant length. More precisely, choose a finite set $\mathcal F$ of finite words over $A$ such that a sequence $x \in A^{\mathbb Z}$ lies in $X$ exactly when no word from $\mathcal F$ occurs in $x$ at any integer position.
Since $\mathcal F$ is finite, the lengths of its words have a maximum. Choose $N \geq 1$ so that every forbidden word has length at most $N+1$. This choice is the reason the later one-step construction works: checking allowed transitions between length-$N$ blocks checks all words of length $N+1$, and every shorter forbidden word is contained in some length-$(N+1)$ window.
For each $m \geq 1$, define
\begin{align*}
\mathcal L_m(X) := \{a_0 \cdots a_{m-1} \in A^m : \text{there exist } x \in X \text{ and } k \in \mathbb Z \text{ with } x_{k+i}=a_i \text{ for } 0 \leq i \leq m-1\}.
\end{align*}
Thus $\mathcal L_m(X)$ is the set of length-$m$ words that actually occur somewhere in points of $X$. We define the new alphabet by
\begin{align*}
B := \mathcal L_N(X).
\end{align*}
This is finite because it is a subset of the finite set $A^N$.
[/guided]
[/step]
[step:Define the one-step transition matrix by overlap and admissibility]
For $u = u_0 \cdots u_{N-1} \in B$ and $v = v_0 \cdots v_{N-1} \in B$, define $M_{uv} \in \{0,1\}$ by
\begin{align*}
M_{uv} = 1
\end{align*}
if and only if both conditions hold:
\begin{align*}
u_i = v_{i-1} \text{ for every } 1 \leq i \leq N-1
\end{align*}
and
\begin{align*}
u_0 u_1 \cdots u_{N-1} v_{N-1} \in \mathcal L_{N+1}(X).
\end{align*}
Otherwise set $M_{uv}=0$.
Let
\begin{align*}
Y := \{y \in B^{\mathbb Z} : M_{y_k y_{k+1}}=1 \text{ for every } k \in \mathbb Z\}.
\end{align*}
By definition, $Y$ is the topological Markov chain determined by the finite zero-one matrix $M$.
[/step]
[step:Send each point to its sequence of length-$N$ blocks]
Define the map
\begin{align*}
\Phi: X &\to B^{\mathbb Z}
\end{align*}
by declaring, for each $x \in X$ and $k \in \mathbb Z$,
\begin{align*}
\Phi(x)_k := x_k x_{k+1}\cdots x_{k+N-1} \in B.
\end{align*}
This is well-defined because every length-$N$ block of a point of $X$ belongs to $\mathcal L_N(X)=B$.
For each $x \in X$ and $k \in \mathbb Z$, the words $\Phi(x)_k$ and $\Phi(x)_{k+1}$ overlap in their last and first $N-1$ coordinates:
\begin{align*}
(\Phi(x)_k)_i = x_{k+i} = (\Phi(x)_{k+1})_{i-1} \text{ for every } 1 \leq i \leq N-1.
\end{align*}
Their concatenated length-$(N+1)$ word is
\begin{align*}
x_k x_{k+1}\cdots x_{k+N},
\end{align*}
which belongs to $\mathcal L_{N+1}(X)$ because it occurs in $x \in X$. Hence $M_{\Phi(x)_k,\Phi(x)_{k+1}}=1$ for every $k \in \mathbb Z$, so $\Phi(X) \subset Y$.
[/step]
[step:Recover the original sequence from overlapping blocks]
Define the reconstruction map
\begin{align*}
\Psi: Y &\to A^{\mathbb Z}
\end{align*}
by declaring, for each $y \in Y$ and $k \in \mathbb Z$,
\begin{align*}
\Psi(y)_k := (y_k)_0,
\end{align*}
where $(y_k)_0$ denotes the first coordinate of the word $y_k \in B \subset A^N$.
We prove that $\Psi(y) \in X$ for every $y \in Y$. Fix $y \in Y$ and set $x := \Psi(y)$. Since $M_{y_k y_{k+1}}=1$, the overlap condition gives
\begin{align*}
(y_k)_i = (y_{k+1})_{i-1} \text{ for every } 1 \leq i \leq N-1.
\end{align*}
Iterating this overlap identity gives, for every $0 \leq i \leq N-1$,
\begin{align*}
x_{k+i} = (y_{k+i})_0 = (y_k)_i.
\end{align*}
Therefore the length-$(N+1)$ block $x_k x_{k+1}\cdots x_{k+N}$ is exactly the admissible word whose first $N$ coordinates are $y_k$ and whose final coordinate is $(y_{k+1})_{N-1}$. Because $M_{y_k y_{k+1}}=1$, this word belongs to $\mathcal L_{N+1}(X)$.
Now let $w \in \mathcal F$ be any forbidden word, and let $m$ be its length. By the choice of $N$, $m \leq N+1$. If $w$ occurred in $x$ beginning at some coordinate $k$, then $w$ would be a subword of the length-$(N+1)$ block $x_k x_{k+1}\cdots x_{k+N}$ after shifting the starting coordinate if necessary. But every length-$(N+1)$ block of $x$ belongs to $\mathcal L_{N+1}(X)$, so it occurs in some point of $X$ and cannot contain a forbidden word from $\mathcal F$. Hence no word in $\mathcal F$ occurs in $x$, and therefore $x \in X$.
[guided]
The purpose of the inverse map is to turn a sequence of overlapping length-$N$ words back into a sequence of letters. Define
\begin{align*}
\Psi: Y &\to A^{\mathbb Z}
\end{align*}
by reading the first coordinate of each block:
\begin{align*}
\Psi(y)_k := (y_k)_0.
\end{align*}
This gives a bi-infinite sequence over $A$, but we still have to prove that it belongs to $X$.
Fix $y \in Y$ and write $x := \Psi(y)$. Since $y \in Y$, every adjacent pair $y_k,y_{k+1}$ is allowed by the matrix $M$. The definition of $M$ has two consequences. First, the words overlap:
\begin{align*}
(y_k)_i = (y_{k+1})_{i-1} \text{ for every } 1 \leq i \leq N-1.
\end{align*}
Second, the concatenated word formed from $y_k$ and the last symbol of $y_{k+1}$ lies in $\mathcal L_{N+1}(X)$.
The overlap identity tells us that the whole block $y_k$ is visible inside the reconstructed sequence $x$. Indeed, for $0 \leq i \leq N-1$, repeated use of the overlap condition gives
\begin{align*}
x_{k+i} = (y_{k+i})_0 = (y_k)_i.
\end{align*}
Thus the length-$N$ block of $x$ starting at $k$ is $y_k$. Adding the next coordinate, the length-$(N+1)$ block $x_k x_{k+1}\cdots x_{k+N}$ is precisely the concatenation of $y_k$ with the final coordinate $(y_{k+1})_{N-1}$. Since $M_{y_k y_{k+1}}=1$, this length-$(N+1)$ word belongs to $\mathcal L_{N+1}(X)$.
It remains to connect this to the forbidden list. Every forbidden word has length at most $N+1$. If some forbidden word $w \in \mathcal F$ occurred in $x$, then by choosing a length-$(N+1)$ window of $x$ containing that occurrence, we would get a length-$(N+1)$ word of $x$ containing $w$. But every such length-$(N+1)$ word belongs to $\mathcal L_{N+1}(X)$, so it occurs in some point of $X$. No point of $X$ contains a word from $\mathcal F$, by the definition of $\mathcal F$. This contradiction shows that no forbidden word occurs in $x$. Therefore $x \in X$.
[/guided]
[/step]
[step:Verify that the two block maps are inverse conjugacies]
For $x \in X$ and $k \in \mathbb Z$,
\begin{align*}
(\Psi(\Phi(x)))_k = (\Phi(x)_k)_0 = x_k.
\end{align*}
Hence $\Psi \circ \Phi = \operatorname{id}_X$.
For $y \in Y$ and $k \in \mathbb Z$, the $k$-th coordinate of $\Phi(\Psi(y))$ is the length-$N$ word
\begin{align*}
\Psi(y)_k \Psi(y)_{k+1}\cdots \Psi(y)_{k+N-1}.
\end{align*}
By the overlap computation in the previous step, this word is $y_k$. Thus $\Phi(\Psi(y))=y$ for every $y \in Y$, so $\Phi$ is a bijection from $X$ onto $Y$ with inverse $\Psi$.
[/step]
[step:Check continuity and shift-commutation]
Let $\sigma_A: A^{\mathbb Z} \to A^{\mathbb Z}$ and $\sigma_B: B^{\mathbb Z} \to B^{\mathbb Z}$ denote the shift maps
\begin{align*}
(\sigma_A x)_k := x_{k+1}
\end{align*}
and
\begin{align*}
(\sigma_B y)_k := y_{k+1}.
\end{align*}
The map $\Phi$ is a sliding block code with finite window $\{0,1,\dots,N-1\}$, so each coordinate $\Phi(x)_k$ depends only on $x_k,\dots,x_{k+N-1}$. Therefore $\Phi$ is continuous in the [product topology](/page/Product%20Topology). The map $\Psi$ is a one-block code, since $\Psi(y)_k=(y_k)_0$, and is continuous in the product topology.
For $x \in X$ and $k \in \mathbb Z$,
\begin{align*}
(\Phi(\sigma_A x))_k = x_{k+1}x_{k+2}\cdots x_{k+N} = \Phi(x)_{k+1} = (\sigma_B\Phi(x))_k.
\end{align*}
Thus $\Phi \circ \sigma_A = \sigma_B \circ \Phi$. Since $\Phi$ is a continuous bijection from $X$ onto $Y$, its inverse $\Psi$ is continuous, and $\Phi$ commutes with the shifts, $\Phi$ is a topological conjugacy from $X$ onto the topological Markov chain $Y=X_M$.
[/step]