[guided]We now spell out why one simulated transition costs only a constant number of full scans, rather than a number of scans depending on the input. The key point is that $k$ is fixed as part of the theorem, so the single-tape machine may devote finitely many states to remembering one symbol from each virtual tape.
Suppose the encoded configuration has state $q \in Q$ and blocks $\# B_1 \# B_2 \# \cdots \# B_k \#$. The machine $S$ makes one left-to-right scan of this entire encoded word. In block $B_i$, it finds the unique marked symbol $\widehat{a_i}$ and records $a_i \in \Gamma$ in its finite control. Since there are only $|\Gamma|^k$ possible tuples $(a_1,\dots,a_k)$, this storage is finite.
Once the tuple $(a_1,\dots,a_k)$ is known, the transition function of $M$ gives exactly one next instruction: $\delta(q,a_1,\dots,a_k) = (q',b_1,\dots,b_k,d_1,\dots,d_k)$. This tuple is also finite data, because $Q$, $\Gamma$, and $\{L,R,N\}$ are finite. The machine $S$ can therefore remember it in its finite control while it edits the encoded tape.
For a fixed tape index $i$, the required edit is local inside $B_i$: replace the old marked symbol by the written symbol $b_i$, then place the marker on the neighboring cell to the left, the neighboring cell to the right, or the same cell according to whether $d_i=L$, $d_i=R$, or $d_i=N$. If the neighboring cell is outside the represented interval, $S$ inserts one new blank cell at the corresponding endpoint and marks that blank cell. The single-tape implementation may use separate passes for left moves and right moves, or may process each block individually; either way, because there are only the fixed indices $1,\dots,k$, the number of passes is bounded by a constant depending only on $M$ and the encoding. Thus there is a constant $B > 0$ such that one simulated transition uses at most $B$ scans of the current encoded region, plus at most $B$ additional endpoint-editing moves.[/guided]