[proofplan]
We encode the $k$ tapes of $M$ on the single tape of $S$ as $k$ delimited blocks, with one distinguished marked cell in each block indicating the corresponding virtual head position. A bounded number of full scans of this encoded active region lets $S$ read the $k$ scanned symbols, compute the transition of $M$ in finite control, update the marked cells, and enlarge the representation by blank cells when a virtual head moves past an endpoint. After $m$ simulated steps, no virtual head has moved farther than $m$ cells from its initial position, so the encoded active region has length $O(n+m)$; since $m \leq t(n)$ and $t(n) \geq n$ eventually, each simulated step costs $O(t(n))$, and $t(n)$ simulated steps cost $O(t(n)^2)$.
[/proofplan]
[step:Encode the multi-tape configuration on one tape]
Let $Q$ be the finite state set of $M$, let $\Gamma$ be its finite tape alphabet, let $\blank \in \Gamma$ be the blank symbol, and let $\delta$ be the deterministic transition function of $M$. We construct a deterministic single-tape machine $S$ whose tape alphabet contains delimiters and, for every $a \in \Gamma$, two symbols $a$ and $\widehat{a}$. The unmarked symbol $a$ represents a virtual tape cell containing $a$, while the marked symbol $\widehat{a}$ represents a virtual tape cell containing $a$ and currently scanned by the corresponding virtual head.
At any simulated time $m \in \mathbb{N} \cup \{0\}$, the tape of $S$ contains an encoding of the form contains the word $\# B_1 \# B_2 \# \cdots \# B_k \#$, where each $B_i$ is a finite word over $\Gamma \cup \widehat{\Gamma}$, where $\widehat{\Gamma} := \{\widehat{a} : a \in \Gamma\}$, and where each block $B_i$ contains exactly one marked symbol. The block $B_i$ records a finite interval of the $i$-th virtual tape of $M$ containing every nonblank cell that has been visited or written so far and containing the current virtual head position. Cells outside the represented interval are understood to contain $\blank$.
On input $x = x_1 \cdots x_n \in \Sigma^*$, the machine $S$ first writes the initial encoding of the $k$ virtual tapes: the first block contains the input word on tape $1$, with the first input cell marked if $n \geq 1$ and with a marked blank cell if $n = 0$; each remaining block contains a single marked blank cell. This initialization takes at most $A(n+1)$ moves for a constant $A > 0$ depending only on $k$, $\Gamma$, and the chosen encoding.
[/step]
[step:Simulate one transition by a bounded number of scans]
Assume that the tape of $S$ encodes a configuration of $M$ in state $q \in Q$. The machine $S$ scans the full encoded region from left to right and records in its finite control the unique tuple the unique tuple $(a_1,\dots,a_k) \in \Gamma^k$, where $\widehat{a_i}$ is the marked symbol found in block $B_i$. This is possible because $k$ and $\Gamma$ are fixed, so the set $\Gamma^k$ is finite.
The deterministic transition function of $M$ determines a unique tuple $\delta(q,a_1,\dots,a_k) = (q',b_1,\dots,b_k,d_1,\dots,d_k)$, where $q' \in Q$ is the next state, each $b_i \in \Gamma$ is the symbol written on tape $i$, and each $d_i \in \{L,R,N\}$ is the movement direction of the $i$-th virtual head. The machine $S$ stores this finite tuple in its finite control.
For each $i \in \{1,\dots,k\}$, the machine $S$ performs a bounded number of scans of block $B_i$ to replace the marked symbol $\widehat{a_i}$ by $b_i$, move the marker one cell left, one cell right, or not at all according to $d_i$, and insert an additional blank cell at the left or right endpoint if the marker would otherwise leave the represented interval. Because $k$ is fixed, doing this for all tapes requires at most $B$ complete scans of the encoded region for a constant $B > 0$ independent of $x$, $n$, and the simulated time $m$.
[guided]
We now spell out why one simulated transition costs only a constant number of full scans, rather than a number of scans depending on the input. The key point is that $k$ is fixed as part of the theorem, so the single-tape machine may devote finitely many states to remembering one symbol from each virtual tape.
Suppose the encoded configuration has state $q \in Q$ and blocks $\# B_1 \# B_2 \# \cdots \# B_k \#$. The machine $S$ makes one left-to-right scan of this entire encoded word. In block $B_i$, it finds the unique marked symbol $\widehat{a_i}$ and records $a_i \in \Gamma$ in its finite control. Since there are only $|\Gamma|^k$ possible tuples $(a_1,\dots,a_k)$, this storage is finite.
Once the tuple $(a_1,\dots,a_k)$ is known, the transition function of $M$ gives exactly one next instruction: $\delta(q,a_1,\dots,a_k) = (q',b_1,\dots,b_k,d_1,\dots,d_k)$. This tuple is also finite data, because $Q$, $\Gamma$, and $\{L,R,N\}$ are finite. The machine $S$ can therefore remember it in its finite control while it edits the encoded tape.
For a fixed tape index $i$, the required edit is local inside $B_i$: replace the old marked symbol by the written symbol $b_i$, then place the marker on the neighboring cell to the left, the neighboring cell to the right, or the same cell according to whether $d_i=L$, $d_i=R$, or $d_i=N$. If the neighboring cell is outside the represented interval, $S$ inserts one new blank cell at the corresponding endpoint and marks that blank cell. The single-tape implementation may use separate passes for left moves and right moves, or may process each block individually; either way, because there are only the fixed indices $1,\dots,k$, the number of passes is bounded by a constant depending only on $M$ and the encoding. Thus there is a constant $B > 0$ such that one simulated transition uses at most $B$ scans of the current encoded region, plus at most $B$ additional endpoint-editing moves.
[/guided]
[/step]
[step:Bound the length of the encoded active region after $m$ simulated steps]
Let $\ell_m$ denote the length of the encoded region on the tape of $S$ after $m$ simulated steps of $M$, including delimiters. During one transition of $M$, each virtual head moves by at most one cell. Hence, after $m$ steps, the head on tape $i$ has visited only cells at distance at most $m$ from its initial position. On the first tape, the initially represented input has length $n$, so the represented interval has length at most $n+2m+1$. On each of the remaining $k-1$ tapes, the represented interval has length at most $2m+1$.
Therefore there is a constant $D > 0$, depending only on $k$ and the encoding alphabet, such that $\ell_m \leq D(n+m+1)$ for every $m \geq 0$.
[/step]
[step:Sum the simulation cost over all simulated steps]
Let $x \in \Sigma^*$ have length $n$, and let $T(x)$ be the number of steps taken by $M$ on input $x$. By hypothesis, $T(x) \leq t(n)$. The initialization cost is at most $A(n+1)$. During the $m$-th simulated transition, where $0 \leq m < T(x)$, the previous step gives $\ell_m \leq D(n+m+1) \leq D(n+t(n)+1)$. Thus the cost of this simulated transition is at most $E(n+t(n)+1)$ for a constant $E > 0$ depending only on $M$ and the chosen encoding.
Choose $N \in \mathbb{N}$ such that $t(n) \geq n$ for all $n \geq N$ and $t(n) \geq 1$ for all $n \geq N$. For such $n$, $n+t(n)+1 \leq 3t(n)$. Hence the total running time of $S$ on $x$ is at most $A(n+1) + T(x) \cdot 3E t(n)$. Using $T(x) \leq t(n)$ and $n+1 \leq 2t(n) \leq 2t(n)^2$ for $n \geq N$, we obtain $A(n+1) + T(x) \cdot 3E t(n) \leq 2A t(n)^2 + 3E t(n)^2$. Define $C := 2A + 3E$. Then, for every input $x$ of length $n \geq N$, the machine $S$ halts within $C t(n)^2$ steps.
[/step]
[step:Conclude that the simulated and original machines decide the same language]
The construction preserves the complete configuration of $M$ at every simulated time: the state stored by $S$ is the current state of $M$, each block records exactly the represented portion of the corresponding virtual tape, and the unique marked symbol in each block records the corresponding virtual head position. The transition simulation applies precisely the transition function $\delta$ of $M$. By induction on the simulated time $m$, after $m$ simulated steps the encoded configuration of $S$ represents the configuration of $M$ after $m$ steps.
Therefore, when $M$ enters an accepting or rejecting halting state, $S$ enters the same accepting or rejecting halting outcome. Since $M$ decides $L$, the machine $S$ also decides $L$. Together with the time estimate above, this proves that $S$ decides $L$ in time $O(t(n)^2)$.
[/step]