[proofplan]
Fix a deterministic decider $M$ using $O(f(n))$ work-tape space, and bound the number of configurations that $M$ can enter on inputs of length $n$. A configuration is determined by the finite control state, input-head position, work-tape contents, and work-tape head positions. The work-tape part contributes exponentially many possibilities in $f(n)$, while the input-head and head-position factors are absorbed into the exponent because $f(n)\geq \log_2 n$. Since a deterministic decider cannot repeat a nonhalting configuration before halting, the original decider itself halts within the number of possible configurations, which is $2^{O(f(n))}$.
[/proofplan]
[step:Fix a space-bounded deterministic decider and name its machine constants]
Let $L\in \operatorname{SPACE}(f(n))$. We use the standard deterministic space model with a read-only input tape, finitely many work tapes, and finite work-tape alphabet; changing among the usual deterministic models changes the space and time bounds only by constant or polynomial factors, which are absorbed by the notation in the theorem. By definition, there is a deterministic Turing-machine decider $M$ deciding $L$ and constants $c>0$ and $d\geq 0$ such that, on every input $x\in\{0,1\}^*$ of length $n\geq 1$, the number of work-tape cells visited by $M$ is at most the value of the integer-valued function $s:\mathbb{N}\to\mathbb{N}$ defined by
\begin{align*}
s(n):=\lceil c f(n)+d\rceil.
\end{align*}
This display is a single-row align environment and contains no TeX row separator.
Fix the following machine-dependent constants. Let $Q$ be the finite state set of $M$, let $q:=|Q|$, let $\Gamma$ be the finite work-tape alphabet, let $a:=|\Gamma|$, and let $r\in\mathbb{N}$ be the number of work tapes. These constants depend on $M$ but not on the input length $n$.
[/step]
[step:Count all configurations reachable under the space bound]
For an input $x$ of length $n$, a configuration of $M$ is determined by: the current state in $Q$; the input-head position, which has at most $n+1$ possibilities; for each of the $r$ work tapes, the contents of the first $s(n)+1$ possible visited cells; and, for each of the $r$ work tapes, the work-head position, which has at most $s(n)+1$ possibilities. Therefore the number $N(n)$ of configurations compatible with the space bound satisfies
\begin{align*}
N(n)\leq q(n+1)(s(n)+1)^r a^{r(s(n)+1)}.
\end{align*}
[guided]
We count configurations because a deterministic machine has no hidden memory beyond its current configuration. Once the input $x$ is fixed, the transition function determines the next configuration uniquely from the current one.
Recall the notation fixed in the previous step: $M$ is the deterministic decider, $Q$ is its finite state set, $q:=|Q|$, $\Gamma$ is its finite work-tape alphabet, $a:=|\Gamma|$, $r\in\mathbb{N}$ is the number of work tapes, and $s:\mathbb{N}\to\mathbb{N}$ is the integer-valued space bound with $s(n)=\lceil c f(n)+d\rceil$. Let $N(n)$ denote the number of configurations that can occur while $M$ runs on inputs of length $n$. The finite control contributes $q$ choices. The input tape is read-only and already fixed by $x$, so the input tape contents do not vary; only the input-head position matters, and there are at most $n+1$ such positions if the head may sit on one of the $n$ input symbols or at an endmarker. The space bound says that each work tape uses at most $s(n)$ cells up to harmless endpoint conventions, so allowing $s(n)+1$ possible cells only enlarges the count. Thus each work tape has at most $a^{s(n)+1}$ possible contents and at most $s(n)+1$ possible head positions.
Multiplying these independent choices gives
\begin{align*}
N(n)\leq q(n+1)(s(n)+1)^r a^{r(s(n)+1)}.
\end{align*}
This is an upper bound, not an exact count. It may include impossible configurations, but an upper bound is enough for the simulation time estimate.
[/guided]
[/step]
[step:Absorb the polynomial factors into an exponential in the space bound]
We show that $N(n)\leq 2^{C f(n)}$ for a constant $C>0$ depending only on $M$ and the constants in the space bound. Taking base-two logarithms of the configuration bound gives
\begin{align*}
\log_2 N(n)\leq \log_2 q+\log_2(n+1)+r\log_2(s(n)+1)+r(s(n)+1)\log_2 a.
\end{align*}
Since $f(n)\geq \log_2 n$ and $f(n)\geq 1$, we have $n+1\leq 2n$ for $n\geq 1$, so
\begin{align*}
\log_2(n+1)\leq 1+\log_2 n\leq 2f(n).
\end{align*}
Thus we may take $C_1:=2$. Since $s(n)=\lceil c f(n)+d\rceil$, the ceiling estimate gives
\begin{align*}
s(n)+1\leq c f(n)+d+2\leq (c+d+2)f(n),
\end{align*}
where the last inequality uses $f(n)\geq 1$. Define $C_2:=c+d+2$. Because $s(n)+1\geq 1$ and $\log_2 t\leq t$ for $t\geq 1$, we get
\begin{align*}
\log_2(s(n)+1)\leq s(n)+1\leq C_2 f(n).
\end{align*}
Thus we may take $C_3:=C_2$. Finally, since $f(n)\geq 1$, the constant term satisfies $\log_2 q\leq (\log_2 q)f(n)$. Substituting these bounds into the logarithmic estimate yields
\begin{align*}
\log_2 N(n)\leq \left(\log_2 q+C_1+rC_3+rC_2\log_2 a\right)f(n).
\end{align*}
Defining $C:=\log_2 q+C_1+rC_3+rC_2\log_2 a$, we have $C>0$ and
\begin{align*}
\log_2 N(n)\leq C f(n).
\end{align*}
Hence
\begin{align*}
N(n)\leq 2^{C f(n)}.
\end{align*}
[/step]
[step:Use determinism to rule out repeated nonhalting configurations]
Fix an input $x$ of length $n$. Suppose the computation of $M$ on $x$ visits the same nonhalting configuration at two distinct times $t_1<t_2$. Since $M$ is deterministic, the entire future computation starting from the configuration at time $t_1$ is identical to the future computation starting from the same configuration at time $t_2$. Thus the computation repeats the same cycle forever and never reaches a halting configuration.
But $M$ is a decider, so it halts on every input. Therefore, on input $x$, the computation of $M$ cannot repeat a nonhalting configuration. Since there are at most $N(n)$ configurations compatible with the space bound, $M$ halts within at most $N(n)$ steps up to a fixed additive convention for counting the initial configuration.
[/step]
[step:Conclude that the original decider already has exponential running time]
Let $S$ be the same deterministic Turing machine as $M$, viewed now as a time-bounded decider rather than only as a space-bounded decider. This definition does not require $S$ to compute $f(n)$, compute the constant $C$, or know the configuration bound in advance.
For every input $x\in\{0,1\}^*$ of length $n\geq 1$, the previous step shows that the computation of $M$ on $x$ cannot repeat a nonhalting configuration. Since the configuration count gives $N(n)\leq 2^{C f(n)}$, the machine $M$, and hence $S$, halts on $x$ after at most $N(n)$ steps up to a fixed additive convention for counting the initial configuration. Therefore the running time of $S$ is bounded by $2^{O(f(n))}$, and $S$ decides $L$ because $M$ decides $L$.
Since $L\in\operatorname{SPACE}(f(n))$ was arbitrary, we conclude
\begin{align*}
\operatorname{SPACE}(f(n)) \subseteq \operatorname{TIME}(2^{O(f(n))}).
\end{align*}
[/step]