[proofplan]
We prove the inclusion directly at the machine level, using the standard convention that space counts only work-tape cells and not the read-only input tape. If a deterministic Turing machine halts after at most a constant multiple of $f(n)$ steps, then on each work tape it can visit only linearly many cells in the number of elapsed steps, because a tape head moves by at most one cell per transition. Since $f: \mathbb{N} \to \mathbb{N}$ takes positive integer values, the additive constants in the visited-cell bound are absorbed into a larger multiple of $f(n)$, so the same machine decides the language using $O(f(n))$ space.
[/proofplan]
[step:Fix a time-bounded machine deciding the language]
Let $L \subseteq \{0,1\}^*$ be a language with $L \in \operatorname{TIME}(f(n))$. We use the standard multitape deterministic Turing-machine convention in which the input tape is read-only and space is the number of work-tape cells visited during the computation. By the definition of $\operatorname{TIME}(f(n))$, there exist a deterministic Turing machine $M$, a constant $C > 0$, and a constant $C_0 \geq 0$ such that $M$ decides $L$ and, for every input word $x \in \{0,1\}^*$, the computation of $M$ on $x$ halts within $C f(|x|) + C_0$ steps.
Let $k \in \mathbb{N}$ denote the number of work tapes of $M$. For an input $x$, let $T_M(x) \in \mathbb{N}$ denote the number of transitions made by $M$ before halting on $x$, and let $S_M(x) \in \mathbb{N}$ denote the total number of distinct work-tape cells visited during that computation, summed over all $k$ work tapes.
[/step]
[step:Bound the number of visited work-tape cells by the number of steps]
At time $0$, each work-tape head is positioned on one initial cell, so at most $k$ work-tape cells have been visited. During one transition, each work-tape head either stays put or moves to an adjacent cell, so each work tape can contribute at most one newly visited cell during that transition. Hence one transition can increase the total number of distinct visited work-tape cells by at most $k$.
Therefore, for every input $x \in \{0,1\}^*$, we have $S_M(x) \leq k(T_M(x) + 1)$.
Using the time bound on $M$, we obtain $S_M(x) \leq k(C f(|x|) + C_0 + 1)$.
Define $C' = kC$ and $C_0' = k(C_0 + 1)$. Then $S_M(x) \leq C' f(|x|) + C_0'$. Since $f: \mathbb{N} \to \mathbb{N}$, we have $f(n) \geq 1$ for every $n \in \mathbb{N}$. Hence, for every input length in the complexity parameter range,
\begin{align*}
S_M(x) \leq (C' + C_0') f(|x|).
\end{align*}
Thus $M$ uses at most $C'' f(|x|)$ work-tape cells on input $x$, where $C'' = C' + C_0' = kC + k(C_0 + 1)$.
[guided]
The point is that space usage cannot grow faster than elapsed time. Let $x \in \{0,1\}^*$ be fixed, and let $T_M(x)$ be the number of transitions made before $M$ halts on $x$. Let $S_M(x)$ be the total number of different work-tape cells visited during this computation, summed over the $k$ work tapes.
Initially, before any transition is made, each work-tape head is scanning one cell. Therefore the computation has visited at most $k$ work-tape cells at time $0$. Now consider a single transition. On each work tape, the head either remains on its current cell or moves one cell left or right. Therefore a single work tape can add at most one new cell during that transition. Since there are $k$ work tapes, one transition can add at most $k$ new visited work-tape cells in total.
After $T_M(x)$ transitions, this gives $S_M(x) \leq k + kT_M(x)$.
Equivalently, $S_M(x) \leq k(T_M(x) + 1)$.
Because $M$ was chosen as a time-$O(f(n))$ decider for $L$, there are constants $C > 0$ and $C_0 \geq 0$ such that $T_M(x) \leq C f(|x|) + C_0$.
Substituting this time bound into the cell-count estimate gives $S_M(x) \leq k(C f(|x|) + C_0 + 1)$.
Define constants $C' = kC$ and $C_0' = k(C_0 + 1)$. Then $S_M(x) \leq C' f(|x|) + C_0'$.
Now we convert this affine bound into the pure multiplicative form required for $O(f(n))$. Because $f: \mathbb{N} \to \mathbb{N}$ and natural numbers start at $1$, we have $f(n) \geq 1$ for every $n \in \mathbb{N}$. Therefore, for every input length in the complexity parameter range,
\begin{align*}
S_M(x) \leq C' f(|x|) + C_0' f(|x|).
\end{align*}
Hence
\begin{align*}
S_M(x) \leq (C' + C_0') f(|x|).
\end{align*}
Define $C'' = C' + C_0' = kC + k(C_0 + 1)$. Then $S_M(x) \leq C'' f(|x|)$, which is exactly an $O(f(n))$ work-tape space bound for the same deterministic machine.
[/guided]
[/step]
[step:Pass from the machine bound to the class inclusion]
The machine $M$ decides $L$ and uses at most $C'' f(|x|)$ work-tape cells on every input $x \in \{0,1\}^*$. By the definition of $\operatorname{SPACE}(f(n))$ under the same work-tape space convention, this means $L \in \operatorname{SPACE}(f(n))$.
Since $L \in \operatorname{TIME}(f(n))$ was arbitrary, every language in $\operatorname{TIME}(f(n))$ belongs to $\operatorname{SPACE}(f(n))$. Hence $\operatorname{TIME}(f(n)) \subseteq \operatorname{SPACE}(f(n))$.
[/step]