[proofplan]
We construct the approximating sequence explicitly. For each $k \ge 1$, partition the range $[0, k)$ into $k \cdot 2^k$ intervals of equal length $2^{-k}$, and define $s_k$ to take the value of the left endpoint on each corresponding preimage set, with a cap of $k$ on the region $\{f \ge k\}$. Measurability of $s_k$ follows from measurability of $f$. Monotonicity $s_k \le s_{k+1}$ holds because refining the partition and raising the cap can only increase the approximant. Pointwise convergence follows from the estimate $0 \le f(x) - s_k(x) \le 2^{-k}$ on $\{f \le k\}$, and uniform convergence when $f$ is bounded follows from the same estimate applied globally once $k$ exceeds $\|f\|_{L^\infty}$.
[/proofplan]
[step:Construct the approximating simple functions by dyadic partition of the range]
For each integer $k \ge 1$, define $s_k: X \to [0, \infty)$ by
\begin{align*}
s_k(x) := \begin{cases} \displaystyle \frac{i-1}{2^k} & \text{if } \frac{i-1}{2^k} \le f(x) < \frac{i}{2^k}, \quad i = 1, 2, \ldots, k \cdot 2^k, \\[6pt] k & \text{if } f(x) \ge k. \end{cases}
\end{align*}
Equivalently, define the sets
\begin{align*}
A_{k,i} &:= \left\{x \in X : \frac{i-1}{2^k} \le f(x) < \frac{i}{2^k}\right\}, \quad i = 1, \ldots, k \cdot 2^k, \\
A_{k,0} &:= \{x \in X : f(x) \ge k\},
\end{align*}
and write
\begin{align*}
s_k = \sum_{i=1}^{k \cdot 2^k} \frac{i-1}{2^k}\, \mathbb{1}_{A_{k,i}} + k\, \mathbb{1}_{A_{k,0}}.
\end{align*}
[guided]
The idea is to discretise the range of $f$, not its domain. For each level $k$, we tile the interval $[0, k)$ into subintervals of width $2^{-k}$ and round $f$ down to the nearest grid point. Values of $f$ at or above $k$ are capped at $k$.
As $k$ increases, two things happen simultaneously: the grid spacing halves (from $2^{-k}$ to $2^{-(k+1)}$), and the cap rises (from $k$ to $k+1$). The finer grid reduces the approximation error on $\{f < k\}$, while the higher cap captures more of the range of $f$.
Concretely, the sets $A_{k,1}, \ldots, A_{k, k \cdot 2^k}, A_{k,0}$ partition $X$: every $x \in X$ falls into exactly one of them (since $f(x) \ge 0$, the value $f(x)$ either lies in some half-open interval $[(i-1)/2^k,\, i/2^k)$ for $1 \le i \le k \cdot 2^k$, or satisfies $f(x) \ge k$). So $s_k$ is well-defined and takes finitely many values, hence is a simple function.
[/guided]
[/step]
[step:Verify that each $s_k$ is $\mathcal{F}$-measurable]
Each set $A_{k,i}$ for $i = 1, \ldots, k \cdot 2^k$ has the form
\begin{align*}
A_{k,i} = f^{-1}\!\left(\left[\frac{i-1}{2^k},\, \frac{i}{2^k}\right)\right) = \left\{f \ge \frac{i-1}{2^k}\right\} \cap \left\{f < \frac{i}{2^k}\right\}.
\end{align*}
Since $f$ is $\mathcal{F}$-measurable, both $\{f \ge c\}$ and $\{f < c\}$ belong to $\mathcal{F}$ for every $c \in \mathbb{R}$ (the former as the complement of $\{f < c\}$, and $\{f < c\} = \bigcup_{m=1}^\infty \{f \le c - 1/m\}$, or directly from the [Generator Criterion for Measurability](/theorems/525)). Their intersection $A_{k,i}$ therefore lies in $\mathcal{F}$. Similarly, $A_{k,0} = \{f \ge k\} \in \mathcal{F}$.
Since $s_k$ is a finite linear combination of indicator functions of $\mathcal{F}$-sets, it is a simple $\mathcal{F}$-measurable function.
[/step]
[step:Prove the monotonicity $s_k \le s_{k+1}$ pointwise]
Fix $x \in X$. We consider two cases.
**Case 1: $f(x) \ge k$.** Then $s_k(x) = k$. Since $f(x) \ge k$ and $k + 1 > k \ge f(x)$ need not hold, we split further. If $f(x) \ge k + 1$, then $s_{k+1}(x) = k + 1 \ge k = s_k(x)$. If $k \le f(x) < k + 1$, then $s_{k+1}(x)$ is the left endpoint of the dyadic interval of width $2^{-(k+1)}$ containing $f(x)$ in $[0, k+1)$. Since $f(x) \ge k$ and the left endpoint of the interval containing $f(x)$ satisfies $s_{k+1}(x) \le f(x)$, we get $s_{k+1}(x) \ge k = s_k(x)$ because the left endpoint of any subinterval of $[k, k+1)$ is at least $k$.
**Case 2: $f(x) < k$.** Then $f(x)$ lies in some interval $[(i-1)/2^k,\, i/2^k)$ and $s_k(x) = (i-1)/2^k$. In the partition at level $k+1$, the interval $[(i-1)/2^k,\, i/2^k)$ is subdivided into two halves:
\begin{align*}
\left[\frac{2(i-1)}{2^{k+1}},\, \frac{2i - 1}{2^{k+1}}\right) \quad \text{and} \quad \left[\frac{2i - 1}{2^{k+1}},\, \frac{2i}{2^{k+1}}\right).
\end{align*}
If $f(x)$ falls in the left half, then $s_{k+1}(x) = \frac{2(i-1)}{2^{k+1}} = \frac{i-1}{2^k} = s_k(x)$. If $f(x)$ falls in the right half, then $s_{k+1}(x) = \frac{2i-1}{2^{k+1}} > \frac{2(i-1)}{2^{k+1}} = s_k(x)$. In both sub-cases, $s_{k+1}(x) \ge s_k(x)$.
[guided]
The key geometric point is that the level-$(k+1)$ partition *refines* the level-$k$ partition. Every interval $[(i-1)/2^k,\, i/2^k)$ at level $k$ is split into exactly two subintervals at level $k+1$, and the left endpoints of both subintervals are at least $(i-1)/2^k$ (the left endpoint of the parent interval). So the "round-down" value can only increase or stay the same under refinement.
For the cap region $\{f \ge k\}$: at level $k$ the approximant is capped at $k$, but at level $k+1$ the cap is raised to $k+1$. If $f(x) \ge k$, then at level $k+1$ either $f(x)$ is captured by a dyadic subinterval of $[k, k+1)$ (giving $s_{k+1}(x) \ge k$) or $f(x) \ge k+1$ (giving $s_{k+1}(x) = k+1 > k$). Either way, $s_{k+1}(x) \ge k = s_k(x)$.
[/guided]
[/step]
[step:Establish pointwise convergence $s_k(x) \to f(x)$ for every $x \in X$]
Fix $x \in X$. We must show $s_k(x) \to f(x)$ as $k \to \infty$.
**Case 1: $f(x) < \infty$.** Choose $K \in \mathbb{N}$ with $K > f(x)$. For every $k \ge K$, we have $f(x) < k$, so $x \in A_{k,i}$ for some $i \in \{1, \ldots, k \cdot 2^k\}$, and
\begin{align*}
0 \le f(x) - s_k(x) < \frac{1}{2^k}.
\end{align*}
This follows directly from the construction: on $A_{k,i}$, $s_k(x) = (i-1)/2^k$ and $f(x) < i/2^k$, so the gap is at most $1/2^k$. Since $1/2^k \to 0$ as $k \to \infty$, we conclude $s_k(x) \to f(x)$.
**Case 2: $f(x) = \infty$.** For every $k \ge 1$, $f(x) \ge k$, so $s_k(x) = k$. Thus $s_k(x) = k \to \infty = f(x)$.
In both cases, the non-negativity $s_k \ge 0$ together with $s_k \le f$ (which holds by construction: on $A_{k,i}$, $s_k = (i-1)/2^k \le f$, and on $A_{k,0}$, $s_k = k \le f$) gives $0 \le s_k(x) \le f(x)$ for all $k$.
[guided]
The convergence rate in Case 1 is worth highlighting: once $k$ is large enough that $f(x) < k$ (so the cap is irrelevant), the error is bounded by the mesh width $2^{-k}$. This is a *uniform* bound over all $x$ with $f(x) < k$. The only obstruction to uniform convergence over all of $X$ is the cap region $\{f \ge k\}$, where the error can be as large as $f(x) - k$, which is unbounded if $f$ is unbounded.
[/guided]
[/step]
[step:Upgrade to uniform convergence when $f$ is bounded]
Suppose $f$ is bounded, say $\|f\|_{L^\infty(X)} \le M$ for some $M < \infty$. Choose $K \in \mathbb{N}$ with $K > M$. Then for every $k \ge K$ and every $x \in X$, we have $f(x) \le M < k$, so $x \notin A_{k,0}$ and the cap is never active. The estimate from the previous step gives
\begin{align*}
0 \le f(x) - s_k(x) < \frac{1}{2^k} \quad \text{for all } x \in X, \; k \ge K.
\end{align*}
Taking the supremum over $x \in X$:
\begin{align*}
\|s_k - f\|_{L^\infty(X)} \le \frac{1}{2^k} \to 0 \quad \text{as } k \to \infty.
\end{align*}
This is part (iii).
[/step]