[proofplan]
Define $N := \sum_{\ell = 1}^{|w|} |\Omega|^\ell$. We must show: $w \in \mathcal{L}(G)$ iff $w$ has a $G$-derivation of length at most $N$. The reverse direction is immediate — any derivation of $w$ witnesses $w \in \mathcal{L}(G)$. For the forward direction, suppose $w \in \mathcal{L}(G)$ and pick a $G$-derivation of $w$ of minimal length. We show the length is at most $N$ via two structural observations: (i) the noncontracting property ensures the sequence of string lengths is monotone nondecreasing and is bounded above by $|w|$, so at each constant-length plateau all the intermediate strings belong to a finite set; (ii) minimality of the derivation forces all intermediate strings to be distinct, because a repeat would let us excise a cycle and shorten the derivation. A straightforward counting argument then bounds the total number of intermediate strings by $N$.
[/proofplan]
[step:Define $N$ and reduce to the forward direction]
Set
\begin{align*}
N &:= \sum_{\ell = 1}^{|w|} |\Omega|^\ell \in \mathbb{N}.
\end{align*}
This is a finite integer since both $|w|$ and $|\Omega|$ are finite, and it depends only on $|w|$ and $|\Omega|$ as asserted. We prove the two directions of the biconditional.
**Backward direction ($\Leftarrow$).** Suppose $w$ has a $G$-derivation of length at most $N$. Then by the definition of $\mathcal{L}(G)$ — the set of words reachable from $S$ by some finite $G$-derivation — we have $w \in \mathcal{L}(G)$.
It therefore remains to prove the forward direction: if $w \in \mathcal{L}(G)$, then $w$ admits a $G$-derivation of length at most $N$.
[/step]
[step:Fix a minimal-length derivation of $w$]
Suppose $w \in \mathcal{L}(G)$. Then the set
\begin{align*}
\mathcal{D}_w &:= \{\, n \in \mathbb{N} \;\big|\; \text{there exists a $G$-derivation } S = \sigma_0 \xrightarrow{G}_1 \sigma_1 \xrightarrow{G}_1 \cdots \xrightarrow{G}_1 \sigma_n = w \,\}
\end{align*}
is non-empty (it contains the length of the derivation witnessing $w \in \mathcal{L}(G)$). By the well-ordering of $\mathbb{N}$, $\mathcal{D}_w$ has a least element $n \in \mathbb{N}$. Fix a derivation
\begin{align*}
S = \sigma_0 \xrightarrow{G}_1 \sigma_1 \xrightarrow{G}_1 \sigma_2 \xrightarrow{G}_1 \cdots \xrightarrow{G}_1 \sigma_n &= w
\end{align*}
of length $n$. By construction, no derivation of $w$ from $S$ in $G$ is strictly shorter than this one.
[/step]
[step:Show the intermediate strings are pairwise distinct]
We claim that $\sigma_0, \sigma_1, \dots, \sigma_n$ are pairwise distinct.
Suppose for contradiction that $\sigma_r = \sigma_s$ for some $0 \leq r < s \leq n$. Consider the excised sequence
\begin{align*}
\sigma_0 \xrightarrow{G}_1 \sigma_1 \xrightarrow{G}_1 \cdots \xrightarrow{G}_1 \sigma_r = \sigma_s \xrightarrow{G}_1 \sigma_{s+1} \xrightarrow{G}_1 \cdots \xrightarrow{G}_1 \sigma_n,
\end{align*}
obtained by deleting $\sigma_{r+1}, \dots, \sigma_s$ from the original derivation. Each arrow in this excised sequence is either an original arrow from the first $r$ steps or from the last $n - s$ steps. In particular, the joining step $\sigma_r \xrightarrow{G}_1 \sigma_{s+1}$ is well-defined because $\sigma_r = \sigma_s$, so the original arrow $\sigma_s \xrightarrow{G}_1 \sigma_{s+1}$ becomes $\sigma_r \xrightarrow{G}_1 \sigma_{s+1}$ under the identification $\sigma_r = \sigma_s$. The first and last strings are unchanged at $S$ and $w = \sigma_n$ respectively. The length of the excised derivation is $n - (s - r) < n$.
But this contradicts the minimality of $n$ chosen in the previous step: we have exhibited a shorter derivation of $w$ from $S$ in $G$. Hence no such repeat exists, and the strings $\sigma_0, \sigma_1, \dots, \sigma_n$ are pairwise distinct.
[/step]
[step:Bound the lengths of the intermediate strings]
We claim
\begin{align*}
|\sigma_0| \leq |\sigma_1| \leq \cdots \leq |\sigma_n| &= |w|,
\end{align*}
and in particular $1 \leq |\sigma_i| \leq |w|$ for every $i$.
For the nondecreasing inequality, fix $i \in \{0, 1, \dots, n-1\}$. The reduction $\sigma_i \xrightarrow{G}_1 \sigma_{i+1}$ means there exist $\alpha', \beta' \in \Omega^\star$ and a rule $(\gamma \to \delta) \in P$ with
\begin{align*}
\sigma_i &= \alpha' \gamma \beta', & \sigma_{i+1} &= \alpha' \delta \beta'.
\end{align*}
Since $G$ is noncontracting, $|\gamma| \leq |\delta|$. Length of concatenation being additive,
\begin{align*}
|\sigma_{i+1}| - |\sigma_i| &= (|\alpha'| + |\delta| + |\beta'|) - (|\alpha'| + |\gamma| + |\beta'|) = |\delta| - |\gamma| \geq 0.
\end{align*}
Hence $|\sigma_i| \leq |\sigma_{i+1}|$.
The terminal value $|\sigma_n| = |w|$ is just the hypothesis $\sigma_n = w$. The initial value satisfies $|\sigma_0| = |S| = 1$ since $S$ is a single variable in $\Omega$. Combining,
\begin{align*}
1 = |\sigma_0| \leq |\sigma_1| \leq \cdots \leq |\sigma_n| &= |w|,
\end{align*}
so each $|\sigma_i| \in \{1, 2, \dots, |w|\}$.
[guided]
The noncontracting property — $|\gamma| \leq |\delta|$ for every rule — is used at exactly one place: the length of the string never decreases along a single reduction step. Applied $n$ times along the chain, it forces the length sequence to be monotone nondecreasing.
Why does this matter? It guarantees that we *cannot* reach arbitrarily long strings during the derivation: once the chain exits the length regime $\{1, \dots, |w|\}$, it cannot come back. Since the final string $\sigma_n$ has length $|w|$, every intermediate string must also have length at most $|w|$. Without the noncontracting property, derivations could pass through arbitrarily long strings — for example through "scratchwork" — and the bound $N$ would fail to hold.
The lower bound $|\sigma_0| = 1$ uses that $S$ is a *single* variable — a single-letter string in $\Omega$. This is a convention of grammar definitions rather than a structural fact, but it lets us restrict attention to lengths in $\{1, \dots, |w|\}$ rather than $\{0, \dots, |w|\}$, matching the range of the sum defining $N$.
[/guided]
[/step]
[step:Count the intermediate strings and conclude]
Partition the intermediate strings by their length: for $\ell \in \{1, 2, \dots, |w|\}$ define
\begin{align*}
I_\ell &:= \{\, i \in \{0, 1, \dots, n\} \;\big|\; |\sigma_i| = \ell \,\}.
\end{align*}
From the previous step, $I_1, I_2, \dots, I_{|w|}$ partition $\{0, 1, \dots, n\}$, so
\begin{align*}
n + 1 &= \sum_{\ell = 1}^{|w|} |I_\ell|.
\end{align*}
For each $\ell$, the map
\begin{align*}
I_\ell &\to \Omega^\ell \\
i &\mapsto \sigma_i
\end{align*}
is injective because the intermediate strings are pairwise distinct (third step). The codomain $\Omega^\ell$ has exactly $|\Omega|^\ell$ elements, being the set of length-$\ell$ words over the finite alphabet $\Omega$. Hence
\begin{align*}
|I_\ell| &\leq |\Omega|^\ell.
\end{align*}
Summing over $\ell$,
\begin{align*}
n + 1 &= \sum_{\ell = 1}^{|w|} |I_\ell| \leq \sum_{\ell = 1}^{|w|} |\Omega|^\ell = N.
\end{align*}
Therefore $n \leq N - 1 \leq N$, and the fixed derivation has length at most $N$. This proves the forward direction and, combined with the backward direction from the first step, gives the biconditional claimed by the theorem.
[/step]