[proofplan]
We track two numerical invariants along any $G$-derivation from a regular grammar: the length $|\sigma|$ and the variable count $v(\sigma)$. Terminal rules $A \to a$ preserve length and decrement the variable count by one; nonterminal rules $A \to aB$ increment length by one and preserve the variable count. By the [Shape of Regular Sentential Forms](/theorems/1786) theorem, every intermediate string in a derivation from $S$ has variable count either $0$ or $1$. Starting from $S$ with length $1$ and variable count $1$, any derivation ending at a terminal word $w$ of length $|w|$ must (i) reach length $|w|$ using $|w| - 1$ nonterminal rules (each contributing $+1$ to length), and (ii) reach variable count $0$ using exactly one terminal rule, which must come last because after a terminal rule the variable count is $0$ and no further rule can fire.
[/proofplan]
[step:Set up the invariants length and variable count]
Fix a regular grammar $G = (\Sigma, V, P, S)$ and a derivation
\begin{align*}
S = \sigma_0 \to_G \sigma_1 \to_G \cdots \to_G \sigma_n = w,
\end{align*}
with $w \in \mathbb{W} = \Sigma^\star$ of length $|w| \geq 1$. (The case $|w| = 0$, i.e. $w = \varepsilon$, cannot occur because regular rules have nonempty right-hand side, so the invariants below force $|\sigma_i| \geq 1$ for every $i$.)
Define for $\sigma \in (\Sigma \cup V)^\star$ the two nonnegative integers
\begin{align*}
\ell(\sigma) &:= |\sigma|\ \ \text{(total length)}, & v(\sigma) &:= |\{i : \sigma_i \in V\}|\ \ \text{(number of variable symbols)}.
\end{align*}
We have $\ell(\sigma) \geq v(\sigma) \geq 0$ and $\ell(\sigma) = v(\sigma) + (\text{number of terminal symbols})$.
Classify rules of $P$:
- a **terminal rule** has the form $A \to a$ with $A \in V$, $a \in \Sigma$;
- a **nonterminal rule** has the form $A \to aB$ with $A, B \in V$, $a \in \Sigma$.
[/step]
[step:Compute the effect of each rule type on $(\ell, v)$]
[claim:Terminal rules preserve $\ell$ and decrement $v$ by $1$]
Suppose $\sigma_i \to_G \sigma_{i+1}$ uses a terminal rule $A \to a$ applied at position $\mu, \nu$, so $\sigma_i = \mu A \nu$ and $\sigma_{i+1} = \mu a \nu$. Then
\begin{align*}
\ell(\sigma_{i+1}) &= |\mu| + 1 + |\nu| = \ell(\sigma_i), \\
v(\sigma_{i+1}) &= v(\mu) + 0 + v(\nu) = v(\sigma_i) - 1,
\end{align*}
since $A \in V$ contributes $1$ to $v(\sigma_i)$ whereas $a \in \Sigma$ contributes $0$ to $v(\sigma_{i+1})$, and $\mu, \nu$ are unchanged.
[/claim]
[proof]
Direct computation from the definitions of $\ell, v$ and the structure of a single derivation step.
[/proof]
[claim:Nonterminal rules increment $\ell$ by $1$ and preserve $v$]
Suppose $\sigma_i \to_G \sigma_{i+1}$ uses a nonterminal rule $A \to aB$ applied at position $\mu, \nu$, so $\sigma_i = \mu A \nu$ and $\sigma_{i+1} = \mu a B \nu$. Then
\begin{align*}
\ell(\sigma_{i+1}) &= |\mu| + 2 + |\nu| = \ell(\sigma_i) + 1, \\
v(\sigma_{i+1}) &= v(\mu) + 1 + v(\nu) = v(\sigma_i),
\end{align*}
since $A \in V$ (contributing $1$) is replaced by $aB$ with $a \in \Sigma$ (contributing $0$) and $B \in V$ (contributing $1$), keeping the variable count the same while increasing the total length by $1$.
[/claim]
[proof]
Direct computation from the definitions.
[/proof]
[/step]
[step:Apply the shape theorem to bound $v(\sigma_i) \in \{0, 1\}$]
By the [Shape of Regular Sentential Forms](/theorems/1786) theorem, every sentential form $\sigma_i$ with $S \xrightarrow{G} \sigma_i$ satisfies $\sigma_i \in \mathbb{W} \cup \mathbb{W} V$. The two subsets are characterised by the variable count:
\begin{align*}
\sigma_i \in \mathbb{W} &\iff v(\sigma_i) = 0, \\
\sigma_i \in \mathbb{W} V &\iff v(\sigma_i) = 1.
\end{align*}
The first equivalence is immediate ($\mathbb{W} = \Sigma^\star$ contains no variables). The second: $\mathbb{W}V = \{wA : w \in \mathbb{W}, A \in V\}$ contains exactly one variable, namely the trailing $A$.
Hence $v(\sigma_i) \in \{0, 1\}$ for every $i \in \{0, 1, \dots, n\}$.
[/step]
[step:Conclude terminal rules must be the final step and are unique]
We have $v(\sigma_0) = v(S) = 1$ and $v(\sigma_n) = v(w) = 0$. Each nonterminal rule keeps $v$ fixed at $1$ (it cannot drop $v$ to $0$), and each terminal rule drops $v$ from $1$ to $0$. We argue:
*There is exactly one terminal rule in the derivation.* The sequence $v(\sigma_0), v(\sigma_1), \dots, v(\sigma_n)$ starts at $1$, ends at $0$, and takes values in $\{0, 1\}$ by the previous step. Each nonterminal step fixes the value; each terminal step decreases it by one. Hence the number of terminal steps equals the total decrease $v(\sigma_0) - v(\sigma_n) = 1 - 0 = 1$.
*The terminal rule is the last step.* Suppose the terminal step is $\sigma_j \to_G \sigma_{j+1}$ with $j < n - 1$, so at least one further step $\sigma_{j+1} \to_G \sigma_{j+2}$ exists. By the first claim of the second step, $v(\sigma_{j+1}) = v(\sigma_j) - 1 = 0$. The claim of the fourth step of the proof of [Shape of Regular Sentential Forms](/theorems/1786) (recorded here as a sub-observation: strings with variable count $0$ are in $\mathbb{W}$, and no $G$-rule applies to an element of $\mathbb{W}$, since every rule has left-hand side a single variable of $V$) then says that $\sigma_{j+1}$ admits no further derivation step, contradicting the existence of $\sigma_{j+2}$. Hence $j = n - 1$, meaning the terminal rule is the last step $\sigma_{n-1} \to_G \sigma_n$.
[/step]
[step:Count the nonterminal rules and the total length]
The derivation has $n$ steps total: exactly $1$ terminal rule (the last), and $n - 1$ nonterminal rules (the rest, which occupy positions $0, 1, \dots, n - 2$).
Track the length along the derivation. At step $0$, $\ell(\sigma_0) = |S| = 1$. Each nonterminal step increments $\ell$ by $1$; each terminal step preserves $\ell$. Hence
\begin{align*}
\ell(\sigma_n) &= \ell(\sigma_0) + (\text{number of nonterminal steps}) + 0 \cdot (\text{number of terminal steps}) \\
&= 1 + (n - 1) = n.
\end{align*}
On the other hand $\ell(\sigma_n) = \ell(w) = |w|$. Combining,
\begin{align*}
n &= |w|.
\end{align*}
Therefore the derivation has length $|w|$, consisting of exactly $|w| - 1$ nonterminal rules followed by a single terminal rule at the end.
[guided]
The proof is a careful bookkeeping exercise. We track two numerical statistics of each sentential form: its length $\ell(\sigma)$ and its variable count $v(\sigma)$. A terminal rule $A \to a$ replaces one variable by one terminal — no change in length, decrease of $1$ in variable count. A nonterminal rule $A \to aB$ replaces one variable by one terminal and one variable — increase of $1$ in length, no change in variable count.
The [Shape of Regular Sentential Forms](/theorems/1786) theorem gives us a crucial restriction: at every intermediate stage, the sentential form lies in $\mathbb{W} \cup \mathbb{W}V$, which means the variable count is $0$ or $1$. The count is $1$ as long as we are "mid-derivation" and $0$ only when we have reached a pure terminal string, from which no further derivation is possible.
From these three facts the conclusion follows by counting:
1. **Number of terminal rules.** Each terminal rule drops $v$ from $1$ to $0$, and nothing can increase $v$ back up (check: nonterminal rules preserve $v$, terminal rules decrement $v$, and there are no other rule types). So the number of terminal rules equals the net decrease $v(S) - v(w) = 1 - 0 = 1$.
2. **Position of the terminal rule.** Once $v = 0$, the current string is in $\mathbb{W}$, and no rule applies — regular rules have single-variable left-hand sides, which cannot match a terminal-only string. So the terminal rule must be the *final* step; otherwise the derivation would be stuck at a pure terminal string with more steps scheduled.
3. **Number of nonterminal rules.** Each nonterminal rule adds $1$ to the length. Starting from $|S| = 1$ and ending at $|w|$, the net length increase is $|w| - 1$, all contributed by nonterminal rules. So there are exactly $|w| - 1$ nonterminal rules.
4. **Total length.** Adding up: $1$ terminal rule $+ |w| - 1$ nonterminal rules $= |w|$ rules total, so the derivation has length $|w|$.
The structure is rigid: a derivation of a word of length $m$ from $S$ looks like
\begin{align*}
S &\to_{G} a_1 A_1 \to_G a_1 a_2 A_2 \to_G \cdots \to_G a_1 a_2 \cdots a_{m-1} A_{m-1} \to_G a_1 a_2 \cdots a_m,
\end{align*}
with the first $m - 1$ steps being nonterminal rules $A_{i-1} \to a_i A_i$ (taking $A_0 := S$) and the final step being the terminal rule $A_{m-1} \to a_m$. No other pattern is possible.
[/guided]
[/step]
[step:State the conclusion]
Combining the counts: the derivation of $w$ from $S$ has length $n = |w|$, uses exactly $|w| - 1$ nonterminal rules in steps $0, \dots, |w| - 2$, and a single terminal rule at step $|w| - 1$. This matches the theorem statement.
[/step]