[proofplan]
Given a regular grammar $G = (\Sigma, V, P, S)$, we construct a nondeterministic automaton $N$ whose states are the variables of $G$ augmented by a fresh halt state $H$: reading a letter $a$ from a variable $A$ moves $N$ (nondeterministically) to every variable $B$ such that $G$ has a nonterminal rule $A \to aB$, and additionally to $H$ whenever $G$ has a terminal rule $A \to a$. The start state is $S$; the only accepting state is $H$. A $G$-derivation of a terminal word $w$, by the [Structure of Terminal Derivations](/theorems/1787) theorem, consists of $|w| - 1$ nonterminal rules followed by exactly one terminal rule. The nonterminal rules encode transitions between variable states of $N$; the final terminal rule encodes a transition *into* $H$. Matching rules to transitions yields a bijection between $G$-derivations of $w$ and accepting $N$-runs on $w$, giving $\mathcal{L}(G) = \mathcal{L}(N)$.
[/proofplan]
[step:Construct the nondeterministic automaton $N$ from the regular grammar $G$]
Fix the regular grammar $G = (\Sigma, V, P, S)$. Choose a fresh symbol $H \notin \Sigma \cup V$ (the *halt state*); set
\begin{align*}
Q &:= V \cup \{H\}.
\end{align*}
Define the nondeterministic transition function $\delta: Q \times \Sigma \to \mathcal{P}(Q)$ by
\begin{align*}
\delta(A, a) &:= \{B \in V : (A \to aB) \in P\} \cup \begin{cases} \{H\} & \text{if } (A \to a) \in P, \\ \varnothing & \text{otherwise,} \end{cases} \qquad A \in V,\ a \in \Sigma, \\
\delta(H, a) &:= \varnothing \qquad a \in \Sigma.
\end{align*}
The halt state has no outgoing transitions: $H$ is a dead-end beyond which no further letter can be read. Define
\begin{align*}
N &:= (\Sigma,\ Q,\ \delta,\ S,\ \{H\}).
\end{align*}
By construction, $N$ is a nondeterministic automaton with start state $S$ (the start variable of $G$) and accepting set $\{H\}$.
Let $\hat{\delta}: \mathcal{P}(Q) \times \Sigma^\star \to \mathcal{P}(Q)$ be the extended transition function: $\hat{\delta}(X, \varepsilon) = X$, and $\hat{\delta}(X, wa) = \bigcup_{q \in \hat{\delta}(X, w)} \delta(q, a)$. Write $\hat{\delta}(q, w)$ for $\hat{\delta}(\{q\}, w)$. Recall $w \in \mathcal{L}(N) \iff \hat{\delta}(S, w) \cap \{H\} \neq \varnothing \iff H \in \hat{\delta}(S, w)$.
[/step]
[step:Show $\mathcal{L}(G) \subseteq \mathcal{L}(N)$ by converting a $G$-derivation into an $N$-run]
Let $w \in \mathcal{L}(G)$. We handle $w \neq \varepsilon$ first; regular grammars (in the standard definition adopted here) have no $\varepsilon$-rules, so $w \neq \varepsilon$ automatically. Write $w = a_0 a_1 \cdots a_n$ with $n = |w| - 1 \geq 0$, each $a_i \in \Sigma$.
By the [Structure of Terminal Derivations](/theorems/1787) theorem applied to $G$ and $w$, any $G$-derivation of $w$ from $S$ has exactly $|w|$ steps, consisting of $|w| - 1$ nonterminal rules followed by a single terminal rule at the end. Concretely, such a derivation has the form
\begin{align*}
S = A_0 \to_G a_0 A_1 \to_G a_0 a_1 A_2 \to_G \cdots \to_G a_0 a_1 \cdots a_{n-1} A_n \to_G a_0 a_1 \cdots a_{n-1} a_n = w,
\end{align*}
where $A_0 := S$ and $A_1, A_2, \dots, A_n \in V$ are variables. The first $n$ steps use nonterminal rules $A_i \to a_i A_{i+1}$ for $i = 0, 1, \dots, n - 1$; the final step uses a terminal rule $A_n \to a_n$.
We translate these rules to $N$-transitions.
*First $n$ transitions (from variable to variable).* For $i = 0, 1, \dots, n - 1$, the rule $A_i \to a_i A_{i+1} \in P$ is a nonterminal rule, so by the definition of $\delta$,
\begin{align*}
A_{i+1} &\in \delta(A_i, a_i).
\end{align*}
*Final transition (from variable to halt state).* The rule $A_n \to a_n \in P$ is a terminal rule, so by the second clause of $\delta$'s definition,
\begin{align*}
H &\in \delta(A_n, a_n).
\end{align*}
Chain these into an $N$-computation: starting from $S = A_0$, there is a valid $N$-run
\begin{align*}
A_0 \xrightarrow{a_0} A_1 \xrightarrow{a_1} A_2 \xrightarrow{a_2} \cdots \xrightarrow{a_{n-1}} A_n \xrightarrow{a_n} H.
\end{align*}
By definition of $\hat{\delta}$ (which closes under the union of all possible transitions), this witnesses $H \in \hat{\delta}(S, w)$: at each step $i$, the state $A_{i+1}$ (respectively $H$ for the last step) is included in the set of reachable states.
Formally, we prove by induction on $i$ that $A_i \in \hat{\delta}(S, a_0 a_1 \cdots a_{i-1})$ for $i = 0, 1, \dots, n$, and $H \in \hat{\delta}(S, a_0 a_1 \cdots a_n)$.
*Base*, $i = 0$: $A_0 = S \in \{S\} = \hat{\delta}(S, \varepsilon)$.
*Inductive step*, $i \to i + 1$ for $i < n$: by hypothesis $A_i \in \hat{\delta}(S, a_0 \cdots a_{i-1})$; by the recursive definition of $\hat{\delta}$,
\begin{align*}
\hat{\delta}(S, a_0 \cdots a_i) &= \bigcup_{q \in \hat{\delta}(S, a_0 \cdots a_{i-1})} \delta(q, a_i) \supseteq \delta(A_i, a_i) \ni A_{i+1},
\end{align*}
so $A_{i+1} \in \hat{\delta}(S, a_0 \cdots a_i)$.
*Final step*, $i = n$ for $H$: by the induction, $A_n \in \hat{\delta}(S, a_0 \cdots a_{n-1})$; by the recursive definition,
\begin{align*}
\hat{\delta}(S, a_0 \cdots a_n) &\supseteq \delta(A_n, a_n) \ni H,
\end{align*}
so $H \in \hat{\delta}(S, w)$, i.e., $w \in \mathcal{L}(N)$.
Hence $\mathcal{L}(G) \subseteq \mathcal{L}(N)$.
[/step]
[step:Show $\mathcal{L}(N) \subseteq \mathcal{L}(G)$ by converting an accepting $N$-run into a $G$-derivation]
Conversely, let $w \in \mathcal{L}(N)$, so $H \in \hat{\delta}(S, w)$. Since $H$ has no outgoing transitions ($\delta(H, a) = \varnothing$ for all $a \in \Sigma$), $H$ cannot appear in any intermediate state of a run that is then extended by further letters. Formally:
[claim:If $H \in \hat{\delta}(X, w)$ for some $w \in \Sigma^\star$ and $X \subseteq V$, then there is a sequence of states $A_0, A_1, \dots, A_n, H$ with $A_0 \in X$, $A_1, \dots, A_n \in V$, and $A_{i+1} \in \delta(A_i, a_i)$ for $i < n$, $H \in \delta(A_n, a_n)$, where $w = a_0 a_1 \cdots a_n$]
Write $w = a_0 a_1 \cdots a_n$. Unfolding the definition of $\hat{\delta}$, the condition $H \in \hat{\delta}(X, w)$ is equivalent to the existence of a sequence of states $q_0, q_1, \dots, q_{n+1}$ with $q_0 \in X$, $q_{i+1} \in \delta(q_i, a_i)$ for $i = 0, \dots, n$, and $q_{n+1} = H$. We verify inductively that $q_i \in V$ for $i \leq n$. If $q_i = H$ for some $i \leq n$, then the inductive hypothesis on the length of the remaining suffix would force $q_{i+1} \in \delta(H, a_i) = \varnothing$, a contradiction. Hence $q_i \in V$ for $i \leq n$, and we may take $A_i := q_i$.
[/claim]
[proof]
Unfolding $\hat{\delta}$ and checking that $H$ is a dead-end: if $H$ appeared at an intermediate step, the next transition from $H$ would have to be nonempty, contradicting $\delta(H, a) = \varnothing$.
[/proof]
Apply the claim with $X = \{S\}$: there exists a sequence $A_0, A_1, \dots, A_n \in V$ and the terminal state $H$, with $A_0 = S$, $A_{i+1} \in \delta(A_i, a_i)$ for $i = 0, \dots, n - 1$, and $H \in \delta(A_n, a_n)$, where $w = a_0 a_1 \cdots a_n$.
Decode each transition into a $G$-rule:
- For $i = 0, \dots, n - 1$: $A_{i+1} \in \delta(A_i, a_i)$ and $A_{i+1} \in V$, so by the first clause of $\delta$'s definition, the rule $A_i \to a_i A_{i+1}$ lies in $P$. (The second clause, which would contribute $H \in \delta(A_i, a_i)$, cannot give us $A_{i+1} \in V$ since $A_{i+1} \neq H$.)
- For the final step: $H \in \delta(A_n, a_n)$, so by the second clause of $\delta$'s definition, the rule $A_n \to a_n$ lies in $P$.
Applying these rules in order,
\begin{align*}
S = A_0 \to_G a_0 A_1 \to_G a_0 a_1 A_2 \to_G \cdots \to_G a_0 \cdots a_{n-1} A_n \to_G a_0 \cdots a_{n-1} a_n = w.
\end{align*}
This exhibits a $G$-derivation of $w$ from $S$, so $w \in \mathcal{L}(G)$. Hence $\mathcal{L}(N) \subseteq \mathcal{L}(G)$.
[guided]
The converse direction requires us to extract a $G$-derivation from an accepting $N$-computation. The construction of $N$ was deliberately engineered so that this extraction is mechanical, but we must verify two things: (i) that an accepting run on $w$ indeed visits $n + 1$ variable states of $V$ followed by $H$, and (ii) that each step of the run corresponds to a legitimate $G$-rule.
For (i), the deadend property $\delta(H, a) = \varnothing$ is crucial. Without it, a run might visit $H$ at an intermediate step (say after reading a prefix $u$) and then try to continue reading the suffix $v$ after $u$ — but from $H$ there is no way to continue, so the run would fail to reach any accepting configuration. By making $H$ a deadend, we force every accepting run on $w = a_0 \cdots a_n$ to visit $H$ only at the very last moment, after reading the full word. Equivalently, an accepting run has exactly one $V$-to-$H$ transition, at the final step. The logic of this argument — the unique $V$-to-$H$ transition is the very last step — directly mirrors the [Structure of Terminal Derivations](/theorems/1787) theorem's logic that the unique terminal rule is the final step of a $G$-derivation.
For (ii), we read the definition of $\delta$ backwards. An inclusion $A_{i+1} \in \delta(A_i, a_i)$ with $A_{i+1} \in V$ can only have been caused by the first clause of $\delta$'s construction — the one that contributes variable successors from nonterminal rules $A \to aB$. Hence $A_i \to a_i A_{i+1}$ is a rule of $P$. An inclusion $H \in \delta(A_n, a_n)$ can only have been caused by the second clause — the one that contributes $H$ from terminal rules $A \to a$. Hence $A_n \to a_n$ is a rule of $P$. No ambiguity arises because $V$ and $\{H\}$ are disjoint by design ($H$ is fresh).
Chaining these rules together gives the $G$-derivation of $w$. The variable sequence $S = A_0, A_1, \dots, A_n$ precisely shadows the variable-state sequence of the $N$-run, and the concluding terminal rule $A_n \to a_n$ shadows the final $V$-to-$H$ transition. The construction of $N$ was the minimum possible addition (one fresh state, transitions dictated by rule shapes) to turn derivation-steps into transition-steps. The correspondence is tight because the structure theorem for regular derivations matches exactly the structure forced on accepting runs by $H$'s deadend property.
[/guided]
[/step]
[step:Combine the two inclusions]
From the two previous steps, $\mathcal{L}(G) = \mathcal{L}(N)$. We exhibited the nondeterministic automaton $N$ explicitly in the first step. This completes the proof that for every regular grammar $G$ there exists a nondeterministic automaton $N$ with $\mathcal{L}(G) = \mathcal{L}(N)$.
[/step]