[proofplan]
Let $L$ be a regular language; by the equivalence of regular grammars and deterministic automata there is a deterministic automaton $D = (\Sigma, Q, \delta, q_0, F)$ with $\mathcal{L}(D) = L$. The pumping number is $n := |Q|$. For any $w \in L$ with $|w| \geq n$, consider the run of $D$ on the prefix of length $n$: this run passes through $n + 1$ states but $D$ has only $n$ distinct states, so by the pigeonhole principle two of the states must coincide. Those two equal states identify a *loop* in $D$ — a substring $y$ of $w$ whose transition returns $D$ to the same state. The word splits as $w = xyz$ where $x$ precedes the loop, $y$ traverses the loop, and $z$ is everything after; the loop can be repeated any number of times without changing the final state, so every $xy^k z$ is accepted. The bounds $|xy| \leq n$ and $|y| \geq 1$ follow from the choice of indices with $0 \leq i < j \leq n$. A clean induction on $k$ formalises "the loop may be repeated any number of times".
[/proofplan]
[step:Pick a deterministic automaton for $L$ and set the pumping number]
Let $L$ be a regular language. By the [equivalence of regular grammars and deterministic automata](/theorems/???) (combining [Regular Grammars Produce Nondeterministic Automata](/theorems/1792), [Nondeterministic and Deterministic Automata Are Equivalent](/theorems/1791), and [Automaton Languages Are Regular](/theorems/1790)), there exists a deterministic automaton
\begin{align*}
D &= (\Sigma, Q, \delta, q_0, F)
\end{align*}
with $\mathcal{L}(D) = L$. Since $D$ is a finite automaton, $Q$ is a finite set. Set
\begin{align*}
n &:= |Q| \geq 1.
\end{align*}
(The case $|Q| = 0$ is excluded: every automaton has at least one state, namely $q_0$.)
We claim $n$ is a pumping number for $L$: for every $w \in L$ with $|w| \geq n$, there exists a decomposition $w = xyz$ with $|y| \geq 1$, $|xy| \leq n$, and $xy^k z \in L$ for all $k \geq 0$.
[/step]
[step:Unroll the run of $D$ on the length-$n$ prefix of $w$]
Let $w \in L$ with $|w| \geq n$. Write $w = a_0 a_1 \cdots a_{n-1} v$, where $a_0, a_1, \dots, a_{n-1} \in \Sigma$ are the first $n$ letters and $v \in \Sigma^\star$ is the (possibly empty) suffix of length $|w| - n$.
Define the state sequence along the length-$n$ prefix: let $q_0$ denote the start state (matching the notation of $D$), and for $i = 1, 2, \dots, n$ set
\begin{align*}
q_i &:= \delta(q_{i-1}, a_{i-1}).
\end{align*}
Equivalently, $q_i = \hat{\delta}(q_0, a_0 a_1 \cdots a_{i-1})$ by the recursive definition of the [extended transition function](/theorems/???). The sequence $q_0, q_1, \dots, q_n$ has $n + 1$ entries.
[/step]
[step:Apply the pigeonhole principle to find a repeated state]
The sequence $q_0, q_1, \dots, q_n$ has $n + 1$ entries, but by definition $q_i \in Q$ for each $i$, and $|Q| = n$. By the [pigeonhole principle](/theorems/???) applied to the function $i \mapsto q_i$ from the set $\{0, 1, \dots, n\}$ (size $n + 1$) to $Q$ (size $n$), the function is not injective: there exist indices $i, j \in \{0, 1, \dots, n\}$ with
\begin{align*}
0 \leq i < j \leq n \qquad \text{and} \qquad q_i = q_j.
\end{align*}
Pick any such pair $(i, j)$; for instance, choose $i$ minimal and then $j$ minimal subject to $q_j = q_i$.
[/step]
[step:Define the decomposition $w = xyz$]
Using the indices $i, j$ from the previous step, define
\begin{align*}
x &:= a_0 a_1 \cdots a_{i-1}, & y &:= a_i a_{i+1} \cdots a_{j-1}, & z &:= a_j a_{j+1} \cdots a_{n-1} v.
\end{align*}
Here $x$ is the empty word when $i = 0$, $y$ is nonempty because $i < j$, and $z$ consists of the letters $a_j, \dots, a_{n-1}$ of the prefix followed by the original suffix $v$.
Then $w = xyz$ by concatenation, and we compute the required length bounds:
\begin{align*}
|y| &= j - i \geq 1 \qquad (\text{since } i < j), \\
|xy| &= i + (j - i) = j \leq n \qquad (\text{since } j \leq n).
\end{align*}
Record the state-reaching properties of the decomposition:
\begin{align*}
\hat{\delta}(q_0, x) &= q_i, && \text{(the run of } D \text{ on } x \text{ ends in } q_i\text{)} \\
\hat{\delta}(q_i, y) &= q_j = q_i, && \text{(the run of } D \text{ on } y \text{ starting at } q_i \text{ returns to } q_i\text{)} \\
\hat{\delta}(q_i, z) &\in F. && \text{(the run of } D \text{ on } z \text{ starting at } q_i \text{ ends in } F\text{, because } w \in L\text{)}
\end{align*}
The first identity: $\hat{\delta}(q_0, a_0 a_1 \cdots a_{i-1}) = q_i$ by the definition of the state sequence. The second: $\hat{\delta}(q_i, a_i a_{i+1} \cdots a_{j-1}) = q_j$ by iterating $\delta$ from $q_i$ for $j - i$ steps and reading $a_i, \dots, a_{j-1}$; by the choice of $(i, j)$, $q_j = q_i$. The third: combining the previous two, $\hat{\delta}(q_0, xy) = q_j = q_i$, and since $w = xyz \in L$,
\begin{align*}
\hat{\delta}(q_0, w) &= \hat{\delta}(q_0, xyz) = \hat{\delta}(\hat{\delta}(q_0, xy), z) = \hat{\delta}(q_i, z) \in F,
\end{align*}
using the fundamental concatenation identity $\hat{\delta}(q, uv) = \hat{\delta}(\hat{\delta}(q, u), v)$ for the extended transition function, and the acceptance criterion $w \in L = \mathcal{L}(D)$.
[/step]
[step:Prove $xy^k z \in L$ for all $k \geq 0$ by induction on $k$]
[claim:For every $k \geq 0$, $\hat{\delta}(q_0, xy^k) = q_i$]
We induct on $k$.
*Base*, $k = 0$: $xy^0 = x$, and $\hat{\delta}(q_0, x) = q_i$ by the identity recorded in the previous step.
*Inductive step*: assume $\hat{\delta}(q_0, xy^k) = q_i$. Then
\begin{align*}
\hat{\delta}(q_0, xy^{k+1}) &= \hat{\delta}(q_0, xy^k y) = \hat{\delta}(\hat{\delta}(q_0, xy^k), y) = \hat{\delta}(q_i, y) = q_i,
\end{align*}
using the concatenation identity $\hat{\delta}(q, uv) = \hat{\delta}(\hat{\delta}(q, u), v)$ with $u := xy^k$, $v := y$; the inductive hypothesis $\hat{\delta}(q_0, xy^k) = q_i$; and the loop identity $\hat{\delta}(q_i, y) = q_i$ from the previous step. This closes the induction.
[/claim]
[proof]
See the induction stated in the claim.
[/proof]
For every $k \geq 0$,
\begin{align*}
\hat{\delta}(q_0, xy^k z) &= \hat{\delta}(\hat{\delta}(q_0, xy^k), z) = \hat{\delta}(q_i, z) \in F,
\end{align*}
again by the concatenation identity and the third identity from the previous step. Hence $xy^k z \in \mathcal{L}(D) = L$.
[guided]
The pumping lemma records a combinatorial consequence of the *finite memory* of a deterministic automaton. Intuitively: if a word is long enough to force the automaton to revisit a state, then everything between the two visits is a "loop" that can be repeated indefinitely, or omitted entirely, without changing the acceptance verdict.
The three technical choices in the proof all come from this intuition:
1. **Why is the pumping number $n = |Q|$?** Because by the pigeonhole principle, any run of length $\geq n$ must revisit some state. More precisely, reading a word of length $n$ exposes $n + 1$ consecutive states $q_0, q_1, \dots, q_n$ in $Q$, and $n + 1 > n = |Q|$ forces a collision.
2. **Why decompose at the *first* repetition in the prefix?** Because we need the bound $|xy| \leq n$: the first repetition occurs at index $j \leq n$, so the loop traversed between $q_i$ and $q_j$ ends no later than position $n$ in $w$. The constraint $|xy| \leq n$ forces the loop to sit near the start of the word — a feature needed for some applications of the pumping lemma (e.g., when $w = 0^n 1^n$ and one wants the pump to be all $0$s).
3. **Why $|y| \geq 1$?** Because $i < j$ by construction; the repetition is at *distinct* positions in the state sequence. Without this, the "loop" could be trivial ($y = \varepsilon$), and the pumping lemma would be vacuous.
The induction on $k$ formalises the verbal statement "a loop can be repeated arbitrarily many times". The invariant is $\hat{\delta}(q_0, xy^k) = q_i$: reading the prefix $xy^k$ always lands back at $q_i$, regardless of how many times we go around. This is because going around once adds $y$ at the end, and $\hat{\delta}(q_i, y) = q_i$ — reading $y$ from $q_i$ returns to $q_i$. The base case $k = 0$ handles the empty repetition: with no loops, we just read $x$ and land at $q_i$ directly. Concatenating $z$ to $xy^k$ then reads $z$ from $q_i$, which lands in $F$ (as it does for the original $w$, which corresponds to $k = 1$). Hence $xy^k z \in L$ for every $k \geq 0$.
One subtle point: the lemma allows $k = 0$, so $xz \in L$. This says we can not only repeat the loop but also delete it. Deleting one loop from a long word still yields a word in $L$ — the automaton has no way to tell the difference between the single pass $q_i \to_y q_i$ and no pass at all, because both end at $q_i$ ready to read $z$.
Contrast this with the pumping lemma's frequent applications, which prove non-regularity by exhibiting a language that fails the lemma: for instance, $L = \{0^n 1^n : n \geq 0\}$ violates the lemma because no decomposition of $0^n 1^n$ with $|xy| \leq n$ can satisfy $xy^2 z \in L$ — pumping $y$ (which lies inside the $0$-block) unbalances the counts. The pumping lemma's *necessary condition* for regularity becomes a tool for proving non-regularity.
[/guided]
[/step]
[step:Assemble the conclusion]
We have exhibited, for the arbitrary $w \in L$ with $|w| \geq n$, a decomposition $w = xyz$ satisfying
\begin{align*}
|y| &\geq 1, & |xy| &\leq n, & xy^k z &\in L \text{ for all } k \geq 0.
\end{align*}
These are exactly the conditions of the [pumping lemma for regular languages](/theorems/???) at the pumping number $n$. Since $L$ was an arbitrary regular language, every regular language satisfies the pumping lemma. This completes the proof.
[/step]