[proofplan]
The three positive closure properties are proved by explicit grammar constructions: given context-free grammars $G_1, G_2$ for $L_1, L_2$, we build a third grammar for each of $L_1 \cup L_2$, $L_1 L_2$, and $L_1^\star$ by introducing a single fresh start variable and a small number of rules that route a derivation into one (or several, in sequence) of the original grammars. Each construction is verified by double inclusion, using that derivations in the new grammar either begin with one of the fresh start rules (forcing the routing) or descend entirely into a copy of one of the original grammars. The three negative results reduce to a single failure of closure under intersection: the classical non-context-free language $\{a^n b^n c^n : n \ge 0\}$ arises as the intersection of two context-free languages, namely $L_0 = \{a^i b^n c^n : i, n \ge 0\}$ and $L_1 = \{a^n b^n c^j : j, n \ge 0\}$. Closure under complement would, combined with the already-proved closure under union and De Morgan, imply closure under intersection; closure under set difference would do the same via $A \cap B = A \setminus (A \setminus B)$ (or via $A \cap B = A \setminus (\Sigma^* \setminus B)$). Both would contradict the intersection counterexample.
[/proofplan]
[step:Fix notation and the setup for the closure constructions]
Let $G_1 = (V_1, \Sigma, P_1, S_1)$ and $G_2 = (V_2, \Sigma, P_2, S_2)$ be [context-free grammars](/pages/???) over a common terminal alphabet $\Sigma$, with $\mathcal{L}(G_i) = L_i$ for $i = 1, 2$. Without loss of generality (by renaming variables) $V_1 \cap V_2 = \varnothing$; we reserve fresh variable symbols $S$, distinct from any variable already in $V_1 \cup V_2$, for each construction below. Write $\xrightarrow{*}_G$ for multi-step derivation in grammar $G$; we use $\Rightarrow_G$ for one step.
Recall that $\mathcal{L}(G) = \{w \in \Sigma^* : S_G \xrightarrow{*}_G w\}$ where $S_G$ is the start variable of $G$. Two grammars are equivalent iff they generate the same language.
[/step]
[step:Construction for union — prove $L_1 \cup L_2 \in \mathrm{CFL}$]
Define
\begin{align*}
V_\cup &:= V_1 \cup V_2 \cup \{S\}, \\
P_\cup &:= P_1 \cup P_2 \cup \{S \to S_1,\ S \to S_2\}, \\
G_\cup &:= (V_\cup, \Sigma, P_\cup, S).
\end{align*}
This is a well-defined context-free grammar: each rule is of the form $A \to \gamma$ with $A \in V_\cup$ and $\gamma \in (V_\cup \cup \Sigma)^*$ (the embeddings $P_1, P_2 \subseteq V_\cup \times (V_\cup \cup \Sigma)^*$ are preserved because $V_1, V_2 \subseteq V_\cup$ and the fresh variable $S$ does not appear in any right-hand side of $P_1 \cup P_2$).
[claim:$\mathcal{L}(G_\cup) = L_1 \cup L_2$]
*Forward ($L_1 \cup L_2 \subseteq \mathcal{L}(G_\cup)$).* Let $w \in L_i$ for some $i \in \{1, 2\}$. Then $S_i \xrightarrow{*}_{G_i} w$, and this derivation lies entirely in $P_i \subseteq P_\cup$, so it is a valid $G_\cup$-derivation. Prepending the rule $S \to S_i \in P_\cup$:
\begin{align*}
S \Rightarrow_{G_\cup} S_i \xrightarrow{*}_{G_\cup} w,
\end{align*}
so $w \in \mathcal{L}(G_\cup)$.
*Reverse ($\mathcal{L}(G_\cup) \subseteq L_1 \cup L_2$).* Let $w \in \mathcal{L}(G_\cup)$, so $S \xrightarrow{*}_{G_\cup} w$. The first one-step transition must apply a rule with left-hand side $S$, and the only such rules in $P_\cup$ are $S \to S_1$ and $S \to S_2$. Thus the derivation has the form $S \Rightarrow_{G_\cup} S_i \xrightarrow{*}_{G_\cup} w$ for some $i \in \{1, 2\}$. After the initial step, the current sentential form is $S_i \in V_i \subseteq V_\cup$; every subsequent transition rewrites a variable. Because $V_1 \cap V_2 = \varnothing$ and the fresh variable $S$ appears on no right-hand side of $P_\cup$, every variable appearing after the first step lies in $V_i$ (by induction: rewriting a $V_i$-variable using a rule of $P_i$ produces a sentential form whose variables lie in $V_i$; the only cross-component rules $S \to S_j$ have $S$ on the left, but $S$ never reappears). Hence every transition after the first uses a rule from $P_i$, giving a $G_i$-derivation $S_i \xrightarrow{*}_{G_i} w$, so $w \in L_i \subseteq L_1 \cup L_2$.
[/claim]
[/step]
[step:Construction for concatenation — prove $L_1 L_2 \in \mathrm{CFL}$]
Define
\begin{align*}
V_\cdot &:= V_1 \cup V_2 \cup \{S\}, \\
P_\cdot &:= P_1 \cup P_2 \cup \{S \to S_1\, S_2\}, \\
G_\cdot &:= (V_\cdot, \Sigma, P_\cdot, S).
\end{align*}
[claim:$\mathcal{L}(G_\cdot) = L_1 L_2 := \{w_1 w_2 : w_1 \in L_1,\ w_2 \in L_2\}$]
*Forward.* Let $w = w_1 w_2$ with $w_1 \in L_1$, $w_2 \in L_2$. Derivations $S_1 \xrightarrow{*}_{G_1} w_1$ and $S_2 \xrightarrow{*}_{G_2} w_2$ exist; both lie in $P_\cdot$. Chaining:
\begin{align*}
S \Rightarrow_{G_\cdot} S_1\, S_2 \xrightarrow{*}_{G_\cdot} w_1\, S_2 \xrightarrow{*}_{G_\cdot} w_1\, w_2 = w,
\end{align*}
where the first multi-step sub-derivation rewrites $S_1$ using rules of $P_1$ (leaving $S_2$ untouched, possible because each $G_\cdot$-derivation step rewrites only one position), and the second rewrites $S_2$ using rules of $P_2$. Thus $w \in \mathcal{L}(G_\cdot)$.
*Reverse.* Let $w \in \mathcal{L}(G_\cdot)$, so $S \xrightarrow{*}_{G_\cdot} w$. The only rule with left-hand side $S$ is $S \to S_1 S_2$, so the first step must use it, yielding the sentential form $S_1 S_2$. Because $V_1 \cap V_2 = \varnothing$ and no rule in $P_1 \cup P_2$ introduces a symbol from the other component (the only cross-component rule $S \to S_1 S_2$ has $S$ on the left and $S$ does not reappear), every subsequent rewrite of a symbol descended from $S_1$ uses a rule of $P_1$ (whose right-hand sides have variables in $V_1$), and every rewrite of a symbol descended from $S_2$ uses a rule of $P_2$. Hence we can decompose the derivation: $w = w_1 w_2$ where $S_1 \xrightarrow{*}_{G_1} w_1$ (consisting of the transitions that rewrite $V_1$-variables) and $S_2 \xrightarrow{*}_{G_2} w_2$ (consisting of those that rewrite $V_2$-variables). Therefore $w_1 \in L_1$, $w_2 \in L_2$, and $w \in L_1 L_2$.
[/claim]
[/step]
[step:Construction for Kleene star — prove $L_1^\star \in \mathrm{CFL}$]
Define
\begin{align*}
V_\star &:= V_1 \cup \{S\}, \\
P_\star &:= P_1 \cup \{S \to S\, S_1,\ S \to \varepsilon\}, \\
G_\star &:= (V_\star, \Sigma, P_\star, S).
\end{align*}
Recall $L_1^\star := \bigcup_{k \ge 0} L_1^k = \{w_1 w_2 \cdots w_k : k \ge 0,\ w_i \in L_1\}$, with the empty product (at $k = 0$) being $\{\varepsilon\}$.
[claim:$\mathcal{L}(G_\star) = L_1^\star$]
*Forward ($L_1^\star \subseteq \mathcal{L}(G_\star)$).* We show by induction on $k \ge 0$ that every word $w = w_1 w_2 \cdots w_k$ with $w_i \in L_1$ is derivable in $G_\star$.
- $k = 0$: $w = \varepsilon$ and $S \Rightarrow_{G_\star} \varepsilon$ using $S \to \varepsilon$.
- $k \to k + 1$: Assume $S \xrightarrow{*}_{G_\star} w_1 \cdots w_k$. Apply $S \to S\, S_1$ once, then continue the (inductively obtained) derivation on the left occurrence of $S$, finally use $S_1 \xrightarrow{*}_{G_1} w_{k+1}$ via $P_1 \subseteq P_\star$:
\begin{align*}
S \Rightarrow_{G_\star} S\, S_1 \xrightarrow{*}_{G_\star} w_1 \cdots w_k\, S_1 \xrightarrow{*}_{G_\star} w_1 \cdots w_k\, w_{k+1}.
\end{align*}
*Reverse ($\mathcal{L}(G_\star) \subseteq L_1^\star$).* We induct on the length $n \ge 1$ of a derivation $S \xrightarrow{*}_{G_\star} w$.
- $n = 1$: the only one-step derivation from $S$ with terminal target is $S \Rightarrow_{G_\star} \varepsilon$ (the alternative $S \Rightarrow_{G_\star} S S_1$ has non-terminal target), giving $w = \varepsilon \in L_1^\star$.
- $n \ge 2$: the first step is either $S \to \varepsilon$ (impossible, as then $w = \varepsilon$ and the derivation has length $1$), or $S \to S S_1$, giving sentential form $S\, S_1$. By the disjoint-variable analysis in the concatenation step, the subsequent derivation decomposes into $S \xrightarrow{*}_{G_\star} w'$ and $S_1 \xrightarrow{*}_{G_1} w''$ with $w = w' w''$. By inductive hypothesis on the shorter derivation $S \xrightarrow{*}_{G_\star} w'$, we have $w' \in L_1^\star$; and $w'' \in L_1$ because $S_1 \xrightarrow{*}_{G_1} w''$ is a $G_1$-derivation. Therefore $w = w' w'' \in L_1^\star \cdot L_1 \subseteq L_1^\star$.
[/claim]
[/step]
[step:Exhibit a concrete non-closure under intersection]
We use the classical counterexample:
\begin{align*}
L_0 &:= \{a^i b^n c^n : i, n \ge 0\}, & L_1 &:= \{a^n b^n c^j : j, n \ge 0\}.
\end{align*}
[claim:$L_0$ and $L_1$ are context-free]
Grammar for $L_0$: variables $\{S_0, A, B\}$, productions
\begin{align*}
S_0 &\to A\, B, & A &\to a A \mid \varepsilon, & B &\to b B c \mid \varepsilon,
\end{align*}
with $A$ generating $\{a^i : i \ge 0\}$ (straightforward induction on derivation length) and $B$ generating $\{b^n c^n : n \ge 0\}$ (induction: each use of $B \to bBc$ wraps a matching pair). Therefore $\mathcal{L}(S_0) = \{a^i\}\{b^n c^n\} = L_0$.
Grammar for $L_1$: symmetrically, variables $\{S_1, D, C\}$, productions
\begin{align*}
S_1 &\to D\, C, & D &\to a D b \mid \varepsilon, & C &\to c C \mid \varepsilon,
\end{align*}
with $D$ generating $\{a^n b^n : n \ge 0\}$ (induction on derivation length, each $D \to aDb$ adds a matching pair) and $C$ generating $\{c^j : j \ge 0\}$. Therefore $\mathcal{L}(S_1) = \{a^n b^n\}\{c^j\} = L_1$.
[/claim]
[claim:$L_0 \cap L_1 = \{a^n b^n c^n : n \ge 0\}$]
A word $w \in \{a, b, c\}^*$ belongs to both $L_0$ and $L_1$ iff
\begin{align*}
w &\in \{a^i b^n c^n : i, n \ge 0\}, & w &\in \{a^n b^n c^j : j, n \ge 0\}.
\end{align*}
Membership in the first set forces $w = a^{i_0} b^{n_0} c^{n_0}$ for some $i_0, n_0 \ge 0$. Membership in the second forces $w = a^{n_1} b^{n_1} c^{j_1}$ for some $n_1, j_1 \ge 0$. Matching letter counts: the number of $a$'s gives $i_0 = n_1$, the number of $b$'s gives $n_0 = n_1$, and the number of $c$'s gives $n_0 = j_1$. Setting $n := n_0 = n_1$ yields $i_0 = j_1 = n$, so $w = a^n b^n c^n$. Conversely every $a^n b^n c^n$ is in both sets (take $i = n$ and $j = n$). This proves the equality.
[/claim]
[claim:$\{a^n b^n c^n : n \ge 0\}$ is not context-free]
This is a standard application of the [Pumping Lemma for context-free languages](/theorems/1806). Assume for contradiction that $L := \{a^n b^n c^n : n \ge 0\}$ is context-free with pumping number $p \ge 1$. Apply the lemma to the word $w := a^p b^p c^p \in L$, which has length $3p \ge p$. We obtain a decomposition $w = xuyvz$ with $|uyv| \le p$, $|uv| > 0$, and $x u^k y v^k z \in L$ for all $k \ge 0$.
The bound $|uyv| \le p$ forces $uyv$ to be a factor of $w$ of length at most $p$; since $w = a^p b^p c^p$ consists of three blocks each of length $p$, the factor $uyv$ cannot span all three letters $a, b, c$. Thus $uyv$ contains occurrences of at most two distinct letters. Pump with $k = 2$: the word $x u^2 y v^2 z$ differs from $w$ in the counts of *at most two* letters (those appearing in $uv$, since pumping adds one more copy of $u$ and of $v$). By $|uv| > 0$, at least one letter's count strictly increases. Therefore in $x u^2 y v^2 z$ at least one of the three letter counts strictly exceeds $p$ and at least one equals $p$, so the counts are not all equal, and $x u^2 y v^2 z \notin L$, contradicting the pumping conclusion. Hence $L$ is not context-free.
[/claim]
Since $L_0 \cap L_1 = L$ is not context-free while $L_0, L_1$ are, the class of context-free languages is not closed under binary intersection.
[/step]
[step:Deduce non-closure under complement and set difference]
Let $\mathrm{CFL}$ denote the class of context-free languages over $\Sigma$.
[claim:$\mathrm{CFL}$ is not closed under complement]
Suppose for contradiction that $\mathrm{CFL}$ is closed under complement. For any $A, B \in \mathrm{CFL}$, De Morgan's law gives
\begin{align*}
A \cap B &= \Sigma^* \setminus \big((\Sigma^* \setminus A) \cup (\Sigma^* \setminus B)\big) = \overline{\overline{A} \cup \overline{B}},
\end{align*}
where we write $\overline{X} := \Sigma^* \setminus X$. Assuming complement closure, $\overline{A}, \overline{B} \in \mathrm{CFL}$; by the already-proved union closure, $\overline{A} \cup \overline{B} \in \mathrm{CFL}$; and by complement closure again, $A \cap B = \overline{\overline{A} \cup \overline{B}} \in \mathrm{CFL}$. So $\mathrm{CFL}$ would be closed under intersection, contradicting the previous step. Hence $\mathrm{CFL}$ is not closed under complement.
[/claim]
[claim:$\mathrm{CFL}$ is not closed under set difference]
Suppose for contradiction that $\mathrm{CFL}$ is closed under set difference. For any $A, B \in \mathrm{CFL}$, using also that $\Sigma^* \in \mathrm{CFL}$ (the context-free grammar with single variable $S$ and rules $S \to a S$ for each $a \in \Sigma$, together with $S \to \varepsilon$, generates $\Sigma^*$), we compute
\begin{align*}
\overline{B} &= \Sigma^* \setminus B, \\
A \cap B &= A \setminus \overline{B} = A \setminus (\Sigma^* \setminus B).
\end{align*}
Under assumed set-difference closure, $\Sigma^* \setminus B \in \mathrm{CFL}$, and then $A \setminus (\Sigma^* \setminus B) \in \mathrm{CFL}$; so $A \cap B \in \mathrm{CFL}$, contradicting the non-closure under intersection. Hence $\mathrm{CFL}$ is not closed under set difference.
[/claim]
[/step]
[step:Collect the results]
We have proved:
- $\mathrm{CFL}$ is closed under union (construction $G_\cup$), concatenation (construction $G_\cdot$), and Kleene star (construction $G_\star$);
- $\mathrm{CFL}$ is not closed under intersection (by exhibiting $L_0, L_1 \in \mathrm{CFL}$ with $L_0 \cap L_1 = \{a^n b^n c^n\} \notin \mathrm{CFL}$);
- $\mathrm{CFL}$ is not closed under complement (implied by non-closure under intersection plus closure under union via De Morgan);
- $\mathrm{CFL}$ is not closed under set difference (implied by non-closure under intersection via $A \cap B = A \setminus (\Sigma^* \setminus B)$).
This establishes both halves of the theorem.
[/step]