This course develops the foundations of model theory as the study of mathematical structures through first-order logic, then follows those foundations into algebraically closed fields, real closed fields, model completeness, and compactness constructions. It asks what can be expressed in a formal language, how structures satisfy sentences, and how different structures can be compared by the theories they realize. The central objects are languages, structures, theories, and notions such as completeness and elementary equivalence, which together provide a framework for understanding when two models are indistinguishable by first-order statements.
The course then develops the main tools that make model theory powerful: filters and ultrafilters, ultraproducts, and Łoś’s theorem, culminating in a new proof of compactness. From there, compactness and Löwenheim-Skolem theorems are used to obtain existence and size results, while quantifier elimination gives a more refined way to analyze theories by reducing formulas to simpler ones. The later chapters focus on major examples and structural consequences, especially algebraically closed fields and real closed fields, before turning to model completeness and Robinson’s test as criteria for when embeddings preserve first-order truth.
The chapters are arranged to move from foundations to methods and then to applications. Early chapters build the basic language and semantic notions, middle chapters introduce the ultraproduct machinery and its consequences, and the final chapters show how these ideas illuminate concrete theories and clarify the limits of what compactness can prove. By the end, the course aims to leave you with both a technical toolkit and a conceptual map of how model theory connects syntax, semantics, and algebraic structure.
# 1. First-Order Languages, Structures, and Satisfaction
This chapter fixes the basic interface between syntax and semantics. We first decide which finite strings count as terms and formulas, then decide what kind of mathematical object interprets those strings, and finally define truth by recursion on the complexity of formulas. The main point is that first-order logic separates the formal vocabulary from the structures in which that vocabulary is realised.
## Languages and Formulas
Before asking whether a statement is true in a structure, we need to know what counts as a statement in the first place. The guiding problem is to build a formal language flexible enough to talk about algebraic and relational structures, while keeping syntax finite and mechanically checkable.
[definition: First-Order Signature]
A first-order signature $L$ consists of three specified collections of non-logical symbols:
1. function symbols $f$, each assigned an arity $n_f \in \mathbb N \cup \{0\}$;
2. relation symbols $R$, each assigned an arity $n_R \in \mathbb N$;
3. constant symbols, regarded as function symbols of arity $0$.
[/definition]
The logical symbols are fixed once and for all: variables, equality, connectives such as $\neg$, $\wedge$, $\vee$, $\implies$, quantifiers $\forall$, $\exists$, and parentheses. The signature records only the symbols whose interpretation changes from one mathematical context to another.
[example: Standard Algebraic Signatures]
For groups, take
\begin{align*}
L_{\mathrm{grp}}=(\cdot,{}^{-1},e).
\end{align*}
Here $\cdot$ is a binary function symbol, so it is meant to be interpreted as a map $G^2\to G$; ${}^{-1}$ is a unary function symbol, so it is meant to be interpreted as a map $G\to G$; and $e$ is a constant symbol, equivalently a $0$-ary function symbol, so it is meant to name one element of $G$.
For rings, take
\begin{align*}
L_{\mathrm{ring}}=(+,\cdot,0,1).
\end{align*}
The symbols $+$ and $\cdot$ are binary function symbols, while $0$ and $1$ are constant symbols. Thus a structure for this language supplies two operations $R^2\to R$ and two distinguished elements of $R$.
For ordered fields, take
\begin{align*}
L_{\mathrm{of}}=(+,\cdot,<,0,1).
\end{align*}
This has the same two binary function symbols $+$ and $\cdot$ and the same two constant symbols $0$ and $1$ as the ring language, together with one binary relation symbol $<$. The signature records only the formal vocabulary; the actual group operation, ring operations, or order relation appear only after a particular structure interprets these symbols.
[/example]
Terms are the first layer of syntax that can later be evaluated inside a structure. They do not assert anything yet; instead, they provide the formal names for elements built from variables, constants, and the function symbols of the language. Before we can speak about equations or relations, we need this precise recursive description of which expressions count as element-denoting expressions.
[definition: Term]
Let $L$ be a first-order signature. The $L$-terms are defined inductively as follows:
1. every variable is an $L$-term;
2. every constant symbol of $L$ is an $L$-term;
3. if $f$ is an $n$-ary function symbol of $L$ and $t_1,\dots,t_n$ are $L$-terms, then $f(t_1,\dots,t_n)$ is an $L$-term.
[/definition]
A term has no truth value by itself. It becomes an element of a structure after the variables are assigned values and the function symbols are interpreted.
[example: Terms in the Language of Groups]
In $L_{\mathrm{grp}}$, the expression $x \cdot (y^{-1} \cdot e)$ is a term because $x$ and $y$ are variables, $e$ is a constant symbol, ${}^{-1}$ is a unary function symbol, and $\cdot$ is a binary function symbol.
Let $G$ be a group regarded as an $L_{\mathrm{grp}}$-structure, and let $s$ be an assignment with $s(x)=a$ and $s(y)=b$. The recursive definition of term value gives
\begin{align*}
\bigl(x \cdot (y^{-1}\cdot e)\bigr)^G[s]
&= \cdot^G\bigl(x^G[s],(y^{-1}\cdot e)^G[s]\bigr)\\
&= \cdot^G\bigl(a,(y^{-1}\cdot e)^G[s]\bigr)\\
&= \cdot^G\bigl(a,\cdot^G((y^{-1})^G[s],e^G[s])\bigr)\\
&= \cdot^G\bigl(a,\cdot^G(({}^{-1})^G(y^G[s]),e^G)\bigr)\\
&= \cdot^G\bigl(a,\cdot^G(({}^{-1})^G(b),e_G)\bigr)\\
&= a\cdot_G(b^{-1}\cdot_G e_G).
\end{align*}
Thus the same formal term can be interpreted in any group, but its value depends on the chosen group operation, inverse map, identity element, and assignment.
[/example]
The term computation above produces elements of a structure, but logic needs a way to turn such computed elements into statements that can be true or false. The first step is to isolate the smallest possible assertions: either two terms have the same value, or a relation symbol holds of the values of some terms. These clauses will be the semantic base cases for every later recursion on formulas.
[definition: Atomic Formula]
Let $L$ be a first-order signature. An atomic $L$-formula is an expression of one of the following forms:
1. $t_1 = t_2$, where $t_1,t_2$ are $L$-terms;
2. $R(t_1,\dots,t_n)$, where $R$ is an $n$-ary relation symbol of $L$ and $t_1,\dots,t_n$ are $L$-terms.
[/definition]
Atomic formulas only express single basic facts, but first-order logic also needs to combine facts and quantify over elements of a structure. This raises a bookkeeping issue: some variables are bound by quantifiers, while others still require values from an external assignment. The next definition simultaneously specifies the recursive construction of formulas and records which variables remain free.
[definition: Formula and Free Variable]
The $L$-formulas are defined inductively as follows:
1. every atomic $L$-formula is an $L$-formula;
2. if $\varphi$ and $\psi$ are $L$-formulas, then $\neg\varphi$, $(\varphi\wedge\psi)$, $(\varphi\vee\psi)$, and $(\varphi\implies\psi)$ are $L$-formulas;
3. if $\varphi$ is an $L$-formula and $x$ is a variable, then $\forall x\,\varphi$ and $\exists x\,\varphi$ are $L$-formulas.
The set of free variables of a formula is defined recursively: variables occurring in atomic formulas are free, Boolean connectives preserve free variables, and $\forall x$ and $\exists x$ bind occurrences of $x$ in their scope.
[/definition]
If $\varphi$ has free variables among $x_1,\dots,x_n$, we write $\varphi(x_1,\dots,x_n)$ to display a list of variables sufficient to evaluate it. The list may contain variables that do not occur freely, but it must include all those that do.
[example: A Formula in the Language of Graphs]
Let $L_{\mathrm{graph}}=(E)$, where $E$ is a binary relation symbol, and consider
\begin{align*}
\varphi(x) := \exists y\,\bigl(E(x,y) \wedge \neg(x=y)\bigr).
\end{align*}
The occurrence of $x$ inside $E(x,y)$ and inside $x=y$ is not in the scope of any quantifier binding $x$, so $x$ is free. The occurrences of $y$ are both inside the scope of the quantifier $\exists y$, so $y$ is bound.
If $G$ is a graph regarded as an $L_{\mathrm{graph}}$-structure and $s$ is an assignment with $s(x)=a$, then
\begin{align*}
G\models \varphi[s]
&\iff G\models \exists y\,\bigl(E(x,y)\wedge \neg(x=y)\bigr)[s]\\
&\iff \text{there exists } b\in |G| \text{ such that }
G\models \bigl(E(x,y)\wedge \neg(x=y)\bigr)[s[y\mapsto b]]\\
&\iff \text{there exists } b\in |G| \text{ such that }
G\models E(x,y)[s[y\mapsto b]]
\text{ and }
G\models \neg(x=y)[s[y\mapsto b]]\\
&\iff \text{there exists } b\in |G| \text{ such that }
(a,b)\in E^G
\text{ and }
G\not\models (x=y)[s[y\mapsto b]]\\
&\iff \text{there exists } b\in |G| \text{ such that }
(a,b)\in E^G
\text{ and }
a\neq b.
\end{align*}
Thus $\varphi(x)$ defines precisely the set of vertices that have an $E$-neighbour different from themselves; the free variable $x$ supplies the input vertex, while the bound variable $y$ ranges internally over possible witnesses.
[/example]
## Structures and Maps Between Structures
A language by itself is only a vocabulary. The next problem is to say what it means for a set to realise that vocabulary: which actual functions, relations, and distinguished elements interpret the symbols of the signature?
[definition: Structure]
Let $L$ be a first-order signature. An $L$-structure $M$ consists of a nonempty set $|M|$, called its universe, together with interpretations of all symbols of $L$:
1. for each $n$-ary function symbol $f$, a function $f^M: |M|^n \to |M|$;
2. for each $n$-ary relation symbol $R$, a subset $R^M \subset |M|^n$;
3. for each constant symbol $c$, an element $c^M \in |M|$.
[/definition]
The notation $M$ often denotes both the structure and its universe when no confusion is possible. The vertical bars $|M|$ are useful when we need to distinguish the underlying set from the additional interpretations.
[example: Four Basic Classes of Structures]
Let $L_{\mathrm{grp}}=(\cdot,{}^{-1},e)$. A group $G$ becomes an $L_{\mathrm{grp}}$-structure by taking the universe to be the underlying set $|G|$ and interpreting the three symbols as
\begin{align*}
\cdot^G &: |G|^2 \to |G|, & (a,b) &\mapsto a\cdot_G b,\\
({}^{-1})^G &: |G| \to |G|, & a &\mapsto a^{-1},\\
e^G &\in |G|, & e^G &= e_G.
\end{align*}
The arities match the signature: $\cdot$ is binary, ${}^{-1}$ is unary, and $e$ is a constant symbol, so it is interpreted as a single element of $|G|$.
Let $L_{\mathrm{ring}}=(+,\cdot,0,1)$. A commutative ring $R$ becomes an $L_{\mathrm{ring}}$-structure with universe $|R|$ by setting
\begin{align*}
+^R &: |R|^2 \to |R|, & (a,b) &\mapsto a+_R b,\\
\cdot^R &: |R|^2 \to |R|, & (a,b) &\mapsto a\cdot_R b,\\
0^R &\in |R|, & 0^R &= 0_R,\\
1^R &\in |R|, & 1^R &= 1_R.
\end{align*}
Thus the formal symbols $+$ and $\cdot$ are interpreted as the two ring operations, while $0$ and $1$ name the additive and multiplicative identities.
Let $L_{\mathrm{of}}=(+,\cdot,<,0,1)$. An ordered field $F$ is an $L_{\mathrm{of}}$-structure by using the same interpretations of $+$, $\cdot$, $0$, and $1$ as in the ring language and adding
\begin{align*}
<^F \subset |F|^2,
\end{align*}
where $(a,b)\in <^F$ exactly when $a<_F b$ in the given order on $F$. Finally, for $L_{\mathrm{graph}}=(E)$, a graph $\Gamma$ becomes an $L_{\mathrm{graph}}$-structure by taking $|\Gamma|$ to be its vertex set and interpreting
\begin{align*}
E^\Gamma \subset |\Gamma|^2
\end{align*}
as its edge relation. These examples show that a structure does not add new syntax; it assigns the already chosen formal symbols to concrete functions, relations, and distinguished elements.
[/example]
After defining structures, we need a way to recognize when a smaller universe inherits the same language from a larger one. Relations can simply be restricted to the subset, but function symbols create a closure requirement: applying a named operation to elements of the subset must stay inside the subset. This is the model-theoretic version of familiar subobjects such as subgroups and subrings.
[definition: Substructure]
Let $M$ be an $L$-structure. An $L$-substructure $A \subset M$ consists of a nonempty subset $|A| \subset |M|$ such that:
1. for every $n$-ary function symbol $f$ and all $a_1,\dots,a_n \in |A|$, we have $f^M(a_1,\dots,a_n) \in |A|$;
2. for every constant symbol $c$, we have $c^M \in |A|$;
3. each function symbol is interpreted by restriction from $M$;
4. each relation symbol is interpreted by restriction from $M$, so $R^A = R^M \cap |A|^n$ for $n$-ary $R$.
[/definition]
Thus a subgroup is a substructure in the group language, and a subring containing $1$ is a substructure in the ring language. A subset of a graph's vertices is always closed under functions because the graph language has no function symbols, so it becomes an induced subgraph.
Substructures compare one structure with a smaller part of itself; we also need maps that compare two possibly different structures in the same language. Such maps should respect the interpretations of the symbols, sending named operations and constants to their counterparts and preserving named relations. This leads to the general notion of homomorphism.
[definition: Homomorphism]
Let $M$ and $N$ be $L$-structures. A homomorphism $h:M\to N$ is a function $h:|M|\to |N|$ such that:
1. for every $n$-ary function symbol $f$ and all $a_1,\dots,a_n\in |M|$,
\begin{align*}
h(f^M(a_1,\dots,a_n))=f^N(h(a_1),\dots,h(a_n));
\end{align*}
2. for every constant symbol $c$, $h(c^M)=c^N$;
3. for every $n$-ary relation symbol $R$ and all $a_1,\dots,a_n\in |M|$, if $(a_1,\dots,a_n)\in R^M$, then $(h(a_1),\dots,h(a_n))\in R^N$.
[/definition]
Relations are required to be preserved forward by homomorphisms. For model theory, embeddings and isomorphisms are more central because they preserve atomic information in both directions.
A homomorphism may collapse elements or create extra relational facts in the target, so it need not exhibit one structure as an exact copy inside another. To regard $M$ as sitting inside $N$ without losing or adding atomic information, the map must be injective and must both preserve and reflect each relation. This stronger comparison is an embedding.
[definition: Embedding]
Let $M$ and $N$ be $L$-structures. An embedding $h:M\to N$ is an injective function $h:|M|\to |N|$ that preserves all function symbols and constants, and such that for every $n$-ary relation symbol $R$ and all $a_1,\dots,a_n\in |M|$,
\begin{align*}
(a_1,\dots,a_n)\in R^M \iff (h(a_1),\dots,h(a_n))\in R^N.
\end{align*}
[/definition]
After embeddings, we need a comparison that says not merely that one structure appears faithfully inside another, but that no part of the target is left over. Requiring the embedding to be onto makes every element and every interpretation of the language correspond exactly. This is the formal version of saying that two structures differ only by a relabelling of their underlying sets.
[definition: Isomorphism]
An isomorphism $h:M\to N$ of $L$-structures is a bijective embedding. If such an isomorphism exists, we write $M\cong N$.
[/definition]
An isomorphism identifies two structures as different presentations of the same $L$-structure. First-order formulas cannot distinguish between isomorphic structures, a result proved after satisfaction has been defined.
[example: Graph Embeddings]
In the graph language $L_{\mathrm{graph}}=(E)$, there are no function symbols and no constant symbols, so the definition of embedding reduces to the relation clause plus injectivity. Thus a map $h:G\to H$ is an embedding exactly when $h:|G|\to |H|$ is injective and, for all vertices $a,b\in |G|$,
\begin{align*}
(a,b)\in E^G
\iff
(h(a),h(b))\in E^H.
\end{align*}
This equivalence has two directions. The forward direction says every edge of $G$ is carried to an edge of $H$:
\begin{align*}
(a,b)\in E^G
\implies
(h(a),h(b))\in E^H.
\end{align*}
The reverse direction says no new edge appears among the vertices in the image of $h$:
\begin{align*}
(h(a),h(b))\in E^H
\implies
(a,b)\in E^G.
\end{align*}
Taking the contrapositive of the reverse direction gives
\begin{align*}
(a,b)\notin E^G
\implies
(h(a),h(b))\notin E^H.
\end{align*}
Therefore $\neg E(x,y)$ is preserved on pairs from $G$ as well as $E(x,y)$ itself. This is why a model-theoretic graph embedding identifies $G$ with an induced subgraph of $H$, not merely with a subgraph obtained by remembering some of the edges.
[/example]
## Assignments and Satisfaction
The syntax now has formulas, and the semantics now has structures. The remaining problem is to define truth for formulas with free variables, since such formulas are neither true nor false in a structure until their free variables receive values.
[definition: Assignment]
Let $M$ be an $L$-structure. An assignment in $M$ is a function $s$ from the set of variables to $|M|$.
If $x$ is a variable and $a\in |M|$, then $s[x\mapsto a]$ denotes the assignment that sends $x$ to $a$ and agrees with $s$ on every variable other than $x$.
[/definition]
Assignments are useful only after we specify how they turn syntactic terms into actual elements of a structure. Variables are read from the assignment, constants are read from the structure, and compound terms are evaluated by applying the interpreted function symbols to previously evaluated subterms. This recursive evaluation is the bridge from syntax to the atomic formulas that satisfaction will interpret next.
[definition: Value of a Term]
Let $M$ be an $L$-structure and let $s$ be an assignment in $M$. For each $L$-term $t$, its value $t^M[s]\in |M|$ is defined recursively by:
1. $x^M[s]=s(x)$ for every variable $x$;
2. $c^M[s]=c^M$ for every constant symbol $c$;
3. if $t=f(t_1,\dots,t_n)$, then
\begin{align*}
t^M[s]=f^M(t_1^M[s],\dots,t_n^M[s]).
\end{align*}
[/definition]
We can now define truth by recursion on formulas, but the definition must handle both logical structure and variable assignments. Atomic formulas are evaluated from term values, Boolean connectives combine truth values, and quantifiers change the assignment by allowing one variable to range over the universe. This recursive definition is the foundational semantic mechanism for the rest of the course.
[quotetheorem:4271]
[citeproof:4271]
The definition is necessary because formulas with free variables do not have an absolute truth value in a structure: the formula $x=x$ can be evaluated only after $x$ has been assigned an element, and a formula such as $x<y$ may change truth value when either assigned element changes. The recursion also explains the limited task of semantics: it tells us what truth means once a structure and assignment are fixed, but it does not decide whether an arbitrary theory has a model or whether a sentence follows from a theory. Those later questions rely on this definition because models, elementary equivalence, completeness, and compactness all use $\models$ as their semantic baseline.
When the free variables of $\varphi$ are among $x_1,\dots,x_n$ and $a=(a_1,\dots,a_n)\in |M|^n$, we write
\begin{align*}
M\models\varphi(a)
\end{align*}
for satisfaction under any assignment sending $x_i$ to $a_i$. This notation is justified only if the unused parts of the assignment are irrelevant.
The issue is that a formal assignment gives values to all variables, while a formula can mention only some of them freely. If truth depended on variables not occurring freely, then tuple notation such as $M\models\varphi(a)$ would be ambiguous. The next result isolates the precise invariance needed to make formulas define relations on tuples.
[quotetheorem:4276]
[citeproof:4276]
This theorem says that the displayed tuple notation $M\models\varphi(a_1,\dots,a_n)$ is not hiding an arbitrary choice of values for the other variables. Agreement on free variables is exactly the right hypothesis: if $x$ is free in $\varphi(x)$, changing the value assigned to $x$ can change truth, while changing a bound variable outside its quantified clause cannot. This is what allows definable subsets of $|M|^n$ to be treated, starting in Chapter 7 on quantifier elimination, as honest relations on tuples rather than as relations on full assignments.
[example: Satisfaction in an Ordered Field]
Let $R$ be an ordered field, considered as an $L_{\mathrm{of}}$-structure, and let
\begin{align*}
\varphi(x) := \exists y\,(y\cdot y = x \wedge 0 < y).
\end{align*}
For $a\in R$, the existential clause for satisfaction gives
\begin{align*}
R\models \varphi(a)
&\iff R\models \exists y\,(y\cdot y=x\wedge 0<y)[x\mapsto a]\\
&\iff \text{there exists } b\in R \text{ such that }
R\models (y\cdot y=x\wedge 0<y)[x\mapsto a,y\mapsto b]\\
&\iff \text{there exists } b\in R \text{ such that }
R\models (y\cdot y=x)[x\mapsto a,y\mapsto b]
\text{ and }
R\models (0<y)[x\mapsto a,y\mapsto b]\\
&\iff \text{there exists } b\in R \text{ such that }
b\cdot_R b=a
\text{ and }
0_R<_R b.
\end{align*}
Thus $\varphi(a)$ says exactly that $a$ is the square of a positive element of $R$.
In the usual ordered field $\mathbb R$, if $\mathbb R\models\varphi(a)$, then there is $b>0$ with $b^2=a$, so $a=b^2>0$. Conversely, if $a>0$, the usual square-root property of the real numbers gives a real number $\sqrt a>0$ with
\begin{align*}
(\sqrt a)\cdot(\sqrt a)=a,
\end{align*}
so $\mathbb R\models\varphi(a)$. Hence $\mathbb R\models\varphi(a)$ exactly when $a>0$.
In $\mathbb Q$, the formula holds at $x=4$ because the rational witness $y=2$ satisfies
\begin{align*}
2\cdot 2 &=4,\\
0&<2.
\end{align*}
It does not hold at $x=2$. If a rational witness existed, write it in lowest terms as $y=m/n$ with $m,n\in\mathbb Z$, $n\neq 0$, and $\gcd(m,n)=1$. Then
\begin{align*}
\left(\frac mn\right)^2=2
&\implies \frac{m^2}{n^2}=2\\
&\implies m^2=2n^2.
\end{align*}
So $m^2$ is even, hence $m$ is even; write $m=2k$. Substituting gives
\begin{align*}
(2k)^2&=2n^2\\
4k^2&=2n^2\\
2k^2&=n^2.
\end{align*}
Thus $n^2$ is even, hence $n$ is even, contradicting $\gcd(m,n)=1$. Therefore $2$ has no rational square root, so $\mathbb Q\not\models\varphi(2)$.
[/example]
Truth should not depend on the particular names chosen for elements of a structure. If two structures are isomorphic, their functions, relations, and constants have the same pattern after translating elements across the isomorphism. The semantic question is whether every formula respects that translation, including formulas with parameters and quantifiers.
[quotetheorem:4274]
[citeproof:4274]
This invariance is a first warning about the expressive limits of first-order logic: formulas speak only about structure, not about the names or external presentation of the underlying set. The hypotheses are doing real work. Bijectivity is needed for quantified formulas because an existential witness in $N$ must be pulled back to $M$, while relation reflection is needed so that negated atomic facts, such as $\neg E(x,y)$ in a graph, are preserved as well as positive atomic facts. The theorem does not say that first-order formulas distinguish all non-isomorphic structures; later chapters introduce elementarily equivalent structures, which satisfy the same sentences even when no isomorphism exists. Thus isomorphism invariance is the baseline semantic symmetry, and elementary equivalence is the weaker relation obtained by forgetting the specific tuple map.
## Sentences, Theories, and Semantic Consequence
Formulas with free variables define subsets or relations inside a structure. To axiomatise a class of structures, however, we need formulas whose truth does not depend on an assignment.
[definition: Sentence]
An $L$-sentence is an $L$-formula with no free variables.
[/definition]
For a sentence $\sigma$, the notation $M\models\sigma$ is independent of assignments. To study classes of structures, we package many such assignment-free requirements together and ask which structures satisfy all of them at once. This leads to the paired notions of a theory and its models.
[definition: Theory and Model]
An $L$-theory $T$ is a set of $L$-sentences.
An $L$-structure $M$ is a model of $T$, written $M\models T$, if $M\models\sigma$ for every $\sigma\in T$.
[/definition]
A theory can be finite or infinite, complete or incomplete, algebraic or relational. The group axioms give a familiar first example of a finite theory.
[example: Group Axioms as a Theory]
In $L_{\mathrm{grp}}=(\cdot,{}^{-1},e)$, let $T_{\mathrm{grp}}$ be the theory consisting of the three sentences
\begin{align*}
&\forall x\,\forall y\,\forall z\,((x\cdot y)\cdot z = x\cdot (y\cdot z)),\\
&\forall x\,(e\cdot x=x \wedge x\cdot e=x),\\
&\forall x\,(x\cdot x^{-1}=e \wedge x^{-1}\cdot x=e).
\end{align*}
If $M$ is an $L_{\mathrm{grp}}$-structure, write its interpreted operation, inverse map, and constant as
\begin{align*}
\cdot^M &: |M|^2\to |M|,\\
({}^{-1})^M &: |M|\to |M|,\\
e^M &\in |M|.
\end{align*}
The first sentence says that for every $a,b,c\in |M|$,
\begin{align*}
M\models ((x\cdot y)\cdot z = x\cdot (y\cdot z))[x\mapsto a,y\mapsto b,z\mapsto c]
&\iff ((x\cdot y)\cdot z)^M[s]=(x\cdot (y\cdot z))^M[s]\\
&\iff \cdot^M\bigl((x\cdot y)^M[s],z^M[s]\bigr)
=\cdot^M\bigl(x^M[s],(y\cdot z)^M[s]\bigr)\\
&\iff \cdot^M\bigl(\cdot^M(a,b),c\bigr)
=\cdot^M\bigl(a,\cdot^M(b,c)\bigr),
\end{align*}
where $s$ is the displayed assignment. Thus the first sentence is exactly associativity of $\cdot^M$.
The second sentence says that for every $a\in |M|$,
\begin{align*}
M\models (e\cdot x=x \wedge x\cdot e=x)[x\mapsto a]
&\iff M\models (e\cdot x=x)[x\mapsto a]
\text{ and }M\models (x\cdot e=x)[x\mapsto a]\\
&\iff (e\cdot x)^M[x\mapsto a]=x^M[x\mapsto a]
\text{ and }(x\cdot e)^M[x\mapsto a]=x^M[x\mapsto a]\\
&\iff \cdot^M(e^M,a)=a
\text{ and }\cdot^M(a,e^M)=a.
\end{align*}
So $e^M$ is a two-sided identity element for $\cdot^M$.
The third sentence says that for every $a\in |M|$,
\begin{align*}
M\models (x\cdot x^{-1}=e \wedge x^{-1}\cdot x=e)[x\mapsto a]
&\iff M\models (x\cdot x^{-1}=e)[x\mapsto a]
\text{ and }M\models (x^{-1}\cdot x=e)[x\mapsto a]\\
&\iff (x\cdot x^{-1})^M[x\mapsto a]=e^M
\text{ and }(x^{-1}\cdot x)^M[x\mapsto a]=e^M\\
&\iff \cdot^M\bigl(a,({}^{-1})^M(a)\bigr)=e^M
\text{ and }\cdot^M\bigl(({}^{-1})^M(a),a\bigr)=e^M.
\end{align*}
Thus every element has the inverse assigned to it by the unary function symbol ${}^{-1}$. Therefore an $L_{\mathrm{grp}}$-structure satisfies $T_{\mathrm{grp}}$ exactly when its interpretations form a group in the usual algebraic sense.
[/example]
The example makes satisfaction a property of a theory rather than of one sentence. This is the point at which model existence becomes a central question.
[definition: Satisfiable Theory]
An $L$-theory $T$ is satisfiable if there exists an $L$-structure $M$ such that $M\models T$.
[/definition]
Satisfiability asks whether a theory has at least one model. Semantic consequence asks what must be true in every model once the theory is assumed.
[definition: Semantic Consequence]
Let $T$ be an $L$-theory and let $\sigma$ be an $L$-sentence. We write $T\models\sigma$ if every model of $T$ is a model of $\sigma$.
[/definition]
Semantic consequence is therefore a statement about all structures, not about derivations in a formal proof system. The compactness proof in Chapter 4 will use the completeness theorem from first-order logic to compare this semantic notion with syntactic provability.
[example: Ordered Fields Satisfy No-Endpoint Consequences]
Let $T_{\mathrm{of}}$ be the usual first-order theory of ordered fields in $L_{\mathrm{of}}$. We show that
\begin{align*}
T_{\mathrm{of}}\models \forall x\,\exists y\,(x<y).
\end{align*}
By the definition of semantic consequence, this means: for every $L_{\mathrm{of}}$-structure $M$, if $M\models T_{\mathrm{of}}$, then
\begin{align*}
M\models \forall x\,\exists y\,(x<y).
\end{align*}
So let $M\models T_{\mathrm{of}}$, and let $a\in |M|$ be arbitrary. Since $M$ is an ordered field, $1^M$ is positive:
\begin{align*}
0^M <^M 1^M.
\end{align*}
Set
\begin{align*}
b := +^M(a,1^M).
\end{align*}
Using the ordered-field axiom that addition preserves strict order, add $a$ to both sides of $0^M <^M 1^M$:
\begin{align*}
0^M <^M 1^M
&\implies +^M(a,0^M) <^M +^M(a,1^M)\\
&\implies a <^M b,
\end{align*}
because $+^M(a,0^M)=a$ and $+^M(a,1^M)=b$. Thus for this arbitrary $a\in |M|$ there exists $b\in |M|$ such that $a<^M b$, so
\begin{align*}
M\models \exists y\,(x<y)[x\mapsto a].
\end{align*}
Since $a$ was arbitrary,
\begin{align*}
M\models \forall x\,\exists y\,(x<y).
\end{align*}
Therefore every model of $T_{\mathrm{of}}$ satisfies the sentence, and hence
\begin{align*}
T_{\mathrm{of}}\models \forall x\,\exists y\,(x<y).
\end{align*}
This example shows that semantic consequence is proved by reasoning inside an arbitrary model of the theory, not by checking one particular ordered field.
[/example]
## Quantifier-Free Preservation Under Substructures
The last issue in this chapter is how truth behaves when passing between a structure and a substructure. Quantifiers can create new demands, since a witness may exist in the larger structure but not in the smaller one. Quantifier-free formulas avoid this difficulty because their truth is determined entirely by interpreting terms and relations on the given tuple.
[definition: Quantifier-Free Formula]
An $L$-formula is quantifier-free if it is built from atomic formulas using only Boolean connectives and contains no occurrence of $\forall$ or $\exists$.
[/definition]
For a substructure $A\subset M$, terms with parameters from $A$ have the same values in $A$ and in $M$, and relation symbols are interpreted by restriction. This makes atomic truth stable between the two structures. Since quantifier-free formulas are built from atomic formulas using only Boolean operations, there is no need to search for witnesses outside $A$.
[quotetheorem:4278]
[citeproof:4278]
The substructure hypothesis is essential: an arbitrary subset of the universe may fail to contain the interpretations of constants or to be closed under function symbols, so terms with parameters from the subset may no longer evaluate inside it. The quantifier-free restriction is also essential because Boolean combinations of atomic formulas only inspect the tuple already present, whereas an existential quantifier may ask for a new witness that exists in $M$ but not in $A$. This preservation result is therefore a local statement about the induced atomic diagram of a substructure, not a general transfer theorem for all first-order formulas. It will be used later as the first model-theoretic preservation principle, before stronger notions such as elementary substructure are introduced to handle formulas with quantifiers.
[example: Why Quantifiers Are Different]
Let $A=\mathbb Q$ and $M=\mathbb R$ as ordered fields in $L_{\mathrm{of}}$, with the usual inclusion $\mathbb Q\subset\mathbb R$. First consider the quantifier-free formula
\begin{align*}
\psi(x) := x<x\cdot x.
\end{align*}
For a rational number $q\in\mathbb Q$, the interpretations of $<$ and $\cdot$ in $\mathbb Q$ are the restrictions of the usual interpretations in $\mathbb R$, so
\begin{align*}
\mathbb Q\models \psi(q)
&\iff q <_{\mathbb Q} q\cdot_{\mathbb Q} q\\
&\iff q <_{\mathbb Q} q^2\\
&\iff q <_{\mathbb R} q^2\\
&\iff q <_{\mathbb R} q\cdot_{\mathbb R} q\\
&\iff \mathbb R\models \psi(q).
\end{align*}
Thus this quantifier-free formula has the same truth value in $\mathbb Q$ and $\mathbb R$ on every rational input.
Now consider the quantified formula
\begin{align*}
\varphi(x) := \exists y\,(y\cdot y = x \wedge 0<y).
\end{align*}
In $\mathbb R$, at $x=2$, choose $y=\sqrt 2$. Then $\sqrt2\in\mathbb R$, $\sqrt2>0$, and
\begin{align*}
\sqrt2\cdot_{\mathbb R}\sqrt2=2.
\end{align*}
Therefore
\begin{align*}
\mathbb R\models \varphi(2).
\end{align*}
In $\mathbb Q$, the same formula would require a rational witness. Suppose, for contradiction, that
\begin{align*}
\mathbb Q\models \varphi(2).
\end{align*}
Then there is $r\in\mathbb Q$ such that
\begin{align*}
r\cdot_{\mathbb Q} r=2
\quad\text{and}\quad
0<_{\mathbb Q}r.
\end{align*}
Write $r=m/n$ in lowest terms, with $m,n\in\mathbb Z$, $n\neq 0$, and $\gcd(m,n)=1$. From $r^2=2$ we get
\begin{align*}
\left(\frac mn\right)^2&=2\\
\frac{m^2}{n^2}&=2\\
m^2&=2n^2.
\end{align*}
Hence $m^2$ is even, so $m$ is even. Write $m=2k$ for some $k\in\mathbb Z$. Substituting gives
\begin{align*}
(2k)^2&=2n^2\\
4k^2&=2n^2\\
2k^2&=n^2.
\end{align*}
Hence $n^2$ is even, so $n$ is even. Thus both $m$ and $n$ are divisible by $2$, contradicting $\gcd(m,n)=1$. Therefore no rational witness exists, and
\begin{align*}
\mathbb Q\not\models \varphi(2).
\end{align*}
The quantifier changes the situation because it asks for an element of the ambient universe: $\sqrt2$ is available in $\mathbb R$ but not in $\mathbb Q$.
[/example]
This chapter has fixed the semantic ground rules for first-order model theory. A theory is now a set of sentences, its models are the structures satisfying them, and semantic consequence means truth in all such models. The following chapters ask how much of a structure is determined by its theory, when two structures satisfy the same sentences, and how to build new models from old ones.
Having fixed languages, structures, satisfaction, and semantic consequence, we can now shift attention from single structures to the collections of sentences they satisfy. The next step is to ask when two structures are indistinguishable by first-order sentences, and how that notion controls elementary equivalence and model comparison.
# 2. Theories, Complete Theories, and Elementary Equivalence
After Chapter 1 fixed languages, structures, satisfaction, and semantic consequence, this chapter moves from individual structures to the theories they satisfy. The central question is how much first-order information is contained in a set of sentences, and when two structures cannot be separated by any first-order sentence. We also introduce elementary embeddings and diagrams, which turn semantic questions about embeddings into satisfiability questions about theories.
## Axiomatizable Classes and Complete Theories
When studying a class of structures, the first model-theoretic question is whether membership in the class is controlled by first-order sentences. Some familiar algebraic classes, such as groups and fields, are given exactly by first-order axioms. Other natural classes require more than first-order logic can supply.
[definition: Axiomatizable Class]
Let $L$ be a first-order language. A class $\mathcal K$ of $L$-structures is axiomatizable if there is an $L$-theory $T$ such that, for every $L$-structure $M$,
\begin{align*}
M \in \mathcal K \iff M \models T.
\end{align*}
If $T$ may be chosen finite, then $\mathcal K$ is finitely axiomatizable.
[/definition]
The theory $T$ is allowed to contain infinitely many sentences. This matters because first-order logic cannot always compress an infinite list of requirements into one sentence.
[example: Infinite Sets]
Work in the empty language $L=\varnothing$. For each $n \ge 1$, let $\sigma_n$ be the sentence
\begin{align*}
\exists x_1\dots \exists x_n \bigwedge_{1\le i<j\le n} x_i\ne x_j,
\end{align*}
which says that the structure has at least $n$ distinct elements.
We show that
\begin{align*}
T_{\infty}=\{\sigma_n:n\ge 1\}
\end{align*}
axiomatizes the class of infinite sets. If $M\models T_\infty$, then $M\models \sigma_n$ for every $n\ge 1$, so for every $n$ there are elements $a_1,\dots,a_n\in M$ such that $a_i\ne a_j$ whenever $i<j$. Hence $|M|\ge n$ for every $n$, and therefore $M$ is infinite. Conversely, if $M$ is infinite, then for each fixed $n\ge 1$ we can choose distinct elements $a_1,\dots,a_n\in M$, so
\begin{align*}
M\models \exists x_1\dots \exists x_n \bigwedge_{1\le i<j\le n} x_i\ne x_j,
\end{align*}
and therefore $M\models \sigma_n$. Since this holds for every $n$, we have $M\models T_\infty$.
A finite subset $\Delta\subseteq T_\infty$ has the form
\begin{align*}
\Delta\subseteq \{\sigma_1,\dots,\sigma_N\}
\end{align*}
for some $N\ge 1$, so satisfying $\Delta$ requires only at least the largest finite size mentioned by its sentences. Thus infinitude is captured by the whole infinite theory $T_\infty$, not by any bounded finite fragment of it.
[/example]
The preceding example also shows why theories are usually not treated as single sentences: the expressive unit is the whole set of sentences, not an individual axiom. Even after a theory axiomatizes a natural class, it may still leave many first-order questions undecided.
To measure how much information a theory contains, we ask whether it decides every sentence in its language. A complete theory is not merely a large set of axioms; it is a satisfiable theory whose semantic consequences leave no first-order sentence unresolved. This notion separates axiomatizing a class from fully determining its first-order behavior.
[definition: Complete Theory]
Let $T$ be an $L$-theory. The theory $T$ is complete if $T$ is satisfiable and for every $L$-sentence $\varphi$, exactly one of the following holds:
\begin{align*}
T \models \varphi, \qquad T \models \neg\varphi.
\end{align*}
[/definition]
The satisfiability condition rules out the degenerate case where an inconsistent theory entails every sentence. Completeness is therefore a semantic notion of maximal information among satisfiable theories.
The abstract notion of completeness becomes concrete when a single structure is used as the source of truth. Instead of choosing axioms first, we collect exactly the sentences that hold in that structure.
This motivates a separate definition: a structure determines its own complete theory, and that theory records precisely the first-order information visible inside the structure. This construction gives a canonical example of a complete theory and provides the language for comparing structures by their first-order content rather than by their elements.
[definition: Complete Theory of a Structure]
Let $M$ be an $L$-structure. The complete theory of $M$ is
\begin{align*}
\operatorname{Th}(M)=\{\varphi : \varphi \text{ is an } L\text{-sentence and } M\models \varphi\}.
\end{align*}
[/definition]
For every structure $M$, the theory $\operatorname{Th}(M)$ is complete: each sentence is either true or false in $M$. This gives the main source of complete theories in the course.
Once complete theories of structures are available, we can compare structures by the sentences they satisfy rather than by explicit maps between their elements. This comparison ignores presentation and asks whether first-order logic can tell the structures apart at all. The resulting relation is elementary equivalence.
[definition: Elementary Equivalence]
Let $M$ and $N$ be $L$-structures. We say that $M$ and $N$ are elementarily equivalent, written $M\equiv N$, if
\begin{align*}
\operatorname{Th}(M)=\operatorname{Th}(N).
\end{align*}
[/definition]
Thus $M\equiv N$ means that $M$ and $N$ satisfy the same first-order sentences. This is weaker than isomorphism because first-order sentences cannot name arbitrary elements unless constants for them are present.
[example: Finite Structures and Elementary Equivalence]
Let $M=\{a_1,\dots,a_m\}$, and suppose $L$ has relation symbols $R_1,\dots,R_s$, with $R_\ell$ of arity $r_\ell$. We encode the whole finite relational structure of $M$ by the single $L$-sentence
\begin{align*}
\theta_M:=\exists x_1\dots \exists x_m\Bigg(
&\bigwedge_{1\le i<j\le m} x_i\ne x_j
\ \wedge\
\forall y\,\bigvee_{i=1}^m y=x_i \\
&\wedge\
\bigwedge_{\ell=1}^s\
\bigwedge_{(i_1,\dots,i_{r_\ell})\in \{1,\dots,m\}^{r_\ell}}
\eta_{\ell,i_1,\dots,i_{r_\ell}}(x_{i_1},\dots,x_{i_{r_\ell}})
\Bigg),
\end{align*}
where
\begin{align*}
\eta_{\ell,i_1,\dots,i_{r_\ell}}=
\begin{cases}
R_\ell(x_{i_1},\dots,x_{i_{r_\ell}}), &\text{if } M\models R_\ell(a_{i_1},\dots,a_{i_{r_\ell}}),\\
\neg R_\ell(x_{i_1},\dots,x_{i_{r_\ell}}), &\text{if } M\not\models R_\ell(a_{i_1},\dots,a_{i_{r_\ell}}).
\end{cases}
\end{align*}
The first conjunction says the witnesses are distinct, the universal clause says they exhaust the whole structure, and the final finite conjunction records the truth value of every relation on every tuple of elements.
Since $M\models \theta_M$ using the witnesses $a_1,\dots,a_m$, elementary equivalence gives $N\models \theta_M$. Choose witnesses $b_1,\dots,b_m\in N$ for $\theta_M$. The clauses $b_i\ne b_j$ for $i<j$ make the map $f(a_i)=b_i$ injective, and the clause $\forall y\,\bigvee_{i=1}^m y=b_i$ makes it surjective. For each relation symbol $R_\ell$ and each tuple $(i_1,\dots,i_{r_\ell})$,
\begin{align*}
M\models R_\ell(a_{i_1},\dots,a_{i_{r_\ell}})
\iff
N\models R_\ell(b_{i_1},\dots,b_{i_{r_\ell}}),
\end{align*}
because $\theta_M$ contains exactly the corresponding positive or negated atomic formula. Thus $f$ is an isomorphism, so $M\cong N$. For infinite structures this argument breaks down because one first-order sentence cannot list all elements, and non-isomorphic infinite structures may satisfy the same complete theory.
[/example]
The finite case is therefore misleading: elementary equivalence only becomes a genuinely weaker relation than isomorphism once infinite structures enter. Linear orders provide the first useful contrast.
[example: Linear Orders Without Endpoints]
In the language $L=\{<\}$, the class of linear orders without endpoints is axiomatized by the linear order axioms
\begin{align*}
&\forall x\,\neg(x<x),\\
&\forall x\,\forall y\,\forall z\,((x<y\wedge y<z)\to x<z),\\
&\forall x\,\forall y\,(x<y\vee x=y\vee y<x),
\end{align*}
together with the two endpoint-exclusion sentences
\begin{align*}
\forall x\,\exists y\,(y<x), \qquad \forall x\,\exists z\,(x<z).
\end{align*}
Both $(\mathbb Z,<)$ and $(\mathbb Q,<)$ satisfy these sentences. Indeed, if $n\in\mathbb Z$, then $n-1<n<n+1$, so $n-1$ witnesses the first endpoint sentence and $n+1$ witnesses the second. If $q\in\mathbb Q$, then $q-1<q<q+1$, and $q-1,q+1\in\mathbb Q$, so $(\mathbb Q,<)$ also has no endpoints.
Let
\begin{align*}
\delta:=\forall x\,\forall z\,\bigl(x<z\to \exists y\,(x<y\wedge y<z)\bigr),
\end{align*}
which says that every nonempty open interval contains a point. The structure $(\mathbb Q,<)$ satisfies $\delta$: if $q<r$, set $y=(q+r)/2$. Since $q,r\in\mathbb Q$, also $y\in\mathbb Q$, and
\begin{align*}
q<r
&\implies 2q<q+r\\
&\implies q<\frac{q+r}{2}=y,
\end{align*}
while
\begin{align*}
q<r
&\implies q+r<2r\\
&\implies y=\frac{q+r}{2}<r.
\end{align*}
Thus $q<y<r$. On the other hand, $(\mathbb Z,<)$ does not satisfy $\delta$, because $0<1$ but there is no integer $m$ with $0<m<1$: if $m\in\mathbb Z$ and $0<m$, then $m\ge 1$, so $m<1$ fails.
Therefore the axioms for linear orders without endpoints have one model satisfying $\delta$ and another model satisfying $\neg\delta$. The theory is consequently not complete: it fixes the absence of first and last elements, but it does not decide density of the order.
[/example]
## Elementary Embeddings and Elementary Substructures
A homomorphism or embedding preserves atomic information. The next question is what it means for a map to preserve all first-order information, including quantified formulas whose witnesses may lie elsewhere in the structure.
[definition: Elementary Embedding]
Let $M$ and $N$ be $L$-structures. An embedding $f:M\to N$ is elementary if, for every $L$-formula $\varphi(x_1,\dots,x_n)$ and every $a_1,\dots,a_n\in M$,
\begin{align*}
M\models \varphi(a_1,\dots,a_n) \iff N\models \varphi(f(a_1),\dots,f(a_n)).
\end{align*}
We write $f:M\preccurlyeq N$ when $f$ is an elementary embedding.
[/definition]
Many applications compare a smaller structure with a larger one through the inclusion map rather than through an arbitrary embedding. In that setting, the question is whether formulas with parameters from the smaller structure have the same truth values when evaluated in the larger structure. This special case is important enough to name separately.
[definition: Elementary Substructure]
Let $M\subset N$ be $L$-structures, with the inclusion map an $L$-embedding. We say that $M$ is an elementary substructure of $N$, written $M\preccurlyeq N$, if the inclusion $M\hookrightarrow N$ is elementary.
[/definition]
Elementary substructures are much stronger than ordinary substructures. The obstruction is existential quantification: a witness that exists in $N$ must already be available in $M$ whenever the parameters come from $M$.
[quotetheorem:4279]
[citeproof:4279]
This lemma is useful because arguments often establish only one implication by induction on formulas. Negation then supplies the missing implication, but only because the hypothesis ranges over all first-order formulas. Checking only atomic formulas would say merely that $f$ is an embedding; for example, the inclusion $(\mathbb Z,<)\subset(\mathbb Q,<)$ preserves and reflects atomic order relations on integer parameters, but it does not preserve the existential formula $\exists y\,(0<y<1)$. The theorem also does not say that every embedding between elementarily equivalent structures is elementary: elementarity is a property of the map together with all parameter formulas, not just of the two complete theories.
For inclusions of substructures, elementarity can be tested more economically. Universal and Boolean structure causes no new elements to appear, so the real obstruction is existential: whenever the larger structure has a witness for a formula with parameters from the smaller structure, the smaller structure must already contain such a witness. The Tarski-Vaught criterion turns this obstruction into a practical test.
[quotetheorem:4280]
[citeproof:4280]
The substructure hypothesis is essential: without it, atomic formulas and function symbols may already be misinterpreted by the inclusion, so the witness condition would not even test the right map. The point of the criterion is that existential formulas are the only place where a larger structure can create new truth by adding new elements. It does not say that every substructure is elementary; it says that whenever parameters from the smaller structure define a nonempty set in the ambient structure, that set must already meet the smaller structure. Later this becomes the main tool for constructing elementary submodels: build $N\subset M$ so that it is closed under enough definable choices, then use Tarski-Vaught to prove $N\preccurlyeq M$.
[example: A Non-Elementary Substructure]
The inclusion map $(\mathbb Z,<)\hookrightarrow(\mathbb Q,<)$ is an $L$-embedding in the language $L=\{<\}$: for any $m,n\in\mathbb Z$,
\begin{align*}
(\mathbb Z,<)\models m<n
\iff m<n \text{ as integers}
\iff m<n \text{ as rationals}
\iff (\mathbb Q,<)\models m<n.
\end{align*}
However, this substructure is not elementary. In $(\mathbb Q,<)$,
\begin{align*}
0<\frac{1}{2}<1,
\end{align*}
so
\begin{align*}
(\mathbb Q,<)\models \exists y\,(0<y\wedge y<1).
\end{align*}
If $m\in\mathbb Z$ and $0<m$, then $m\ge 1$, so $m<1$ is false. Hence there is no $m\in\mathbb Z$ such that
\begin{align*}
0<m<1,
\end{align*}
and therefore
\begin{align*}
(\mathbb Z,<)\not\models \exists y\,(0<y\wedge y<1).
\end{align*}
Thus the formula $\exists y\,(0<y\wedge y<1)$ has a witness in the larger structure with parameters from $\mathbb Z$, but no witness in the substructure. This is exactly the obstruction detected by the Tarski-Vaught witness condition, so $(\mathbb Z,<)\not\preccurlyeq(\mathbb Q,<)$.
[/example]
## Elementary Diagrams and the Diagram Method
How can we force a model of a theory to contain a specified structure? The diagram method answers this by expanding the language with constant symbols for the elements of the structure and writing down all atomic information about them.
[definition: Expanded Language With Constants]
Let $M$ be an $L$-structure. The language $L(M)$ is obtained from $L$ by adding a new constant symbol $c_a$ for each element $a\in M$.
[/definition]
Once constants have been added, the structure $M$ can be described inside the expanded language by writing sentences about those named elements. To connect this expanded language with embeddings, we first need the smallest record that still fixes all function values and relation facts among the named elements. Keeping exactly the atomic facts and their negations gives the theory whose models contain a copy of $M$ as an ordinary substructure.
[definition: Atomic Diagram]
Let $M$ be an $L$-structure. The atomic diagram $\operatorname{Diag}_{\mathrm{at}}(M)$ is the set of all atomic and negated atomic $L(M)$-sentences true in the expansion of $M$ interpreting each $c_a$ as $a$.
[/definition]
Atomic information is enough to recover the algebraic shape of $M$, but not enough to control formulas with quantifiers. To force elementary behavior, the diagram must record every sentence in the expanded language.
[definition: Elementary Diagram]
Let $M$ be an $L$-structure. The elementary diagram $\operatorname{Diag}_{\mathrm{el}}(M)$ is the set of all $L(M)$-sentences true in the expansion of $M$ interpreting each $c_a$ as $a$.
[/definition]
The atomic diagram controls embeddings of $M$, because it records exactly the atomic and negated atomic facts that an embedding must preserve and reflect. The elementary diagram asks for more: it records every first-order fact about the named elements of $M$, including facts involving quantifiers. Models of these expanded theories therefore encode ordinary embeddings and elementary embeddings of the original structure.
[quotetheorem:4281]
[citeproof:4281]
The distinction between the two diagrams is exactly the distinction between preserving quantifier-free structure and preserving all first-order structure. Atomic diagrams must include constants for all elements of $M$, since an embedding is a map defined on every element, not only on a chosen generating set unless the structure is generated by that set. The atomic diagram alone cannot prevent new witnesses from appearing around the named copy of $M$; for instance, it can embed $(\mathbb Z,<)$ into a denser order without making the copy elementary. Conversely, a model of the expanded language is not literally an extension of $M$ until we pass through the map $a\mapsto c_a^N$, and that map may fail to be surjective onto the whole $L$-reduct.
[example: Diagram Method for Extending Models]
Suppose $T$ is an $L$-theory and $M$ is an $L$-structure. Form the expanded language $L(M)$ by adding a constant $c_a$ for each $a\in M$, and consider
\begin{align*}
T\cup \operatorname{Diag}_{\mathrm{at}}(M),
\end{align*}
where each sentence of $T$ is viewed as an $L(M)$-sentence.
If $N\models T\cup \operatorname{Diag}_{\mathrm{at}}(M)$, define
\begin{align*}
f:M&\to N|_L,\\
f(a)&=c_a^N.
\end{align*}
For distinct $a,b\in M$, the sentence $c_a\ne c_b$ belongs to $\operatorname{Diag}_{\mathrm{at}}(M)$, so
\begin{align*}
N\models c_a\ne c_b,
\end{align*}
and hence $f(a)\ne f(b)$. Thus $f$ is injective. If $R$ is an $r$-ary relation symbol of $L$ and $a_1,\dots,a_r\in M$, then the atomic diagram contains exactly one of
\begin{align*}
R(c_{a_1},\dots,c_{a_r}),\qquad \neg R(c_{a_1},\dots,c_{a_r}),
\end{align*}
according as $M\models R(a_1,\dots,a_r)$ or $M\not\models R(a_1,\dots,a_r)$. Since $N$ satisfies that sentence,
\begin{align*}
M\models R(a_1,\dots,a_r)
\iff
N\models R(f(a_1),\dots,f(a_r)).
\end{align*}
Similarly, if $g$ is an $r$-ary function symbol and
\begin{align*}
g^M(a_1,\dots,a_r)=b,
\end{align*}
then the atomic sentence
\begin{align*}
g(c_{a_1},\dots,c_{a_r})=c_b
\end{align*}
lies in $\operatorname{Diag}_{\mathrm{at}}(M)$, so
\begin{align*}
g^{N|_L}(f(a_1),\dots,f(a_r))=f(b).
\end{align*}
Therefore $f$ is an embedding of $M$ into the $L$-reduct $N|_L$, and $N|_L\models T$ because $N\models T$.
If we instead require
\begin{align*}
N\models T\cup \operatorname{Diag}_{\mathrm{el}}(M),
\end{align*}
then the same map $f(a)=c_a^N$ preserves every formula with parameters from $M$. Indeed, for any $L$-formula $\varphi(x_1,\dots,x_n)$ and any $a_1,\dots,a_n\in M$, the corresponding $L(M)$-sentence
\begin{align*}
\varphi(c_{a_1},\dots,c_{a_n})
\end{align*}
belongs to the elementary diagram exactly when
\begin{align*}
M\models \varphi(a_1,\dots,a_n).
\end{align*}
Since $N$ satisfies the elementary diagram, this gives
\begin{align*}
M\models \varphi(a_1,\dots,a_n)
\iff
N|_L\models \varphi(f(a_1),\dots,f(a_n)).
\end{align*}
Thus the atomic diagram forces an embedded copy of $M$, while the elementary diagram forces that copy to be elementary.
[/example]
The diagram method will become more powerful after compactness is proved in Chapter 4. At that point, it will often be enough to check finite pieces of a diagram to build large extensions, as in the compactness applications of Chapter 5.
Once theories and elementary equivalence are in view, the natural question becomes how to build new structures while preserving first-order information. Ultraproducts provide exactly that mechanism: they package infinitely many structures into one, with filters and ultrafilters governing what counts as a large set of coordinates.
# 3. Filters, Ultrafilters, and Ultraproducts
Ultraproducts turn a family of structures into a single structure by declaring two sequences to be the same when they agree on a large set of indices. The word "large" is not measured by cardinality, but by a filter, and the special case of an ultrafilter is what makes first-order logic behave well in the next chapter. This chapter builds the set-theoretic and algebraic machinery needed for Łoś's theorem in Chapter 4: filters, ultrafilters, reduced products, ultraproducts, and the interpretation of symbols in quotient structures.
## Measuring Large Subsets of an Index Set
How can we speak about a property holding for almost all indices without choosing a probability measure or a finite counting convention? A filter is the abstract device that records which subsets of an index set are to count as large. The axioms say that large sets are nonempty in the relevant sense, are closed under finite conjunction, and remain large after weakening.
[definition: Filter]
Let $I$ be a set. A filter on $I$ is a collection $\mathcal F \subseteq \mathcal P(I)$ such that:
1. $I \in \mathcal F$ and $\varnothing \notin \mathcal F$.
2. If $A,B \in \mathcal F$, then $A \cap B \in \mathcal F$.
3. If $A \in \mathcal F$ and $A \subseteq B \subseteq I$, then $B \in \mathcal F$.
[/definition]
The guiding example is the collection of subsets containing a fixed point. It treats agreement at that point as the only relevant notion of largeness.
[example: Principal Filter At A Point]
Let $i_0 \in I$, and define
\begin{align*}
\mathcal F_{i_0}=\{A\subseteq I : i_0\in A\}.
\end{align*}
We verify the filter axioms. Since $i_0\in I$, we have $I\in \mathcal F_{i_0}$, while $\varnothing\notin \mathcal F_{i_0}$ because $i_0\notin \varnothing$. If $A,B\in \mathcal F_{i_0}$, then $i_0\in A$ and $i_0\in B$, hence $i_0\in A\cap B$, so
\begin{align*}
A\cap B\in \mathcal F_{i_0}.
\end{align*}
If $A\in \mathcal F_{i_0}$ and $A\subseteq B\subseteq I$, then $i_0\in A$ and $A\subseteq B$ imply $i_0\in B$, so $B\in \mathcal F_{i_0}$.
Thus $\mathcal F_{i_0}$ is a filter on $I$. In this case, a subset of $I$ is large exactly when it contains the single coordinate $i_0$, so quotienting by large agreement only remembers what happens at $i_0$; the corresponding ultraproduct collapses to the $i_0$-th factor and produces no limiting behaviour.
[/example]
For infinite index sets, the filter of cofinite subsets records the familiar phrase "all but finitely many".
[example: Frechet Filter]
Let $I$ be an infinite set, and define
\begin{align*}
\mathcal F_{\mathrm{cof}}=\{A\subseteq I : I\setminus A \text{ is finite}\}.
\end{align*}
We verify that this is a filter. Since
\begin{align*}
I\setminus I=\varnothing,
\end{align*}
the set $I$ belongs to $\mathcal F_{\mathrm{cof}}$. Also
\begin{align*}
I\setminus \varnothing=I,
\end{align*}
which is infinite, so $\varnothing\notin \mathcal F_{\mathrm{cof}}$.
If $A,B\in \mathcal F_{\mathrm{cof}}$, then $I\setminus A$ and $I\setminus B$ are finite. By De Morgan's law,
\begin{align*}
I\setminus (A\cap B)=(I\setminus A)\cup (I\setminus B).
\end{align*}
The union of two finite sets is finite, so $I\setminus(A\cap B)$ is finite, and hence
\begin{align*}
A\cap B\in \mathcal F_{\mathrm{cof}}.
\end{align*}
If $A\in \mathcal F_{\mathrm{cof}}$ and $A\subseteq B\subseteq I$, then
\begin{align*}
I\setminus B\subseteq I\setminus A.
\end{align*}
Since $I\setminus A$ is finite, every subset of it is finite, so $I\setminus B$ is finite. Therefore $B\in \mathcal F_{\mathrm{cof}}$.
Thus $\mathcal F_{\mathrm{cof}}$ is a filter on $I$. It is not an ultrafilter when $I$ is infinite: choose $A\subseteq I$ such that both $A$ and $I\setminus A$ are infinite, for instance the even and odd natural numbers when $I=\mathbb N$. Then
\begin{align*}
I\setminus A \text{ is infinite}
\end{align*}
so $A\notin \mathcal F_{\mathrm{cof}}$, and also
\begin{align*}
I\setminus (I\setminus A)=A
\end{align*}
is infinite, so $I\setminus A\notin \mathcal F_{\mathrm{cof}}$. Hence the Frechet filter does not decide this subset, which is exactly the obstruction to being an ultrafilter.
[/example]
A filter does not always decide whether a subset is large or small. Ultraproducts require a sharper notion: for every subset of the index set, exactly one of the subset and its complement is declared large.
[definition: Ultrafilter]
Let $I$ be a set. An ultrafilter on $I$ is a filter $\mathcal U$ on $I$ such that for every $A \subseteq I$, either $A \in \mathcal U$ or $I \setminus A \in \mathcal U$.
[/definition]
The definition of ultrafilter says that every subset is decided, but in practice ultrafilters are often recognized by a more structural condition. We need a criterion that turns the deciding property into maximality among proper filters, because maximality is the form that can be produced by extension arguments. The next result supplies that equivalence.
[quotetheorem:4283]
[citeproof:4283]
This characterisation explains why ultrafilters are often constructed by maximality rather than by explicitly deciding every subset. The properness hypothesis is essential: if $\varnothing$ were allowed, the whole power set $\mathcal P(I)$ would be a maximal collection under inclusion but would not encode a meaningful notion of largeness. The theorem also separates ordinary filters, such as the Frechet filter on an infinite set, from ultrafilters: the Frechet filter is proper but not maximal because it can be extended while still avoiding finite sets.
Not all ultrafilters produce genuinely new quotient structures. Some simply select one coordinate and make the quotient behave like that factor. We need a name for this degenerate case before constructing nonprincipal ultrafilters, because the model-theoretic power of ultraproducts comes from avoiding concentration at a single index.
[definition: Principal Ultrafilter]
Let $I$ be a set. An ultrafilter $\mathcal U$ on $I$ is principal if there exists $i_0 \in I$ such that
\begin{align*}
\mathcal U = \{A \subseteq I : i_0 \in A\}.
\end{align*}
[/definition]
Principal ultrafilters are easy to understand, but they make an ultraproduct collapse to one selected coordinate. For infinite index sets, model theory needs ultrafilters that extend a chosen notion of largeness without concentrating on a single point. The existence theorem below is the tool that produces such deciding filters from proper preliminary filters.
[quotetheorem:4284]
[citeproof:4284]
The extension lemma is the point at which a genuine choice principle enters the construction. It guarantees that every consistent preliminary notion of largeness can be completed to a deciding one, but it does not usually describe the resulting ultrafilter explicitly. For finite index sets this produces only principal ultrafilters, because some singleton must be large; for infinite index sets it becomes powerful when applied to filters that already exclude finite sets, such as the Frechet filter.
[example: Nonprincipal Ultrafilters On Natural Numbers]
Let $I=\mathbb N$, and let
\begin{align*}
\mathcal F_{\mathrm{cof}}=\{A\subseteq \mathbb N : \mathbb N\setminus A \text{ is finite}\}
\end{align*}
be the Frechet filter. By the *Ultrafilter Extension Lemma*, there is an ultrafilter $\mathcal U$ on $\mathbb N$ such that
\begin{align*}
\mathcal F_{\mathrm{cof}}\subseteq \mathcal U.
\end{align*}
We first check that $\mathcal U$ contains no finite set. If $F\subseteq \mathbb N$ is finite, then
\begin{align*}
\mathbb N\setminus(\mathbb N\setminus F)=F
\end{align*}
is finite, so $\mathbb N\setminus F\in \mathcal F_{\mathrm{cof}}\subseteq \mathcal U$. If also $F\in\mathcal U$, then closure under finite intersections would give
\begin{align*}
F\cap(\mathbb N\setminus F)=\varnothing\in\mathcal U,
\end{align*}
contradicting the definition of a filter. Hence $F\notin\mathcal U$ for every finite $F\subseteq\mathbb N$.
Therefore $\mathcal U$ is not principal. Indeed, if $\mathcal U$ were principal at some $n\in\mathbb N$, then $\{n\}\in\mathcal U$, but $\{n\}$ is finite, contradicting the previous paragraph. Finally, because $\mathcal U$ is an ultrafilter, for every $A\subseteq\mathbb N$ exactly one of $A$ and $\mathbb N\setminus A$ belongs to $\mathcal U$: at least one belongs by the ultrafilter property, and both cannot belong since their intersection is empty. Thus $\mathcal U$ is a nonprincipal ultrafilter extending the Frechet filter, and it turns every subset of $\mathbb N$ into a decided large-or-small set.
[/example]
## Quotienting Products By Large Agreement
Given structures $(M_i)_{i \in I}$ in a common first-order language $L$, how should we identify two elements of the direct product $\prod_{i \in I} M_i$ if we only care about large sets of coordinates? The answer is to compare sequences coordinatewise and keep only the set of indices on which they agree. A filter then turns this comparison into an equivalence relation.
[definition: Filter Equivalence Relation]
Let $(M_i)_{i \in I}$ be nonempty sets and let $\mathcal F$ be a filter on $I$. For $a,b \in \prod_{i \in I} M_i$, define
\begin{align*}
a \sim_{\mathcal F} b \quad \Longleftrightarrow \quad \{i \in I : a_i = b_i\} \in \mathcal F.
\end{align*}
[/definition]
The quotient remembers the value of a sequence only up to large agreement. For this quotient to make sense, large agreement must behave like equality: every sequence must agree with itself on a large set, agreement must be symmetric, and agreement through a middle sequence must remain large. We write $[a]_{\mathcal F}$ for the equivalence class of $a$ once these properties are verified.
[quotetheorem:4285]
[citeproof:4285]
With the equivalence relation in place, the next construction forms the actual quotient set of sequences modulo large agreement. This is the underlying universe on which reduced products and ultraproducts will later interpret functions, relations, and constants coordinatewise. Naming the set-level quotient first separates the basic equality construction from the additional model-theoretic structure.
[definition: Reduced Product Of Sets]
Let $(M_i)_{i \in I}$ be nonempty sets and let $\mathcal F$ be a filter on $I$. The reduced product of the sets $M_i$ modulo $\mathcal F$ is
\begin{align*}
\prod_{i \in I} M_i / \mathcal F := \left(\prod_{i \in I} M_i\right) / {\sim_{\mathcal F}}.
\end{align*}
[/definition]
The same quotient construction becomes especially important when the filter is an ultrafilter. For an arbitrary filter, a statement about a representative sequence can leave many complementary cases unresolved; an ultrafilter removes this ambiguity by classifying every set of indices as large or small. This sharper large-set notion is what will later make truth in the quotient testable by one coordinatewise condition.
Before functions, relations, and constants can be interpreted on such a quotient, the underlying universe has to be named separately. We need the set-level quotient first because the later model-theoretic construction can only define symbols after its domain has been fixed.
[definition: Ultraproduct Of Sets]
Let $(M_i)_{i \in I}$ be nonempty sets and let $\mathcal U$ be an ultrafilter on $I$. The ultraproduct of the sets $M_i$ modulo $\mathcal U$ is the reduced product
\begin{align*}
\prod_{i \in I} M_i / \mathcal U.
\end{align*}
[/definition]
The quotient definition is easier to read through representatives: two sequences become equal exactly when they agree on a large set of indices.
[example: Equivalence Classes Of Sequences]
Let $x=(5,5,5,\dots)$ and $y=(5,6,5,6,\dots)$, with the displayed pattern indexed starting at $1$. Thus
\begin{align*}
x_i&=5 \quad \text{for every } i\in\mathbb N,\\
y_i&=
\begin{cases}
5,& i \text{ odd},\\
6,& i \text{ even}.
\end{cases}
\end{align*}
The agreement set is therefore
\begin{align*}
\{i\in\mathbb N:x_i=y_i\}
&=\{i\in\mathbb N:5=y_i\}\\
&=\{i\in\mathbb N:i\text{ is odd}\}.
\end{align*}
By the definition of the quotient relation $\sim_{\mathcal U}$,
\begin{align*}
[x]_{\mathcal U}=[y]_{\mathcal U}
\quad\Longleftrightarrow\quad
\{i\in\mathbb N:i\text{ is odd}\}\in\mathcal U.
\end{align*}
Since $\mathcal U$ is an ultrafilter, exactly one of the odd indices and the even indices belongs to $\mathcal U$, so this equality is decided by which parity class $\mathcal U$ declares large.
Now let $z=(1,2,3,\dots)$, so $z_i=i$. For a constant sequence $c^\ast=(c,c,c,\dots)$, its agreement set with $z$ is
\begin{align*}
\{i\in\mathbb N:z_i=c^\ast_i\}
&=\{i\in\mathbb N:i=c\}.
\end{align*}
This set is either a singleton or empty, hence finite. A nonprincipal ultrafilter on $\mathbb N$ contains no finite set: if it contained a singleton $\{n\}$, upward closure would force every subset containing $n$ to belong to $\mathcal U$, and the ultrafilter condition would force every subset not containing $n$ to be excluded, making $\mathcal U$ principal at $n$. Therefore
\begin{align*}
\{i\in\mathbb N:z_i=c^\ast_i\}\notin\mathcal U,
\end{align*}
so $[z]_{\mathcal U}\ne[c^\ast]_{\mathcal U}$ for every constant sequence $c^\ast$. Thus the class of $(1,2,3,\dots)$ is genuinely new: it is not represented by any standard constant natural number.
[/example]
## Interpreting The Language In A Quotient Product
The quotient of the Cartesian product is only a set until we explain how the symbols of the language act on equivalence classes. The main issue is independence of representatives: changing a sequence on a small set of coordinates must not change the output class or the truth of an interpreted relation. Without this compatibility, a quotient construction would assign different outputs to the same element; for instance, replacing a representative by an equivalent one could change a coordinatewise function value on a small set, and the operation would not descend to equivalence classes.
[definition: Reduced Product Structure]
Let $L$ be a first-order language, let $(M_i)_{i \in I}$ be $L$-structures, and let $\mathcal F$ be a filter on $I$. The reduced product $\prod_{i \in I} M_i / \mathcal F$ is the $L$-structure whose universe is the reduced product of the underlying sets and whose symbols are interpreted as follows:
1. For each $n$-ary function symbol $f \in L$, define
\begin{align*}
f^{\prod_i M_i/\mathcal F}: \left(\prod_{i \in I} M_i / \mathcal F\right)^n \longrightarrow \prod_{i \in I} M_i / \mathcal F
\end{align*}
by
\begin{align*}
f^{\prod_i M_i/\mathcal F}([a_1]_{\mathcal F},\dots,[a_n]_{\mathcal F}) = [b]_{\mathcal F}, \quad b_i = f^{M_i}((a_1)_i,\dots,(a_n)_i).
\end{align*}
2. For each constant symbol $c \in L$,
\begin{align*}
c^{\prod_i M_i/\mathcal F} = [a]_{\mathcal F}, \quad a_i = c^{M_i}.
\end{align*}
3. For each $n$-ary relation symbol $R \in L$,
\begin{align*}
R^{\prod_i M_i/\mathcal F}([a_1]_{\mathcal F},\dots,[a_n]_{\mathcal F})
\end{align*}
holds if and only if
\begin{align*}
\{i \in I : M_i \models R((a_1)_i,\dots,(a_n)_i)\} \in \mathcal F.
\end{align*}
[/definition]
The definition is only legitimate if representatives can be changed without affecting the result. This is the central technical check before the logical theorem about formulas.
[quotetheorem:4286]
[citeproof:4286]
The theorem turns the set-level quotient into an honest $L$-structure. Its role is to ensure that changing representatives inside an equivalence class cannot change the value of a function symbol or the truth of a relation symbol in the quotient. Without this well-definedness step, an ultraproduct would be only a quotient set, not a model in the original language.
[remark: Filters Suffice For Well-Definedness]
The proof of well-definedness uses the filter axioms rather than the full ultrafilter decision property. The ultrafilter condition becomes essential when passing from atomic formulas to negations and disjunctions in Los's theorem.
[/remark]
The first concrete model-theoretic payoff, proved by Łoś's theorem in Chapter 4, is that ultrapowers can enlarge familiar structures while preserving their first-order behaviour.
[example: Ultrapower Of Natural Numbers]
Let $L=\{+,\cdot,0,1,<\}$, let every factor be the standard $L$-structure $\mathbb N$, and let $\mathcal U$ be a nonprincipal ultrafilter on $\mathbb N$. For each $n\in\mathbb N$, write
\begin{align*}
\bar n=(n,n,n,\dots)\in \mathbb N^{\mathbb N}.
\end{align*}
The assignment $n\mapsto [\bar n]_{\mathcal U}$ is injective: if $m\ne n$, then
\begin{align*}
\{i\in\mathbb N:\bar m_i=\bar n_i\}
&=\{i\in\mathbb N:m=n\}\\
&=\varnothing,
\end{align*}
and $\varnothing\notin\mathcal U$, so $[\bar m]_{\mathcal U}\ne[\bar n]_{\mathcal U}$. The operations agree with the usual ones because they are interpreted coordinatewise:
\begin{align*}
[\bar m]_{\mathcal U}+[\bar n]_{\mathcal U}
&=[(m+n,m+n,m+n,\dots)]_{\mathcal U}\\
&=[\overline{m+n}]_{\mathcal U},
\end{align*}
and similarly
\begin{align*}
[\bar m]_{\mathcal U}\cdot[\bar n]_{\mathcal U}
&=[(mn,mn,mn,\dots)]_{\mathcal U}\\
&=[\overline{mn}]_{\mathcal U}.
\end{align*}
Also $0$ and $1$ are interpreted as $[\bar 0]_{\mathcal U}$ and $[\bar 1]_{\mathcal U}$, so this is the diagonal copy of $\mathbb N$ inside the ultrapower.
Now let $h=(1,2,3,\dots)$, so $h_i=i$ with the displayed indexing. For any fixed $m\in\mathbb N$, the interpreted order gives
\begin{align*}
[\bar m]_{\mathcal U}< [h]_{\mathcal U}
\quad\Longleftrightarrow\quad
\{i\in\mathbb N:\bar m_i<h_i\}\in\mathcal U.
\end{align*}
Since $\bar m_i=m$ and $h_i=i$, this truth set is
\begin{align*}
\{i\in\mathbb N:\bar m_i<h_i\}
&=\{i\in\mathbb N:m<i\}\\
&=\mathbb N\setminus\{i\in\mathbb N:i\le m\}.
\end{align*}
The set $\{i\in\mathbb N:i\le m\}$ is finite, so its complement is cofinite. A nonprincipal ultrafilter on $\mathbb N$ contains every cofinite set: finite sets are not large, and the ultrafilter condition therefore puts their complements in $\mathcal U$. Hence
\begin{align*}
\{i\in\mathbb N:m<i\}\in\mathcal U,
\end{align*}
so $[\bar m]_{\mathcal U}< [h]_{\mathcal U}$ for every standard $m\in\mathbb N$. Thus $[(1,2,3,\dots)]_{\mathcal U}$ is an element of the ultrapower larger than every element in the diagonal copy of $\mathbb N$.
[/example]
## Reduced Products Versus Ultraproducts
What changes when the filter is not an ultrafilter? The quotient structure can still be formed, and atomic information is still interpreted by large truth sets, but the structure may no longer make every first-order statement correspond to a clean large-set condition.
[definition: Reduced Product]
Let $L$ be a first-order language, let $(M_i)_{i \in I}$ be $L$-structures, and let $\mathcal F$ be a filter on $I$. The reduced product of the structures $(M_i)_{i \in I}$ modulo $\mathcal F$ is the $L$-structure $\prod_{i \in I} M_i / \mathcal F$ obtained from coordinatewise interpretation of symbols and quotienting by $\sim_{\mathcal F}$.
[/definition]
For structures, the decisive case is the quotient by an ultrafilter rather than by an arbitrary filter. The ultrafilter condition ensures that coordinatewise truth sets are always decided as large or not large, which is exactly what Los's theorem will require. This motivates giving the ultrafilter case its own name.
[definition: Ultraproduct]
Let $L$ be a first-order language, let $(M_i)_{i \in I}$ be $L$-structures, and let $\mathcal U$ be an ultrafilter on $I$. The ultraproduct of the structures $(M_i)_{i \in I}$ modulo $\mathcal U$ is the reduced product $\prod_{i \in I} M_i / \mathcal U$.
[/definition]
The terminology reflects a logical divide. Reduced products preserve enough coordinatewise structure for algebraic constructions, while ultraproducts are tuned to first-order truth.
[example: Principal Ultraproduct]
Let $\mathcal U$ be the principal ultrafilter on $I$ determined by $i_0$, so for every $A\subseteq I$,
\begin{align*}
A\in\mathcal U \quad\Longleftrightarrow\quad i_0\in A.
\end{align*}
Define
\begin{align*}
\Phi:\prod_{i\in I}M_i/\mathcal U&\longrightarrow M_{i_0},&
\Phi([a]_{\mathcal U})&=a_{i_0}.
\end{align*}
This is well-defined: if $[a]_{\mathcal U}=[b]_{\mathcal U}$, then
\begin{align*}
\{i\in I:a_i=b_i\}\in\mathcal U,
\end{align*}
so, by principality of $\mathcal U$,
\begin{align*}
i_0\in\{i\in I:a_i=b_i\},
\end{align*}
and therefore $a_{i_0}=b_{i_0}$.
The map $\Phi$ is injective. Indeed, if $\Phi([a]_{\mathcal U})=\Phi([b]_{\mathcal U})$, then $a_{i_0}=b_{i_0}$, so
\begin{align*}
i_0\in\{i\in I:a_i=b_i\}.
\end{align*}
Hence $\{i\in I:a_i=b_i\}\in\mathcal U$, and therefore $[a]_{\mathcal U}=[b]_{\mathcal U}$. It is also surjective: given $m\in M_{i_0}$, choose any sequence $r\in\prod_{i\in I}M_i$ and define
\begin{align*}
b_i=
\begin{cases}
m,& i=i_0,\\
r_i,& i\ne i_0.
\end{cases}
\end{align*}
Then $b\in\prod_{i\in I}M_i$ and
\begin{align*}
\Phi([b]_{\mathcal U})=b_{i_0}=m.
\end{align*}
It remains to check preservation of the $L$-structure. For an $n$-ary function symbol $f\in L$ and representatives $a_1,\dots,a_n$, the reduced product interpretation gives
\begin{align*}
f^{\prod_i M_i/\mathcal U}([a_1]_{\mathcal U},\dots,[a_n]_{\mathcal U})
=[c]_{\mathcal U},
\end{align*}
where
\begin{align*}
c_i=f^{M_i}((a_1)_i,\dots,(a_n)_i).
\end{align*}
Applying $\Phi$ gives
\begin{align*}
\Phi([c]_{\mathcal U})
&=c_{i_0}\\
&=f^{M_{i_0}}((a_1)_{i_0},\dots,(a_n)_{i_0})\\
&=f^{M_{i_0}}\bigl(\Phi([a_1]_{\mathcal U}),\dots,\Phi([a_n]_{\mathcal U})\bigr).
\end{align*}
For a constant symbol $d\in L$, its interpretation in the reduced product is $[e]_{\mathcal U}$ with $e_i=d^{M_i}$, so
\begin{align*}
\Phi(d^{\prod_i M_i/\mathcal U})
=\Phi([e]_{\mathcal U})
=e_{i_0}
=d^{M_{i_0}}.
\end{align*}
For an $n$-ary relation symbol $R\in L$,
\begin{align*}
\prod_i M_i/\mathcal U \models R([a_1]_{\mathcal U},\dots,[a_n]_{\mathcal U})
\end{align*}
holds exactly when
\begin{align*}
\{i\in I:M_i\models R((a_1)_i,\dots,(a_n)_i)\}\in\mathcal U.
\end{align*}
By principality, this is equivalent to
\begin{align*}
M_{i_0}\models R((a_1)_{i_0},\dots,(a_n)_{i_0}),
\end{align*}
which is the same as
\begin{align*}
M_{i_0}\models R(\Phi([a_1]_{\mathcal U}),\dots,\Phi([a_n]_{\mathcal U})).
\end{align*}
Thus $\Phi$ is an isomorphism of $L$-structures. The quotient identifies exactly those sequences that agree at $i_0$, so it retains the $i_0$-th factor and discards all other coordinates.
[/example]
The ultrapower example is the prototype for nonstandard analysis: the element represented by $(1,2,3,\dots)$ behaves like an infinite natural number once transfer is available. The same construction also appears in asymptotic algebra, where a sequence of finite structures is replaced by a single limiting structure.
[example: Ultraproducts Of Finite Fields]
Let $(p_i)_{i\in\mathbb N}$ be a sequence of primes tending to infinity, let $M_i=\mathbb F_{p_i}$ in the language of rings, and let $\mathcal U$ be a nonprincipal ultrafilter on $\mathbb N$. Set
\begin{align*}
K=\prod_{i\in\mathbb N}\mathbb F_{p_i}/\mathcal U.
\end{align*}
For a fixed positive integer $n$, the interpretation of $n\cdot 1$ in $K$ is represented by the sequence
\begin{align*}
a_i=n\cdot 1_{\mathbb F_{p_i}}.
\end{align*}
In the field $\mathbb F_{p_i}$, we have
\begin{align*}
n\cdot 1_{\mathbb F_{p_i}}=0
\quad\Longleftrightarrow\quad
p_i\mid n,
\end{align*}
because the additive order of $1_{\mathbb F_{p_i}}$ is $p_i$. Therefore
\begin{align*}
\{i\in\mathbb N:a_i\ne 0\}
&=\{i\in\mathbb N:n\cdot 1_{\mathbb F_{p_i}}\ne 0\}\\
&=\{i\in\mathbb N:p_i\nmid n\}.
\end{align*}
Since $p_i\to\infty$, there is $N$ such that $p_i>n$ for all $i\ge N$. For those indices $i\ge N$, the prime $p_i$ cannot divide the positive integer $n$, so
\begin{align*}
\{i\in\mathbb N:i\ge N\}\subseteq \{i\in\mathbb N:p_i\nmid n\}.
\end{align*}
The set $\{i\in\mathbb N:i<N\}$ is finite, so $\{i\in\mathbb N:i\ge N\}$ is cofinite. A nonprincipal ultrafilter on $\mathbb N$ contains every cofinite set, hence upward closure gives
\begin{align*}
\{i\in\mathbb N:p_i\nmid n\}\in\mathcal U.
\end{align*}
Thus $n\cdot 1_K\ne 0$ for every positive integer $n$, because equality to $0$ in the quotient would require the complementary set $\{i:p_i\mid n\}$ to belong to $\mathcal U$.
So $K$ has characteristic $0$, even though each factor $\mathbb F_{p_i}$ has positive characteristic $p_i$. The ultraproduct records the limiting first-order behaviour of finite fields whose characteristics escape to infinity.
[/example]
The finite-fields example illustrates the practical method: translate the property into a first-order sentence, check it on a large set of coordinates, and then read the conclusion inside the ultraproduct. Reduced products follow the same coordinatewise construction, but without the ultrafilter decision rule they lose the clean truth-set calculus.
[example: Reduced Product Modulo The Frechet Filter]
Let $I=\mathbb N$, let each $M_i=\{0,1\}$ in the language with equality only, and quotient the product $\{0,1\}^{\mathbb N}$ by the Frechet filter
\begin{align*}
\mathcal F_{\mathrm{cof}}=\{A\subseteq\mathbb N:\mathbb N\setminus A\text{ is finite}\}.
\end{align*}
For two binary sequences $a,b\in\{0,1\}^{\mathbb N}$, the quotient relation is
\begin{align*}
a\sim_{\mathcal F_{\mathrm{cof}}} b
&\Longleftrightarrow \{i\in\mathbb N:a_i=b_i\}\in\mathcal F_{\mathrm{cof}}\\
&\Longleftrightarrow \mathbb N\setminus\{i\in\mathbb N:a_i=b_i\}\text{ is finite}\\
&\Longleftrightarrow \{i\in\mathbb N:a_i\ne b_i\}\text{ is finite}.
\end{align*}
Thus two sequences define the same element of the reduced product exactly when they differ at only finitely many coordinates.
The Frechet filter does not decide the set of even indices. Let
\begin{align*}
E=\{i\in\mathbb N:i\text{ is even}\}.
\end{align*}
Then
\begin{align*}
\mathbb N\setminus E=\{i\in\mathbb N:i\text{ is odd}\},
\end{align*}
and both $E$ and $\mathbb N\setminus E$ are infinite. Hence
\begin{align*}
E\notin\mathcal F_{\mathrm{cof}}
\quad\text{and}\quad
\mathbb N\setminus E\notin\mathcal F_{\mathrm{cof}},
\end{align*}
because membership in $\mathcal F_{\mathrm{cof}}$ requires finite complement. Therefore $\mathcal F_{\mathrm{cof}}$ is not an ultrafilter.
The quotient still records eventual behaviour: for example, a sequence that is eventually constant $1$ has the same class as $(1,1,1,\dots)$, because the set of indices where they differ is finite. But alternating behaviour is not decided by the filter: the sequence
\begin{align*}
e_i=
\begin{cases}
1,& i\text{ even},\\
0,& i\text{ odd}
\end{cases}
\end{align*}
does not agree with the constant $1$ sequence on a cofinite set, and it does not agree with the constant $0$ sequence on a cofinite set. This is the precise obstruction to a direct *Los-style truth theorem* for all first-order formulas: complements of truth sets need not be large when the original truth set is not large.
[/example]
## The Road To Los's Theorem
Why was so much care spent on the quotient construction? The point is that first-order formulas are evaluated from atomic formulas by Boolean operations and quantifiers, and ultrafilters are designed to match exactly those operations at the level of large subsets.
[explanation: Truth Sets]
Given an $L$-formula $\varphi(x_1,\dots,x_n)$ and representatives $a_1,\dots,a_n \in \prod_i M_i$, the associated truth set is
\begin{align*}
T_\varphi(a_1,\dots,a_n) = \{i \in I : M_i \models \varphi((a_1)_i,\dots,(a_n)_i)\}.
\end{align*}
For atomic formulas, the definition of the reduced product was arranged so that truth in the quotient is membership of this set in the filter. For negation, an ultrafilter is needed because $T_{\neg \varphi}$ is the complement of $T_\varphi$. For conjunction, the filter intersection axiom is enough. For existential quantifiers, representatives are chosen coordinate by coordinate from witnesses on a large set.
[/explanation]
This truth-set notation is the working procedure for ultraproduct arguments: compute the coordinatewise set on which a formula holds, then ask whether that set is large. Before tackling all formulas, the atomic language has to be aligned with the quotient structure, because terms, relations, and equality are interpreted directly from coordinates.
[quotetheorem:4287]
[citeproof:4287]
The lemma identifies the part of truth that is already visible from the coordinatewise construction: atomic formulas are evaluated by checking where they hold in the factors. This is the base case for Łoś's theorem, so its role is structural rather than merely computational. Equality in the quotient, functions, and relation symbols must all agree with the filter notion of large agreement before Boolean connectives and quantifiers can be handled by induction.
Its limitation is also important. With a non-ultrafilter, a truth set and its complement may both fail to be large, so the quotient need not decide negations in a Łoś-style way. For example, modulo the Frechet filter on $\mathbb N$, the set of even indices is neither cofinite nor has cofinite complement, so a statement alternating between even and odd coordinates has no definite large side. The lemma therefore marks the exact boundary between the algebraic quotient construction and the full transfer theorem: atomic truth is already coordinatewise, but first-order truth needs the ultrafilter to decide complements and support the induction through all formulas.
This chapter has isolated the construction from the logical transfer theorem. Filters describe large agreement, ultrafilters decide every subset, and quotient products turn coordinatewise structures into new structures. Chapter 4 combines these ingredients to prove Łoś's theorem and then compactness, explaining why ultraproducts are central to model theory.
The ultraproduct construction is useful because it turns local agreement into global structure, but its real force appears only when formulas are tracked through the construction. Łoś's theorem is the bridge: it tells us that truth in the ultraproduct is governed by truth on an ultrafilter-large set, and from that compactness follows.
# 4. Łoś's Theorem and the Ultraproduct Proof of Compactness
Ultraproducts turn many local satisfiability checks into a single global structure. The guiding point of this chapter is that first-order formulas cannot distinguish an ultraproduct from holding on an ultrafilter-large set of coordinates. This is Łoś's theorem, and it converts the combinatorial data of finite satisfiability into the compactness theorem for first-order logic.
Chapters 1 and 2 introduced structures, satisfaction, elementary equivalence, and diagrams, while Chapter 3 built filters and ultraproducts. We now use that construction as a proof engine: build many finite or partial models, quotient their product by an ultrafilter, and read truth in the quotient coordinatewise.
## Why Truth in a Quotient Should Be Coordinatewise
The first question is how a formula should be evaluated in a product of structures after identifying sequences that agree on a large set of indices. Atomic formulas are forced by the definition of the quotient relation. The main content of Łoś theorem is that this coordinatewise rule survives Boolean connectives and quantifiers.
Let $L$ be a first-order language, let $I$ be a non-empty index set, let $(M_i)_{i \in I}$ be a family of $L$-structures, and let $\mathcal U$ be an ultrafilter on $I$. Write $\prod_{i \in I} M_i$ for the Cartesian product of the underlying sets.
[definition: Ultraproduct Equivalence Relation]
For $a,b \in \prod_{i \in I} M_i$, define
\begin{align*}
a \sim_{\mathcal U} b \quad \Longleftrightarrow \quad \{i \in I : a(i)=b(i)\} \in \mathcal U.
\end{align*}
[/definition]
The equivalence class of $a$ is written $[a]_{\mathcal U}$, or just $[a]$ when the ultrafilter is fixed. The next step is to turn the quotient set into a genuine $L$-structure, not merely a set of equivalence classes. Each symbol must be interpreted in a way that is independent of the chosen representatives, so the definition records the coordinatewise interpretations explicitly.
[definition: Ultraproduct]
The ultraproduct of $(M_i)_{i \in I}$ modulo $\mathcal U$ is the $L$-structure
\begin{align*}
\prod_{i \in I} M_i / \mathcal U
\end{align*}
whose universe is $\left(\prod_{i \in I} M_i\right)/\sim_{\mathcal U}$ and whose symbols are interpreted as follows:
For each $n$-ary function symbol $f \in L$,
\begin{align*}
f^{\prod M_i/\mathcal U}([a_1],\dots,[a_n]) = [b], \qquad b(i)=f^{M_i}(a_1(i),\dots,a_n(i)).
\end{align*}
For each $n$-ary relation symbol $R \in L$,
\begin{align*}
\prod M_i/\mathcal U \models R([a_1],\dots,[a_n])
\end{align*}
exactly when
\begin{align*}
\{i \in I : M_i \models R(a_1(i),\dots,a_n(i))\}\in \mathcal U.
\end{align*}
For each constant symbol $c \in L$,
\begin{align*}
c^{\prod M_i/\mathcal U}=[a_c], \qquad a_c(i)=c^{M_i}.
\end{align*}
[/definition]
Well-definedness uses closure of $\mathcal U$ under finite intersections. If representatives are changed on small sets, then the coordinatewise values and relation truth sets change only on small sets.
[example: Reduced Product of Finite Rings]
Let $I$ be the set of prime numbers, let $M_p=\mathbb F_p$ in the language of rings, and let $\mathcal U$ be a non-principal ultrafilter on $I$. Put
\begin{align*}
K=\prod_{p\in I}\mathbb F_p/\mathcal U,
\qquad
e(p)=1_{\mathbb F_p}.
\end{align*}
The class $[e]$ is the multiplicative identity in $K$: if $a\in\prod_{p\in I}\mathbb F_p$, then the product $ea$ is represented by the sequence
\begin{align*}
(ea)(p)=e(p)a(p)=1_{\mathbb F_p}a(p)=a(p),
\end{align*}
so
\begin{align*}
\{p\in I:(ea)(p)=a(p)\}=I\in\mathcal U,
\end{align*}
and therefore $[e][a]=[a]$. The same coordinatewise calculation gives $[a][e]=[a]$.
We show that $K$ has characteristic $0$. Fix an integer $n\ge 1$. The element $n[e]$ is represented by the sequence
\begin{align*}
p\longmapsto n\cdot 1_{\mathbb F_p}.
\end{align*}
In $\mathbb F_p$, the equality $n\cdot 1_{\mathbb F_p}=0$ holds exactly when $p$ divides $n$, because $\mathbb F_p$ has characteristic $p$. Hence
\begin{align*}
\{p\in I:n\cdot 1_{\mathbb F_p}=0\}
=
\{p\in I:p\mid n\}.
\end{align*}
The right-hand set is finite, since a positive integer has only finitely many prime divisors. A non-principal ultrafilter on $I$ contains no finite set, so
\begin{align*}
\{p\in I:n\cdot 1_{\mathbb F_p}=0\}\notin\mathcal U.
\end{align*}
By the definition of equality in the ultraproduct quotient, this means
\begin{align*}
n[e]\ne 0
\end{align*}
in $K$. Since this holds for every $n\ge 1$, the ultraproduct $K$ has characteristic $0$, even though each coordinate field $\mathbb F_p$ has positive characteristic $p$.
[/example]
The example already shows the logical flavour of ultraproducts: a fixed first-order sentence may ignore finitely many bad coordinates when the ultrafilter is non-principal. The general question is whether this coordinatewise test works for every first-order formula, including formulas built using negation and existential quantifiers. Łoś's theorem gives the exact answer by identifying truth in the ultraproduct with membership of the corresponding truth set in the ultrafilter.
[quotetheorem:4288]
[citeproof:4288]
Łoś's theorem says that every first-order formula commutes with passage to an ultraproduct, provided truth is interpreted through the ultrafilter. Its content is not just that the quotient is well defined, but that the full first-order language is transferred through it. The ultrafilter handles Boolean operations by deciding complements and preserving finite intersections, while existential formulas are handled by choosing witnesses coordinate by coordinate on a large set. This is why ultraproducts can convert many local model-theoretic facts into a single global structure.
## Where Existential Witnesses Enter
The delicate question in the proof is why an existential statement in many coordinates produces a single element of the ultraproduct. Universal and Boolean formulas are controlled by set operations inside the ultrafilter. Existential formulas require a choice of coordinatewise witnesses and therefore reveal exactly where the ultraproduct construction uses the ambient product.
[explanation: The Existential Step]
Suppose
\begin{align*}
A=\{i\in I : M_i\models \exists y\,\psi(y,a_1(i),\dots,a_n(i))\}\in \mathcal U.
\end{align*}
For each $i\in A$, choose $b(i)\in M_i$ such that
\begin{align*}
M_i\models \psi(b(i),a_1(i),\dots,a_n(i)).
\end{align*}
For $i\notin A$, choose any element $b(i)\in M_i$; in standard presentations all structures have non-empty universes. The sequence $b\in \prod_i M_i$ defines an ultraproduct element $[b]$. Since the truth set of $\psi(b,a_1,\dots,a_n)$ contains $A$, it belongs to $\mathcal U$, and the induction hypothesis gives the desired existential truth in the ultraproduct.
[/explanation]
This is the bridge between coordinatewise satisfiability and global satisfiability. The ultrafilter does not merely record a majority vote; it guarantees that finite intersections of large sets remain large and that every coordinatewise Boolean decision has a definite large side.
[remark: Ultrafilter Use in the Induction]
Filters are enough for conjunctions and for upward closure arguments. The ultrafilter property is needed for negation, because Łoś theorem for $\neg\varphi$ requires
\begin{align*}
I\setminus \{i : M_i\models \varphi(a(i))\}\in \mathcal U
\end{align*}
exactly when the original truth set is not in $\mathcal U$. Without maximality of the filter, the quotient may still be a reduced product, but full first-order transfer can fail.
[/remark]
The transfer statement is most useful when a coordinatewise witness can be assembled into one ultraproduct element, as the square-root example shows.
[example: Witnesses for Square Roots]
Let $I=\mathbb N$, let $M_n=\mathbb R$ as ordered fields, and let $\mathcal U$ be a non-principal ultrafilter on $\mathbb N$. Define $a\in\prod_{n\in\mathbb N}\mathbb R$ by $a(n)=n^2$, and define $b\in\prod_{n\in\mathbb N}\mathbb R$ by $b(n)=n$. For every $n\in\mathbb N$ with the usual convention $\mathbb N=\{1,2,3,\dots\}$,
\begin{align*}
b(n)^2
&=n^2\\
&=a(n),
\end{align*}
and also
\begin{align*}
b(n)=n>0.
\end{align*}
Hence the coordinate truth set for the formula $y>0\wedge y^2=x$ at the pair $(b(n),a(n))$ is
\begin{align*}
\{n\in\mathbb N:\mathbb R\models b(n)>0\wedge b(n)^2=a(n)\}
=\mathbb N.
\end{align*}
Since $\mathbb N\in\mathcal U$, *Łoś theorem* gives
\begin{align*}
\prod_{n\in\mathbb N}\mathbb R/\mathcal U
\models [b]>0\wedge [b]^2=[a].
\end{align*}
Therefore
\begin{align*}
\prod_{n\in\mathbb N}\mathbb R/\mathcal U
\models \exists y\,(y>0\wedge y^2=[a]).
\end{align*}
The witness is not obtained by a formula naming $n$ inside the language; it is the ultraproduct element $[b]$ represented by the chosen coordinatewise sequence $b(n)=n$.
[/example]
The distinction between definable witnesses and chosen coordinatewise witnesses is important. Łoś theorem proves existence in the quotient even when no uniform term or formula selects witnesses across all coordinates. This raises the structural question of how an original structure sits inside its ultrapower: the diagonal copy should preserve every first-order fact, even though the surrounding quotient contains many new sequence classes.
[quotetheorem:4289]
[citeproof:4289]
The ultrapower therefore contains an elementary copy of the original structure. If the ultrafilter is principal, the ultrapower is essentially the original structure; if it is non-principal, new elements often appear. The theorem is also a warning about how quotient constructions should be read: the diagonal copy preserves all first-order facts of the original structure, but the surrounding ultrapower may contain genuinely new equivalence classes of sequences. Thus elementary equivalence is compatible with a much larger universe, and later arguments will exploit this gap between preserving formulas and preserving the underlying set.
## Compactness from Ultraproducts
The central model-theoretic question is whether finite consistency already guarantees consistency. Compactness answers yes for first-order logic: if every finite fragment of a theory has a model, then the whole theory has a model. Ultraproducts give a direct semantic proof by indexing models by finite pieces of the theory.
[definition: Finitely Satisfiable Theory]
Let $L$ be a first-order language and let $T$ be a set of $L$-sentences. The theory $T$ is finitely satisfiable if every finite subset $\Delta\subset T$ has an $L$-structure $M_\Delta$ such that $M_\Delta\models \Delta$.
[/definition]
Finite satisfiability is a local condition on the set of axioms. The obstruction is that the models for different finite fragments may have no obvious common universe or compatible interpretations. Compactness says that first-order logic nevertheless lets these local models be assembled into one model of the entire theory.
[quotetheorem:4290]
[citeproof:4290]
Compactness is a metatheorem about first-order syntax, not about any particular class of structures. Its strength is that it turns a family of local consistency checks into one global model, even when the finite models have no visible compatibility with one another. Its limitation is just as important: the model produced by compactness need not be canonical, countable, standard, or close to the intended example. The theorem is therefore best used as an existence and non-definability tool: if every finite obstruction can be avoided, first-order logic cannot rule out the global object.
[example: Finitely Satisfiable Graph Coloring Constraints]
Work in the language with a binary relation symbol $E$, unary predicates $C_1,\dots,C_k$, and a constant symbol $c_v$ for each vertex $v\in V(G)$. Let $T$ contain the graph facts
\begin{align*}
E(c_v,c_w)
\end{align*}
for every edge $\{v,w\}$ of $G$, together with the coloring axioms
\begin{align*}
&C_1(c_v)\vee\cdots\vee C_k(c_v),\\
&\neg(C_i(c_v)\wedge C_j(c_v))\qquad (1\le i<j\le k),\\
&\neg(C_i(c_v)\wedge C_i(c_w))\qquad (\{v,w\}\in E(G),\ 1\le i\le k).
\end{align*}
The first line says that $v$ receives at least one color, the second says that it receives at most one color, and the third says that adjacent vertices do not receive the same color.
Let $\Delta\subset T$ be finite. Only finitely many constants occur in $\Delta$, say
\begin{align*}
c_{v_1},\dots,c_{v_m}.
\end{align*}
By assumption, the finite induced subgraph of $G$ on $\{v_1,\dots,v_m\}$ has a proper $k$-coloring. Choose such a coloring
\begin{align*}
\gamma:\{v_1,\dots,v_m\}\to \{1,\dots,k\}.
\end{align*}
Interpret each $c_{v_\ell}$ as $v_\ell$, interpret $E$ as the edge relation on this finite induced subgraph, and interpret
\begin{align*}
C_i=\{v_\ell:\gamma(v_\ell)=i\}.
\end{align*}
Then $C_1(c_{v_\ell})\vee\cdots\vee C_k(c_{v_\ell})$ holds because $\gamma(v_\ell)$ is one of $1,\dots,k$; the formula $\neg(C_i(c_{v_\ell})\wedge C_j(c_{v_\ell}))$ holds for $i<j$ because $\gamma(v_\ell)$ has a single value; and if $\{v_\ell,v_r\}$ is an edge, then $\gamma(v_\ell)\ne\gamma(v_r)$, so no predicate $C_i$ contains both $v_\ell$ and $v_r$. Thus every finite $\Delta\subset T$ has a model.
By *Compactness Theorem for First-Order Logic*, there is a model $M\models T$. Define a coloring of $G$ by assigning to each vertex $v$ the unique $i\in\{1,\dots,k\}$ such that
\begin{align*}
M\models C_i(c_v).
\end{align*}
Existence of such an $i$ follows from $C_1(c_v)\vee\cdots\vee C_k(c_v)$, and uniqueness follows from the pairwise incompatibility sentences. If $\{v,w\}$ is an edge and both vertices had color $i$, then $M\models C_i(c_v)\wedge C_i(c_w)$, contradicting the axiom $\neg(C_i(c_v)\wedge C_i(c_w))$. Hence the coloring obtained from $M$ is a proper $k$-coloring of the whole graph $G$.
[/example]
The same finite-satisfiability method also creates ordered-field extensions with elements beyond every standard real bound.
[example: Nonstandard Real Closed Extensions]
Expand the language of ordered rings by constants $c_r$ for each $r\in\mathbb R$ and one additional constant $a$. Let $T$ consist of the full elementary diagram of $\mathbb R$ in the constants $c_r$, together with the sentences
\begin{align*}
a>c_n
\end{align*}
for every $n\in\mathbb N$.
We verify finite satisfiability. Let $\Delta\subset T$ be finite, and let
\begin{align*}
F=\{n\in\mathbb N : a>c_n \text{ occurs in } \Delta\}.
\end{align*}
The set $F$ is finite. If $F=\varnothing$, interpret $a$ as $0\in\mathbb R$. If $F\ne\varnothing$, let $N=\max F$ and interpret $a$ as $N+1$. For every $n\in F$,
\begin{align*}
N=\max F &\ge n,\\
N+1 &> n,
\end{align*}
so, with $c_n$ interpreted as the real number $n$,
\begin{align*}
\mathbb R\models a>c_n.
\end{align*}
All sentences from the elementary diagram that occur in $\Delta$ hold because each $c_r$ is interpreted as $r$ in $\mathbb R$. Thus every finite $\Delta\subset T$ has a model.
By *Compactness Theorem for First-Order Logic*, there is a model $R^*\models T$. For each $r\in\mathbb R$, view $r$ inside $R^*$ as the interpretation of $c_r$. The elementary diagram ensures that this copy is elementary: for any ordered-ring formula $\varphi(x_1,\dots,x_m)$ and any $r_1,\dots,r_m\in\mathbb R$, the sentence $\varphi(c_{r_1},\dots,c_{r_m})$ or its negation belongs to the elementary diagram according as it is true or false in $\mathbb R$. Hence
\begin{align*}
\mathbb R\models \varphi(r_1,\dots,r_m)
\quad\Longleftrightarrow\quad
R^*\models \varphi(c_{r_1},\dots,c_{r_m}).
\end{align*}
Also, since $R^*\models a>c_n$ for every $n\in\mathbb N$, the element interpreted by $a$ is larger than every standard natural number in the embedded copy of $\mathbb R$. The ordered field reduct of $R^*$ is real closed because the axioms for real closed ordered fields are first-order sentences true in $\mathbb R$, hence included in its elementary diagram and therefore true in $R^*$.
[/example]
This compactness argument is often the fastest way to produce nonstandard extensions. It does not construct a canonical extension, but it guarantees that no finite first-order test can prevent the desired element from existing.
## First Consequences of Łoś and Compactness
The last question in this chapter is what these results immediately buy us. Three consequences recur throughout model theory: finite satisfiability as a criterion for consistency, elementary equivalence of ultrapowers, and the existence of nonstandard elements satisfying all standard finite bounds.
[quotetheorem:4291]
[citeproof:4291]
This consequence lets us build models of a theory by varying the coordinates, as long as each required axiom holds on a large enough set. It is the basic transfer mechanism behind many examples in algebra and arithmetic. The point is not that one coordinate model satisfies everything at once, but that the ultrafilter lets different coordinates be responsible for different finite pieces while still producing a single quotient model. This is why finite satisfiability and largeness conditions naturally translate into consistency results.
The next issue is how much first-order information is preserved when a structure is replaced by an ultrapower of itself. Since every coordinate is the same model, Łoś theorem suggests that sentences should transfer perfectly, but this must be separated from the stronger and usually false claim that the ultrapower is isomorphic to the original structure.
[quotetheorem:4292]
[citeproof:4292]
Elementary equivalence is weaker than isomorphism. A non-principal ultrapower may satisfy exactly the same first-order sentences as $M$ while having elements that do not correspond to any element of $M$.
[example: A Nonstandard Natural Number]
Let $M=(\mathbb N,+,\cdot,0,1,<)$, let $\mathcal U$ be a non-principal ultrafilter on $\mathbb N$, and form the ultrapower
\begin{align*}
N^*=M^{\mathbb N}/\mathcal U.
\end{align*}
Let $h=[s]$, where $s(n)=n$, and identify each standard $k\in\mathbb N$ with its diagonal element $[c_k]$, where $c_k(n)=k$ for every $n\in\mathbb N$.
Fix $k\in\mathbb N$. The coordinate truth set for $h>k$ is
\begin{align*}
A_k
&=\{n\in\mathbb N : M\models s(n)>c_k(n)\}\\
&=\{n\in\mathbb N : n>k\}.
\end{align*}
Its complement is
\begin{align*}
\mathbb N\setminus A_k
&=\{n\in\mathbb N : n\le k\}\\
&=\{1,2,\dots,k\},
\end{align*}
with the convention $\mathbb N=\{1,2,3,\dots\}$. This complement is finite. Since $\mathcal U$ is non-principal, no finite subset of $\mathbb N$ belongs to $\mathcal U$; since $\mathcal U$ is an ultrafilter, exactly one of a set and its complement belongs to $\mathcal U$. Hence
\begin{align*}
A_k\in\mathcal U.
\end{align*}
By *Łoś theorem* applied to the formula $x>y$,
\begin{align*}
N^*\models [s]>[c_k].
\end{align*}
Thus $N^*\models h>k$ for every standard $k\in\mathbb N$, so $h$ is larger than every element in the diagonal copy of the standard natural numbers.
[/example]
This example also points to a boundary of the idea: transfer preserves first-order induction, but it does not make the new element standard.
[remark: Why This Does Not Contradict Induction]
The ultrapower satisfies the same first-order theory as $\mathbb N$ in the given language, including every first-order instance of induction expressible in that language. The element $h$ is nonstandard from the external viewpoint because it is not equal to any diagonal constant $[n\mapsto k]$. First-order induction controls definable subsets inside the model; it does not say that every element is named by a standard numeral.
[/remark]
The same distinction appears in ordered fields, where an ultrapower can contain infinitesimal elements while preserving every first-order sentence true in the original real field.
[example: Infinitesimals in an Ultrapower of the Reals]
Let $R^*=\mathbb R^{\mathbb N}/\mathcal U$, where $\mathcal U$ is a non-principal ultrafilter on $\mathbb N=\{1,2,3,\dots\}$. Define $\varepsilon\in\mathbb R^{\mathbb N}$ by
\begin{align*}
\varepsilon(n)=\frac{1}{n}.
\end{align*}
For every $n\in\mathbb N$,
\begin{align*}
\varepsilon(n)=\frac{1}{n}>0
\end{align*}
in $\mathbb R$, so the coordinate truth set for $\varepsilon>0$ is
\begin{align*}
\{n\in\mathbb N:\mathbb R\models \varepsilon(n)>0\}
=\mathbb N.
\end{align*}
Since $\mathbb N\in\mathcal U$, *Łoś theorem* gives
\begin{align*}
R^*\models [\varepsilon]>0.
\end{align*}
Now fix a standard real number $r>0$, and let $c_r\in\mathbb R^{\mathbb N}$ be the constant sequence $c_r(n)=r$. By the Archimedean property of $\mathbb R$, choose $m\in\mathbb N$ such that
\begin{align*}
m>\frac{1}{r}.
\end{align*}
Since $r>0$, this implies
\begin{align*}
\frac{1}{m}<r.
\end{align*}
If $n\ge m$, then
\begin{align*}
0<\frac{1}{n}\le \frac{1}{m}<r,
\end{align*}
and therefore
\begin{align*}
\{n\in\mathbb N:\mathbb R\models \varepsilon(n)<c_r(n)\}
\supseteq
\{n\in\mathbb N:n\ge m\}.
\end{align*}
The complement of $\{n\in\mathbb N:n\ge m\}$ is the finite set $\{1,\dots,m-1\}$. A non-principal ultrafilter on $\mathbb N$ contains no finite set, so it contains every cofinite set; hence
\begin{align*}
\{n\in\mathbb N:\mathbb R\models \varepsilon(n)<c_r(n)\}\in\mathcal U.
\end{align*}
Applying *Łoś theorem* to $x<y$ gives
\begin{align*}
R^*\models [\varepsilon]<[c_r].
\end{align*}
Thus for every standard real $r>0$,
\begin{align*}
R^*\models 0<[\varepsilon]<r.
\end{align*}
The element $[\varepsilon]$ is therefore a positive infinitesimal in the ultrapower, while the ultrapower remains elementarily equivalent to $\mathbb R$ by the ultrapower elementary equivalence theorem.
[/example]
These examples mark the beginning of nonstandard methods in model theory. Compactness and ultraproducts do not merely prove existence theorems; they create models whose new elements encode infinite limiting behaviour while preserving all first-order truths of the original structures.
Compactness is most effective when used in reverse, by prescribing finite fragments of a desired behavior and letting the theorem assemble them into a model. The applications in the next chapter show how this idea produces non-definability results, special kinds of models, and existence arguments that cannot be obtained by direct construction alone.
# 5. Applications of Compactness
Compactness becomes most useful when it is used backwards. Instead of building one model directly, we prescribe every finite piece of the desired behaviour and let the compactness theorem assemble a model satisfying all of it at once. This chapter develops the standard applications: non-definability, nonstandard models, algebraic extension arguments, and the warning that first-order logic cannot enforce genuinely infinite global conditions by inspecting only finite fragments. The organizing theme is that first-order theories control finite pieces of information, while compactness turns coherent finite control into a single global structure.
## Non-Definability from Compactness
Which mathematical properties fail to be expressible by a single first-order sentence, even when they feel structurally natural? Compactness gives a uniform test: if a proposed sentence would separate a class from structures that approximate it finitely, then the sentence would force an impossible compactness conclusion.
[remark: Compactness Non-Definability Pattern]
To show that a class $\mathcal C$ is not first-order axiomatizable, one often assumes an axiomatization $\Gamma$ and adds a set $\Sigma$ of requirements that force a structure outside $\mathcal C$. If every finite subset of $\Gamma\cup\Sigma$ is satisfiable but the whole set has no model in $\mathcal C$, compactness contradicts the assumed axiomatization.
The finite-satisfiability hypothesis is essential: compactness has no force if some finite obstruction is already present. The failure condition is also essential, because otherwise the compactness model would not contradict the proposed class. For example, if $\mathcal C$ is the class of all graphs and $\Gamma$ is empty, adding infinitely many distinct constants produces an infinite graph but refutes nothing, since infinite graphs still lie in that class. The method says that a proposed first-order description cannot separate $\mathcal C$ from structures obtained by finitely satisfiable approximation.
[/remark]
We first apply the template to cardinality. The finite fragments ask only for finitely many distinct named elements, so they can be hidden inside a large finite structure.
[example: Finiteness Is Not First-Order Definable for Graphs]
Work in the graph language $\mathcal{L}=\{E\}$, where $E$ is a binary relation, and suppose that a sentence $\varphi$ is true exactly in finite graphs. Expand the language by constants $c_1,c_2,\dots$, and consider
\begin{align*}
T=\{\varphi\}\cup\{c_i\neq c_j:i,j\in\mathbb{N},\ i\neq j\}.
\end{align*}
Let $T_0\subseteq T$ be finite. Then $T_0$ contains only finitely many inequalities, so there is some $N$ such that every constant appearing in $T_0$ lies among $c_1,\dots,c_N$. Take a finite graph with at least $N$ vertices, for instance the empty graph on $\{1,\dots,N\}$, and interpret $c_i$ as $i$ for each constant $c_i$ appearing in $T_0$. Since this graph is finite, it satisfies $\varphi$, and since the chosen interpretations are distinct, it satisfies every inequality from $T_0$.
Thus every finite subset of $T$ is satisfiable. By the *Compactness Theorem*, $T$ has a model $M$. Since $M\models\varphi$, the sentence $\varphi$ says that the graph reduct of $M$ should be finite. But for every $i\neq j$,
\begin{align*}
M\models c_i\neq c_j,
\end{align*}
so the elements $c_1^M,c_2^M,\dots$ are pairwise distinct. Hence the domain of $M$ contains infinitely many elements, contradicting that $\varphi$ defines exactly the finite graphs.
[/example]
This example is the prototype: first-order logic can demand at least $n$ elements for each fixed $n$, but compactness prevents it from enforcing that these demands fail somewhere.
[example: Infinitude Is Not Defined by the Complement of Finiteness]
For each $n\in\mathbb{N}$, let
\begin{align*}
\theta_n=\exists x_1\dots \exists x_n\bigwedge_{1\le i<j\le n}x_i\neq x_j.
\end{align*}
The sentence $\theta_n$ says that the graph has at least $n$ distinct vertices: if $G\models\theta_n$, then the witnesses for $x_1,\dots,x_n$ are pairwise distinct; conversely, if $G$ has at least $n$ vertices, choose distinct vertices $a_1,\dots,a_n$, and then $G\models\bigwedge_{i<j}a_i\neq a_j$. Hence a graph satisfies
\begin{align*}
\{\theta_n:n\in\mathbb{N}\}
\end{align*}
exactly when it has at least $n$ vertices for every $n$, which is exactly to say that it is infinite.
This shows that infinitude is axiomatizable by an infinite first-order theory. It is not definable by a single first-order sentence. Indeed, suppose a sentence $\psi$ were true exactly in the infinite graphs. Then for every graph $G$,
\begin{align*}
G\models\neg\psi
&\Longleftrightarrow G\not\models\psi\\
&\Longleftrightarrow G\text{ is not infinite}\\
&\Longleftrightarrow G\text{ is finite}.
\end{align*}
Thus $\neg\psi$ would define the class of finite graphs, contradicting the preceding compactness argument that finiteness is not first-order definable for graphs. Infinitude can be enforced by all finite lower-bound sentences together, but not compressed into one sentence.
[/example]
A theory may describe an infinite phenomenon by infinitely many finite lower bounds. What compactness rules out is the compression of that phenomenon into one sentence when its complement would require a global finite bound.
## Building Models with Prescribed Finite Behavior
How can we build a structure satisfying infinitely many local requirements when each finite list of requirements is harmless? Compactness says that finite consistency is the only obstruction, so the construction reduces to checking finite pieces.
[quotetheorem:4294]
[citeproof:4294]
This is the first major model-building use of compactness. The assumption that $M$ is infinite is necessary: if $M$ were finite, the finitely many inequalities $c\neq\bar{m}$ for all $m\in M$ would already form an unsatisfiable finite set. The elementary diagram is also necessary for the stated conclusion, because it preserves all first-order truths with parameters from $M$, not merely the atomic operations and relations. The theorem does not identify the new extension uniquely, nor does it imply that the original theory is complete or decidable. It only says that finite satisfiability of the demand for a new element is enough to produce an elementary extension realizing that demand.
For arithmetic and ordered fields, the same proof becomes more concrete when the new element is required to dominate or be smaller than every standard named element. These examples show how compactness manufactures nonstandard behaviour while preserving first-order truth over the original structure.
[example: Nonstandard Natural Number Models]
Let $M=(\mathbb{N},+,\cdot,0,1,<)$, and expand the language by a constant symbol $\bar{n}$ for each $n\in\mathbb{N}$, interpreted in $M$ as $n$. Add one new constant symbol $c$, and consider
\begin{align*}
T=\operatorname{Diag}_{\mathrm{el}}(M)\cup\{\bar{n}<c:n\in\mathbb{N}\}.
\end{align*}
To see that every finite subset of $T$ is satisfiable, let $T_0\subseteq T$ be finite. The inequalities from $T_0$ have the form
\begin{align*}
\bar{n}_1<c,\ \bar{n}_2<c,\ \dots,\ \bar{n}_r<c
\end{align*}
for finitely many natural numbers $n_1,\dots,n_r$. Choose
\begin{align*}
N=1+\max\{n_1,\dots,n_r\}
\end{align*}
if $r>0$, and choose any $N\in\mathbb{N}$ if $r=0$. Interpret $c$ as $N$ in $M$. Then, for each $1\le i\le r$,
\begin{align*}
\bar{n}_i^M=n_i<N=c^M,
\end{align*}
so all displayed inequalities in $T_0$ hold. The remaining sentences of $T_0$ lie in $\operatorname{Diag}_{\mathrm{el}}(M)$, hence hold in $M$ by definition of the elementary diagram.
Thus every finite subset of $T$ is satisfiable. By the *Compactness Theorem*, there is a model $M^+\models T$. The constants $\bar{n}$ embed the original structure elementarily into the reduct of $M^+$, because $M^+$ satisfies every sentence in $\operatorname{Diag}_{\mathrm{el}}(M)$. Let
\begin{align*}
a=c^{M^+}.
\end{align*}
Since $M^+\models \bar{n}<c$ for every $n\in\mathbb{N}$, we have
\begin{align*}
\bar{n}^{M^+}<a
\end{align*}
for every standard natural number $n$. In particular, $a$ is not equal to any standard element: if $a=\bar{m}^{M^+}$, then $\bar{m}^{M^+}<a$ would give $\bar{m}^{M^+}<\bar{m}^{M^+}$, contradicting the elementary-diagram sentence expressing $\neg(\bar{m}<\bar{m})$. Therefore the elementary extension satisfies the same first-order sentences as $\mathbb{N}$ in the original language, while containing an element larger than every standard natural number.
[/example]
The construction does not say that arithmetic is complete or decidable. It says that first-order truth in the standard natural numbers cannot exclude an elementary extension once the language only records finitely many comparisons at a time.
[example: Non-Archimedean Real Closed Fields]
Let $\operatorname{RCF}$ be the first-order theory of real closed ordered fields. Add a constant symbol $\varepsilon$, and for each positive integer $n$ add the sentence
\begin{align*}
0<\varepsilon<\frac{1}{n},
\end{align*}
where $n$ denotes the ordered-field term $1+\cdots+1$ with $n$ summands.
Let $T_0$ be a finite subset of
\begin{align*}
\operatorname{RCF}\cup\left\{0<\varepsilon<\frac{1}{n}:n\ge 1\right\}.
\end{align*}
The infinitesimal requirements appearing in $T_0$ involve only finitely many integers $n_1,\dots,n_r$. If $r=0$, interpret $\varepsilon$ as $1$ in $\mathbb{R}$. If $r>0$, set
\begin{align*}
m=\max\{n_1,\dots,n_r\}
\qquad\text{and}\qquad
\varepsilon^{\mathbb{R}}=\frac{1}{m+1}.
\end{align*}
Then $\varepsilon^{\mathbb{R}}>0$. For each $i$, we have $n_i\le m<m+1$, so, since all terms are positive,
\begin{align*}
\frac{1}{m+1}<\frac{1}{n_i}.
\end{align*}
Thus
\begin{align*}
0<\varepsilon^{\mathbb{R}}=\frac{1}{m+1}<\frac{1}{n_i}
\end{align*}
for every displayed requirement in $T_0$. The remaining sentences of $T_0$ from $\operatorname{RCF}$ hold in $\mathbb{R}$ because $\mathbb{R}$ is a real closed ordered field.
Hence every finite subset is satisfiable. By the *Compactness Theorem*, there is a model $F\models\operatorname{RCF}$ with an element $\varepsilon^F$ such that
\begin{align*}
0<\varepsilon^F<\frac{1}{n}
\end{align*}
for every positive integer $n$. This element is a positive infinitesimal. The ordered field $F$ is not Archimedean: if it were, then for $x=1/\varepsilon^F$ there would be some positive integer $n$ with
\begin{align*}
\frac{1}{\varepsilon^F}<n.
\end{align*}
Multiplying by the positive element $\varepsilon^F$ gives
\begin{align*}
1<n\varepsilon^F.
\end{align*}
But $\varepsilon^F<1/n$ also gives
\begin{align*}
n\varepsilon^F<1,
\end{align*}
a contradiction. Therefore compactness produces a real closed ordered field containing a positive infinitesimal, and so a non-Archimedean real closed field.
[/example]
This example separates two notions that coincide in the usual real field: being real closed is first-order, while Archimedeanness is not forced by the first-order theory of real closed ordered fields. The obstruction is that Archimedeanness asks for one standard integer bound to work against each positive element, and compactness lets every finite approximation to such a demand look harmless inside $\mathbb R$. The full collection of requirements instead forces an infinitesimal in an elementary-style extension, which is why later non-axiomatizability statements must specify the class and language carefully.
[quotetheorem:4295]
[citeproof:4295]
The obstruction is not algebraic but logical: real closedness fixes the algebraic part of ordered-field behaviour while leaving the Archimedean condition exposed as a global order property. The result rules out arbitrary first-order axiomatizations, not only single sentences. It does not say that Archimedeanness is undefinable in second-order logic or in the surrounding set theory; those languages can quantify over sets or sequences in ways first-order ordered rings cannot. The infinitesimal phenomenon also previews the diagram method developed next, where compactness is used to add controlled new elements while preserving old first-order information.
## Compactness and Algebraic Diagrams
How can algebraic data given only partially be extended to a full structure? The diagram method packages the already-known atomic facts of a structure as a theory, then uses compactness to realize all finitely consistent extra conditions.
[definition: Atomic Diagram]
Let $M$ be an $\mathcal{L}$-structure. Expand $\mathcal{L}$ to $\mathcal{L}(M)$ by adding a constant symbol $\bar{m}$ for each $m\in M$. The atomic diagram of $M$, denoted $\operatorname{Diag}_{\mathrm{at}}(M)$, is the set of all atomic and negated atomic $\mathcal{L}(M)$-sentences true in $M$ under the interpretation $\bar{m}\mapsto m$.
[/definition]
The atomic diagram records the quantifier-free algebra of $M$: which equations, inequalities, relations, and function values already hold among named elements. For compactness arguments about elementary extensions, quantifier-free control is not enough, because formulas with quantifiers over old parameters must keep the same truth values. We therefore need a stronger diagram that records the full first-order theory of the named structure.
[definition: Elementary Diagram]
Let $M$ be an $\mathcal{L}$-structure. In the language $\mathcal{L}(M)$, the elementary diagram of $M$, denoted $\operatorname{Diag}_{\mathrm{el}}(M)$, is the set of all $\mathcal{L}(M)$-sentences true in $M$.
[/definition]
The elementary diagram is stronger: a model of it contains a copy of $M$ that is elementary rather than merely embedded as a substructure. This stronger diagram is the right background theory when we want to add a new element satisfying proposed conditions over $M$ without changing the truth of formulas about the old parameters. The natural question is therefore exactly a compactness question: if every finite batch of desired conditions can coexist with the elementary diagram of $M$, can all of them be realized together in one elementary extension?
[quotetheorem:4296]
[citeproof:4296]
The finite satisfiability assumption cannot be weakened: if a finite part of $\Sigma(c)$ already contradicts the elementary diagram, then no extension can realize the whole set. The elementary diagram is what upgrades the copy of $M$ from a merely compatible substructure to an elementary substructure; using only the atomic diagram would generally give an embedding preserving quantifier-free facts but not all formulas with parameters. The theorem also does not promise that $a$ lies in the original structure $M$, since the point is often to pass to an extension where a finitely consistent type becomes realized. This type-realization viewpoint connects compactness with algebraic extension arguments and with the later model-theoretic language of saturated structures and ultraproduct intuition.
[example: Realizing a Finite-System Pattern in a Field Extension]
For a field $K$, the notation $K[X]$ denotes the ring of polynomials in one indeterminate $X$ with coefficients in $K$. More generally, $K[x_1,\dots,x_n]$ denotes the polynomial ring in $n$ indeterminates over $K$.
Let $K$ be a field, and write each condition in $\Sigma(x)$ as either
\begin{align*}
p(x)=0
\qquad\text{or}\qquad
q(x)\neq 0
\end{align*}
with $p,q\in K[X]$. Work in the ring language expanded by constant symbols $\bar{k}$ for all $k\in K$ and by one new constant $c$. Consider the theory
\begin{align*}
T=\operatorname{Diag}_{\mathrm{at}}(K)\cup\{\text{field axioms}\}\cup\Sigma(c).
\end{align*}
If $T_0\subseteq T$ is finite, then it contains only finitely many polynomial constraints, say $\Sigma_0(c)$. By assumption, there is a field extension $E/K$ and an element $b\in E$ satisfying every condition in $\Sigma_0(x)$. Interpret $\bar{k}$ as the image of $k$ in $E$ and interpret $c$ as $b$. The field axioms hold in $E$, the finite part of $\operatorname{Diag}_{\mathrm{at}}(K)$ holds because the embedding $K\hookrightarrow E$ preserves addition, multiplication, $0$, and $1$, and $\Sigma_0(c)$ holds by the choice of $b$. Hence every finite subset of $T$ is satisfiable.
By the *Compactness Theorem*, there is a model $L^+\models T$. Let $L$ be its reduct to the ring language, and let
\begin{align*}
\iota:K\to L,\qquad \iota(k)=\bar{k}^{L^+}.
\end{align*}
For $k,\ell\in K$, the atomic diagram contains
\begin{align*}
\overline{k+\ell}=\bar{k}+\bar{\ell},
\qquad
\overline{k\ell}=\bar{k}\bar{\ell},
\qquad
\bar{0}=0,
\qquad
\bar{1}=1,
\end{align*}
so $\iota(k+\ell)=\iota(k)+\iota(\ell)$, $\iota(k\ell)=\iota(k)\iota(\ell)$, $\iota(0)=0$, and $\iota(1)=1$. If $k\neq\ell$, the atomic diagram contains $\bar{k}\neq\bar{\ell}$, so $\iota(k)\neq\iota(\ell)$. Thus $\iota$ embeds $K$ as a subfield of $L$, and after identifying $K$ with its image, $L/K$ is a field extension.
Finally set
\begin{align*}
a=c^{L^+}.
\end{align*}
If $p(X)=\alpha_0+\alpha_1X+\cdots+\alpha_nX^n\in K[X]$ and the constraint $p(c)=0$ belongs to $\Sigma(c)$, then $L^+\models p(c)=0$, which means
\begin{align*}
\bar{\alpha}_0+\bar{\alpha}_1a+\cdots+\bar{\alpha}_na^n=0.
\end{align*}
After identifying $\bar{\alpha}_i^{L^+}$ with $\alpha_i\in K$, this is exactly $p(a)=0$ in $L$. The same reading of terms shows that each inequation $q(c)\neq 0$ in $\Sigma(c)$ becomes $q(a)\neq 0$ in $L$. Therefore one field extension $L/K$ contains a single element $a$ satisfying all polynomial equations and inequations in $\Sigma(x)$ simultaneously.
[/example]
The same pattern works for groups, rings, ordered fields, and modules whenever finite pieces can be realized in compatible extensions. The model-theoretic gain is that compatibility is checked finitely but obtained globally.
[example: No First-Order Theory Axiomatizes Exactly the Finite Fields]
In the language of rings, suppose for contradiction that a first-order theory $T$ has exactly the finite fields as its models. For each $m\ge 1$, let
\begin{align*}
\theta_m=\exists x_1\dots\exists x_m\bigwedge_{1\le i<j\le m}x_i\neq x_j.
\end{align*}
The sentence $\theta_m$ says that the structure has at least $m$ distinct elements, because its witnesses $x_1,\dots,x_m$ are required to be pairwise unequal.
Consider
\begin{align*}
T^*=T\cup\{\theta_m:m\ge 1\}.
\end{align*}
Let $T_0\subseteq T^*$ be finite. Then $T_0$ contains only finitely many of the lower-bound sentences, say $\theta_{m_1},\dots,\theta_{m_r}$. If $r=0$, choose any finite field, for instance $\mathbb{F}_2$. If $r>0$, set
\begin{align*}
M=\max\{m_1,\dots,m_r\}.
\end{align*}
Choose $n$ with $2^n\ge M$; the finite field $\mathbb{F}_{2^n}$ has $2^n$ elements, so it satisfies each $\theta_{m_i}$, since
\begin{align*}
m_i\le M\le 2^n
\end{align*}
for every $i$. It also satisfies every sentence from the finite fragment of $T$, because $T$ is assumed to hold in exactly the finite fields. Hence every finite subset of $T^*$ is satisfiable.
By the *Compactness Theorem*, $T^*$ has a model $K$. Since $K\models T$, the assumed axiomatization says that $K$ is a finite field. Let $|K|=s$. But $K\models\theta_{s+1}$, so there are elements
\begin{align*}
a_1,\dots,a_{s+1}\in K
\end{align*}
such that $a_i\neq a_j$ whenever $i\neq j$. Thus $K$ contains at least $s+1$ distinct elements, contradicting $|K|=s$. Therefore no first-order theory axiomatizes exactly the finite fields.
[/example]
This argument deliberately ignores the internal classification of finite fields. It uses only unbounded finite size, which is precisely the kind of information compactness cannot turn into a first-order upper bound.
## Preservation Warnings and Finite Fragments
What kinds of mathematical properties are invisible to first-order logic because they require control over all finite stages at once? Compactness warns that first-order theories preserve finite satisfiability too well: if every finite obstruction can be avoided, then the full obstruction cannot be first-order enforced.
[example: Well-Ordering Is Not First-Order Axiomatizable]
Suppose, toward a contradiction, that there is a first-order theory $T$ in the language of linear orders whose models are exactly the well-orders. Add constant symbols $c_1,c_2,\dots$, and consider
\begin{align*}
T^*=T\cup\{c_{n+1}<c_n:n\in\mathbb{N}\}.
\end{align*}
We show that every finite subset of $T^*$ is satisfiable. Let $T_0\subseteq T^*$ be finite. The descending-chain requirements appearing in $T_0$ have the form
\begin{align*}
c_{n_1+1}<c_{n_1},\quad c_{n_2+1}<c_{n_2},\quad \dots,\quad c_{n_r+1}<c_{n_r}
\end{align*}
for finitely many indices $n_1,\dots,n_r$. If $r=0$, take any finite well-order. If $r>0$, set
\begin{align*}
m=\max\{n_1,\dots,n_r\}.
\end{align*}
Use the finite linear order
\begin{align*}
0<1<\cdots<m.
\end{align*}
For each $1\le i\le m+1$, interpret
\begin{align*}
c_i=m+1-i.
\end{align*}
Then for every $n\le m$,
\begin{align*}
c_{n+1}=m+1-(n+1)=m-n<m+1-n=c_n,
\end{align*}
so every displayed inequality in $T_0$ holds. The finite order $0<1<\cdots<m$ is a well-order, so it satisfies every sentence from the finite fragment of $T$, because $T$ is assumed to hold exactly in the well-orders.
Thus every finite subset of $T^*$ is satisfiable. By the *Compactness Theorem*, there is a model $M\models T^*$. Since $M\models T$, the assumption on $T$ says that the order of $M$ is a well-order. But $M$ also satisfies
\begin{align*}
c_{n+1}^M<c_n^M
\end{align*}
for every $n\in\mathbb{N}$. Hence the nonempty set
\begin{align*}
A=\{c_n^M:n\in\mathbb{N}\}
\end{align*}
has no least element: if $c_m^M\in A$, then $c_{m+1}^M\in A$ and
\begin{align*}
c_{m+1}^M<c_m^M.
\end{align*}
This contradicts the definition of a well-order. Therefore no first-order theory axiomatizes exactly the well-orders.
[/example]
The well-ordering property is global: it forbids infinite descending sequences, but first-order sentences can only inspect finitely many named positions in any one formula.
[example: Infinite Graph with Every Finite Subgraph $k$-Colorable]
Fix $k\ge 1$, and let $G=(V,E)$ be a graph such that every finite subgraph of $G$ is $k$-colorable. Add constant symbols $\bar v$ for all $v\in V$ and unary predicates $P_1,\dots,P_k$. Consider the theory consisting of the atomic diagram of $G$, together with the coloring requirements
\begin{align*}
P_1(\bar v)\vee\cdots\vee P_k(\bar v)
\end{align*}
for each $v\in V$,
\begin{align*}
\neg(P_i(\bar v)\wedge P_j(\bar v))
\end{align*}
for each $v\in V$ and each $1\le i<j\le k$, and
\begin{align*}
\neg(P_i(\bar u)\wedge P_i(\bar v))
\end{align*}
for each edge $E(u,v)$ of $G$ and each $1\le i\le k$.
Let $T_0$ be a finite subset of this theory. Only finitely many constants occur in $T_0$; write them as
\begin{align*}
\bar v_1,\dots,\bar v_m.
\end{align*}
The induced subgraph of $G$ on $\{v_1,\dots,v_m\}$ is finite, so by assumption it has a $k$-coloring
\begin{align*}
\chi:\{v_1,\dots,v_m\}\to\{1,\dots,k\}.
\end{align*}
Interpret the graph symbols and constants as they are interpreted in $G$, and define the predicates on the mentioned vertices by
\begin{align*}
P_i(v_j)\quad\Longleftrightarrow\quad \chi(v_j)=i.
\end{align*}
On vertices not among $v_1,\dots,v_m$, define the predicates arbitrarily. The atomic-diagram sentences in $T_0$ hold because they are true in $G$. For each mentioned vertex $v_j$, exactly one value $\chi(v_j)$ is chosen, so
\begin{align*}
P_1(v_j)\vee\cdots\vee P_k(v_j)
\end{align*}
holds and
\begin{align*}
\neg(P_i(v_j)\wedge P_\ell(v_j))
\end{align*}
holds whenever $i\neq \ell$. If $E(v_a,v_b)$ and $1\le i\le k$, then $\chi(v_a)$ and $\chi(v_b)$ cannot both equal $i$, since $\chi$ is a proper coloring of the finite subgraph. Hence
\begin{align*}
\neg(P_i(v_a)\wedge P_i(v_b))
\end{align*}
holds for every edge-color restriction appearing in $T_0$.
Thus every finite subset of the expanded theory is satisfiable. By the *Compactness Theorem*, the whole theory has a model $M$. For each $v\in V$, the sentence
\begin{align*}
P_1(\bar v)\vee\cdots\vee P_k(\bar v)
\end{align*}
holds in $M$, and the pairwise-exclusion sentences ensure that exactly one predicate $P_i$ holds of $\bar v^M$. Define
\begin{align*}
\chi(v)=i\quad\Longleftrightarrow\quad M\models P_i(\bar v).
\end{align*}
If $E(u,v)$ in $G$, then for every $i$ the theory contains
\begin{align*}
\neg(P_i(\bar u)\wedge P_i(\bar v)),
\end{align*}
so $\chi(u)\neq \chi(v)$. Therefore $\chi$ is a $k$-coloring of the whole graph $G$. Compactness turns the compatible finite colorings into one global coloring.
[/example]
Here compactness proves a positive theorem rather than a non-definability result. The same finite-fragment principle says that a global coloring exists because all finite demands are jointly satisfiable.
[remark: Compactness Does Not Preserve Second-Order Meaning]
Compactness is a theorem about first-order theories. Properties such as finiteness, well-ordering, the standard natural numbers, and Archimedeanness can be expressed naturally in stronger logics or in set-theoretic metatheory, but they are not controlled by first-order syntax alone. When a proposed axiom system tries to enforce one of these properties, the compactness test should be the first diagnostic.
[/remark]
The chapter's applications all have the same shape. Add constants or predicates to name the finite behaviour desired, verify that every finite fragment has a model, and then invoke compactness to obtain a single model realizing the whole scheme. The resulting model either supplies the desired construction or contradicts a proposed first-order definition.
Compactness tells us that finitely satisfiable conditions can be realized, but it says nothing yet about how large the resulting models must be. The Löwenheim-Skolem theorems supply that missing control by showing that first-order theories admit models of many sizes, often far smaller or larger than one might expect.
# 6. Löwenheim–Skolem Theorems
The compactness theorem showed that first-order theories can have models built by controlling only finitely many sentences at a time. This chapter studies a complementary kind of control: the size of models. The Löwenheim-Skolem theorems say that, once an infinite model exists, first-order logic usually cannot determine its cardinality. We prove the downward theorem by closing a chosen set under witnesses, prove the upward theorem by adding many constants and using compactness, and then explain why these results produce the Skolem paradox without producing a contradiction.
## Building Small Elementary Substructures
The first problem is the following. Given a large structure $M$ and a small set of parameters $A \subset M$, can we find a small elementary substructure of $M$ containing $A$? A substructure alone is too weak, because existential formulas may have witnesses in $M$ which are absent from the smaller universe. The construction therefore closes $A$ under a controlled supply of witnesses for existential formulas.
[definition: Skolem Function For A Formula]
Let $L$ be a first-order language, let $M$ be an $L$-structure, and let $\varphi(x,y)$ be an $L$-formula, where $x$ is a single variable and $y=(y_1,\dots,y_n)$ is a finite tuple of variables. A Skolem function for $\varphi$ in $M$ is a function $f_\varphi:M^n\to M$ such that, for every $a\in M^n$, if $M\models \exists x\,\varphi(x,a)$, then $M\models \varphi(f_\varphi(a),a)$.
[/definition]
If no witness exists for a parameter tuple $a$, the value of $f_\varphi(a)$ is irrelevant. These externally chosen functions are useful only after we close a parameter set under them.
The next object is the closure generated by that requirement. It is the candidate small universe: it contains the starting parameters and all witnesses demanded by the chosen formulas, while being built by an explicit finite-iteration process.
[definition: Skolem Hull]
Let $M$ be an $L$-structure, let $A\subset M$, and let $\mathcal F$ be a family of functions such that each $f\in\mathcal F$ has some finite arity $n$ and is a function $f:M^n\to M$. The Skolem hull of $A$ under $\mathcal F$ is the smallest subset $H\subset M$ such that $A\subset H$ and, whenever $f\in\mathcal F$ has arity $n$ and $a\in H^n$, then $f(a)\in H$.
[/definition]
The hull is obtained by starting with $A$ and iterating all functions in $\mathcal F$ finitely many times. The key payoff is that this closure process can be made small while still supplying witnesses for existential formulas over the closed set. The following result packages that idea into the construction needed for downward Löwenheim-Skolem.
[quotetheorem:4297]
[citeproof:4297]
This lemma packages the main idea behind downward Löwenheim-Skolem: elementary submodels are produced by making sure every existential demand that is satisfiable in the big model has a witness inside the small set. The closure under the original function symbols is necessary because, without it, $H$ might not even carry the induced structure: in the language of groups, a subset not closed under multiplication is not a subgroup. The closure under Skolem functions is the extra ingredient which upgrades substructure to elementary substructure; for instance, a subfield of an algebraically closed field may fail to contain a root of a polynomial whose coefficients lie in the subfield. The next theorem isolates this witness condition directly, so that we can recognise elementary substructures without naming the auxiliary Skolem functions.
[quotetheorem:4280]
[citeproof:4280]
As introduced in Chapter 2, the Tarski-Vaught test is often the most convenient form of the Skolem hull argument. Rather than mentioning chosen functions, it states the exact witness property needed for elementarity. The assumption that $N$ is already a substructure is essential: atomic formulas and terms must have the same interpretation before the induction on formulas can begin. The test also shows what elementarity does not mean: it does not require $N$ and $M$ to have the same elements, only that existential facts over parameters from $N$ have witnesses inside $N$. This is the local form of reflection that will be used to shrink large structures while preserving the formulas relevant to a chosen parameter set.
[example: Countable Elementary Submodel Of An Uncountable Structure]
Let $L$ be countable, let $M$ be an uncountable $L$-structure, and choose a countable set $A\subset M$. For each existential formula $\exists x\,\varphi(x,y)$, choose a Skolem function $f_\varphi$ in $M$, and also include the interpretations of the function symbols and constant symbols of $L$ among the closure operations. Since $L$ is countable, the set of $L$-formulas is countable, so the total family of these finitary operations is countable.
Define stages by
\begin{align*}
H_0&=A\cup\{c^M:c\text{ is a constant symbol of }L\},\\
H_{n+1}&=H_n\cup\{g(a_1,\dots,a_k):g\text{ is one of the chosen finitary operations and }(a_1,\dots,a_k)\in H_n^k\}.
\end{align*}
The set $H_0$ is countable because it is the union of two countable sets. If $H_n$ is countable and $k<\omega$, then $H_n^k$ is countable; for each fixed operation $g$ of arity $k$, the set of values $g[H_n^k]$ is therefore countable. Taking the union over countably many operations still gives a countable set, so $H_{n+1}$ is countable. By induction every $H_n$ is countable, and hence
\begin{align*}
H=\bigcup_{n<\omega}H_n
\end{align*}
is countable as a countable union of countable sets.
The set $H$ contains $A$, is closed under the original function symbols of $L$, and is closed under all the chosen Skolem functions. Therefore the induced $L$-substructure $N$ with universe $H$ satisfies $N\preccurlyeq M$ by the *Skolem Hull Lemma*. Thus every countable parameter set in $M$ is contained in a countable elementary substructure of $M$.
[/example]
This example is the basic mechanism behind many applications: arguments about a large structure can often be reflected down to a smaller elementary substructure containing the relevant parameters.
## The Downward Löwenheim-Skolem Theorem
The next question is how far the preceding construction can be pushed when the language or parameter set is not countable. The correct size bound is the larger of the language size, the parameter set size, and $\aleph_0$, because there must be enough room for the parameters, the symbols of the language, and the countably many logical operations used to form formulas. The theorem cannot ignore the language size: if $L$ has constants naming many different elements, any elementary substructure must contain their interpretations. It also cannot ignore the parameter set, since asking for $A\subset N$ already forces $|N|\ge |A|$. This is the first sign that first-order logic measures complexity through formulas and finite tuples rather than through the full size of the ambient universe.
[quotetheorem:4299]
[citeproof:4299]
The theorem does not say that every substructure of the bounded size is elementary. It says that a suitable one exists, and its construction is driven by witnesses for formulas. The infinitude assumption matters because finite structures can have their exact size described up to isomorphism by a single first-order sentence saying that there are exactly $n$ elements. The bound is also sharp in the expected sense: if the language contains $\kappa$ many distinct constant symbols interpreted differently, then every elementary substructure must have size at least $\kappa$. In algebraically closed fields, this abstract bound produces genuine elementary subfields rather than arbitrary small subsets.
[example: Countable Elementary Subfields Of Algebraically Closed Fields]
Let $K$ be an uncountable algebraically closed field in the language of rings, and let $A\subset K$ be countable. By *Downward Löwenheim-Skolem*, since the language of rings has only finitely many symbols, there is an elementary substructure $F\preccurlyeq K$ such that $A\subset F$ and $F$ is countable.
Because $F$ is an $L$-substructure of $K$, it contains the interpretations of $0$ and $1$ and is closed under the ring operations. Thus, if $a,b\in F$, then
\begin{align*}
a+b\in F,\qquad -a\in F,\qquad a\cdot b\in F.
\end{align*}
The field axioms are first-order sentences in the language of rings, and $K$ satisfies them. Since $F\preccurlyeq K$, every first-order sentence true in $K$ is true in $F$, so $F$ is a field.
To see algebraic closedness, fix $n\ge 1$. The assertion that every nonconstant polynomial of degree at most $n$ has a root is expressed by the first-order sentence
\begin{align*}
\forall a_0\cdots \forall a_n\,
\bigl(a_n\ne 0 \to \exists x\,(a_nx^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0=0)\bigr).
\end{align*}
This sentence holds in $K$ because $K$ is algebraically closed. By elementarity it also holds in $F$. Since this is true for every $n\ge 1$, every nonconstant polynomial with coefficients in $F$ has a root in $F$, so $F$ is algebraically closed.
Thus every countable parameter set inside an uncountable algebraically closed field is contained in a countable elementary subfield which is itself algebraically closed.
[/example]
This example is stronger than merely taking the subfield generated by $A$. The generated subfield need not be algebraically closed and need not be elementary; for example, $\mathbb Q$ inside $\mathbb C$ does not contain a root of $x^2-2$. The Skolem construction adds witnesses for all first-order existential requirements, including algebraic roots and more complicated definable elements. This algebraic illustration is a useful warning for the rest of the chapter: small elementary substructures can preserve first-order truth while looking very different from the ambient structure as plain sets.
## Creating Large Models By Compactness
The downward theorem produces small elementary substructures of a given model. The opposite problem asks whether a theory with one infinite model must also have larger models. The word infinite is essential: a theory saying that there are exactly $n$ elements has no model of any larger size. Compactness gives a direct method in the infinite case: add many constants and require them to be distinct. The finite satisfiability check works precisely because an infinite model can realise any finite demand for distinct elements.
[quotetheorem:1489]
[citeproof:1489]
The theorem shows that no first-order theory with an infinite model can impose a largest infinite cardinality. The lower bound $\kappa\ge |L|+\aleph_0$ ensures that, after the large model is built, downward Löwenheim-Skolem can trim it to size exactly $\kappa$ without losing the named constants. The theorem does not say that the larger model is an elementary extension of a particular original model; it only produces some model of the same theory. To force an elementary extension, we must preserve all formulas with parameters from the original structure, which is the role of the elementary diagram in the next example.
[example: Elementary Extensions Of Arbitrarily Large Cardinality]
Let $M$ be an infinite $L$-structure and let $\kappa\ge |L|+|M|+\aleph_0$. Expand the language by adding a constant symbol $d_a$ for each $a\in M$ and a further family of constant symbols $(c_i)_{i<\kappa}$. Let $\operatorname{Diag}_{\mathrm{el}}(M)$ be the elementary diagram of $M$ in the language with the constants $d_a$, and consider the expanded theory
\begin{align*}
\Gamma
=
\operatorname{Diag}_{\mathrm{el}}(M)
\cup
\{c_i\ne c_j:i,j<\kappa,\ i\ne j\}.
\end{align*}
We show that every finite $\Gamma_0\subset \Gamma$ is satisfiable. Only finitely many constants $c_{i_1},\dots,c_{i_n}$ occur in $\Gamma_0$. Since $M$ is infinite, choose distinct elements $m_1,\dots,m_n\in M$. Expand $M$ by interpreting each $d_a$ as $a$ and each mentioned $c_{i_r}$ as $m_r$. Then every sentence from $\operatorname{Diag}_{\mathrm{el}}(M)$ appearing in $\Gamma_0$ is true by the definition of the elementary diagram, and for $r\ne s$ we have
\begin{align*}
c_{i_r}^M=m_r\ne m_s=c_{i_s}^M,
\end{align*}
so every required inequality between the mentioned constants is also true. Thus $\Gamma_0$ has a model.
By *Compactness Theorem*, $\Gamma$ has a model $B$. Define $e:M\to B$ by
\begin{align*}
e(a)=d_a^B.
\end{align*}
If $a\ne b$ in $M$, then the sentence $d_a\ne d_b$ belongs to $\operatorname{Diag}_{\mathrm{el}}(M)$, so $B\models d_a\ne d_b$ and hence $e(a)\ne e(b)$. Thus $e$ is injective. Moreover, for every $L$-formula $\varphi(x_1,\dots,x_n)$ and every tuple $a_1,\dots,a_n\in M$, exactly one of
\begin{align*}
M\models \varphi(a_1,\dots,a_n),
\qquad
M\models \neg\varphi(a_1,\dots,a_n)
\end{align*}
holds. In the first case, $\varphi(d_{a_1},\dots,d_{a_n})$ belongs to the elementary diagram, so
\begin{align*}
B\models \varphi(e(a_1),\dots,e(a_n)).
\end{align*}
In the second case, $\neg\varphi(d_{a_1},\dots,d_{a_n})$ belongs to the elementary diagram, so
\begin{align*}
B\models \neg\varphi(e(a_1),\dots,e(a_n)).
\end{align*}
Therefore $e[M]$ is an elementary copy of $M$ inside the $L$-reduct of $B$.
Finally, since $\Gamma$ contains $c_i\ne c_j$ for all distinct $i,j<\kappa$, the map
\begin{align*}
i\longmapsto c_i^B
\end{align*}
is an injection from $\kappa$ into $B$. Hence $|B|\ge \kappa$. So $M$ has an elementary extension, up to the displayed elementary copy, of cardinality at least $\kappa$.
[/example]
For this application, the elementary diagram makes the extension elementary rather than merely a model of the same theory. The constants naming elements of $M$ force all first-order facts with parameters from $M$ to be preserved, so the embedding of $M$ into the compactness model is elementary. This distinction is important in algebra and set theory: two structures can satisfy the same sentences while failing to preserve formulas with parameters from a chosen substructure. The next section returns to the witness mechanism behind this preservation and turns it into a formal expansion of the language.
## Skolemization And Cardinality Control
The preceding proofs used Skolem functions externally. A related syntactic construction adds formal function symbols to the language so that existential witnesses are named by terms. This is useful for model construction because it turns existential search into closure under functions.
[definition: Skolemization]
Let $T$ be an $L$-theory. A Skolemization of $T$ is an expansion $L^{\mathrm{Sk}}$ of $L$ obtained by adding, for selected formulas $\exists x\,\varphi(x,y)$, a function symbol $f_\varphi$ of arity $|y|$, together with axioms
\begin{align*}
\forall y\,\bigl(\exists x\,\varphi(x,y) \to \varphi(f_\varphi(y),y)\bigr).
\end{align*}
[/definition]
After Skolemization, substructures of models of the expanded theory are far more likely to be elementary in the reduct, because closure under the added functions supplies witnesses. The phrase "for all existential $L$-formulas" is crucial: adding Skolem functions for only some formulas may leave another existential formula with witnesses in the ambient model but none in the substructure.
The obstruction is that being a substructure only preserves the values of named functions; it does not by itself ensure that existential witnesses from the larger model remain inside the smaller one. The next result identifies the extra naming condition under which closure in the Skolemized language exactly supplies the missing witnesses needed for elementarity in the original language.
[quotetheorem:4300]
[citeproof:4300]
The distinction between external and formal Skolem functions matters in practice. External functions prove existence of elementary substructures inside a fixed model; formal Skolem functions modify the language so that elementary substructures can be produced by ordinary closure. The theorem does not assert elementarity for formulas in the expanded language unless the relevant Skolem expansion is itself treated as the base language and the same witness condition is checked there. It also explains why omitting even one family of existential witnesses can break the argument: the Tarski-Vaught condition is an all-formulas condition. The following example converts this logical observation into a concrete countability computation.
[example: Cardinality Of A Skolem Hull In A Skolemized Language]
Suppose $L$ is countable, $L^{\mathrm{Sk}}$ is obtained from $L$ by adding countably many Skolem function symbols, and $M^{\mathrm{Sk}}$ is a model of the Skolemized theory. Let $A\subset M$ be countable. Enumerate the function symbols of $L^{\mathrm{Sk}}$ as a countable family of finitary operations, and define
\begin{align*}
H_0&=A\cup\{c^{M^{\mathrm{Sk}}}:c\text{ is a constant symbol of }L^{\mathrm{Sk}}\},\\
H_{n+1}&=H_n\cup\{f(a_1,\dots,a_k):f\text{ is a }k\text{-ary function symbol of }L^{\mathrm{Sk}},\ (a_1,\dots,a_k)\in H_n^k\}.
\end{align*}
The set $H_0$ is countable because it is the union of the countable set $A$ and the countable set of interpretations of constant symbols.
Assume $H_n$ is countable. For each fixed $k<\omega$, the set $H_n^k$ is countable: for $k=0$ it has one element, and for $k\ge 1$ it is a finite Cartesian product of countable sets. If $f$ has arity $k$, then
\begin{align*}
f[H_n^k]=\{f(a_1,\dots,a_k):(a_1,\dots,a_k)\in H_n^k\}
\end{align*}
is countable because it is the image of a countable set. Since there are only countably many function symbols in $L^{\mathrm{Sk}}$, the union of all such sets $f[H_n^k]$ is countable. Hence $H_{n+1}$ is countable. By induction, every $H_n$ is countable, and therefore
\begin{align*}
H=\bigcup_{n<\omega}H_n
\end{align*}
is countable as a countable union of countable sets.
The set $H$ contains $A$ and is closed under every function symbol of $L^{\mathrm{Sk}}$: if $a_1,\dots,a_k\in H$, then each $a_i$ lies in some $H_{n_i}$, so with $n=\max\{n_1,\dots,n_k\}$ we have $(a_1,\dots,a_k)\in H_n^k$, and hence $f(a_1,\dots,a_k)\in H_{n+1}\subset H$. Thus $H$ is exactly the universe of the $L^{\mathrm{Sk}}$-substructure generated by $A$. By *Skolem Hulls Are Elementary Substructures*, the $L$-reduct of this substructure is an elementary substructure of the original $L$-reduct of $M^{\mathrm{Sk}}$. So, in a countable Skolemized language, the downward Löwenheim-Skolem construction is precisely the countable closure of $A$ under countably many finitary operations.
[/example]
Skolemization therefore explains why first-order logic has a persistent countability phenomenon in countable languages: only countably many formulas can ask for witnesses, and each asks using only finitely many parameters.
## The Skolem Paradox
The final question is conceptual. If set theory has a model, downward Löwenheim-Skolem can produce a countable elementary model of a large fragment of set theory. How can such a model satisfy sentences saying that uncountable sets exist?
[example: Countable Models Of Set-Theoretic Fragments]
Let $T$ be a countable first-order fragment of set theory with an infinite model $M$. Since $T$ is countable and we take no extra parameter set, *Downward Löwenheim-Skolem* gives an elementary submodel $N\preccurlyeq M$ whose underlying set is countable.
Suppose $N$ satisfies the sentence saying that there is no bijection from $\omega$ onto its real line. Written internally, this means
\begin{align*}
N\models \neg\exists f\,\bigl(f:\omega^N\to \mathbb R^N\text{ is a bijection}\bigr).
\end{align*}
Thus, inside $N$, there is no element $f\in N$ which $N$ recognizes as a bijection from $\omega^N$ onto $\mathbb R^N$. In this precise sense, $N$ believes that $\mathbb R^N$ is uncountable.
Externally, however, the domain of $N$ is countable. Since the collection that $N$ calls $\mathbb R$ is a subset of the domain of $N$, we have
\begin{align*}
\mathbb R^N\subseteq N.
\end{align*}
A subset of a countable set is countable, so from the outside viewpoint there is some enumeration of the elements of $\mathbb R^N$. There is no contradiction: the external enumeration need not be an element of $N$, and only functions belonging to $N$ are available to the internal existential quantifier displayed above.
[/example]
There is no contradiction because the bijections available inside $N$ are only the functions which are elements of $N$. An external enumeration of the domain of $N$ need not itself belong to $N$, and so it need not be seen by $N$. This is another instance of the same hypothesis sensitivity seen above: first-order quantifiers range over the universe of the structure, not over the surrounding metatheory. The apparent paradox comes from silently switching between these two ranges of quantification.
[explanation: Internal And External Countability]
A set $X\in N$ is internally countable in $N$ if $N$ contains a function which it satisfies is a bijection from $\omega$ onto $X$. The same set is externally countable if, from the surrounding metatheory, its collection of elements can be placed in bijection with $\mathbb N$. These are different notions because internal countability quantifies only over functions present in the model, while external countability quantifies over all functions available in the metatheory.
The Skolem paradox is the observation that a countable model of set theory can contain sets which it regards as uncountable. The paradox dissolves once the phrase "there exists a bijection" is read relative to a model. First-order satisfaction is internal to the structure, while the assertion that the domain of the structure is countable is made externally about that domain.
[/explanation]
This distinction is a recurring theme in model theory. A structure may be small from the outside while internally satisfying rich statements about large mathematical objects. In set theory the contrast is especially vivid because the objects inside the structure include functions, bijections, ordinals, and cardinals as the model understands them. The Löwenheim-Skolem theorem therefore does not refute uncountability results; it shows that first-order theories cannot force their intended external universe merely by describing internal membership relations.
[remark: Why The Paradox Matters]
The Skolem paradox is not a defect in set theory or in the Löwenheim-Skolem theorem. It records a limitation of first-order expressibility: first-order theories can describe membership, functions, and bijections inside their models, but they do not control the external cardinality of the whole domain once an infinite model exists.
[/remark]
The Löwenheim-Skolem theorems complete the first major arc of the course. Compactness and elementary substructures together show that first-order theories are flexible across cardinalities, and the later study of quantifier elimination and model completeness will ask when this flexibility can still be controlled by syntactic normal forms.
After compactness and the Löwenheim-Skolem theorems, we know that first-order theories are flexible across cardinalities, but we still need a way to control their internal form. Quantifier elimination answers that question by turning semantic information into syntactic normal forms, making embeddings and definability far more transparent.
# 7. Quantifier Elimination: Method and First Examples
Quantifier elimination is the point where the syntactic and semantic sides of the course meet in a usable form. The compactness theorem from Chapter 4 gave us existence arguments, diagrams from Chapter 2 gave us control over embeddings, and elementary equivalence compared structures by sentences. In this chapter we learn a stronger method: in good theories, every first-order formula is equivalent to a quantifier-free formula, so definability can be read directly from the language and the substructure relation.
The guiding example is dense linear orders without endpoints. There, quantified statements about hidden points reduce to finite order comparisons among the visible variables and parameters. The same method also explains why some theories, such as pure groups, do not eliminate quantifiers in their usual languages, and why algebraically closed fields need a more algebraic analysis even though they also have quantifier elimination.
## Why Remove Quantifiers?
The basic problem is this: a formula with quantifiers may describe the existence of elements that are not named by the tuple under discussion, while a quantifier-free formula only records the finite algebraic or relational configuration of that tuple. Quantifier elimination asks when the hidden witnesses do not add any definable information beyond the visible configuration.
[definition: Quantifier Elimination]
Let $L$ be a first-order language and let $T$ be an $L$-theory. The theory $T$ has quantifier elimination if for every $L$-formula $\varphi(x_1,\dots,x_n)$ there is a quantifier-free $L$-formula $\psi(x_1,\dots,x_n)$ such that
\begin{align*}
T \models \forall x_1\cdots \forall x_n\, (\varphi(x_1,\dots,x_n) \leftrightarrow \psi(x_1,\dots,x_n)).
\end{align*}
[/definition]
This definition is relative to a theory, not to a single structure. It says that the equivalence must hold uniformly in every model of $T$. If parameters are allowed from a model $M \models T$, formulas over those parameters are reduced by treating the parameters as constants.
[example: Eliminating A Bounded Existential In A Linear Order]
Work in the language $L=\{<\}$, and let $a,b$ be parameters in a dense linear order. We show that
\begin{align*}
\exists y\,(a<y \wedge y<b)
\end{align*}
is equivalent to the quantifier-free formula $a<b$.
First suppose $\exists y\,(a<y \wedge y<b)$ holds. Then for some element $y$ we have $a<y$ and $y<b$, so transitivity of the linear order gives $a<b$. Conversely, suppose $a<b$. By density, there is an element $y$ such that
\begin{align*}
a<y \quad\text{and}\quad y<b.
\end{align*}
Therefore $y$ witnesses $\exists y\,(a<y \wedge y<b)$. Hence, in every dense linear order,
\begin{align*}
\exists y\,(a<y \wedge y<b) \leftrightarrow a<b.
\end{align*}
The existential quantifier has introduced no information beyond the visible order comparison between $a$ and $b$.
[/example]
The definition becomes useful because it has several equivalent forms. The following formulation is often the working criterion: two tuples with the same quantifier-free behaviour must have the same full first-order behaviour.
[definition: Quantifier Free Type]
Let $M$ be an $L$-structure and let $a = (a_1,\dots,a_n) \in M^n$. The quantifier-free type of $a$ in $M$ is the set of all quantifier-free $L$-formulas $\theta(x_1,\dots,x_n)$ such that $M \models \theta(a)$.
[/definition]
With this terminology, quantifier elimination can be tested by asking whether quantifier-free sameness already forces complete first-order sameness. This replaces a syntactic demand about rewriting every formula with a semantic demand about tuples in models, and it exposes existential formulas as the remaining obstruction once Boolean operations are under control.
[quotetheorem:4301]
[citeproof:4301]
This theorem explains why quantifier elimination is more than a syntactic convenience. The tuple formulation says that quantifier-free data must already determine all existential behaviour; without this, two elements could satisfy the same equations and relations but differ on whether some hidden witness exists. For example, in the language of groups, an element can have the same atomic one-variable data in a subgroup and in a larger group while acquiring a square root in the larger group, so existential formulas need not be controlled by quantifier-free type. The theorem does not provide a practical algorithm by itself, but it reduces the task to checking preservation of existential information.
The existential version is the most economical one for applications. When a formula has the shape $\exists y\,\theta(x,y)$ with $\theta$ quantifier-free, the question is whether the possible witness $y$ imposes only quantifier-free conditions on $x$.
[remark: Definability Consequence]
If $T$ has quantifier elimination, every definable subset of $M^n$ in a model $M \models T$ is defined by a quantifier-free formula with parameters from $M$. Thus the geometry of definable sets is controlled by the atomic relations and functions of the language.
[/remark]
## Back-And-Forth Criteria
How can we prove quantifier elimination without explicitly transforming every formula? The usual answer is a back-and-forth criterion: instead of simplifying syntax one formula at a time, we compare partial isomorphisms between models and show that they preserve all first-order information.
[definition: Partial Isomorphism]
Let $M$ and $N$ be $L$-structures. A partial isomorphism from $M$ to $N$ is an isomorphism $f:A \to B$ between substructures $A \subset M$ and $B \subset N$.
[/definition]
A partial isomorphism records agreement on quantifier-free formulas. To use such maps for full first-order formulas, we need a way to extend them whenever a quantified variable asks for a new element. The back-and-forth property formalizes this extension ability for finitely generated pieces of models.
[definition: Back And Forth Property]
Let $\mathcal K$ be a class of $L$-structures. The class $\mathcal K$ has the back-and-forth property for finitely generated substructures if whenever $M,N \in \mathcal K$, $A \subset M$ and $B \subset N$ are finitely generated substructures, and $f:A \to B$ is an isomorphism, then for every $a \in M$ there are a finitely generated substructure $A' \subset M$ with $A \cup \{a\} \subset A'$ and a finitely generated substructure $B' \subset N$ such that $f$ extends to an isomorphism $f':A' \to B'$.
[/definition]
The word "forth" refers to adding an element from the first model. A symmetric condition adds elements from the second model, and it is obtained by applying the same condition to $f^{-1}$.
This extension property is useful because quantified formulas ask for new elements. With both directions available, existential witnesses can be matched on either side, which is exactly the mechanism needed to turn quantifier-free agreement into agreement for all formulas.
[quotetheorem:4302]
[citeproof:4302]
The extension hypothesis is the point where existential quantifiers are handled. If a witness in $M$ cannot be matched by a witness in $N$ with the same quantifier-free relations to the old tuple, then the existential formula recording that witness separates the two tuples. Symmetry is also essential: a one-sided extension property only transfers existential truths in one direction, while elementary equivalence requires both directions. The criterion is sufficient rather than necessary in this exact form, since some theories eliminate quantifiers for reasons better expressed algebraically than by a bare finite-substructure argument.
This theorem is often applied in a slightly more concrete form: classify finitely generated substructures and show that every finite configuration can be extended in the required way. For relational languages, finitely generated substructures are finite induced substructures, so the criterion becomes especially transparent.
[example: Finite Configurations In Dense Orders]
In the language $\{<\}$, the substructure generated by a finite set is just that finite set with the induced order, since there are no function symbols. Let $M,N$ be dense linear orders without endpoints, let
\begin{align*}
A=\{a_1<\cdots<a_n\}\subset M,\qquad B=\{b_1<\cdots<b_n\}\subset N,
\end{align*}
and let $f:A\to B$ be the order-preserving bijection, so $f(a_i)=b_i$ for each $i$.
We extend $f$ over one new point $a\in M$. If $a=a_i$ for some $i$, then $a$ is already in the domain and no new value is needed. If $a<a_1$, then the no-left-endpoint axiom in $N$ gives some $b\in N$ with $b<b_1$; defining $f'(a)=b$ preserves every comparison with the old points because
\begin{align*}
a<a_i \quad\text{and}\quad b<b_i
\end{align*}
for all $i$. If $a_n<a$, the no-right-endpoint axiom gives $b_n<b$, and again
\begin{align*}
a_i<a \quad\text{and}\quad b_i<b
\end{align*}
for all $i$. Finally, if $a_i<a<a_{i+1}$ for some $i<n$, density in $N$ gives $b$ with
\begin{align*}
b_i<b<b_{i+1}.
\end{align*}
Then for $j\le i$ we have $a_j\le a_i<a$ and $b_j\le b_i<b$, while for $j\ge i+1$ we have $a<a_{i+1}\le a_j$ and $b<b_{i+1}\le b_j$. Thus every old comparison involving $a$ is matched by the corresponding comparison involving $b$, so $f\cup\{(a,b)\}$ is an isomorphism of the enlarged finite suborders. This is exactly why finite configurations in dense orders are controlled by cuts.
[/example]
## Model Completeness And Complete Theories
What does quantifier elimination say about embeddings between models? Since quantifier-free formulas are preserved by substructures, quantifier elimination forces embeddings of models to preserve all formulas. This links the elimination method with the elementary embedding machinery from earlier chapters.
[definition: Model Complete Theory]
An $L$-theory $T$ is model complete if for every pair of models $M,N \models T$, every embedding $f:M \to N$ is an elementary embedding.
[/definition]
Quantifier elimination is stronger than this definition because it rewrites formulas inside a single model, while model completeness only compares models through embeddings. Still, quantifier elimination should imply model completeness: an embedding preserves terms, equations, and all quantifier-free formulas, and after elimination there are no hidden quantified conditions left to check. The following theorem records this basic bridge from syntactic normal forms to elementary embeddings.
[quotetheorem:4303]
[citeproof:4303]
Model completeness does not usually imply quantifier elimination in the original language. Quantifier elimination is stronger because it demands that every definable set be described without quantifiers, while model completeness only controls embeddings between models of the theory. For instance, algebraically closed fields are model complete and in fact eliminate quantifiers in the ring language, but if the language is artificially reduced or if definable relations are hidden behind omitted symbols, model completeness alone need not recover quantifier-free definitions in that smaller language. The forward use of the theorem is therefore one-way: once quantifier elimination has been proved, model completeness is automatic, but proving model completeness is not a substitute for finding normal forms.
[quotetheorem:4304]
[citeproof:4304]
The common-substructure hypothesis is not cosmetic: quantifier elimination alone does not force completeness when the language leaves room for different quantifier-free sentences. The relevant field-theoretic example is algebraically closed fields separated by characteristic: the sentence $1+\cdots+1=0$ with $p$ summands is quantifier-free and distinguishes characteristic $p$ from characteristic $0$. Thus the theorem does not say that every quantifier-eliminating theory is complete, only that completeness follows once the quantifier-free part common to all models has been fixed. In the ring language, fixing the characteristic fixes the prime field shared by all algebraically closed fields in that characteristic, which is why quantifier elimination for algebraically closed fields leads to completeness after the characteristic is specified.
[example: Algebraically Closed Fields Compared]
In the language of rings, the algebraic content of quantifier elimination is that an existential formula
\begin{align*}
\exists y\,\theta(x,y)
\end{align*}
with $\theta$ quantifier-free defines the projection of the set of pairs $(x,y)$ satisfying finitely many polynomial equations and inequations. The projection map forgets the $y$-coordinates:
\begin{align*}
\{(x,y):\theta(x,y)\}\longmapsto \{x:\exists y\,\theta(x,y)\}.
\end{align*}
By *Chevalley's constructibility theorem*, this projection is constructible, meaning it is a finite Boolean combination of algebraic sets. In the ring language, algebraic sets are defined by polynomial equations $f(x)=0$, and Boolean combinations of such equations and their negations are exactly quantifier-free ring formulas. Thus algebraic closedness turns the hidden existential witness $y$ into quantifier-free polynomial information about $x$.
Completeness then depends on fixing the characteristic. If $K$ is an algebraically closed field of characteristic $0$, the prime subfield is the copy of $\mathbb Q$ given by
\begin{align*}
\frac{m}{n}\longmapsto (m\cdot 1_K)(n\cdot 1_K)^{-1},
\end{align*}
where $n\ne 0$ and $n\cdot 1_K\ne 0$ because the characteristic is $0$. If $K$ has characteristic $p$, then $p\cdot 1_K=0$, and the prime subfield is the copy of $\mathbb F_p$ given by
\begin{align*}
[m]\longmapsto m\cdot 1_K.
\end{align*}
This map is well-defined because if $m\equiv n \pmod p$, then $m-n=kp$ for some integer $k$, so
\begin{align*}
(m-n)\cdot 1_K=(kp)\cdot 1_K=k\cdot(p\cdot 1_K)=k\cdot 0=0.
\end{align*}
Therefore all algebraically closed fields of characteristic $0$ contain the same prime field $\mathbb Q$, and all algebraically closed fields of characteristic $p$ contain the same prime field $\mathbb F_p$. Once the characteristic is fixed, quantifier elimination reduces every sentence to quantifier-free information over this common prime subfield, so the corresponding theory is complete.
[/example]
## Dense Linear Orders Without Endpoints
The first full example should be simple enough that the whole elimination argument can be seen directly. Dense linear orders without endpoints are ideal for this because the language has only the order relation, and all finite configurations are finite linear orders.
[definition: Dense Linear Order Without Endpoints]
A dense linear order without endpoints is an $L = \{<\}$-structure $M$ such that:
1. $<$ is a linear order on $M$.
2. For all $a,b \in M$, if $a < b$, then there exists $c \in M$ such that $a < c < b$.
3. For every $a \in M$, there exist $b,c \in M$ such that $b < a < c$.
[/definition]
The theory of dense linear orders without endpoints is denoted by $\operatorname{DLO}$. The standard examples are $(\mathbb Q,<)$ and $(\mathbb R,<)$, and these are elementarily equivalent even though they are not isomorphic.
The obstruction to eliminating quantifiers is that an existential variable might have to occupy a cut determined by finitely many parameters. In a dense order without endpoints, every consistent finite cut has a realization, so no additional first-order condition is needed beyond the quantifier-free order pattern.
[quotetheorem:4305]
[citeproof:4305]
Each axiom of $\operatorname{DLO}$ is used in the extension argument. Density supplies a new point between adjacent finite parameters; without it, as in $(\mathbb Z,<)$, the formula $\exists y\,(x<y\wedge y<z)$ is not equivalent simply to $x<z$. Absence of endpoints supplies points below or above a finite configuration; with a first or last element, endpoint behaviour becomes a special definable case. Linearity ensures that a finite quantifier-free configuration is just an ordered list with equalities, whereas partial orders can have incomparable patterns whose extensions are not controlled by cuts alone.
Quantifier elimination gives a concrete normal form. Every formula in variables $x_1,\dots,x_n$ is equivalent over $\operatorname{DLO}$ to a Boolean combination of comparisons $x_i < x_j$, $x_i < a$, $a < x_i$, and equalities among variables and parameters.
[example: Solving A Formula By Ordering Parameters]
Let $a<b<c$ be parameters in a model of $\operatorname{DLO}$, and write
\begin{align*}
\varphi(x) \ :=\ \exists y\,(a<y \wedge y<c \wedge y\ne b \wedge x<y).
\end{align*}
We show that $\varphi(x)$ is equivalent to $x<c$.
First suppose $\varphi(x)$ holds. Then for some $y$,
\begin{align*}
x<y \quad\text{and}\quad y<c.
\end{align*}
By transitivity of $<$, this gives $x<c$.
Conversely suppose $x<c$. Since $a<b<c$, linearity places $x$ in one of the following cases.
If $x\le a$, density applied to $a<b$ gives $y$ with
\begin{align*}
a<y<b.
\end{align*}
Then $y<c$ because $y<b<c$, and $x<y$ because $x\le a<y$. Also $y\ne b$ because $y<b$.
If $a<x<b$, density applied to $x<b$ gives $y$ with
\begin{align*}
x<y<b.
\end{align*}
Then $a<y$ because $a<x<y$, and $y<c$ because $y<b<c$. Also $y\ne b$ because $y<b$.
If $x=b$, density applied to $b<c$ gives $y$ with
\begin{align*}
b<y<c.
\end{align*}
Then $x<y$ because $x=b<y$, and $a<y$ because $a<b<y$. Also $y\ne b$ because $b<y$.
If $b<x<c$, density applied to $x<c$ gives $y$ with
\begin{align*}
x<y<c.
\end{align*}
Then $a<y$ because $a<b<x<y$. Also $y\ne b$ because $b<x<y$.
In every case there is a witness $y$ satisfying
\begin{align*}
a<y \wedge y<c \wedge y\ne b \wedge x<y.
\end{align*}
Hence, over $\operatorname{DLO}$ with parameters $a<b<c$,
\begin{align*}
\exists y\,(a<y \wedge y<c \wedge y\ne b \wedge x<y)
\end{align*}
is equivalent to the quantifier-free condition $x<c$.
[/example]
The one-variable definable sets are especially transparent. With parameters $a_1<\cdots<a_n$, any formula in one free variable defines a finite Boolean combination of intervals and points whose endpoints are among the parameters.
[example: Definable Subsets In One Variable]
Let $M \models \operatorname{DLO}$ and let $a<b$ be parameters. For any $m\in M$,
\begin{align*}
M \models (a<x \wedge x<b)\vee x=a \ [m]
\end{align*}
holds exactly when either
\begin{align*}
a<m \quad\text{and}\quad m<b,
\end{align*}
or
\begin{align*}
m=a.
\end{align*}
The first alternative says $m\in (a,b)$, and the second says $m\in \{a\}$. Hence the formula defines precisely
\begin{align*}
\{a\}\cup(a,b).
\end{align*}
More generally, by *Quantifier Elimination For Dense Linear Orders*, every one-variable formula over parameters $a_1<\cdots<a_n$ is equivalent over $\operatorname{DLO}$ to a Boolean combination of comparisons of the forms $x<a_i$, $a_i<x$, and $x=a_i$. These comparisons can only distinguish the finitely many points
\begin{align*}
a_1,\dots,a_n
\end{align*}
and the finitely many cuts
\begin{align*}
(-\infty,a_1),\ (a_1,a_2),\ \dots,\ (a_{n-1},a_n),\ (a_n,\infty).
\end{align*}
On each open interval between consecutive parameters, every comparison with every $a_i$ has a fixed truth value, so a definable set either contains the whole interval or none of it. Since every nonempty open interval in a dense linear order is dense in itself, any definable discrete subset must avoid all such interval pieces and therefore is contained in the finite set $\{a_1,\dots,a_n\}$. Thus no formula with finitely many parameters can define a discrete infinite subset of a dense order.
[/example]
The same normal-form analysis also controls sentences, not just definable subsets. In the language with only the order relation, a closed quantifier-free sentence has very little room to distinguish one nonempty dense linear order without endpoints from another. Since the language has no constant symbols, sentences have no parameters to name distinguished elements, and quantifier elimination therefore leads to completeness: all nonempty dense linear orders without endpoints satisfy the same first-order sentences even though their sizes and order-completion properties may differ.
[quotetheorem:4306]
[citeproof:4306]
The hypotheses are again essential for completeness. Dense linear orders with a named endpoint, discrete linear orders such as $(\mathbb Z,<)$, or orders with adjacent pairs have quantifier-free or existential sentences detecting those extra features, so they do not all collapse to one complete theory. Completeness also does not mean isomorphism: $(\mathbb Q,<)$ and $(\mathbb R,<)$ satisfy the same first-order sentences in $\{<\}$, but they differ in cardinality and in order-completeness. What the theorem gives is exactly elementary indistinguishability, not a classification of models up to order isomorphism.
## First Non-Examples And Comparisons
Quantifier elimination is a strong property of a theory in a chosen language. It can fail because the theory is not complete enough, because embeddings do not behave elementarily, or because the language lacks symbols for the relevant definable relations.
[example: Pure Groups Do Not Eliminate Quantifiers]
In the group language $L=(\cdot,{}^{-1},e)$, consider the cyclic group
\begin{align*}
G=\{e,a\},\qquad a^2=e,
\end{align*}
as a subgroup of the cyclic group
\begin{align*}
H=\{e,r,r^2,r^3\},\qquad r^4=e,
\end{align*}
by the embedding $e\mapsto e$ and $a\mapsto r^2$. The element $a$ has no square root in $G$: the only possible values of $y$ are $e$ and $a$, and
\begin{align*}
e\cdot e=e,\qquad a\cdot a=e,
\end{align*}
so no $y\in G$ satisfies $y\cdot y=a$. Thus
\begin{align*}
G \models \neg \exists y\,(y\cdot y=a).
\end{align*}
In $H$, however,
\begin{align*}
r\cdot r=r^2,
\end{align*}
so $r$ is a square root of $r^2$. Hence
\begin{align*}
H \models \exists y\,(y\cdot y=r^2).
\end{align*}
The inclusion $G\subset H$ preserves all quantifier-free group formulas with parameters from $G$, because every group term in elements of $G$ is evaluated using the same multiplication, inverse, and identity in both $G$ and $H$. Therefore no quantifier-free formula over $G$ can distinguish $a$ in $G$ from its image $r^2$ in $H$, while the existential formula $\exists y\,(y\cdot y=x)$ does distinguish them. Thus the ordinary theory of groups does not eliminate quantifiers in the pure group language.
[/example]
This failure also shows why model completeness matters. If a theory had quantifier elimination, then every embedding between its models would be elementary. The ordinary theory of groups cannot satisfy such a condition, since subgroup embeddings often create new solutions to equations.
[example: Language Matters]
Changing the language changes what quantifier-free formulas are allowed to say. For algebraically closed fields, the relevant language is the ring language $L_{\mathrm{ring}}=\{0,1,+,-,\cdot\}$. A quantifier-free formula in variables $x$ and $y$ is a Boolean combination of polynomial equations
\begin{align*}
f(x,y)=0
\end{align*}
and inequations
\begin{align*}
g(x,y)\ne 0.
\end{align*}
An existential formula $\exists y\,\theta(x,y)$ therefore defines the projection of the set of pairs satisfying those polynomial conditions:
\begin{align*}
\{(x,y):\theta(x,y)\}\longmapsto \{x:\exists y\,\theta(x,y)\}.
\end{align*}
By *Chevalley's constructibility theorem*, such projections are finite Boolean combinations of algebraic sets, and algebraic sets are defined in the ring language by polynomial equations. Thus, in algebraically closed fields, the hidden witness $y$ can be removed and replaced by quantifier-free polynomial information about $x$.
For real closed fields, the appropriate language is the ordered ring language
\begin{align*}
L_{\mathrm{or}}=\{0,1,+,-,\cdot,<\}.
\end{align*}
The order symbol is not cosmetic: quantifier-free formulas may now distinguish conditions such as
\begin{align*}
x>0,\qquad x=0,\qquad x<0.
\end{align*}
This is the language in which real closed fields eliminate quantifiers; the order relation records sign information that is essential for describing projections of semialgebraic sets.
Pure equality is the simplest case. In the language $L=\{=\}$, the only atomic formulas in variables $x_1,\dots,x_n$ are equalities
\begin{align*}
x_i=x_j.
\end{align*}
Quantifier-free formulas are Boolean combinations of these equalities, so they describe only which variables are equal and which variables are distinct. For example,
\begin{align*}
\exists y\,(y=x_1 \wedge y\ne x_2)
\end{align*}
is equivalent to
\begin{align*}
x_1\ne x_2,
\end{align*}
because any witness $y$ must equal $x_1$, and then the condition $y\ne x_2$ becomes exactly $x_1\ne x_2$. Thus pure equality eliminates quantifiers because existential variables add no structure beyond equality patterns among the visible variables.
[/example]
The lesson is methodological. To prove quantifier elimination, isolate the finite quantifier-free configurations, prove an extension property for partial isomorphisms, and then read off definable sets from the resulting normal forms. Dense linear orders demonstrate the whole method in its most accessible form, and Chapters 8 and 9 follow the same template for algebraically closed fields and real closed fields, with more structure in the finite configurations.
Quantifier elimination is where the syntactic and semantic sides of the course meet in a concrete way. With the method established, the natural next examples are the major algebraic classes where it succeeds most dramatically: algebraically closed fields and then real closed fields.
# 8. Algebraically Closed Fields
Algebraically closed fields are the first major example where quantifier elimination meets a familiar algebraic class. The guiding question is: what can first-order logic see inside a field once every non-constant polynomial has a root? The answer is that definable sets are controlled by polynomial equations and their projections, so the logical analysis becomes a controlled fragment of algebraic geometry.
Building on the quantifier-elimination method of Chapter 7, this chapter builds the theory $\mathrm{ACF}$ of algebraically closed fields, refines it by characteristic, and proves quantifier elimination using the algebraic fact that projections of constructible sets are constructible. We then extract the model-theoretic consequences: completeness of $\mathrm{ACF}_p$ and $\mathrm{ACF}_0$, elementary equivalence by characteristic, and the finite-or-cofinite shape of definable subsets of the field sort.
## The Theory of Algebraically Closed Fields
The first problem is to express algebraic closure in first-order language. A field being algebraically closed says that every non-constant polynomial has a zero, but first-order logic cannot quantify over all polynomials as objects. Instead, the theory uses one axiom scheme for each degree.
[definition: Language of Rings]
The language of rings is the first-order language
\begin{align*}
\mathcal L_{\mathrm{ring}} = \{+, -, \cdot, 0, 1\}.
\end{align*}
[/definition]
In this language, the coefficients of a polynomial are variables from the field, while the root is another variable. This makes each fixed-degree algebraic-closedness assertion a first-order sentence.
We can now package the usual algebraic condition into a first-order theory. Since no single sentence quantifies over all polynomial degrees at once, the theory is given by an axiom scheme, one sentence for each positive degree.
[definition: Algebraically Closed Field Theory]
The theory $\mathrm{ACF}$ in $\mathcal L_{\mathrm{ring}}$ consists of the field axioms together with, for each $n \ge 1$, the sentence
\begin{align*}
\forall a_0\cdots \forall a_n\,\bigl(a_n \ne 0 \implies \exists x\,(a_nx^n + a_{n-1}x^{n-1}+\cdots+a_1x+a_0 = 0)\bigr).
\end{align*}
[/definition]
The scheme says that every polynomial of positive degree has a root. Together with the field axioms, this is exactly the usual algebraic notion: repeated factorisation gives linear factors for every polynomial over the field.
Algebraic closure alone leaves a basic ambiguity: fields of characteristic $0$ and fields of characteristic $p$ satisfy different ring sentences, even when both are algebraically closed. To obtain the complete theories used in quantifier elimination and classification, the characteristic must be built into the theory by first-order axioms.
This creates two different first-order tasks. For a fixed prime characteristic, one sentence is enough to say that adding $1$ to itself $p$ times gives $0$; for characteristic $0$, no single finite bound suffices, so the theory must forbid each positive multiple of $1$ from vanishing. The refined theories below make that distinction part of the axioms.
[definition: Characteristic Refinements of Algebraically Closed Fields]
For a prime $p$, the theory $\mathrm{ACF}_p$ is $\mathrm{ACF}$ together with the sentence
\begin{align*}
\underbrace{1+\cdots+1}_{p\text{ times}} = 0.
\end{align*}
The theory $\mathrm{ACF}_0$ is $\mathrm{ACF}$ together with, for each $n \ge 1$, the sentence
\begin{align*}
\underbrace{1+\cdots+1}_{n\text{ times}} \ne 0.
\end{align*}
[/definition]
The characteristic has to be handled by theories rather than by a single uniform sentence for characteristic $0$. Characteristic $p$ is first-order expressible by one sentence, while characteristic $0$ is expressed by the infinite scheme saying that no positive integer multiple of $1$ vanishes.
[example: Algebraic Closures as Models]
The field $\overline{\mathbb Q}$ is algebraically closed by definition of algebraic closure, and its characteristic is $0$ because it contains $\mathbb Q$ as a subfield. Thus it satisfies the field axioms, every positive-degree polynomial over it has a root in it, and for every $n\ge 1$,
\begin{align*}
\underbrace{1+\cdots+1}_{n\text{ times}} \ne 0
\end{align*}
because this equality is already false in $\mathbb Q$. Hence $\overline{\mathbb Q}\models \mathrm{ACF}_0$.
Similarly, $\mathbb F_p(t)$ has characteristic $p$, since its prime field is $\mathbb F_p$, so
\begin{align*}
\underbrace{1+\cdots+1}_{p\text{ times}} = 0.
\end{align*}
Its algebraic closure $\overline{\mathbb F_p(t)}$ is algebraically closed by definition and has the same characteristic as $\mathbb F_p(t)$, because algebraic field extensions preserve the equation $p\cdot 1=0$. Therefore $\overline{\mathbb F_p(t)}\models \mathrm{ACF}_p$. The two examples show that the theory records algebraic closedness and characteristic, while later quantifier elimination will explain why fields of the same characteristic can agree on all first-order sentences even when their field-theoretic sizes differ.
[/example]
## Polynomial Equations and Constructible Sets
The next problem is to understand what quantifier-free formulas define in an algebraically closed field. Since atomic formulas in the ring language are polynomial equations, Boolean combinations of atomic formulas define Boolean combinations of algebraic sets. This is the point where algebraic geometry enters the course.
[definition: Affine Algebraic Set]
Let $K$ be a field and let $S \subset K[x_1,\dots,x_n]$. The affine algebraic set defined by $S$ is
\begin{align*}
V(S) = \{a \in K^n : f(a)=0 \text{ for all } f \in S\}.
\end{align*}
[/definition]
For first-order purposes, finite systems of equations are enough, since each formula contains only finitely many terms. Over algebraically closed fields, the geometry of these zero sets is rich enough to control existential quantifiers.
[definition: Constructible Set]
Let $K$ be a field. A subset $X \subset K^n$ is constructible if it is a finite Boolean combination of affine algebraic subsets of $K^n$.
[/definition]
Constructible sets are exactly the subsets of affine space defined by quantifier-free formulas in the language of rings, after parameters from $K$ are allowed. Inequalities such as $f(x) \ne 0$ are complements of algebraic sets, and arbitrary quantifier-free formulas are built from these by finite Boolean operations.
To use this dictionary systematically, we need a theorem identifying the geometric sets produced by quantifier-free formulas. This is the first half of the bridge between algebraic geometry and quantifier elimination for algebraically closed fields.
[quotetheorem:4307]
[citeproof:4307]
This theorem gives the geometric reading of quantifier-free formulas: Boolean combinations of polynomial equations define constructible sets. The converse direction is the next ingredient, because quantifier elimination also needs a way to turn every constructible set back into a formula. The issue is finiteness: the geometric definition allows algebraic sets cut out by arbitrary families of polynomials, whereas a first-order formula can mention only finitely many equations.
A subset of $K^n$ is locally closed if it is the intersection of a Zariski open set with a Zariski closed set. Equivalently, it has the form $U\cap Z$ where $Z$ is algebraic and $U$ is the complement of an algebraic set. This terminology is another way to package Boolean combinations of polynomial equations and inequations.
The remaining obstruction is syntactic: a constructible set is defined geometrically, while quantifier elimination needs an actual quantifier-free formula. A finite locally closed decomposition removes that obstruction because each closed condition is polynomial equality and each open condition is polynomial inequation.
[quotetheorem:4308]
[citeproof:4308]
The remaining question is whether existential quantification preserves constructibility.
[example: Projection Need Not Preserve Closedness]
Over an algebraically closed field $K$, consider
\begin{align*}
X = V(xy-1) = \{(a,b)\in K^2 : ab-1=0\}.
\end{align*}
Let $\pi\colon K^2\to K$ be the projection onto the first coordinate. We compute $\pi(X)$ by checking both inclusions. If $a\in \pi(X)$, then for some $b\in K$,
\begin{align*}
ab-1=0,
\end{align*}
so $ab=1$, and hence $a\ne 0$. Thus $\pi(X)\subseteq K\setminus\{0\}$. Conversely, if $a\ne 0$, then $a$ has an inverse $a^{-1}\in K$, and with $b=a^{-1}$ we get
\begin{align*}
ab-1 = a a^{-1}-1 = 1-1 = 0.
\end{align*}
Therefore $(a,a^{-1})\in X$, so $a\in \pi(X)$. Hence
\begin{align*}
\pi(X)=K\setminus\{0\}.
\end{align*}
This set is constructible because
\begin{align*}
K\setminus\{0\}=K\setminus V(x),
\end{align*}
the complement of the algebraic set $V(x)\subset K$. It is not Zariski closed in $K$: if $K\setminus\{0\}=V(S)$ for some $S\subset K[x]$, then every $f\in S$ vanishes at every nonzero element of $K$. The field $K$ is infinite, since a finite field with elements $a_1,\dots,a_N$ would make the positive-degree polynomial $1+\prod_{i=1}^N (t-a_i)$ have no root in $K$. Thus each $f\in S$ has infinitely many roots, so each $f$ is the zero polynomial, and then $V(S)=K$, contradicting $0\notin K\setminus\{0\}$. The projection is therefore constructible but not closed, so projection theorems must allow Boolean combinations of algebraic sets rather than only algebraic sets.
[/example]
The example shows why the geometric form of quantifier elimination cannot say that projections of algebraic sets remain algebraic. Existential quantifiers correspond to forgetting the witness coordinates, so the correct geometric closure property must allow finite Boolean combinations of algebraic conditions.
The problem is to find a class of geometric sets large enough to survive projection but still controlled by polynomial equations and inequations. Constructible sets are designed for this role, and the next result gives the projection theorem that makes them match existential quantification in algebraically closed fields.
[quotetheorem:4309]
[citeproof:4309]
[example: Solvability of a Polynomial System]
Fix polynomials $f_1(x,y),\dots,f_r(x,y)\in K[x_1,\dots,x_n,y_1,\dots,y_m]$ over an algebraically closed field $K$, and let
\begin{align*}
Z=V(f_1,\dots,f_r)\subset K^{n+m}.
\end{align*}
Writing a point of $K^{n+m}$ as $(a,b)$ with $a\in K^n$ and $b\in K^m$, the definition of $V(f_1,\dots,f_r)$ gives
\begin{align*}
(a,b)\in Z
&\Longleftrightarrow f_i(a,b)=0 \text{ for every } i=1,\dots,r.
\end{align*}
Let $\pi\colon K^{n+m}\to K^n$ be the projection $\pi(a,b)=a$. Then
\begin{align*}
a\in \pi(Z)
&\Longleftrightarrow \exists b\in K^m\,((a,b)\in Z)\\
&\Longleftrightarrow \exists b\in K^m\,\bigl(f_1(a,b)=0\wedge \cdots \wedge f_r(a,b)=0\bigr).
\end{align*}
Thus $\pi(Z)$ is exactly the set of parameter values $a\in K^n$ for which the system
\begin{align*}
f_1(a,y)=\cdots=f_r(a,y)=0
\end{align*}
has a solution $y\in K^m$.
Since $Z$ is an affine algebraic set, it is constructible. By *Chevalley Projection Theorem*, $\pi(Z)\subset K^n$ is constructible. By *Constructible Sets Are Quantifier-Free Definable*, there is a quantifier-free $\mathcal L_{\mathrm{ring}}(K)$-formula $\psi(x)$ such that
\begin{align*}
\psi(K)=\pi(Z).
\end{align*}
So the existential condition $\exists y\,(f_1(x,y)=0\wedge\cdots\wedge f_r(x,y)=0)$ is represented, over $K$, by a quantifier-free description of the same parameter set.
[/example]
## Quantifier Elimination for Algebraically Closed Fields
The central logical problem is now direct: can every formula over an algebraically closed field be replaced by a quantifier-free formula? The preceding section says that existential images of quantifier-free definable sets remain constructible, and constructible sets are quantifier-free definable. This proves quantifier elimination.
[quotetheorem:4310]
[citeproof:4310]
Quantifier elimination is stronger than the statement that every definable set is constructible in each model. It says that the replacement formula is forced by the theory itself, uniformly across all algebraically closed fields. Algebraic closedness is the structural hypothesis that makes this possible: in arbitrary fields, the formula $\exists y\,(y^2=x)$ need not be equivalent to a Boolean combination of polynomial equalities in one variable. Real closed fields also eliminate quantifiers, as Chapter 9 will show, but only after their order structure is included; in the pure ring language, order-sensitive sets such as the nonnegative elements are not controlled by Zariski constructibility alone. Quantifier elimination does not imply that all models of $\mathrm{ACF}$ are isomorphic, but it is the mechanism that will make their sentence theories depend only on characteristic.
[example: Eliminating a Simple Quantifier]
Let $K$ be an algebraically closed field. We show that, for each $a\in K$,
\begin{align*}
K \models \exists y\,(ay=1)
\quad\Longleftrightarrow\quad
a\ne 0.
\end{align*}
If $K \models \exists y\,(ay=1)$, choose $b\in K$ with $ab=1$. If $a=0$, then
\begin{align*}
ab=0b=0,
\end{align*}
contradicting $ab=1$ because $0\ne 1$ in a field. Hence $a\ne 0$.
Conversely, suppose $a\ne 0$. Since $K$ is a field, $a$ has a multiplicative inverse $a^{-1}\in K$. Taking $b=a^{-1}$ gives
\begin{align*}
ab=a a^{-1}=1,
\end{align*}
so $K \models \exists y\,(ay=1)$. Therefore
\begin{align*}
\{a\in K:K\models \exists y\,(ay=1)\}=K\setminus\{0\}.
\end{align*}
Geometrically, let
\begin{align*}
X=V(xy-1)=\{(a,b)\in K^2:ab-1=0\}.
\end{align*}
The projection of $X$ to the $x$-axis is
\begin{align*}
\pi(X)
&=\{a\in K:\exists b\in K\,((a,b)\in X)\}\\
&=\{a\in K:\exists b\in K\,(ab-1=0)\}\\
&=\{a\in K:\exists b\in K\,(ab=1)\}\\
&=K\setminus\{0\}\\
&=K\setminus V(x).
\end{align*}
Thus eliminating the quantifier replaces the existential equation $xy=1$ by the quantifier-free inequality $x\ne 0$, matching the projection of the algebraic set $V(xy-1)$ with the constructible set $K\setminus V(x)$.
[/example]
The example also points to a boundary of the idea: eliminating quantifiers does not mean eliminating dependence on named parameters.
[remark: Parameters]
With parameters from an algebraically closed field $K$, every definable subset of $K^n$ is constructible over $K$. Without parameters, the defining Boolean combinations use polynomials with coefficients from the prime field, or from the named parameter set if the language has been expanded by constants.
[/remark]
The one-variable case is the first place where the geometric description becomes a sharp model-theoretic constraint.
[example: Finite or Cofinite Definable Subsets]
Let $K$ be an algebraically closed field and let $X\subset K$ be definable with parameters from $K$. By *Quantifier Elimination for Algebraically Closed Fields*, $X$ is defined by a quantifier-free formula in one variable, so $X$ is a finite Boolean combination of sets of the form
\begin{align*}
V(f)=\{a\in K:f(a)=0\}
\end{align*}
with $f\in K[t]$.
For such an $f$, there are two cases. If $f=0$, then $f(a)=0$ for every $a\in K$, so
\begin{align*}
V(f)=K.
\end{align*}
If $f\ne 0$ and $\deg(f)=d$, then $V(f)$ is finite: indeed, whenever $c\in K$ satisfies $f(c)=0$, polynomial division gives
\begin{align*}
f(t)=(t-c)q(t)
\end{align*}
for some $q\in K[t]$ with $\deg(q)=d-1$. Repeating this for distinct roots shows that $f$ has at most $d$ roots, since after removing $d$ linear factors no further distinct root can remain in a nonzero polynomial of degree $0$.
Thus each basic zero set $V(f)$ is either finite or all of $K$. Complements of such sets are therefore either cofinite or empty, finite unions of finite sets are finite, finite intersections of cofinite sets are cofinite, and intersections such as
\begin{align*}
(K\setminus F)\cap E=E\setminus F
\end{align*}
with $E,F$ finite are finite. Hence every finite Boolean combination of one-variable polynomial zero sets is finite or cofinite. Therefore every definable subset of the field sort $K$ is finite or cofinite.
[/example]
This one-dimensional consequence is a useful test case. It shows that one-variable definable subsets of an algebraically closed field cannot have complicated infinite shape: they are controlled by finitely many exceptional points. In particular, there is no definable infinite subset of the field sort that is both infinite and coinfinite. More sophisticated order-like configurations require looking at definable relations on $K^2$, not only at one-variable definable subsets.
## Completeness in Fixed Characteristic
The next problem is to determine when two algebraically closed fields satisfy the same sentences. Quantifier elimination reduces every sentence to a quantifier-free sentence, and quantifier-free sentences in the ring language are controlled by the prime field and its characteristic.
[quotetheorem:4311]
[citeproof:4311]
Fixing the characteristic is necessary. The sentence $p\cdot 1=0$ is true in every model of $\mathrm{ACF}_p$ and false in every algebraically closed field of characteristic different from $p$, so no theory mixing characteristics can be complete. Completeness also does not imply categoricity: algebraically closed fields of characteristic $p$ can have different transcendence degrees and hence need not be isomorphic. What completeness gives is sentence-level agreement, and the later discussion of transcendence degree explains exactly what is being forgotten.
The same argument works in characteristic $0$, with the prime field replaced by $\mathbb Q$. Here the theory includes all sentences asserting that $n\cdot 1\ne 0$, so the arithmetic of the prime field is fixed.
The characteristic-zero case deserves its own statement because it is not obtained by choosing a single prime. Instead the theory fixes an infinite list of inequalities saying that no positive integer multiple of $1$ vanishes. Once those prime-field facts are fixed, quantifier elimination reduces every sentence about an algebraically closed field to information already visible over $\mathbb Q$.
This raises a separate completeness question for characteristic $0$: after imposing all of those inequalities, is there still any sentence that can distinguish two algebraically closed fields of characteristic $0$? The point of the next result is that quantifier elimination leaves no hidden residue beyond the prime-field information, so the whole characteristic-zero theory becomes complete.
[quotetheorem:4312]
[citeproof:4312]
Completeness here is a statement about sentences only. It does not say that all algebraically closed fields of characteristic $0$ are isomorphic: $\overline{\mathbb Q}$ and $\mathbb C$ already give different cardinalities, and their transcendence degrees over $\mathbb Q$ are different. The hypothesis of characteristic $0$ is doing real work, since in positive characteristic the closed sentence $p\cdot 1=0$ changes truth value. The theorem will be used next to compress elementary equivalence of algebraically closed fields to a single invariant, namely characteristic.
[quotetheorem:4313]
[citeproof:4313]
The algebraic closedness assumption cannot be dropped. Two non-algebraically closed fields of the same characteristic can disagree on sentences such as $\exists x\,(x^2+1=0)$, as $\mathbb R$ and $\mathbb Q$ do in characteristic $0$. The theorem says that once all polynomial root-existence questions have been settled positively by algebraic closedness, the remaining sentence-level information is exactly the characteristic. This prepares the final distinction: transcendence degree classifies algebraically closed fields up to isomorphism, but not up to elementary equivalence.
[example: Different Sizes with the Same Theory]
The field $\overline{\mathbb Q}$ is countable: there are countably many polynomials in $\mathbb Q[t]$, each nonzero polynomial of degree $d$ has at most $d$ roots, and $\overline{\mathbb Q}$ is the union of these finite root sets. The field $\mathbb C$ is uncountable, so there cannot be a bijection $\overline{\mathbb Q}\to \mathbb C$. Since any field isomorphism is in particular a bijection of underlying sets, $\overline{\mathbb Q}$ and $\mathbb C$ are not isomorphic.
Both fields are algebraically closed: $\overline{\mathbb Q}$ by definition of algebraic closure, and $\mathbb C$ by the fundamental theorem of algebra. Both have characteristic $0$: for every $n\ge 1$,
\begin{align*}
\underbrace{1+\cdots+1}_{n\text{ times}} \ne 0
\end{align*}
in $\mathbb Q$, and the embeddings $\mathbb Q\subset \overline{\mathbb Q}$ and $\mathbb Q\subset \mathbb C$ preserve $0$, $1$, and addition. Therefore $\overline{\mathbb Q}$ and $\mathbb C$ are both models of $\mathrm{ACF}_0$, so by *Elementary Equivalence by Characteristic*,
\begin{align*}
\overline{\mathbb Q}\equiv \mathbb C.
\end{align*}
Thus first-order sentences in the pure ring language can agree even when the fields have different sizes, and in particular they do not detect the transcendence degree distinction between $\overline{\mathbb Q}$ and $\mathbb C$.
[/example]
## Transcendence Degree and Sentence-Level Behaviour
The final problem is to place transcendence degree correctly in the model-theoretic picture. It is a genuine isomorphism invariant of algebraically closed fields, but the preceding completeness theorem says that it disappears from the theory of sentences once the characteristic is fixed.
[definition: Transcendence Degree]
Let $K/k$ be a field extension. A subset $B \subset K$ is a transcendence basis for $K/k$ if $B$ is algebraically independent over $k$ and $K$ is algebraic over $k(B)$. The transcendence degree $\operatorname{trdeg}(K/k)$ is the cardinality of a transcendence basis.
[/definition]
Transcendence degree classifies algebraically closed fields up to isomorphism once the base prime field and characteristic are fixed. Model theory sees it through formulas with parameters and through cardinality-sensitive questions, but not through first-order sentences in the unexpanded language of rings.
[remark: Why Sentences Do Not See Transcendence Degree]
A sentence has no parameters and no free variables, so after quantifier elimination it becomes a quantifier-free assertion about closed ring terms. Closed ring terms live in the prime field. Thus the sentence can test characteristic, but it has no tuple available with which to express algebraic independence over the prime field.
[/remark]
With parameters present, the language can name particular elements, and polynomial relations among those elements become visible.
[example: Parameters Can See Algebraic Dependence]
Let $k \subset K$ be a subfield, let $K$ be algebraically closed, and fix parameters $a_1,\dots,a_n\in K$. Suppose there is a nonzero polynomial
\begin{align*}
f(x_1,\dots,x_n)=\sum_{\alpha} c_{\alpha}x_1^{\alpha_1}\cdots x_n^{\alpha_n}\in k[x_1,\dots,x_n]
\end{align*}
such that
\begin{align*}
f(a_1,\dots,a_n)=0.
\end{align*}
Expanding the evaluation gives the explicit relation
\begin{align*}
0
&=f(a_1,\dots,a_n)\\
&=\sum_{\alpha} c_{\alpha}a_1^{\alpha_1}\cdots a_n^{\alpha_n},
\end{align*}
where each coefficient $c_{\alpha}$ lies in $k$ and not all $c_{\alpha}$ are zero. This is exactly an algebraic dependence relation among $a_1,\dots,a_n$ over $k$: a nonzero polynomial with coefficients in $k$ vanishes at the tuple.
For instance, if $n=2$ and $a_2=a_1^2$, then the polynomial
\begin{align*}
f(x_1,x_2)=x_2-x_1^2\in k[x_1,x_2]
\end{align*}
satisfies
\begin{align*}
f(a_1,a_2)
&=a_2-a_1^2\\
&=a_1^2-a_1^2\\
&=0.
\end{align*}
Thus the parameter tuple $(a_1,a_2)$ satisfies a polynomial equation over $k$, so its algebraic dependence is visible to the ring-language formula $f(x_1,x_2)=0$ with coefficients from $k$.
By quantifier elimination for algebraically closed fields, any relation definable over the named parameters is equivalent, over $K$, to a Boolean combination of polynomial equations and inequalities with coefficients from those parameters. Thus sentence-level first-order logic does not see transcendence degree, but once parameters are named, definable geometry can record the polynomial relations those parameters satisfy.
[/example]
The chapter therefore ends with a clean separation. Algebraically closed fields of the same characteristic may have very different algebraic sizes, but their first-order sentences agree. Quantifier elimination explains this by reducing every sentence to arithmetic in the prime field, while reducing every definable set with variables to constructible algebraic geometry.
Algebraically closed fields show how quantifier elimination can turn a rich algebraic theory into a precise description of definable sets. The real closed case develops the same philosophy in the ordered setting, where sign conditions and semialgebraic geometry replace purely algebraic data.
# 9. Real Closed Fields and Ordered Algebra
This chapter studies ordered fields from the model-theoretic point of view. The guiding question is why the first-order theory of the real ordered field admits such strong algebraic and geometric control, even though the real numbers themselves are analytically rich. The answer is that real closed fields are precisely the ordered fields in which the algebraic obstructions to solving polynomial equations have been removed, and Tarski-Seidenberg quantifier elimination turns this algebraic completeness into a precise description of definable sets.
Chapter 7 developed quantifier elimination as a general tool, and Chapter 10 will isolate model completeness as the corresponding embedding-theoretic notion. Here those tools meet real algebraic geometry: formulas in the language of ordered rings define exactly semialgebraic sets, projections of such sets remain semialgebraic, and elementary equivalence becomes a theorem about all real closed ordered fields rather than a special feature of $\mathbb R$.
## Ordered Fields and Real Closed Fields
What algebraic condition on an ordered field makes it resemble $\mathbb R$ as far as polynomial equations and inequalities are concerned? An ordered field already knows how to compare elements, but the order alone does not force square roots of positive elements or roots of odd-degree polynomials to exist. Real closed fields are the ordered fields in which these two requirements hold, and they form the algebraic setting for the theory RCF.
[definition: Ordered Field]
An ordered field is a field $F$ equipped with a binary relation $<$ such that $(F,<)$ is a linear order and, for all $a,b,c \in F$,
\begin{align*}
a < b &\implies a + c < b + c,\\
0 < a \text{ and } 0 < b &\implies 0 < ab.
\end{align*}
[/definition]
The order determines a positive cone $F_{>0} = \{x \in F : x>0\}$, and the field operations preserve that cone in exactly the way needed to interpret polynomial inequalities. Ordered fields have characteristic $0$, since $1>0$ and hence $1+\cdots+1>0$ for every positive number of summands.
[example: Rational Ordered Field]
The field $\mathbb Q$ with its usual order is an ordered field: if $a<b$ and $c\in \mathbb Q$, then $a+c<b+c$, and if $0<a$ and $0<b$, then $0<ab$.
It is not real closed because the positive rational number $2$ has no square root in $\mathbb Q$. Suppose, toward a contradiction, that $r\in \mathbb Q$ satisfies $r^2=2$. Write $r=m/n$ with $m,n\in \mathbb Z$, $n\ne 0$, and $\gcd(m,n)=1$. Then
\begin{align*}
\left(\frac{m}{n}\right)^2 &= 2,\\
m^2 &= 2n^2.
\end{align*}
Thus $m^2$ is even, so $m$ is even. Write $m=2k$. Substituting gives
\begin{align*}
(2k)^2 &= 2n^2,\\
4k^2 &= 2n^2,\\
2k^2 &= n^2.
\end{align*}
Hence $n^2$ is even, so $n$ is even. This contradicts $\gcd(m,n)=1$, since both $m$ and $n$ are divisible by $2$. Therefore $X^2-2$ has no root in $\mathbb Q$, even though $2>0$, so $\mathbb Q$ fails the square-root condition for real closed ordered fields.
[/example]
The rational example isolates the two missing algebraic features: compatibility with order is not enough unless the field also has the right roots.
The obstruction is that an ordered field can look order-compatible while still having gaps detected by polynomial equations, such as a missing square root of a positive element. To capture the ordered-field analogue of algebraic closedness, we require precisely the root-existence conditions that remove those first-order algebraic gaps.
[definition: Real Closed Ordered Field]
A real closed ordered field is an ordered field $F$ such that every positive element of $F$ has a square root in $F$ and every polynomial $f \in F[X]$ of odd degree has a root in $F$.
[/definition]
This definition is tailored to the algebraic behaviour of $\mathbb R$, but it is first-order expressible as a scheme in the language $L_{\mathrm{or}} = \{0,1,+,-,\cdot,<\}$. For each odd degree and each degree of polynomial, one writes a sentence asserting existence of a root or a square root in terms of its coefficients.
[quotetheorem:4314]
[citeproof:4314]
The hypotheses are tight. The ordered field $\mathbb Q$ has the right order compatibility but fails the square-root condition, while $\mathbb C$ is algebraically closed but cannot be ordered because $-1=i^2$ would have to be nonnegative. The theorem does not say that an arbitrary field has a canonical order; it says that once these equivalent conditions hold, the order and the algebraic closure after adjoining $i$ determine each other. This equivalence is what lets the later model theory move freely between ordered formulas, polynomial root conditions, and algebraic extensions.
[example: The Real Algebraic Numbers]
Let $\mathbb R_{\mathrm{alg}}=\{x\in \mathbb R:x\text{ is algebraic over }\mathbb Q\}$. It is a subfield of $\mathbb R$, since sums, products, negatives, and inverses of algebraic numbers are algebraic; with the order inherited from $\mathbb R$, it is therefore an ordered field.
We verify the two real-closedness conditions. Let $a\in \mathbb R_{\mathrm{alg}}$ with $a>0$, and let $s\in \mathbb R$ be the positive real number satisfying $s^2=a$. Since $a$ is algebraic over $\mathbb Q$, the field $\mathbb Q(a)$ is algebraic over $\mathbb Q$. Also $s$ is a root of
\begin{align*}
X^2-a \in \mathbb Q(a)[X],
\end{align*}
so $s$ is algebraic over $\mathbb Q(a)$. By transitivity of algebraicity, $s$ is algebraic over $\mathbb Q$, hence $s\in \mathbb R_{\mathrm{alg}}$.
Now let
\begin{align*}
f(X)=c_dX^d+c_{d-1}X^{d-1}+\cdots+c_0 \in \mathbb R_{\mathrm{alg}}[X]
\end{align*}
have odd degree $d$, with $c_d\ne 0$. Viewed as a polynomial over $\mathbb R$, $f$ has a real root $r\in \mathbb R$ by the *Intermediate Value Theorem*. Let $K=\mathbb Q(c_0,\dots,c_d)$. Each $c_i$ is algebraic over $\mathbb Q$, so $K$ is algebraic over $\mathbb Q$. Since $f\in K[X]$ and $f(r)=0$, the element $r$ is algebraic over $K$. Again by transitivity of algebraicity, $r$ is algebraic over $\mathbb Q$, so $r\in \mathbb R_{\mathrm{alg}}$.
Thus every positive element of $\mathbb R_{\mathrm{alg}}$ has a square root in $\mathbb R_{\mathrm{alg}}$, and every odd-degree polynomial over $\mathbb R_{\mathrm{alg}}$ has a root in $\mathbb R_{\mathrm{alg}}$. Therefore $\mathbb R_{\mathrm{alg}}$ is a real closed ordered field.
[/example]
## Quantifier Elimination for Real Closed Fields
Can every first-order statement about ordered polynomial equations and inequalities be reduced to a quantifier-free statement? The obstruction is already visible in a formula such as $\exists y\,p(x,y)>0$: it asks whether the fibre over $x$ contains a point satisfying a sign condition, so it is a projection problem. For real closed fields, Tarski-Seidenberg says that these projections introduce no new first-order complexity. Existential quantification over a real closed field therefore does not create sets more complicated than finite Boolean combinations of polynomial sign conditions.
[definition: Theory of Real Closed Ordered Fields]
The theory RCF is the $L_{\mathrm{or}}$-theory whose models are real closed ordered fields.
[/definition]
The axioms are given by the ordered-field axioms, the square-root scheme for positive elements, and the odd-degree root scheme. Since the class of models is elementary, it makes sense to ask whether RCF eliminates quantifiers.
[quotetheorem:4315]
[citeproof:4315]
The real closedness hypotheses are essential: over $\mathbb Q$, the existential formula $\exists y\,(y^2=2)$ is not captured by simply treating positive elements as squares inside the field. The theorem also does not produce a short or geometrically transparent quantifier-free formula; elimination can be computationally large. Its importance for the rest of the chapter is structural: once quantifiers are removed, definable sets can be read as finite Boolean combinations of polynomial sign conditions, so projection and transfer become algebraic consequences.
[example: Eliminating a Quadratic Inequality]
Let $F$ be a real closed field, and consider
\begin{align*}
\exists y\, (y^2+ay+b<0)
\end{align*}
with parameters $a,b\in F$. Since $F$ is an ordered field, $2=1+1>0$, so $2\ne 0$ and division by $2$ and $4=2^2$ is valid. For every $y\in F$,
\begin{align*}
\left(y+\frac a2\right)^2+b-\frac{a^2}{4}
&= y^2+2\cdot y\cdot \frac a2+\left(\frac a2\right)^2+b-\frac{a^2}{4}\\
&= y^2+ay+\frac{a^2}{4}+b-\frac{a^2}{4}\\
&= y^2+ay+b.
\end{align*}
Thus
\begin{align*}
y^2+ay+b<0
\end{align*}
is equivalent to
\begin{align*}
\left(y+\frac a2\right)^2 < \frac{a^2}{4}-b.
\end{align*}
If some $y$ satisfies the inequality, then the square $\left(y+\frac a2\right)^2$ is nonnegative in the ordered field, so
\begin{align*}
0\le \left(y+\frac a2\right)^2 < \frac{a^2}{4}-b.
\end{align*}
Hence $\frac{a^2}{4}-b>0$, and multiplying by the positive element $4$ gives
\begin{align*}
a^2-4b>0.
\end{align*}
Conversely, if $a^2-4b>0$, then dividing by the positive element $4$ gives
\begin{align*}
\frac{a^2}{4}-b>0.
\end{align*}
Taking $y=-\frac a2$ makes the square term equal to $0$, so
\begin{align*}
y^2+ay+b
&=\left(y+\frac a2\right)^2+b-\frac{a^2}{4}\\
&=0+b-\frac{a^2}{4}\\
&<0.
\end{align*}
Therefore
\begin{align*}
\exists y\,(y^2+ay+b<0)
\end{align*}
is equivalent over real closed fields to the quantifier-free polynomial inequality
\begin{align*}
a^2-4b>0.
\end{align*}
This elimination replaces the fibre variable $y$ by the single discriminant condition on the parameters $a,b$.
[/example]
The example also points to a boundary of the idea: quantifier elimination is always a statement about a chosen language.
[remark: Language Matters]
Quantifier elimination here uses the order relation. In the pure ring language, the theory of real closed fields does not eliminate quantifiers in this form because the order must first be recovered by the formula
\begin{align*}
x>0 \leftrightarrow \exists y\,(x=y^2 \wedge x \ne 0).
\end{align*}
[/remark]
## Semialgebraic Sets and Projections
What are the subsets of $F^n$ that first-order formulas define in a real closed field $F$? Quantifier elimination translates this model-theoretic question into geometry. The answer is that definable sets are exactly semialgebraic sets, and the main closure property is stability under coordinate projection.
[definition: Semialgebraic Set]
Let $F$ be an ordered field. A subset $A \subset F^n$ is semialgebraic over $F$ if it is a finite Boolean combination of sets of the form
\begin{align*}
\{x \in F^n : p(x)=0\}, \qquad \{x \in F^n : q(x)>0\},
\end{align*}
where $p,q \in F[X_1,\dots,X_n]$.
[/definition]
This definition matches quantifier-free definability in the ordered-ring language with parameters from $F$.
The next issue is closure under existential quantifiers. Geometrically, existentially quantifying some variables means projecting away their coordinates, so the required theorem says that semialgebraic sets remain semialgebraic under projection.
[quotetheorem:4316]
[citeproof:4316]
The real closed-field assumption cannot be dropped: projecting algebraic or order-defined sets over a non-real-closed ordered field may require asking for roots that do not exist in the field, as with $\exists y\,(y^2=x)$ over $\mathbb Q$. The theorem also does not say that projections of algebraic sets are algebraic; inequalities and Boolean combinations are genuinely needed. This is the geometric form of quantifier elimination and is the bridge from model theory to semialgebraic geometry.
[illustration:semialgebraic-projection]
[example: Projection of a Disk]
In $\mathbb R^2$, let
\begin{align*}
A=\{(x,y)\in \mathbb R^2:x^2+y^2<1\}.
\end{align*}
The projection of $A$ onto the $x$-axis is
\begin{align*}
\pi(A)=\{x\in \mathbb R:\exists y\,(x^2+y^2<1)\}.
\end{align*}
We show that this set is exactly $\{x\in \mathbb R:x^2<1\}$.
First suppose $x\in \pi(A)$. Then there is some $y\in \mathbb R$ such that
\begin{align*}
x^2+y^2<1.
\end{align*}
Since $y^2\ge 0$ in the ordered field $\mathbb R$, we have
\begin{align*}
x^2 \le x^2+y^2<1,
\end{align*}
and therefore
\begin{align*}
x^2<1.
\end{align*}
Conversely, suppose $x^2<1$. Taking $y=0$ gives
\begin{align*}
x^2+y^2
&=x^2+0^2\\
&=x^2+0\\
&=x^2\\
&<1,
\end{align*}
so there exists $y\in \mathbb R$ with $x^2+y^2<1$. Hence
\begin{align*}
\pi(A)=\{x\in \mathbb R:x^2<1\}.
\end{align*}
Because
\begin{align*}
x^2<1
\end{align*}
is equivalent, by adding $-x^2$ to both sides, to
\begin{align*}
1-x^2>0,
\end{align*}
the projection is the semialgebraic subset of the line defined by the single polynomial inequality $1-x^2>0$, namely the interval $(-1,1)$.
[/example]
The disk example shows projection in its simplest geometric form: a two-dimensional inequality becomes a one-dimensional inequality after eliminating the fibre variable. On the line, this phenomenon becomes especially rigid because polynomial signs can change only at roots.
This raises a classification question for one-variable definable sets: after all quantifiers are eliminated, how complicated can the resulting Boolean combination of polynomial inequalities be? Since finitely many roots cut the line into finitely many sign-constant intervals, definability in one variable should reduce to finitely many points and intervals.
[quotetheorem:4317]
[citeproof:4317]
Real closedness and the ordered-ring language are both doing work here. In richer structures, such as $(\mathbb R,+,\cdot,<,\mathbb Z)$, the subset $\mathbb Z \subset \mathbb R$ is definable and is not a finite union of points and intervals. The theorem also does not classify definable subsets of $F^n$ for $n>1$ as finite unions of boxes; higher-dimensional semialgebraic sets have richer geometry. What it does provide is the one-dimensional template for o-minimality, which will later generalise this finite interval decomposition to higher-dimensional cell decompositions.
[illustration:semialgebraic-line-decomposition]
## Completeness and Model Completeness
What does quantifier elimination imply about models of RCF as structures? A naive transfer principle would say that anything true over $\mathbb R$ is true over every real closed field, but this is false for statements about metric completeness, cardinality, or arbitrary subsets. Quantifier elimination identifies the correct boundary: first-order ordered-field statements transfer, and embeddings preserve all formulas because formulas reduce to quantifier-free algebraic comparisons. These are model-theoretic consequences of the same algebraic normal form.
[quotetheorem:4318]
[citeproof:4318]
The ordered-field language is essential: it cannot express Dedekind completeness, Archimedeanness, or uncountability, and those properties distinguish $\mathbb R$ from many other real closed fields. Completeness also does not say that all real closed fields are isomorphic; $\mathbb R$ and $\mathbb R_{\mathrm{alg}}$ already give different cardinalities. What it does say is that every sentence built only from ordered-field operations has the same truth value in every real closed field, preparing the transfer principle used at the end of the chapter.
[example: Real Numbers and Real Algebraic Numbers]
The fields $\mathbb R$ and $\mathbb R_{\mathrm{alg}}$ are not isomorphic. Indeed, $\mathbb Q[X]$ is countable because
\begin{align*}
\mathbb Q[X]=\bigcup_{d\ge 0}\mathbb Q^{d+1}
\end{align*}
after recording a polynomial of degree at most $d$ by its $d+1$ coefficients, and each $\mathbb Q^{d+1}$ is countable. Every nonzero polynomial has only finitely many roots, so the set of roots of all nonzero polynomials in $\mathbb Q[X]$ is a countable union of finite sets. Hence $\mathbb R_{\mathrm{alg}}$ is countable. By Cantor's diagonal argument, $\mathbb R$ is uncountable, so no bijection $\mathbb R_{\mathrm{alg}}\to\mathbb R$ exists; in particular, the two fields cannot be isomorphic.
Nevertheless, $\mathbb R$ is a real closed ordered field, and $\mathbb R_{\mathrm{alg}}$ is a real closed ordered field by the preceding example. Therefore *Completeness of Real Closed Ordered Fields* applies and gives
\begin{align*}
(\mathbb R,+,\cdot,<,0,1) \equiv (\mathbb R_{\mathrm{alg}},+,\cdot,<,0,1).
\end{align*}
Thus isomorphism is stronger than elementary equivalence here: although the two ordered fields have different cardinalities, every first-order sentence in the ordered-ring language has the same truth value in $\mathbb R$ and in $\mathbb R_{\mathrm{alg}}$.
[/example]
The example shows why completeness is weaker than elementary embedding: it compares whole structures without saying how a subfield sits inside a larger field.
For applications, the harder question is whether an inclusion of real closed ordered fields preserves formulas with parameters from the smaller field. Quantifier elimination suggests that it should, because polynomial equalities and inequalities are already interpreted correctly by the inclusion.
[quotetheorem:4319]
[citeproof:4319]
The inclusion hypotheses matter. If $K$ is not real closed, the inclusion $\mathbb Q \subset \mathbb R$ is not elementary because $\mathbb R \models \exists y\,(y^2=2)$ while $\mathbb Q$ does not. The theorem also does not say that every ordered-field embedding into a real closed field is elementary; both sides must be models of RCF. Its use is that naturally occurring inclusions such as $\mathbb R_{\mathrm{alg}} \subset \mathbb R$ become elementary without a separate back-and-forth construction.
[example: Elementary Subfields Inside a Real Closure]
Let $K$ be a real closed subfield of a real closed field $L$, with the order on $K$ inherited from $L$. For example, $\mathbb R_{\mathrm{alg}}\subset \mathbb R$. If $a,b\in K$, then the inclusion map $\iota:K\to L$ satisfies
\begin{align*}
\iota(0)&=0,\\
\iota(1)&=1,\\
\iota(a+b)&=a+b=\iota(a)+\iota(b),\\
\iota(-a)&=-a=-\iota(a),\\
\iota(ab)&=ab=\iota(a)\iota(b).
\end{align*}
Since the order on $K$ is the restriction of the order on $L$, for $a,b\in K$ we also have
\begin{align*}
K\models a<b \quad \Longleftrightarrow \quad L\models a<b.
\end{align*}
Thus the inclusion is an ordered-field embedding.
Because both $K$ and $L$ are real closed ordered fields, *Model Completeness of Real Closed Ordered Fields* applies to this inclusion and gives
\begin{align*}
K\preccurlyeq L.
\end{align*}
Equivalently, for every ordered-ring formula $\varphi(x_1,\dots,x_n)$ and every tuple $a\in K^n$,
\begin{align*}
K\models \varphi(a) \quad \Longleftrightarrow \quad L\models \varphi(a).
\end{align*}
So any definable question whose parameters lie in $K$ has the same answer whether it is evaluated inside $K$ or inside the larger real closed field $L$.
[/example]
## Consequences for Real Algebraic Geometry
How does a theorem about formulas become a theorem about geometry? In real closed fields, polynomial equations and inequalities are the bridge: they are both the atomic formulas of the language and the basic building blocks of semialgebraic geometry. The model-theoretic consequences of RCF therefore translate into uniform algebraic statements about all real closed fields.
[quotetheorem:4320]
[citeproof:4320]
The restriction to the ordered-ring language is necessary: adding a predicate for $\mathbb Z$ over $\mathbb R$ creates definable sets that are not semialgebraic. The theorem also does not claim that a definable set has a unique semialgebraic presentation; many different Boolean combinations can define the same subset. Its practical value is that arbitrary formulas may be replaced by finite sign conditions on polynomials, which is the language in which real algebraic geometry operates.
[example: A Definable Image]
Let $F$ be a real closed field, and let $f:F\to F^2$ be the polynomial map $f(t)=(t^2,t^3)$. Its image is defined by
\begin{align*}
\exists t\,(x=t^2 \wedge y=t^3),
\end{align*}
and we show that this is equivalent to the quantifier-free conditions
\begin{align*}
x\ge 0,\qquad y^2=x^3.
\end{align*}
First suppose $(x,y)=f(t)$ for some $t\in F$. Then $x=t^2$, so $x\ge 0$ because squares are nonnegative in an ordered field. Also,
\begin{align*}
y^2
&=(t^3)^2\\
&=t^6\\
&=(t^2)^3\\
&=x^3.
\end{align*}
Thus every point in the image satisfies $x\ge 0$ and $y^2=x^3$.
Conversely, suppose $x\ge 0$ and $y^2=x^3$. If $x=0$, then
\begin{align*}
y^2=x^3=0^3=0,
\end{align*}
so $y=0$, and taking $t=0$ gives
\begin{align*}
t^2=0=x,\qquad t^3=0=y.
\end{align*}
Now assume $x>0$. Since $F$ is real closed, there is $s\in F$ with $s^2=x$. Then
\begin{align*}
y^2
&=x^3\\
&=(s^2)^3\\
&=s^6\\
&=(s^3)^2.
\end{align*}
Hence
\begin{align*}
0
&=y^2-(s^3)^2\\
&=(y-s^3)(y+s^3).
\end{align*}
Since $F$ is a field, this implies $y=s^3$ or $y=-s^3$. If $y=s^3$, take $t=s$. If $y=-s^3$, take $t=-s$, and then
\begin{align*}
t^2=(-s)^2=s^2=x,\qquad t^3=(-s)^3=-s^3=y.
\end{align*}
Therefore
\begin{align*}
\exists t\,(x=t^2 \wedge y=t^3)
\end{align*}
is equivalent over real closed fields to
\begin{align*}
x\ge 0 \wedge y^2=x^3.
\end{align*}
The image is therefore the semialgebraic cusp cut out by a nonnegativity condition and the equation $y^2=x^3$, rather than the graph of a polynomial function.
[/example]
This example illustrates why images are naturally projection problems: the parameter $t$ is hidden, and the semialgebraic description records exactly the algebraic compatibility conditions left behind after eliminating it.
[remark: Transfer Across Real Closed Fields]
Completeness of RCF gives a transfer principle for first-order statements in the ordered-ring language. If a sentence about polynomial equations and inequalities holds in $\mathbb R$, then it holds in every real closed field. Statements involving completeness of the metric, countability, or arbitrary subsets of $\mathbb R$ are not first-order ordered-field statements and are outside this transfer principle.
[/remark]
The chapter therefore closes the loop between algebra, logic, and geometry. Real closedness gives the correct algebraic class of ordered fields, quantifier elimination turns formulas into sign conditions, and completeness/model completeness explain why the same first-order real algebraic geometry is shared by all real closed ordered fields.
Real closed fields show that quantifier elimination can produce a complete and highly controlled first-order theory even in the presence of order. The final step is to isolate the general semantic mechanism behind that control: model completeness, and Robinson's test as a practical criterion for recognizing it.
# 10. Model Completeness and Robinson's Test
Model completeness is a strengthening of completeness that controls not just which sentences hold in all models, but how models of the same theory sit inside one another. Chapters 7 through 9 treated quantifier elimination as a precise way of reducing formulas to quantifier-free information, first abstractly and then in dense orders, algebraically closed fields, and real closed fields. This chapter studies the weaker but very robust condition that embeddings between models preserve all formulas, and explains Robinson's test as the main practical criterion for proving it.
The guiding question is: when does a substructure inclusion $M \subset N$ between models of a theory $T$ already behave like an elementary substructure relation? Quantifier elimination gives a direct answer, because quantifier-free formulas are preserved by embeddings. Model completeness isolates the elementary-embedding conclusion and keeps it even in languages where full quantifier elimination is unavailable or inconvenient.
## Embeddings Between Models
A theory may have many models, and embeddings between them need not preserve formulas with quantifiers. The first problem is to identify the theories for which this defect never occurs: once two structures both satisfy the theory, every embedding between them is already as strong as an elementary embedding.
[definition: Model Complete Theory]
Let $L$ be a first-order language and let $T$ be an $L$-theory. The theory $T$ is model complete if, whenever $M \models T$ and $N \models T$, every $L$-embedding $f:M\to N$ is an elementary embedding.
[/definition]
Thus model completeness is a property of the class of models of $T$ together with the embeddings between them. In particular, if $M \subset N$ are both models of $T$, then model completeness says $M \preccurlyeq N$.
[example: Dense Linear Orders]
Let $T=\operatorname{DLO}$ in the language $L=\{<\}$, and let $M,N\models T$ with $f:M\to N$ an order embedding. We show that $f$ is elementary. Fix an $L$-formula $\varphi(x_1,\ldots,x_n)$ and a tuple $a=(a_1,\ldots,a_n)$ from $M$. By *Quantifier Elimination for Dense Linear Orders*, there is a quantifier-free formula $\theta(x_1,\ldots,x_n)$ such that
\begin{align*}
T \models \forall x\,\bigl(\varphi(x)\leftrightarrow \theta(x)\bigr).
\end{align*}
Every atomic $L$-formula is either $x_i<x_j$ or $x_i=x_j$. Since $f$ is an order embedding, for all $i,j$ we have
\begin{align*}
M\models a_i<a_j
&\Longleftrightarrow a_i<a_j \text{ in } M\\
&\Longleftrightarrow f(a_i)<f(a_j) \text{ in } N\\
&\Longleftrightarrow N\models f(a_i)<f(a_j),
\end{align*}
and since every embedding is injective,
\begin{align*}
M\models a_i=a_j
&\Longleftrightarrow a_i=a_j\\
&\Longleftrightarrow f(a_i)=f(a_j)\\
&\Longleftrightarrow N\models f(a_i)=f(a_j).
\end{align*}
Truth of Boolean combinations is obtained from truth of their atomic parts, so
\begin{align*}
M\models \theta(a)
\Longleftrightarrow
N\models \theta(f(a)).
\end{align*}
Using the equivalence of $\varphi$ and $\theta$ in both models of $T$,
\begin{align*}
M\models \varphi(a)
&\Longleftrightarrow M\models \theta(a)\\
&\Longleftrightarrow N\models \theta(f(a))\\
&\Longleftrightarrow N\models \varphi(f(a)).
\end{align*}
Thus every order embedding between models of $\operatorname{DLO}$ preserves and reflects all formulas, so it is elementary. Hence $\operatorname{DLO}$ is model complete.
[/example]
This example shows one route to model completeness: prove quantifier elimination first. Robinson's test will give a second route, based on existential formulas rather than quantifier-free formulas.
To formulate that route, we need to isolate the formulas whose truth is witnessed by finitely many elements. These are the formulas naturally preserved upward along embeddings, because a witness in the smaller structure remains available in the larger one.
[definition: Existential Formula]
An $L$-formula $\varphi(x)$ is existential if it has the form
\begin{align*}
\exists y\, \psi(x,y),
\end{align*}
where $\psi(x,y)$ is quantifier-free.
[/definition]
Existential formulas are the formulas whose truth is automatically preserved when passing upward along embeddings, provided the relevant witnesses already exist in the larger structure. This asymmetry is the reason model completeness can be tested using existential information.
## Existential Preservation
The central technical question is how much of elementary equivalence can be recovered from the behavior of existential formulas. Since arbitrary formulas may alternate quantifiers, it is not immediate that existential formulas should suffice. The point of model completeness is that, over models of the theory, every formula can be replaced by an existential one.
[quotetheorem:4321]
[citeproof:4321]
The theorem explains why existential formulas are not merely a technical convenience. In a model complete theory, all first-order information has an existential representative, so embeddings cannot create new truth without also respecting its negation.
[remark: Universal Forms]
Since the negation of an existential formula is universal, the theorem also says that in a model complete theory every formula is equivalent modulo $T$ to a universal formula. The existential and universal representatives usually differ syntactically, but they define the same set in every model of $T$.
[/remark]
The embedding criterion is abstract, so it is useful to test it against the most concrete source of embeddings: solving algebraic equations in field extensions.
[example: Algebraic Equations in Fields]
In the language of rings, the formula
\begin{align*}
\exists y\,(y^2=x)
\end{align*}
says that $x$ has a square root. Let $K\subset L$ be a field extension and let $a\in K$. If $K\models \exists y\,(y^2=a)$, then there is some $b\in K$ such that
\begin{align*}
b^2=a.
\end{align*}
Since $K\subset L$, the same element $b$ also lies in $L$, and the multiplication of elements of $K$ agrees with their multiplication in $L$. Hence, computed in $L$,
\begin{align*}
b^2=b\cdot b=a,
\end{align*}
so $L\models \exists y\,(y^2=a)$.
The converse can fail. Take $K=\mathbb{Q}$, $L=\mathbb{Q}(\sqrt{2})$, and $a=2$. In $L$ there is an element $\sqrt{2}$ with
\begin{align*}
(\sqrt{2})^2=2,
\end{align*}
so $L\models \exists y\,(y^2=2)$. But if $q\in\mathbb{Q}$ and $q^2=2$, write $q=m/n$ with integers $m,n$, $n\neq 0$, and $\gcd(m,n)=1$. Then
\begin{align*}
\left(\frac{m}{n}\right)^2=2
&\Longleftrightarrow m^2=2n^2.
\end{align*}
Thus $m^2$ is even, so $m$ is even; write $m=2r$. Substituting gives
\begin{align*}
(2r)^2=2n^2
&\Longleftrightarrow 4r^2=2n^2\\
&\Longleftrightarrow 2r^2=n^2.
\end{align*}
Thus $n^2$ is even, so $n$ is even, contradicting $\gcd(m,n)=1$. Therefore $\mathbb{Q}\not\models \exists y\,(y^2=2)$.
This shows the asymmetric behavior of existential formulas: witnesses already present in the smaller field remain witnesses in the larger field, but a larger field may contain new algebraic witnesses that the smaller field lacks.
[/example]
## Robinson's Test
The practical problem is that checking every formula directly is not feasible. Robinson's test replaces that task with a diagram-completeness condition: after naming the elements of a model $M\models T$, the quantifier-free diagram of $M$ should already determine every sentence in the expanded language.
[definition: Diagram]
Let $A$ be an $L$-structure. The diagram $\operatorname{Diag}(A)$ is the set of all quantifier-free $L(A)$-sentences true in $A$, where $L(A)$ is obtained from $L$ by adding a constant symbol for each element of $A$.
[/definition]
The diagram records the atomic and negated atomic facts of a structure with parameters named. It is designed so that a model of $\operatorname{Diag}(A)$ contains an isomorphic copy of $A$ as an $L$-substructure.
The remaining issue is how this syntactic control detects model completeness. Robinson's test asks whether, after a model is named by constants, the quantifier-free information compatible with the theory already decides all expanded-language consequences needed to force embeddings to be elementary.
[quotetheorem:4322]
[citeproof:4322]
Robinson's test is often paired with an equivalent amalgamation-style form: finite quantifier-free information over a named substructure can be tested by asking whether the corresponding diagram extensions stay compatible with $T$.
[example: Algebraically Closed Fields]
For algebraically closed fields, the diagram-compatibility condition behind Robinson's test reflects a familiar algebraic fact: algebraically closed field extensions over a common subfield can be compared inside a larger algebraically closed field after arranging the transcendence data compatibly. Thus finite polynomial equations and inequations over the named common part can be realized together in one algebraically closed field.
The model-theoretic consequence is that inclusions between algebraically closed fields preserve first-order truth in the ring language. This is weaker than quantifier elimination as a method but agrees with the outcome already obtained from quantifier elimination: $\operatorname{ACF}$ is model complete.
[/example]
The algebraically closed case uses transcendence bases and algebraic extension embeddings. Ordered fields require the same compatibility idea, but now the common embedding must preserve the order.
[example: Real Closed Fields]
For real closed fields, the analogous compatibility condition must also respect order. Ordered-field amalgamation and real closure provide the algebraic background: two real closed extensions of a common ordered subfield can be compared inside a larger ordered real closed field in a way that preserves the ordered-ring information over the common part.
Robinson's test packages that compatibility as model completeness of $\operatorname{RCF}$. This agrees with the earlier quantifier-elimination viewpoint: embeddings between real closed ordered fields are elementary, even though the proof strategy here emphasizes diagram compatibility rather than explicit normal forms for formulas.
[/example]
These examples are important because model completeness is weaker than quantifier elimination but still strong enough to control embeddings. Algebraically closed fields and real closed fields in their standard languages in fact have quantifier elimination, but Robinson's test gives a proof strategy that also works in settings where quantifier elimination is too much to ask for.
## Quantifier Elimination and Model Completeness
The final question is how the new notion compares with quantifier elimination. Quantifier elimination controls formulas down to quantifier-free data in each model; model completeness controls embeddings between models. The former implies the latter, but the converse depends on the language.
[quotetheorem:4303]
[citeproof:4303]
The implication is frequently the quickest way to prove model completeness. Dense linear orders, algebraically closed fields, and real closed fields all admit quantifier elimination in natural languages, so each is model complete. Conceptually, the reason is that an embedding already preserves quantifier-free formulas, and quantifier elimination says every formula is equivalent to one of those preserved formulas. This also shows the limitation of the result: it proves model completeness only after quantifier elimination has been established in the chosen language.
The converse question is subtler and depends on how much definable structure the language names. Model completeness says that embeddings preserve all formulas between models, but it does not automatically provide quantifier-free definitions for all definable sets. To recover quantifier elimination from model completeness, one needs additional hypotheses ensuring that definable relations can be separated by quantifier-free information.
[quotetheorem:4324]
[citeproof:4324]
The criterion is useful because it turns the search for quantifier-free definitions into a separation problem: if two tuples cannot be separated by quantifier-free information, model completeness forces them to agree on all formulas. The hypothesis is therefore not just technical; it supplies the missing bridge between preservation under embeddings and explicit quantifier-free descriptions.
A common way to recognize this bridge is to add names for enough definable relations. Expanding the language can make formerly hidden structure quantifier-free, and then model completeness has a better chance of upgrading to quantifier elimination. The following formulation isolates that language-dependent mechanism.
[quotetheorem:4325]
[citeproof:4325]
[example: Model Completeness Without Full Quantifier Elimination]
A concrete instance is obtained by forgetting the order symbol from real closed fields. In the ordered-ring language $L'=\{0,1,+,-,\cdot,<\}$, real closed fields have quantifier elimination. Pass to the smaller ring language $L=\{0,1,+,-,\cdot\}$, and let $T$ be the $L$-theory of real closed fields.
We first check why the reduct is still model complete. Let $M,N\models T$ and let $f:M\to N$ be an $L$-embedding. In a real closed field, the order is recovered from the ring structure by
\begin{align*}
x<y
\Longleftrightarrow
\exists z\,\bigl(z\neq 0\wedge y-x=z^2\bigr).
\end{align*}
Indeed, if $x<y$, then $y-x>0$, and every positive element of a real closed field is a square; moreover the square root is nonzero because $y-x\neq 0$. Conversely, if $y-x=z^2$ with $z\neq 0$, then $z^2>0$, so $y-x>0$, hence $x<y$. Therefore, if $a<b$ in $M$, choose $c\in M$ with $c\neq 0$ and
\begin{align*}
b-a=c^2.
\end{align*}
Applying the ring embedding $f$ gives
\begin{align*}
f(b)-f(a)
&=f(b-a)\\
&=f(c^2)\\
&=f(c)^2,
\end{align*}
and $f(c)\neq 0$ because $f$ is injective. Hence $f(a)<f(b)$ in $N$. Thus every ring embedding between real closed fields is automatically an ordered-ring embedding, so by *Model Completeness of Real Closed Fields* it is elementary in the ordered-ring language, and therefore elementary in the smaller ring language.
But the reduct does not have quantifier elimination in the ring language. The formula
\begin{align*}
\exists z\,\bigl(z\neq 0\wedge x=z^2\bigr)
\end{align*}
defines the positive elements. If it were equivalent over $T$ to a quantifier-free ring formula $\theta(x)$, then in $\mathbb{R}$ the set $(0,\infty)$ would be a Boolean combination of solution sets of polynomial equations $p(x)=0$. Each nonzero polynomial in one variable over $\mathbb{R}$ has finitely many roots, while the equation $0=0$ defines all of $\mathbb{R}$ and $1=0$ defines the empty set. Hence every quantifier-free one-variable ring formula over $\mathbb{R}$ defines either a finite set or the complement of a finite set. The set $(0,\infty)$ is infinite, and its complement $(-\infty,0]$ is also infinite, so it is neither finite nor cofinite. Thus no such quantifier-free $\theta(x)$ exists.
This illustrates the gap: after removing the order symbol, embeddings are still controlled strongly enough to give model completeness, but quantifier-free formulas in the smaller language no longer describe all definable sets.
[/example]
The distinction matters in applications. Quantifier elimination gives explicit normal forms for formulas, while model completeness gives elementary behavior of embeddings and existential descriptions of definable sets. Robinson's test sits between the two: it often proves model completeness by checking a concrete compatibility condition on diagrams rather than producing full quantifier-free normal forms.
By the end of the course, the central tools have all been assembled: compactness, ultraproducts, quantifier elimination, model completeness, and elementary chains. The final synthesis asks what these methods can build, what they cannot force, and how the limits of first-order logic still support powerful structural conclusions.
# 11. Synthesis: What Compactness Buys and What It Cannot Buy
This chapter synthesises the model-building tools developed so far and asks what they accomplish when used together. Compactness, diagrams, ultraproducts, quantifier elimination, model completeness, and elementary chains all convert finite or local first-order information into global structure. The guiding tension is that the same flexibility which produces rich models also prevents first-order logic from isolating many intended infinite structures.
## Compactness, Ultraproducts, and Elementary Equivalence as a Toolkit
The central problem is this: given partial first-order information about a structure, when can we turn it into an actual model or compare it with another structure? Compactness answers existence questions from finite satisfiability, ultraproducts package infinitely many approximations into one structure, and elementary equivalence records agreement on all first-order sentences. Used together, these methods explain why first-order logic is powerful enough to build nonstandard models but too weak to enforce many intended infinite features.
[definition: Elementary Equivalence]
Let $M$ and $N$ be $L$-structures. We write $M \equiv N$ if for every $L$-sentence $\sigma$,
\begin{align*}
M \models \sigma \quad \Longleftrightarrow \quad N \models \sigma.
\end{align*}
[/definition]
Elementary equivalence forgets the actual elements of the structures and remembers only their first-order theory. It is therefore coarser than isomorphism, but much finer than sharing a few algebraic invariants.
[example: Nonisomorphic Elementarily Equivalent Dense Orders]
The ordered sets $(\mathbb Q,<)$ and $(\mathbb R,<)$ cannot be isomorphic: an order isomorphism is in particular a bijection between the underlying sets, but $\mathbb Q$ is countable and $\mathbb R$ is uncountable.
Both structures are dense linear orders without endpoints. For $\mathbb Q$, if $q<r$, then
\begin{align*}
q<\frac{q+r}{2}<r,
\end{align*}
and $q-1<q<q+1$, so there are no endpoints. The same calculation shows density and absence of endpoints in $\mathbb R$. By *Quantifier Elimination for Dense Linear Orders Without Endpoints*, every $\{<\}$-sentence is equivalent, modulo the theory of dense linear orders without endpoints, to a quantifier-free sentence. Since the language $\{<\}$ has no constant symbols, a quantifier-free sentence has no closed terms to compare, so its truth value is fixed independently of whether it is interpreted in $(\mathbb Q,<)$ or $(\mathbb R,<)$. Hence
\begin{align*}
(\mathbb Q,<)\models \sigma
\quad\Longleftrightarrow\quad
(\mathbb R,<)\models \sigma
\end{align*}
for every first-order sentence $\sigma$ in the language $\{<\}$, so $(\mathbb Q,<)\equiv(\mathbb R,<)$. Elementary equivalence preserves all first-order order-theoretic information here, but it does not preserve the cardinality of the underlying set.
[/example]
The diagram method gives a systematic way to use compactness to construct extensions, embeddings, and elementary extensions. Without diagrams, a compactness construction may produce a model of the right theory but fail to remember where the original structure went. Constants for old elements prevent this loss: they force any new model to carry a named copy of the original structure, and the choice between atomic and elementary diagrams controls whether the copy is merely embedded or elementary.
[definition: Atomic Diagram]
Let $M$ be an $L$-structure. Expand $L$ by constants $c_a$ for each $a \in M$. The atomic diagram $\operatorname{Diag}_{\mathrm{at}}(M)$ is the set of all atomic and negated atomic $L(M)$-sentences true in $M$ under the interpretation $c_a^M=a$.
[/definition]
The atomic diagram records enough information to force an embedding of $M$ into any model of the expanded theory. It does not force the embedding to preserve formulas with quantifiers, because existential or universal facts about named parameters may be lost in the larger model.
To control elementary embeddings rather than mere embeddings, the diagram must record every first-order fact about the named elements. This stronger record is the elementary diagram.
[definition: Elementary Diagram]
Let $M$ be an $L$-structure. The elementary diagram $\operatorname{Diag}_{\mathrm{el}}(M)$ is the set of all $L(M)$-sentences true in $M$, where $L(M)$ is obtained by adding a constant $c_a$ for every $a \in M$.
[/definition]
The reason for introducing elementary diagrams is that they let compactness build extensions without losing the original structure. If every finite part of the desired extension can already be realized, then compactness produces a model containing named copies of all old elements and satisfying all old first-order facts with parameters.
The useful compactness question is therefore finite satisfiability: when can additional requirements be imposed while still preserving the complete first-order behavior of the old structure? The obstruction is that new conditions may accidentally contradict some formula with parameters from $M$, even if they look harmless in the smaller language. The construction principle below gives the exact compactness test: check only finite pieces after adjoining the elementary diagram.
[quotetheorem:4326]
[citeproof:4326]
The finite satisfiability hypothesis is the only place where the proposed requirements must be checked. The elementary diagram is essential: using only the complete theory of $M$ would produce a model elementarily equivalent to $M$, but not a controlled extension of the given structure. The schema is therefore the standard compactness template for adding new elements while preserving all first-order facts about old parameters.
[example: Constructing a Proper Elementary Extension]
Let $M$ be an infinite $L$-structure, and expand $L(M)$ by one new constant symbol $d$. Consider
\begin{align*}
\operatorname{Diag}_{\mathrm{el}}(M)\ \cup\ \{d\ne c_a : a\in M\}.
\end{align*}
To check finite satisfiability, let $\Delta$ be a finite subset of this theory. Only finitely many inequalities involving $d$ occur in $\Delta$, say
\begin{align*}
d\ne c_{a_1},\dots,d\ne c_{a_n}.
\end{align*}
Since $M$ is infinite and $\{a_1,\dots,a_n\}$ is finite, choose $m\in M\setminus\{a_1,\dots,a_n\}$. Interpret every old constant by $c_a^M=a$ and interpret $d$ by $m$. Then each sentence from $\operatorname{Diag}_{\mathrm{el}}(M)$ appearing in $\Delta$ is true by the definition of the elementary diagram, and each displayed inequality is true because
\begin{align*}
d^M=m\ne a_i=c_{a_i}^M
\end{align*}
for every $1\le i\le n$. Thus every finite $\Delta$ is satisfiable.
By *Elementary Diagram Compactness Construction*, there is a model $N^*$ of the full expanded theory. Its $L$-reduct contains an elementary copy of $M$, and the element $d^{N^*}$ is distinct from every named old element because $N^*\models d\ne c_a$ for each $a\in M$. After identifying $M$ with its elementary image, we obtain a proper elementary extension $N\succeq M$.
[/example]
Ultraproducts provide the parallel semantic construction. Instead of adding constants and invoking compactness, we quotient products by an ultrafilter and use Los's theorem to transfer first-order truth coordinatewise.
There is a deeper comparison theorem, due to Keisler and Shelah, saying roughly that elementary equivalence can be detected by passing to suitable ultrapowers. That result belongs to a later course: its proof uses saturation and a more refined analysis of ultrafilters than this foundations sequence has developed. For the present page, the important takeaway is more modest: ultraproducts preserve first-order truth by Łoś's theorem, while compactness and elementary diagrams explain how to build extensions with controlled first-order behavior.
## Comparing Dense Orders, Algebraically Closed Fields, and Real Closed Fields
The next problem is to compare what first-order methods reveal in the three guiding examples. Dense linear orders without endpoints, algebraically closed fields, and real closed fields all have quantifier elimination after choosing the correct language or theory. The resulting descriptions of definable sets are quite different, and this difference explains why the examples feel algebraic, order-theoretic, and semi-algebraic respectively.
[definition: One-Variable Definable Set]
Let $M$ be an $L$-structure and let $A \subset M$. A subset $X \subset M$ is definable over $A$ if there is an $L(A)$-formula $\varphi(x)$ such that
\begin{align*}
X = \{a \in M : M \models \varphi(a)\}.
\end{align*}
[/definition]
The shape of definable subsets of the line is a compact way to remember the behaviour of a complete theory. Quantifier elimination makes this shape computable from atomic formulas.
For dense linear orders, the possible atomic information about one variable is only its comparison with finitely many parameters. The question is whether arbitrary formulas can define wilder subsets after quantifiers are allowed, or whether quantifier elimination keeps every one-variable definable set controlled by finitely many cuts. The next result gives that classification.
[quotetheorem:4328]
[citeproof:4328]
The hypothesis that the order be dense and have no endpoints is what prevents isolated gaps and boundary anomalies from appearing in the classification. Quantifier elimination is doing the real work: it reduces all possible definitions to finitely many comparisons with parameters. This prepares the contrast with fields, where the atomic formulas are equations or inequalities rather than order cuts.
[example: A Definable Set in a Dense Order]
Let $(M,<)$ be a dense linear order without endpoints, and fix parameters $a<b<c$ in $M$. The formula
\begin{align*}
(a<x \wedge x<b) \vee (x=c)
\end{align*}
is true exactly on $(a,b)\cup\{c\}$.
The ordered parameters cut $M$ into the seven possible regions
\begin{align*}
(-\infty,a),\quad \{a\},\quad (a,b),\quad \{b\},\quad (b,c),\quad \{c\},\quad (c,\infty).
\end{align*}
If $x\in(a,b)$, then $a<x$ and $x<b$, so $a<x\wedge x<b$ is true, and the whole disjunction is true. If $x=c$, then $x=c$ is true, so the disjunction is true. Conversely, if the disjunction is true, then either $a<x\wedge x<b$ is true, which means $x\in(a,b)$, or $x=c$ is true, which means $x\in\{c\}$. Hence the defined set is exactly
\begin{align*}
\{x\in M:M\models (a<x\wedge x<b)\vee(x=c)\}=(a,b)\cup\{c\}.
\end{align*}
This illustrates the finite-cut method: with finitely many parameters, one checks finitely many intervals and singleton parameter points. Since one-variable definable sets in dense linear orders without endpoints are finite unions of such pieces, an infinite definable set must contain a nonempty interval; by density, no such interval is discrete.
[/example]
Dense linear orders show that quantifier elimination can impose strong one-variable geometry. Algebraically closed fields have a different kind of geometry: polynomial equations in one variable have only finitely many roots unless the polynomial is identically zero. Because quantifier elimination reduces arbitrary definable sets to Boolean combinations of polynomial equations, this algebraic fact suggests a sharp finite-or-cofinite dichotomy.
The point that still needs a theorem is that Boolean combinations of polynomial equations introduce no hidden one-variable complexity beyond those finite zero sets. The following classification makes that precise for algebraically closed fields.
[quotetheorem:4329]
[citeproof:4329]
Algebraic closedness is essential here, because it turns the zero set of a one-variable polynomial into a completely controlled finite object unless the polynomial is identically zero. The theorem is the model-theoretic shadow of the Zariski topology on the affine line: definable subsets are no more complicated than constructible subsets, which in one dimension collapse to finite or cofinite sets. This is why algebraically closed fields behave so differently from ordered fields in one variable.
[example: Finite or Cofinite Behaviour in an Algebraically Closed Field]
Over an algebraically closed field $K$, fix parameters $a,b\in K$ and consider
\begin{align*}
\varphi(x)\ :=\ (x^2-a=0)\vee(x^3+b=0).
\end{align*}
The set defined by $\varphi$ is
\begin{align*}
X
&=\{x\in K:K\models x^2-a=0\}\cup\{x\in K:K\models x^3+b=0\}\\
&=\{x\in K:x^2=a\}\cup\{x\in K:x^3=-b\}.
\end{align*}
The polynomial $x^2-a$ has degree $2$ unless it is the zero polynomial, which cannot happen because its leading coefficient is $1$. Hence it has at most $2$ roots in the field $K$. Similarly, $x^3+b$ has degree $3$ and leading coefficient $1$, so it has at most $3$ roots. Therefore
\begin{align*}
|X|
\le |\{x\in K:x^2=a\}|+|\{x\in K:x^3=-b\}|
\le 2+3
=5.
\end{align*}
If the two root sets overlap, or if either polynomial has repeated roots, the actual number of elements is smaller, but it is still finite.
The negation of $\varphi$ defines
\begin{align*}
K\setminus X
=\{x\in K:(x^2-a\ne 0)\wedge(x^3+b\ne 0)\},
\end{align*}
so it is cofinite because its complement is the finite set $X$. Thus this formula exhibits the finite side of the finite-or-cofinite classification: a one-variable set in an algebraically closed field that is infinite and has infinite complement cannot arise in this way in the pure field language with parameters.
[/example]
Real closed fields retain an order, so their one-variable definable sets cannot collapse to the finite-or-cofinite pattern. Inequalities cut the line into intervals, while polynomial equations contribute finitely many boundary points. The appropriate analogue of the algebraically closed field result is therefore a finite union of points and intervals, with endpoints controlled by the parameter field.
[quotetheorem:4330]
[citeproof:4330]
The endpoint condition reflects the algebraic nature of the construction: endpoints arise as roots of polynomials over the parameter field. Unlike the algebraically closed case, inequalities remember order and therefore produce intervals rather than merely finite exceptional sets. This is the one-dimensional beginning of semi-algebraic geometry and foreshadows o-minimality, where definable sets remain geometrically tame in all dimensions.
[example: A Semi-Algebraic One-Variable Set]
In a real closed field $R$, let $\sqrt{2}$ denote the positive element $s\in R$ with $s^2=2$; such an element exists because $2>0$ and positive elements have square roots in a real closed field. We compute the set defined by
\begin{align*}
x^2<2 \wedge x^2-1\ne 0.
\end{align*}
Since $s^2=2$, we have
\begin{align*}
x^2<2
&\Longleftrightarrow x^2<s^2\\
&\Longleftrightarrow x^2-s^2<0\\
&\Longleftrightarrow (x-s)(x+s)<0.
\end{align*}
In an ordered field, a product is negative exactly when its two factors have opposite signs, so
\begin{align*}
(x-s)(x+s)<0
\Longleftrightarrow -s<x<s.
\end{align*}
Thus $x^2<2$ defines $(-\sqrt{2},\sqrt{2})$.
For the second condition,
\begin{align*}
x^2-1\ne 0
&\Longleftrightarrow (x-1)(x+1)\ne 0\\
&\Longleftrightarrow x-1\ne 0\ \text{and}\ x+1\ne 0\\
&\Longleftrightarrow x\ne 1\ \text{and}\ x\ne -1.
\end{align*}
Since
\begin{align*}
1^2=1<2
\quad\text{and}\quad
(-1)^2=1<2,
\end{align*}
both $1$ and $-1$ lie inside $(-\sqrt{2},\sqrt{2})$. Therefore the formula defines
\begin{align*}
(-\sqrt{2},\sqrt{2})\setminus\{-1,1\}.
\end{align*}
This is a semi-algebraic one-variable set: it is obtained from polynomial inequalities and polynomial inequations, producing intervals with finitely many points removed rather than the finite-or-cofinite behaviour of algebraically closed fields.
[/example]
These three examples show the same general mechanism with different outputs. Quantifier elimination reduces definability to atomic data, but the atomic data depends strongly on the language: order relations produce cuts, algebraic equations produce finite root sets in one variable, and ordered polynomial inequalities produce semi-algebraic cells.
## First-Order Properties and Their Limits
The next question is which mathematical properties can be captured by first-order theories. Compactness often proves non-definability by showing that if a property were first-order, then an impossible limit structure would have to exist. This is the main negative lesson of the course: first-order logic controls finite patterns very well, but it cannot directly count infinite size, enforce standardness, or express many global finiteness conditions.
[example: Nonstandard Arithmetic from Compactness]
Work in $L=\{+,\cdot,0,1,<\}$, and let
\begin{align*}
T=\operatorname{Th}(\mathbb N,+,\cdot,0,1,<).
\end{align*}
Expand the language by a new constant symbol $c$, and consider
\begin{align*}
T^*=T\cup\{\bar n<c:n\in\mathbb N\},
\end{align*}
where $\bar n$ is the standard numeral for $n$.
To check finite satisfiability, let $\Delta\subseteq T^*$ be finite. Only finitely many of the new inequalities occur in $\Delta$, say
\begin{align*}
\bar n_1<c,\ \bar n_2<c,\ \dots,\ \bar n_r<c.
\end{align*}
If $r=0$, interpret $c$ as $0$. If $r>0$, let
\begin{align*}
m=\max\{n_1,\dots,n_r\}+1.
\end{align*}
Then for each $1\le i\le r$,
\begin{align*}
n_i\le \max\{n_1,\dots,n_r\}<\max\{n_1,\dots,n_r\}+1=m,
\end{align*}
so $\mathbb N\models \bar n_i<c$ when $c$ is interpreted as $m$. Every sentence of $T$ appearing in $\Delta$ is true in $\mathbb N$ by the definition of $T=\operatorname{Th}(\mathbb N,+,\cdot,0,1,<)$. Hence every finite $\Delta\subseteq T^*$ is satisfiable.
By the *Compactness Theorem*, there is an $L(c)$-structure $M^*$ satisfying all of $T^*$. Let $M$ be its $L$-reduct and let $a=c^{M^*}$. Since $M^*\models \bar n<c$ for every $n\in\mathbb N$, we have
\begin{align*}
M\models \bar n<a
\end{align*}
for every standard numeral $\bar n$. In particular, $a$ cannot be the interpretation of any standard numeral: if $a=\bar k^M$ for some $k\in\mathbb N$, then $M\models \bar k<\bar k$, contradicting the sentence $\forall x\,\neg(x<x)$, which belongs to $T$ because it is true in $\mathbb N$.
Thus compactness produces a model of true arithmetic containing an element greater than every standard natural number. This is the basic source of nonstandard elements in first-order arithmetic.
[/example]
This example explains why first-order Peano arithmetic has nonstandard models. The issue is not a weakness of the axioms alone; it is built into compactness for first-order logic.
A broader obstruction is now visible: first-order logic cannot always enforce a global size restriction when every finite fragment can be satisfied in some finite structure. The next result isolates this compactness phenomenon, showing why any theory with arbitrarily large finite models must also have an infinite model.
[quotetheorem:4331]
[citeproof:4331]
The point is not that first-order theories can never have well-controlled infinite models; later categoricity theorems explain when that can happen. The point is that compactness alone prevents a theory from saying that all of its models are finite while allowing arbitrarily large finite examples. This turns many attempted first-order descriptions of global finiteness into immediate compactness contradictions.
A more elementary and immediate limitation is that many natural properties are not first-order expressible in a class of structures.
[example: Deciding First-Order Expressibility in Algebra]
In the language of rings, the condition that a ring has no zero divisors is expressed by
\begin{align*}
\forall x\,\forall y\,((xy=0)\to(x=0\vee y=0)).
\end{align*}
Indeed, if $R$ satisfies this sentence and $a,b\in R$ satisfy $ab=0$, then substituting $a$ for $x$ and $b$ for $y$ gives $a=0$ or $b=0$. Conversely, if $R$ has no zero divisors and $ab=0$, then by the definition of having no zero divisors, one of $a,b$ is $0$, so the displayed sentence holds. Thus, together with the usual ring-theoretic background convention that integral domains are commutative rings with $0\ne 1$, being an integral domain is first-order.
Algebraic closedness is first-order as an axiom scheme. For each integer $n\ge 1$, include the sentence
\begin{align*}
\forall a_0\,\forall a_1\cdots\forall a_{n-1}\,\exists x\,
(x^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0=0).
\end{align*}
For fixed coefficients $a_0,\dots,a_{n-1}$, the polynomial
\begin{align*}
X^n+a_{n-1}X^{n-1}+\cdots+a_1X+a_0
\end{align*}
is monic of degree $n$, and the displayed sentence asserts that it has a root. Requiring this for every $n\ge 1$ says that every nonconstant monic polynomial has a root; dividing a nonzero polynomial by its leading coefficient reduces the general case to the monic case.
By contrast, finiteness is not first-order expressible by a theory whose models are exactly the finite rings. Suppose such a theory $T$ existed. Add, for each $n\ge 1$, the sentence
\begin{align*}
\exists x_1\cdots\exists x_n\bigwedge_{1\le i<j\le n}x_i\ne x_j,
\end{align*}
which says that there are at least $n$ distinct elements. Any finite subset of these new sentences only asks for at least $N$ elements for some finite $N$, and there is a finite ring with at least $N$ elements, for example $\mathbb Z/p\mathbb Z$ with a prime $p\ge N$. Hence every finite part is satisfiable with $T$. By the *Compactness Theorem*, the enlarged theory has a model satisfying $T$ and having at least $n$ elements for every $n$, so it is infinite. This contradicts the assumption that the models of $T$ are exactly the finite rings.
[/example]
The same compactness obstruction applies to any class that contains finite structures of unbounded size but is claimed to have only finite models. First-order theories can enforce each finite lower bound separately, but compactness then assembles all those bounds into a single infinite model. The next theorem packages this recurring argument as a general non-axiomatizability principle.
[quotetheorem:4332]
[citeproof:4332]
The assumption of arbitrarily large finite models is necessary: a theory with all models of size at most $n$ can enforce that bound by first-order sentences. What compactness rules out is an unbounded finite class with no infinite limit model. This is the standard template behind many non-expressibility results for finiteness and global finite combinatorial properties.
[example: Connectedness Is Not First-Order for Graphs]
Assume, for contradiction, that there is a first-order theory $T$ in the graph language $\{E\}$ whose models are exactly the connected graphs. Add two new constant symbols $c,d$. For each $n\ge 0$, let $\rho_n(c,d)$ be the sentence saying that there is no path of length at most $n$ from $c$ to $d$. Explicitly, $\rho_0(c,d)$ is $c\ne d$, and for $n\ge 1$ it says
\begin{align*}
\neg\exists x_0\cdots\exists x_n\,
\bigl(x_0=c\wedge x_n=d\wedge \bigwedge_{i=0}^{n-1}E(x_i,x_{i+1})\bigr).
\end{align*}
Consider the expanded theory
\begin{align*}
T^*=T\cup\{\rho_n(c,d):n\ge 0\}.
\end{align*}
Every finite subset of $T^*$ is satisfiable. Indeed, a finite subset mentions only
\begin{align*}
\rho_0(c,d),\rho_1(c,d),\dots,\rho_N(c,d)
\end{align*}
for some $N$. Take the finite path graph with vertices
\begin{align*}
0,1,\dots,N+2
\end{align*}
and edges exactly between consecutive vertices:
\begin{align*}
E(i,j)\quad\Longleftrightarrow\quad |i-j|=1.
\end{align*}
Interpret $c$ as $0$ and $d$ as $N+2$. This graph is connected because for any $i<j$ the sequence
\begin{align*}
i,i+1,i+2,\dots,j
\end{align*}
is a path from $i$ to $j$. Also, any path from $0$ to $N+2$ must move from $k$ to $k+1$ at least once for each $k=0,\dots,N+1$, so it has length at least $N+2$. Hence there is no path of length at most $N$ from $c$ to $d$, and the graph satisfies each $\rho_n(c,d)$ with $0\le n\le N$.
By the *Compactness Theorem*, $T^*$ has a model $G^*$. Let $G$ be its reduct to the language $\{E\}$, and let $a=c^{G^*}$ and $b=d^{G^*}$. Since $G^*\models T$, the assumed meaning of $T$ says that $G$ is connected. But since $G^*\models \rho_n(c,d)$ for every $n\ge 0$, there is no finite path of any length from $a$ to $b$. Thus $G$ is not connected, a contradiction. Therefore connectedness is not first-order axiomatizable uniformly for graphs; the obstruction is that first-order theories can impose each bounded path condition separately, but connectedness requires one unbounded reachability condition.
[/example]
## Elementary Chains and Controlled Unions
The next problem is how to build large structures from smaller elementary pieces without losing first-order information. Compactness builds models all at once, while elementary chains build them by stages. The key point is that unions of increasing elementary chains remain elementary over each stage.
[definition: Elementary Chain]
An elementary chain is a sequence or directed family $(M_i)_{i \in I}$ of $L$-structures such that $I$ is directed and, whenever $i \le j$,
\begin{align*}
M_i \preceq M_j.
\end{align*}
[/definition]
The point of the definition is that first-order formulas mention only finitely many parameters at a time. In a directed elementary chain, those parameters already lie together in some stage, where elementarity can be used before passing to the union. This finite-parameter feature is what makes the union behave like a genuine elementary extension of each earlier model.
[quotetheorem:4333]
[citeproof:4333]
The theorem depends on elementarity at each stage, not merely on substructure inclusions. A union of ordinary substructures may introduce new existential truths about old parameters without providing old witnesses. Elementary chains are therefore the controlled way to build large models by stages while preserving the first-order theory seen from every earlier stage.
[example: Building an Elementary Extension by a Chain]
Start with an infinite $L$-structure $M_0$. Having constructed $M_n$, apply *Elementary Diagram Compactness Construction* to the elementary diagram of $M_n$ together with a new constant required to be different from every named element of $M_n$. This gives a proper elementary extension
\begin{align*}
M_n \preceq M_{n+1}
\end{align*}
and an element
\begin{align*}
b_n\in M_{n+1}\setminus M_n.
\end{align*}
Since $M_n\preceq M_{n+1}$ for every $n$, transitivity of elementary substructure gives
\begin{align*}
M_i\preceq M_j
\end{align*}
whenever $i\le j$. Thus
\begin{align*}
M_0\preceq M_1\preceq M_2\preceq\cdots
\end{align*}
is an elementary chain.
Set
\begin{align*}
M_\omega=\bigcup_{n\in\mathbb N}M_n.
\end{align*}
By the *Elementary Chain Theorem*,
\begin{align*}
M_0\preceq M_\omega.
\end{align*}
The union contains a genuinely new element from each successor stage. Indeed, if $i<j$, then
\begin{align*}
b_i\in M_{i+1}\subseteq M_j
\end{align*}
because the chain is increasing, while
\begin{align*}
b_j\notin M_j
\end{align*}
by the choice of $b_j$. Hence $b_i\ne b_j$. Therefore the elements
\begin{align*}
b_0,b_1,b_2,\dots
\end{align*}
are pairwise distinct elements of $M_\omega$, each added at a later stage than the previous structure contained.
This shows how compactness constructs one proper elementary extension at a time, while the elementary chain theorem preserves elementarity after taking the countable union.
[/example]
Elementary chains are also the technical bridge to Skolem hulls, downward Löwenheim-Skolem arguments, and later saturation constructions. They allow model-building to be local at each stage while preserving the global elementary embedding at the end.
## Preparing for Types and Stability Without Using Them
The final question is what remains unresolved after compactness, diagrams, ultraproducts, and quantifier elimination. We have learned to construct models, compare theories, and classify definable sets in special examples. What we have not yet developed is a systematic language for describing possible elements over parameter sets or measuring the complexity of definable families.
[definition: Partial Type]
Let $T$ be an $L$-theory, let $M \models T$, and let $A \subset M$. A partial type over $A$ in variables $x$ is a set $p(x)$ of $L(A)$-formulas such that every finite subset of $p(x)$ is realized in some elementary extension of $M$.
[/definition]
This definition is included only as a signpost. The present course has already supplied the reason it matters: compactness says that finite satisfiability of formulas is the right existence condition, while elementary extensions provide the natural place where new realizations may live.
[example: A Cut Type in Dense Linear Orders]
Let $M=(\mathbb Q,<)$, and put
\begin{align*}
A=\{a\in\mathbb Q:a^2<2\}.
\end{align*}
Consider the set of formulas
\begin{align*}
p(x)=\{a<x:a\in A\}\cup\{x<b:b\in\mathbb Q,\ b>0,\ b^2>2\}.
\end{align*}
This says that $x$ lies above every rational whose square is less than $2$ and below every positive rational whose square is greater than $2$.
No rational realizes $p(x)$. If $r\in\mathbb Q$ and $r^2<2$, then $r\in A$, so the formula $r<x$ would require $r<r$. If $r^2=2$, write $r=m/n$ in lowest terms with $n\ne 0$. Then
\begin{align*}
m^2=2n^2.
\end{align*}
Thus $m^2$ is even, so $m$ is even; writing $m=2k$ gives
\begin{align*}
4k^2=2n^2,
\end{align*}
hence
\begin{align*}
n^2=2k^2,
\end{align*}
so $n$ is even, contradicting that $m/n$ was in lowest terms. Finally, if $r^2>2$ and $r>0$, then the formula $x<r$ belongs to $p(x)$ and would require $r<r$; if $r\le 0$, then $0\in A$ because $0^2=0<2$, so the formula $0<x$ would require $0<r$, impossible.
Every finite part of $p(x)$ is realized in $\mathbb Q$. A finite part mentions only finitely many lower bounds
\begin{align*}
a_1<x,\dots,a_m<x
\end{align*}
with $a_i^2<2$, and finitely many upper bounds
\begin{align*}
x<b_1,\dots,x<b_n
\end{align*}
with $b_j>0$ and $b_j^2>2$. For each pair $a_i,b_j$, if $a_i<0$, then $a_i<b_j$ because $b_j>0$. If $a_i\ge 0$, then
\begin{align*}
b_j^2-a_i^2>0
\end{align*}
and
\begin{align*}
b_j^2-a_i^2=(b_j-a_i)(b_j+a_i).
\end{align*}
Since $b_j+a_i>0$, it follows that $b_j-a_i>0$, hence $a_i<b_j$. Therefore every listed lower bound is below every listed upper bound. Let
\begin{align*}
\alpha=\max\{a_1,\dots,a_m\},
\qquad
\beta=\min\{b_1,\dots,b_n\}
\end{align*}
when both lists are nonempty. Then $\alpha<\beta$, and
\begin{align*}
\alpha<\frac{\alpha+\beta}{2}<\beta,
\end{align*}
so $q=(\alpha+\beta)/2\in\mathbb Q$ realizes that finite part. If one list is empty, choose a rational respectively below the least upper bound listed or above the greatest lower bound listed; if both are empty, choose $q=0$.
Thus $p(x)$ is finitely satisfiable in $(\mathbb Q,<)$ but not realized there. By compactness applied together with the elementary diagram of $(\mathbb Q,<)$, there is an elementary extension of $(\mathbb Q,<)$ containing an element realizing all formulas in $p(x)$. This element fills the irrational cut determined by $\sqrt 2$ without being a rational number itself.
[/example]
This example turns the earlier discussion of dense orders into a prototype for types. The missing element is not specified by a single formula, but by a whole finitely satisfiable family of inequalities. Compactness converts that approximate description into an actual element in a larger elementary environment.
[example: A Transcendental Type Over a Field]
Let $K$ be an algebraically closed field and let $A\subset K$ be a subfield. Consider the set of $L(A)$-formulas
\begin{align*}
p(x)=\{f(x)\ne 0: f\in A[X]\setminus\{0\}\}.
\end{align*}
An element realizing $p(x)$ is exactly an element that is not a root of any nonzero polynomial over $A$, hence is transcendental over $A$.
Every finite subset of $p(x)$ has the form
\begin{align*}
f_1(x)\ne 0,\dots,f_n(x)\ne 0
\end{align*}
with $f_1,\dots,f_n\in A[X]\setminus\{0\}$. Let $t$ be an indeterminate over $K$, and let $E$ be an algebraic closure of $K(t)$. Since $A\subset K$, each $f_i$ is also a polynomial in $K[X]$. If $f_i(t)=0$ in $K(t)$, then the nonzero polynomial $f_i(X)$ would vanish at the indeterminate $t$, contradicting the defining property of $t$ as transcendental over $K$. Thus
\begin{align*}
E\models f_i(t)\ne 0
\end{align*}
for each $1\le i\le n$.
Therefore every finite part of the transcendence requirements is satisfiable in an algebraically closed field extension of $K$. Applying compactness together with the elementary diagram of $K$ gives an elementary extension $K'\succeq K$ and an element $u\in K'$ such that
\begin{align*}
K'\models f(u)\ne 0
\end{align*}
for every nonzero $f\in A[X]$. The element $u$ realizes the type saying “transcendental over $A$”: compactness turns the infinitely many exclusions of algebraic roots over $A$ into one element in a larger elementary field.
[/example]
These examples preview the later shift from asking whether a whole theory has a model to asking how many possible elements or tuples can exist over a parameter set. Stability theory begins when this counting and organisation of types becomes the main object of study.
[remark: What Compactness Buys]
Compactness turns finite consistency into models, produces nonstandard elements, constructs elementary extensions, proves preservation phenomena, and explains why many algebraic theories have rich classes of models. It is the engine behind the diagram method and the semantic force behind many ultraproduct arguments.
[/remark]
The positive uses all share the same pattern: impose only finitely checkable requirements, verify each finite part inside some known model, and then pass to a global model. That pattern is powerful, but it also explains the limitations that follow. Anything depending on excluding all nonstandard or limiting behaviour is exactly where compactness resists the intended interpretation.
[remark: What Compactness Cannot Buy]
Compactness does not identify the intended infinite model, does not enforce standardness, does not express finiteness, and does not by itself classify the possible shapes of models of a theory. For classification, the next tools are types, saturation, indiscernibles, ranks, and stability-theoretic dividing lines.
[/remark]
The foundation course therefore ends with a useful tension. First-order logic is strong enough to formalise major parts of algebra and order, and compactness makes it flexible enough to build models with prescribed finite behaviour. The same flexibility prevents first-order theories from pinning down many intended infinite structures, which is exactly why the later subject studies the space of all models rather than only the axioms themselves.
Contents
- 1. First-Order Languages, Structures, and Satisfaction
- Languages and Formulas
- Structures and Maps Between Structures
- Assignments and Satisfaction
- Sentences, Theories, and Semantic Consequence
- Quantifier-Free Preservation Under Substructures
- 2. Theories, Complete Theories, and Elementary Equivalence
- Axiomatizable Classes and Complete Theories
- Elementary Embeddings and Elementary Substructures
- Elementary Diagrams and the Diagram Method
- 3. Filters, Ultrafilters, and Ultraproducts
- Measuring Large Subsets of an Index Set
- Quotienting Products By Large Agreement
- Interpreting The Language In A Quotient Product
- Reduced Products Versus Ultraproducts
- The Road To Los's Theorem
- 4. Łoś's Theorem and the Ultraproduct Proof of Compactness
- Why Truth in a Quotient Should Be Coordinatewise
- Where Existential Witnesses Enter
- Compactness from Ultraproducts
- First Consequences of Łoś and Compactness
- 5. Applications of Compactness
- Non-Definability from Compactness
- Building Models with Prescribed Finite Behavior
- Compactness and Algebraic Diagrams
- Preservation Warnings and Finite Fragments
- 6. Löwenheim–Skolem Theorems
- Building Small Elementary Substructures
- The Downward Löwenheim-Skolem Theorem
- Creating Large Models By Compactness
- Skolemization And Cardinality Control
- The Skolem Paradox
- 7. Quantifier Elimination: Method and First Examples
- Why Remove Quantifiers?
- Back-And-Forth Criteria
- Model Completeness And Complete Theories
- Dense Linear Orders Without Endpoints
- First Non-Examples And Comparisons
- 8. Algebraically Closed Fields
- The Theory of Algebraically Closed Fields
- Polynomial Equations and Constructible Sets
- Quantifier Elimination for Algebraically Closed Fields
- Completeness in Fixed Characteristic
- Transcendence Degree and Sentence-Level Behaviour
- 9. Real Closed Fields and Ordered Algebra
- Ordered Fields and Real Closed Fields
- Quantifier Elimination for Real Closed Fields
- Semialgebraic Sets and Projections
- Completeness and Model Completeness
- Consequences for Real Algebraic Geometry
- 10. Model Completeness and Robinson's Test
- Embeddings Between Models
- Existential Preservation
- Robinson's Test
- Quantifier Elimination and Model Completeness
- 11. Synthesis: What Compactness Buys and What It Cannot Buy
- Compactness, Ultraproducts, and Elementary Equivalence as a Toolkit
- Comparing Dense Orders, Algebraically Closed Fields, and Real Closed Fields
- First-Order Properties and Their Limits
- Elementary Chains and Controlled Unions
- Preparing for Types and Stability Without Using Them
Model Theory I: Foundations
Content
Problems
History
Created by admin on 5/29/2026 | Last updated on 6/1/2026
Prerequisites (0/2 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent