[proofplan]
This proof is a structured sketch of the classical argument; the fine technical points (existence of triangulations and the combinatorial reductions) are cited to named results rather than carried out in full, as each would require a separate multi-page proof. The argument has three movements. First, every compact surface admits a triangulation (Radó's theorem), which encodes it as a finite polygon with edges identified in pairs. Second, a finite sequence of combinatorial moves — cutting, pasting, and permuting edges — reduces every such polygon to one of the normal forms $a_1 b_1 a_1^{-1} b_1^{-1} \cdots a_g b_g a_g^{-1} b_g^{-1}$ (orientable, genus $g$) or $a_1 a_1 a_2 a_2 \cdots a_n a_n$ (non-orientable, genus $n$). Third, distinct normal forms yield non-homeomorphic surfaces because the Euler characteristic (and, separately, orientability) is a topological invariant that takes different values on different normal forms.
[/proofplan]
[step:Reduce to polygons with edge identifications via triangulation]
Let $S$ be a compact surface, i.e., a compact Hausdorff space in which every point has a neighbourhood homeomorphic to an open subset of $\mathbb{R}^2$.
By [Radó's theorem](/theorems/???), every compact surface admits a **triangulation**: a finite simplicial complex $K$ together with a homeomorphism $|K| \to S$ such that the image of each $2$-simplex is a topological triangle in $S$. In particular, $K$ has finitely many $2$-simplices $T_1, \ldots, T_F$, finitely many $1$-simplices (edges), and finitely many $0$-simplices (vertices).
Using the triangulation, we can represent $S$ as a **polygonal presentation**: choose a spanning tree in the dual graph of $K$, and open up the triangulation along the remaining edges. This unfolds $S$ into a single $2m$-gon $P$ in the plane, with a pairing of its boundary edges: each edge on $\partial P$ is paired with exactly one other edge, and identifying the paired edges (with appropriate orientations) via homeomorphisms reconstructs $S$. Formally, $S$ is homeomorphic to the quotient $P / \sim$, where $\sim$ is the equivalence relation generated by the edge identifications.
A polygonal presentation is encoded by an **edge word** $w$ over an alphabet $\{a_1, a_1^{-1}, a_2, a_2^{-1}, \ldots\}$: reading the boundary of $P$ counter-clockwise, each edge is labelled by a letter, with $a_i$ and $a_i^{-1}$ denoting the two edges of the $i$-th identification pair, and the sign indicating whether the identification is orientation-preserving ($a_i a_i^{-1}$) or orientation-reversing ($a_i a_i$).
[/step]
[step:Reduce the edge word to normal form via elementary moves]
Three elementary operations on edge words preserve the underlying surface up to homeomorphism:
1. **Cancellation of adjacent inverse pairs** $\cdots a a^{-1} \cdots \to \cdots \cdots$ (provided the polygon has at least 4 edges after cancellation, or equivalently, the result is non-empty).
2. **Cutting and pasting** along a diagonal: if $w = w_1 w_2$ (product of two subwords), introduce a new letter $c$, cut the polygon along the diagonal labelled $c$, and re-glue to produce an equivalent presentation $w_1 c$ and $w_2 c^{-1}$, which can be recombined as $w_1 c c^{-1} w_2$ or as $w_2 c^{-1} w_1 c$ etc.
3. **Vertex identification**: after a sequence of moves, all vertices of $P$ can be brought to a single equivalence class under $\sim$.
By [the Classification Reduction Algorithm](/theorems/???) — the combinatorial heart of the classification, due in essence to Brahana — every edge word can be transformed by a finite sequence of these moves into one of the following **normal forms**:
- **Orientable normal form**: $a_1 b_1 a_1^{-1} b_1^{-1} a_2 b_2 a_2^{-1} b_2^{-1} \cdots a_g b_g a_g^{-1} b_g^{-1}$, for some $g \geq 0$, or
- **Non-orientable normal form**: $a_1 a_1 a_2 a_2 \cdots a_n a_n$, for some $n \geq 1$, or
- **Trivial form**: $a a^{-1}$, corresponding to $S^2$.
The trichotomy is forced because mixed forms (containing both orientable-type pairs $a a^{-1}$ and non-orientable pairs $a a$) reduce to purely non-orientable forms via the identity $a b a b^{-1} = a a c c$ (a classical combinatorial identity in the edge-word calculus).
The surface corresponding to the orientable normal form of length $4g$ is by definition $\Sigma_g$, the orientable surface of genus $g$; the surface corresponding to the non-orientable form of length $2n$ is by definition $E_n$, the non-orientable surface of genus $n$. The trivial form $a a^{-1}$ gives $\Sigma_0 = S^2$.
[guided]
The classification reduction algorithm is the technical core of the theorem. We outline the main ideas but refer to a textbook (e.g., Massey, *Algebraic Topology: An Introduction*, Chapter 1) for the complete case analysis, which runs to several pages of combinatorial bookkeeping.
**Step A: Bring all vertices to a single equivalence class.** Starting from the polygon $P$, the vertices of $\partial P$ are grouped into equivalence classes by the identifications. If there is more than one class, one can find a vertex $v$ in a class $[v]$ of size $\geq 2$, and a cutting/pasting move that reduces the size of $[v]$ by merging it with a neighbouring class. Iterating, all vertices become equivalent.
**Step B: Make pairs of the form $a \cdots a$ adjacent.** If the edge word contains a pair of like-oriented occurrences of the same letter (an $a a$ pair, non-orientable type) separated by other letters, a cutting-and-pasting move brings them adjacent: $w_1 a w_2 a w_3 \to w_1 a a (w_2^{-1} w_3)$ after suitable re-labelling.
**Step C: Group inverse pairs $a \cdots a^{-1} \cdots b \cdots b^{-1}$ into commutator blocks.** The key identity is: if the word contains two "linked" inverse pairs — $\cdots a \cdots b \cdots a^{-1} \cdots b^{-1} \cdots$ — then a sequence of cutting/pasting moves brings them into the commutator form $a b a^{-1} b^{-1}$ as a consecutive block.
**Step D: Resolve the mixed case.** If after steps A–C the word contains both a non-orientable block ($a a$) and an orientable commutator block ($b c b^{-1} c^{-1}$), the identity $a a b c b^{-1} c^{-1} = a a b b c c$ (provable by a direct cutting/pasting calculation) converts the commutator block into two non-orientable blocks. So any mixed word reduces to purely non-orientable form.
**Step E: Cancel adjacent inverse pairs.** Any remaining $a a^{-1}$ outside a commutator block (which would only occur in degenerate configurations) can be cancelled, possibly after reaching the trivial form $a a^{-1}$ for $S^2$.
The output is: either no non-orientable blocks (pure orientable normal form of length $4g$, giving $\Sigma_g$), or all non-orientable blocks (pure non-orientable normal form of length $2n$, giving $E_n$), or the trivial word (giving $S^2 = \Sigma_0$).
[/guided]
[/step]
[step:Verify that distinct normal forms give non-homeomorphic surfaces]
We must check that the list $\{\Sigma_g\}_{g \geq 0} \cup \{E_n\}_{n \geq 1}$ contains no repetitions. Two invariants suffice.
**Orientability.** Orientability of a surface is a topological invariant — it is preserved under homeomorphism by [invariance of orientability](/theorems/???). Direct inspection of the polygonal presentations:
- The orientable normal form $\prod_{i=1}^g [a_i, b_i] := a_i b_i a_i^{-1} b_i^{-1}$ uses each letter once with each sign, so consistent orientation of the faces extends across all edge identifications. Hence $\Sigma_g$ is orientable.
- The non-orientable normal form $a_1 a_1 \cdots a_n a_n$ uses each letter twice with the same sign, forcing a reversal of orientation across each paired edge. Hence $E_n$ is non-orientable.
Therefore no $\Sigma_g$ is homeomorphic to any $E_n$.
**Euler characteristic.** The Euler characteristic $\chi(S) := V - E + F$ of a triangulated surface is a topological invariant by [topological invariance of the Euler characteristic](/theorems/???). Computed from the normal-form polygonal presentation:
- For $\Sigma_g$: the polygon has one face ($F = 1$), $4g$ edges on its boundary identified in $2g$ pairs ($E = 2g$), and all $4g$ vertices identified to a single point ($V = 1$, by construction of the normal form). Thus
\begin{align*}
\chi(\Sigma_g) = 1 - 2g + 1 = 2 - 2g.
\end{align*}
- For $E_n$: the polygon has $F = 1$, $2n$ edges identified in $n$ pairs ($E = n$), and all $2n$ vertices identified to one ($V = 1$). Thus
\begin{align*}
\chi(E_n) = 1 - n + 1 = 2 - n.
\end{align*}
Since $\chi$ is injective on the set $\{\Sigma_g : g \geq 0\}$ — distinct values $g$ give distinct $\chi = 2 - 2g$ — no two $\Sigma_g$ are homeomorphic. Similarly $\chi$ is injective on $\{E_n : n \geq 1\}$. Combined with the orientability distinction of the previous paragraph, the full list contains no repetitions.
[/step]
[step:Assemble the classification]
By Step 1, every compact surface $S$ has a polygonal presentation. By Step 2, its edge word reduces to one of the normal forms, so $S$ is homeomorphic to some $\Sigma_g$ ($g \geq 0$) or some $E_n$ ($n \geq 1$). By Step 3, this surface is unique: different normal forms yield non-homeomorphic surfaces, distinguished by orientability and Euler characteristic. This is exactly the statement of the theorem.
[/step]