[proofplan]
Let $f, g: X \to Y$ be two order isomorphisms between well-orderings. We show $f = g$ by transfinite induction on $X$. Fix $x \in X$ and assume $f(y) = g(y)$ for every $y < x$. Let $S := \{f(y) : y < x\}$; because $f$ and $g$ agree strictly below $x$, this also equals $\{g(y) : y < x\}$. We use the rigidity of well-orderings to identify $f(x)$ and $g(x)$ as the **same** canonical element, namely $\min(Y \setminus S)$ — the least element of the complement, which exists because $Y$ is well-ordered. Order-preservation and bijectivity of $f$ force $f(x) = \min(Y \setminus S)$, and the identical argument applies to $g$. The transfinite induction then closes.
[/proofplan]
[step:Set up the transfinite induction on $X$]
Suppose $f: X \to Y$ and $g: X \to Y$ are two order isomorphisms between well-orderings $(X, <_X)$ and $(Y, <_Y)$. We prove $f(x) = g(x)$ for all $x \in X$ by [transfinite induction](/theorems/1463) on $X$. Writing
\begin{align*}
\Phi(x) &: \quad f(x) = g(x),
\end{align*}
the inductive principle over the well-ordering $(X, <_X)$ states that if
\begin{align*}
\bigl(\forall y <_X x,\ \Phi(y)\bigr) \implies \Phi(x)
\end{align*}
holds for every $x \in X$, then $\Phi(x)$ holds for every $x \in X$. There is no separate base case: when $x$ is the least element of $X$ (if one exists), the hypothesis "$\forall y <_X x,\ \Phi(y)$" is vacuously true.
Fix $x \in X$ and assume $\Phi(y)$ for every $y <_X x$, i.e.\ $f(y) = g(y)$ whenever $y <_X x$. We must show $f(x) = g(x)$.
[guided]
Transfinite induction is the natural tool when the underlying index set is well-ordered but not necessarily of order type $\omega$. Unlike ordinary induction on $\mathbb{N}$, there is no distinguished "base case" because $X$ may have no minimum (e.g.\ $X = \omega + \omega$ restricted above $\omega$). Instead, transfinite induction on a well-ordering $(X, <_X)$ uses a single unified clause:
\begin{align*}
\bigl(\forall y <_X x,\ \Phi(y)\bigr) \implies \Phi(x) \qquad \text{for every } x \in X.
\end{align*}
When $x$ is the minimum of $X$ (if one exists), the premise is vacuous and reduces to $\Phi(x)$ directly — so the minimum must be checked explicitly if the argument requires it. When $x$ is a limit or successor, the premise packages the inductive hypothesis on all strictly smaller indices. The conclusion is that $\Phi(x)$ holds for every $x \in X$, proved by invoking well-orderedness: if $\Phi$ failed somewhere, the set $\{x \in X : \neg \Phi(x)\}$ would be a non-empty subset of $X$ and would have a least element $x_0$, but the inductive implication applied at $x_0$ would force $\Phi(x_0)$ — contradiction.
For our theorem, we set $\Phi(x) : f(x) = g(x)$. Fix $x \in X$ and assume $\Phi(y)$ for all $y <_X x$, i.e.\ $f(y) = g(y)$ for every $y$ strictly below $x$. The goal is to deduce $f(x) = g(x)$.
[/guided]
[/step]
[step:Introduce the common image $S$ of the initial segment below $x$]
Define
\begin{align*}
S &:= \{f(y) : y \in X,\ y <_X x\} \subseteq Y.
\end{align*}
By the inductive hypothesis $f(y) = g(y)$ for every $y <_X x$, so $\{g(y) : y <_X x\} = S$ as well. The set $S$ is therefore the common image, under either $f$ or $g$, of the initial segment $\{y \in X : y <_X x\}$.
We record two properties of $S$ that will be used below:
1. $f(x) \notin S$ and $g(x) \notin S$. Indeed, if $f(x) = f(y)$ for some $y <_X x$, injectivity of $f$ would force $x = y$, contradicting $y <_X x$. The same argument, using injectivity of $g$, shows $g(x) \notin S$.
2. Every element of $S$ is strictly less than $f(x)$ (and strictly less than $g(x)$). For if $s \in S$, then $s = f(y)$ for some $y <_X x$, and order-preservation of $f$ gives $f(y) <_Y f(x)$. Since $f(y) = g(y)$, the same reasoning applied to $g$ gives $g(y) <_Y g(x)$.
[guided]
Fix $x \in X$ and consider the image of all "earlier" elements:
\begin{align*}
S &:= \{f(y) : y \in X,\ y <_X x\}.
\end{align*}
The inductive hypothesis $f(y) = g(y)$ for all $y <_X x$ says that replacing $f$ by $g$ in the defining formula leaves $S$ unchanged:
\begin{align*}
S &= \{g(y) : y \in X,\ y <_X x\}.
\end{align*}
This is the critical observation enabled by the inductive hypothesis: **$S$ is simultaneously the image under $f$ of the initial segment below $x$, and the image under $g$ of the same initial segment**. Now we will characterize $f(x)$ and $g(x)$ in terms of $S$ and see they must coincide.
**Property (1): $f(x), g(x) \notin S$.** Suppose for contradiction that $f(x) \in S$. Then $f(x) = f(y)$ for some $y <_X x$. Since $f$ is injective (being an isomorphism, it is bijective), $x = y$, contradicting $y <_X x$. Identically, $g(x) \notin S$.
**Property (2): every $s \in S$ satisfies $s <_Y f(x)$ and $s <_Y g(x)$.** Let $s \in S$, so $s = f(y)$ for some $y <_X x$. Because $f$ is an order isomorphism, $y <_X x$ gives $f(y) <_Y f(x)$, i.e.\ $s <_Y f(x)$. Similarly, using $s = g(y)$ and that $g$ is an order isomorphism, $g(y) <_Y g(x)$, i.e.\ $s <_Y g(x)$.
These two properties say that $f(x)$ is an upper bound of $S$ not belonging to $S$, and the same for $g(x)$. The next step identifies both as the **least** such upper bound, which pins them down uniquely.
[/guided]
[/step]
[step:Identify $f(x)$ as the minimum of $Y \setminus S$]
We claim $f(x) = \min(Y \setminus S)$, where the minimum exists because:
- $Y \setminus S$ is non-empty (it contains $f(x)$ by Property 1 of Step 2);
- $Y$ is well-ordered, so every non-empty subset has a least element.
Let $a := \min(Y \setminus S)$. By Property 1, $f(x) \in Y \setminus S$, so $a \le_Y f(x)$ (with equality meaning $a = f(x)$).
Suppose for contradiction that $a <_Y f(x)$. Since $f: X \to Y$ is a bijection, there exists a unique $z \in X$ with $f(z) = a$. Because $f$ is an order isomorphism, $f(z) <_Y f(x)$ implies $z <_X x$. But then $a = f(z) \in S$ by the definition of $S$, contradicting $a \in Y \setminus S$. Hence $a = f(x)$, as claimed.
[guided]
The set $Y \setminus S$ is the "locus" in $Y$ of elements not yet hit by the initial segment. Since $Y$ is well-ordered and $Y \setminus S$ is non-empty (it contains $f(x) \notin S$), we may form
\begin{align*}
a &:= \min(Y \setminus S) \in Y.
\end{align*}
We now show $a = f(x)$. We have $f(x) \in Y \setminus S$ by Step 2 Property 1, and $a$ is the minimum of this set, so $a \le_Y f(x)$. Two cases.
**Case A: $a = f(x)$.** We are done.
**Case B: $a <_Y f(x)$.** We derive a contradiction. Since $f$ is surjective, there is a unique $z \in X$ with $f(z) = a$ (uniqueness by injectivity of $f$). Applying the **inverse** of the order isomorphism $f$ to the inequality $f(z) = a <_Y f(x)$ gives $z <_X x$ — this uses that $f$ is order-preserving in both directions, a standard property of order isomorphisms. But then $z <_X x$ places $a = f(z)$ into $S$, contradicting $a \in Y \setminus S$.
So Case B is impossible, and $a = f(x)$.
**Intuition.** $f(x)$ is the first element of $Y$ not yet hit by the initial segment under $f$. The order isomorphism must map the next element of $X$ to the next element of $Y$ — "next" meaning "least of what remains". This is the rigidity of order isomorphisms between well-orderings.
[/guided]
[/step]
[step:Identify $g(x)$ as the same minimum and close the induction]
The argument of Step 3 used no properties of $f$ beyond:
- $f$ is an order isomorphism $X \to Y$,
- the image of the initial segment $\{y : y <_X x\}$ under $f$ is $S$.
The same two properties hold with $f$ replaced by $g$: $g$ is an order isomorphism by hypothesis, and the inductive hypothesis $f = g$ on $\{y : y <_X x\}$ makes the image of that initial segment under $g$ equal to $S$. Substituting $g$ for $f$ throughout Step 3 gives
\begin{align*}
g(x) &= \min(Y \setminus S).
\end{align*}
Combining with Step 3,
\begin{align*}
f(x) &= \min(Y \setminus S) = g(x).
\end{align*}
This establishes $\Phi(x)$. By transfinite induction, $f(x) = g(x)$ for every $x \in X$, hence $f = g$.
[guided]
At this point we have shown $f(x) = \min(Y \setminus S)$. The goal was to establish $f(x) = g(x)$, and the cleanest route is to show that $g(x)$ satisfies the same characterization.
Let us examine Step 3 carefully. The argument there used:
- **(a)** $f$ is an order isomorphism;
- **(b)** $S = \{f(y) : y <_X x\}$;
- **(c)** $f(x) \notin S$ (a consequence of (a) and (b), via injectivity and the definition of $S$).
Each of these has an analogue for $g$:
- **(a')** $g$ is an order isomorphism — by hypothesis of the theorem;
- **(b')** $S = \{g(y) : y <_X x\}$ — by the inductive hypothesis $f(y) = g(y)$ for $y <_X x$;
- **(c')** $g(x) \notin S$ — by injectivity of $g$ and (b'), exactly as in Step 2.
Substituting $g$ for $f$ in Step 3's argument yields:
\begin{align*}
g(x) &= \min(Y \setminus S).
\end{align*}
Combining with Step 3's conclusion for $f$:
\begin{align*}
f(x) &= \min(Y \setminus S) = g(x).
\end{align*}
The crucial point is that the element $\min(Y \setminus S)$ depends only on $Y$, the well-ordering $<_Y$, and $S$ — **not** on the particular map ($f$ or $g$) used to present $S$. Since $f$ and $g$ present the **same** $S$ (by the inductive hypothesis), they must produce the same value at $x$.
This completes the inductive step. By [transfinite induction](/theorems/1463) over the well-ordering $(X, <_X)$, $f(x) = g(x)$ for every $x \in X$. Since $f$ and $g$ agree pointwise on $X$, they are equal as maps.
**Why well-orderedness is essential.** Without well-orderedness, the argument fails at Step 3: the set $Y \setminus S$ may be non-empty but have no minimum, so we cannot canonically pick out $f(x)$. Indeed, the theorem is false for arbitrary total orders: consider $X = Y = \mathbb{Z}$ with the usual order; the shift $f(n) = n + 1$ and the identity $g(n) = n$ are both **self-maps**, but they are distinct order automorphisms of $\mathbb{Z}$. (In fact, $\mathbb{Z}$ has an infinite automorphism group consisting of all shifts.) The well-ordering hypothesis eliminates exactly this ambiguity by guaranteeing that every non-empty subset of $Y$ has a canonical first element.
[/guided]
[/step]