[proofplan]
The strategy is to define $f: X \to Y$ by transfinite recursion, sending each $x \in X$ to the minimum of $Y$ that has not yet been used. The recursion may succeed at every $x \in X$, in which case $f$ is an order-preserving injection of $X$ into $Y$ whose image is an initial segment, giving $X \leq Y$. Or the recursion may stall at some first $x \in X$ where the target set $Y \setminus \{f(y) : y < x\}$ is empty; in that case $f$ restricted to $I_x$ is an order isomorphism onto $Y$, and $I_x$ is a well-ordering (initial segment of $X$), so $Y \leq X$.
[/proofplan]
[step:Set up the recursion that picks the least unused element of $Y$]
Let $(X, <_X)$ and $(Y, <_Y)$ be well-orderings. Adjoin a formal top element $\infty \notin Y$ and define a function
\begin{align*}
G: \mathcal{P}(X \times Y) &\to Y \cup \{\infty\} \\
h &\mapsto \begin{cases} \min_Y(Y \setminus \operatorname{range}(h)) & \text{if } Y \setminus \operatorname{range}(h) \neq \varnothing, \\ \infty & \text{otherwise.} \end{cases}
\end{align*}
Here $\operatorname{range}(h)$ denotes the image of the second projection of $h$ when $h$ is a function, and is the empty set when $h$ is not a function or is empty. The minimum in the first branch exists because $Y \setminus \operatorname{range}(h)$ is a non-empty subset of the well-ordering $Y$.
By the [Definition by Transfinite Recursion](/theorems/1464) applied to the well-ordering $X$ and target $Y \cup \{\infty\}$, there is a unique function $f: X \to Y \cup \{\infty\}$ such that for every $x \in X$,
\begin{align*}
f(x) = G(f|_{I_x}) = \begin{cases} \min_Y(Y \setminus \{f(y) : y < x\}) & \text{if this set is non-empty,} \\ \infty & \text{otherwise,} \end{cases}
\end{align*}
where $I_x = \{y \in X : y <_X x\}$ and $f|_{I_x}$ denotes the restriction of $f$ to $I_x$.
[guided]
The intuitive picture is a greedy matching: walk along $X$ in increasing order; at each step $x$, match $x$ with the least element of $Y$ that has not already been used. Either we exhaust $X$ without running out of $Y$ (in which case $X$ was no longer than $Y$), or we exhaust $Y$ first (in which case $Y$ was shorter).
To make "greedy matching along a well-ordering" rigorous we invoke the [Definition by Transfinite Recursion](/theorems/1464). That theorem requires: a well-ordering $X$, a target set $Y'$, and a function $G: \mathcal{P}(X \times Y') \to Y'$. It outputs a unique $f: X \to Y'$ satisfying $f(x) = G(f|_{I_x})$ for all $x$.
The technical nuisance is that our natural recipe
\begin{align*}
f(x) = \min_Y(Y \setminus \{f(y) : y < x\})
\end{align*}
only makes sense while $Y \setminus \{f(y) : y < x\}$ is non-empty. We need a target set that accommodates both outcomes. The clean fix is to enlarge the target by a formal "overflow" symbol $\infty \notin Y$ and let the recursion produce $\infty$ whenever the greedy set is empty.
Define
\begin{align*}
G: \mathcal{P}(X \times Y) &\to Y \cup \{\infty\} \\
h &\mapsto \begin{cases} \min_Y(Y \setminus \operatorname{range}(h)) & \text{if } Y \setminus \operatorname{range}(h) \neq \varnothing, \\ \infty & \text{otherwise.} \end{cases}
\end{align*}
The minimum in the first branch exists because $Y$ is a well-ordering and the set is a non-empty subset. The transfinite recursion theorem now produces a unique $f: X \to Y \cup \{\infty\}$ with
\begin{align*}
f(x) = \begin{cases} \min_Y(Y \setminus \{f(y) : y < x\}) & \text{if non-empty,} \\ \infty & \text{otherwise.} \end{cases}
\end{align*}
Two outcomes are possible: either $f$ never takes the value $\infty$ (the recursion "succeeds"), or there is a first $x$ at which $f(x) = \infty$ (the recursion "stalls").
[/guided]
[/step]
[step:Prove that $f$ is strictly increasing on $\{x : f(x) \in Y\}$]
Let $A := \{x \in X : f(x) \in Y\}$ (i.e., the $x$ where the recursion has not yet produced $\infty$). We claim $f|_A$ is strictly $<_Y$-increasing.
Let $x_1, x_2 \in A$ with $x_1 <_X x_2$. By definition,
\begin{align*}
f(x_2) = \min\nolimits_Y(Y \setminus \{f(y) : y <_X x_2\}),
\end{align*}
so $f(x_2) \notin \{f(y) : y <_X x_2\}$. In particular $f(x_2) \neq f(x_1)$.
It remains to rule out $f(x_2) <_Y f(x_1)$. Note that
\begin{align*}
\{f(y) : y <_X x_1\} \subseteq \{f(y) : y <_X x_2\},
\end{align*}
and taking complements in $Y$,
\begin{align*}
Y \setminus \{f(y) : y <_X x_2\} \subseteq Y \setminus \{f(y) : y <_X x_1\}.
\end{align*}
Both sets are non-empty (the first contains $f(x_2)$ by hypothesis $x_2 \in A$; the second contains $f(x_1)$ because $x_1 \in A$). The minimum of a subset is $\geq_Y$ the minimum of the superset, so
\begin{align*}
f(x_1) = \min\nolimits_Y(Y \setminus \{f(y) : y <_X x_1\}) \leq_Y \min\nolimits_Y(Y \setminus \{f(y) : y <_X x_2\}) = f(x_2).
\end{align*}
Combined with $f(x_1) \neq f(x_2)$, this gives $f(x_1) <_Y f(x_2)$.
[/step]
[step:Show that $A = \{x : f(x) \in Y\}$ is an initial segment of $X$]
If $x \in A$ and $x' <_X x$, we must show $x' \in A$, i.e., $f(x') \in Y$. Suppose for contradiction $f(x') = \infty$. Then $Y \setminus \{f(y) : y <_X x'\} = \varnothing$, which means
\begin{align*}
\{f(y) : y <_X x'\} \supseteq Y.
\end{align*}
But $\{f(y) : y <_X x'\} \subseteq \{f(y) : y <_X x\}$ (more elements in the second set), so
\begin{align*}
\{f(y) : y <_X x\} \supseteq Y,
\end{align*}
which forces $Y \setminus \{f(y) : y <_X x\} = \varnothing$ and hence $f(x) = \infty$, contradicting $x \in A$. Therefore $f(x') \in Y$, i.e., $x' \in A$.
Thus $A$ is downward closed in $X$, hence an initial segment.
[/step]
[step:Analyse the two outcomes: $A = X$ gives $X \leq Y$]
Suppose first that $A = X$, so $f: X \to Y$ is totally defined (never takes the value $\infty$).
*Injectivity.* By the second step, $f$ is strictly $<_Y$-increasing on $X$, hence injective.
*Image is an initial segment of $Y$.* Let $R := \operatorname{range}(f) = \{f(x) : x \in X\}$. We show $R$ is downward closed in $Y$. Fix $b \in R$, say $b = f(x_0)$, and let $c <_Y b$. We must exhibit $c = f(x)$ for some $x \in X$.
Consider the set $T := \{x \in X : c \in \{f(y) : y <_X x\}\}$. If $T$ is non-empty, take $x_1 := \min_X T$. Then $c \in \{f(y) : y <_X x_1\}$, so $c = f(y)$ for some $y <_X x_1$, and in particular $c \in R$, as required.
If $T = \varnothing$, then for every $x \in X$, $c \notin \{f(y) : y <_X x\}$. In particular $c \notin \{f(y) : y <_X x_0\}$, so $c \in Y \setminus \{f(y) : y <_X x_0\}$, whose minimum is $f(x_0) = b$. This forces $b \leq_Y c$, contradicting $c <_Y b$.
Hence $R$ is an initial segment of $Y$, and $f: X \to R$ is an order-preserving bijection. By the definition of the comparison order on well-orderings, $X \leq Y$.
[guided]
We have $A = X$, so the recursion ran to completion and $f: X \to Y$.
*Injectivity* is immediate from strict monotonicity (second step).
For the image being an initial segment, the subtle point is: why must every element of $Y$ below some $f(x_0)$ itself appear in the range of $f$? Suppose $c <_Y f(x_0) = b$. The greedy rule at step $x_0$ picked $f(x_0) = \min_Y(Y \setminus \{f(y) : y <_X x_0\})$. If $c$ were not in the excluded set $\{f(y) : y <_X x_0\}$, then $c$ would be a smaller available candidate, contradicting minimality of $f(x_0)$. Hence $c = f(y)$ for some $y <_X x_0$.
Formally: consider $T = \{x \in X : c \in \{f(y) : y <_X x\}\}$ — the $x$ at which $c$ "has already been used". If $T$ is non-empty, let $x_1 = \min_X T$; by definition of $T$, there is $y <_X x_1$ with $f(y) = c$, so $c \in R$. If $T = \varnothing$, then $c$ was never used before $x_0$, and the greedy minimum rule forces $f(x_0) \leq_Y c$, contradicting $c <_Y b = f(x_0)$.
So the image $R$ of $f$ is an initial segment of $Y$. Combined with strict monotonicity, $f: X \to R$ is an order-isomorphism, which is exactly the definition $X \leq Y$.
[/guided]
[/step]
[step:Analyse the two outcomes: $A \neq X$ gives $Y \leq X$]
Now suppose $A \neq X$. Since $A$ is an initial segment of $X$ (third step) and $X \setminus A$ is non-empty, by the [characterisation of initial segments](/theorems/1462), either $A = X$ or $A = I_{x^*}$ for the unique $x^* = \min_X(X \setminus A)$. The first is excluded, so
\begin{align*}
A = I_{x^*} = \{y \in X : y <_X x^*\}.
\end{align*}
By definition of $A$, $f(x^*) = \infty$, which means $Y \setminus \{f(y) : y <_X x^*\} = \varnothing$, i.e.,
\begin{align*}
Y = \{f(y) : y \in I_{x^*}\} = \operatorname{range}(f|_{I_{x^*}}).
\end{align*}
Thus $f|_{I_{x^*}}: I_{x^*} \to Y$ is surjective. By the second step, $f|_{I_{x^*}}$ is strictly increasing, hence injective, and order-preserving. Therefore $f|_{I_{x^*}}: I_{x^*} \to Y$ is an order-preserving bijection — an order-isomorphism.
Inverting, $f|_{I_{x^*}}^{-1}: Y \to I_{x^*} \subseteq X$ is an order-preserving bijection onto the initial segment $I_{x^*}$ of $X$. By definition of the comparison order, $Y \leq X$.
[/step]
[step:Combine the two cases]
In the first case ($A = X$) we established $X \leq Y$; in the second case ($A \neq X$) we established $Y \leq X$. These two cases are exhaustive — either $f$ never takes the value $\infty$, or it does at some point — so one of $X \leq Y$, $Y \leq X$ holds, which is the claim.
[/step]