[proofplan]
We prove the stronger statement that $|\operatorname{Hom}_K(L, E)| \leq [L : K]$ for every field extension $E/K$, by induction on $[L : K]$. The bound on $|\operatorname{Aut}_K(L)|$ follows immediately since $\operatorname{Aut}_K(L) \subset \operatorname{Hom}_K(L, L)$. In the inductive step, we pick an element $\alpha \in L \setminus K$, set $M := K(\alpha)$, and decompose each $K$-homomorphism $L \to E$ by first restricting it to $M$. The restriction lands in $\operatorname{Hom}_K(M, E)$, which has at most $\deg P_\alpha = [M : K]$ elements by the [Roots-Homomorphisms Correspondence](/theorems/1256). For each restriction $\tau$, the set of extensions of $\tau$ to all of $L$ is bounded by $[L : M]$ via the inductive hypothesis. The total count is therefore at most $[M : K] \cdot [L : M] = [L : K]$ by the [Tower Law](/theorems/1248).
[/proofplan]
[step:Reduce to bounding $|\operatorname{Hom}_K(L, E)|$ for an arbitrary field extension $E/K$]
Every $K$-automorphism of $L$ is in particular a $K$-homomorphism from $L$ to $L$, so
\begin{align*}
\operatorname{Aut}_K(L) \subset \operatorname{Hom}_K(L, L).
\end{align*}
It therefore suffices to prove the stronger inequality
\begin{align*}
|\operatorname{Hom}_K(L, E)| \leq [L : K]
\end{align*}
for every field extension $E/K$, and then specialise to $E = L$. The stronger statement is the one amenable to induction, because restricting a homomorphism to an intermediate subfield changes the target of the map but preserves the $K$-homomorphism structure.
[guided]
Why prove the stronger statement? Attempting to prove $|\operatorname{Aut}_K(L)| \leq [L : K]$ directly by induction on $[L : K]$ leads to difficulty in the fibre-counting step. When we restrict an automorphism $\sigma \in \operatorname{Aut}_K(L)$ to an intermediate field $M = K(\alpha)$, the restriction $\sigma|_M$ is a $K$-homomorphism from $M$ into $L$, but not necessarily a $K$-automorphism of $M$ (its image $\sigma(M)$ may differ from $M$). The fibres of the restriction map therefore involve extensions of a $K$-embedding $\tau \colon M \hookrightarrow L$ to all of $L$ — and counting these extensions requires a bound on $K$-homomorphisms, not automorphisms.
By upgrading the inductive hypothesis from automorphisms to $\operatorname{Hom}_K(L, E)$ for an arbitrary target $E$, the fibre-counting step works cleanly: the extensions of $\tau$ correspond exactly to elements of a certain set of $M$-homomorphisms from $L$ into $E$, where $E$ carries an $M$-algebra structure via $\tau$. The inductive hypothesis applies directly.
[/guided]
[/step]
[step:Establish the base case $[L : K] = 1$]
If $[L : K] = 1$, then $L = K$ as vector spaces over $K$, so $L = K$ as fields. A $K$-homomorphism $\sigma \colon K \to E$ is a field homomorphism that fixes $K$ pointwise, so $\sigma = \operatorname{id}_K$ composed with the inclusion $K \hookrightarrow E$. There is exactly one such map: the inclusion itself. Therefore
\begin{align*}
|\operatorname{Hom}_K(K, E)| = 1 = [K : K].
\end{align*}
[/step]
[step:Pick $\alpha \in L \setminus K$ and decompose $\operatorname{Hom}_K(L, E)$ via restriction to $K(\alpha)$]
Assume $[L : K] > 1$. Since $L \neq K$, there exists $\alpha \in L \setminus K$. Set $M := K(\alpha)$. Since $L/K$ is finite, $\alpha$ is algebraic over $K$ (by [Finite Implies Algebraic](/theorems/1249)), and the [Structure of Simple Algebraic Extensions](/theorems/1251) gives $[M : K] = \deg P_\alpha$, where $P_\alpha \in K[t]$ is the minimal polynomial of $\alpha$ over $K$. Since $\alpha \notin K$, we have $\deg P_\alpha \geq 2$, so $[M : K] \geq 2$.
By the [Tower Law](/theorems/1248), $[L : K] = [L : M] \cdot [M : K]$. Since $[M : K] \geq 2$, we obtain $[L : M] \leq [L : K] / 2 < [L : K]$.
Define the restriction map
\begin{align*}
\rho \colon \operatorname{Hom}_K(L, E) &\to \operatorname{Hom}_K(M, E) \\
\sigma &\mapsto \sigma|_M.
\end{align*}
This map is well-defined: if $\sigma \colon L \to E$ is a $K$-homomorphism, then the restriction $\sigma|_M \colon M \to E$ is a field homomorphism (as a restriction of a field homomorphism to a subfield), and it fixes $K$ pointwise (since $\sigma$ does and $K \subset M$), so $\sigma|_M \in \operatorname{Hom}_K(M, E)$.
The map $\rho$ partitions $\operatorname{Hom}_K(L, E)$ into fibres:
\begin{align*}
\operatorname{Hom}_K(L, E) = \bigsqcup_{\tau \in \operatorname{Im}(\rho)} \rho^{-1}(\tau),
\end{align*}
so that
\begin{align*}
|\operatorname{Hom}_K(L, E)| = \sum_{\tau \in \operatorname{Im}(\rho)} |\rho^{-1}(\tau)|.
\end{align*}
[guided]
The idea is to decompose a $K$-homomorphism $\sigma \colon L \to E$ into two pieces of information: what $\sigma$ does on the intermediate field $M = K(\alpha)$, and what $\sigma$ does on the rest of $L$ once its behaviour on $M$ is fixed.
Why choose $M = K(\alpha)$ as the intermediate field rather than some other subfield? Any proper intermediate field would work for the induction, but $K(\alpha)$ is the simplest choice: it is generated by a single element, which allows us to apply the [Roots-Homomorphisms Correspondence](/theorems/1256) to count the number of $K$-homomorphisms from $M$ into $E$.
To verify that the restriction map $\rho$ is well-defined, note that $\sigma|_M$ is a ring homomorphism from $M$ to $E$ (since restricting a ring homomorphism to a subring gives a ring homomorphism). It is a field homomorphism because $M$ is a field and ring homomorphisms from fields are injective ($\ker \sigma|_M$ is an ideal of $M$, and the only ideals of a field are $\{0\}$ and the field itself; since $\sigma(1) = 1 \neq 0$, the kernel is $\{0\}$). Finally, $\sigma|_M$ fixes $K$ because $K \subset M$ and $\sigma$ fixes $K$.
[/guided]
[/step]
[step:Bound the number of fibres using the Roots-Homomorphisms Correspondence]
The number of fibres equals $|\operatorname{Im}(\rho)|$, which satisfies
\begin{align*}
|\operatorname{Im}(\rho)| \leq |\operatorname{Hom}_K(M, E)|.
\end{align*}
Since $M = K(\alpha)$ is a simple algebraic extension of $K$, the [Roots-Homomorphisms Correspondence](/theorems/1256) provides a bijection $\operatorname{Hom}_K(K(\alpha), E) \longleftrightarrow \operatorname{Root}_{P_\alpha}(E)$ given by $\tau \mapsto \tau(\alpha)$. Therefore
\begin{align*}
|\operatorname{Hom}_K(M, E)| = |\operatorname{Root}_{P_\alpha}(E)| \leq \deg P_\alpha = [M : K],
\end{align*}
where the inequality holds because a polynomial of degree $\deg P_\alpha$ over the field $K$ has at most $\deg P_\alpha$ roots in any field, and the final equality is $[K(\alpha) : K] = \deg P_\alpha$ from the [Structure of Simple Algebraic Extensions](/theorems/1251).
[guided]
This step is where the theory of simple extensions does the heavy lifting. The Roots-Homomorphisms Correspondence (Theorem 1256) converts the problem of counting $K$-homomorphisms from $K(\alpha)$ into a root-counting problem. The bijection sends each $\tau \in \operatorname{Hom}_K(K(\alpha), E)$ to $\tau(\alpha) \in E$, and verifies that $\tau(\alpha)$ must be a root of $P_\alpha$ (because $\tau$ fixes the coefficients of $P_\alpha$, which lie in $K$). Conversely, each root $\beta$ of $P_\alpha$ in $E$ gives rise to a unique $K$-homomorphism $K(\alpha) \to E$ sending $\alpha \mapsto \beta$.
The root count $|\operatorname{Root}_{P_\alpha}(E)| \leq \deg P_\alpha$ is the fundamental degree bound for polynomials over a field: in the polynomial ring $E[t]$, the Factor Theorem gives that each root $\beta$ of $P_\alpha$ contributes a linear factor $(t - \beta)$, and since $E[t]$ is a unique factorisation domain, the number of distinct linear factors cannot exceed $\deg P_\alpha$.
[/guided]
[/step]
[step:Bound each fibre size by $[L : M]$ using the inductive hypothesis]
Fix $\tau \in \operatorname{Im}(\rho) \subset \operatorname{Hom}_K(M, E)$. The fibre $\rho^{-1}(\tau)$ consists of all $K$-homomorphisms $\sigma \colon L \to E$ satisfying $\sigma|_M = \tau$.
[claim:The fibre $\rho^{-1}(\tau)$ is contained in $\operatorname{Hom}_M(L, E_\tau)$]
Let $E_\tau$ denote the field $E$ equipped with the $M$-algebra structure given by $\tau$: for $m \in M$ and $e \in E$, the scalar multiplication is $m \cdot_\tau e := \tau(m) \cdot e$. Then
\begin{align*}
\rho^{-1}(\tau) \subset \operatorname{Hom}_M(L, E_\tau),
\end{align*}
where $\operatorname{Hom}_M(L, E_\tau)$ denotes the set of field homomorphisms $\sigma \colon L \to E$ satisfying $\sigma(m \cdot x) = \tau(m) \cdot \sigma(x)$ for all $m \in M$ and $x \in L$.
[/claim]
[proof]
Let $\sigma \in \rho^{-1}(\tau)$, so $\sigma \colon L \to E$ is a $K$-homomorphism with $\sigma|_M = \tau$. For any $m \in M$ and $x \in L$:
\begin{align*}
\sigma(m \cdot x) = \sigma(m) \cdot \sigma(x) = \tau(m) \cdot \sigma(x),
\end{align*}
where the first equality uses the fact that $\sigma$ is a ring homomorphism, and the second uses $\sigma|_M = \tau$. This is precisely the condition for $\sigma$ to be an $M$-homomorphism from $L$ to $E_\tau$.
[/proof]
Now $L/M$ is a finite extension with $[L : M] < [L : K]$ (as established in the previous step). The inductive hypothesis applied to the finite extension $L/M$ and the $M$-algebra $E_\tau$ gives
\begin{align*}
|\rho^{-1}(\tau)| \leq |\operatorname{Hom}_M(L, E_\tau)| \leq [L : M].
\end{align*}
[guided]
This is the step where the inductive hypothesis is consumed, and it requires care about what "$M$-homomorphism" means.
A $K$-homomorphism $\sigma \colon L \to E$ with $\sigma|_M = \tau$ is not the same as an $M$-homomorphism in the usual sense (which would require $\sigma(m \cdot x) = m \cdot \sigma(x)$ for $m \in M$). Rather, $\sigma$ satisfies $\sigma(m \cdot x) = \tau(m) \cdot \sigma(x)$, where $\tau(m)$ replaces $m$ on the right-hand side. To apply the inductive hypothesis, which concerns $M$-homomorphisms from $L$ into some field extension of $M$, we view $E$ as an $M$-algebra via the embedding $\tau \colon M \hookrightarrow E$. We denote this $M$-algebra by $E_\tau$.
In $E_\tau$, the $M$-action on $E$ is defined by $m \cdot_\tau e := \tau(m) \cdot e$. Under this convention, the condition $\sigma(m \cdot x) = \tau(m) \cdot \sigma(x)$ becomes exactly the condition that $\sigma$ is $M$-linear (with $M$ acting on $E$ via $\tau$). Thus $\rho^{-1}(\tau) \subset \operatorname{Hom}_M(L, E_\tau)$.
Why is it valid to apply the inductive hypothesis? We need $[L : M] < [L : K]$ and $L/M$ to be a finite extension. The first condition holds because $[L : K] = [L : M] \cdot [M : K]$ and $[M : K] = \deg P_\alpha \geq 2$ (since $\alpha \notin K$). The second holds because $[L : M] = [L : K] / [M : K]$ is finite. The inductive hypothesis then gives $|\operatorname{Hom}_M(L, E_\tau)| \leq [L : M]$ for any field extension $E_\tau$ of $M$ — and $E_\tau$ is indeed a field extension of $M$ via the embedding $\tau$.
[/guided]
[/step]
[step:Combine the fibre and image bounds to conclude $|\operatorname{Hom}_K(L, E)| \leq [L : K]$]
From the decomposition into fibres:
\begin{align*}
|\operatorname{Hom}_K(L, E)| &= \sum_{\tau \in \operatorname{Im}(\rho)} |\rho^{-1}(\tau)| \leq \sum_{\tau \in \operatorname{Im}(\rho)} [L : M] = |\operatorname{Im}(\rho)| \cdot [L : M].
\end{align*}
The inequality uses the fibre bound $|\rho^{-1}(\tau)| \leq [L : M]$ for every $\tau \in \operatorname{Im}(\rho)$. Applying the image bound $|\operatorname{Im}(\rho)| \leq [M : K]$:
\begin{align*}
|\operatorname{Hom}_K(L, E)| \leq [M : K] \cdot [L : M] = [L : K],
\end{align*}
where the final equality is the [Tower Law](/theorems/1248): $[L : K] = [L : M] \cdot [M : K]$.
Specialising to $E = L$, we obtain
\begin{align*}
|\operatorname{Aut}_K(L)| \leq |\operatorname{Hom}_K(L, L)| \leq [L : K].
\end{align*}
This completes the induction and proves the theorem.
[/step]