[proofplan]
The proof divides into two parts matching the theorem statement. For Part (1), we show that $D_f = \Delta_f^2$ is a symmetric rational expression in the roots $\alpha_1, \ldots, \alpha_n$, hence fixed by every $\sigma \in G$. The key step is establishing that each $\sigma \in G$ acts on $\Delta_f$ by $\sigma(\Delta_f) = \operatorname{sgn}(\sigma) \cdot \Delta_f$, so that squaring eliminates the sign. For Part (2), the sign rule reduces the condition $G \subset A_n$ to the condition $\Delta_f \in L^G = K$: in the forward direction, every even permutation fixes $\Delta_f$; in the reverse, $\Delta_f \in K$ forces every $\sigma \in G$ to be even (using separability to ensure $\Delta_f \neq 0$). The equivalence with $D_f$ being a square follows from the factorisation $D_f = \Delta_f^2$.
[/proofplan]
[step:Establish the sign rule $\sigma(\Delta_f) = \operatorname{sgn}(\sigma) \cdot \Delta_f$ for each $\sigma \in G$]
Recall that $\Delta_f = \prod_{1 \le i < j \le n} (\alpha_i - \alpha_j)$ and that each $\sigma \in G \subset S_n$ acts on the roots by $\sigma(\alpha_i) = \alpha_{\sigma(i)}$. We compute:
\begin{align*}
\sigma(\Delta_f) = \prod_{1 \le i < j \le n} (\alpha_{\sigma(i)} - \alpha_{\sigma(j)}).
\end{align*}
Since $\sigma$ is a bijection on $\{1, \ldots, n\}$, as $(i,j)$ ranges over all pairs with $i < j$, the pair $\{\sigma(i), \sigma(j)\}$ ranges over all two-element subsets of $\{1, \ldots, n\}$. For each such pair, we may write the factor in canonical order: if $\sigma(i) < \sigma(j)$, the factor is $(\alpha_{\sigma(i)} - \alpha_{\sigma(j)})$; if $\sigma(i) > \sigma(j)$, the factor is $-(\alpha_{\sigma(j)} - \alpha_{\sigma(i)})$. We prove this sign equals $\operatorname{sgn}(\sigma)$ by first establishing the result for transpositions and then extending multiplicatively.
[claim:Each transposition reverses the sign of $\Delta_f$]
For any transposition $\tau = (r \; s) \in S_n$ with $r < s$, we have $\tau(\Delta_f) = -\Delta_f$.
[/claim]
[proof]
Partition the $\binom{n}{2}$ factors of $\Delta_f = \prod_{i < j}(\alpha_i - \alpha_j)$ into three groups according to how $\tau = (r\;s)$ acts on the indices $i, j$:
**Type 1: The unique factor involving both $r$ and $s$.** Since $r < s$, the factor $(\alpha_r - \alpha_s)$ appears in $\Delta_f$. Under $\tau$:
\begin{align*}
\tau(\alpha_r - \alpha_s) = \alpha_{\tau(r)} - \alpha_{\tau(s)} = \alpha_s - \alpha_r = -(\alpha_r - \alpha_s).
\end{align*}
This contributes a sign of $-1$.
**Type 2: Factors involving exactly one of $r, s$.** For each index $k \in \{1, \ldots, n\} \setminus \{r, s\}$, there are exactly two factors in $\Delta_f$ that involve $k$ and one element of $\{r, s\}$. We show the product of these two factors is invariant under $\tau$ by examining three cases based on the position of $k$ relative to $r$ and $s$:
*Case $k < r < s$:* The two factors are $(\alpha_k - \alpha_r)$ and $(\alpha_k - \alpha_s)$. Under $\tau$, they become $(\alpha_k - \alpha_s)$ and $(\alpha_k - \alpha_r)$, so their product is unchanged.
*Case $r < k < s$:* The two factors are $(\alpha_r - \alpha_k)$ and $(\alpha_k - \alpha_s)$. Under $\tau$:
\begin{align*}
\tau\bigl((\alpha_r - \alpha_k)(\alpha_k - \alpha_s)\bigr) &= (\alpha_s - \alpha_k)(\alpha_k - \alpha_r) \\
&= \bigl(-(\alpha_k - \alpha_s)\bigr)\bigl(-(\alpha_r - \alpha_k)\bigr) \\
&= (\alpha_k - \alpha_s)(\alpha_r - \alpha_k) \\
&= (\alpha_r - \alpha_k)(\alpha_k - \alpha_s).
\end{align*}
The product is unchanged (two sign flips cancel).
*Case $r < s < k$:* The two factors are $(\alpha_r - \alpha_k)$ and $(\alpha_s - \alpha_k)$. Under $\tau$, they become $(\alpha_s - \alpha_k)$ and $(\alpha_r - \alpha_k)$, so the product is unchanged.
In every case, the product of the two Type 2 factors involving $k$ is invariant under $\tau$. Ranging over all $k \in \{1, \ldots, n\} \setminus \{r, s\}$, the Type 2 factors contribute a total sign of $+1$.
**Type 3: Factors involving neither $r$ nor $s$.** If $i, j \notin \{r, s\}$, then $\tau(\alpha_i) = \alpha_i$ and $\tau(\alpha_j) = \alpha_j$, so each such factor is fixed. These contribute a sign of $+1$.
Combining all three types: $\tau(\Delta_f) = (-1) \cdot (+1) \cdot (+1) \cdot \Delta_f = -\Delta_f$.
[/proof]
Now let $\sigma \in S_n$ be an arbitrary permutation, and write $\sigma = \tau_1 \circ \cdots \circ \tau_m$ as a composition of $m$ transpositions. Applying each transposition in sequence and using the claim at each stage:
\begin{align*}
\sigma(\Delta_f) &= (\tau_1 \circ \cdots \circ \tau_m)(\Delta_f) = (-1)^m \, \Delta_f = \operatorname{sgn}(\sigma) \cdot \Delta_f,
\end{align*}
where the final equality uses the definition of the sign homomorphism: $\operatorname{sgn}(\sigma) = (-1)^m$, and the well-definedness of $\operatorname{sgn}$ guarantees that $(-1)^m$ is independent of the choice of transposition decomposition.
[guided]
The product $\Delta_f = \prod_{i < j}(\alpha_i - \alpha_j)$ has $\binom{n}{2}$ factors. A permutation $\sigma$ permutes the index pairs, and when a pair $(i,j)$ with $i < j$ is sent to a pair $(\sigma(i), \sigma(j))$ with $\sigma(i) > \sigma(j)$, the corresponding factor acquires a sign of $-1$. The total sign is determined by the number of such "inversions" modulo $2$.
Why isolate transpositions first? Because every permutation is a product of transpositions, and the group action is multiplicative: $(\tau_1 \circ \tau_2)(\Delta_f) = \tau_1(\tau_2(\Delta_f))$. Once we know that each transposition contributes a factor of $-1$, the general formula follows by induction on the number of transpositions.
The partition into three types is the key organisational idea for the transposition case. The crucial observation is that Type 2 factors pair up: for each "bystander" index $k \notin \{r, s\}$, the transposition swaps the roles of $\alpha_r$ and $\alpha_s$ in the two factors involving $k$. The product of these two factors is a symmetric expression in $\alpha_r$ and $\alpha_s$ (after accounting for the ordering convention $i < j$), so swapping $\alpha_r$ and $\alpha_s$ either introduces zero sign flips (Cases 1 and 3) or exactly two sign flips that cancel (Case 2). Only Type 1 — the direct swap of the pair $(r, s)$ — produces an unpaired sign change.
Note that the sign rule $\sigma(\Delta_f) = \operatorname{sgn}(\sigma) \cdot \Delta_f$ also provides an independent proof that the sign homomorphism $\operatorname{sgn}: S_n \to \{+1, -1\}$ is well-defined: since $\Delta_f \neq 0$ (the roots are distinct by separability), the value $\sigma(\Delta_f)/\Delta_f \in \{+1, -1\}$ depends only on $\sigma$, not on any choice of transposition decomposition.
[/guided]
[/step]
[step:Prove Part (1): $D_f \in K$]
By the sign rule established above, for any $\sigma \in G$:
\begin{align*}
\sigma(D_f) = \sigma(\Delta_f^2) = \bigl(\sigma(\Delta_f)\bigr)^2 = \bigl(\operatorname{sgn}(\sigma) \cdot \Delta_f\bigr)^2 = \operatorname{sgn}(\sigma)^2 \cdot \Delta_f^2 = \Delta_f^2 = D_f,
\end{align*}
where we use $\operatorname{sgn}(\sigma)^2 = 1$ since $\operatorname{sgn}(\sigma) \in \{+1, -1\}$. Since $\sigma \in G$ was arbitrary, $D_f$ is fixed by every element of $G$. Since $L/K$ is a Galois extension with $\operatorname{Gal}(L/K) = G$, the Fundamental Theorem of Galois Theory gives $L^G = K$. Therefore $D_f \in L^G = K$.
[/step]
[step:Prove Part (2): $G \subset A_n$ if and only if $\Delta_f \in K$]
Since $f$ is separable, its roots $\alpha_1, \ldots, \alpha_n$ are pairwise distinct, so every factor $(\alpha_i - \alpha_j)$ with $i \neq j$ is nonzero. Therefore $\Delta_f = \prod_{i < j}(\alpha_i - \alpha_j) \neq 0$.
**Forward direction.** Assume $G \subset A_n$. Then every $\sigma \in G$ is an even permutation, so $\operatorname{sgn}(\sigma) = +1$. By the sign rule:
\begin{align*}
\sigma(\Delta_f) = \operatorname{sgn}(\sigma) \cdot \Delta_f = \Delta_f \quad \text{for all } \sigma \in G.
\end{align*}
Hence $\Delta_f$ is fixed by every element of $G$. By the Fundamental Theorem of Galois Theory, $L^G = K$, so $\Delta_f \in K$.
**Reverse direction.** Assume $\Delta_f \in K$. Since $G = \operatorname{Gal}(L/K)$ fixes $K$ pointwise, $\sigma(\Delta_f) = \Delta_f$ for all $\sigma \in G$. The sign rule gives:
\begin{align*}
\operatorname{sgn}(\sigma) \cdot \Delta_f = \Delta_f \quad \text{for all } \sigma \in G.
\end{align*}
Since $\Delta_f \neq 0$ (separability), we may cancel $\Delta_f$ from both sides to obtain $\operatorname{sgn}(\sigma) = 1$ for all $\sigma \in G$. By definition of $A_n = \ker(\operatorname{sgn})$, this gives $G \subset A_n$.
[guided]
The separability hypothesis is consumed precisely in the reverse direction. If $f$ had a repeated root $\alpha_i = \alpha_j$ for some $i \neq j$, then the factor $(\alpha_i - \alpha_j) = 0$ would force $\Delta_f = 0$. The equation $\operatorname{sgn}(\sigma) \cdot \Delta_f = \Delta_f$ would then read $0 = 0$ for every $\sigma \in G$, regardless of whether $\sigma$ is even or odd. The cancellation step would be invalid, and the discriminant would carry no information about the parity of $G$.
The forward direction does not require separability in the same essential way — if $\Delta_f = 0$, the conclusion $\Delta_f \in K$ holds vacuously (since $0 \in K$). But the equivalence between $G \subset A_n$ and $\Delta_f \in K$ genuinely requires $\Delta_f \neq 0$.
It is also worth noting that if $\operatorname{char} K = 2$, the sign homomorphism $\operatorname{sgn}: S_n \to \{+1, -1\}$ becomes trivial because $+1 = -1$ in $K$, so $\operatorname{sgn}(\sigma) \cdot \Delta_f = \Delta_f$ for every $\sigma$ regardless of parity. In this case the discriminant criterion provides no information. However, the hypothesis that $f$ is separable already excludes many characteristic-$2$ pathologies, and the statement of the theorem remains formally correct (it just becomes vacuous when $A_n = S_n$ as happens when $\operatorname{sgn}$ is trivial).
[/guided]
[/step]
[step:Complete Part (2): the equivalence $\Delta_f \in K$ if and only if $D_f$ is a square in $K$]
We established $D_f = \Delta_f^2 \in K$ in Part (1).
**If $\Delta_f \in K$:** Then $D_f = \Delta_f^2$ has $\Delta_f$ as a square root in $K$, so $D_f$ is a square in $K$.
**If $D_f$ is a square in $K$:** Let $c \in K$ satisfy $c^2 = D_f = \Delta_f^2$. Then $c^2 - \Delta_f^2 = 0$, which factors as $(c - \Delta_f)(c + \Delta_f) = 0$ in the field $L$. Since $L$ is a field (hence an integral domain), either $c = \Delta_f$ or $c = -\Delta_f$. In both cases $\Delta_f = \pm c \in K$.
Combining the forward and reverse directions of both equivalences:
\begin{align*}
G \subset A_n \iff \Delta_f \in K \iff D_f \text{ is a square in } K.
\end{align*}
This completes the proof.
[guided]
The final equivalence translates the Galois-theoretic condition ($\Delta_f$ lies in the base field) into a purely arithmetic condition ($D_f$ has a square root in $K$) that can be checked without constructing the splitting field.
The factorisation $(c - \Delta_f)(c + \Delta_f) = 0$ uses the fact that $L$ is a field, so it has no zero divisors. In a ring with zero divisors, the equation $c^2 = \Delta_f^2$ would not force $c = \pm \Delta_f$, and the argument would fail.
The practical value of this equivalence is significant: $D_f$ can be computed directly from the coefficients of $f$ (via the resultant $D_f = (-1)^{n(n-1)/2} \operatorname{Res}(f, f') / a_n$, where $a_n$ is the leading coefficient and $f'$ is the formal derivative), without ever finding the roots. One then checks whether $D_f$ is a perfect square in $K$ to determine the parity of the Galois group. For instance, if $K = \mathbb{Q}$ and $D_f$ is a positive rational number, one checks whether $D_f$ is the square of a rational number; if $D_f < 0$, it cannot be a square in $\mathbb{Q}$, so $G \not\subset A_n$.
[/guided]
[/step]