[proofplan]
We prove the identity by comparing two expansions of the Cauchy double alternant. One expansion expresses the determinant as a sum of products of alternants $a_\mu(x)a_\mu(y)$, while the other expansion factors out the Vandermonde alternants and records the remaining coefficients as Jacobi-Trudi determinants. Comparing the coefficient of $a_{\lambda+\delta}(y_1,\dots,y_n)$ gives $a_{\lambda+\delta}(x_1,\dots,x_n)=a_\delta(x_1,\dots,x_n)s_\lambda(x_1,\dots,x_n)$, and division by the nonzero alternant $a_\delta$ gives the stated quotient formula.
[/proofplan]
[step:Introduce the complete homogeneous symmetric functions and the Jacobi-Trudi form of $s_\lambda$]
Let $R := \mathbb{Z}[x_1,\dots,x_n]$ be the [polynomial ring](/page/Polynomial%20Ring) in the variables $x_1,\dots,x_n$. Define
\begin{align*}
\delta := (n-1,n-2,\dots,0).
\end{align*}
For any $n$-tuple $\alpha=(\alpha_1,\dots,\alpha_n)$ of nonnegative integers and any variables $z_1,\dots,z_n$, define the alternant
\begin{align*}
a_\alpha(z_1,\dots,z_n)
:=
\det\left(z_i^{\alpha_j}\right)_{1\leq i,j\leq n}.
\end{align*}
In particular,
\begin{align*}
a_\delta(z_1,\dots,z_n)
=
\prod_{1\leq i<j\leq n}(z_i-z_j).
\end{align*}
For each integer $r \in \mathbb{Z}$, define $h_r \in R$ by
\begin{align*}
h_r :=
\begin{cases}
\displaystyle \sum_{\substack{\beta_1,\dots,\beta_n \geq 0\\ \beta_1+\cdots+\beta_n=r}}
x_1^{\beta_1}\cdots x_n^{\beta_n}, & r \geq 0,\\
0, & r < 0.
\end{cases}
\end{align*}
Thus $h_0=1$, and $h_r$ is the complete homogeneous symmetric polynomial of degree $r$ in $x_1,\dots,x_n$ for $r \geq 0$.
We use the [Jacobi-Trudi definition](/page/Jacobi-Trudi%20Identity) of the Schur polynomial:
\begin{align*}
s_\lambda(x_1,\dots,x_n)
:=
\det\left(h_{\lambda_i-i+j}\right)_{1 \leq i,j \leq n}.
\end{align*}
It is therefore enough to prove
\begin{align*}
a_{\lambda+\delta}(x_1,\dots,x_n)
=
a_\delta(x_1,\dots,x_n)
\det\left(h_{\lambda_i-i+j}\right)_{1 \leq i,j \leq n}.
\end{align*}
[/step]
[step:Expand the Cauchy double alternant as a sum of alternants]
Introduce a second set of indeterminates $y_1,\dots,y_n$, and work in the formal [power series](/page/Power%20Series) ring
\begin{align*}
\mathbb{Z}[x_1,\dots,x_n][[y_1,\dots,y_n]].
\end{align*}
Consider the determinant
\begin{align*}
D(x,y)
:=
\det\left(\frac{1}{1-x_i y_j}\right)_{1 \leq i,j \leq n}.
\end{align*}
Since
\begin{align*}
\frac{1}{1-x_i y_j}
=
\sum_{m=0}^{\infty} x_i^m y_j^m
\end{align*}
as a formal power series, multilinearity of the determinant in its columns gives
\begin{align*}
D(x,y)
=
\sum_{m_1,\dots,m_n \geq 0}
\det\left(x_i^{m_j}\right)_{1 \leq i,j \leq n}
\prod_{j=1}^n y_j^{m_j}.
\end{align*}
This equality is coefficientwise in $\mathbb{Z}[x_1,\dots,x_n][[y_1,\dots,y_n]]$: for a fixed monomial $y_1^{k_1}\cdots y_n^{k_n}$, only the tuple $(m_1,\dots,m_n)=(k_1,\dots,k_n)$ contributes. If two of the integers $m_1,\dots,m_n$ are equal, then the determinant $\det(x_i^{m_j})$ has two identical columns and is zero. Therefore only tuples with distinct entries contribute. Reordering each contributing tuple into strictly decreasing order $\mu_1>\cdots>\mu_n\geq 0$ gives one copy of the alternant in the $x$-variables and the same sign from reordering the $y$-monomial into the corresponding alternant. Hence
\begin{align*}
D(x,y)
=
\sum_{\mu_1>\cdots>\mu_n\geq 0}
a_\mu(x_1,\dots,x_n)\,a_\mu(y_1,\dots,y_n),
\end{align*}
where $\mu=(\mu_1,\dots,\mu_n)$.
[/step]
[step:Factor the same determinant by the Cauchy double alternant identity]
We now compute the same determinant as a rational function. Put all entries over the common denominator
\begin{align*}
P(x,y)
:=
\prod_{i=1}^n\prod_{j=1}^n (1-x_i y_j).
\end{align*}
Then
\begin{align*}
P(x,y)D(x,y)
\end{align*}
is a polynomial which is alternating in the variables $x_1,\dots,x_n$ and also alternating in the variables $y_1,\dots,y_n$. Therefore it is divisible by both Vandermonde alternants
\begin{align*}
a_\delta(x_1,\dots,x_n)
=
\prod_{1\leq i<j\leq n}(x_i-x_j),
\qquad
a_\delta(y_1,\dots,y_n)
=
\prod_{1\leq i<j\leq n}(y_i-y_j).
\end{align*}
By the [Vandermonde divisibility property for alternating polynomials](/page/Alternating%20Polynomial), write
\begin{align*}
P(x,y)D(x,y)
=
a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n)Q(x,y)
\end{align*}
for the resulting polynomial quotient $Q \in \mathbb{Z}[x_1,\dots,x_n,y_1,\dots,y_n]$. Since the quotient of an alternating polynomial by the Vandermonde alternant is symmetric, $Q$ is symmetric in $x_1,\dots,x_n$ and also symmetric in $y_1,\dots,y_n$. We show that $Q$ is constant. For each fixed $i \in \{1,\dots,n\}$, every term of $P(x,y)D(x,y)$ has degree at most $n-1$ in $x_i$, because after clearing denominators the factor depending on $x_i$ is a product of exactly $n-1$ factors of the form $1-x_i y_j$.
Use lexicographic monomial order in the $x$-variables with $x_1>x_2>\cdots>x_n$, treating coefficients as elements of $\mathbb{Z}[y_1,\dots,y_n]$. The leading monomial of $a_\delta(x_1,\dots,x_n)$ is $x_1^{n-1}x_2^{n-2}\cdots x_n^0$ with coefficient $1$. Since $Q$ is symmetric in the variables $x_1,\dots,x_n$, if $Q$ had positive degree in the $x$-variables, then its leading $x$-monomial would have the form $x_1^{\alpha_1}\cdots x_n^{\alpha_n}$ with $\alpha_1>0$. The leading monomial of the product $a_\delta(x_1,\dots,x_n)Q(x,y)$ would then be
\begin{align*}
x_1^{\alpha_1+n-1}x_2^{\alpha_2+n-2}\cdots x_n^{\alpha_n},
\end{align*}
because leading monomials multiply in the polynomial ring over the integral domain $\mathbb{Z}[y_1,\dots,y_n]$. This monomial has $x_1$-degree greater than $n-1$, contradicting the individual-degree bound for $P(x,y)D(x,y)$. Thus $Q$ is independent of $x_1,\dots,x_n$.
The same leading-monomial argument in the $y$-variables, using lexicographic order with $y_1>y_2>\cdots>y_n$ and treating coefficients as elements of $\mathbb{Z}[x_1,\dots,x_n]$, shows that $Q$ is independent of $y_1,\dots,y_n$. Hence $Q$ is an integer constant.
To identify the constant, compare the coefficient of
\begin{align*}
M:=x_1^{n-1}x_2^{n-2}\cdots x_n^0
y_1^{n-1}y_2^{n-2}\cdots y_n^0.
\end{align*}
The coefficient of $M$ in $a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n)$ is $1$. In $P(x,y)D(x,y)$, the determinant expansion gives
\begin{align*}
P(x,y)D(x,y)
=
\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)
\prod_{i=1}^n\prod_{j\neq \sigma(i)}(1-x_i y_j).
\end{align*}
To obtain $M$, one must choose $n-i$ factors from the $i$th row and $n-j$ factors involving $y_j$ from the $j$th column. The unique zero-one selection matrix with these row and column sums is the triangular matrix selecting exactly the pairs $(i,j)$ with $j\leq n-i$. This selection is compatible with the omitted factors only for the reverse permutation $\sigma(i)=n+1-i$. Its contribution has sign
\begin{align*}
\operatorname{sgn}(\sigma)(-1)^{n(n-1)/2}
=
(-1)^{n(n-1)/2}(-1)^{n(n-1)/2}
=
1.
\end{align*}
Thus the coefficient of $M$ in $P(x,y)D(x,y)$ is also $1$, so the constant is $1$. Hence
\begin{align*}
D(x,y)
=
\frac{
a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n)
}{
\prod_{i=1}^n\prod_{j=1}^n (1-x_i y_j)
}.
\end{align*}
[guided]
The determinant $D(x,y)$ has two different useful descriptions. The first description came from expanding each entry as a geometric series. The second description comes from treating the determinant as a rational function and clearing denominators.
Define
\begin{align*}
P(x,y)
:=
\prod_{i=1}^n\prod_{j=1}^n (1-x_i y_j).
\end{align*}
Then $P(x,y)D(x,y)$ is a polynomial. If we interchange two $x$-variables, say $x_i$ and $x_j$, then two rows of the determinant are interchanged, so the determinant changes sign. The factor $P(x,y)$ is symmetric in the $x$-variables, so $P(x,y)D(x,y)$ is alternating in $x_1,\dots,x_n$. The same argument with columns shows that it is alternating in $y_1,\dots,y_n$.
Every alternating polynomial in $x_1,\dots,x_n$ is divisible by the Vandermonde alternant
\begin{align*}
a_\delta(x_1,\dots,x_n)
=
\prod_{1\leq i<j\leq n}(x_i-x_j),
\end{align*}
because setting $x_i=x_j$ forces the polynomial to vanish. Applying the same argument in the $y$-variables shows that
\begin{align*}
P(x,y)D(x,y)
\end{align*}
is divisible by
\begin{align*}
a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n).
\end{align*}
By the [Vandermonde divisibility property for alternating polynomials](/page/Alternating%20Polynomial), write
\begin{align*}
P(x,y)D(x,y)
=
a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n)Q(x,y)
\end{align*}
for the polynomial quotient $Q$. The quotient is symmetric in $x_1,\dots,x_n$ because both $P(x,y)D(x,y)$ and $a_\delta(x_1,\dots,x_n)$ change sign under every transposition of two $x$-variables, and the same argument gives symmetry in $y_1,\dots,y_n$. We must show that $Q$ has no remaining dependence on either set of variables. Fix $i \in \{1,\dots,n\}$. In each summand of the cleared determinant, the variable $x_i$ occurs in exactly the product of the $n-1$ factors $1-x_i y_j$ with one value of $j$ omitted. Hence $P(x,y)D(x,y)$ has degree at most $n-1$ in each individual variable $x_i$.
Now $Q$ is symmetric in $x_1,\dots,x_n$, because both $P(x,y)D(x,y)$ and $a_\delta(x_1,\dots,x_n)$ are alternating in those variables. To avoid any possible cancellation issue, use lexicographic monomial order in the $x$-variables with $x_1>x_2>\cdots>x_n$, treating coefficients as elements of $\mathbb{Z}[y_1,\dots,y_n]$. The leading monomial of $a_\delta(x_1,\dots,x_n)$ is
\begin{align*}
x_1^{n-1}x_2^{n-2}\cdots x_n^0
\end{align*}
with coefficient $1$.
Suppose $Q$ had a nonconstant term involving the $x$-variables. Since $Q$ is symmetric, the leading $x$-monomial of $Q$ has the form $x_1^{\alpha_1}\cdots x_n^{\alpha_n}$ with $\alpha_1>0$. In a polynomial ring over an integral domain, the leading monomial of a product is the product of the leading monomials. Therefore the leading monomial of $a_\delta(x_1,\dots,x_n)Q(x,y)$ would have $x_1$-degree $\alpha_1+n-1>n-1$. This contradicts the fact that $P(x,y)D(x,y)$ has degree at most $n-1$ in $x_1$. Hence $Q$ is independent of the $x$-variables.
Applying the corresponding lexicographic leading-monomial argument in the $y$-variables shows that $Q$ is independent of $y_1,\dots,y_n$. Therefore $Q$ is a constant.
It remains to identify this constant. Compare the monomial
\begin{align*}
M:=x_1^{n-1}x_2^{n-2}\cdots x_n^0
y_1^{n-1}y_2^{n-2}\cdots y_n^0.
\end{align*}
The coefficient of $M$ in
\begin{align*}
a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n)
\end{align*}
is $1$. For $P(x,y)D(x,y)$, expanding the determinant after clearing denominators gives
\begin{align*}
P(x,y)D(x,y)
=
\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)
\prod_{i=1}^n\prod_{j\neq \sigma(i)}(1-x_i y_j).
\end{align*}
To obtain $M$, the selected factors $x_i y_j$ must have row sums $n-i$ and column sums $n-j$. These row and column sums force the triangular selection $j\leq n-i$. This selection avoids exactly the omitted positions only for the reverse permutation $\sigma(i)=n+1-i$. Its sign contribution is
\begin{align*}
\operatorname{sgn}(\sigma)(-1)^{n(n-1)/2}
=
(-1)^{n(n-1)/2}(-1)^{n(n-1)/2}
=
1.
\end{align*}
Thus the coefficient of $M$ in $P(x,y)D(x,y)$ is $1$, and the constant is $1$, giving
\begin{align*}
D(x,y)
=
\frac{
a_\delta(x_1,\dots,x_n)a_\delta(y_1,\dots,y_n)
}{
\prod_{i=1}^n\prod_{j=1}^n (1-x_i y_j)
}.
\end{align*}
[/guided]
[/step]
[step:Extract the coefficient of the alternant $a_{\lambda+\delta}(y)$]
Let $t$ be a formal indeterminate. Using the generating function for the complete homogeneous symmetric polynomials in the formal power series ring $R[[t]]$,
\begin{align*}
\prod_{i=1}^n \frac{1}{1-x_i t}
=
\sum_{r=0}^{\infty} h_r t^r,
\end{align*}
we rewrite the factored expression as
\begin{align*}
D(x,y)
&=
a_\delta(x_1,\dots,x_n)
a_\delta(y_1,\dots,y_n)
\prod_{j=1}^n
\left(\sum_{r=0}^{\infty} h_r y_j^r\right).
\end{align*}
Since
\begin{align*}
a_\delta(y_1,\dots,y_n)
=
\det\left(y_i^{n-j}\right)_{1\leq i,j\leq n},
\end{align*}
multiplying the $i$th row by the series $\sum_{r=0}^{\infty}h_r y_i^r$ gives
\begin{align*}
a_\delta(y_1,\dots,y_n)
\prod_{i=1}^n
\left(\sum_{r=0}^{\infty} h_r y_i^r\right)
=
\det\left(\sum_{r=0}^{\infty} h_r y_i^{r+n-j}\right)_{1\leq i,j\leq n}.
\end{align*}
For each integer $k \geq 0$, the coefficient of $y_i^k$ in the $(i,j)$ entry is $h_{k-n+j}$, with the convention $h_r=0$ for $r<0$. Expanding the determinant coefficientwise is legitimate in the formal power series ring because, for each fixed total degree in $y_1,\dots,y_n$, only finitely many exponent choices can contribute. The coefficientwise alternating expansion gives
\begin{align*}
\det\left(\sum_{r=0}^{\infty} h_r y_i^{r+n-j}\right)_{1\leq i,j\leq n}
=
\sum_{\nu_1>\cdots>\nu_n\geq 0}
\det\left(h_{\nu_i-n+j}\right)_{1\leq i,j\leq n}
a_\nu(y_1,\dots,y_n).
\end{align*}
Indeed, the coefficient of the ordered monomial $y_1^{\nu_1}\cdots y_n^{\nu_n}$ is obtained by choosing from column $j$ of row $i$ the coefficient $h_{\nu_i-n+j}$, and summing over permutations of columns gives the determinant $\det(h_{\nu_i-n+j})_{1\leq i,j\leq n}$. Here $a_\nu(y_1,\dots,y_n)=\det(y_i^{\nu_j})_{1\leq i,j\leq n}$, so the ordering $\nu_1>\cdots>\nu_n$ fixes the alternant sign.
Taking $\nu=\lambda+\delta$, so that
\begin{align*}
\nu_i=\lambda_i+n-i,
\end{align*}
the coefficient of $a_{\lambda+\delta}(y_1,\dots,y_n)$ in this expansion is
\begin{align*}
\det\left(h_{\lambda_i-i+j}\right)_{1\leq i,j\leq n}
=
s_\lambda(x_1,\dots,x_n).
\end{align*}
Therefore the coefficient of $a_{\lambda+\delta}(y_1,\dots,y_n)$ in the factored expansion of $D(x,y)$ is
\begin{align*}
a_\delta(x_1,\dots,x_n)s_\lambda(x_1,\dots,x_n).
\end{align*}
[/step]
[step:Compare the two coefficients and divide by the Vandermonde alternant]
From the alternant expansion of $D(x,y)$, the coefficient of $a_{\lambda+\delta}(y_1,\dots,y_n)$ is
\begin{align*}
a_{\lambda+\delta}(x_1,\dots,x_n).
\end{align*}
From the factored expansion of $D(x,y)$, the same coefficient is
\begin{align*}
a_\delta(x_1,\dots,x_n)s_\lambda(x_1,\dots,x_n).
\end{align*}
Equating the two coefficients gives the polynomial identity
\begin{align*}
a_{\lambda+\delta}(x_1,\dots,x_n)
=
a_\delta(x_1,\dots,x_n)s_\lambda(x_1,\dots,x_n).
\end{align*}
Since
\begin{align*}
a_\delta(x_1,\dots,x_n)
=
\prod_{1\leq i<j\leq n}(x_i-x_j)
\end{align*}
is a nonzero polynomial in the integral domain $\mathbb{Z}[x_1,\dots,x_n]$, we may divide by it in the fraction field $\mathbb{Q}(x_1,\dots,x_n)$. Hence
\begin{align*}
s_\lambda(x_1,\dots,x_n)
=
\frac{a_{\lambda+\delta}(x_1,\dots,x_n)}{a_\delta(x_1,\dots,x_n)}.
\end{align*}
This is the bialternant formula.
[/step]