[proofplan]
We encode a non-negative integer matrix by a monomial whose $x$-exponents record its row sums and whose $y$-exponents record its column sums. The generating function for all such matrices is the Cauchy kernel $\prod_{i,j}(1-x_i y_j)^{-1}$. We then apply the Schur Cauchy identity (citing a result not yet in the wiki: Schur Cauchy Identity), extract the coefficient of $x^r y^c$, and use the defining coefficient interpretation of the Kostka numbers.
[/proofplan]
[step:Encode matrices by the coefficient of the Cauchy kernel]
Let $x = (x_1,\dots,x_a)$ and $y = (y_1,\dots,y_b)$ be two finite sets of commuting indeterminates. For a matrix $M = (m_{ij}) \in \mathbb{Z}_{\ge 0}^{a \times b}$, define its associated monomial
\begin{align*}
\operatorname{mon}(M)
:= \prod_{i=1}^{a}\prod_{j=1}^{b} (x_i y_j)^{m_{ij}}
= x_1^{\sum_{j=1}^{b}m_{1j}}\cdots x_a^{\sum_{j=1}^{b}m_{aj}}
y_1^{\sum_{i=1}^{a}m_{i1}}\cdots y_b^{\sum_{i=1}^{a}m_{ib}}.
\end{align*}
For each pair $(i,j)$, the geometric series identity in formal [power series](/page/Power%20Series) gives
\begin{align*}
\frac{1}{1-x_i y_j} = \sum_{m=0}^{\infty} (x_i y_j)^m.
\end{align*}
Multiplying these finitely many series, we obtain
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}\frac{1}{1-x_i y_j}
=
\sum_{M \in \mathbb{Z}_{\ge 0}^{a \times b}} \operatorname{mon}(M).
\end{align*}
Therefore the coefficient of
\begin{align*}
x^r y^c := x_1^{r_1}\cdots x_a^{r_a}y_1^{c_1}\cdots y_b^{c_b}
\end{align*}
in the Cauchy kernel is exactly the number of matrices $M \in \mathbb{Z}_{\ge 0}^{a \times b}$ with row sums $r$ and column sums $c$.
[guided]
The purpose of the generating function is to make every matrix entry choose its own exponent. Fix a pair $(i,j) \in \{1,\dots,a\}\times\{1,\dots,b\}$. The entry $m_{ij}$ can be any non-negative integer, so its contribution is recorded by the formal geometric series
\begin{align*}
\frac{1}{1-x_i y_j} = \sum_{m=0}^{\infty} (x_i y_j)^m.
\end{align*}
Choosing one term from each factor is the same as choosing one integer $m_{ij} \ge 0$ for each pair $(i,j)$, hence the product expands as
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}\frac{1}{1-x_i y_j}
=
\sum_{M \in \mathbb{Z}_{\ge 0}^{a \times b}}
\prod_{i=1}^{a}\prod_{j=1}^{b} (x_i y_j)^{m_{ij}}.
\end{align*}
For a fixed matrix $M = (m_{ij})$, collect the powers of each $x_i$ and each $y_j$:
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b} (x_i y_j)^{m_{ij}}
=
x_1^{\sum_{j=1}^{b}m_{1j}}\cdots x_a^{\sum_{j=1}^{b}m_{aj}}
y_1^{\sum_{i=1}^{a}m_{i1}}\cdots y_b^{\sum_{i=1}^{a}m_{ib}}.
\end{align*}
Thus the exponent of $x_i$ is the $i$-th row sum, and the exponent of $y_j$ is the $j$-th column sum. Consequently, the coefficient of
\begin{align*}
x^r y^c = x_1^{r_1}\cdots x_a^{r_a}y_1^{c_1}\cdots y_b^{c_b}
\end{align*}
counts precisely those matrices whose row sums are $r$ and whose column sums are $c$.
[/guided]
[/step]
[step:Expand the Cauchy kernel in the Schur basis]
By the Schur Cauchy identity (citing a result not yet in the wiki: Schur Cauchy Identity), in the ring of formal power series in $x_1,\dots,x_a,y_1,\dots,y_b$,
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}\frac{1}{1-x_i y_j}
=
\sum_{\lambda} s_\lambda(x_1,\dots,x_a)s_\lambda(y_1,\dots,y_b),
\end{align*}
where the sum runs over all partitions $\lambda$. Since $s_\lambda$ is homogeneous of degree $|\lambda|$, the product
\begin{align*}
s_\lambda(x_1,\dots,x_a)s_\lambda(y_1,\dots,y_b)
\end{align*}
has total $x$-degree $|\lambda|$ and total $y$-degree $|\lambda|$. The monomial $x^r y^c$ has total $x$-degree $N$ and total $y$-degree $N$, so only partitions $\lambda \vdash N$ can contribute to its coefficient.
[/step]
[step:Extract the coefficient and identify the Kostka factors]
Define the coefficient-extraction symbols $[x^r]$, $[y^c]$, and $[x^r y^c]$ as follows: for any formal power series $F$ in the $x$-variables, $[x^r]F$ denotes the coefficient of the monomial $x_1^{r_1}\cdots x_a^{r_a}$ in $F$; for any formal power series $G$ in the $y$-variables, $[y^c]G$ denotes the coefficient of the monomial $y_1^{c_1}\cdots y_b^{c_b}$ in $G$; and for any formal power series $H$ in all variables, $[x^r y^c]H$ denotes the coefficient of the monomial $x_1^{r_1}\cdots x_a^{r_a}y_1^{c_1}\cdots y_b^{c_b}$ in $H$.
For each partition $\lambda \vdash N$, coefficient extraction from the product of a polynomial in the $x$-variables and a polynomial in the $y$-variables gives
\begin{align*}
[x^r y^c]\left(s_\lambda(x_1,\dots,x_a)s_\lambda(y_1,\dots,y_b)\right)
=
\left([x^r]s_\lambda(x_1,\dots,x_a)\right)
\left([y^c]s_\lambda(y_1,\dots,y_b)\right).
\end{align*}
By the definitions in the statement,
\begin{align*}
[x^r]s_\lambda(x_1,\dots,x_a)=K_{\lambda r},
\qquad
[y^c]s_\lambda(y_1,\dots,y_b)=K_{\lambda c}.
\end{align*}
Combining this coefficient extraction with the Schur Cauchy expansion yields
\begin{align*}
[x^r y^c]\prod_{i=1}^{a}\prod_{j=1}^{b}\frac{1}{1-x_i y_j}
=
\sum_{\lambda \vdash N} K_{\lambda r}K_{\lambda c}.
\end{align*}
The left-hand side is the number of non-negative integer matrices with row sums $r$ and column sums $c$, by the first step. This proves the formula.
[/step]