[proofplan]
We prove the scaling theorem by a finite-dimensional variational argument. We introduce logarithmic variables $x_i=\log u_i$ and $y_j=\log v_j$, minimize a strictly convex functional modulo its one-dimensional translation invariance, and show that its Euler-Lagrange equations are exactly the prescribed row and column constraints. The strict positivity of $K$, $a$, and $b$ gives coercivity on a normalized affine slice and identifies the only possible degeneracy of the Hessian. Uniqueness follows by applying the same Hessian argument to the line segment between two logarithmic scaling pairs.
[/proofplan]
[step:Set up the logarithmic scaling functional on a normalized slice]
Define the affine subspace
\begin{align*}
E:=\left\{(x,y)\in\mathbb R^m\times\mathbb R^n:\sum_{i=1}^m a_i x_i=0\right\}.
\end{align*}
Define the function
\begin{align*}
\Phi:E\to\mathbb R,\qquad (x,y)\mapsto \sum_{i=1}^m\sum_{j=1}^n K_{ij}\exp(x_i+y_j)-\sum_{i=1}^m a_i x_i-\sum_{j=1}^n b_j y_j
\end{align*}
Since $(x,y)\in E$ implies $\sum_{i=1}^m a_i x_i=0$, on $E$ this is equivalently
\begin{align*}
\Phi(x,y)=\sum_{i=1}^m\sum_{j=1}^n K_{ij}\exp(x_i+y_j)-\sum_{j=1}^n b_j y_j.
\end{align*}
The normalization in the definition of $E$ removes the invariance
\begin{align*}
(x,y)\mapsto (x+c(1,\dots,1),y-c(1,\dots,1)),
\end{align*}
which leaves all sums $x_i+y_j$ unchanged.
[/step]
[step:Prove coercivity of the functional on the normalized slice]
We prove that $\Phi(z)\to+\infty$ whenever $z=(x,y)\in E$ and $|z|\to\infty$, where $|z|$ is the Euclidean norm on $\mathbb R^{m+n}$.
Let
\begin{align*}
\alpha:=\min_{1\le i\le m} a_i
\end{align*}
and
\begin{align*}
\beta:=\min_{1\le j\le n} b_j.
\end{align*}
Then $\alpha>0$ and $\beta>0$. For $(x,y)\in E$, the identity $\sum_i a_i x_i=0$ implies that either all $x_i=0$ or the vector $x$ has at least one nonnegative component and at least one nonpositive component. In particular,
\begin{align*}
\max_{1\le i\le m} x_i\ge 0
\end{align*}
and
\begin{align*}
\min_{1\le i\le m} x_i\le 0.
\end{align*}
Also,
\begin{align*}
\sum_{i=1}^m a_i |x_i|\le \max_{1\le i\le m}x_i-\min_{1\le i\le m}x_i.
\end{align*}
Let
\begin{align*}
M_K:=\min_{1\le i\le m,\ 1\le j\le n}K_{ij}.
\end{align*}
Then $M_K>0$. For every $(x,y)\in E$,
\begin{align*}
\sum_{i=1}^m\sum_{j=1}^n K_{ij}\exp(x_i+y_j)\ge M_K\exp\left(\max_i x_i+\max_j y_j\right).
\end{align*}
Set
\begin{align*}
A:=\max_{1\le i\le m}x_i+\max_{1\le j\le n}y_j.
\end{align*}
Since $\max_i x_i\ge 0$, we have $\max_j y_j\le A$. Hence
\begin{align*}
-\sum_{j=1}^n b_j y_j\ge -\max_{1\le j\le n}y_j\ge -A.
\end{align*}
Therefore, when $A\to+\infty$,
\begin{align*}
\Phi(x,y)\ge M_K\exp(A)-A\to+\infty,
\end{align*}
because the exponential function dominates the linear function.
It remains to control sequences for which $\max_i x_i+\max_j y_j$ is bounded above. Suppose $(x_k,y_k)\in E$ is a sequence with $|(x_k,y_k)|\to\infty$ and with $\max_i x_{k,i}+\max_j y_{k,j}$ bounded above. If $\max_j y_{k,j}\to+\infty$, then $\max_i x_{k,i}\to-\infty$, contradicting $\max_i x_{k,i}\ge0$. Hence $\max_j y_{k,j}$ is bounded above. Since
\begin{align*}
-\sum_{j=1}^n b_j y_{k,j}\ge -\max_{1\le j\le n}y_{k,j},
\end{align*}
the linear part cannot tend to $-\infty$ through the maximum of $y_k$.
If some component $y_{k,j}\to-\infty$, then, because $b_j>0$,
\begin{align*}
-\sum_{\ell=1}^n b_\ell y_{k,\ell}\to+\infty
\end{align*}
unless another component of $y_k$ tends to $+\infty$, which has already been excluded. Therefore $\Phi(x_k,y_k)\to+\infty$ in this case.
Finally, if $y_k$ is bounded and $|(x_k,y_k)|\to\infty$, then $|x_k|\to\infty$. Since $x_k\in E$, the oscillation $\max_i x_{k,i}-\min_i x_{k,i}$ tends to $+\infty$. With $y_k$ bounded, at least one sum $x_{k,i}+y_{k,j}$ tends to $+\infty$, and the positive exponential term again forces $\Phi(x_k,y_k)\to+\infty$. Thus $\Phi$ is coercive on $E$.
[guided]
The only delicate point is that $\Phi$ has a built-in scaling symmetry: adding the same constant to every $x_i$ and subtracting it from every $y_j$ does not change $x_i+y_j$. The slice
\begin{align*}
E=\left\{(x,y)\in\mathbb R^m\times\mathbb R^n:\sum_{i=1}^m a_i x_i=0\right\}
\end{align*}
removes exactly this freedom.
We now show that $\Phi$ tends to $+\infty$ along every unbounded sequence in $E$. Let $((x_k,y_k))_{k\in\mathbb N}$ be a sequence in $E$ with $|(x_k,y_k)|\to\infty$, where $x_k=(x_{k,1},\dots,x_{k,m})\in\mathbb R^m$ and $y_k=(y_{k,1},\dots,y_{k,n})\in\mathbb R^n$. Define
\begin{align*}
M_K:=\min_{1\le i\le m,\ 1\le j\le n}K_{ij}.
\end{align*}
Because every entry of $K$ is strictly positive, $M_K>0$. For every $(x,y)\in E$, the exponential part satisfies
\begin{align*}
\sum_{i=1}^m\sum_{j=1}^n K_{ij}\exp(x_i+y_j)\ge M_K\exp\left(\max_i x_i+\max_j y_j\right).
\end{align*}
Define
\begin{align*}
A_k:=\max_{1\le i\le m}x_{k,i}+\max_{1\le j\le n}y_{k,j}.
\end{align*}
If $A_k\to+\infty$, then the normalization gives $\max_i x_{k,i}\ge0$, so $\max_j y_{k,j}\le A_k$. Because the weights $b_j$ are positive and sum to $1$,
\begin{align*}
-\sum_{j=1}^n b_j y_{k,j}\ge -\max_{1\le j\le n}y_{k,j}\ge -A_k.
\end{align*}
Consequently
\begin{align*}
\Phi(x_k,y_k)\ge M_K\exp(A_k)-A_k\to+\infty.
\end{align*}
This is the precise domination estimate: the worst possible contribution from the linear term is only linear in $A_k$, while the positive exponential term grows like $\exp(A_k)$.
It remains to consider the case where $A_k=\max_i x_{k,i}+\max_j y_{k,j}$ is bounded above. Since
\begin{align*}
\sum_{i=1}^m a_i x_{k,i}=0
\end{align*}
and every $a_i$ is positive, the vector $x_k$ cannot have all components strictly negative or all components strictly positive unless it is balanced by components of the opposite sign. In particular,
\begin{align*}
\max_i x_{k,i}\ge0.
\end{align*}
Therefore boundedness of $\max_i x_{k,i}+\max_j y_{k,j}$ implies that $\max_j y_{k,j}$ is bounded above.
Now the negative linear term in $\Phi$ is helpful. Since each $b_j>0$, if any component $y_{k,j}$ tends to $-\infty$ while $\max_j y_{k,j}$ remains bounded above, then
\begin{align*}
-\sum_{\ell=1}^n b_\ell y_{k,\ell}\to+\infty.
\end{align*}
Hence $\Phi(x_k,y_k)\to+\infty$ in that case.
The only remaining possibility is that $y_k$ is bounded but $x_k$ is unbounded. Since $x_k$ satisfies the weighted normalization
\begin{align*}
\sum_{i=1}^m a_i x_{k,i}=0,
\end{align*}
unboundedness of $x_k$ forces the oscillation between its largest and smallest components to tend to $+\infty$. Because $y_k$ is bounded, some sum $x_{k,i}+y_{k,j}$ must tend to $+\infty$. The corresponding exponential term
\begin{align*}
K_{ij}\exp(x_{k,i}+y_{k,j})
\end{align*}
then tends to $+\infty$, and all other exponential terms are nonnegative. Thus $\Phi(x_k,y_k)\to+\infty$. This proves coercivity on $E$.
[/guided]
[/step]
[step:Minimize the functional and derive the row and column equations]
We use the finite-dimensional direct method. Let $((x_k,y_k))_{k\in\mathbb N}$ be a minimizing sequence for $\Phi$ on $E$. Since $\Phi$ is coercive on $E$, this sequence is bounded in the finite-dimensional Euclidean space $\mathbb R^m\times\mathbb R^n$. By Bolzano-Weierstrass, after passing to a subsequence there is a point $(x^*,y^*)\in\mathbb R^m\times\mathbb R^n$ such that $(x_k,y_k)\to(x^*,y^*)$. Since $E$ is a closed affine subspace, $(x^*,y^*)\in E$. Since $\Phi:E\to\mathbb R$ is continuous, $\Phi$ attains its minimum at $(x^*,y^*)$.
For $i\in\{1,\dots,m\}$ and $j\in\{1,\dots,n\}$, define
\begin{align*}
P_{ij}:=K_{ij}\exp(x_i^*+y_j^*).
\end{align*}
We first compute the column equations. For each $j\in\{1,\dots,n\}$ and each $s\in\mathbb R$, the point $(x^*,y^*+s e_j)$ lies in $E$, where $e_j\in\mathbb R^n$ is the $j$-th standard basis vector. Differentiating the one-variable function $s\mapsto\Phi(x^*,y^*+s e_j)$ at $s=0$ gives
\begin{align*}
0=\sum_{i=1}^m K_{ij}\exp(x_i^*+y_j^*)-b_j.
\end{align*}
Therefore
\begin{align*}
\sum_{i=1}^m P_{ij}=b_j.
\end{align*}
For the row equations, let $h=(h_1,\dots,h_m)\in\mathbb R^m$ satisfy
\begin{align*}
\sum_{i=1}^m a_i h_i=0.
\end{align*}
Then $(x^*+s h,y^*)\in E$ for every $s\in\mathbb R$. Differentiating $s\mapsto\Phi(x^*+s h,y^*)$ at $s=0$ gives
\begin{align*}
0=\sum_{i=1}^m h_i\left(\sum_{j=1}^n K_{ij}\exp(x_i^*+y_j^*)-a_i\right).
\end{align*}
Define $r=(r_1,\dots,r_m)\in\mathbb R^m$ by
\begin{align*}
r_i:=\sum_{j=1}^n P_{ij}-a_i.
\end{align*}
The preceding identity says
\begin{align*}
\sum_{i=1}^m h_i r_i=0
\end{align*}
for every $h\in\mathbb R^m$ satisfying $\sum_i a_i h_i=0$. Hence $r$ is a scalar multiple of $a$. Summing the components of $r$ and using the already proved column equations gives
\begin{align*}
\sum_{i=1}^m r_i=\sum_{i=1}^m\sum_{j=1}^n P_{ij}-\sum_{i=1}^m a_i=\sum_{j=1}^n b_j-1=0.
\end{align*}
Since $\sum_i a_i=1$, the scalar multiple must be zero. Thus $r_i=0$ for every $i$, namely
\begin{align*}
\sum_{j=1}^n P_{ij}=a_i.
\end{align*}
[/step]
[step:Convert the critical point into positive scaling vectors]
Define
\begin{align*}
u:\{1,\dots,m\}\to(0,\infty),\qquad i\mapsto \exp(x_i^*)
\end{align*}
and
\begin{align*}
v:\{1,\dots,n\}\to(0,\infty),\qquad j\mapsto \exp(y_j^*)
\end{align*}
Equivalently, write $u=(u_1,\dots,u_m)\in(0,\infty)^m$ and $v=(v_1,\dots,v_n)\in(0,\infty)^n$. Then
\begin{align*}
P_{ij}=u_iK_{ij}v_j.
\end{align*}
The row and column identities proved above show that
\begin{align*}
P=\operatorname{diag}(u)K\operatorname{diag}(v)\in\Pi(a,b).
\end{align*}
This proves existence.
[/step]
[step:Identify the only possible non-uniqueness of the scaling factors]
Let $\widetilde u\in(0,\infty)^m$ and $\widetilde v\in(0,\infty)^n$ satisfy
\begin{align*}
\widetilde P:=\operatorname{diag}(\widetilde u)K\operatorname{diag}(\widetilde v)\in\Pi(a,b).
\end{align*}
Define logarithmic differences $p=(p_1,\dots,p_m)\in\mathbb R^m$ and $q=(q_1,\dots,q_n)\in\mathbb R^n$ by
\begin{align*}
p_i:=\log \widetilde u_i-\log u_i
\end{align*}
and
\begin{align*}
q_j:=\log \widetilde v_j-\log v_j.
\end{align*}
Define the parameterized matrix family
\begin{align*}
P_{\mathrm{seg}}:[0,1]\to(0,\infty)^{m\times n},\qquad t\mapsto P_{\mathrm{seg}}(t)
\end{align*}
by
\begin{align*}
P_{\mathrm{seg}}(t)_{ij}:=P_{ij}\exp(t(p_i+q_j)).
\end{align*}
Both endpoint matrices $P_{\mathrm{seg}}(0)=P$ and $P_{\mathrm{seg}}(1)=\widetilde P$ have row marginals $a$ and column marginals $b$.
Consider the convex function $\psi:[0,1]\to\mathbb R$ defined by
\begin{align*}
\psi(t):=\sum_{i=1}^m\sum_{j=1}^n P_{ij}\exp(t(p_i+q_j))-\sum_{i=1}^m a_i t p_i-\sum_{j=1}^n b_j t q_j.
\end{align*}
The endpoint marginal equations imply $\psi'(0)=0$ and $\psi'(1)=0$. Its second derivative is
\begin{align*}
\psi''(t)=\sum_{i=1}^m\sum_{j=1}^n P_{ij}\exp(t(p_i+q_j))(p_i+q_j)^2.
\end{align*}
Since every $P_{ij}>0$, every term in this sum is nonnegative, and $\psi''(t)=0$ holds only if
\begin{align*}
p_i+q_j=0
\end{align*}
for every $i\in\{1,\dots,m\}$ and $j\in\{1,\dots,n\}$. Because $\psi''\ge0$, the derivative $\psi'$ is nondecreasing on $[0,1]$. Together with $\psi'(0)=\psi'(1)=0$, this forces $\psi'$ to be constant on $[0,1]$. Hence the continuous nonnegative function $\psi''$ vanishes on $(0,1)$. Therefore $p_i+q_j=0$ for all $i,j$.
Fix $i_0\in\{1,\dots,m\}$ and define
\begin{align*}
c:=p_{i_0}.
\end{align*}
From $p_i+q_j=0$ for all $i,j$, we obtain $q_j=-c$ for every $j$ and then $p_i=c$ for every $i$. Set
\begin{align*}
\lambda:=\exp(c).
\end{align*}
Then $\lambda>0$ and
\begin{align*}
\widetilde u=\lambda u
\end{align*}
while
\begin{align*}
\widetilde v=\lambda^{-1}v.
\end{align*}
Thus the scaling pair is unique up to reciprocal scalar rescaling, completing the proof.
[/step]