[step:Apply Kellerer's compactness theorem to a maximizing sequence]We use the following previously established external result, which is stronger than the present theorem because it gives compactness of maximizing sequences in addition to existence of an optimizer. Kellerer's compactness theorem for the Kantorovich dual problem states: if $X$ and $Y$ are Polish spaces, $\mu\in\mathcal{P}(X)$, $\nu\in\mathcal{P}(Y)$, and $c:X\times Y\to[0,\infty)$ is finite lower semicontinuous with Borel functions $a:X\to\mathbb{R}$ and $b:Y\to\mathbb{R}$ satisfying $a\in L^1(X,\mu)$, $b\in L^1(Y,\nu)$, and $c(x,y)\leq a(x)+b(y)$ for every $(x,y)\in X\times Y$, then every maximizing sequence in $\mathcal{A}_c(\mu,\nu)$ has a subsequence whose normalized potentials converge in the Kellerer compactness topology to Borel integrable potentials $(\varphi,\psi)\in\mathcal{A}_c(\mu,\nu)$ attaining the supremum $D_c(\mu,\nu)$.
We verify the hypotheses of this theorem. The spaces $X$ and $Y$ are Polish by assumption, and $\mu$ and $\nu$ are Borel probability measures because $\mu\in\mathcal{P}(X)$ and $\nu\in\mathcal{P}(Y)$. The cost map $c:X\times Y\to[0,\infty)$ is finite and lower semicontinuous by assumption. The domination hypothesis is also exactly available: the given Borel maps $a:X\to\mathbb{R}$ and $b:Y\to\mathbb{R}$ satisfy $a\in L^1(X,\mu)$, $b\in L^1(Y,\nu)$, and
\begin{align*}
0 \leq c(x,y) \leq a(x)+b(y)
\end{align*}
for every $(x,y)\in X\times Y$.
Since $D_c(\mu,\nu)>-\infty$ and $D_c(\mu,\nu)<\infty$ by the preceding step, choose a maximizing sequence $((f_n,g_n))_{n=1}^{\infty}$ in $\mathcal{A}_c(\mu,\nu)$, meaning that each $(f_n,g_n)\in\mathcal{A}_c(\mu,\nu)$ and
\begin{align*}
\lim_{n\to\infty}
\left(
\int_X f_n(x)\, d\mu(x)+\int_Y g_n(y)\, d\nu(y)
\right)
=
D_c(\mu,\nu).
\end{align*}
Kellerer's compactness theorem applied to this sequence gives finite real-valued Borel functions $\varphi:X\to\mathbb{R}$ and $\psi:Y\to\mathbb{R}$ such that $\varphi\in L^1(X,\mu)$, $\psi\in L^1(Y,\nu)$,
\begin{align*}
\varphi(x)+\psi(y)\leq c(x,y)
\end{align*}
for every $(x,y)\in X\times Y$, and
\begin{align*}
\int_X \varphi(x)\, d\mu(x)+\int_Y \psi(y)\, d\nu(y)
=
D_c(\mu,\nu).
\end{align*}[/step]