[proofplan]
We first pass the marginal constraints to the narrow limit by applying the continuous mapping principle to the coordinate projections $X \times Y \to X$ and $X \times Y \to Y$. The lower semicontinuity and lower boundedness of $c$ then give the liminf inequality by the Portmanteau lower semicontinuity theorem, after shifting $c$ by a constant if necessary. Under the recovery-sequence hypothesis, optimality of each $\pi_k$ compares the costs of $\pi_k$ with approximating competitors for the limiting problem, which proves that every subsequential limit is optimal. Finally, tightness and [Prokhorov's theorem](/theorems/1172) reduce convergence of the whole sequence of optimal values to the already proved subsequential liminf bound and the recovery-sequence limsup bound.
[/proofplan]
[step:Pass the marginal constraints to the narrow limit]
Let
\begin{align*}
p_X:X \times Y &\to X
\end{align*}
and
\begin{align*}
p_Y:X \times Y &\to Y
\end{align*}
denote the coordinate projections, defined by $p_X(x,y)=x$ and $p_Y(x,y)=y$. These maps are continuous because $X \times Y$ carries the [product topology](/page/Product%20Topology).
Since $\pi_{k_j} \to \pi$ narrowly in $\mathcal{P}(X \times Y)$, the [continuous mapping theorem](/theorems/1847) for pushforwards gives
\begin{align*}
(p_X)_\#\pi_{k_j} \to (p_X)_\#\pi
\end{align*}
narrowly in $\mathcal{P}(X)$ and
\begin{align*}
(p_Y)_\#\pi_{k_j} \to (p_Y)_\#\pi
\end{align*}
narrowly in $\mathcal{P}(Y)$.
For every $j \ge 1$, the condition $\pi_{k_j} \in \Pi(\mu_{k_j},\nu_{k_j})$ means
\begin{align*}
(p_X)_\#\pi_{k_j}=\mu_{k_j}
\end{align*}
and
\begin{align*}
(p_Y)_\#\pi_{k_j}=\nu_{k_j}.
\end{align*}
By hypothesis, $\mu_{k_j}\to\mu$ narrowly and $\nu_{k_j}\to\nu$ narrowly. Since narrow limits in $\mathcal{P}(X)$ and $\mathcal{P}(Y)$ are unique on Polish spaces, we obtain
\begin{align*}
(p_X)_\#\pi=\mu
\end{align*}
and
\begin{align*}
(p_Y)_\#\pi=\nu.
\end{align*}
Thus $\pi \in \Pi(\mu,\nu)$.
[guided]
The goal of this step is to prove that the limiting measure still has the correct marginals. Define the coordinate projections
\begin{align*}
p_X:X \times Y &\to X
\end{align*}
and
\begin{align*}
p_Y:X \times Y &\to Y
\end{align*}
by $p_X(x,y)=x$ and $p_Y(x,y)=y$. These maps are continuous for the product topology on $X \times Y$.
We apply the continuous mapping theorem for pushforwards. Its relevant content here is that if probability measures converge narrowly and the map is continuous, then their pushforwards also converge narrowly. Applying it first to $p_X$ gives
\begin{align*}
(p_X)_\#\pi_{k_j} \to (p_X)_\#\pi
\end{align*}
narrowly in $\mathcal{P}(X)$. Applying it to $p_Y$ gives
\begin{align*}
(p_Y)_\#\pi_{k_j} \to (p_Y)_\#\pi
\end{align*}
narrowly in $\mathcal{P}(Y)$.
Now we use the fact that each $\pi_{k_j}$ is a coupling of $\mu_{k_j}$ and $\nu_{k_j}$. By definition of $\Pi(\mu_{k_j},\nu_{k_j})$,
\begin{align*}
(p_X)_\#\pi_{k_j}=\mu_{k_j}
\end{align*}
and
\begin{align*}
(p_Y)_\#\pi_{k_j}=\nu_{k_j}.
\end{align*}
The hypotheses say that $\mu_{k_j}\to\mu$ narrowly and $\nu_{k_j}\to\nu$ narrowly. Therefore the same sequence $(p_X)_\#\pi_{k_j}$ has narrow limit $(p_X)_\#\pi$ by the pushforward argument and narrow limit $\mu$ by hypothesis. Narrow limits of probability measures on Polish spaces are unique, so
\begin{align*}
(p_X)_\#\pi=\mu.
\end{align*}
The same argument for the second coordinate gives
\begin{align*}
(p_Y)_\#\pi=\nu.
\end{align*}
These two identities are exactly the definition of $\pi \in \Pi(\mu,\nu)$.
[/guided]
[/step]
[step:Apply lower semicontinuity of the cost functional]
Choose a real number $a \in \mathbb{R}$ such that $c(x,y)\ge a$ for every $(x,y)\in X \times Y$. Define
\begin{align*}
\widetilde c:X \times Y &\to [0,\infty]
\end{align*}
\begin{align*}
(x,y) &\mapsto c(x,y)-a.
\end{align*}
Then $\widetilde c$ is lower semicontinuous and non-negative. By the Portmanteau lower semicontinuity theorem applied to $\widetilde c$ and the narrow convergence $\pi_{k_j}\to\pi$,
\begin{align*}
\int_{X \times Y} \widetilde c(x,y)\,d\pi(x,y)
\leq
\liminf_{j \to \infty}
\int_{X \times Y} \widetilde c(x,y)\,d\pi_{k_j}(x,y).
\end{align*}
Since all measures involved are probability measures,
\begin{align*}
\int_{X \times Y} \widetilde c(x,y)\,d\pi(x,y)
=
\int_{X \times Y} c(x,y)\,d\pi(x,y)-a
\end{align*}
and
\begin{align*}
\int_{X \times Y} \widetilde c(x,y)\,d\pi_{k_j}(x,y)
=
\int_{X \times Y} c(x,y)\,d\pi_{k_j}(x,y)-a.
\end{align*}
Adding $a$ to both sides of the liminf inequality yields
\begin{align*}
\int_{X \times Y} c(x,y)\,d\pi(x,y)
\leq
\liminf_{j \to \infty}
\int_{X \times Y} c(x,y)\,d\pi_{k_j}(x,y).
\end{align*}
[/step]
[step:Compare subsequential limits with recovered competitors]
Assume the recovery-sequence hypothesis. Let $\gamma \in \Pi(\mu,\nu)$ satisfy
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma(x,y)<\infty.
\end{align*}
Choose a sequence $(\gamma_k)_{k\ge 1}$ such that $\gamma_k\in\Pi(\mu_k,\nu_k)$, $\gamma_k\to\gamma$ narrowly, and
\begin{align*}
\lim_{k\to\infty}
\int_{X \times Y} c(x,y)\,d\gamma_k(x,y)
=
\int_{X \times Y} c(x,y)\,d\gamma(x,y).
\end{align*}
For each $j\ge 1$, optimality of $\pi_{k_j}$ in $\Pi(\mu_{k_j},\nu_{k_j})$ gives
\begin{align*}
\int_{X \times Y} c(x,y)\,d\pi_{k_j}(x,y)
\leq
\int_{X \times Y} c(x,y)\,d\gamma_{k_j}(x,y).
\end{align*}
Combining this inequality with the liminf inequality already proved gives
\begin{align*}
\int_{X \times Y} c(x,y)\,d\pi(x,y)
\leq
\liminf_{j\to\infty}
\int_{X \times Y} c(x,y)\,d\pi_{k_j}(x,y)
\leq
\limsup_{j\to\infty}
\int_{X \times Y} c(x,y)\,d\pi_{k_j}(x,y)
\leq
\int_{X \times Y} c(x,y)\,d\gamma(x,y).
\end{align*}
Thus $\pi$ has cost no larger than every finite-cost competitor $\gamma \in \Pi(\mu,\nu)$.
If there is no finite-cost competitor in $\Pi(\mu,\nu)$, then every element of $\Pi(\mu,\nu)$ has cost $+\infty$, and every coupling is optimal in the extended sense. Otherwise, the preceding inequality proves that $\pi$ minimizes the cost over all finite-cost competitors. Since competitors of infinite cost cannot have smaller cost than $\pi$, it follows that $\pi$ is optimal in $\Pi(\mu,\nu)$.
[/step]
[step:Prove the limsup bound for optimal values]
For each $k\ge 1$, define the optimal value
\begin{align*}
V_k := \int_{X \times Y} c(x,y)\,d\pi_k(x,y).
\end{align*}
This equals the infimum over $\Pi(\mu_k,\nu_k)$ because $\pi_k$ is optimal. Let $V$ denote the limiting optimal value
\begin{align*}
V := \inf_{\sigma \in \Pi(\mu,\nu)}
\int_{X \times Y} c(x,y)\,d\sigma(x,y).
\end{align*}
If $V<\infty$, let $\varepsilon>0$ be arbitrary. By the definition of the infimum, choose $\gamma^\varepsilon\in\Pi(\mu,\nu)$ such that
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma^\varepsilon(x,y)
\leq V+\varepsilon.
\end{align*}
The recovery-sequence hypothesis supplies $\gamma_k^\varepsilon\in\Pi(\mu_k,\nu_k)$ with $\gamma_k^\varepsilon\to\gamma^\varepsilon$ narrowly and
\begin{align*}
\lim_{k\to\infty}
\int_{X \times Y} c(x,y)\,d\gamma_k^\varepsilon(x,y)
=
\int_{X \times Y} c(x,y)\,d\gamma^\varepsilon(x,y).
\end{align*}
Optimality of $\pi_k$ gives
\begin{align*}
V_k
\leq
\int_{X \times Y} c(x,y)\,d\gamma_k^\varepsilon(x,y).
\end{align*}
Taking the limsup in $k$ yields
\begin{align*}
\limsup_{k\to\infty} V_k \leq V+\varepsilon.
\end{align*}
Since $\varepsilon>0$ was arbitrary,
\begin{align*}
\limsup_{k\to\infty} V_k \leq V.
\end{align*}
If $V=+\infty$, this limsup inequality is automatic in the extended real order.
[/step]
[step:Use tightness to prove the liminf bound for optimal values]
We prove
\begin{align*}
V \leq \liminf_{k\to\infty} V_k.
\end{align*}
Let $(k_j)_{j\ge 1}$ be a strictly increasing sequence such that
\begin{align*}
\lim_{j\to\infty} V_{k_j}
=
\liminf_{k\to\infty} V_k
\end{align*}
in the extended real sense. Since $(\pi_k)_{k\ge 1}$ is tight in $\mathcal{P}(X\times Y)$ and $X\times Y$ is Polish, Prokhorov's theorem gives a further subsequence $(k_{j_\ell})_{\ell\ge 1}$ and a probability measure $\rho\in\mathcal{P}(X\times Y)$ such that
\begin{align*}
\pi_{k_{j_\ell}}\to\rho
\end{align*}
narrowly in $\mathcal{P}(X\times Y)$.
By the first two steps, $\rho\in\Pi(\mu,\nu)$ and
\begin{align*}
\int_{X \times Y} c(x,y)\,d\rho(x,y)
\leq
\liminf_{\ell\to\infty} V_{k_{j_\ell}}.
\end{align*}
Since $V$ is the infimum over $\Pi(\mu,\nu)$,
\begin{align*}
V \leq \int_{X \times Y} c(x,y)\,d\rho(x,y).
\end{align*}
Therefore
\begin{align*}
V \leq \liminf_{\ell\to\infty} V_{k_{j_\ell}}
=
\liminf_{k\to\infty} V_k.
\end{align*}
This proves the liminf bound.
[/step]
[step:Conclude optimality of limits and convergence of values]
The comparison with recovered competitors proves that every narrow subsequential limit $\pi$ of $(\pi_k)_{k\ge 1}$ is optimal in $\Pi(\mu,\nu)$. The limsup and liminf bounds for the optimal values give
\begin{align*}
\limsup_{k\to\infty} V_k \leq V \leq \liminf_{k\to\infty} V_k.
\end{align*}
Hence
\begin{align*}
\lim_{k\to\infty} V_k = V
\end{align*}
in the extended real sense. This is precisely the asserted convergence of optimal values.
[/step]