[proofplan]
The proof is the direct method of the [calculus of variations](/page/Calculus%20of%20Variations) applied to the compact set of transport plans. First we use the finite-cost admissible plan and the lower bound on $c$ to see that the Kantorovich value is a real number, then choose a minimizing sequence. Fixed marginals make this sequence tight, so Prokhorov compactness gives a weakly convergent subsequence. We verify that the weak limit still has marginals $\mu$ and $\nu$, and then use the Portmanteau lower semicontinuity bound for the shifted nonnegative cost $c-m$ to prove that the limit plan attains the infimum.
[/proofplan]
[step:Choose a finite minimizing sequence in $\Pi(\mu,\nu)$]
Let $m \in \mathbb{R}$ be a lower bound for $c$, so $c(x,y) \geq m$ for every $(x,y) \in X \times Y$. Since every $\gamma \in \Pi(\mu,\nu)$ is a probability measure on $X \times Y$, we have
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma(x,y) \geq m.
\end{align*}
The hypothesis gives $\bar{\gamma} \in \Pi(\mu,\nu)$ with finite cost, hence
\begin{align*}
m \leq \mathsf{K}_c(\mu,\nu) \leq \int_{X \times Y} c(x,y)\,d\bar{\gamma}(x,y) < \infty.
\end{align*}
Thus $\mathsf{K}_c(\mu,\nu) \in \mathbb{R}$. By the definition of the infimum, for each $n \in \mathbb{N}$ there exists $\gamma_n \in \Pi(\mu,\nu)$ such that
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma_n(x,y) \leq \mathsf{K}_c(\mu,\nu) + \frac{1}{n}.
\end{align*}
The sequence $(\gamma_n)_{n \in \mathbb{N}}$ is a minimizing sequence.
[/step]
[step:Extract a weakly convergent subsequence from tightness of the fixed marginals]
Because $X$ and $Y$ are Polish spaces, the probability measures $\mu$ and $\nu$ are tight. Let $\varepsilon > 0$. Choose compact sets $K_X \subset X$ and $K_Y \subset Y$ such that
\begin{align*}
\mu(K_X) \geq 1 - \frac{\varepsilon}{2}
\end{align*}
and
\begin{align*}
\nu(K_Y) \geq 1 - \frac{\varepsilon}{2}.
\end{align*}
For every $n \in \mathbb{N}$, since $(\pi_X)_\#\gamma_n=\mu$ and $(\pi_Y)_\#\gamma_n=\nu$, the union bound gives
\begin{align*}
\gamma_n((X \times Y)\setminus (K_X \times K_Y)) \leq \gamma_n((X\setminus K_X)\times Y) + \gamma_n(X \times (Y\setminus K_Y)).
\end{align*}
Using the marginal identities, this becomes
\begin{align*}
\gamma_n((X \times Y)\setminus (K_X \times K_Y)) \leq \mu(X\setminus K_X) + \nu(Y\setminus K_Y) \leq \varepsilon.
\end{align*}
Thus $(\gamma_n)_{n \in \mathbb{N}}$ is tight in $\mathcal{P}(X \times Y)$. Since $X \times Y$ is Polish, [Prokhorov's theorem](/theorems/1172) implies that the tight family $(\gamma_n)_{n \in \mathbb{N}}$ is relatively sequentially compact for [weak convergence](/page/Weak%20Convergence) (citing a result not yet in the wiki: Prokhorov theorem). Therefore there exist a subsequence $(\gamma_{n_k})_{k \in \mathbb{N}}$ and a probability measure $\gamma^* \in \mathcal{P}(X \times Y)$ such that $\gamma_{n_k}$ converges weakly to $\gamma^*$.
[guided]
The compactness input is not compactness of $X \times Y$ itself; the spaces may be noncompact. Instead, the fixed marginals force uniform tightness. Since $X$ and $Y$ are Polish, every probability measure on them is tight. Therefore, for a given $\varepsilon > 0$, we can find compact sets $K_X \subset X$ and $K_Y \subset Y$ with
\begin{align*}
\mu(K_X) \geq 1 - \frac{\varepsilon}{2}
\end{align*}
and
\begin{align*}
\nu(K_Y) \geq 1 - \frac{\varepsilon}{2}.
\end{align*}
Now let $n \in \mathbb{N}$. Since $\gamma_n$ has first marginal $\mu$, the mass that $\gamma_n$ assigns to $(X\setminus K_X)\times Y$ is exactly $\mu(X\setminus K_X)$. Since $\gamma_n$ has second marginal $\nu$, the mass that $\gamma_n$ assigns to $X\times (Y\setminus K_Y)$ is exactly $\nu(Y\setminus K_Y)$. The complement of $K_X \times K_Y$ is contained in the union of these two bad sets, so
\begin{align*}
\gamma_n((X \times Y)\setminus (K_X \times K_Y)) \leq \mu(X\setminus K_X) + \nu(Y\setminus K_Y) \leq \varepsilon.
\end{align*}
The set $K_X \times K_Y$ is compact in $X \times Y$, so this estimate proves tightness of the whole sequence $(\gamma_n)_{n \in \mathbb{N}}$. Because $X \times Y$ is again Polish, Prokhorov's theorem applies and gives a weakly convergent subsequence. Thus there are indices $n_k \to \infty$ and a probability measure $\gamma^* \in \mathcal{P}(X \times Y)$ such that $\gamma_{n_k}$ converges weakly to $\gamma^*$ (citing a result not yet in the wiki: Prokhorov theorem).
[/guided]
[/step]
[step:Verify that the weak limit has marginals $\mu$ and $\nu$]
We prove that $\gamma^* \in \Pi(\mu,\nu)$. Let $\varphi: X \to \mathbb{R}$ be a bounded [continuous function](/page/Continuous%20Function). Then $\varphi \circ \pi_X: X \times Y \to \mathbb{R}$ is bounded and continuous. Weak convergence of $\gamma_{n_k}$ to $\gamma^*$ gives
\begin{align*}
\int_{X \times Y} \varphi(\pi_X(x,y))\,d\gamma^*(x,y) = \lim_{k \to \infty} \int_{X \times Y} \varphi(\pi_X(x,y))\,d\gamma_{n_k}(x,y).
\end{align*}
Since $(\pi_X)_\#\gamma_{n_k}=\mu$ for every $k \in \mathbb{N}$, the right-hand side equals
\begin{align*}
\int_X \varphi(x)\,d\mu(x).
\end{align*}
Hence $(\pi_X)_\#\gamma^*=\mu$. The same argument with an arbitrary bounded continuous function $\psi: Y \to \mathbb{R}$ gives $(\pi_Y)_\#\gamma^*=\nu$. Therefore $\gamma^* \in \Pi(\mu,\nu)$.
[/step]
[step:Apply Portmanteau to the shifted lower semicontinuous cost]
Define the shifted cost function $\tilde{c}: X \times Y \to [0,\infty]$ by
\begin{align*}
\tilde{c}(x,y) := c(x,y)-m.
\end{align*}
Since $c$ is lower semicontinuous and $m$ is constant, $\tilde{c}$ is lower semicontinuous. Since $c \geq m$, the function $\tilde{c}$ is nonnegative. The Portmanteau lower semicontinuity theorem for nonnegative lower semicontinuous functions gives
\begin{align*}
\int_{X \times Y} \tilde{c}(x,y)\,d\gamma^*(x,y) \leq \liminf_{k \to \infty} \int_{X \times Y} \tilde{c}(x,y)\,d\gamma_{n_k}(x,y).
\end{align*}
Because every $\gamma_{n_k}$ and $\gamma^*$ is a probability measure, subtracting and adding the constant $m$ gives
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma^*(x,y) \leq \liminf_{k \to \infty} \int_{X \times Y} c(x,y)\,d\gamma_{n_k}(x,y).
\end{align*}
Using the minimizing property of $(\gamma_n)_{n \in \mathbb{N}}$, we obtain
\begin{align*}
\liminf_{k \to \infty} \int_{X \times Y} c(x,y)\,d\gamma_{n_k}(x,y) \leq \lim_{k \to \infty} \left(\mathsf{K}_c(\mu,\nu)+\frac{1}{n_k}\right) = \mathsf{K}_c(\mu,\nu).
\end{align*}
Thus
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma^*(x,y) \leq \mathsf{K}_c(\mu,\nu).
\end{align*}
[/step]
[step:Compare with the defining infimum to conclude optimality]
Since $\gamma^* \in \Pi(\mu,\nu)$, the definition of $\mathsf{K}_c(\mu,\nu)$ as the infimum over all admissible plans gives
\begin{align*}
\mathsf{K}_c(\mu,\nu) \leq \int_{X \times Y} c(x,y)\,d\gamma^*(x,y).
\end{align*}
The reverse inequality was proved in the previous step. Therefore
\begin{align*}
\int_{X \times Y} c(x,y)\,d\gamma^*(x,y) = \mathsf{K}_c(\mu,\nu).
\end{align*}
Hence $\gamma^*$ is an optimal Kantorovich plan.
[/step]