[proofplan]
We use the direct method. First choose a minimizing sequence of transport plans. The fixed marginal constraints imply tightness of this sequence on the product space, so [Prokhorov's theorem](/theorems/1172) gives a weakly convergent subsequence. We then show that the marginal constraints are closed under [weak convergence](/page/Weak%20Convergence) and use the Portmanteau lower semicontinuity theorem for the bounded-below lower semicontinuous cost to pass the minimizing property to the limit.
[/proofplan]
[step:Choose a minimizing sequence with finite costs]
Let $V \in \mathbb{R}$ denote the Kantorovich value:
\begin{align*}
V := \inf_{\gamma \in \Pi(\mu,\nu)} \int_{X \times Y} c(x,y)\, d\gamma(x,y).
\end{align*}
By hypothesis, $V < \infty$. Since $c$ is bounded below, there exists $a \in \mathbb{R}$ such that $c(x,y) \geq a$ for every $(x,y) \in X \times Y$, so the integral functional cannot take the value $-\infty$. Thus $V \in \mathbb{R}$.
For each $k \in \mathbb{N}$, choose $\pi_k \in \Pi(\mu,\nu)$ such that
\begin{align*}
\int_{X \times Y} c(x,y)\, d\pi_k(x,y) \leq V + \frac{1}{k}.
\end{align*}
This is possible by the definition of the infimum. The sequence $(\pi_k)_{k \in \mathbb{N}}$ is a minimizing sequence in $\Pi(\mu,\nu)$.
[/step]
[step:Use the fixed marginals to prove tightness on $X \times Y$]
We prove that $(\pi_k)_{k \in \mathbb{N}}$ is tight in $\mathcal{P}(X \times Y)$. Let $\varepsilon > 0$. Since $X$ and $Y$ are Polish spaces, the probability measures $\mu$ and $\nu$ are tight. Hence there exist compact sets $K_X \subset X$ and $K_Y \subset Y$ such that
\begin{align*}
\mu(X \setminus K_X) < \frac{\varepsilon}{2}
\end{align*}
and
\begin{align*}
\nu(Y \setminus K_Y) < \frac{\varepsilon}{2}.
\end{align*}
The set $K_X \times K_Y$ is compact in $X \times Y$. Since each $\pi_k$ has marginals $\mu$ and $\nu$, we have
\begin{align*}
\pi_k((X \times Y) \setminus (K_X \times K_Y))
\leq
\pi_k((X \setminus K_X) \times Y)
+
\pi_k(X \times (Y \setminus K_Y)).
\end{align*}
Using the marginal identities gives
\begin{align*}
\pi_k((X \setminus K_X) \times Y) = \mu(X \setminus K_X)
\end{align*}
and
\begin{align*}
\pi_k(X \times (Y \setminus K_Y)) = \nu(Y \setminus K_Y).
\end{align*}
Therefore
\begin{align*}
\pi_k((X \times Y) \setminus (K_X \times K_Y)) < \varepsilon
\end{align*}
for every $k \in \mathbb{N}$. This proves tightness.
[guided]
The point of this step is that although the plans $\pi_k$ may vary, their two marginals do not vary. Tightness of the marginals therefore forces tightness of the whole sequence of joint measures.
Let $\varepsilon > 0$. Because $X$ and $Y$ are Polish spaces, every Borel probability measure on each space is tight. Applying this to $\mu \in \mathcal{P}(X)$ and $\nu \in \mathcal{P}(Y)$, choose compact sets $K_X \subset X$ and $K_Y \subset Y$ satisfying
\begin{align*}
\mu(X \setminus K_X) < \frac{\varepsilon}{2}
\end{align*}
and
\begin{align*}
\nu(Y \setminus K_Y) < \frac{\varepsilon}{2}.
\end{align*}
The product $K_X \times K_Y$ is compact in $X \times Y$. We now estimate how much mass $\pi_k$ can put outside this product compact set. The inclusion
\begin{align*}
(X \times Y) \setminus (K_X \times K_Y)
\subset
((X \setminus K_X) \times Y) \cup (X \times (Y \setminus K_Y))
\end{align*}
gives, by subadditivity of the probability measure $\pi_k$,
\begin{align*}
\pi_k((X \times Y) \setminus (K_X \times K_Y))
\leq
\pi_k((X \setminus K_X) \times Y)
+
\pi_k(X \times (Y \setminus K_Y)).
\end{align*}
Now we use the defining property $\pi_k \in \Pi(\mu,\nu)$. The first marginal of $\pi_k$ is $\mu$, so
\begin{align*}
\pi_k((X \setminus K_X) \times Y) = \mu(X \setminus K_X).
\end{align*}
The second marginal of $\pi_k$ is $\nu$, so
\begin{align*}
\pi_k(X \times (Y \setminus K_Y)) = \nu(Y \setminus K_Y).
\end{align*}
Combining the three estimates yields
\begin{align*}
\pi_k((X \times Y) \setminus (K_X \times K_Y))
<
\frac{\varepsilon}{2} + \frac{\varepsilon}{2}
=
\varepsilon.
\end{align*}
The compact set $K_X \times K_Y$ works uniformly for all $k$, which is exactly tightness of the sequence $(\pi_k)_{k \in \mathbb{N}}$ in $\mathcal{P}(X \times Y)$.
[/guided]
[/step]
[step:Extract a weakly convergent subsequence by Prokhorov compactness]
The product $X \times Y$ is Polish. Since $(\pi_k)_{k \in \mathbb{N}}$ is tight in $\mathcal{P}(X \times Y)$, Prokhorov's theorem applies and gives a subsequence $(\pi_{k_j})_{j \in \mathbb{N}}$ and a probability measure $\pi \in \mathcal{P}(X \times Y)$ such that
\begin{align*}
\pi_{k_j} \rightharpoonup \pi
\end{align*}
weakly in $\mathcal{P}(X \times Y)$ as $j \to \infty$.
(citing a result not yet in the wiki: Prokhorov's theorem)
[/step]
[step:Pass the marginal constraints to the weak limit]
Let
\begin{align*}
\operatorname{pr}_X: X \times Y &\to X
\end{align*}
denote the first coordinate projection, and let
\begin{align*}
\operatorname{pr}_Y: X \times Y &\to Y
\end{align*}
denote the second coordinate projection. Both projections are continuous.
Let $\varphi: X \to \mathbb{R}$ be a bounded [continuous function](/page/Continuous%20Function). Then $\varphi \circ \operatorname{pr}_X: X \times Y \to \mathbb{R}$ is bounded and continuous. By weak convergence,
\begin{align*}
\int_{X \times Y} \varphi(\operatorname{pr}_X(x,y))\, d\pi(x,y)
=
\lim_{j \to \infty}
\int_{X \times Y} \varphi(\operatorname{pr}_X(x,y))\, d\pi_{k_j}(x,y).
\end{align*}
Since $(\operatorname{pr}_X)_{\#}\pi_{k_j} = \mu$, the right-hand side equals
\begin{align*}
\lim_{j \to \infty} \int_X \varphi(x)\, d\mu(x)
=
\int_X \varphi(x)\, d\mu(x).
\end{align*}
Therefore $(\operatorname{pr}_X)_{\#}\pi = \mu$.
The same argument with an arbitrary bounded continuous function $\psi: Y \to \mathbb{R}$ gives $(\operatorname{pr}_Y)_{\#}\pi = \nu$. Hence $\pi \in \Pi(\mu,\nu)$.
[/step]
[step:Use lower semicontinuity of the cost to attain the infimum]
Define the shifted cost
\begin{align*}
\tilde c: X \times Y &\to [0,\infty]
\end{align*}
\begin{align*}
(x,y) &\mapsto c(x,y) - a.
\end{align*}
Since $c$ is lower semicontinuous and $a \in \mathbb{R}$ is constant, $\tilde c$ is lower semicontinuous and nonnegative. By the Portmanteau lower semicontinuity theorem applied to the weak convergence $\pi_{k_j} \rightharpoonup \pi$,
\begin{align*}
\int_{X \times Y} \tilde c(x,y)\, d\pi(x,y)
\leq
\liminf_{j \to \infty}
\int_{X \times Y} \tilde c(x,y)\, d\pi_{k_j}(x,y).
\end{align*}
Because $\pi$ and each $\pi_{k_j}$ are probability measures, subtracting the constant shift gives
\begin{align*}
\int_{X \times Y} c(x,y)\, d\pi(x,y)
\leq
\liminf_{j \to \infty}
\int_{X \times Y} c(x,y)\, d\pi_{k_j}(x,y).
\end{align*}
Using the minimizing property of $(\pi_k)_{k \in \mathbb{N}}$,
\begin{align*}
\liminf_{j \to \infty}
\int_{X \times Y} c(x,y)\, d\pi_{k_j}(x,y)
\leq
\lim_{j \to \infty}
\left(V + \frac{1}{k_j}\right)
=
V.
\end{align*}
Thus
\begin{align*}
\int_{X \times Y} c(x,y)\, d\pi(x,y) \leq V.
\end{align*}
Since $\pi \in \Pi(\mu,\nu)$ and $V$ is the infimum over $\Pi(\mu,\nu)$, we also have
\begin{align*}
V \leq \int_{X \times Y} c(x,y)\, d\pi(x,y).
\end{align*}
Therefore
\begin{align*}
\int_{X \times Y} c(x,y)\, d\pi(x,y) = V,
\end{align*}
so $\pi$ attains the Kantorovich infimum.
(citing a result not yet in the wiki: Portmanteau lower semicontinuity theorem)
[/step]