[proofplan]
We prove the result for $M = \mathbb{R}$ using [distribution](/page/Distribution) [functions](/page/Function), then explain the adaptation to a general [metric space](/page/Metric%20Space). The argument for $\mathbb{R}$ has three stages. First, a diagonal extraction: the distribution functions $F_n$ take values in $[0,1]$, so by applying the [Bolzano-Weierstrass Theorem](/theorems/628) along an enumeration of $\mathbb{Q}$ and extracting a diagonal subsequence, we obtain a subsequence $F_{m_k}$ converging at every rational. Second, we extend the rational-point [limit](/page/Limit) to a right-continuous non-decreasing function $F$ on $\mathbb{R}$ and show $F_{m_k}(t) \to F(t)$ at every [continuity](/page/Continuity) point of $F$, via a squeeze argument using two characterisations of $F$: as a right infimum and via a left supremum identity. Third, the tightness hypothesis ensures $F$ has the correct [boundary](/page/Boundary) behaviour ($F(-\infty) = 0$, $F(+\infty) = 1$), so $F$ is a genuine distribution function, and convergence at continuity points gives [weak convergence](/page/Weak%20Convergence) by the [Portmanteau Theorem](/theorems/1171).
[/proofplan]
[step:Extract a subsequence converging at every rational by a diagonal argument]
For each $n \geq 1$, define the distribution function
\begin{align*}
F_n : \mathbb{R} &\to [0,1] \\
t &\mapsto \mu_n((-\infty, t]).
\end{align*}
Each $F_n$ is non-decreasing and right-continuous. Enumerate $\mathbb{Q} = \{q_1, q_2, q_3, \ldots\}$. The sequence $(F_n(q_1))_{n \geq 1}$ lies in $[0,1]$, so by the [Bolzano-Weierstrass Theorem](/theorems/628), there exists a subsequence $(n_k^{(1)})_{k \geq 1}$ such that $F_{n_k^{(1)}}(q_1)$ converges.
The sequence $(F_{n_k^{(1)}}(q_2))_{k \geq 1}$ again lies in $[0,1]$, so there is a further subsequence $(n_k^{(2)})_{k \geq 1}$ of $(n_k^{(1)})$ with $F_{n_k^{(2)}}(q_2)$ convergent. Continuing inductively, at stage $j$ we extract a subsequence $(n_k^{(j)})_k$ of $(n_k^{(j-1)})_k$ such that $F_{n_k^{(j)}}(q_j)$ converges.
The diagonal subsequence $m_k := n_k^{(k)}$ is eventually a subsequence of each $(n_k^{(j)})$, so $F_{m_k}(q)$ converges for every $q \in \mathbb{Q}$. Define $G : \mathbb{Q} \to [0,1]$ by $G(q) := \lim_{k \to \infty} F_{m_k}(q)$. Since each $F_{m_k}$ is non-decreasing, the pointwise limit $G$ is non-decreasing on $\mathbb{Q}$.
[guided]
The diagonal argument is a standard compactness technique in sequence spaces. The key structure exploited is: (i) $[0,1]$ is sequentially compact, and (ii) $\mathbb{Q}$ is countable.
At each stage $j$, we refine the subsequence to force convergence at one more rational point $q_j$. The crucial property of the diagonal subsequence $m_k = n_k^{(k)}$ is: for each fixed $j$, the tail $(m_k)_{k \geq j}$ is a subsequence of $(n_k^{(j)})_k$. Since $F_{n_k^{(j)}}(q_j)$ converges and $(m_k)_{k \geq j}$ is a subsequence, $F_{m_k}(q_j)$ converges as well. This holds for every $j$, so $F_{m_k}(q)$ converges for every $q \in \mathbb{Q}$.
This is the only place where the argument uses [sequential compactness](/page/Sequential%20Compactness) of $[0,1]$. The tightness hypothesis is not needed here — the diagonal extraction works for any sequence of distribution functions.
[/guided]
[/step]
[step:Extend to a right-continuous non-decreasing function on $\mathbb{R}$ via right limits]
Define
\begin{align*}
F : \mathbb{R} &\to [0,1] \\
x &\mapsto \inf_{q \in \mathbb{Q},\, q > x} G(q).
\end{align*}
The infimum exists because $G$ is bounded in $[0,1]$, and equals $\lim_{q \downarrow x,\, q \in \mathbb{Q}} G(q)$ since $G$ is non-decreasing. The function $F$ is non-decreasing: if $x_1 < x_2$ then $\{q \in \mathbb{Q} : q > x_2\} \subset \{q \in \mathbb{Q} : q > x_1\}$, so the infimum over the smaller [set](/page/Set) is at least the infimum over the larger. The function $F$ is right-continuous: for any decreasing sequence $\delta_j \downarrow 0$, we have $F(x + \delta_j) = \inf_{q > x + \delta_j} G(q) \downarrow \inf_{q > x} G(q) = F(x)$, since the sets $\{q > x + \delta_j\}$ increase to $\{q > x\}$.
[/step]
[step:Establish two key relations between $G$ and $F$]
We record two identities that will drive the convergence argument.
[claim:Monotonicity gives $G(s) \leq F(s)$ for every $s \in \mathbb{Q}$]
For every $s \in \mathbb{Q}$, we have $G(s) \leq F(s)$.
[/claim]
[proof]
Since $G$ is non-decreasing on $\mathbb{Q}$, every rational $q > s$ satisfies $G(q) \geq G(s)$. Taking the infimum over all such $q$:
\begin{align*}
F(s) = \inf_{q \in \mathbb{Q},\, q > s} G(q) \geq G(s).
\end{align*}
[/proof]
[claim:The left limit of $F$ equals the left supremum of $G$]
For every $x \in \mathbb{R}$, the left limit $F(x^-) := \lim_{y \uparrow x} F(y)$ satisfies
\begin{align*}
F(x^-) = \sup_{q \in \mathbb{Q},\, q < x} G(q).
\end{align*}
[/claim]
[proof]
Since $F$ is non-decreasing, the left limit $F(x^-) = \sup_{y < x} F(y)$ exists.
**Lower bound.** For any rational $s < x$, the preceding claim gives $G(s) \leq F(s)$. Since $s < x$, we have $F(s) \leq F(x^-)$, so $G(s) \leq F(x^-)$. Taking the supremum over all rational $s < x$ yields $\sup_{s < x} G(s) \leq F(x^-)$.
**Upper bound.** For any $y < x$, by definition $F(y) = \inf_{q > y} G(q)$. Choosing any rational $q$ with $y < q < x$ gives $F(y) \leq G(q) \leq \sup_{r < x} G(r)$. Taking the supremum over $y < x$: $F(x^-) \leq \sup_{r \in \mathbb{Q},\, r < x} G(r)$.
[/proof]
[/step]
[step:Show $F_{m_k}(t) \to F(t)$ at every continuity point of $F$ by a squeeze argument]
Let $t \in \mathbb{R}$ be a continuity point of $F$, meaning $F(t^-) = F(t)$. Fix $\varepsilon > 0$.
**Upper bound on $\limsup$.** By definition, $F(t) = \inf_{q > t,\, q \in \mathbb{Q}} G(q)$. Choose $s_2 \in \mathbb{Q}$ with $s_2 > t$ and $G(s_2) < F(t) + \varepsilon$. Since each $F_{m_k}$ is non-decreasing, $F_{m_k}(t) \leq F_{m_k}(s_2)$. Taking $k \to \infty$ and using $F_{m_k}(s_2) \to G(s_2)$ (since $s_2 \in \mathbb{Q}$):
\begin{align*}
\limsup_{k \to \infty} F_{m_k}(t) \leq G(s_2) < F(t) + \varepsilon.
\end{align*}
**Lower bound on $\liminf$.** By continuity at $t$, $F(t^-) = F(t)$. By the preceding claim, $F(t^-) = \sup_{q < t,\, q \in \mathbb{Q}} G(q)$. Since this supremum equals $F(t)$, there exists $s_1 \in \mathbb{Q}$ with $s_1 < t$ and $G(s_1) > F(t) - \varepsilon$. Since each $F_{m_k}$ is non-decreasing, $F_{m_k}(t) \geq F_{m_k}(s_1)$. Taking $k \to \infty$:
\begin{align*}
\liminf_{k \to \infty} F_{m_k}(t) \geq G(s_1) > F(t) - \varepsilon.
\end{align*}
**Conclusion.** Combining the two bounds: $F(t) - \varepsilon < \liminf_k F_{m_k}(t) \leq \limsup_k F_{m_k}(t) < F(t) + \varepsilon$. Since $\varepsilon > 0$ was arbitrary, $\lim_{k \to \infty} F_{m_k}(t) = F(t)$.
[guided]
The squeeze argument has a simple geometric structure. At a continuity point $t$, the function $F$ has no jump: $F(t^-) = F(t) = F(t^+)$. The right limit $F(t^+) = F(t)$ holds by right-continuity. The key content of continuity at $t$ is therefore $F(t^-) = F(t)$, which means the supremum of $G$ below $t$ equals the infimum of $G$ above $t$.
With this, both sides of the squeeze can be pushed arbitrarily close to $F(t)$:
- From above: $G(s_2) \downarrow F(t)$ as $s_2 \downarrow t$ through rationals (by definition of $F$ as the infimum).
- From below: $G(s_1) \uparrow F(t^-) = F(t)$ as $s_1 \uparrow t$ through rationals (by the left-limit identity).
At a discontinuity point, the gap $F(t) - F(t^-) > 0$ means the lower bound cannot reach $F(t)$: the best we can achieve is $\liminf_k F_{m_k}(t) \geq F(t^-)$ and $\limsup_k F_{m_k}(t) \leq F(t)$, which leaves the limit undetermined. This is precisely why the [Portmanteau Theorem](/theorems/1171) characterises weak convergence via convergence at continuity points of the distribution function, not at all points.
[/guided]
[/step]
[step:Use tightness to verify $F$ is a distribution function]
We must verify $\lim_{x \to -\infty} F(x) = 0$ and $\lim_{x \to +\infty} F(x) = 1$. By tightness, for every $\varepsilon > 0$ there exists $R > 0$ with $\mu_{m_k}([-R, R]) \geq 1 - \varepsilon$ for all $k$. This gives $F_{m_k}(R) \geq 1 - \varepsilon$ and $F_{m_k}(-R) \leq F_{m_k}((-R)^-) = \mu_{m_k}((-\infty, -R)) \leq \varepsilon$.
Since $F$ has at most countably many discontinuities (it is monotone), we may increase $R$ slightly so that both $R$ and $-R$ are continuity points of $F$. Passing $k \to \infty$ using the convergence established in the preceding step:
\begin{align*}
F(R) \geq 1 - \varepsilon \quad \text{and} \quad F(-R) \leq \varepsilon.
\end{align*}
Since $F$ is non-decreasing with values in $[0,1]$, letting $R \to \infty$ gives $F(-\infty) = 0$ and $F(+\infty) = 1$.
[guided]
This is the step where tightness is essential. Without it, mass could escape to $\pm \infty$: for example, $\mu_n = \delta_n$ (the Dirac mass at $n$) has $F_n(x) \to 0$ for every $x \in \mathbb{R}$, so $F \equiv 0$, which is not a distribution function. The tightness condition prevents this by requiring that a uniform proportion of the mass remains in a fixed compact set, independent of $k$.
The technical point about choosing $R$ at a continuity point is minor but necessary: the convergence $F_{m_k}(R) \to F(R)$ was established only at continuity points. A monotone function on $\mathbb{R}$ has at most countably many discontinuities (each jump corresponds to a distinct rational in the range), so continuity points are dense.
[/guided]
[/step]
[step:Conclude weak convergence on $\mathbb{R}$ from convergence of distribution functions]
By the [Caratheodory Extension Theorem](/theorems/522), the right-continuous non-decreasing function $F : \mathbb{R} \to [0,1]$ with $F(-\infty) = 0$ and $F(+\infty) = 1$ determines a unique Borel probability measure $\mu$ on $\mathbb{R}$ via $\mu((a,b]) = F(b) - F(a)$.
The convergence $F_{m_k}(t) \to F(t)$ at every continuity point $t$ of $F$ is equivalent to $\mu_{m_k} \Rightarrow \mu$ by the [Portmanteau Theorem](/theorems/1171) characterisation (iv): the set $(-\infty, t]$ has $\mu$-boundary $\{t\}$, and $\mu(\{t\}) = 0$ precisely when $t$ is a continuity point of $F$.
[/step]
[step:Adapt the argument to a general metric space]
The distribution function approach is specific to $\mathbb{R}$, but the underlying strategy — diagonal extraction on a countable separating family, followed by identification of the limit via tightness — extends to any metric space $(M, d)$.
**Reduction to a [separable](/page/Separable) subspace.** Tightness provides, for each $j \geq 1$, a compact set $K_j \subset M$ with $\sup_n \mu_n(M \setminus K_j) \leq 1/j$. The set $S := \bigcup_{j=1}^\infty K_j$ is separable (a countable union of compact metric spaces is separable), and every $\mu_n$ satisfies $\mu_n(S) = 1$. We may therefore replace $M$ by $\overline{S}$ and assume without loss of generality that $M$ is separable.
**Constructing a countable convergence-determining family.** Let $\{x_1, x_2, \ldots\}$ be a countable [dense subset](/page/Dense%20Subset) of $M$. For each $i \geq 1$ and each positive rational $r \in \mathbb{Q}_{> 0}$, define
\begin{align*}
f_{i,r} : M &\to [0,1] \\
x &\mapsto \max\bigl(1 - d(x, x_i)/r,\, 0\bigr).
\end{align*}
Each $f_{i,r}$ is Lipschitz continuous (with constant $1/r$) and supported on the closed ball $\overline{B}(x_i, r)$. The countable collection $\mathcal{F} := \{f_{i,r} : i \geq 1,\, r \in \mathbb{Q}_{> 0}\}$ separates Borel probability measures on $M$: if $\int_M f_{i,r} \, d\nu_1 = \int_M f_{i,r} \, d\nu_2$ for all $f_{i,r} \in \mathcal{F}$, then $\nu_1(G) = \nu_2(G)$ for every open $G \subset M$ (since every [open set](/page/Open%20Set) is a countable union of balls centred at the $x_i$), so $\nu_1 = \nu_2$.
**Diagonal extraction.** Since each $\int_M f_{i,r} \, d\mu_n$ lies in $[0,1]$, the same diagonal argument as in the first step produces a subsequence $(m_k)$ such that $\lim_{k \to \infty} \int_M f_{i,r} \, d\mu_{m_k}$ exists for every $f_{i,r} \in \mathcal{F}$.
**Identifying the limit measure.** The functional $\Lambda(f_{i,r}) := \lim_k \int_M f_{i,r} \, d\mu_{m_k}$ extends by linearity and density to a positive linear functional on $C_b(M)$ with $\Lambda(1) = 1$. By the [Riesz Representation Theorem](/theorems/218) on locally compact or Polish spaces, $\Lambda$ is represented by a Borel probability measure $\mu$. The tightness of the original sequence ensures that $\mu$ does not lose mass (the same role it plays in the distribution function argument on $\mathbb{R}$): for every $\varepsilon > 0$, there exists compact $K$ with $\mu(K) \geq \limsup_k \mu_{m_k}(K) \geq 1 - \varepsilon$ by Portmanteau applied to the [closed set](/page/Closed%20Set) $K$. Therefore $\mu$ is a probability measure, and $\mu_{m_k} \Rightarrow \mu$.
[guided]
The key conceptual point is that the distribution function on $\mathbb{R}$ serves as a "countable skeleton" for the measure: knowing $F$ at all rationals determines the measure. In a general metric space, there is no canonical ordering, so there are no distribution functions. The replacement is the countable family $\mathcal{F} = \{f_{i,r}\}$, which plays the same role: knowing $\int f_{i,r} \, d\mu$ for all $f_{i,r}$ determines $\mu$.
The diagonal extraction is identical in structure — we enumerate $\mathcal{F}$ and refine subsequences one function at a time. The main difference from the $\mathbb{R}$ case is in the identification step: on $\mathbb{R}$, the Caratheodory [Extension Theorem](/theorems/59) constructs the limit measure from the distribution function, while in the general case, the [Riesz Representation Theorem](/theorems/221) constructs it from the limiting linear functional.
Tightness serves the same purpose in both settings: it prevents mass from escaping. On $\mathbb{R}$, this means $F(+\infty) = 1$ and $F(-\infty) = 0$. In a general metric space, it means $\Lambda(1) = 1$, i.e., the limiting functional assigns total mass $1$.
[/guided]
[/step]