[proofplan]
We first reinterpret the Stieltjes transform as integration against resolvent functions $t \mapsto (t-z)^{-1}$. Pointwise convergence of these transforms, together with local boundedness and [Cauchy's integral formula](/page/Cauchy's%20Integral%20Formula), gives convergence of integrals against all rational functions with poles off $\mathbb{R}$. The [Stone-Weierstrass theorem](/theorems/886) on the one-point compactification of $\mathbb{R}$ then upgrades this to convergence against every compactly supported continuous [test function](/page/Test%20Function). Finally, vague convergence to a probability measure implies tightness of the sequence and hence convergence against every bounded [continuous function](/page/Continuous%20Function), which is convergence in distribution.
[/proofplan]
[step:Convert Stieltjes convergence into convergence of resolvent integrals]
For $z \in \mathbb{C}\setminus\mathbb{R}$, define the resolvent test function $r_z: \mathbb{R} \to \mathbb{C}$ by $r_z(t)=(t-z)^{-1}$. Since $|r_z(t)| \leq |\operatorname{Im} z|^{-1}$ for every $t \in \mathbb{R}$, the function $r_z$ is bounded and Borel measurable. For a Borel probability measure $\nu$ on $\mathbb{R}$, the Stieltjes transform is
\begin{align*}
m_\nu(z)=\int_{\mathbb{R}} r_z(t)\,d\nu(t).
\end{align*}
Thus the hypothesis says that, for every $z \in \mathbb{C}\setminus\mathbb{R}$,
\begin{align*}
\int_{\mathbb{R}} r_z(t)\,d\mu_n(t) \to \int_{\mathbb{R}} r_z(t)\,d\mu(t).
\end{align*}
[/step]
[step:Differentiate the Stieltjes transforms to obtain convergence for higher poles]
Fix $z_0 \in \mathbb{C}\setminus\mathbb{R}$ and choose $\rho>0$ such that the closed disc $\overline{B}(z_0,\rho)$ is contained in $\mathbb{C}\setminus\mathbb{R}$. For $z\in\mathbb{C}$, write
\begin{align*}
\operatorname{dist}(z,\mathbb{R})=\inf_{t\in\mathbb{R}}|z-t|.
\end{align*}
For every Borel probability measure $\nu$ on $\mathbb{R}$, the map $m_\nu: \mathbb{C}\setminus\mathbb{R}\to\mathbb{C}$ is holomorphic because the integrand $z\mapsto (t-z)^{-1}$ is holomorphic and locally dominated by a constant depending only on this distance to $\mathbb{R}$. Moreover $|m_{\mu_n}(z)|\leq \operatorname{dist}(z,\mathbb{R})^{-1}$ on this disc, so the family is locally bounded.
By the [Cauchy integral formula](/page/Cauchy%20Integral%20Formula), applied on the circle $\partial B(z_0,\rho)$ with positive orientation, each derivative $m_{\mu_n}^{(j)}(z_0)$ is represented by a contour integral of $m_{\mu_n}$ over that fixed circle. The pointwise convergence $m_{\mu_n}\to m_\mu$ on the circle and the bound $|m_{\mu_n}(z)|\leq \operatorname{dist}(z,\mathbb{R})^{-1}$ allow the [Dominated Convergence Theorem](/page/Dominated%20Convergence%20Theorem) with respect to arclength measure on $\partial B(z_0,\rho)$, so the derivatives converge at $z_0$. Hence, for every integer $k\geq 1$,
\begin{align*}
\int_{\mathbb{R}} \frac{1}{(t-z_0)^k}\,d\mu_n(t) \to \int_{\mathbb{R}} \frac{1}{(t-z_0)^k}\,d\mu(t).
\end{align*}
Here we used the identity $m_\nu^{(k-1)}(z_0)=(k-1)!\int_{\mathbb{R}}(t-z_0)^{-k}\,d\nu(t)$, obtained by differentiating under the integral sign with the same local domination.
[guided]
The hypothesis gives convergence only for first powers of the resolvent, but rational approximation will also require higher powers such as $(t-z)^{-k}$. We obtain these powers by differentiating the analytic functions $m_{\mu_n}$.
Fix $z_0 \in \mathbb{C}\setminus\mathbb{R}$ and choose $\rho>0$ so small that $\overline{B}(z_0,\rho)\subset \mathbb{C}\setminus\mathbb{R}$. On this closed disc the distance to the real axis is positive, so there is a constant $C_{z_0,\rho}>0$ such that $|(t-z)^{-1}|\leq C_{z_0,\rho}$ for all $t\in\mathbb{R}$ and all $z\in\overline{B}(z_0,\rho)$. This proves both holomorphy of $m_\nu$ by dominated differentiation and the uniform local bound $|m_{\mu_n}(z)|\leq C_{z_0,\rho}$.
The [Cauchy integral formula](/page/Cauchy%20Integral%20Formula) expresses each derivative $m_{\mu_n}^{(j)}(z_0)$ as an integral of $m_{\mu_n}$ over the fixed circle $\partial B(z_0,\rho)$. The integrands converge pointwise on that circle and are dominated by the locally uniform bound, so the derivative limits agree with the derivatives of $m_\mu$. Therefore, for every $k\geq 1$,
\begin{align*}
\int_{\mathbb{R}} \frac{1}{(t-z_0)^k}\,d\mu_n(t) \to \int_{\mathbb{R}} \frac{1}{(t-z_0)^k}\,d\mu(t).
\end{align*}
This is the needed upgrade from resolvents to all principal parts with poles off the real axis.
[/guided]
[/step]
[step:Approximate compactly supported continuous functions by rational resolvents]
Let $\widehat{\mathbb{R}}=\mathbb{R}\cup\{\infty\}$ denote the one-point compactification of $\mathbb{R}$. Let $\mathcal{A}$ be the complex algebra on $\widehat{\mathbb{R}}$ generated by the constant functions and by the continuous extensions of $r_z$ for $z\in\mathbb{C}\setminus\mathbb{R}$, where each extension has value $0$ at $\infty$. The algebra $\mathcal{A}$ contains constants, is closed under complex conjugation because $\overline{r_z}=r_{\bar z}$ on $\mathbb{R}$, and separates points of $\widehat{\mathbb{R}}$: the function $r_i$ separates $\infty$ from every real point, and it is injective on $\mathbb{R}$.
By the complex [Stone-Weierstrass Theorem](/page/Stone-Weierstrass%20Theorem), $\mathcal{A}$ is uniformly dense in $C(\widehat{\mathbb{R}};\mathbb{C})$. Since integrals of constants converge because $\mu_n(\mathbb{R})=\mu(\mathbb{R})=1$, and integrals of products of generators reduce by partial fractions to finite linear combinations of the already treated terms $(t-z)^{-k}$, it follows that
\begin{align*}
\int_{\mathbb{R}} f(t)\,d\mu_n(t) \to \int_{\mathbb{R}} f(t)\,d\mu(t)
\end{align*}
for every $f\in C_c(\mathbb{R};\mathbb{C})$.
[/step]
[step:Upgrade vague convergence to weak convergence using tightness]
The preceding step gives convergence against compactly supported continuous functions. We first prove that the family $(\mu_n)_{n\geq 1}$ is tight. Let $\varepsilon>0$. Since $\mu$ is a probability measure on $\mathbb{R}$, choose $R>0$ such that $\mu([-R,R])>1-\varepsilon$. Choose a continuous cutoff function $\phi_R:\mathbb{R}\to[0,1]$ such that $\phi_R(t)=1$ for $t\in[-R,R]$ and $\phi_R(t)=0$ for $t\notin[-R-1,R+1]$. Then $\phi_R\in C_c(\mathbb{R};\mathbb{C})$, so
\begin{align*}
\int_{\mathbb{R}} \phi_R(t)\,d\mu_n(t) \to \int_{\mathbb{R}} \phi_R(t)\,d\mu(t) \geq \mu([-R,R])>1-\varepsilon.
\end{align*}
Because $0\leq \phi_R\leq \mathbb{1}_{[-R-1,R+1]}$, it follows that $\mu_n([-R-1,R+1])>1-2\varepsilon$ for all sufficiently large $n$. Enlarging the compact interval to include compact sets carrying mass greater than $1-2\varepsilon$ for the finitely many remaining measures, we obtain a compact set $K_\varepsilon\subset\mathbb{R}$ such that $\mu_n(\mathbb{R}\setminus K_\varepsilon)<2\varepsilon$ for every $n\geq 1$. Hence $(\mu_n)_{n\geq 1}$ is tight.
Now let $g\in C_b(\mathbb{R};\mathbb{C})$ and let $\delta>0$. By tightness of $(\mu_n)_{n\geq 1}$ and because $\mu$ is a probability measure, choose $S>0$ such that
\begin{align*}
\sup_{n\geq 1}\mu_n(\mathbb{R}\setminus[-S,S])<\delta
\end{align*}
and
\begin{align*}
\mu(\mathbb{R}\setminus[-S,S])<\delta.
\end{align*}
Choose a continuous cutoff function $\psi_S:\mathbb{R}\to[0,1]$ such that $\psi_S(t)=1$ for $t\in[-S,S]$ and $\psi_S(t)=0$ for $t\notin[-S-1,S+1]$. Then $g\psi_S\in C_c(\mathbb{R};\mathbb{C})$, so its integrals converge. Since $1-\psi_S$ vanishes on $[-S,S]$ and $0\leq 1-\psi_S\leq 1$, we have
\begin{align*}
\left|\int_{\mathbb{R}} g(t)(1-\psi_S(t))\,d\mu_n(t)\right|\leq \|g\|_\infty\mu_n(\mathbb{R}\setminus[-S,S])<\|g\|_\infty\delta
\end{align*}
for every $n$, and the same estimate with $\mu$ in place of $\mu_n$. Therefore
\begin{align*}
\limsup_{n\to\infty}\left|\int_{\mathbb{R}} g(t)\,d\mu_n(t)-\int_{\mathbb{R}} g(t)\,d\mu(t)\right|\leq 2\|g\|_\infty\delta.
\end{align*}
Letting $\delta\downarrow0$ gives
\begin{align*}
\int_{\mathbb{R}} g(t)\,d\mu_n(t) \to \int_{\mathbb{R}} g(t)\,d\mu(t).
\end{align*}
By the bounded-continuous-test-function characterization of convergence in distribution on $\mathbb{R}$, this is exactly $\mu_n\xrightarrow{d}\mu$.
[/step]