[proofplan]
We prove the three implications separately. Almost sure convergence is converted into convergence in probability by applying dominated convergence to the indicators of the deviation events. $L^p$ convergence implies convergence in probability by Markov's inequality applied to $|X_n-X|^p$. Finally, convergence in probability implies convergence in distribution by comparing the distribution functions of $X_n$ and $X$ at continuity points of the distribution function of $X$.
[/proofplan]
[step:Declare the ambient probability space and the convergence notions]
Let $\mathbb N:=\{1,2,3,\ldots\}$. Let $(\Omega,\mathcal F,\mathbb P)$ be the probability space, let $\mathcal B(\mathbb R)$ denote the Borel $\sigma$-algebra on $\mathbb R$, and let
\begin{align*}
X_n:(\Omega,\mathcal F)&\to(\mathbb R,\mathcal B(\mathbb R)),\qquad n\in\mathbb N,\\
X:(\Omega,\mathcal F)&\to(\mathbb R,\mathcal B(\mathbb R))
\end{align*}
be the real-valued random variables in the theorem.
For a real number $\varepsilon>0$, convergence in probability $X_n\xrightarrow{\mathbb P}X$ means
\begin{align*}
\lim_{n\to\infty}\mathbb P(\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\})=0.
\end{align*}
Almost sure convergence $X_n\xrightarrow{a.s.}X$ means that there exists an event $\Omega_0\in\mathcal F$ with $\mathbb P(\Omega_0)=1$ such that
\begin{align*}
\lim_{n\to\infty}X_n(\omega)=X(\omega)
\end{align*}
for every $\omega\in\Omega_0$. For a real number $p\ge1$, convergence $X_n\xrightarrow{L^p}X$ means that $|X_n-X|^p$ is integrable for each $n\in\mathbb N$ and
\begin{align*}
\lim_{n\to\infty}\mathbb E[|X_n-X|^p]
=
\lim_{n\to\infty}\int_\Omega |X_n(\omega)-X(\omega)|^p\,d\mathbb P(\omega)
=0.
\end{align*}
For a real-valued random variable $Y:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$, define its distribution function
\begin{align*}
F_Y:\mathbb R&\to[0,1]\\
t&\mapsto \mathbb P(\{\omega\in\Omega:Y(\omega)\le t\}).
\end{align*}
Convergence in distribution $X_n\xrightarrow{d}X$ means that
\begin{align*}
\lim_{n\to\infty}F_{X_n}(t)=F_X(t)
\end{align*}
for every real number $t\in\mathbb R$ at which $F_X$ is continuous.
[/step]
[step:Convert almost sure convergence into convergence in probability]
Assume $X_n\xrightarrow{a.s.}X$. Fix a real number $\varepsilon>0$. For each $n\in\mathbb N$, define the deviation event
\begin{align*}
A_n^\varepsilon:=\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\}\in\mathcal F
\end{align*}
and define the indicator random variable
\begin{align*}
\mathbb 1_{A_n^\varepsilon}:\Omega&\to\{0,1\}\\
\omega&\mapsto
\begin{cases}
1,&\omega\in A_n^\varepsilon,\\
0,&\omega\notin A_n^\varepsilon.
\end{cases}
\end{align*}
Since $X_n\to X$ almost surely, $\mathbb 1_{A_n^\varepsilon}(\omega)\to0$ for $\mathbb P$-almost every $\omega\in\Omega$. Also $0\le \mathbb 1_{A_n^\varepsilon}\le1$ on $\Omega$, and the constant function $1:\Omega\to\mathbb R$ is integrable because $\mathbb P(\Omega)=1$. By the [Dominated Convergence Theorem](/theorems/???),
\begin{align*}
\lim_{n\to\infty}\mathbb P(A_n^\varepsilon)
=
\lim_{n\to\infty}\int_\Omega \mathbb 1_{A_n^\varepsilon}(\omega)\,d\mathbb P(\omega)
=
\int_\Omega 0\,d\mathbb P(\omega)
=
0.
\end{align*}
Because this holds for every $\varepsilon>0$, $X_n\xrightarrow{\mathbb P}X$.
[/step]
[step:Use Markov's inequality to pass from $L^p$ convergence to convergence in probability]
Assume $X_n\xrightarrow{L^p}X$ for some real number $p\ge1$. Fix a real number $\varepsilon>0$. For each $n\in\mathbb N$, define
\begin{align*}
Y_n:\Omega&\to[0,\infty)\\
\omega&\mapsto |X_n(\omega)-X(\omega)|^p.
\end{align*}
The random variable $Y_n$ is nonnegative and integrable by the definition of $L^p$ convergence. Applying [Markov's Inequality](/theorems/???) to $Y_n$ with threshold $\varepsilon^p>0$ gives
\begin{align*}
\mathbb P(\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\})
&=
\mathbb P(\{\omega\in\Omega:Y_n(\omega)>\varepsilon^p\})\\
&\le
\frac{\mathbb E[Y_n]}{\varepsilon^p}\\
&=
\frac{1}{\varepsilon^p}\int_\Omega |X_n(\omega)-X(\omega)|^p\,d\mathbb P(\omega).
\end{align*}
Taking $n\to\infty$ and using $X_n\xrightarrow{L^p}X$ yields
\begin{align*}
\lim_{n\to\infty}\mathbb P(\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\})=0.
\end{align*}
Because this holds for every $\varepsilon>0$, $X_n\xrightarrow{\mathbb P}X$.
[/step]
[step:Compare distribution functions at continuity points]
Assume $X_n\xrightarrow{\mathbb P}X$. Let $t\in\mathbb R$ be a point at which the distribution function $F_X:\mathbb R\to[0,1]$ is continuous. Fix a real number $\varepsilon>0$.
For each $n\in\mathbb N$, the event $\{X_n\le t\}$ is contained in the union of the events $\{X\le t+\varepsilon\}$ and $\{|X_n-X|>\varepsilon\}$, because if $X_n(\omega)\le t$ and $|X_n(\omega)-X(\omega)|\le\varepsilon$, then $X(\omega)\le t+\varepsilon$. Hence
\begin{align*}
F_{X_n}(t)
=
\mathbb P(\{X_n\le t\})
\le
F_X(t+\varepsilon)+\mathbb P(\{|X_n-X|>\varepsilon\}).
\end{align*}
Taking the limit superior and using convergence in probability gives
\begin{align*}
\limsup_{n\to\infty}F_{X_n}(t)\le F_X(t+\varepsilon).
\end{align*}
Similarly, the event $\{X\le t-\varepsilon\}$ is contained in the union of the events $\{X_n\le t\}$ and $\{|X_n-X|>\varepsilon\}$, because if $X(\omega)\le t-\varepsilon$ and $|X_n(\omega)-X(\omega)|\le\varepsilon$, then $X_n(\omega)\le t$. Therefore
\begin{align*}
F_X(t-\varepsilon)
\le
F_{X_n}(t)+\mathbb P(\{|X_n-X|>\varepsilon\}).
\end{align*}
Taking the limit inferior and using convergence in probability gives
\begin{align*}
F_X(t-\varepsilon)\le \liminf_{n\to\infty}F_{X_n}(t).
\end{align*}
[guided]
We prove convergence of distribution functions at a fixed continuity point $t$ of $F_X$. The goal is to trap $F_{X_n}(t)$ between values of $F_X$ slightly to the left and slightly to the right of $t$, with an error term controlled by convergence in probability.
Fix a real number $\varepsilon>0$. For each $n\in\mathbb N$, consider the event
\begin{align*}
\{\omega\in\Omega:X_n(\omega)\le t\}.
\end{align*}
If, in addition, $|X_n(\omega)-X(\omega)|\le\varepsilon$, then
\begin{align*}
X(\omega)\le X_n(\omega)+\varepsilon\le t+\varepsilon.
\end{align*}
Thus every $\omega\in\Omega$ with $X_n(\omega)\le t$ must satisfy at least one of the two alternatives $X(\omega)\le t+\varepsilon$ or $|X_n(\omega)-X(\omega)|>\varepsilon$. This proves the event inclusion
\begin{align*}
\{\omega\in\Omega:X_n(\omega)\le t\}
\subset
\{\omega\in\Omega:X(\omega)\le t+\varepsilon\}
\cup
\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\}.
\end{align*}
Taking probabilities and using subadditivity of the probability measure $\mathbb P$ gives
\begin{align*}
F_{X_n}(t)
=
\mathbb P(\{\omega\in\Omega:X_n(\omega)\le t\})
\le
F_X(t+\varepsilon)+\mathbb P(\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\}).
\end{align*}
Since $X_n\xrightarrow{\mathbb P}X$, the second term on the right tends to $0$ as $n\to\infty$. Therefore
\begin{align*}
\limsup_{n\to\infty}F_{X_n}(t)\le F_X(t+\varepsilon).
\end{align*}
For the lower bound, we compare from the left. If $X(\omega)\le t-\varepsilon$ and $|X_n(\omega)-X(\omega)|\le\varepsilon$, then
\begin{align*}
X_n(\omega)\le X(\omega)+\varepsilon\le t.
\end{align*}
Thus every $\omega\in\Omega$ with $X(\omega)\le t-\varepsilon$ must either satisfy $X_n(\omega)\le t$ or satisfy $|X_n(\omega)-X(\omega)|>\varepsilon$. Hence
\begin{align*}
\{\omega\in\Omega:X(\omega)\le t-\varepsilon\}
\subset
\{\omega\in\Omega:X_n(\omega)\le t\}
\cup
\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\}.
\end{align*}
Taking probabilities and using subadditivity gives
\begin{align*}
F_X(t-\varepsilon)
\le
F_{X_n}(t)+\mathbb P(\{\omega\in\Omega:|X_n(\omega)-X(\omega)|>\varepsilon\}).
\end{align*}
Again convergence in probability makes the error term tend to $0$, so
\begin{align*}
F_X(t-\varepsilon)\le \liminf_{n\to\infty}F_{X_n}(t).
\end{align*}
These two inequalities are the essential comparison: probability convergence controls the error event, while continuity of $F_X$ will let $\varepsilon$ shrink to $0$ in the next step.
[/guided]
[/step]
[step:Let the comparison width shrink to zero and conclude convergence in distribution]
From the previous step, for every real number $\varepsilon>0$,
\begin{align*}
F_X(t-\varepsilon)
\le
\liminf_{n\to\infty}F_{X_n}(t)
\le
\limsup_{n\to\infty}F_{X_n}(t)
\le
F_X(t+\varepsilon).
\end{align*}
Because $F_X$ is continuous at $t$, the one-sided limits satisfy
\begin{align*}
\lim_{\varepsilon\downarrow0}F_X(t-\varepsilon)=F_X(t),
\qquad
\lim_{\varepsilon\downarrow0}F_X(t+\varepsilon)=F_X(t).
\end{align*}
Letting $\varepsilon\downarrow0$ in the preceding inequalities gives
\begin{align*}
F_X(t)
\le
\liminf_{n\to\infty}F_{X_n}(t)
\le
\limsup_{n\to\infty}F_{X_n}(t)
\le
F_X(t).
\end{align*}
Therefore
\begin{align*}
\lim_{n\to\infty}F_{X_n}(t)=F_X(t).
\end{align*}
Since $t$ was an arbitrary continuity point of $F_X$, this is exactly $X_n\xrightarrow{d}X$. Combining the three separately proved implications completes the proof.
[/step]