[proofplan]
We prove the equivalence by a cycle of quantitative implications. The moment generating function bound gives the two-sided tail estimate by the Chernoff argument. The tail estimate gives moment growth by integrating the distribution function and estimating the resulting gamma integral. Moment growth gives finiteness of the $\psi_2$-Orlicz norm by expanding the exponential series. Finally, the $\psi_2$ bound gives a quadratic moment generating function bound by splitting the exponential into a linear term in $Y^2$ and a quadratic term in $\lambda$, then absorbing the absolute prefactor using centering.
[/proofplan]
[step:Derive the sub-Gaussian tail bound from the moment generating function bound]
Let $Y: \Omega \to \mathbb R$ denote the centered real-valued [random variable](/page/Random%20Variable) defined by $Y:=X-\mathbb E[X]$ on the underlying probability space $(\Omega,\mathcal F,\mathbb P)$. Assume condition 1 holds with constant $K_1>0$. Fix $t \ge 0$. For $\lambda > 0$, the monotonicity of the exponential function gives $\mathbb P(Y \ge t)=\mathbb P(e^{\lambda Y} \ge e^{\lambda t})$. Since condition 1 gives $\mathbb E[e^{\lambda Y}]<\infty$, [Markov's inequality](/theorems/514) applies to the non-negative random variable $e^{\lambda Y}$ and gives
\begin{align*}
e^{\lambda t}\mathbb P(e^{\lambda Y} \ge e^{\lambda t})\le \mathbb E[e^{\lambda Y}].
\end{align*}
Using the assumed moment generating function bound, we obtain
\begin{align*}
\mathbb P(Y \ge t)
\le \exp\left(\frac{K_1^2\lambda^2}{2}-\lambda t\right).
\end{align*}
Choosing $\lambda = t/K_1^2$ gives
\begin{align*}
\mathbb P(Y \ge t) \le \exp\left(-\frac{t^2}{2K_1^2}\right).
\end{align*}
Applying the same argument to $-Y$ gives
\begin{align*}
\mathbb P(Y \le -t) \le \exp\left(-\frac{t^2}{2K_1^2}\right).
\end{align*}
Therefore
\begin{align*}
\mathbb P(|Y| \ge t) \le 2\exp\left(-\frac{t^2}{2K_1^2}\right).
\end{align*}
Thus condition 2 holds with $K_2 = \sqrt 2\,K_1$.
[/step]
[step:Integrate the tail bound to obtain moment growth]
Assume condition 2 holds with constant $K_2>0$. Fix $p \ge 1$. Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $\mathbb R$. The random variable $|Y|^p:\Omega\to[0,\infty)$ is non-negative and measurable. Applying the [layer-cake formula](/theorems/2956) in the extended non-negative sense before finiteness is known gives
\begin{align*}
\mathbb E[|Y|^p]
= p\int_0^\infty t^{p-1}\mathbb P(|Y|\ge t)\,d\mathcal L^1(t).
\end{align*}
Using the assumed tail bound,
\begin{align*}
\mathbb E[|Y|^p]
&\le 2p\int_0^\infty t^{p-1}\exp\left(-\frac{t^2}{K_2^2}\right)\,d\mathcal L^1(t).
\end{align*}
Make the substitution $u=t^2/K_2^2$, so $t=K_2u^{1/2}$ and
\begin{align*}
d\mathcal L^1(t)=\frac{K_2}{2}u^{-1/2}\,d\mathcal L^1(u).
\end{align*}
The integration domain $[0,\infty)$ is unchanged. Hence
\begin{align*}
\mathbb E[|Y|^p]
&\le pK_2^p\int_0^\infty u^{p/2-1}e^{-u}\,d\mathcal L^1(u).
\end{align*}
Define the gamma integral $G_p$ by
\begin{align*}
G_p := \int_0^\infty u^{p/2-1}e^{-u}\,d\mathcal L^1(u).
\end{align*}
We estimate this gamma integral explicitly. Let $a:=p/2$. If $1\le p\le2$, then $a\in[1/2,1]$ and
\begin{align*}
G_p=\Gamma(a)\le \sup_{r\in[1/2,1]}\Gamma(r)=\sqrt\pi.
\end{align*}
If $p\ge2$, then $a\ge1$ and the standard [Stirling upper bound](/theorems/1109) for the gamma function gives
\begin{align*}
G_p=\Gamma(a)\le \sqrt{2\pi}\,a^{a-1/2}e^{-a}e^{1/(12a)}\le C_a a^a,
\end{align*}
where $C_a:=\sqrt{2\pi}e^{1/6}$ is universal. In both cases, increasing the universal constant if necessary, there exists $C_0>0$ such that
\begin{align*}
pG_p \le (C_0\sqrt p)^p.
\end{align*} Consequently
\begin{align*}
\left(\mathbb E[|Y|^p]\right)^{1/p} \le C_0K_2\sqrt p.
\end{align*}
Thus condition 3 holds with $K_3=C_0K_2$.
[guided]
Assume that the tail estimate holds with constant $K_2>0$. In this step, the goal is to turn information about the probability of large deviations into bounds for all moments. The correct identity is the [layer-cake formula](/theorems/2956), applied in the extended non-negative sense to the non-negative measurable random variable $|Y|^p$:
\begin{align*}
\mathbb E[|Y|^p]
= \int_0^\infty \mathbb P(|Y|^p \ge s)\,d\mathcal L^1(s).
\end{align*}
Now substitute $s=t^p$. Since $d\mathcal L^1(s)=pt^{p-1}\,d\mathcal L^1(t)$ and $|Y|^p\ge t^p$ is equivalent to $|Y|\ge t$ for $t\ge0$, this becomes
\begin{align*}
\mathbb E[|Y|^p]
= p\int_0^\infty t^{p-1}\mathbb P(|Y|\ge t)\,d\mathcal L^1(t).
\end{align*}
The assumed tail estimate gives
\begin{align*}
\mathbb E[|Y|^p]
&\le 2p\int_0^\infty t^{p-1}\exp\left(-\frac{t^2}{K_2^2}\right)\,d\mathcal L^1(t).
\end{align*}
We now compute the scale of this integral. Use the substitution $u=t^2/K_2^2$. Then $t=K_2u^{1/2}$ and
\begin{align*}
d\mathcal L^1(t)=\frac{K_2}{2}u^{-1/2}\,d\mathcal L^1(u).
\end{align*}
The domain $t\in[0,\infty)$ becomes $u\in[0,\infty)$. Therefore
\begin{align*}
2p\int_0^\infty t^{p-1}\exp\left(-\frac{t^2}{K_2^2}\right)\,d\mathcal L^1(t)
&=pK_2^p\int_0^\infty u^{p/2-1}e^{-u}\,d\mathcal L^1(u).
\end{align*}
Define
\begin{align*}
G_p:=\int_0^\infty u^{p/2-1}e^{-u}\,d\mathcal L^1(u).
\end{align*}
We now justify the growth estimate for $G_p$. Let $a:=p/2$. If $1\le p\le2$, then $a\in[1/2,1]$ and
\begin{align*}
G_p=\Gamma(a)\le \sup_{r\in[1/2,1]}\Gamma(r)=\sqrt\pi.
\end{align*}
If $p\ge2$, then $a\ge1$ and the [Stirling upper bound](/theorems/1109) for the gamma function gives
\begin{align*}
G_p=\Gamma(a)\le \sqrt{2\pi}\,a^{a-1/2}e^{-a}e^{1/(12a)}\le C_a a^a,
\end{align*}
where $C_a:=\sqrt{2\pi}e^{1/6}$ is universal. Therefore, after enlarging a universal constant if necessary, there exists $C_0>0$ such that
\begin{align*}
pG_p \le (C_0\sqrt p)^p.
\end{align*}
This estimate is exactly the analytic content behind the phrase "Gaussian tails have $\sqrt p$ moments." Substituting it into the previous bound gives
\begin{align*}
\mathbb E[|Y|^p]\le K_2^p(C_0\sqrt p)^p.
\end{align*}
Taking the $p$-th root yields
\begin{align*}
\left(\mathbb E[|Y|^p]\right)^{1/p}\le C_0K_2\sqrt p.
\end{align*}
Thus the moment-growth condition holds with $K_3=C_0K_2$.
[/guided]
[/step]
[step:Use moment growth to make the Orlicz exponential integrable]
Assume condition 3 holds with constant $K_3>0$. Define
\begin{align*}
s:=4K_3.
\end{align*}
For each integer $N\ge1$, define the partial exponential polynomial $P_N:[0,\infty)\to[0,\infty)$ by
\begin{align*}
P_N(r):=\sum_{m=0}^N\frac{r^m}{m!}.
\end{align*}
The sequence $(P_N(Y^2/s^2))_{N=1}^\infty$ is non-negative and increases pointwise to $\exp(Y^2/s^2)$, so the hypotheses of the [monotone convergence theorem](/theorems/4) hold for the probability measure $\mathbb P$. Hence expectation may be passed through the increasing limit, giving
\begin{align*}
\mathbb E\left[\exp\left(\frac{Y^2}{s^2}\right)\right]
=1+\sum_{m=1}^\infty \frac{\mathbb E[|Y|^{2m}]}{m!s^{2m}}.
\end{align*}
Applying the moment bound with $p=2m$ for each integer $m\ge1$ gives
\begin{align*}
\mathbb E\left[\exp\left(\frac{Y^2}{s^2}\right)\right]
\le 1+\sum_{m=1}^\infty \frac{(K_3\sqrt{2m})^{2m}}{m!(4K_3)^{2m}}.
\end{align*}
Since $s=4K_3$, this is
\begin{align*}
\mathbb E\left[\exp\left(\frac{Y^2}{s^2}\right)\right]
\le 1+\sum_{m=1}^\infty \frac{(2m)^m}{16^m m!}.
\end{align*}
Since $m!\ge (m/e)^m$, each summand is bounded by $(2e/16)^m=(e/8)^m$. Also $e/8<1/2$, so
\begin{align*}
\sum_{m=1}^\infty \left(\frac e8\right)^m=\frac{e/8}{1-e/8}<1.
\end{align*}
Hence
\begin{align*}
\mathbb E\left[\exp\left(\frac{Y^2}{s^2}\right)\right]<2.
\end{align*}
Therefore $\|Y\|_{\psi_2}\le 4K_3$, and condition 4 holds.
[/step]
[step:Convert the Orlicz bound into a centered moment generating function bound]
Assume condition 4 holds. Let
\begin{align*}
A:=\|Y\|_{\psi_2}.
\end{align*}
If $A=0$, then $Y=0$ almost surely and condition 1 holds with every $K_1>0$. Suppose $A>0$. By the definition of the infimum, for every $\varepsilon>0$ the number
\begin{align*}
s_\varepsilon:=A+\varepsilon
\end{align*}
satisfies
\begin{align*}
\mathbb E\left[\exp\left(\frac{Y^2}{s_\varepsilon^2}\right)\right]\le 2.
\end{align*}
For every real number $a$ and every real number $b$, the inequality
\begin{align*}
ab\le \frac{a^2}{2}+\frac{b^2}{2}
\end{align*}
applied with $a=Y/s_\varepsilon$ and $b=\lambda s_\varepsilon$ gives
\begin{align*}
\lambda Y\le \frac{Y^2}{2s_\varepsilon^2}+\frac{\lambda^2s_\varepsilon^2}{2}.
\end{align*}
Therefore
\begin{align*}
\mathbb E[e^{\lambda Y}]
\le \exp\left(\frac{\lambda^2s_\varepsilon^2}{2}\right)
\mathbb E\left[\exp\left(\frac{Y^2}{2s_\varepsilon^2}\right)\right].
\end{align*}
Since $Y^2/(2s_\varepsilon^2)\le Y^2/s_\varepsilon^2$, monotonicity of the exponential function gives
\begin{align*}
\mathbb E[e^{\lambda Y}]
\le 2\exp\left(\frac{\lambda^2s_\varepsilon^2}{2}\right).
\end{align*}
This bound has an absolute prefactor. To remove it, first obtain small-$\lambda$ control directly from the Orlicz bound. For every integer $m\ge1$, the elementary inequality $r^m/m!\le e^r$ for $r\ge0$ gives, with $r=Y^2/s_\varepsilon^2$,
\begin{align*}
\frac{|Y|^{2m}}{m!s_\varepsilon^{2m}}\le \exp\left(\frac{Y^2}{s_\varepsilon^2}\right).
\end{align*}
Taking expectations and using $\mathbb E[\exp(Y^2/s_\varepsilon^2)]\le2$ gives
\begin{align*}
\mathbb E[|Y|^{2m}]\le 2m!s_\varepsilon^{2m}.
\end{align*}
Using $m!\le m^m$ and setting $m=\lceil q/2\rceil$ for a real number $q\ge2$, the [monotonicity of $L^q$ norms on a probability space](/page/%24L%5Ep%24%20Spaces) gives the moment estimate
\begin{align*}
\left(\mathbb E[|Y|^q]\right)^{1/q}\le C_1s_\varepsilon\sqrt q,
\end{align*}
where $C_1:=2\sqrt2$ is a universal constant. In particular, for every integer $m\ge2$,
\begin{align*}
|\mathbb E[Y^m]|\le \mathbb E[|Y|^m]\le (C_1s_\varepsilon\sqrt m)^m.
\end{align*}
For each integer $N\ge0$, define $Q_N:\mathbb R\to\mathbb R$ by $Q_N(u):=\sum_{m=0}^N u^m/m!$. The bound above shows that the series $\sum_{m=0}^\infty |\lambda|^m\mathbb E[|Y|^m]/m!$ converges whenever $|\lambda|s_\varepsilon<1/(eC_1)$, since $m!\ge(m/e)^m$. Hence [Fubini's theorem](/theorems/2961), applied to the absolutely summable series of integrable functions with counting measure on $\mathbb N_0$, justifies termwise expectation of the exponential series in this range. Because $\mathbb E[Y]=0$, for $|\lambda|s_\varepsilon<1/(eC_1)$ we get
\begin{align*}
\mathbb E[e^{\lambda Y}]
=1+\sum_{m=2}^\infty \frac{\lambda^m\mathbb E[Y^m]}{m!}.
\end{align*}
Using the moment bound and $m!\ge(m/e)^m$ again gives
\begin{align*}
\sum_{m=2}^\infty \frac{|\lambda|^m|\mathbb E[Y^m]|}{m!}
\le \sum_{m=2}^\infty (eC_1|\lambda|s_\varepsilon)^m m^{-m/2}.
\end{align*}
Since $m^{-m/2}\le1$ for every integer $m\ge2$, we further obtain
\begin{align*}
\sum_{m=2}^\infty \frac{|\lambda|^m|\mathbb E[Y^m]|}{m!}
\le (eC_1|\lambda|s_\varepsilon)^2\sum_{m=2}^\infty (eC_1|\lambda|s_\varepsilon)^{m-2}.
\end{align*}
Choose $c_1:=1/(2eC_1)$. If $|\lambda|s_\varepsilon\le c_1$, then the geometric series is bounded by $2$, and therefore
\begin{align*}
\mathbb E[e^{\lambda Y}]
\le 1+C_2\lambda^2s_\varepsilon^2,
\end{align*}
where $C_2:=2e^2C_1^2$ is universal. Using $1+r\le e^r$ for $r\ge0$ with $r=C_2\lambda^2s_\varepsilon^2$ gives
\begin{align*}
\mathbb E[e^{\lambda Y}]
\le \exp(C_2\lambda^2s_\varepsilon^2).
\end{align*}
For $|\lambda|s_\varepsilon>c_1$, the preceding prefactor estimate gives
\begin{align*}
2\exp\left(\frac{\lambda^2s_\varepsilon^2}{2}\right)
\le \exp\left(\left(\frac12+\frac{\log 2}{c_1^2}\right)\lambda^2s_\varepsilon^2\right).
\end{align*}
Combining the small and large cases, there is a universal constant $C_3>0$ such that
\begin{align*}
\mathbb E[e^{\lambda Y}]\le \exp(C_3\lambda^2s_\varepsilon^2)
\end{align*}
for every $\lambda\in\mathbb R$. Letting $\varepsilon\downarrow0$ gives
\begin{align*}
\mathbb E[e^{\lambda Y}]\le \exp(C_3\lambda^2A^2).
\end{align*}
Thus condition 1 holds with $K_1=\sqrt{2C_3}\,\|Y\|_{\psi_2}$.
[/step]
[step:Collect the quantitative implications]
If $Y=0$ almost surely, then the least admissible values in conditions 1, 2, and 3 are all $0$, and $\|Y\|_{\psi_2}=0$, so the asserted comparability holds. Suppose now that $Y$ is not almost surely zero. The previous steps give the chain of universal estimates
\begin{align*}
K_2 \le \sqrt2K_1.
\end{align*}
They also give
\begin{align*}
K_3 \le C_0K_2.
\end{align*}
Furthermore,
\begin{align*}
\|Y\|_{\psi_2}\le 4K_3.
\end{align*}
Finally,
\begin{align*}
K_1 \le C_4\|Y\|_{\psi_2},
\end{align*}
where $C_0,C_4>0$ are universal constants. Taking infima over all admissible positive constants in the nonzero case gives two-sided comparability among the infimal admissible constants $K_1^*$, $K_2^*$, $K_3^*$, and $\|Y\|_{\psi_2}$. Therefore the four conditions are equivalent up to universal multiplicative constants, as claimed.
[/step]