[proofplan]
We first translate the $\psi_2$ bound into a uniform sub-Gaussian tail estimate for each [random variable](/page/Random%20Variable). A union bound then gives a tail estimate for the maximum $M := \max_{1 \le i \le N}|X_i|$. The expectation bound follows by integrating this tail estimate and splitting the integral at the natural scale $K\sqrt{\log(N+1)}$. The deviation estimate is obtained by applying the same union-bound tail estimate at the shifted threshold and absorbing the finite maximum penalty into the logarithmic part of the threshold.
[/proofplan]
[step:Reduce the $\psi_2$ hypothesis to individual tail bounds]
Let $(\Omega,\mathcal F,\mathbb P)$ denote the probability space on which the random variables $X_1,\dots,X_N$ are defined. Since the statement has been formulated with $K>0$, all expressions involving $K^{-2}$ are defined. Fix $i \in \{1,\dots,N\}$. By definition,
\begin{align*}
\|X_i\|_{\psi_2}=\inf\left\{a>0: \mathbb E\left[\exp\left(\frac{X_i^2}{a^2}\right)\right]\le 2\right\}.
\end{align*}
Since $\|X_i\|_{\psi_2}\le K$, every $K+\varepsilon$ with $\varepsilon>0$ belongs to this admissible set, and hence
\begin{align*}
\mathbb E\left[\exp\left(\frac{X_i^2}{(K+\varepsilon)^2}\right)\right] \le 2.
\end{align*}
As $\varepsilon \downarrow 0$, the non-negative random variables $Y_{i,\varepsilon}: \Omega \to [0,\infty)$ defined by
\begin{align*}
Y_{i,\varepsilon}(\omega) := \exp\left(\frac{X_i(\omega)^2}{(K+\varepsilon)^2}\right)
\end{align*}
increase pointwise to the random variable $Y_i: \Omega \to [0,\infty)$ defined by
\begin{align*}
Y_i(\omega) := \exp\left(\frac{X_i(\omega)^2}{K^2}\right).
\end{align*}
By [Fatou's lemma](/theorems/510),
\begin{align*}
\mathbb E[Y_i]
\le \liminf_{\varepsilon \downarrow 0}\mathbb E[Y_{i,\varepsilon}]
\le 2.
\end{align*}
For each $t \ge 0$, applying [Markov's inequality](/theorems/514) to the non-negative random variable $Y_i$ at the level $\exp(t^2/K^2)$ gives
\begin{align*}
\mathbb P(|X_i| \ge t)=\mathbb P\left(\exp\left(\frac{X_i^2}{K^2}\right) \ge \exp\left(\frac{t^2}{K^2}\right)\right) \le 2\exp\left(-\frac{t^2}{K^2}\right).
\end{align*}
[/step]
[step:Apply the union bound to the maximum]
Define the random variable $M: \Omega \to [0,\infty)$ by
\begin{align*}
M(\omega) := \max_{1 \le i \le N}|X_i(\omega)|.
\end{align*}
For every $t \ge 0$, the event $\{M \ge t\}$ is contained in the union $\bigcup_{i=1}^N \{|X_i| \ge t\}$. Therefore the [union bound](/theorems/6078) and the tail estimate from the previous step give
\begin{align*}
\mathbb P(M \ge t) \le \sum_{i=1}^N \mathbb P(|X_i| \ge t) \le 2N\exp\left(-\frac{t^2}{K^2}\right).
\end{align*}
Also $\mathbb P(M \ge t) \le 1$, so
\begin{align*}
\mathbb P(M \ge t) \le \min\left\{1,2N\exp\left(-\frac{t^2}{K^2}\right)\right\}.
\end{align*}
[guided]
The maximum is large only if at least one of the variables is large. Formally, define the random variable $M: \Omega \to [0,\infty)$ by
\begin{align*}
M(\omega) := \max_{1 \le i \le N}|X_i(\omega)|.
\end{align*}
For a fixed threshold $t \ge 0$, if $M(\omega) \ge t$, then by the definition of the maximum there exists an index $i \in \{1,\dots,N\}$ such that $|X_i(\omega)| \ge t$. Hence
\begin{align*}
\{M \ge t\} \subset \bigcup_{i=1}^N \{|X_i| \ge t\}.
\end{align*}
Taking probabilities and using the [union bound](/theorems/6078), equivalently finite subadditivity of $\mathbb P$ over the finite family of events $\{|X_i| \ge t\}$, gives
\begin{align*}
\mathbb P(M \ge t) \le \sum_{i=1}^N \mathbb P(|X_i| \ge t).
\end{align*}
For each fixed index $i \in \{1,\dots,N\}$, the preceding tail argument applies Markov's inequality to the non-negative random variable $Y_i=\exp(X_i^2/K^2)$ at the level $\exp(t^2/K^2)$. Since the $\psi_2$ hypothesis gave $\mathbb E[Y_i]\le 2$, we obtain
\begin{align*}
\mathbb P(|X_i| \ge t)
= \mathbb P\left(Y_i \ge \exp\left(\frac{t^2}{K^2}\right)\right)
\le 2\exp\left(-\frac{t^2}{K^2}\right).
\end{align*}
Substituting this bound into the finite sum yields
\begin{align*}
\mathbb P(M \ge t) \le 2N\exp\left(-\frac{t^2}{K^2}\right).
\end{align*}
Since probabilities are always at most $1$, we may combine this with the elementary bound $\mathbb P(M \ge t) \le 1$ and obtain
\begin{align*}
\mathbb P(M \ge t) \le \min\left\{1,2N\exp\left(-\frac{t^2}{K^2}\right)\right\}.
\end{align*}
This is the key estimate: the price of taking the maximum over $N$ variables is the factor $N$, and the logarithm in the final bound will be exactly what cancels that factor.
[/guided]
[/step]
[step:Integrate the tail bound to estimate the expectation]
Let $\mathcal L^1$ denote one-dimensional [Lebesgue measure](/page/Lebesgue%20Measure) on $[0,\infty)$. Since $M$ is the finite maximum of measurable real-valued random variables, $M$ is measurable, and $M \ge 0$. The [layer-cake identity](/theorems/2956) for non-negative measurable random variables gives, with both sides allowed initially to take the value $+\infty$,
\begin{align*}
\mathbb E[M] = \int_0^\infty \mathbb P(M \ge t)\,d\mathcal L^1(t).
\end{align*}
Let
\begin{align*}
L := \log(N+1), \qquad a := K\sqrt{\log(2N)}.
\end{align*}
Using the tail estimate from the previous step and splitting the integral at $a$, we obtain
\begin{align*}
\mathbb E[M] \le \int_0^a 1\,d\mathcal L^1(t)+ \int_a^\infty 2N\exp\left(-\frac{t^2}{K^2}\right)\,d\mathcal L^1(t).
\end{align*}
For the second integral, the change of variables $t=Ks$ transforms $d\mathcal L^1(t)$ into $K\,d\mathcal L^1(s)$ and transforms the domain $[a,\infty)$ into $[\sqrt{\log(2N)},\infty)$. Hence
\begin{align*}
\mathbb E[M] \le a + K\int_{\sqrt{\log(2N)}}^\infty 2N e^{-s^2}\,d\mathcal L^1(s),
\end{align*}
where we used the change of variables $t=Ks$, so $d\mathcal L^1(t)=K\,d\mathcal L^1(s)$. For $s \ge \sqrt{\log(2N)}$, the Gaussian tail estimate
\begin{align*}
\int_r^\infty e^{-s^2}\,d\mathcal L^1(s) \le \frac{e^{-r^2}}{2r}
\end{align*}
with $r=\sqrt{\log(2N)}$ gives
\begin{align*}
K\int_{\sqrt{\log(2N)}}^\infty 2N e^{-s^2}\,d\mathcal L^1(s) \le K \cdot 2N \cdot \frac{e^{-\log(2N)}}{2\sqrt{\log(2N)}} = \frac{K}{2\sqrt{\log(2N)}}.
\end{align*}
Since $N \ge 1$, we have $\log(2N) \ge \log 2$, $\log(2N) \le 2\log(N+1)$, and $\log(N+1) \ge \log 2$. Therefore
\begin{align*}
\frac{K}{2\sqrt{\log 2}}
\le \frac{K\sqrt{\log(N+1)}}{2\log 2},
\end{align*}
and hence
\begin{align*}
\mathbb E[M] \le K\sqrt{2\log(N+1)} + \frac{K\sqrt{\log(N+1)}}{2\log 2} \le C_1 K\sqrt{\log(N+1)}
\end{align*}
for the universal constant
\begin{align*}
C_1 := \sqrt{2} + \frac{1}{2\log 2}.
\end{align*}
[/step]
[step:Shift the threshold to obtain the deviation estimate]
Let $C := 2\sqrt{2}$ and define
\begin{align*}
b := C K\sqrt{\log(N+1)} + u.
\end{align*}
For every $u \ge 0$, the [union bound](/theorems/6078) tail estimate gives
\begin{align*}
\mathbb P(M \ge b) \le 2N\exp\left(-\frac{b^2}{K^2}\right).
\end{align*}
Since $b \ge C K\sqrt{\log(N+1)}$, we have
\begin{align*}
\frac{b^2}{K^2} \ge \frac{C^2}{2}\log(N+1) + \frac{u^2}{2K^2},
\end{align*}
using $(r+s)^2 \ge r^2+s^2$ for $r,s \ge 0$ and weakening the bound by a factor of $2$. Therefore
\begin{align*}
\mathbb P(M \ge b) \le 2N\exp\left(-\frac{C^2}{2}\log(N+1)\right)\exp\left(-\frac{u^2}{2K^2}\right).
\end{align*}
With $C=2\sqrt{2}$, we have $C^2/2=4$, and since $N \le (N+1)^4$ for $N \ge 1$,
\begin{align*}
N\exp\left(-4\log(N+1)\right)
= \frac{N}{(N+1)^4}
\le 1.
\end{align*}
Thus
\begin{align*}
\mathbb P\left(M \ge C K\sqrt{\log(N+1)}+u\right)
\le 2\exp\left(-\frac{u^2}{2K^2}\right).
\end{align*}
Taking $c := 1/2$ proves the claimed deviation estimate.
[/step]
[step:Combine the estimates]
The expectation estimate follows from the third step with the universal constant $C_1$. The deviation estimate follows from the fourth step with universal constants $C=2\sqrt{2}$ and $c=1/2$. Enlarging $C$ if necessary so that the same universal constant may be used in both displayed conclusions gives the theorem.
[/step]