[proofplan]
The forecast error representation reduces the prediction error to a finite linear combination of future innovations. Since the innovations are Gaussian and independent across time, this finite linear combination is Gaussian, and its conditional law given $\mathcal F_t$ is the same as its unconditional law because only future innovations appear. Computing its mean and variance gives the stated conditional normal distribution. The prediction interval then follows by standardising the error and using the central quantiles of the standard normal distribution.
[/proofplan]
[step:Represent the forecast error as a finite sum of future innovations]
Let $(\Omega,\mathcal F,\mathbb P)$ be the probability space on which the ARMA process and its innovations are defined. For each $t \in \mathbb Z$, define the forecasting sigma-algebra $\mathcal F_t \subseteq \mathcal F$ by
\begin{align*}
\mathcal F_t := \sigma(Y_s : s \leq t).
\end{align*}
Fix $t \in \mathbb Z$ and $h \in \mathbb N$; by the global convention for Androma, $\mathbb N = \{1,2,3,\dots\}$. Define the forecast error random variable
\begin{align*}
E_{t,h} : \Omega &\to \mathbb R \\
\omega &\mapsto Y_{t+h}(\omega) - \hat Y_t(h)(\omega).
\end{align*}
By the assumed ARMA forecast error representation,
\begin{align*}
E_{t,h}
=
\sum_{j=0}^{h-1} \psi_j \varepsilon_{t+h-j}.
\end{align*}
The indices $t+h-j$ appearing in this sum are $t+1,t+2,\dots,t+h$, so $E_{t,h}$ depends only on innovations strictly after time $t$.
[/step]
[step:Check independence from the forecasting information]
For each $j \in \{0,\dots,h-1\}$, the innovation $\varepsilon_{t+h-j}$ is a future innovation relative to time $t$. Since the innovations $(\varepsilon_s)_{s \in \mathbb Z}$ are independent and the causal process satisfies $Y_s = \mu + \sum_{k=0}^{\infty}\psi_k \varepsilon_{s-k}$, each $Y_s$ with $s \leq t$ is measurable with respect to the sigma-algebra generated by innovations up to time $t$.
Thus the random vector
\begin{align*}
(\varepsilon_{t+1},\dots,\varepsilon_{t+h})
\end{align*}
is independent of $\mathcal F_t$. Since $E_{t,h}$ is a measurable function of this future innovation vector, $E_{t,h}$ is independent of $\mathcal F_t$.
[/step]
[step:Compute the conditional Gaussian law of the forecast error]
Define the coefficient vector $a \in \mathbb R^h$ by
\begin{align*}
a := (\psi_{h-1},\psi_{h-2},\dots,\psi_0).
\end{align*}
Define the future innovation vector $Z_{t,h}: \Omega \to \mathbb R^h$ by
\begin{align*}
Z_{t,h}: \Omega &\to \mathbb R^h \\
\omega &\mapsto (\varepsilon_{t+1}(\omega),\varepsilon_{t+2}(\omega),\dots,\varepsilon_{t+h}(\omega)).
\end{align*}
Then
\begin{align*}
E_{t,h}
=
\sum_{j=0}^{h-1}\psi_j \varepsilon_{t+h-j}
=
a \cdot Z_{t,h}.
\end{align*}
Because the components of $Z_{t,h}$ are independent $\mathcal N(0,\sigma^2)$ random variables, the linear combination $a \cdot Z_{t,h}$ is Gaussian with mean
\begin{align*}
\mathbb E[E_{t,h}]
=
\sum_{j=0}^{h-1}\psi_j \mathbb E[\varepsilon_{t+h-j}]
=
0
\end{align*}
and variance
\begin{align*}
\operatorname{Var}(E_{t,h})
&=
\operatorname{Var}\left(\sum_{j=0}^{h-1}\psi_j \varepsilon_{t+h-j}\right) \\
&=
\sum_{j=0}^{h-1}\psi_j^2 \operatorname{Var}(\varepsilon_{t+h-j}) \\
&=
\sigma^2 \sum_{j=0}^{h-1}\psi_j^2,
\end{align*}
where the cross-covariance terms vanish because distinct innovations are independent.
Since $E_{t,h}$ is independent of $\mathcal F_t$, its conditional distribution given $\mathcal F_t$ equals its unconditional distribution. Therefore, conditional on $\mathcal F_t$,
\begin{align*}
Y_{t+h}-\hat Y_t(h)
=
E_{t,h}
\sim
\mathcal N\left(0,\sigma^2\sum_{j=0}^{h-1}\psi_j^2\right).
\end{align*}
[/step]
[step:Standardise the error and read off the central Gaussian interval]
Define the forecast error standard deviation
\begin{align*}
s_h := \sigma\left(\sum_{j=0}^{h-1}\psi_j^2\right)^{1/2}.
\end{align*}
Since $\psi_0 = 1$ and $\sigma > 0$, we have $s_h > 0$. The conditional Gaussian law proved above gives
\begin{align*}
\frac{Y_{t+h}-\hat Y_t(h)}{s_h}
\;\middle|\; \mathcal F_t
\sim
\mathcal N(0,1).
\end{align*}
Let $z_{1-\alpha/2}$ be the $(1-\alpha/2)$-quantile of the standard normal distribution. Hence
\begin{align*}
\mathbb P\left(
-z_{1-\alpha/2}
\leq
\frac{Y_{t+h}-\hat Y_t(h)}{s_h}
\leq
z_{1-\alpha/2}
\;\middle|\;
\mathcal F_t
\right)
=
1-\alpha.
\end{align*}
Multiplying the middle inequality by $s_h > 0$ and adding $\hat Y_t(h)$ gives
\begin{align*}
\mathbb P\left(
\hat Y_t(h)-z_{1-\alpha/2}s_h
\leq
Y_{t+h}
\leq
\hat Y_t(h)+z_{1-\alpha/2}s_h
\;\middle|\;
\mathcal F_t
\right)
=
1-\alpha.
\end{align*}
Substituting the definition of $s_h$ yields exactly
\begin{align*}
\left[
\hat Y_t(h) - z_{1-\alpha/2}\sigma\left(\sum_{j=0}^{h-1}\psi_j^2\right)^{1/2},
\;
\hat Y_t(h) + z_{1-\alpha/2}\sigma\left(\sum_{j=0}^{h-1}\psi_j^2\right)^{1/2}
\right],
\end{align*}
so this interval has exact conditional coverage $1-\alpha$ under the model.
[/step]