Gaussian ARMA Prediction Interval Theorem — Statement & Proof

Gaussian ARMA Prediction Interval Theorem (Theorem # 3667)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] The forecast error representation reduces the prediction error to a finite linear combination of future innovations. Since the innovations are Gaussian and independent across time, this finite linear combination is Gaussian, and its conditional law given $\mathcal F_t$ is the same as its unconditional law because only future innovations appear. Computing its mean and variance gives the stated conditional normal distribution. The prediction interval then follows by standardising the error and using the central quantiles of the standard normal distribution. [/proofplan] [step:Represent the forecast error as a finite sum of future innovations] Let $(\Omega,\mathcal F,\mathbb P)$ be the probability space on which the ARMA process and its innovations are defined. For each $t \in \mathbb Z$, define the forecasting sigma-algebra $\mathcal F_t \subseteq \mathcal F$ by \begin{align*} \mathcal F_t := \sigma(Y_s : s \leq t). \end{align*} Fix $t \in \mathbb Z$ and $h \in \mathbb N$; by the global convention for Androma, $\mathbb N = \{1,2,3,\dots\}$. Define the forecast error random variable \begin{align*} E_{t,h} : \Omega &\to \mathbb R \\ \omega &\mapsto Y_{t+h}(\omega) - \hat Y_t(h)(\omega). \end{align*} By the assumed ARMA forecast error representation, \begin{align*} E_{t,h} = \sum_{j=0}^{h-1} \psi_j \varepsilon_{t+h-j}. \end{align*} The indices $t+h-j$ appearing in this sum are $t+1,t+2,\dots,t+h$, so $E_{t,h}$ depends only on innovations strictly after time $t$. [/step] [step:Check independence from the forecasting information] For each $j \in \{0,\dots,h-1\}$, the innovation $\varepsilon_{t+h-j}$ is a future innovation relative to time $t$. Since the innovations $(\varepsilon_s)_{s \in \mathbb Z}$ are independent and the causal process satisfies $Y_s = \mu + \sum_{k=0}^{\infty}\psi_k \varepsilon_{s-k}$, each $Y_s$ with $s \leq t$ is measurable with respect to the sigma-algebra generated by innovations up to time $t$. Thus the random vector \begin{align*} (\varepsilon_{t+1},\dots,\varepsilon_{t+h}) \end{align*} is independent of $\mathcal F_t$. Since $E_{t,h}$ is a measurable function of this future innovation vector, $E_{t,h}$ is independent of $\mathcal F_t$. [/step] [step:Compute the conditional Gaussian law of the forecast error] Define the coefficient vector $a \in \mathbb R^h$ by \begin{align*} a := (\psi_{h-1},\psi_{h-2},\dots,\psi_0). \end{align*} Define the future innovation vector $Z_{t,h}: \Omega \to \mathbb R^h$ by \begin{align*} Z_{t,h}: \Omega &\to \mathbb R^h \\ \omega &\mapsto (\varepsilon_{t+1}(\omega),\varepsilon_{t+2}(\omega),\dots,\varepsilon_{t+h}(\omega)). \end{align*} Then \begin{align*} E_{t,h} = \sum_{j=0}^{h-1}\psi_j \varepsilon_{t+h-j} = a \cdot Z_{t,h}. \end{align*} Because the components of $Z_{t,h}$ are independent $\mathcal N(0,\sigma^2)$ random variables, the linear combination $a \cdot Z_{t,h}$ is Gaussian with mean \begin{align*} \mathbb E[E_{t,h}] = \sum_{j=0}^{h-1}\psi_j \mathbb E[\varepsilon_{t+h-j}] = 0 \end{align*} and variance \begin{align*} \operatorname{Var}(E_{t,h}) &= \operatorname{Var}\left(\sum_{j=0}^{h-1}\psi_j \varepsilon_{t+h-j}\right) \\ &= \sum_{j=0}^{h-1}\psi_j^2 \operatorname{Var}(\varepsilon_{t+h-j}) \\ &= \sigma^2 \sum_{j=0}^{h-1}\psi_j^2, \end{align*} where the cross-covariance terms vanish because distinct innovations are independent. Since $E_{t,h}$ is independent of $\mathcal F_t$, its conditional distribution given $\mathcal F_t$ equals its unconditional distribution. Therefore, conditional on $\mathcal F_t$, \begin{align*} Y_{t+h}-\hat Y_t(h) = E_{t,h} \sim \mathcal N\left(0,\sigma^2\sum_{j=0}^{h-1}\psi_j^2\right). \end{align*} [/step] [step:Standardise the error and read off the central Gaussian interval] Define the forecast error standard deviation \begin{align*} s_h := \sigma\left(\sum_{j=0}^{h-1}\psi_j^2\right)^{1/2}. \end{align*} Since $\psi_0 = 1$ and $\sigma > 0$, we have $s_h > 0$. The conditional Gaussian law proved above gives \begin{align*} \frac{Y_{t+h}-\hat Y_t(h)}{s_h} \;\middle|\; \mathcal F_t \sim \mathcal N(0,1). \end{align*} Let $z_{1-\alpha/2}$ be the $(1-\alpha/2)$-quantile of the standard normal distribution. Hence \begin{align*} \mathbb P\left( -z_{1-\alpha/2} \leq \frac{Y_{t+h}-\hat Y_t(h)}{s_h} \leq z_{1-\alpha/2} \;\middle|\; \mathcal F_t \right) = 1-\alpha. \end{align*} Multiplying the middle inequality by $s_h > 0$ and adding $\hat Y_t(h)$ gives \begin{align*} \mathbb P\left( \hat Y_t(h)-z_{1-\alpha/2}s_h \leq Y_{t+h} \leq \hat Y_t(h)+z_{1-\alpha/2}s_h \;\middle|\; \mathcal F_t \right) = 1-\alpha. \end{align*} Substituting the definition of $s_h$ yields exactly \begin{align*} \left[ \hat Y_t(h) - z_{1-\alpha/2}\sigma\left(\sum_{j=0}^{h-1}\psi_j^2\right)^{1/2}, \; \hat Y_t(h) + z_{1-\alpha/2}\sigma\left(\sum_{j=0}^{h-1}\psi_j^2\right)^{1/2} \right], \end{align*} so this interval has exact conditional coverage $1-\alpha$ under the model. [/step]

Explore Further

Tail Dependence Coefficients of the Bivariate Student $t$ Copula probability Gauss-Markov Efficiency of Generalized Least Squares in Seemingly Unrelated Regressions probability Population Canonical Correlations via Singular Values probability Dickey-Fuller Unit Root Limiting Distribution probability Strong Consistency of the Multivariate Normal Maximum Likelihood Estimators probability One-Way MANOVA Sum-of-Squares-and-Products Decomposition probability Orthogonality of Population Canonical Variates probability Covariance-Structure Distinction Between Factor Analysis and Principal Component Analysis probability

What brings you to Androma?

Start with a route through the knowledge graph.

Gaussian ARMA Prediction Interval Theorem (Theorem # 3667)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Gaussian ARMA Prediction Interval Theorem (Theorem # 3667)

Discussion

Proof

Explore Further