Androma — The Home of Mathematics on the Internet

custom_env admin

[guided]We need two facts before any computation, and both are exactly where the hypotheses "causal" and "invertible" are consumed. **Why these two facts?** When we condition the ARMA recursion on $\mathcal{F}_t$, each term is one of $Y_{t+h-i}$ or $Z_{t+h-j}$. Property (M) lets us keep a term unchanged precisely when it is $\mathcal{F}_t$-measurable, and property (I) lets us delete a centred term precisely when it is independent of $\mathcal{F}_t$. So we must know, for an arbitrary index, whether the corresponding $Z$ is "known" (measurable) or "unseen" (independent). The dividing line is the time $t$. **Past innovations are known (invertibility).** Intuitively, once we have observed $Y_s, Y_{s-1}, \dots$ we can reconstruct the shock $Z_s$ that drove the system at time $s \le t$. Formally, invertibility gives $Z_s = \sum_{j=0}^\infty \pi_j Y_{s-j}$ in $L^2$. Every index $s - j$ is $\le s \le t$, so each $Y_{s-j}$ lies in $\mathcal{F}_t$; the partial sums $S_N = \sum_{j=0}^N \pi_j Y_{s-j}$ are $\mathcal{F}_t$-measurable, and since $S_N \to Z_s$ in $L^2$ a subsequence converges a.s., so the limit $Z_s$ is $\mathcal{F}_t$-measurable. (We tacitly complete $\mathcal{F}_t$ with null sets; this changes no conditional expectation.) Without invertibility we could not assert that the *past* innovations appearing in the recursion are observable, and the clean formula would break. **Future innovations are unseen (causality).** Intuitively a causal system depends only on past and present shocks, so observing $Y$ up to time $t$ tells us nothing about a shock $Z_{t+k}$ that has not yet occurred. Formally, set $\mathcal{G}_t = \sigma(Z_v : v \le t)$. Causality $Y_u = \sum_{j \ge 0}\psi_j Z_{u-j}$ expresses each $Y_u$ ($u \le t$) as an $L^2$ limit of $\mathcal{G}_t$-measurable variables, so $Y_u$ is $\mathcal{G}_t$-measurable and therefore $\mathcal{F}_t = \sigma(Y_u : u \le t) \subseteq \mathcal{G}_t$. Now $Z_{t+k}$ with $k \ge 1$ is, by the independence of the white-noise family, independent of $\sigma(Z_v : v \le t) = \mathcal{G}_t$; independence of a $\sigma$-algebra is inherited by any sub-$\sigma$-algebra, so $Z_{t+k}$ is independent of $\mathcal{F}_t$. This is the only place the *independence* (not merely uncorrelatedness) of the noise is used; it is what makes $\mathbb{E}[Z_{t+k}\mid\mathcal{F}_t]=\mathbb{E}[Z_{t+k}]=0$ via property (I), so that the conditional expectation coincides with the linear forecast.[/guided]

custom_env admin

[step:Condition the ARMA identity at index $t+h$ and collapse each term]Fix $h \ge 1$. The defining ARMA identity, written at time index $t + h$, reads \begin{align*} Y_{t+h} = \sum_{i=1}^p \phi_i\, Y_{t+h-i} + Z_{t+h} + \sum_{j=1}^q \theta_j\, Z_{t+h-j}. \end{align*} Every random variable here lies in $L^2 \subseteq L^1$, so we may apply $P_t$ and use linearity **(L)**: \begin{align*} \hat{Y}_t(h) = P_t Y_{t+h} = \sum_{i=1}^p \phi_i\, P_t Y_{t+h-i} + P_t Z_{t+h} + \sum_{j=1}^q \theta_j\, P_t Z_{t+h-j}. \end{align*} We evaluate each term using the facts from the previous step. *Autoregressive terms $P_t Y_{t+h-i}$ for $1 \le i \le p$.* If $h - i \ge 1$, then $Y_{t+(h-i)}$ is a future value and $P_t Y_{t+(h-i)} = \hat{Y}_t(h-i)$ by definition of the forecast. If $h - i \le 0$, then $t + (h-i) \le t$, so $Y_{t+h-i}$ is $\mathcal{F}_t$-measurable and $P_t Y_{t+h-i} = Y_{t+h-i}$ by **(M)**; under the convention $\hat{Y}_t(r) = Y_{t+r}$ for $r \le 0$ this again equals $\hat{Y}_t(h-i)$. In both cases \begin{align*} P_t Y_{t+h-i} = \hat{Y}_t(h - i). \end{align*} *Leading innovation $P_t Z_{t+h}$.* Since $h \ge 1$, $Z_{t+h}$ is independent of $\mathcal{F}_t$ with $\mathbb{E}[Z_{t+h}] = 0$, so $P_t Z_{t+h} = 0$ by **(I)**. With the convention $\hat{Z}_t(h) = 0$ for $h \ge 1$ this is $\hat{Z}_t(h)$. *Moving-average terms $P_t Z_{t+h-j}$ for $1 \le j \le q$.* If $h - j \ge 1$, then $Z_{t+(h-j)}$ is a future innovation, independent of $\mathcal{F}_t$ with mean $0$, so $P_t Z_{t+h-j} = 0$ by **(I)**, which equals $\hat{Z}_t(h-j)$. If $h - j \le 0$, then $t + (h-j) \le t$, so $Z_{t+h-j}$ is $\mathcal{F}_t$-measurable and $P_t Z_{t+h-j} = Z_{t+h-j} = \hat{Z}_t(h-j)$ by **(M)**. In both cases \begin{align*} P_t Z_{t+h-j} = \hat{Z}_t(h - j). \end{align*} Substituting these three evaluations gives, for every $h \ge 1$, \begin{align*} \hat{Y}_t(h) = \sum_{i=1}^p \phi_i\, \hat{Y}_t(h - i) + \hat{Z}_t(h) + \sum_{j=1}^q \theta_j\, \hat{Z}_t(h - j), \end{align*} which is the asserted recursion.[/step]

custom_env admin

[guided]The strategy is to write the ARMA law one step into the future — at index $t+h$ — and then take the conditional expectation $P_t = \mathbb{E}[\cdot \mid \mathcal{F}_t]$ of both sides. Because $P_t$ is linear **(L)**, the conditional expectation of the sum is the sum of the conditional expectations, and the only work left is to classify each individual term. The classification is governed entirely by the sign of the time offset of each index relative to the present $t$. An index $> t$ is in the future; an index $\le t$ is observed. - **Autoregressive part.** The term $Y_{t+h-i}$ has offset $h - i$. If $h - i \ge 1$ the value is still in the future and its conditional expectation is, by definition, the forecast $\hat{Y}_t(h-i)$ — this is the recursive coupling that makes long-horizon forecasts depend on shorter-horizon ones. If $h - i \le 0$ the value has already been observed, so it is $\mathcal{F}_t$-measurable and property (M) returns it unchanged. The convention $\hat{Y}_t(r) := Y_{t+r}$ for $r \le 0$ is chosen precisely so these two cases read identically as $\hat{Y}_t(h-i)$; it is not an extra assumption but a bookkeeping device consistent with (M), since $\mathbb{E}[Y_{t+r}\mid\mathcal{F}_t] = Y_{t+r}$ when $r \le 0$. - **Leading innovation.** The shock $Z_{t+h}$ driving $Y_{t+h}$ always has offset $h \ge 1$, so it is a genuine future innovation: independent of $\mathcal{F}_t$ (Claim 2) and centred, hence killed by property (I). This is the analytic content of "replace future innovations by $0$." - **Moving-average part.** The term $Z_{t+h-j}$ has offset $h - j$. If $h - j \ge 1$ it is a future shock and is annihilated by (I), matching $\hat{Z}_t(h-j) = 0$. If $h - j \le 0$ it is a past shock, $\mathcal{F}_t$-measurable by Claim 1, and (M) returns it unchanged, matching $\hat{Z}_t(h-j) = Z_{t+h-j}$. Again the convention for $\hat{Z}_t$ merges the two cases. Assembling the three evaluated pieces and using linearity in reverse gives the single recursion valid for all $h \ge 1$. Notice how the two structural claims did all the heavy lifting: every term became either "kept" or "deleted" with no residual error, which is exactly why the practical forecasting rule is exact rather than approximate.[/guided]

custom_env admin

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data