Wold Decomposition Theorem — Statement & Proof

Wold Decomposition Theorem (Theorem # 3641)

Theorem

Edit Issues Pull Requests Attributions Admin

Let $(\Omega,\mathcal F,\mathbb P)$ be a probability space, and let $(X_t)_{t\in\mathbb Z}$ be a real-valued stochastic process such that each $X_t:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ belongs to $L^2(\Omega,\mathcal F,\mathbb P)$, satisfies $\mathbb E[X_t]=0$, and is second-order stationary: \begin{align*} \mathbb E[X_{t+h}X_{s+h}] = \mathbb E[X_tX_s] \end{align*} for all $s,t,h\in\mathbb Z$. For each $t\in\mathbb Z$, define the closed past linear span \begin{align*} \mathcal H_t^X := \overline{\operatorname{span}}\{X_s:s\leq t\}\subset L^2(\Omega,\mathcal F,\mathbb P), \end{align*} where the closure is taken in the $L^2$ norm, and define the remote past \begin{align*} \mathcal H_{-\infty}^X := \bigcap_{t\in\mathbb Z}\mathcal H_t^X. \end{align*} Then there exist a mean-zero second-order stationary deterministic process $(D_t)_{t\in\mathbb Z}$, a mean-zero second-order stationary purely nondeterministic process $(Y_t)_{t\in\mathbb Z}$, an orthogonal innovation process $(\varepsilon_t)_{t\in\mathbb Z}$, a number $\sigma^2\geq 0$, and coefficients $(\psi_j)_{j\geq 0}\subset\mathbb R$ such that, for every $t\in\mathbb Z$, \begin{align*} X_t = D_t + Y_t \end{align*} in $L^2(\Omega,\mathcal F,\mathbb P)$, the two components are orthogonal in the sense that \begin{align*} \mathbb E[D_tY_s]=0 \end{align*} for all $s,t\in\mathbb Z$, and \begin{align*} Y_t = \sum_{j=0}^{\infty}\psi_j\varepsilon_{t-j} \end{align*} with convergence in $L^2(\Omega,\mathcal F,\mathbb P)$. The innovation process satisfies \begin{align*} \mathbb E[\varepsilon_t]=0,\qquad \mathbb E[\varepsilon_t^2]=\sigma^2,\qquad \mathbb E[\varepsilon_t\varepsilon_s]=0\quad\text{for }s\neq t, \end{align*} and each deterministic variable $D_t$ is orthogonal to every innovation: \begin{align*} \mathbb E[D_t\varepsilon_s]=0 \end{align*} for all $s,t\in\mathbb Z$. If $\sigma^2>0$, then $\psi_0=1$ and \begin{align*} \sum_{j=0}^{\infty}|\psi_j|^2<\infty. \end{align*} If $\sigma^2=0$, then $Y_t=0$ in $L^2(\Omega,\mathcal F,\mathbb P)$ for every $t\in\mathbb Z$, the moving-average term is identically zero, and the normalization $\psi_0=1$ is only a convention. Here deterministic means \begin{align*} \bigcap_{t\in\mathbb Z}\overline{\operatorname{span}}\{D_s:s\leq t\} = \overline{\operatorname{span}}\{D_s:s\in\mathbb Z\}, \end{align*} and purely nondeterministic means \begin{align*} \bigcap_{t\in\mathbb Z}\overline{\operatorname{span}}\{Y_s:s\leq t\}=\{0\}. \end{align*}

Discussion

No discussion available for this theorem.

Proof

[proofplan] We work inside the Hilbert space generated by the process and use the unitary shift induced by stationarity. The deterministic component is the orthogonal projection of $X_t$ onto the remote past $\mathcal H_{-\infty}^X$, and the remaining component $Y_t$ has zero remote past. For the purely nondeterministic component, each one-step increment $\mathcal H_t^Y\ominus\mathcal H_{t-1}^Y$ is spanned by the innovation $\varepsilon_t$, and iterating these orthogonal decompositions gives finite moving-average approximations. The projections onto the decreasing remote past spaces converge to zero, so the finite expansions converge in $L^2$ to the desired Wold expansion. [/proofplan] [step:Construct the stationary shift on the Hilbert space generated by the process] Let \begin{align*} \mathcal H^X:=\overline{\operatorname{span}}\{X_t:t\in\mathbb Z\}\subset L^2(\Omega,\mathcal F,\mathbb P), \end{align*} with inner product $(U,V)_{L^2}:=\mathbb E[UV]$. Define first on finite linear combinations the map \begin{align*} S_0:\operatorname{span}\{X_t:t\in\mathbb Z\}&\to \operatorname{span}\{X_t:t\in\mathbb Z\}\\ \sum_{k=1}^n a_kX_{t_k}&\mapsto \sum_{k=1}^n a_kX_{t_k+1}. \end{align*} Second-order stationarity gives, for every finite family $(a_k)_{k=1}^n\subset\mathbb R$ and $(t_k)_{k=1}^n\subset\mathbb Z$, \begin{align*} \left\|\sum_{k=1}^n a_kX_{t_k+1}\right\|_{L^2}^2 &= \sum_{k=1}^n\sum_{\ell=1}^n a_ka_\ell\,\mathbb E[X_{t_k+1}X_{t_\ell+1}]\\ &= \sum_{k=1}^n\sum_{\ell=1}^n a_ka_\ell\,\mathbb E[X_{t_k}X_{t_\ell}]\\ &= \left\|\sum_{k=1}^n a_kX_{t_k}\right\|_{L^2}^2. \end{align*} Thus $S_0$ is an isometry on a dense subspace of $\mathcal H^X$. Its range is the same algebraic span, because the inverse shift on finite linear combinations sends $X_t$ to $X_{t-1}$. Hence the isometric extension is surjective, and therefore it extends uniquely to a unitary operator \begin{align*} S:\mathcal H^X\to\mathcal H^X. \end{align*} For every $t\in\mathbb Z$, this operator satisfies \begin{align*} S\mathcal H_t^X=\mathcal H_{t+1}^X. \end{align*} Consequently $S\mathcal H_{-\infty}^X=\mathcal H_{-\infty}^X$. [guided] The point of stationarity is that it lets us represent time translation as a unitary operator. We define \begin{align*} \mathcal H^X:=\overline{\operatorname{span}}\{X_t:t\in\mathbb Z\}\subset L^2(\Omega,\mathcal F,\mathbb P), \end{align*} and use the Hilbert-space inner product $(U,V)_{L^2}:=\mathbb E[UV]$. On a finite linear combination of process variables, define the one-step shift by \begin{align*} S_0:\operatorname{span}\{X_t:t\in\mathbb Z\}&\to \operatorname{span}\{X_t:t\in\mathbb Z\}\\ \sum_{k=1}^n a_kX_{t_k}&\mapsto \sum_{k=1}^n a_kX_{t_k+1}. \end{align*} This is well-defined because it preserves the $L^2$ norm. Indeed, second-order stationarity gives \begin{align*} \left\|\sum_{k=1}^n a_kX_{t_k+1}\right\|_{L^2}^2 &= \sum_{k=1}^n\sum_{\ell=1}^n a_ka_\ell\,\mathbb E[X_{t_k+1}X_{t_\ell+1}]\\ &= \sum_{k=1}^n\sum_{\ell=1}^n a_ka_\ell\,\mathbb E[X_{t_k}X_{t_\ell}]\\ &= \left\|\sum_{k=1}^n a_kX_{t_k}\right\|_{L^2}^2. \end{align*} Hence if a finite linear combination represents the zero element of $L^2$, its shifted combination also represents the zero element. The map is not merely an isometry into $\mathcal H^X$; it is onto the algebraic span, since the backward shift sends each generator $X_t$ to $X_{t-1}$. Therefore the continuous extension has dense range and, being an isometry with closed range, has range equal to all of $\mathcal H^X$. Thus it extends uniquely by continuity to a unitary map \begin{align*} S:\mathcal H^X\to\mathcal H^X. \end{align*} The identity $SX_t=X_{t+1}$ implies \begin{align*} S\mathcal H_t^X=\mathcal H_{t+1}^X \end{align*} for every $t\in\mathbb Z$. Applying this to the intersection over all $t$ gives \begin{align*} S\mathcal H_{-\infty}^X = S\left(\bigcap_{t\in\mathbb Z}\mathcal H_t^X\right) = \bigcap_{t\in\mathbb Z}S\mathcal H_t^X = \bigcap_{t\in\mathbb Z}\mathcal H_{t+1}^X = \mathcal H_{-\infty}^X. \end{align*} Thus the remote past is invariant under the stationary time shift. [/guided] [/step] [step:Project onto the remote past to obtain the deterministic component] Let \begin{align*} P_{-\infty}:\mathcal H^X\to\mathcal H_{-\infty}^X \end{align*} be the orthogonal projection onto the closed subspace $\mathcal H_{-\infty}^X$. Define \begin{align*} D:\mathbb Z&\to L^2(\Omega,\mathcal F,\mathbb P)\\ t&\mapsto D_t:=P_{-\infty}X_t \end{align*} and \begin{align*} Y:\mathbb Z&\to L^2(\Omega,\mathcal F,\mathbb P)\\ t&\mapsto Y_t:=X_t-D_t. \end{align*} Let \begin{align*} L^2_0(\Omega,\mathcal F,\mathbb P):=\{Z\in L^2(\Omega,\mathcal F,\mathbb P):\mathbb E[Z]=0\} \end{align*} denote the closed mean-zero subspace. The expectation functional $Z\mapsto \mathbb E[Z]$ is continuous on $L^2(\Omega,\mathcal F,\mathbb P)$ by Cauchy-Schwarz, and each $X_t$ belongs to $L^2_0(\Omega,\mathcal F,\mathbb P)$; hence $\mathcal H^X\subset L^2_0(\Omega,\mathcal F,\mathbb P)$. Therefore $D_t,Y_t\in L^2_0(\Omega,\mathcal F,\mathbb P)$ for every $t\in\mathbb Z$. Since $S\mathcal H_{-\infty}^X=\mathcal H_{-\infty}^X$, the projection $P_{-\infty}$ commutes with $S$. Hence \begin{align*} D_t=S^tD_0,\qquad Y_t=S^tY_0. \end{align*} The unitarity of $S$ gives second-order stationarity of both processes, and the preceding paragraph gives their mean-zero property. For all $s,t\in\mathbb Z$, $D_s\in\mathcal H_{-\infty}^X$ and $Y_t=X_t-P_{-\infty}X_t$ is orthogonal to $\mathcal H_{-\infty}^X$, so \begin{align*} \mathbb E[D_sY_t]=0. \end{align*} It remains to verify determinism of $(D_t)$. For each $t\in\mathbb Z$, define \begin{align*} \mathcal H_t^D:=\overline{\operatorname{span}}\{D_s:s\leq t\}. \end{align*} Since every $D_s$ belongs to $\mathcal H_{-\infty}^X$, we have $\mathcal H_t^D\subset\mathcal H_{-\infty}^X$. Conversely, if $Z\in\mathcal H_{-\infty}^X$, then $Z\in\mathcal H_t^X$, so there are finite linear combinations \begin{align*} Z_n=\sum_{k=1}^{N_n}a_{n,k}X_{s_{n,k}},\qquad s_{n,k}\leq t, \end{align*} such that $Z_n\to Z$ in $L^2$. Applying the continuous projection $P_{-\infty}$ gives \begin{align*} P_{-\infty}Z_n=\sum_{k=1}^{N_n}a_{n,k}D_{s_{n,k}}\to P_{-\infty}Z=Z \end{align*} in $L^2$. Hence $Z\in\mathcal H_t^D$, and therefore \begin{align*} \mathcal H_t^D=\mathcal H_{-\infty}^X \end{align*} for every $t\in\mathbb Z$. Thus $(D_t)$ is deterministic. [/step] [step:Show that the residual process has zero remote past] For every $t\in\mathbb Z$, define \begin{align*} \mathcal H_t^Y:=\overline{\operatorname{span}}\{Y_s:s\leq t\}. \end{align*} Because $Y_s=X_s-P_{-\infty}X_s$, each $Y_s$ is orthogonal to $\mathcal H_{-\infty}^X$. Also $X_s=D_s+Y_s$, with $D_s\in\mathcal H_{-\infty}^X$. Hence \begin{align*} \mathcal H_t^X=\mathcal H_{-\infty}^X\oplus \mathcal H_t^Y \end{align*} as an orthogonal direct sum. If \begin{align*} Z\in\bigcap_{t\in\mathbb Z}\mathcal H_t^Y, \end{align*} then $Z\in\bigcap_{t\in\mathbb Z}\mathcal H_t^X=\mathcal H_{-\infty}^X$, while also $Z\perp\mathcal H_{-\infty}^X$. Therefore $\|Z\|_{L^2}^2=(Z,Z)_{L^2}=0$, so $Z=0$ in $L^2$. Thus \begin{align*} \bigcap_{t\in\mathbb Z}\mathcal H_t^Y=\{0\}, \end{align*} and $(Y_t)$ is purely nondeterministic. [/step] [step:Define the innovations and identify each one-step increment] For each $t\in\mathbb Z$, let \begin{align*} P_{t-1}^Y:\mathcal H^Y\to\mathcal H_{t-1}^Y \end{align*} denote the orthogonal projection, where \begin{align*} \mathcal H^Y:=\overline{\operatorname{span}}\{Y_t:t\in\mathbb Z\}. \end{align*} Define the innovation process \begin{align*} \varepsilon:\mathbb Z&\to L^2(\Omega,\mathcal F,\mathbb P)\\ t&\mapsto \varepsilon_t:=Y_t-P_{t-1}^YY_t. \end{align*} Because $\mathcal H^Y\subset L^2_0(\Omega,\mathcal F,\mathbb P)$ and $\varepsilon_t\in\mathcal H^Y$, we have \begin{align*} \mathbb E[\varepsilon_t]=0 \end{align*} for every $t\in\mathbb Z$. Also $\varepsilon_t\perp\mathcal H_{t-1}^Y$ and \begin{align*} \mathcal H_t^Y=\mathcal H_{t-1}^Y\oplus \operatorname{span}\{\varepsilon_t\}. \end{align*} Since $Y_t=S^tY_0$, the unitary $S$ maps $\mathcal H_r^Y$ onto $\mathcal H_{r+1}^Y$ for every $r\in\mathbb Z$. Thus $S^tP_{-1}^YS^{-t}$ is the orthogonal projection from $\mathcal H^Y$ onto $\mathcal H_{t-1}^Y$. By uniqueness of orthogonal projection, \begin{align*} P_{t-1}^Y=S^tP_{-1}^YS^{-t}. \end{align*} Therefore \begin{align*} \varepsilon_t =Y_t-P_{t-1}^YY_t =S^tY_0-S^tP_{-1}^YY_0 =S^t\varepsilon_0. \end{align*} Define \begin{align*} \sigma^2:=\mathbb E[\varepsilon_0^2]\geq 0. \end{align*} Since $S$ is unitary and $\varepsilon_t=S^t\varepsilon_0$, we have $\mathbb E[\varepsilon_t^2]=\sigma^2$ for all $t\in\mathbb Z$. If $s<t$, then $\varepsilon_s\in\mathcal H_s^Y\subset\mathcal H_{t-1}^Y$, so $\varepsilon_t\perp\varepsilon_s$. Thus \begin{align*} \mathbb E[\varepsilon_t\varepsilon_s]=0 \end{align*} whenever $s\neq t$. Since $\mathcal H_{-\infty}^X\perp\mathcal H^Y$, every $D_s$ is orthogonal to every $\varepsilon_t$. If $\sigma^2=0$, then $\varepsilon_t=0$ in $L^2$ for every $t$, and therefore $\mathcal H_t^Y=\mathcal H_{t-1}^Y$ for every $t$. Hence all spaces $\mathcal H_t^Y$ are equal, and their intersection equals each one of them. Since $(Y_t)$ is purely nondeterministic, this common space is $\{0\}$, so $Y_t=0$ for all $t$. This is precisely the zero moving-average case. [guided] The innovation at time $t$ is the part of $Y_t$ that cannot be predicted from its closed linear past. We define \begin{align*} \mathcal H^Y:=\overline{\operatorname{span}}\{Y_t:t\in\mathbb Z\} \end{align*} and let \begin{align*} P_{t-1}^Y:\mathcal H^Y\to\mathcal H_{t-1}^Y \end{align*} be the orthogonal projection. Then set \begin{align*} \varepsilon:\mathbb Z&\to L^2(\Omega,\mathcal F,\mathbb P)\\ t&\mapsto \varepsilon_t:=Y_t-P_{t-1}^YY_t. \end{align*} By the defining property of orthogonal projection, $\varepsilon_t$ is orthogonal to $\mathcal H_{t-1}^Y$. Since $\mathcal H^Y$ is contained in the closed mean-zero subspace $L^2_0(\Omega,\mathcal F,\mathbb P)$, and since both $Y_t$ and $P_{t-1}^YY_t$ belong to $\mathcal H^Y$, the innovation also belongs to $L^2_0(\Omega,\mathcal F,\mathbb P)$. Hence \begin{align*} \mathbb E[\varepsilon_t]=0 \end{align*} for every $t\in\mathbb Z$. Why does this single vector span the whole new information at time $t$? Since \begin{align*} \mathcal H_t^Y=\overline{\operatorname{span}}\bigl(\mathcal H_{t-1}^Y\cup\{Y_t\}\bigr) \end{align*} and $Y_t=P_{t-1}^YY_t+\varepsilon_t$, every element added when passing from $\mathcal H_{t-1}^Y$ to $\mathcal H_t^Y$ lies in the direction of $\varepsilon_t$. Hence \begin{align*} \mathcal H_t^Y=\mathcal H_{t-1}^Y\oplus \operatorname{span}\{\varepsilon_t\}. \end{align*} The sum is orthogonal because $\varepsilon_t\perp\mathcal H_{t-1}^Y$. We also need the innovations to move correctly under the time shift. From $Y_t=S^tY_0$ it follows that $S\mathcal H_r^Y=\mathcal H_{r+1}^Y$ for every $r\in\mathbb Z$. Therefore the operator $S^tP_{-1}^YS^{-t}$ is an orthogonal projection onto $\mathcal H_{t-1}^Y$: the conjugation by the unitary $S^t$ transports the target space $\mathcal H_{-1}^Y$ to $\mathcal H_{t-1}^Y$ and preserves orthogonality. Orthogonal projections onto a fixed closed subspace are unique, so \begin{align*} P_{t-1}^Y=S^tP_{-1}^YS^{-t}. \end{align*} Consequently \begin{align*} \varepsilon_t &=Y_t-P_{t-1}^YY_t\\ &=S^tY_0-S^tP_{-1}^YS^{-t}S^tY_0\\ &=S^t(Y_0-P_{-1}^YY_0)\\ &=S^t\varepsilon_0. \end{align*} Define \begin{align*} \sigma^2:=\mathbb E[\varepsilon_0^2]\geq 0. \end{align*} Because $S$ is unitary and $\varepsilon_t=S^t\varepsilon_0$, we obtain \begin{align*} \mathbb E[\varepsilon_t^2]=\|\varepsilon_t\|_{L^2}^2=\|S^t\varepsilon_0\|_{L^2}^2=\|\varepsilon_0\|_{L^2}^2=\sigma^2 \end{align*} for every $t\in\mathbb Z$. The innovations are orthogonal at distinct times. If $s<t$, then \begin{align*} \varepsilon_s\in\mathcal H_s^Y\subset\mathcal H_{t-1}^Y. \end{align*} Since $\varepsilon_t\perp\mathcal H_{t-1}^Y$, we get \begin{align*} \mathbb E[\varepsilon_t\varepsilon_s]=0. \end{align*} Symmetry of the inner product gives the same conclusion for $t<s$. Also, every $D_s$ belongs to $\mathcal H_{-\infty}^X$, while every $\varepsilon_t$ belongs to $\mathcal H^Y$, and we already proved $\mathcal H_{-\infty}^X\perp\mathcal H^Y$. Thus \begin{align*} \mathbb E[D_s\varepsilon_t]=0 \end{align*} for all $s,t\in\mathbb Z$. Finally suppose $\sigma^2=0$. Then each $\varepsilon_t$ has zero $L^2$ norm, so $\varepsilon_t=0$ in $L^2$. The decomposition \begin{align*} \mathcal H_t^Y=\mathcal H_{t-1}^Y\oplus \operatorname{span}\{\varepsilon_t\} \end{align*} therefore reduces to $\mathcal H_t^Y=\mathcal H_{t-1}^Y$ for all $t$. Thus all the past spaces $\mathcal H_t^Y$ are the same space. Their intersection is that common space, but pure nondeterminism says the intersection is $\{0\}$. Hence $\mathcal H_t^Y=\{0\}$ for every $t$, and in particular $Y_t=0$ in $L^2$ for every $t$. This is the exceptional zero-variance case. [/guided] [/step] [step:Expand the purely nondeterministic component into orthogonal innovations] Assume now that $\sigma^2>0$. For each $j\geq 0$, define \begin{align*} \psi_j:=\frac{\mathbb E[Y_j\varepsilon_0]}{\sigma^2}. \end{align*} Using $Y_t=S^{t-j}Y_j$, $\varepsilon_{t-j}=S^{t-j}\varepsilon_0$, and unitarity of $S$, we obtain \begin{align*} \frac{\mathbb E[Y_t\varepsilon_{t-j}]}{\sigma^2} = \frac{(S^{t-j}Y_j,S^{t-j}\varepsilon_0)_{L^2}}{\sigma^2} = \frac{(Y_j,\varepsilon_0)_{L^2}}{\sigma^2} =\psi_j \end{align*} for all $t\in\mathbb Z$ and $j\geq 0$. Since \begin{align*} Y_t=P_{t-1}^YY_t+\varepsilon_t \end{align*} and $\varepsilon_t\perp\mathcal H_{t-1}^Y$, we have \begin{align*} \psi_0=\frac{\mathbb E[Y_t\varepsilon_t]}{\sigma^2} = \frac{\mathbb E[\varepsilon_t^2]}{\sigma^2} =1. \end{align*} For $m\geq 0$, iterating \begin{align*} \mathcal H_r^Y=\mathcal H_{r-1}^Y\oplus\operatorname{span}\{\varepsilon_r\} \end{align*} from $r=t$ down to $r=t-m$ gives \begin{align*} \mathcal H_t^Y = \mathcal H_{t-m-1}^Y \oplus \bigoplus_{j=0}^{m}\operatorname{span}\{\varepsilon_{t-j}\}. \end{align*} Projecting $Y_t\in\mathcal H_t^Y$ onto this orthogonal direct sum yields \begin{align*} Y_t = P_{t-m-1}^YY_t + \sum_{j=0}^{m}\psi_j\varepsilon_{t-j}. \end{align*} The finite sum is orthogonal, so Bessel's inequality gives \begin{align*} \sum_{j=0}^{m}\psi_j^2\sigma^2 = \left\|\sum_{j=0}^{m}\psi_j\varepsilon_{t-j}\right\|_{L^2}^2 \leq \|Y_t\|_{L^2}^2. \end{align*} Letting $m\to\infty$ gives \begin{align*} \sum_{j=0}^{\infty}|\psi_j|^2<\infty. \end{align*} [guided] Assume $\sigma^2>0$; the case $\sigma^2=0$ was already identified as the zero moving-average case. For each $j\geq 0$, define the coefficient \begin{align*} \psi_j:=\frac{\mathbb E[Y_j\varepsilon_0]}{\sigma^2}. \end{align*} This is the coefficient obtained by projecting $Y_j$ onto the one-dimensional innovation space $\operatorname{span}\{\varepsilon_0\}$. The denominator is non-zero by the present assumption. The same coefficient appears at every time because the innovations are transported by the unitary shift. Indeed, for every $t\in\mathbb Z$ and $j\geq 0$, we have $Y_t=S^{t-j}Y_j$ and $\varepsilon_{t-j}=S^{t-j}\varepsilon_0$. Since $S$ preserves the $L^2$ inner product, \begin{align*} \frac{\mathbb E[Y_t\varepsilon_{t-j}]}{\sigma^2} = \frac{(S^{t-j}Y_j,S^{t-j}\varepsilon_0)_{L^2}}{\sigma^2} = \frac{(Y_j,\varepsilon_0)_{L^2}}{\sigma^2} =\psi_j. \end{align*} For $j=0$, the decomposition $Y_t=P_{t-1}^YY_t+\varepsilon_t$ and the orthogonality $\varepsilon_t\perp\mathcal H_{t-1}^Y$ give \begin{align*} \psi_0 = \frac{\mathbb E[Y_t\varepsilon_t]}{\sigma^2} = \frac{\mathbb E[(P_{t-1}^YY_t+\varepsilon_t)\varepsilon_t]}{\sigma^2} = \frac{\mathbb E[\varepsilon_t^2]}{\sigma^2} =1. \end{align*} Now fix $m\geq 0$. Iterating the orthogonal decompositions \begin{align*} \mathcal H_r^Y=\mathcal H_{r-1}^Y\oplus\operatorname{span}\{\varepsilon_r\} \end{align*} for $r=t,t-1,\dots,t-m$ yields \begin{align*} \mathcal H_t^Y = \mathcal H_{t-m-1}^Y \oplus \bigoplus_{j=0}^{m}\operatorname{span}\{\varepsilon_{t-j}\}. \end{align*} Projecting $Y_t\in\mathcal H_t^Y$ onto this orthogonal direct sum gives \begin{align*} Y_t = P_{t-m-1}^YY_t + \sum_{j=0}^{m}\psi_j\varepsilon_{t-j}, \end{align*} because the coefficient of $\varepsilon_{t-j}$ in an orthogonal projection onto $\operatorname{span}\{\varepsilon_{t-j}\}$ is \begin{align*} \frac{\mathbb E[Y_t\varepsilon_{t-j}]}{\mathbb E[\varepsilon_{t-j}^2]}=\frac{\mathbb E[Y_t\varepsilon_{t-j}]}{\sigma^2}=\psi_j. \end{align*} The finite innovation sum is orthogonal, so the Pythagorean theorem gives \begin{align*} \left\|\sum_{j=0}^{m}\psi_j\varepsilon_{t-j}\right\|_{L^2}^2 = \sum_{j=0}^{m}\psi_j^2\|\varepsilon_{t-j}\|_{L^2}^2 = \sum_{j=0}^{m}\psi_j^2\sigma^2. \end{align*} Since this finite sum is the orthogonal projection of $Y_t$ onto a closed subspace of $\mathcal H_t^Y$, its norm is at most $\|Y_t\|_{L^2}$. Therefore \begin{align*} \sum_{j=0}^{m}\psi_j^2\sigma^2 \leq \|Y_t\|_{L^2}^2. \end{align*} Dividing by $\sigma^2>0$ and letting $m\to\infty$ gives \begin{align*} \sum_{j=0}^{\infty}|\psi_j|^2<\infty. \end{align*} [/guided] [/step] [step:Let the remote projections vanish to obtain the infinite moving average] We use the following Hilbert-space fact. If $(K_m)_{m\geq 0}$ is a decreasing sequence of closed subspaces of a Hilbert space $H$, if $P_m:H\to K_m$ is the orthogonal projection, and if \begin{align*} K_\infty:=\bigcap_{m=0}^{\infty}K_m, \end{align*} then $P_mx\to P_\infty x$ in $H$, where $P_\infty:H\to K_\infty$ is the orthogonal projection. Indeed, for $n>m$, the identity $K_n\subset K_m$ implies $P_nx\in K_m$, and the projection identities give \begin{align*} \|P_mx-P_nx\|_H^2=\|P_mx\|_H^2-\|P_nx\|_H^2. \end{align*} Thus $(P_mx)_{m\geq 0}$ is Cauchy and converges to some $z\in H$. Since $P_mx\in K_r$ for every $m\geq r$ and $K_r$ is closed, $z\in K_r$ for every $r$, hence $z\in K_\infty$. For every $w\in K_\infty$, we have $w\in K_m$ for all $m$, so \begin{align*} (x-P_mx,w)_H=0. \end{align*} Passing to the limit gives $(x-z,w)_H=0$ for all $w\in K_\infty$, hence $z=P_\infty x$. Apply this fact with \begin{align*} H=\mathcal H^Y,\qquad K_m=\mathcal H_{t-m-1}^Y,\qquad x=Y_t. \end{align*} Then \begin{align*} \bigcap_{m=0}^{\infty}\mathcal H_{t-m-1}^Y = \bigcap_{r\in\mathbb Z}\mathcal H_r^Y = \{0\}, \end{align*} so \begin{align*} P_{t-m-1}^YY_t\to 0 \end{align*} in $L^2$. Taking the limit in \begin{align*} Y_t = P_{t-m-1}^YY_t + \sum_{j=0}^{m}\psi_j\varepsilon_{t-j} \end{align*} gives \begin{align*} Y_t=\sum_{j=0}^{\infty}\psi_j\varepsilon_{t-j} \end{align*} in $L^2(\Omega,\mathcal F,\mathbb P)$. Combining this identity with $X_t=D_t+Y_t$ gives \begin{align*} X_t=D_t+\sum_{j=0}^{\infty}\psi_j\varepsilon_{t-j} \end{align*} in $L^2(\Omega,\mathcal F,\mathbb P)$ for every $t\in\mathbb Z$. The deterministic component is orthogonal to all innovations, and the moving-average component is purely nondeterministic by the construction of $(Y_t)$. This completes the proof. [guided] It remains to justify that the finite expansions converge to the infinite moving average. We use the following Hilbert-space fact. Let $(K_m)_{m\geq 0}$ be a decreasing sequence of closed subspaces of a Hilbert space $H$, let $P_m:H\to K_m$ be the orthogonal projection, and define \begin{align*} K_\infty:=\bigcap_{m=0}^{\infty}K_m. \end{align*} If $P_\infty:H\to K_\infty$ is the orthogonal projection, then $P_mx\to P_\infty x$ in $H$ for every $x\in H$. We prove the fact because it is exactly the mechanism by which the remote-past term disappears. If $n>m$, then $K_n\subset K_m$, so $P_nx\in K_m$. The projection identity applied in $K_m$ gives the orthogonal decomposition \begin{align*} P_mx=P_nx+(P_mx-P_nx), \end{align*} where $P_mx-P_nx\perp K_n$ and $P_nx\in K_n$. Hence \begin{align*} \|P_mx-P_nx\|_H^2=\|P_mx\|_H^2-\|P_nx\|_H^2. \end{align*} The sequence $(\|P_mx\|_H^2)_{m\geq 0}$ is decreasing and bounded below by $0$, so it is Cauchy. The displayed identity then shows that $(P_mx)_{m\geq 0}$ is Cauchy in $H$. Let its limit be $z\in H$. For each fixed $r\geq 0$, all terms $P_mx$ with $m\geq r$ lie in $K_r$. Since $K_r$ is closed, the limit $z$ also lies in $K_r$. Therefore $z\in K_\infty$. If $w\in K_\infty$, then $w\in K_m$ for every $m$, and the defining property of the projection $P_m$ gives \begin{align*} (x-P_mx,w)_H=0. \end{align*} Passing to the limit in the inner product yields $(x-z,w)_H=0$ for every $w\in K_\infty$. Thus $z$ is the orthogonal projection of $x$ onto $K_\infty$, namely $z=P_\infty x$. Apply this fact with \begin{align*} H=\mathcal H^Y,\qquad K_m=\mathcal H_{t-m-1}^Y,\qquad x=Y_t. \end{align*} The subspaces are decreasing because earlier past spaces are contained in later past spaces. Their intersection is \begin{align*} \bigcap_{m=0}^{\infty}\mathcal H_{t-m-1}^Y = \bigcap_{r\in\mathbb Z}\mathcal H_r^Y = \{0\}, \end{align*} where the last equality is pure nondeterminism of $(Y_t)$. Therefore the corresponding projections satisfy \begin{align*} P_{t-m-1}^YY_t\to 0 \end{align*} in $L^2(\Omega,\mathcal F,\mathbb P)$. Taking the $L^2$ limit in the finite orthogonal expansion \begin{align*} Y_t = P_{t-m-1}^YY_t + \sum_{j=0}^{m}\psi_j\varepsilon_{t-j} \end{align*} gives \begin{align*} Y_t=\sum_{j=0}^{\infty}\psi_j\varepsilon_{t-j} \end{align*} in $L^2(\Omega,\mathcal F,\mathbb P)$. Since $X_t=D_t+Y_t$, we conclude \begin{align*} X_t=D_t+ \sum_{j=0}^{\infty}\psi_j\varepsilon_{t-j} \end{align*} in $L^2(\Omega,\mathcal F,\mathbb P)$ for every $t\in\mathbb Z$. The orthogonality of $D_s$ to every innovation was proved when the innovations were constructed, and the moving-average component is purely nondeterministic because it is exactly the residual process $(Y_t)$. This proves the asserted Wold decomposition. [/guided] [/step]

Explore Further

Beveridge–Nelson Decomposition probability Strong Consistency of the Multivariate Normal Maximum Likelihood Estimators probability Simultaneous Confidence Intervals for Mean Contrasts in One-Way MANOVA probability Gaussian Copula Has Zero Tail Dependence probability Quadratic Discriminant Analysis Bayes Rule probability Anderson's Asymptotic Normality Theorem for Sample Covariance Eigenvalues probability Spectral Density Formula for a Causal ARMA Process probability Mahalanobis Quadratic Form Distribution probability

What brings you to Androma?

Start with a route through the knowledge graph.

Wold Decomposition Theorem (Theorem # 3641)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Wold Decomposition Theorem (Theorem # 3641)

Discussion

Proof

Explore Further