Innovations Algorithm — Statement & Proof

Innovations Algorithm (Theorem # 3634)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] We regard the closed linear span of finitely many observations as a finite-dimensional Hilbert space inside $L^2(\Omega,\mathcal{F},\mathbb{P})$, with inner product given by covariance because the process has mean zero. The innovations are obtained from the observations by a triangular orthogonalization procedure, so $E_1,\dots,E_n$ form an orthogonal basis of the same subspace as $X_1,\dots,X_n$. Expanding the orthogonal projection of $X_{n+1}$ in this basis gives the coefficient formula. The covariance recursion follows by inserting the earlier predictor formula into $E_k=X_k-P_{k-1}X_k$, and the variance identity follows from the Pythagorean decomposition of $X_{n+1}$ into its projection and innovation. [/proofplan] [step:Place the process in the finite-dimensional Hilbert-space setting] Define the bilinear map \begin{align*} (\cdot,\cdot)_{L^2}: L^2(\Omega,\mathcal{F},\mathbb{P})\times L^2(\Omega,\mathcal{F},\mathbb{P}) &\to \mathbb{R}, \\ (Y,Z) &\mapsto \mathbb{E}[YZ]. \end{align*} Since each $X_t$ has mean zero, this inner product agrees with covariance on pairs of observations: \begin{align*} (X_r,X_s)_{L^2}=\operatorname{Cov}(X_r,X_s)=\kappa(r,s). \end{align*} Define $\mathcal{H}_0:=\{0\}$, and for each $n\geq 1$ define the finite-dimensional subspace \begin{align*} \mathcal{H}_n:=\operatorname{span}\{X_1,\dots,X_n\}\subset L^2(\Omega,\mathcal{F},\mathbb{P}). \end{align*} By the finite-dimensional [orthogonal projection theorem](/page/Orthogonal%20Projection), for every $n\geq 0$ the orthogonal projection \begin{align*} P_n:L^2(\Omega,\mathcal{F},\mathbb{P})&\to \mathcal{H}_n \end{align*} exists; in particular, $P_0$ is the zero projection onto $\mathcal{H}_0$. For each $n\geq 1$, define the covariance matrix $\Gamma_n\in\mathbb{R}^{n\times n}$ by \begin{align*} (\Gamma_n)_{rs}:=\kappa(r,s),\qquad 1\leq r,s\leq n. \end{align*} The nonsingularity of $\Gamma_n$ implies that $X_1,\dots,X_n$ are linearly independent in $L^2$. Indeed, if real numbers $a_1,\dots,a_n$ satisfy $\sum_{r=1}^n a_rX_r=0$ in $L^2$, then \begin{align*} 0 &= \mathbb{E}\left[\left(\sum_{r=1}^n a_rX_r\right)^2\right] \\ &= \sum_{r=1}^n\sum_{s=1}^n a_ra_s\kappa(r,s). \end{align*} Thus $a^\top\Gamma_na=0$, where $a=(a_1,\dots,a_n)^\top$. Since $\Gamma_n$ is a positive definite covariance matrix, $a=0$. Hence $\dim \mathcal{H}_n=n$. [/step] [step:Show that the innovations form an orthogonal basis of the past] Fix $n\geq 1$. For each $t\in\{1,\dots,n\}$, the innovation \begin{align*} E_t=X_t-P_{t-1}X_t \end{align*} belongs to $\mathcal{H}_t$, because $X_t\in\mathcal{H}_t$ and $P_{t-1}X_t\in\mathcal{H}_{t-1}\subset\mathcal{H}_t$. Also, by the defining property of orthogonal projection, \begin{align*} (E_t,Y)_{L^2}=0\qquad\text{for every }Y\in\mathcal{H}_{t-1}. \end{align*} In particular, if $1\leq s<t\leq n$, then $E_s\in\mathcal{H}_s\subset\mathcal{H}_{t-1}$, so \begin{align*} (E_t,E_s)_{L^2}=0. \end{align*} Thus $E_1,\dots,E_n$ are pairwise orthogonal. The relation $E_t=X_t-P_{t-1}X_t$ expresses $E_t$ as $X_t$ plus an element of $\mathcal{H}_{t-1}$. Thus, with respect to the ordered lists $(X_1,\dots,X_n)$ and $(E_1,\dots,E_n)$, the change-of-basis matrix is lower triangular with all diagonal entries equal to $1$. Its determinant is therefore $1$, so the matrix is invertible. Hence \begin{align*} \operatorname{span}\{E_1,\dots,E_n\} = \operatorname{span}\{X_1,\dots,X_n\} = \mathcal{H}_n. \end{align*} Since $X_1,\dots,X_n$ are linearly independent, no $E_t$ is zero in $L^2$. Consequently \begin{align*} v_t=(E_t,E_t)_{L^2}=\mathbb{E}[E_t^2]>0,\qquad 1\leq t\leq n. \end{align*} [guided] The innovations are the result of orthogonalizing the observations in chronological order. For each $t$, the variable $P_{t-1}X_t$ is the best linear prediction of $X_t$ from the earlier variables $X_1,\dots,X_{t-1}$, so the residual \begin{align*} E_t=X_t-P_{t-1}X_t \end{align*} is orthogonal to every element of the past space $\mathcal{H}_{t-1}$. Let us verify the two required facts carefully. First, $E_t\in\mathcal{H}_t$, because $X_t\in\mathcal{H}_t$ and $P_{t-1}X_t\in\mathcal{H}_{t-1}\subset\mathcal{H}_t$. Second, if $s<t$, then $E_s\in\mathcal{H}_s\subset\mathcal{H}_{t-1}$, while $E_t$ is orthogonal to all of $\mathcal{H}_{t-1}$. Hence \begin{align*} (E_t,E_s)_{L^2}=0. \end{align*} Thus the innovations are mutually orthogonal. It remains to check that they span the same space as the observations. The identity \begin{align*} E_t=X_t-P_{t-1}X_t \end{align*} writes $E_t$ as $X_t$ plus a linear combination of $X_1,\dots,X_{t-1}$. Therefore the transformation from $X_1,\dots,X_n$ to $E_1,\dots,E_n$ is triangular and has diagonal coefficient $1$ at every stage. Such a triangular transformation is invertible, so \begin{align*} \operatorname{span}\{E_1,\dots,E_n\} = \operatorname{span}\{X_1,\dots,X_n\}. \end{align*} Here $\Gamma_n\in\mathbb{R}^{n\times n}$ denotes the covariance matrix with entries $(\Gamma_n)_{rs}=\kappa(r,s)$ for $1\leq r,s\leq n$. Because $\Gamma_n$ is nonsingular, the variables $X_1,\dots,X_n$ are linearly independent in $L^2$, and hence the same is true of $E_1,\dots,E_n$. Since each $E_t$ is nonzero and orthogonal to the others, \begin{align*} v_t=\mathbb{E}[E_t^2]=(E_t,E_t)_{L^2}>0. \end{align*} [/guided] [/step] [step:Expand the one-step predictor in the orthogonal innovation basis] Fix $n\geq 1$. Since $E_1,\dots,E_n$ form an orthogonal basis of $\mathcal{H}_n$, there exist unique real numbers $c_{n,1},\dots,c_{n,n}$ such that \begin{align*} P_nX_{n+1}=\sum_{k=1}^{n}c_{n,k}E_k. \end{align*} Taking the $L^2$ inner product with $E_k$ and using orthogonality gives \begin{align*} (P_nX_{n+1},E_k)_{L^2}=c_{n,k}(E_k,E_k)_{L^2}=c_{n,k}v_k. \end{align*} Because $E_k\in\mathcal{H}_n$ and $X_{n+1}-P_nX_{n+1}$ is orthogonal to $\mathcal{H}_n$, \begin{align*} (P_nX_{n+1},E_k)_{L^2}=(X_{n+1},E_k)_{L^2}. \end{align*} The innovation $E_k$ has mean zero, since $P_{k-1}X_k\in\mathcal{H}_{k-1}$ is a linear combination of the mean-zero variables $X_1,\dots,X_{k-1}$. Therefore \begin{align*} (X_{n+1},E_k)_{L^2} = \operatorname{Cov}(X_{n+1},E_k). \end{align*} Since $v_k>0$, \begin{align*} c_{n,k}=\frac{\operatorname{Cov}(X_{n+1},E_k)}{v_k}. \end{align*} Now put $k=n+1-j$. Then \begin{align*} P_nX_{n+1} &= \sum_{j=1}^{n}\theta_{n,j}E_{n+1-j}, \\ \theta_{n,j} &= \frac{\operatorname{Cov}(X_{n+1},E_{n+1-j})}{v_{n+1-j}}. \end{align*} [/step] [step:Derive the covariance recursion from the previous predictor] Fix integers $n\geq 1$ and $k$ with $1\leq k\leq n$. If $k=1$, then $P_0$ is the zero projection onto $\mathcal{H}_0=\{0\}$, so $E_1=X_1-P_0X_1=X_1$. Hence \begin{align*} \operatorname{Cov}(X_{n+1},E_1)=\operatorname{Cov}(X_{n+1},X_1)=\kappa(n+1,1), \end{align*} which is the stated formula with an empty sum. Assume now that $2\leq k\leq n$. Applying the predictor representation already proved with $n=k-1$ gives \begin{align*} P_{k-1}X_k=\sum_{r=1}^{k-1}\theta_{k-1,r}E_{k-r}. \end{align*} Using $E_k=X_k-P_{k-1}X_k$ and bilinearity of covariance, \begin{align*} \operatorname{Cov}(X_{n+1},E_k) &= \operatorname{Cov}\left(X_{n+1},X_k-\sum_{r=1}^{k-1}\theta_{k-1,r}E_{k-r}\right)\\ &= \operatorname{Cov}(X_{n+1},X_k) - \sum_{r=1}^{k-1}\theta_{k-1,r}\operatorname{Cov}(X_{n+1},E_{k-r})\\ &= \kappa(n+1,k) - \sum_{r=1}^{k-1}\theta_{k-1,r}\operatorname{Cov}(X_{n+1},E_{k-r}). \end{align*} This is the desired recursion. [/step] [step:Compute the innovation variance by orthogonal decomposition] Fix $n\geq 1$. The decomposition \begin{align*} X_{n+1}=P_nX_{n+1}+E_{n+1} \end{align*} is orthogonal, because $P_nX_{n+1}\in\mathcal{H}_n$ and $E_{n+1}=X_{n+1}-P_nX_{n+1}$ is orthogonal to $\mathcal{H}_n$. The [Pythagorean identity](/page/Pythagorean%20Theorem) in the Hilbert space $L^2(\Omega,\mathcal{F},\mathbb{P})$ therefore gives \begin{align*} \mathbb{E}[X_{n+1}^2] = \mathbb{E}\bigl[(P_nX_{n+1})^2\bigr]+\mathbb{E}[E_{n+1}^2]. \end{align*} Using the expansion \begin{align*} P_nX_{n+1}=\sum_{j=1}^{n}\theta_{n,j}E_{n+1-j} \end{align*} and the pairwise orthogonality of $E_1,\dots,E_n$, we get \begin{align*} \mathbb{E}\bigl[(P_nX_{n+1})^2\bigr] &= \sum_{j=1}^{n}\theta_{n,j}^2\mathbb{E}[E_{n+1-j}^2] \\ &= \sum_{j=1}^{n}\theta_{n,j}^2v_{n+1-j}. \end{align*} Since $\mathbb{E}[X_{n+1}^2]=\kappa(n+1,n+1)$ and $v_{n+1}=\mathbb{E}[E_{n+1}^2]$, rearranging gives \begin{align*} v_{n+1} = \kappa(n+1,n+1) - \sum_{j=1}^{n}\theta_{n,j}^2v_{n+1-j}. \end{align*} This completes the proof of the innovations algorithm. [/step]

Explore Further

Quadratic Discriminant Analysis Bayes Rule probability Normal Equal-Covariance Bayes Rule probability Mahalanobis Quadratic Form Distribution probability Distribution of the Sample Mean of a Multivariate Normal Sample probability Principal Component Variances probability Tracy-Widom Limit for the Largest Eigenvalue of a Real Wishart Matrix probability Root-Pattern Sensitivity of Classical One-Way MANOVA Criteria probability Bartlett Decomposition probability

What brings you to Androma?

Start with a route through the knowledge graph.

Innovations Algorithm (Theorem # 3634)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

Innovations Algorithm (Theorem # 3634)

Discussion

Proof

Explore Further