Yule-Walker Equations for a Causal Autoregressive Process

Yule-Walker Equations for a Causal Autoregressive Process (Theorem # 3631)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] The proof is obtained by multiplying the autoregressive recursion by a lagged value of the process and taking expectations. For positive lags, the current innovation $Z_t$ is uncorrelated with the past variable $X_{t-h}$, so only the autoregressive terms remain. Stationarity converts each covariance $\operatorname{Cov}(X_{t-k},X_{t-h})$ into the autocovariance value $\gamma(h-k)$. For lag zero, the same computation leaves the current innovation contribution $\operatorname{Cov}(Z_t,X_t)=\sigma^2$, because causality gives $X_t=Z_t$ plus a component depending only on past innovations. [/proofplan] [step:Record the mean-zero covariance identities used in the computation] Since $(Z_t)_{t \in \mathbb{Z}}$ is white noise, $\mathbb{E}[Z_t]=0$ for every $t \in \mathbb{Z}$. Taking expectations in \begin{align*} X_t=\sum_{k=1}^{p}\phi_k X_{t-k}+Z_t \end{align*} and using second-order stationarity gives a constant mean $m:=\mathbb{E}[X_t]$ satisfying \begin{align*} m=\sum_{k=1}^{p}\phi_k m. \end{align*} For the covariance computations below, we use centered variables. Define $Y_t:=X_t-m$ for $t \in \mathbb{Z}$. Then $(Y_t)_{t \in \mathbb{Z}}$ is second-order stationary, has mean zero, has the same autocovariance function as $(X_t)$, and satisfies \begin{align*} Y_t=\sum_{k=1}^{p}\phi_k Y_{t-k}+Z_t. \end{align*} Therefore it is enough to prove the displayed identities under the normalization $\mathbb{E}[X_t]=0$ for every $t \in \mathbb{Z}$. Under this normalization, \begin{align*} \gamma(a-b)=\operatorname{Cov}(X_a,X_b)=\mathbb{E}[X_aX_b] \end{align*} for all $a,b \in \mathbb{Z}$. [/step] [step:Multiply by a past value to obtain the positive-lag equations] Fix an integer $h \geq 1$. Multiplying the autoregressive equation at time $t$ by $X_{t-h}$ and taking expectations gives \begin{align*} \mathbb{E}[X_tX_{t-h}] &=\mathbb{E}\left[\left(\sum_{k=1}^{p}\phi_k X_{t-k}+Z_t\right)X_{t-h}\right] \\ &=\sum_{k=1}^{p}\phi_k \mathbb{E}[X_{t-k}X_{t-h}] +\mathbb{E}[Z_tX_{t-h}]. \end{align*} The final expectation vanishes because $h \geq 1$ and causality gives that $Z_t$ is uncorrelated with $X_{t-h}$: \begin{align*} \mathbb{E}[Z_tX_{t-h}] =\operatorname{Cov}(Z_t,X_{t-h}) =0. \end{align*} By second-order stationarity, for each $k \in \{1,\dots,p\}$, \begin{align*} \mathbb{E}[X_{t-k}X_{t-h}] =\operatorname{Cov}(X_{t-k},X_{t-h}) =\gamma((t-k)-(t-h)) =\gamma(h-k). \end{align*} Also, \begin{align*} \mathbb{E}[X_tX_{t-h}] =\operatorname{Cov}(X_t,X_{t-h}) =\gamma(h). \end{align*} Substituting these identities yields \begin{align*} \gamma(h)=\sum_{k=1}^{p}\phi_k\gamma(h-k). \end{align*} Since $h \geq 1$ was arbitrary, the positive-lag Yule-Walker equations hold for every integer $h \geq 1$. [/step] [step:Compute the current innovation contribution at lag zero] By causality, there are real coefficients $(\psi_j)_{j \geq 0}$ with $\psi_0=1$ such that \begin{align*} X_t=\sum_{j=0}^{\infty}\psi_j Z_{t-j} \end{align*} with convergence in $L^2$. Define the past-innovation component $W_t$ by the $L^2$-convergent series \begin{align*} W_t=\sum_{j=1}^{\infty}\psi_j Z_{t-j}. \end{align*} Then $X_t=Z_t+W_t$. Since white noise has uncorrelated values at distinct times, $Z_t$ is uncorrelated with every $Z_{t-j}$ for $j \geq 1$. Passing to the $L^2$ limit in the covariance pairing gives \begin{align*} \operatorname{Cov}(Z_t,W_t)=0. \end{align*} Therefore \begin{align*} \operatorname{Cov}(Z_t,X_t) &=\operatorname{Cov}(Z_t,Z_t+W_t) \\ &=\operatorname{Var}(Z_t)+\operatorname{Cov}(Z_t,W_t) \\ &=\sigma^2. \end{align*} Because both variables have mean zero, this is equivalently \begin{align*} \mathbb{E}[Z_tX_t]=\sigma^2. \end{align*} [guided] The only point at lag zero that differs from the positive-lag calculation is that $X_t$ contains the current innovation $Z_t$ itself. Causality makes this precise: there are real coefficients $(\psi_j)_{j \geq 0}$, with $\psi_0=1$, such that \begin{align*} X_t=\sum_{j=0}^{\infty}\psi_j Z_{t-j} \end{align*} in $L^2$. We separate the current innovation from the strictly past innovations by defining \begin{align*} W_t=\sum_{j=1}^{\infty}\psi_j Z_{t-j}. \end{align*} Then $X_t=Z_t+W_t$. Since $(Z_t)_{t \in \mathbb{Z}}$ is white noise, $Z_t$ is uncorrelated with $Z_{t-j}$ for each $j \geq 1$. The covariance map is continuous under $L^2$ convergence, so $Z_t$ is uncorrelated with the $L^2$ limit $W_t$: \begin{align*} \operatorname{Cov}(Z_t,W_t)=0. \end{align*} Thus the only contribution to $\operatorname{Cov}(Z_t,X_t)$ comes from the coefficient-one copy of $Z_t$ inside $X_t$: \begin{align*} \operatorname{Cov}(Z_t,X_t) &=\operatorname{Cov}(Z_t,Z_t+W_t) \\ &=\operatorname{Cov}(Z_t,Z_t)+\operatorname{Cov}(Z_t,W_t) \\ &=\operatorname{Var}(Z_t)+0 \\ &=\sigma^2. \end{align*} Since the variables are centered, this also says $\mathbb{E}[Z_tX_t]=\sigma^2$. [/guided] [/step] [step:Multiply by the present value to obtain the lag-zero equation] Multiplying the autoregressive equation at time $t$ by $X_t$ and taking expectations gives \begin{align*} \mathbb{E}[X_t^2] &=\mathbb{E}\left[\left(\sum_{k=1}^{p}\phi_kX_{t-k}+Z_t\right)X_t\right] \\ &=\sum_{k=1}^{p}\phi_k\mathbb{E}[X_{t-k}X_t]+\mathbb{E}[Z_tX_t]. \end{align*} By second-order stationarity, \begin{align*} \mathbb{E}[X_t^2]=\gamma(0), \end{align*} and, for each $k \in \{1,\dots,p\}$, \begin{align*} \mathbb{E}[X_{t-k}X_t] =\operatorname{Cov}(X_{t-k},X_t) =\gamma(k). \end{align*} The previous step gives $\mathbb{E}[Z_tX_t]=\sigma^2$. Substituting these identities into the expectation equation yields \begin{align*} \gamma(0)=\sum_{k=1}^{p}\phi_k\gamma(k)+\sigma^2. \end{align*} Together with the positive-lag identities, this proves the Yule-Walker equations. [/step]

Explore Further

Beveridge–Nelson Decomposition probability Kalman Filter Step with a Missing Observation probability Wishart Distribution Density Formula probability Anderson's Asymptotic Normality Theorem for Sample Covariance Eigenvalues probability Strong Consistency of the Multivariate Normal Maximum Likelihood Estimators probability Quadratic Discriminant Analysis Bayes Rule probability Covariance Matrix Is Positive Semidefinite probability Durbin-Levinson Recursion probability

What brings you to Androma?

Start with a route through the knowledge graph.