[proofplan]
We write the mean-square prediction error as a quadratic function of the coefficient vector $b \in \mathbb{R}^p$. The minimizer is characterized by vanishing directional derivatives in each coordinate, which gives the orthogonality of the prediction error against every lagged regressor $X_{t-i}$. Weak stationarity then rewrites these orthogonality equations entirely in terms of the autocovariance function $\gamma$, producing the displayed linear system. Since $\Gamma_p$ is invertible, that system has exactly one solution, and hence the optimal coefficient vector is unique.
[/proofplan]
custom_env
admin
[step:Express the prediction risk as a quadratic function of the coefficients]
For $b = (b_1,\dots,b_p)^\top \in \mathbb{R}^p$, define the linear finite-lag predictor
\begin{align*}
\widehat{X}_{t,p}(b) = \sum_{j=1}^{p} b_j X_{t-j}.
\end{align*}
Define the prediction error [random variable](/page/Random%20Variable) $E_t(b): \Omega \to \mathbb{R}$ by
\begin{align*}
E_t(b) = X_t - \widehat{X}_{t,p}(b).
\end{align*}
Since each $X_{t-j}$ is square-integrable and $b$ is finite-dimensional, $E_t(b)$ is square-integrible. The mean-square prediction risk is the function $Q: \mathbb{R}^p \to \mathbb{R}$ given by
\begin{align*}
Q(b) = \mathbb{E}[E_t(b)^2].
\end{align*}
A mean-square optimal linear finite $p$-lag predictor is therefore obtained by minimizing $Q$ over $\mathbb{R}^p$.
[/step]
custom_env
admin
[step:Differentiate the quadratic risk to obtain orthogonality conditions]Fix $i \in \{1,\dots,p\}$. For $r \in \mathbb{R}$, let $e_i \in \mathbb{R}^p$ denote the $i$th standard basis vector and consider the one-variable function $\varphi_i: \mathbb{R} \to \mathbb{R}$ defined by
\begin{align*}
\varphi_i(r) = Q(b + r e_i).
\end{align*}
Because
\begin{align*}
E_t(b + r e_i) = E_t(b) - r X_{t-i},
\end{align*}
we have
\begin{align*}
\varphi_i(r) = \mathbb{E}[(E_t(b) - r X_{t-i})^2].
\end{align*}
Expanding the square and using linearity of expectation gives
\begin{align*}
\varphi_i(r) = \mathbb{E}[E_t(b)^2] - 2r\mathbb{E}[E_t(b)X_{t-i}] + r^2\mathbb{E}[X_{t-i}^2].
\end{align*}
Thus $\varphi_i$ is differentiable and
\begin{align*}
\varphi_i'(0) = -2\mathbb{E}[E_t(b)X_{t-i}].
\end{align*}
If $a \in \mathbb{R}^p$ minimizes $Q$, then every coordinate directional derivative at $a$ vanishes, so
\begin{align*}
\mathbb{E}[E_t(a)X_{t-i}] = 0
\end{align*}
for every $i \in \{1,\dots,p\}$.[/step]
custom_env
admin
[guided]Fix one lag index $i \in \{1,\dots,p\}$. To find the normal equation in the $i$th coordinate, we vary only the coefficient of $X_{t-i}$. Let $e_i \in \mathbb{R}^p$ be the $i$th standard basis vector, and define $\varphi_i: \mathbb{R} \to \mathbb{R}$ by
\begin{align*}
\varphi_i(r) = Q(b + r e_i).
\end{align*}
The coefficient vector $b + r e_i$ changes the predictor by adding $rX_{t-i}$, so the corresponding error changes by subtracting $rX_{t-i}$:
\begin{align*}
E_t(b + r e_i) = E_t(b) - rX_{t-i}.
\end{align*}
Therefore
\begin{align*}
\varphi_i(r) = \mathbb{E}[(E_t(b) - rX_{t-i})^2].
\end{align*}
Now expand the square inside the expectation:
\begin{align*}
(E_t(b) - rX_{t-i})^2 = E_t(b)^2 - 2rE_t(b)X_{t-i} + r^2X_{t-i}^2.
\end{align*}
Each term is integrable because $E_t(b)$ and $X_{t-i}$ are square-integrable, and the product term is integrable by the [Cauchy-Schwarz inequality](/theorems/432) for random variables. Taking expectations gives
\begin{align*}
\varphi_i(r) = \mathbb{E}[E_t(b)^2] - 2r\mathbb{E}[E_t(b)X_{t-i}] + r^2\mathbb{E}[X_{t-i}^2].
\end{align*}
This is an ordinary quadratic polynomial in $r$, so
\begin{align*}
\varphi_i'(0) = -2\mathbb{E}[E_t(b)X_{t-i}].
\end{align*}
If $a$ minimizes the full function $Q: \mathbb{R}^p \to \mathbb{R}$, then moving from $a$ in the $e_i$ direction cannot decrease $Q$. Hence the derivative of $\varphi_i$ at $0$ must be zero. We obtain
\begin{align*}
\mathbb{E}[E_t(a)X_{t-i}] = 0.
\end{align*}
This is the orthogonality principle in this finite-dimensional setting: the optimal error is orthogonal, in the $L^2$ [inner product](/page/Inner%20Product), to each regressor used in the linear predictor.[/guided]
custom_env
admin
[step:Rewrite the orthogonality conditions using weak stationarity]
For $a = (a_1,\dots,a_p)^\top$, the error is
\begin{align*}
E_t(a) = X_t - \sum_{j=1}^{p} a_j X_{t-j}.
\end{align*}
For each $i \in \{1,\dots,p\}$, the orthogonality condition gives
\begin{align*}
0 = \mathbb{E}[X_tX_{t-i}] - \sum_{j=1}^{p} a_j\mathbb{E}[X_{t-j}X_{t-i}].
\end{align*}
Because the process is mean-zero and weakly stationary,
\begin{align*}
\mathbb{E}[X_tX_{t-i}] = \gamma(i).
\end{align*}
Likewise, for each $j \in \{1,\dots,p\}$,
\begin{align*}
\mathbb{E}[X_{t-j}X_{t-i}] = \gamma(i-j).
\end{align*}
Substituting these covariance identities into the orthogonality condition yields
\begin{align*}
\gamma(i) = \sum_{j=1}^{p} a_j\gamma(i-j)
\end{align*}
for every $i \in \{1,\dots,p\}$.
[/step]
custom_env
admin
[step:Collect the covariance equations into the normal equation system]
By the definition of $\Gamma_p$, the $i$th component of $\Gamma_p a$ is
\begin{align*}
(\Gamma_p a)_i = \sum_{j=1}^{p}(\Gamma_p)_{ij}a_j = \sum_{j=1}^{p}\gamma(i-j)a_j.
\end{align*}
The equations obtained above say exactly that
\begin{align*}
(\Gamma_p a)_i = \gamma(i)
\end{align*}
for every $i \in \{1,\dots,p\}$. Therefore
\begin{align*}
\Gamma_p a = (\gamma(1),\dots,\gamma(p))^\top.
\end{align*}
[/step]
custom_env
admin
[step:Use invertibility to identify the unique optimal coefficient vector]
Since $\Gamma_p$ is invertible, the linear system
\begin{align*}
\Gamma_p a = (\gamma(1),\dots,\gamma(p))^\top
\end{align*}
has the unique solution
\begin{align*}
a = \Gamma_p^{-1}(\gamma(1),\dots,\gamma(p))^\top.
\end{align*}
The function $Q$ is a quadratic function whose Hessian is $2\Gamma_p$. Because $\Gamma_p$ is a covariance matrix, it is positive semidefinite; invertibility makes it positive definite. Hence $Q$ is strictly convex on $\mathbb{R}^p$, so its critical point is the unique global minimizer. Thus the mean-square optimal linear finite $p$-lag predictor has coefficients satisfying the stated normal equations.
[/step]