Covariance Stationarity Criterion for the GARCH(1,1) Process (Theorem # 3660)
Theorem
Let $(Z_t)_{t \in \mathbb{Z}}$ be a sequence of independent and identically distributed real-valued random variables with $\mathbb{E}[Z_t] = 0$ and $\mathbb{E}[Z_t^2] = 1$, defined on a probability space $(\Omega, \mathcal{F}, \mathbb{P})$. Fix parameters $\omega > 0$ and $\alpha, \beta \ge 0$, and let $\mathcal{F}_t := \sigma(Z_s : s \le t)$ denote the natural filtration.
Suppose $(X_t)_{t \in \mathbb{Z}}$ is a **strictly stationary GARCH$(1,1)$ process** driven by $(Z_t)$: there exists a nonnegative process $(\sigma_t)_{t \in \mathbb{Z}}$ such that each $\sigma_t$ is $\mathcal{F}_{t-1}$-measurable (so that $Z_t$ is independent of $\sigma_t$), the pair process $(X_t, \sigma_t)_{t \in \mathbb{Z}}$ is strictly stationary, and for every $t \in \mathbb{Z}$,
\begin{align*}
X_t = \sigma_t Z_t, \qquad \sigma_t^2 = \omega + \alpha X_{t-1}^2 + \beta \sigma_{t-1}^2 .
\end{align*}
Then:
1. **(Finiteness criterion)** $\mathbb{E}[X_t^2] < \infty$ if and only if $\alpha + \beta < 1$.
2. **(Variance and covariance stationarity)** If $\alpha + \beta < 1$, then
\begin{align*}
\mathbb{E}[X_t^2] = \mathbb{E}[\sigma_t^2] = \frac{\omega}{1 - \alpha - \beta},
\end{align*}
and $(X_t)_{t \in \mathbb{Z}}$ is covariance stationary with
\begin{align*}
\mathbb{E}[X_t] = 0, \qquad \operatorname{Var}(X_t) = \frac{\omega}{1 - \alpha - \beta}, \qquad \operatorname{Cov}(X_t, X_{t+h}) = 0 \quad \text{for all } h \in \mathbb{Z} \setminus \{0\}.
\end{align*}
Discussion
No discussion available for this theorem.
Proof
[proofplan]
Write the conditional-variance recursion in the multiplicative form $\sigma_t^2 = \omega + a_{t-1}\sigma_{t-1}^2$ with random coefficients $a_{t-1} = \alpha Z_{t-1}^2 + \beta$, which are i.i.d. with mean $\alpha + \beta$. Iterating produces an exact finite expansion whose tail term is nonnegative; dropping it and taking expectations forces the unconditional variance to be infinite when $\alpha + \beta \ge 1$. When $\alpha + \beta < 1$, Jensen's inequality and the Strong Law of Large Numbers show the random products vanish almost surely, which lets us pass to the convergent infinite series $\sigma_t^2 = \omega \sum_{k \ge 0} \prod_{j=1}^k a_{t-j}$; the Monotone Convergence Theorem then evaluates $\mathbb{E}[\sigma_t^2]$ as a geometric series equal to $\omega/(1-\alpha-\beta)$. Finally, the relation $\mathbb{E}[X_t^2] = \mathbb{E}[\sigma_t^2]$ and the martingale-difference identity $\mathbb{E}[X_t \mid \mathcal{F}_{t-1}] = 0$ give zero mean and zero autocovariances at all nonzero lags, establishing covariance stationarity.
[/proofplan]
[step:Recast the conditional-variance recursion in multiplicative form and reduce the variance of $X_t$ to that of $\sigma_t^2$]
Since $X_{t-1}^2 = \sigma_{t-1}^2 Z_{t-1}^2$, the defining recursion becomes
\begin{align*}
\sigma_t^2 = \omega + \alpha \sigma_{t-1}^2 Z_{t-1}^2 + \beta \sigma_{t-1}^2 = \omega + a_{t-1}\sigma_{t-1}^2, \qquad a_{t-1} := \alpha Z_{t-1}^2 + \beta .
\end{align*}
For each $t$, the random variable $a_t = \alpha Z_t^2 + \beta \ge 0$ is a fixed measurable function of $Z_t$; hence the family $(a_t)_{t \in \mathbb{Z}}$ is i.i.d., and
\begin{align*}
\mathbb{E}[a_t] = \alpha\,\mathbb{E}[Z_t^2] + \beta = \alpha + \beta,
\end{align*}
using $\mathbb{E}[Z_t^2] = 1$. By strict stationarity of $(X_t, \sigma_t)$, the law of $\sigma_t^2$ does not depend on $t$; write
\begin{align*}
m := \mathbb{E}[\sigma_t^2] \in [0, \infty],
\end{align*}
a constant in $t$ (possibly $+\infty$ at this stage).
We first record that the unconditional second moments of $X_t$ and $\sigma_t$ coincide. Since $\sigma_t^2$ is $\mathcal{F}_{t-1}$-measurable and $Z_t^2$ is independent of $\mathcal{F}_{t-1}$,
\begin{align*}
\mathbb{E}[X_t^2] = \mathbb{E}\big[\sigma_t^2 Z_t^2\big] = \mathbb{E}\big[\sigma_t^2\,\mathbb{E}[Z_t^2 \mid \mathcal{F}_{t-1}]\big] = \mathbb{E}\big[\sigma_t^2\,\mathbb{E}[Z_t^2]\big] = \mathbb{E}[\sigma_t^2] = m,
\end{align*}
the third equality by the [Conditioning and Independence](/theorems/1152) property and the second by the pull-out rule for $\mathcal{F}_{t-1}$-measurable factors in [Properties of Conditional Expectation](/theorems/1122). This identity holds in $[0,\infty]$, so $\mathbb{E}[X_t^2] < \infty$ if and only if $m < \infty$.
[guided]
The whole problem is governed by the conditional variance $\sigma_t^2$, so we first put its recursion in a form amenable to iteration. Substituting $X_{t-1} = \sigma_{t-1}Z_{t-1}$ into $\sigma_t^2 = \omega + \alpha X_{t-1}^2 + \beta\sigma_{t-1}^2$ gives
\begin{align*}
\sigma_t^2 = \omega + (\alpha Z_{t-1}^2 + \beta)\sigma_{t-1}^2 .
\end{align*}
This is an affine recursion in $\sigma_t^2$ with a **random** multiplicative coefficient $a_{t-1} = \alpha Z_{t-1}^2 + \beta$. Why is this the right move? Because the coefficients $(a_t)$ are i.i.d. (each is the same function of one $Z_t$), and their mean is exactly the quantity $\alpha + \beta$ appearing in the theorem:
\begin{align*}
\mathbb{E}[a_t] = \alpha\,\mathbb{E}[Z_t^2] + \beta = \alpha\cdot 1 + \beta = \alpha + \beta .
\end{align*}
So the threshold $\alpha + \beta < 1$ will emerge naturally as the condition that the expected multiplicative factor is below $1$.
Strict stationarity of $(X_t,\sigma_t)$ means every $\sigma_t^2$ has the same distribution, so $m := \mathbb{E}[\sigma_t^2]$ is a well-defined element of $[0,\infty]$ independent of $t$.
Finally, why is it enough to study $\sigma_t^2$? Because the unconditional variance of $X_t$ equals $m$. We condition on $\mathcal{F}_{t-1}$: the factor $\sigma_t^2$ is $\mathcal{F}_{t-1}$-measurable (it depends only on $Z_s$ with $s \le t-1$), so it pulls out of the conditional expectation, while $Z_t^2$ is independent of $\mathcal{F}_{t-1}$, so conditioning has no effect on it:
\begin{align*}
\mathbb{E}[X_t^2] = \mathbb{E}\big[\sigma_t^2 Z_t^2\big] = \mathbb{E}\big[\sigma_t^2\,\mathbb{E}[Z_t^2 \mid \mathcal{F}_{t-1}]\big] = \mathbb{E}\big[\sigma_t^2\big]\,\mathbb{E}[Z_t^2] = m .
\end{align*}
Every equality is valid in $[0,\infty]$ because all integrands are nonnegative, so no finiteness is assumed yet. Consequently $\mathbb{E}[X_t^2] < \infty \iff m < \infty$, and from now on we may work entirely with $m$.
[/guided]
[/step]
[step:Iterate the recursion into an exact finite expansion with a nonnegative remainder]
We claim that for every integer $n \ge 1$,
\begin{align*}
\sigma_t^2 = \omega \sum_{k=0}^{n-1} \Bigg(\prod_{j=1}^{k} a_{t-j}\Bigg) + \Bigg(\prod_{j=1}^{n} a_{t-j}\Bigg)\sigma_{t-n}^2,
\end{align*}
with the convention that the empty product ($k = 0$) equals $1$.
For $n = 1$ this is exactly $\sigma_t^2 = \omega + a_{t-1}\sigma_{t-1}^2$. Assume the identity holds for some $n \ge 1$. Substituting $\sigma_{t-n}^2 = \omega + a_{t-n-1}\sigma_{t-n-1}^2$ into the remainder term,
\begin{align*}
\Bigg(\prod_{j=1}^{n} a_{t-j}\Bigg)\sigma_{t-n}^2
&= \omega\Bigg(\prod_{j=1}^{n} a_{t-j}\Bigg) + \Bigg(\prod_{j=1}^{n} a_{t-j}\Bigg) a_{t-n-1}\,\sigma_{t-n-1}^2 \\
&= \omega \Bigg(\prod_{j=1}^{n} a_{t-j}\Bigg) + \Bigg(\prod_{j=1}^{n+1} a_{t-j}\Bigg)\sigma_{t-n-1}^2,
\end{align*}
where $\prod_{j=1}^{n} a_{t-j}\cdot a_{t-n-1} = \prod_{j=1}^{n+1} a_{t-j}$. Inserting this into the level-$n$ identity yields the level-$(n+1)$ identity, completing the induction.
Because $a_s \ge 0$ for all $s$ and $\sigma_{t-n}^2 \ge 0$, every product $\prod_{j=1}^{k} a_{t-j}$ and the remainder term are nonnegative.
[/step]
[step:Conclude infinite variance when $\alpha + \beta \ge 1$]
Suppose $\alpha + \beta \ge 1$. Discarding the nonnegative remainder term in the level-$n$ expansion gives the pointwise lower bound
\begin{align*}
\sigma_t^2 \ge \omega \sum_{k=0}^{n-1} \prod_{j=1}^{k} a_{t-j} \qquad \text{for every } n \ge 1.
\end{align*}
Taking expectations and using linearity over the finite sum,
\begin{align*}
m = \mathbb{E}[\sigma_t^2] \ge \omega \sum_{k=0}^{n-1} \mathbb{E}\Bigg[\prod_{j=1}^{k} a_{t-j}\Bigg].
\end{align*}
For each fixed $k$, the variables $a_{t-1}, \dots, a_{t-k}$ are independent (being functions of distinct $Z$'s), so the expectation of their product factorises:
\begin{align*}
\mathbb{E}\Bigg[\prod_{j=1}^{k} a_{t-j}\Bigg] = \prod_{j=1}^{k} \mathbb{E}[a_{t-j}] = (\alpha + \beta)^k .
\end{align*}
Hence $m \ge \omega \sum_{k=0}^{n-1} (\alpha+\beta)^k$ for all $n$. Since $\alpha + \beta \ge 1$, the partial sums $\sum_{k=0}^{n-1}(\alpha+\beta)^k \ge n$ diverge to $+\infty$ as $n \to \infty$, forcing $m = +\infty$. By Step 1, $\mathbb{E}[X_t^2] = m = +\infty$.
Contrapositively, $\mathbb{E}[X_t^2] < \infty$ implies $\alpha + \beta < 1$, which is the forward direction of part (1).
[/step]
[step:Pass to the convergent series representation of $\sigma_t^2$ when $\alpha + \beta < 1$]
Assume now $\alpha + \beta < 1$. Define the partial sums and tail
\begin{align*}
T_n := \omega \sum_{k=0}^{n-1} \prod_{j=1}^{k} a_{t-j}, \qquad R_n := \Bigg(\prod_{j=1}^{n} a_{t-j}\Bigg)\sigma_{t-n}^2,
\end{align*}
so that the Step 2 identity reads $\sigma_t^2 = T_n + R_n$ with $T_n, R_n \ge 0$. Since the summands are nonnegative, $(T_n)$ is nondecreasing and converges almost surely to
\begin{align*}
S := \omega \sum_{k=0}^{\infty} \prod_{j=1}^{k} a_{t-j} \in [0, \infty].
\end{align*}
From $R_n = \sigma_t^2 - T_n \ge 0$ we get $T_n \le \sigma_t^2 < \infty$ almost surely, so $S \le \sigma_t^2 < \infty$ almost surely, and $R_n \downarrow \sigma_t^2 - S =: D \ge 0$ almost surely. We show $D = 0$.
[claim:The random products $P_n := \prod_{j=1}^{n} a_{t-j}$ tend to $0$ almost surely]
[proof]
The variables $\log a_{t-j}$, $j \ge 1$, are i.i.d. Their positive part is integrable: using $\log^+ x \le x$ for $x \ge 0$,
\begin{align*}
\mathbb{E}[\log^+ a_t] \le \mathbb{E}[a_t] = \alpha + \beta < \infty .
\end{align*}
By [Jensen's Inequality](/theorems/9) applied to the concave function $\log$ and the integrable nonnegative variable $a_t$,
\begin{align*}
\mathbb{E}[\log a_t] \le \log \mathbb{E}[a_t] = \log(\alpha + \beta) < 0,
\end{align*}
where the value $\mathbb{E}[\log a_t] \in [-\infty, 0)$ is well defined because $\mathbb{E}[\log^+ a_t] < \infty$.
To apply the Strong Law in its integrable form we truncate from below. For $M > 0$ set $g_M := \max(\log a_t, -M)$, which satisfies $-M \le g_M \le \log^+ a_t \le a_t$ and is therefore integrable. As $M \uparrow \infty$, $g_M \downarrow \log a_t$ pointwise, so by the [Monotone Convergence Theorem](/theorems/509) applied to the nonincreasing sequence $(-g_M)$ bounded below in $L^1$, $\mathbb{E}[g_M] \downarrow \mathbb{E}[\log a_t] < 0$. Fix $M_0$ large enough that $c := \mathbb{E}[\max(\log a_t, -M_0)] < 0$.
The variables $\xi_j := \max(\log a_{t-j}, -M_0)$, $j \ge 1$, are i.i.d. and integrable, so by the [Strong Law of Large Numbers](/theorems/520),
\begin{align*}
\frac{1}{n}\sum_{j=1}^{n} \xi_j \xrightarrow{a.s.} c < 0 .
\end{align*}
Since $\log a_{t-j} \le \xi_j$ for every $j$,
\begin{align*}
\frac{1}{n}\log P_n = \frac{1}{n}\sum_{j=1}^{n} \log a_{t-j} \le \frac{1}{n}\sum_{j=1}^{n} \xi_j \xrightarrow{a.s.} c < 0,
\end{align*}
hence $\limsup_{n} \tfrac{1}{n}\log P_n \le c < 0$ almost surely. Therefore $\log P_n \to -\infty$, i.e. $P_n \to 0$ almost surely.
[/proof]
[/claim]
Next we show $R_n = P_n \sigma_{t-n}^2 \to 0$ in probability. By strict stationarity every $\sigma_{t-n}^2$ has the law of $\sigma_0^2$, a finite random variable; hence the family $\{\sigma_{t-n}^2 : n \ge 1\}$ is tight: given $\delta > 0$ choose $M_\delta < \infty$ with $\mathbb{P}(\sigma_0^2 > M_\delta) < \delta$, so $\mathbb{P}(\sigma_{t-n}^2 > M_\delta) < \delta$ for all $n$. For any $\varepsilon > 0$,
\begin{align*}
\mathbb{P}(R_n > \varepsilon) \le \mathbb{P}(\sigma_{t-n}^2 > M_\delta) + \mathbb{P}\!\left(P_n > \tfrac{\varepsilon}{M_\delta}\right) < \delta + \mathbb{P}\!\left(P_n > \tfrac{\varepsilon}{M_\delta}\right).
\end{align*}
Since $P_n \to 0$ almost surely (the Claim), $P_n \to 0$ in probability, so $\mathbb{P}(P_n > \varepsilon/M_\delta) \to 0$; thus $\limsup_n \mathbb{P}(R_n > \varepsilon) \le \delta$. As $\delta > 0$ was arbitrary, $\mathbb{P}(R_n > \varepsilon) \to 0$, i.e. $R_n \to 0$ in probability.
But $R_n \to D$ almost surely, hence also in probability. For any $\varepsilon > 0$,
\begin{align*}
\mathbb{P}(D > \varepsilon) \le \mathbb{P}\!\left(|D - R_n| > \tfrac{\varepsilon}{2}\right) + \mathbb{P}\!\left(R_n > \tfrac{\varepsilon}{2}\right) \xrightarrow[n\to\infty]{} 0,
\end{align*}
so $\mathbb{P}(D > \varepsilon) = 0$ for every $\varepsilon > 0$, giving $D = 0$ almost surely. Therefore
\begin{align*}
\sigma_t^2 = S = \omega \sum_{k=0}^{\infty} \prod_{j=1}^{k} a_{t-j} \qquad \text{almost surely.}
\end{align*}
[guided]
We have an exact identity $\sigma_t^2 = T_n + R_n$ for each $n$, and we would like to send $n \to \infty$ to obtain an infinite series for $\sigma_t^2$. The partial sums $T_n$ are increasing (we only add nonnegative terms), so they converge almost surely to some $S \in [0,\infty]$. Moreover $T_n \le \sigma_t^2 < \infty$, so in fact $S \le \sigma_t^2 < \infty$ almost surely. The entire difficulty is the remainder: does $R_n = \sigma_t^2 - T_n$ shrink to $0$? Writing $D := \sigma_t^2 - S = \lim_n R_n \ge 0$, the goal is $D = 0$.
**Why should $R_n \to 0$?** The remainder is a product $R_n = P_n\,\sigma_{t-n}^2$, where $P_n = \prod_{j=1}^n a_{t-j}$. The factor $\sigma_{t-n}^2$ does not shrink — by stationarity it has the same distribution for every $n$. So all the decay must come from the product $P_n$. Here is the key point: the **expected** multiplicative factor is $\alpha + \beta < 1$, so we expect $P_n \to 0$. Multiplicatively this is a statement about the average of $\log a_{t-j}$.
*Establishing $P_n \to 0$.* We pass to logarithms: $\log P_n = \sum_{j=1}^n \log a_{t-j}$. The summands are i.i.d., and we want their average to be negative. The mean is controlled by Jensen's inequality (log is concave):
\begin{align*}
\mathbb{E}[\log a_t] \le \log \mathbb{E}[a_t] = \log(\alpha+\beta) < 0 .
\end{align*}
This is precisely where the hypothesis $\alpha + \beta < 1$ is consumed — it makes the logarithmic drift strictly negative. One technical point: $\mathbb{E}[\log a_t]$ might be $-\infty$ (if $a_t$ can be close to $0$), and the Strong Law in its textbook form wants an integrable summand. We sidestep this by truncating below at level $-M_0$: the truncated variable $\xi_j = \max(\log a_{t-j}, -M_0)$ is integrable, and for $M_0$ large its mean $c = \mathbb{E}[\xi_1]$ is still $< 0$ (because $\mathbb{E}[g_M] \downarrow \mathbb{E}[\log a_t] < 0$). The [Strong Law of Large Numbers](/theorems/520) gives $\frac1n\sum_{j=1}^n \xi_j \to c < 0$ almost surely, and since $\log a_{t-j} \le \xi_j$,
\begin{align*}
\frac1n \log P_n \le \frac1n\sum_{j=1}^n \xi_j \to c < 0,
\end{align*}
so $\log P_n \to -\infty$ and $P_n \to 0$ almost surely.
*From $P_n \to 0$ to $R_n \to 0$.* We cannot simply multiply "$P_n \to 0$" by "$\sigma_{t-n}^2$" because $\sigma_{t-n}^2$ is a different random variable for each $n$. The right notion is convergence in probability, exploiting tightness: a single distribution (that of $\sigma_0^2$) is tight, so for $\delta > 0$ there is a cutoff $M_\delta$ with $\mathbb{P}(\sigma_{t-n}^2 > M_\delta) < \delta$ uniformly in $n$. Then
\begin{align*}
\mathbb{P}(R_n > \varepsilon) \le \underbrace{\mathbb{P}(\sigma_{t-n}^2 > M_\delta)}_{<\,\delta} + \underbrace{\mathbb{P}(P_n > \varepsilon/M_\delta)}_{\to\, 0},
\end{align*}
so $\limsup_n \mathbb{P}(R_n > \varepsilon) \le \delta$, and letting $\delta \downarrow 0$ shows $R_n \to 0$ in probability.
*Reconciling the two limits.* We now have $R_n \to D$ almost surely (hence in probability) and $R_n \to 0$ in probability. A sequence cannot have two different limits in probability: by the triangle-inequality union bound,
\begin{align*}
\mathbb{P}(D > \varepsilon) \le \mathbb{P}(|D - R_n| > \varepsilon/2) + \mathbb{P}(R_n > \varepsilon/2) \to 0,
\end{align*}
so $\mathbb{P}(D > \varepsilon) = 0$ for all $\varepsilon$, i.e. $D = 0$ almost surely. Thus the remainder genuinely vanishes and
\begin{align*}
\sigma_t^2 = \omega \sum_{k=0}^\infty \prod_{j=1}^k a_{t-j} \quad \text{almost surely,}
\end{align*}
the strictly stationary series solution we sought.
[/guided]
[/step]
[step:Evaluate the unconditional variance as a geometric series]
With $\alpha + \beta < 1$ and the series representation $\sigma_t^2 = \omega \sum_{k=0}^{\infty} \prod_{j=1}^{k} a_{t-j}$ almost surely, all terms are nonnegative, so the [Monotone Convergence Theorem](/theorems/509) permits exchanging expectation and summation:
\begin{align*}
m = \mathbb{E}[\sigma_t^2] = \omega \sum_{k=0}^{\infty} \mathbb{E}\Bigg[\prod_{j=1}^{k} a_{t-j}\Bigg].
\end{align*}
As computed in Step 3, independence of $a_{t-1}, \dots, a_{t-k}$ gives $\mathbb{E}\big[\prod_{j=1}^{k} a_{t-j}\big] = (\alpha + \beta)^k$. Hence
\begin{align*}
m = \omega \sum_{k=0}^{\infty} (\alpha + \beta)^k = \frac{\omega}{1 - \alpha - \beta},
\end{align*}
the geometric series converging precisely because $0 \le \alpha + \beta < 1$. In particular $m < \infty$, so by Step 1,
\begin{align*}
\mathbb{E}[X_t^2] = \mathbb{E}[\sigma_t^2] = \frac{\omega}{1 - \alpha - \beta} < \infty .
\end{align*}
Combined with Step 3, this proves part (1): $\mathbb{E}[X_t^2] < \infty \iff \alpha + \beta < 1$, and supplies the value of the variance claimed in part (2).
[/step]
[step:Establish zero mean, the martingale-difference identity, and vanishing autocovariances]
Assume $\alpha + \beta < 1$, so $\mathbb{E}[\sigma_t^2] = \mathbb{E}[X_t^2] = \omega/(1-\alpha-\beta) < \infty$ by Step 5.
**Integrability and zero mean.** By the [Cauchy–Schwarz Inequality](/theorems/432), $\mathbb{E}[\sigma_t] \le (\mathbb{E}[\sigma_t^2])^{1/2} < \infty$ and, since $Z_t$ is independent of $\sigma_t$ with $\mathbb{E}|Z_t| \le (\mathbb{E}[Z_t^2])^{1/2} = 1$,
\begin{align*}
\mathbb{E}|X_t| = \mathbb{E}[\sigma_t |Z_t|] = \mathbb{E}[\sigma_t]\,\mathbb{E}|Z_t| < \infty,
\end{align*}
so $X_t \in L^1$. As $\sigma_t$ is $\mathcal{F}_{t-1}$-measurable and $Z_t$ is independent of $\mathcal{F}_{t-1}$ with $\mathbb{E}[Z_t] = 0$, the pull-out and independence rules from [Properties of Conditional Expectation](/theorems/1122) and [Conditioning and Independence](/theorems/1152) give
\begin{align*}
\mathbb{E}[X_t \mid \mathcal{F}_{t-1}] = \sigma_t\,\mathbb{E}[Z_t \mid \mathcal{F}_{t-1}] = \sigma_t\,\mathbb{E}[Z_t] = 0 .
\end{align*}
Thus $(X_t)$ is a martingale-difference sequence with respect to $(\mathcal{F}_t)$, and by the [Law of Total Expectation](/theorems/1121), $\mathbb{E}[X_t] = \mathbb{E}\big[\mathbb{E}[X_t \mid \mathcal{F}_{t-1}]\big] = 0$. By strict stationarity this mean is the same constant ($0$) for all $t$.
**Zero autocovariances at nonzero lags.** Fix $h \ge 1$. Since $X_t = \sigma_t Z_t$ is $\mathcal{F}_t$-measurable and $\mathcal{F}_t \subseteq \mathcal{F}_{t+h-1}$, both $X_t$ and $\sigma_{t+h}$ are $\mathcal{F}_{t+h-1}$-measurable, while $Z_{t+h}$ is independent of $\mathcal{F}_{t+h-1}$. The product $X_t X_{t+h}$ is integrable by Cauchy–Schwarz:
\begin{align*}
\mathbb{E}|X_t X_{t+h}| \le \big(\mathbb{E}[X_t^2]\,\mathbb{E}[X_{t+h}^2]\big)^{1/2} = \frac{\omega}{1-\alpha-\beta} < \infty .
\end{align*}
Conditioning on $\mathcal{F}_{t+h-1}$ and pulling out the measurable factor $X_t \sigma_{t+h}$,
\begin{align*}
\mathbb{E}[X_t X_{t+h} \mid \mathcal{F}_{t+h-1}] = X_t\,\sigma_{t+h}\,\mathbb{E}[Z_{t+h} \mid \mathcal{F}_{t+h-1}] = X_t\,\sigma_{t+h}\,\mathbb{E}[Z_{t+h}] = 0 .
\end{align*}
By the [Law of Total Expectation](/theorems/1121), $\mathbb{E}[X_t X_{t+h}] = 0$. Since $\mathbb{E}[X_t] = 0$ for all $t$,
\begin{align*}
\operatorname{Cov}(X_t, X_{t+h}) = \mathbb{E}[X_t X_{t+h}] - \mathbb{E}[X_t]\,\mathbb{E}[X_{t+h}] = 0 .
\end{align*}
For $h \le -1$, symmetry of covariance gives $\operatorname{Cov}(X_t, X_{t+h}) = \operatorname{Cov}(X_{t+h}, X_t) = 0$ by the case just proved (applied at lag $-h \ge 1$). Hence $\operatorname{Cov}(X_t, X_{t+h}) = 0$ for every $h \ne 0$.
**Conclusion.** The process $(X_t)$ has constant mean $\mathbb{E}[X_t] = 0$, constant finite variance $\operatorname{Var}(X_t) = \mathbb{E}[X_t^2] = \omega/(1-\alpha-\beta)$, and an autocovariance function $\operatorname{Cov}(X_t, X_{t+h})$ that depends only on the lag $h$ (equal to $\omega/(1-\alpha-\beta)$ at $h = 0$ and $0$ otherwise). These are exactly the defining properties of covariance stationarity, proving part (2) and completing the proof.
[guided]
Having pinned down the second moment, covariance stationarity requires three things: a constant finite mean, a constant finite variance, and autocovariances depending only on the lag. The variance is done (Step 5); we now handle the mean and the cross-moments, and the engine for both is the martingale structure $\mathbb{E}[X_t \mid \mathcal{F}_{t-1}] = 0$.
*Why is $X_t$ integrable?* We need $L^1$ control before conditioning. Cauchy–Schwarz bounds $\mathbb{E}[\sigma_t] \le (\mathbb{E}[\sigma_t^2])^{1/2}$, which is finite by Step 5, and $\mathbb{E}|Z_t| \le (\mathbb{E}[Z_t^2])^{1/2} = 1$; independence of $Z_t$ and $\sigma_t$ then gives $\mathbb{E}|X_t| = \mathbb{E}[\sigma_t]\mathbb{E}|Z_t| < \infty$.
*The martingale-difference identity.* Conditioning $X_t = \sigma_t Z_t$ on $\mathcal{F}_{t-1}$: the volatility $\sigma_t$ is built from $Z_s$ with $s \le t-1$, hence is $\mathcal{F}_{t-1}$-measurable and pulls out; the innovation $Z_t$ is independent of $\mathcal{F}_{t-1}$, so its conditional mean is its unconditional mean $\mathbb{E}[Z_t] = 0$:
\begin{align*}
\mathbb{E}[X_t \mid \mathcal{F}_{t-1}] = \sigma_t\,\mathbb{E}[Z_t \mid \mathcal{F}_{t-1}] = \sigma_t\cdot 0 = 0 .
\end{align*}
Taking expectations (Law of Total Expectation) yields $\mathbb{E}[X_t] = 0$. This is the heart of why GARCH returns are *uncorrelated*: they form a martingale-difference sequence.
*Why do autocovariances vanish?* Take a lag $h \ge 1$ and condition the product $X_t X_{t+h}$ on the information $\mathcal{F}_{t+h-1}$ available just before time $t+h$. Everything except the latest innovation $Z_{t+h}$ is known at that time: $X_t$ (with $t \le t+h-1$) and $\sigma_{t+h}$ are both $\mathcal{F}_{t+h-1}$-measurable, so they pull out, and what remains is the conditional mean of the fresh, independent innovation:
\begin{align*}
\mathbb{E}[X_t X_{t+h} \mid \mathcal{F}_{t+h-1}] = X_t\,\sigma_{t+h}\,\mathbb{E}[Z_{t+h} \mid \mathcal{F}_{t+h-1}] = X_t\,\sigma_{t+h}\cdot 0 = 0 .
\end{align*}
We checked $X_t X_{t+h} \in L^1$ first (Cauchy–Schwarz, using $\mathbb{E}[X_t^2] = \mathbb{E}[X_{t+h}^2] < \infty$), so the Law of Total Expectation applies and $\mathbb{E}[X_t X_{t+h}] = 0$. With zero means, this is exactly $\operatorname{Cov}(X_t, X_{t+h}) = 0$. Negative lags follow by the symmetry $\operatorname{Cov}(X_t, X_{t+h}) = \operatorname{Cov}(X_{t+h}, X_t)$.
Notice what is *not* claimed: the $X_t$ are uncorrelated but **not** independent — the squared series $X_t^2$ is serially correlated through the volatility recursion, which is the entire point of GARCH modelling. Covariance stationarity is a statement only about first and second moments, and those we have now fully computed: constant zero mean, constant variance $\omega/(1-\alpha-\beta)$, and lag-dependent (indeed lag-zero-supported) autocovariances.
[/guided]
[/step]
Explore Further
Strong Consistency of the Multivariate Normal Maximum Likelihood Estimators
probability
Orthogonality of Innovations
probability
Forecast Error Variance of a Causal ARMA Process
probability
Maximum Likelihood Estimator of the Coefficient Matrix in the Multivariate Linear Model
probability
Ljung-Box Portmanteau Test
probability
Rank Bound for the Sample Covariance Matrix
probability
Birkhoff Ergodic Theorem for Sample Means
probability
Nelson's Strict Stationarity Theorem for the GARCH(1,1) Process
probability