[proofplan]
We center the process and express the mean-square error of the sample mean as the variance of a finite average. Weak stationarity converts each covariance term in the double sum into a function only of the lag $s-t$. Grouping the double sum by lags produces the triangular weights $1-|h|/n$, and the assumed averaged absolute summability condition forces the resulting variance to vanish.
[/proofplan]
custom_env
admin
[step:Center the sample mean and reduce mean-square convergence to a variance estimate]
For each $t\in\mathbb Z$, define the centered [random variable](/page/Random%20Variable) $Y_t:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ by
\begin{align*}
Y_t=X_t-\mu.
\end{align*}
For each $n\in\mathbb N$, define the centered sample mean $\bar Y_n:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$ by
\begin{align*}
\bar Y_n=\bar X_n-\mu=\frac{1}{n}\sum_{t=1}^{n}Y_t.
\end{align*}
Linearity of expectation gives $\mathbb E[Y_t]=0$ for every $t\in\mathbb Z$, hence $\mathbb E[\bar Y_n]=0$ for every $n\in\mathbb N$. For a square-integrable real-valued random variable $Z:(\Omega,\mathcal F)\to(\mathbb R,\mathcal B(\mathbb R))$, define its variance by
\begin{align*}
\operatorname{Var}(Z)=\mathbb E[(Z-\mathbb E[Z])^2].
\end{align*}
Applying this definition to $Z=\bar Y_n$ and using $\mathbb E[\bar Y_n]=0$, we obtain
\begin{align*}
\mathbb E[(\bar X_n-\mu)^2]=\mathbb E[\bar Y_n^2]=\operatorname{Var}(\bar Y_n).
\end{align*}
It remains to prove that $\operatorname{Var}(\bar Y_n)\to 0$.
[/step]
custom_env
admin
[step:Expand the variance as a double covariance sum]Since each $X_t$ is square-integrable, each $Y_t$ is square-integrable. For every $s,t\in\{1,\dots,n\}$, the [Cauchy-Schwarz inequality](/theorems/432) gives
\begin{align*}
\mathbb E[|Y_sY_t|]\le \mathbb E[Y_s^2]^{1/2}\mathbb E[Y_t^2]^{1/2}<\infty,
\end{align*}
so each product $Y_sY_t$ is integrable. The sum defining $\bar Y_n$ is finite:
\begin{align*}
\bar Y_n=\frac{1}{n}\sum_{t=1}^{n}Y_t.
\end{align*}
Therefore bilinearity of covariance applies to this finite linear combination, and
\begin{align*}
\operatorname{Var}(\bar Y_n)=\mathbb E\left[\left(\frac{1}{n}\sum_{t=1}^{n}Y_t\right)^2\right].
\end{align*}
Expanding the square and using linearity of expectation over the finite double sum gives
\begin{align*}
\operatorname{Var}(\bar Y_n)=\frac{1}{n^2}\sum_{s=1}^{n}\sum_{t=1}^{n}\mathbb E[Y_sY_t].
\end{align*}
By weak stationarity, for every $s,t\in\{1,\dots,n\}$,
\begin{align*}
\mathbb E[Y_sY_t]=\mathbb E[(X_s-\mu)(X_t-\mu)]=\gamma(s-t).
\end{align*}
Therefore
\begin{align*}
\operatorname{Var}(\bar Y_n)=\frac{1}{n^2}\sum_{s=1}^{n}\sum_{t=1}^{n}\gamma(s-t).
\end{align*}[/step]
custom_env
admin
[guided]The reason for centering is that the sample mean error is itself an average of centered variables. Define $Y_t=X_t-\mu$ for each $t\in\mathbb Z$. Then $\mathbb E[Y_t]=\mathbb E[X_t]-\mu=0$, and
\begin{align*}
\bar X_n-\mu=\frac{1}{n}\sum_{t=1}^{n}(X_t-\mu)=\frac{1}{n}\sum_{t=1}^{n}Y_t.
\end{align*}
Because only finitely many square-integrable random variables are being summed, the square of the sum expands into finitely many products. Each product is integrable: for $s,t\in\{1,\dots,n\}$, the Cauchy-Schwarz inequality gives
\begin{align*}
\mathbb E[|Y_sY_t|]\le \mathbb E[Y_s^2]^{1/2}\mathbb E[Y_t^2]^{1/2}<\infty.
\end{align*}
Thus we may expand the square and move expectation through the finite sum:
\begin{align*}
\mathbb E[(\bar X_n-\mu)^2]=\mathbb E\left[\left(\frac{1}{n}\sum_{t=1}^{n}Y_t\right)^2\right].
\end{align*}
Expanding the product gives one term for every ordered pair $(s,t)\in\{1,\dots,n\}^2$:
\begin{align*}
\mathbb E[(\bar X_n-\mu)^2]=\frac{1}{n^2}\sum_{s=1}^{n}\sum_{t=1}^{n}\mathbb E[Y_sY_t].
\end{align*}
Weak stationarity is used precisely here. Since $Y_s=X_s-\mu$ and $Y_t=X_t-\mu$, the covariance between $X_s$ and $X_t$ depends only on the lag $s-t$:
\begin{align*}
\mathbb E[Y_sY_t]=\mathbb E[(X_s-\mu)(X_t-\mu)]=\gamma(s-t).
\end{align*}
Substituting this into the double sum yields
\begin{align*}
\mathbb E[(\bar X_n-\mu)^2]=\frac{1}{n^2}\sum_{s=1}^{n}\sum_{t=1}^{n}\gamma(s-t).
\end{align*}
Thus the mean-square error has been converted into a purely deterministic finite sum involving the autocovariance function.[/guided]
custom_env
admin
[step:Group the double sum by lag]
For a fixed $n\in\mathbb N$, define the lag-counting function $N_n:\mathbb Z\to\{0,1,\dots,n\}$ by
\begin{align*}
N_n(h)=\#\{(s,t)\in\{1,\dots,n\}^2:s-t=h\}.
\end{align*}
If $|h|\ge n$, then $N_n(h)=0$. If $0\le h<n$, then $s=t+h$ and $t$ ranges over $\{1,\dots,n-h\}$, so $N_n(h)=n-h$. If $-n<h<0$, then $t=s-h$ and $s$ ranges over $\{1,\dots,n+h\}$, so $N_n(h)=n+h$. Hence, for every $h\in\mathbb Z$ with $|h|<n$,
\begin{align*}
N_n(h)=n-|h|.
\end{align*}
Grouping the ordered pairs $(s,t)$ by their lag $h=s-t$ gives
\begin{align*}
\sum_{s=1}^{n}\sum_{t=1}^{n}\gamma(s-t)=\sum_{|h|<n}(n-|h|)\gamma(h).
\end{align*}
Therefore
\begin{align*}
\operatorname{Var}(\bar Y_n)=\frac{1}{n}\sum_{|h|<n}\left(1-\frac{|h|}{n}\right)\gamma(h).
\end{align*}
[/step]
custom_env
admin
[step:Bound the triangular covariance average by the assumed absolute average]
For every integer $h$ with $|h|<n$, the triangular weight satisfies
\begin{align*}
0\le 1-\frac{|h|}{n}\le 1.
\end{align*}
Taking absolute values and using the triangle inequality for the finite sum, we obtain
\begin{align*}
|\operatorname{Var}(\bar Y_n)|\le \frac{1}{n}\sum_{|h|<n}\left(1-\frac{|h|}{n}\right)|\gamma(h)|.
\end{align*}
Since each weight is at most $1$,
\begin{align*}
|\operatorname{Var}(\bar Y_n)|\le \frac{1}{n}\sum_{|h|<n}|\gamma(h)|.
\end{align*}
By hypothesis, the right-hand side tends to $0$ as $n\to\infty$. Hence $\operatorname{Var}(\bar Y_n)\to 0$.
[/step]
custom_env
admin
[step:Conclude mean-square convergence of the sample mean]
From the first step,
\begin{align*}
\mathbb E[(\bar X_n-\mu)^2]=\operatorname{Var}(\bar Y_n).
\end{align*}
The previous step proves that the right-hand side tends to $0$. Therefore
\begin{align*}
\mathbb E[(\bar X_n-\mu)^2]\to 0
\end{align*}
as $n\to\infty$, which is exactly the assertion that $\bar X_n\to\mu$ in mean square.
[/step]