[proofplan]
We first verify that the filtered process is well-defined in $L^2$ and compute its mean. After centering the process, we write the covariance of $Y$ as a double series involving the covariance function of $X$. The spectral representation of $\gamma_X$ is then substituted into this double series, and absolute summability of the filter justifies interchanging the sums with the spectral integral. The resulting double sum factors into $A(\lambda)\overline{A(\lambda)}$, giving the claimed spectral density and showing that the covariance of $Y$ depends only on the lag.
[/proofplan]
[step:Show that the filtered process is well-defined and compute its mean]
Let $(\Omega,\mathcal F,\mathbb P)$ be the probability space on which the process $(X_t)_{t \in \mathbb{Z}}$ is defined, and let $\mu_X = \mathbb{E}[X_0]$ denote its common mean. Define the centered process $Z_t: \Omega \to \mathbb{C}$ by
\begin{align*}
Z_t = X_t - \mu_X.
\end{align*}
Since $(X_t)$ is weakly stationary, $\mathbb{E}[|Z_t|^2] = \gamma_X(0) < \infty$ for every $t \in \mathbb{Z}$.
For each $t \in \mathbb{Z}$ and each finite set $F \subset \mathbb{Z}$, define the partial filtered sum $S_{t,F}: \Omega \to \mathbb{C}$ by
\begin{align*}
S_{t,F} = \sum_{j \in F} a_j X_{t-j}.
\end{align*}
If $F,G \subset \mathbb{Z}$ are finite, then by the triangle inequality in $L^2(\Omega)$,
\begin{align*}
\|S_{t,F} - S_{t,G}\|_{L^2(\Omega)}
&\leq \sum_{j \in F \triangle G} |a_j| \, \|X_{t-j}\|_{L^2(\Omega)}.
\end{align*}
Weak stationarity gives $\|X_{t-j}\|_{L^2(\Omega)} = \|X_0\|_{L^2(\Omega)}$, and $a \in \ell^1(\mathbb{Z})$, so the net $(S_{t,F})_F$ over finite subsets of $\mathbb{Z}$ is Cauchy in $L^2(\Omega)$. Hence $Y_t$ is well-defined as its $L^2$ limit.
Taking expectations and using absolute convergence of $\sum_{j \in \mathbb{Z}} a_j \mu_X$, we obtain
\begin{align*}
\mathbb{E}[Y_t]
= \sum_{j \in \mathbb{Z}} a_j \mathbb{E}[X_{t-j}]
= \sum_{j \in \mathbb{Z}} a_j \mu_X
= A(0)\mu_X.
\end{align*}
Thus $\mu_Y = A(0)\mu_X$ is independent of $t$.
[guided]
We first need to justify that the expression defining $Y_t$ is not merely formal. Define the centered random variable $Z_t: \Omega \to \mathbb{C}$ by
\begin{align*}
Z_t = X_t - \mu_X.
\end{align*}
Weak stationarity means, in particular, that all $Z_t$ have the same finite second moment:
\begin{align*}
\mathbb{E}[|Z_t|^2] = \gamma_X(0) < \infty.
\end{align*}
Since $X_t = Z_t + \mu_X$, each $X_t$ also belongs to $L^2(\Omega)$.
For a finite set $F \subset \mathbb{Z}$, define the finite partial filter
\begin{align*}
S_{t,F}: \Omega \to \mathbb{C}, \qquad
S_{t,F} = \sum_{j \in F} a_j X_{t-j}.
\end{align*}
To show that these partial sums converge in $L^2$, compare two finite sums indexed by finite sets $F,G \subset \mathbb{Z}$. By the triangle inequality in $L^2(\Omega)$,
\begin{align*}
\|S_{t,F} - S_{t,G}\|_{L^2(\Omega)}
&=
\left\|\sum_{j \in F \triangle G} \varepsilon_j a_j X_{t-j}\right\|_{L^2(\Omega)} \\
&\leq \sum_{j \in F \triangle G} |a_j|\,\|X_{t-j}\|_{L^2(\Omega)},
\end{align*}
where each $\varepsilon_j \in \{-1,1\}$ records whether the term comes from $F \setminus G$ or $G \setminus F$. Weak stationarity gives a common value for $\|X_{t-j}\|_{L^2(\Omega)}$, namely $\|X_0\|_{L^2(\Omega)}$. Therefore
\begin{align*}
\|S_{t,F} - S_{t,G}\|_{L^2(\Omega)}
\leq \|X_0\|_{L^2(\Omega)} \sum_{j \in F \triangle G} |a_j|.
\end{align*}
Because $a \in \ell^1(\mathbb{Z})$, the right-hand side tends to $0$ as the finite sets exhaust $\mathbb{Z}$. Thus the partial sums converge in $L^2(\Omega)$, and $Y_t$ is well-defined.
Now compute the mean. Since $L^2$ convergence implies $L^1$ convergence for probability measures, expectation passes to the limit of the finite partial sums. Hence
\begin{align*}
\mathbb{E}[Y_t]
= \sum_{j \in \mathbb{Z}} a_j \mathbb{E}[X_{t-j}]
= \sum_{j \in \mathbb{Z}} a_j \mu_X
= A(0)\mu_X.
\end{align*}
This quantity does not depend on $t$, which is the first part of weak stationarity for $(Y_t)$.
[/guided]
[/step]
[step:Express the filtered covariance as an absolutely convergent double series]
For $h \in \mathbb{Z}$, define the autocovariance of $(Y_t)$ by
\begin{align*}
\gamma_Y(h) = \mathbb{E}\big[(Y_{t+h}-\mu_Y)\overline{(Y_t-\mu_Y)}\big].
\end{align*}
Since
\begin{align*}
Y_t - \mu_Y = \sum_{j \in \mathbb{Z}} a_j Z_{t-j}
\end{align*}
in $L^2(\Omega)$, the finite partial sums converge to $Y_t-\mu_Y$ in $L^2(\Omega)$. The $L^2$ inner product is continuous by the Cauchy-Schwarz inequality: if $U_m \to U$ and $V_m \to V$ in $L^2(\Omega)$, then
\begin{align*}
\left|\mathbb{E}[U_m\overline{V_m}]-\mathbb{E}[U\overline{V}]\right|
&\leq \|U_m-U\|_{L^2(\Omega)}\|V_m\|_{L^2(\Omega)}
+ \|U\|_{L^2(\Omega)}\|V_m-V\|_{L^2(\Omega)}.
\end{align*}
Applying this continuity to the centered finite filtered sums gives
\begin{align*}
\gamma_Y(h)
&= \sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
a_j \overline{a_k}\,
\mathbb{E}\big[Z_{t+h-j}\overline{Z_{t-k}}\big] \\
&= \sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
a_j \overline{a_k}\,\gamma_X(h-j+k).
\end{align*}
The double series is absolutely convergent because
\begin{align*}
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
|a_j|\,|a_k|\,|\gamma_X(h-j+k)|
\leq \gamma_X(0)\left(\sum_{j \in \mathbb{Z}} |a_j|\right)^2,
\end{align*}
where $|\gamma_X(r)| \leq \gamma_X(0)$ follows from the Cauchy-Schwarz inequality applied to $Z_{t+r}$ and $Z_t$ in $L^2(\Omega)$.
[guided]
We now compute the covariance of the centered filtered process. Since
\begin{align*}
Y_t-\mu_Y
= \sum_{j \in \mathbb{Z}} a_j Z_{t-j}
\end{align*}
in $L^2(\Omega)$, the covariance can be computed by taking limits of finite partial sums. For finite sets $F,G \subset \mathbb{Z}$,
\begin{align*}
\mathbb{E}\left[
\left(\sum_{j \in F} a_j Z_{t+h-j}\right)
\overline{\left(\sum_{k \in G} a_k Z_{t-k}\right)}
\right]
=
\sum_{j \in F}\sum_{k \in G}
a_j\overline{a_k}\,
\mathbb{E}[Z_{t+h-j}\overline{Z_{t-k}}].
\end{align*}
The expectation on the right is a covariance of $X$ at lag $(t+h-j)-(t-k)=h-j+k$, so weak stationarity gives
\begin{align*}
\mathbb{E}[Z_{t+h-j}\overline{Z_{t-k}}]
=
\gamma_X(h-j+k).
\end{align*}
Passing to the $L^2$ limit in both factors is justified by continuity of the $L^2$ inner product. Indeed, if $U_m \to U$ and $V_m \to V$ in $L^2(\Omega)$, then the Cauchy-Schwarz inequality gives
\begin{align*}
\left|\mathbb{E}[U_m\overline{V_m}]-\mathbb{E}[U\overline{V}]\right|
&\leq \|U_m-U\|_{L^2(\Omega)}\|V_m\|_{L^2(\Omega)}
+ \|U\|_{L^2(\Omega)}\|V_m-V\|_{L^2(\Omega)},
\end{align*}
and the right-hand side tends to $0$ because $(V_m)$ is bounded in $L^2(\Omega)$. Therefore
\begin{align*}
\gamma_Y(h)
=
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
a_j\overline{a_k}\,\gamma_X(h-j+k).
\end{align*}
We also verify absolute convergence, because it will be needed when the spectral integral is inserted. For any $r \in \mathbb{Z}$, the Cauchy-Schwarz inequality in $L^2(\Omega)$ gives
\begin{align*}
|\gamma_X(r)|
&=
|\mathbb{E}[Z_{t+r}\overline{Z_t}]| \\
&\leq
\left(\mathbb{E}[|Z_{t+r}|^2]\right)^{1/2}
\left(\mathbb{E}[|Z_t|^2]\right)^{1/2}
=
\gamma_X(0).
\end{align*}
Therefore
\begin{align*}
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
|a_j|\,|a_k|\,|\gamma_X(h-j+k)|
\leq
\gamma_X(0)
\left(\sum_{j \in \mathbb{Z}} |a_j|\right)
\left(\sum_{k \in \mathbb{Z}} |a_k|\right)
< \infty.
\end{align*}
Thus the covariance series is absolutely convergent.
[/guided]
[/step]
[step:Insert the spectral representation and factor the frequency response]
Using the spectral representation of $\gamma_X$, for each $h \in \mathbb{Z}$ we obtain
\begin{align*}
\gamma_Y(h)
&=
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
a_j \overline{a_k}
\int_{-\pi}^{\pi}
e^{i(h-j+k)\lambda} f_X(\lambda)\, d\mathcal{L}^1(\lambda).
\end{align*}
The Tonelli-Fubini theorem justifies the interchange of the double sum and the integral, because the total absolute integral is finite:
\begin{align*}
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
|a_j|\,|a_k|
\int_{-\pi}^{\pi}
|f_X(\lambda)|\, d\mathcal{L}^1(\lambda)
=
\|a\|_{\ell^1(\mathbb{Z})}^2
\|f_X\|_{L^1([-\pi,\pi])}
< \infty.
\end{align*}
Hence
\begin{align*}
\gamma_Y(h)
&=
\int_{-\pi}^{\pi}
e^{ih\lambda}
\left(\sum_{j \in \mathbb{Z}} a_j e^{-ij\lambda}\right)
\left(\sum_{k \in \mathbb{Z}} \overline{a_k} e^{ik\lambda}\right)
f_X(\lambda)\, d\mathcal{L}^1(\lambda) \\
&=
\int_{-\pi}^{\pi}
e^{ih\lambda}
A(\lambda)\overline{A(\lambda)}
f_X(\lambda)\, d\mathcal{L}^1(\lambda) \\
&=
\int_{-\pi}^{\pi}
e^{ih\lambda}
|A(\lambda)|^2 f_X(\lambda)\, d\mathcal{L}^1(\lambda).
\end{align*}
[guided]
The previous step reduced the problem to understanding the double series
\begin{align*}
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
a_j\overline{a_k}\,\gamma_X(h-j+k).
\end{align*}
Now we use the spectral density of $X$. By hypothesis, for every integer $r$,
\begin{align*}
\gamma_X(r)
=
\int_{-\pi}^{\pi} e^{ir\lambda}f_X(\lambda)\, d\mathcal{L}^1(\lambda).
\end{align*}
Substituting $r=h-j+k$ gives
\begin{align*}
\gamma_Y(h)
=
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
a_j \overline{a_k}
\int_{-\pi}^{\pi}
e^{i(h-j+k)\lambda}f_X(\lambda)\, d\mathcal{L}^1(\lambda).
\end{align*}
We must justify moving the two infinite sums inside the integral. The absolute value of the summand is bounded by
\begin{align*}
|a_j|\,|a_k|\,|f_X(\lambda)|,
\end{align*}
because $|e^{i(h-j+k)\lambda}|=1$. Thus the total absolute integral is
\begin{align*}
\sum_{j \in \mathbb{Z}}\sum_{k \in \mathbb{Z}}
|a_j|\,|a_k|
\int_{-\pi}^{\pi}
|f_X(\lambda)|\, d\mathcal{L}^1(\lambda)
=
\|a\|_{\ell^1(\mathbb{Z})}^2
\|f_X\|_{L^1([-\pi,\pi])}
< \infty.
\end{align*}
By the Tonelli-Fubini theorem, this absolute integrability permits the interchange of the sums and the integral.
After interchanging, the dependence on $j$ and $k$ separates:
\begin{align*}
\gamma_Y(h)
&=
\int_{-\pi}^{\pi}
e^{ih\lambda}
\left(\sum_{j \in \mathbb{Z}} a_j e^{-ij\lambda}\right)
\left(\sum_{k \in \mathbb{Z}} \overline{a_k} e^{ik\lambda}\right)
f_X(\lambda)\, d\mathcal{L}^1(\lambda).
\end{align*}
The first series is exactly the frequency response $A(\lambda)$. The second series is its complex conjugate, since
\begin{align*}
\overline{A(\lambda)}
=
\overline{\sum_{k \in \mathbb{Z}} a_k e^{-ik\lambda}}
=
\sum_{k \in \mathbb{Z}} \overline{a_k} e^{ik\lambda},
\end{align*}
where absolute convergence of the series justifies conjugating term by term. Therefore
\begin{align*}
\gamma_Y(h)
=
\int_{-\pi}^{\pi}
e^{ih\lambda}
|A(\lambda)|^2 f_X(\lambda)\, d\mathcal{L}^1(\lambda).
\end{align*}
[/guided]
[/step]
[step:Conclude stationarity and identify the transformed spectral density]
The function $A$ is bounded because
\begin{align*}
|A(\lambda)|
\leq
\sum_{j \in \mathbb{Z}} |a_j|
=
\|a\|_{\ell^1(\mathbb{Z})}
\end{align*}
for every $\lambda \in [-\pi,\pi]$. Hence the measurable function $f_Y: [-\pi,\pi] \to \mathbb{C}$ defined by
\begin{align*}
f_Y(\lambda) = |A(\lambda)|^2 f_X(\lambda)
\end{align*}
belongs to $L^1([-\pi,\pi], \mathcal{B}([-\pi,\pi]), \mathcal{L}^1)$. Since $f_X$ is a spectral density, it is nonnegative $\mathcal{L}^1$-a.e.; therefore $f_Y$ is also nonnegative $\mathcal{L}^1$-a.e. The previous step shows that
\begin{align*}
\gamma_Y(h)
=
\int_{-\pi}^{\pi}
e^{ih\lambda} f_Y(\lambda)\, d\mathcal{L}^1(\lambda)
\end{align*}
for every $h \in \mathbb{Z}$.
The mean $\mu_Y=A(0)\mu_X$ is independent of $t$, and $\gamma_Y(h)$ depends only on the lag $h$. Therefore $(Y_t)_{t \in \mathbb{Z}}$ is weakly stationary, and its spectral density is
\begin{align*}
f_Y(\lambda)=|A(\lambda)|^2 f_X(\lambda).
\end{align*}
This proves the theorem.
[/step]