[proofplan]
The absence of zeros of $\phi$ on the closed unit disc gives an absolutely summable impulse response $(\psi_j)_{j\geq 0}$ whose generating function is $\theta(z)/\phi(z)$. Causality expresses the ARMA process as the linear filter $X_t=\sum_{j=0}^{\infty}\psi_j Z_{t-j}$. We compute the autocovariance by expanding this moving-average representation and using the orthogonality of white noise, then take its absolutely convergent Fourier series. The resulting double sum factors into $\psi(e^{-i\lambda})\overline{\psi(e^{-i\lambda})}$, giving the stated density.
[/proofplan]
[step:Construct the absolutely summable impulse response from the ARMA polynomials]
Define
\begin{align*}
\psi: \{z\in\mathbb{C}: |z|<R\} &\to \mathbb{C},\\
z &\mapsto \frac{\theta(z)}{\phi(z)},
\end{align*}
where $R>1$ is chosen so that $\phi(z)\neq 0$ for $|z|<R$. Such an $R$ exists because $\phi$ is continuous and has no zeros on the compact set $\{z\in\mathbb{C}: |z|\leq 1\}$. Since $\psi$ is holomorphic on $\{z:|z|<R\}$, it has a power-series expansion
\begin{align*}
\psi(z)=\sum_{j=0}^{\infty}\psi_j z^j,
\qquad |z|<R,
\end{align*}
for a sequence $(\psi_j)_{j\geq 0}$ in $\mathbb{C}$. Choose $\rho$ with $1<\rho<R$. By Cauchy's coefficient estimate on the circle $|z|=\rho$, there is a constant
\begin{align*}
M_\rho:=\sup_{|z|=\rho}|\psi(z)|<\infty
\end{align*}
such that $|\psi_j|\leq M_\rho\rho^{-j}$ for every $j\geq 0$. Hence
\begin{align*}
\sum_{j=0}^{\infty}|\psi_j|
\leq
M_\rho\sum_{j=0}^{\infty}\rho^{-j}
=
M_\rho\frac{\rho}{\rho-1}
<\infty.
\end{align*}
Thus $(\psi_j)_{j\geq 0}\in \ell^1$.
[guided]
The goal of this step is to turn the rational transfer function $\theta/\phi$ into a genuine one-sided linear filter. The hypothesis $\phi(z)\neq 0$ for $|z|\leq 1$ is stronger than non-vanishing on the unit circle: because zeros of a polynomial form a finite closed set, there is some radius $R>1$ such that $\phi(z)\neq 0$ for every $z$ with $|z|<R$. Therefore the map
\begin{align*}
\psi: \{z\in\mathbb{C}: |z|<R\} &\to \mathbb{C},\\
z &\mapsto \frac{\theta(z)}{\phi(z)}
\end{align*}
is holomorphic on a disc strictly larger than the unit disc.
A holomorphic function on $\{z:|z|<R\}$ has a power-series expansion about $0$, so there are coefficients $\psi_j\in\mathbb{C}$ such that
\begin{align*}
\psi(z)=\sum_{j=0}^{\infty}\psi_j z^j,
\qquad |z|<R.
\end{align*}
We need more than convergence on $|z|<1$; to justify the later covariance and Fourier-series manipulations, we need absolute summability of the coefficients. Choose $\rho$ with $1<\rho<R$ and define
\begin{align*}
M_\rho:=\sup_{|z|=\rho}|\psi(z)|.
\end{align*}
This number is finite because $\psi$ is continuous on the compact circle $\{z\in\mathbb{C}:|z|=\rho\}$. Cauchy's coefficient estimate gives
\begin{align*}
|\psi_j|\leq M_\rho\rho^{-j},
\qquad j\geq 0.
\end{align*}
Since $\rho>1$, the geometric series converges, and therefore
\begin{align*}
\sum_{j=0}^{\infty}|\psi_j|
\leq
M_\rho\sum_{j=0}^{\infty}\rho^{-j}
=
M_\rho\frac{\rho}{\rho-1}
<\infty.
\end{align*}
This proves that the impulse response $(\psi_j)_{j\geq 0}$ belongs to $\ell^1$.
[/guided]
[/step]
[step:Represent the causal ARMA process as a one-sided linear filter]
Since $(X_t)_{t\in\mathbb{Z}}$ is causal, it is the stationary solution obtained by applying the one-sided filter $(\psi_j)_{j\geq 0}$ to the white noise:
\begin{align*}
X_t=\sum_{j=0}^{\infty}\psi_j Z_{t-j},
\qquad t\in\mathbb{Z}.
\end{align*}
The series converges in $L^2$ because
\begin{align*}
\mathbb{E}\left[
\left|
\sum_{j=m}^{n}\psi_j Z_{t-j}
\right|^2
\right]
=
\sigma^2\sum_{j=m}^{n}|\psi_j|^2
\end{align*}
for $0\leq m\leq n$, and $(\psi_j)_{j\geq 0}\in\ell^1\subset\ell^2$. The identity $\phi(z)\psi(z)=\theta(z)$ gives the ARMA equation after comparing coefficients and applying the filter identity in $L^2$.
[guided]
Causality means that $X_t$ depends only on the present and past innovations $Z_t,Z_{t-1},\dots$. The coefficients found in the previous step give the candidate causal representation
\begin{align*}
X_t=\sum_{j=0}^{\infty}\psi_j Z_{t-j},
\qquad t\in\mathbb{Z}.
\end{align*}
This series is legitimate in $L^2$. Indeed, for integers $0\leq m\leq n$, orthogonality of white noise gives
\begin{align*}
\mathbb{E}\left[
\left|
\sum_{j=m}^{n}\psi_j Z_{t-j}
\right|^2
\right]
=
\sum_{j=m}^{n}\sum_{k=m}^{n}
\psi_j\overline{\psi_k}\,
\mathbb{E}[Z_{t-j}\overline{Z_{t-k}}]
=
\sigma^2\sum_{j=m}^{n}|\psi_j|^2.
\end{align*}
Because $(\psi_j)_{j\geq 0}\in\ell^1$, it also lies in $\ell^2$, so the tails on the right tend to $0$. Hence the partial sums form a Cauchy sequence in $L^2$ and define $X_t$.
It remains to connect this filter to the ARMA equation. The power-series identity
\begin{align*}
\phi(z)\psi(z)=\theta(z)
\end{align*}
holds for $|z|<1$. Comparing coefficients says exactly that applying the autoregressive filter $\phi$ to the impulse response $(\psi_j)$ produces the finite moving-average coefficients of $\theta$. Therefore applying the same filter identity to the $L^2$-convergent series gives
\begin{align*}
X_t-\sum_{r=1}^{p}\phi_r X_{t-r}
=
Z_t+\sum_{r=1}^{q}\theta_r Z_{t-r},
\qquad t\in\mathbb{Z}.
\end{align*}
Thus the causal ARMA process is the one-sided linear process with impulse response $(\psi_j)_{j\geq 0}$.
[/guided]
[/step]
[step:Compute the autocovariance sequence of the filtered white noise]
Define the autocovariance function
\begin{align*}
\gamma_X:\mathbb{Z} &\to \mathbb{C},\\
h &\mapsto \mathbb{E}[X_{t+h}\overline{X_t}],
\end{align*}
which is independent of $t$ by stationarity. Using the $L^2$ representation and white-noise orthogonality, for each $h\in\mathbb{Z}$,
\begin{align*}
\gamma_X(h)
&=
\mathbb{E}\left[
\left(\sum_{j=0}^{\infty}\psi_j Z_{t+h-j}\right)
\overline{
\left(\sum_{k=0}^{\infty}\psi_k Z_{t-k}\right)}
\right]\\
&=
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}
\psi_j\overline{\psi_k}\,
\mathbb{E}[Z_{t+h-j}\overline{Z_{t-k}}]\\
&=
\sigma^2
\sum_{\substack{j,k\geq 0\\ k=j-h}}
\psi_j\overline{\psi_k}.
\end{align*}
Equivalently,
\begin{align*}
\gamma_X(h)=
\sigma^2\sum_{j=\max\{0,h\}}^{\infty}\psi_j\overline{\psi_{j-h}},
\qquad h\in\mathbb{Z},
\end{align*}
where terms with negative indices are omitted.
[guided]
We now compute the covariance created by passing white noise through the filter. Define
\begin{align*}
\gamma_X:\mathbb{Z} &\to \mathbb{C},\\
h &\mapsto \mathbb{E}[X_{t+h}\overline{X_t}].
\end{align*}
Stationarity makes this independent of the chosen time $t\in\mathbb{Z}$. Substituting the moving-average representation gives
\begin{align*}
\gamma_X(h)
=
\mathbb{E}\left[
\left(\sum_{j=0}^{\infty}\psi_j Z_{t+h-j}\right)
\overline{
\left(\sum_{k=0}^{\infty}\psi_k Z_{t-k}\right)}
\right].
\end{align*}
The interchange of expectation with the two sums is justified by absolute summability: the covariance series is bounded in absolute value by
\begin{align*}
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}
|\psi_j|\,|\psi_k|\,
\left|\mathbb{E}[Z_{t+h-j}\overline{Z_{t-k}}]\right|
\leq
\sigma^2
\left(\sum_{j=0}^{\infty}|\psi_j|\right)
\left(\sum_{k=0}^{\infty}|\psi_k|\right)
<\infty.
\end{align*}
Therefore
\begin{align*}
\gamma_X(h)
&=
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}
\psi_j\overline{\psi_k}\,
\mathbb{E}[Z_{t+h-j}\overline{Z_{t-k}}].
\end{align*}
White-noise orthogonality says
\begin{align*}
\mathbb{E}[Z_{t+h-j}\overline{Z_{t-k}}]
=
\sigma^2\mathbb{1}_{\{t+h-j=t-k\}}
=
\sigma^2\mathbb{1}_{\{k=j-h\}}.
\end{align*}
Thus only pairs satisfying $k=j-h$ contribute, and we obtain
\begin{align*}
\gamma_X(h)
=
\sigma^2
\sum_{\substack{j,k\geq 0\\ k=j-h}}
\psi_j\overline{\psi_k}
=
\sigma^2\sum_{j=\max\{0,h\}}^{\infty}\psi_j\overline{\psi_{j-h}},
\qquad h\in\mathbb{Z}.
\end{align*}
The lower bound $j\geq \max\{0,h\}$ is exactly the condition that both indices $j$ and $j-h$ are non-negative.
[/guided]
[/step]
[step:Take the Fourier series and factor the double sum]
Because $(\psi_j)_{j\geq 0}\in\ell^1$, the sequence $(\gamma_X(h))_{h\in\mathbb{Z}}$ is absolutely summable:
\begin{align*}
\sum_{h\in\mathbb{Z}}|\gamma_X(h)|
\leq
\sigma^2
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}|\psi_j|\,|\psi_k|
=
\sigma^2
\left(\sum_{j=0}^{\infty}|\psi_j|\right)^2
<\infty.
\end{align*}
Thus the spectral density is
\begin{align*}
f_X(\lambda)
=
\frac{1}{2\pi}\sum_{h\in\mathbb{Z}}\gamma_X(h)e^{-ih\lambda},
\qquad -\pi\leq \lambda\leq \pi.
\end{align*}
Using the covariance formula and the change of index $h=j-k$, we obtain
\begin{align*}
\sum_{h\in\mathbb{Z}}\gamma_X(h)e^{-ih\lambda}
&=
\sigma^2
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}
\psi_j\overline{\psi_k}\,e^{-i(j-k)\lambda}\\
&=
\sigma^2
\left(\sum_{j=0}^{\infty}\psi_j e^{-ij\lambda}\right)
\overline{
\left(\sum_{k=0}^{\infty}\psi_k e^{-ik\lambda}\right)}\\
&=
\sigma^2|\psi(e^{-i\lambda})|^2.
\end{align*}
[guided]
The covariance sequence is absolutely summable. Indeed, using the formula from the previous step and summing over all possible pairs $(j,k)$ with $h=j-k$ gives
\begin{align*}
\sum_{h\in\mathbb{Z}}|\gamma_X(h)|
\leq
\sigma^2
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}|\psi_j|\,|\psi_k|
=
\sigma^2
\left(\sum_{j=0}^{\infty}|\psi_j|\right)^2
<\infty.
\end{align*}
This absolute summability is the key technical point: it allows the spectral density to be recovered from the Fourier series of the autocovariance sequence and allows the sums to be rearranged.
By definition of the spectral density associated to an absolutely summable autocovariance sequence,
\begin{align*}
f_X(\lambda)
=
\frac{1}{2\pi}\sum_{h\in\mathbb{Z}}\gamma_X(h)e^{-ih\lambda},
\qquad -\pi\leq \lambda\leq \pi.
\end{align*}
Now substitute the covariance formula. Each nonzero covariance term corresponds to a pair of non-negative indices $(j,k)$ with $h=j-k$. Therefore
\begin{align*}
\sum_{h\in\mathbb{Z}}\gamma_X(h)e^{-ih\lambda}
&=
\sigma^2
\sum_{j=0}^{\infty}\sum_{k=0}^{\infty}
\psi_j\overline{\psi_k}\,e^{-i(j-k)\lambda}.
\end{align*}
The double series is absolutely convergent because $(\psi_j)\in\ell^1$, so it factors as a product:
\begin{align*}
\sum_{h\in\mathbb{Z}}\gamma_X(h)e^{-ih\lambda}
&=
\sigma^2
\left(\sum_{j=0}^{\infty}\psi_j e^{-ij\lambda}\right)
\left(\sum_{k=0}^{\infty}\overline{\psi_k} e^{ik\lambda}\right)\\
&=
\sigma^2
\left(\sum_{j=0}^{\infty}\psi_j e^{-ij\lambda}\right)
\overline{
\left(\sum_{k=0}^{\infty}\psi_k e^{-ik\lambda}\right)}\\
&=
\sigma^2|\psi(e^{-i\lambda})|^2.
\end{align*}
This is the Fourier-domain version of the elementary principle that filtering white noise multiplies the white-noise spectrum by the squared modulus of the transfer function.
[/guided]
[/step]
[step:Substitute the ARMA transfer function to obtain the displayed density]
For $-\pi\leq \lambda\leq \pi$, the point $e^{-i\lambda}$ lies on the unit circle, and the hypothesis gives $\phi(e^{-i\lambda})\neq 0$. Since
\begin{align*}
\psi(e^{-i\lambda})
=
\frac{\theta(e^{-i\lambda})}{\phi(e^{-i\lambda})},
\end{align*}
the preceding step yields
\begin{align*}
f_X(\lambda)
=
\frac{1}{2\pi}\sigma^2|\psi(e^{-i\lambda})|^2
=
\frac{\sigma^2}{2\pi}
\left|
\frac{\theta(e^{-i\lambda})}{\phi(e^{-i\lambda})}
\right|^2,
\qquad -\pi\leq \lambda\leq \pi.
\end{align*}
This is the claimed spectral density of the causal ARMA$(p,q)$ process.
[/step]