[proofplan]
The argument is the moving-average inversion of the autoregressive causal representation. Since $\theta$ has no zeros on the closed unit disc, the reciprocal $1/\theta$ is analytic on a larger disc, so its Taylor coefficients are absolutely summable. We use these coefficients to define a stable linear filter, apply it to both sides of the ARMA equation, and justify the passage through infinite sums in $L^2$. The polynomial identity $(1/\theta)\theta=1$ then recovers $Z_t$, while $(1/\theta)\phi=\phi/\theta$ gives the asserted filter coefficients.
[/proofplan]
[step:Construct an absolutely summable reciprocal filter for $\theta$]
Because $\theta$ is a polynomial and $\theta(z)\neq 0$ for every $z \in \mathbb{C}$ with $|z|\leq 1$, there exists $r>1$ such that $\theta(z)\neq 0$ for every $z \in \mathbb{C}$ with $|z|<r$. Define the holomorphic map
\begin{align*}
\beta:\{z \in \mathbb{C}: |z|<r\}&\to \mathbb{C}\\
z&\mapsto \frac{1}{\theta(z)}.
\end{align*}
Let $(\beta_j)_{j=0}^{\infty}$ denote the Taylor coefficients of $\beta$ at $0$, so
\begin{align*}
\beta(z)=\sum_{j=0}^{\infty}\beta_j z^j,\qquad |z|<r.
\end{align*}
Choose $\rho$ with $1<\rho<r$. Since $\beta$ is continuous on the compact circle $\{z \in \mathbb{C}: |z|=\rho\}$, define
\begin{align*}
M_\rho:=\sup\{|\beta(z)|: |z|=\rho\}<\infty.
\end{align*}
Cauchy's coefficient estimate gives $|\beta_j|\leq M_\rho \rho^{-j}$ for every $j\geq 0$. Therefore
\begin{align*}
\sum_{j=0}^{\infty}|\beta_j|
\leq
M_\rho\sum_{j=0}^{\infty}\rho^{-j}
=
M_\rho\frac{\rho}{\rho-1}
<\infty.
\end{align*}
[guided]
The assumption on $\theta$ is exactly the stability condition needed for the inverse filter. Since $\theta$ is a polynomial, its zeros form a finite set. None of those zeros lies in $\{z \in \mathbb{C}: |z|\leq 1\}$, so the closest zero of $\theta$, if any exists, has modulus strictly larger than $1$. Hence we can choose $r>1$ such that $\theta(z)\neq 0$ whenever $|z|<r$.
On this disc define
\begin{align*}
\beta:\{z \in \mathbb{C}: |z|<r\}&\to \mathbb{C}\\
z&\mapsto \frac{1}{\theta(z)}.
\end{align*}
This map is holomorphic because $\theta$ is holomorphic and non-vanishing on the disc. Let $(\beta_j)_{j=0}^{\infty}$ be its Taylor coefficient sequence at $0$:
\begin{align*}
\beta(z)=\sum_{j=0}^{\infty}\beta_j z^j,\qquad |z|<r.
\end{align*}
To prove that the filter is stable, we need $\sum_{j=0}^{\infty}|\beta_j|<\infty$. Choose any $\rho$ satisfying $1<\rho<r$. The circle $|z|=\rho$ is compact and lies inside the domain of $\beta$, so
\begin{align*}
M_\rho:=\sup\{|\beta(z)|: |z|=\rho\}<\infty.
\end{align*}
Cauchy's coefficient estimate applied on this circle gives
\begin{align*}
|\beta_j|\leq M_\rho \rho^{-j},\qquad j\geq 0.
\end{align*}
Since $\rho>1$, the geometric series with ratio $\rho^{-1}$ converges, and hence
\begin{align*}
\sum_{j=0}^{\infty}|\beta_j|
\leq
M_\rho\sum_{j=0}^{\infty}\rho^{-j}
=
M_\rho\frac{\rho}{\rho-1}
<\infty.
\end{align*}
Thus the reciprocal power series defines an absolutely summable one-sided filter.
[/guided]
[/step]
[step:Define the inverse ARMA coefficients and prove their summability]
Define
\begin{align*}
\pi:\{z \in \mathbb{C}: |z|<r\}&\to \mathbb{C}\\
z&\mapsto \frac{\phi(z)}{\theta(z)}.
\end{align*}
Let $(\pi_j)_{j=0}^{\infty}$ be the Taylor coefficients of $\pi$ at $0$:
\begin{align*}
\pi(z)=\sum_{j=0}^{\infty}\pi_j z^j,\qquad |z|<r.
\end{align*}
Since $\pi=\phi\beta$ and $\phi(z)=\sum_{k=0}^{p}a_k z^k$ with $a_0=1$ and $a_k=-\phi_k$ for $1\leq k\leq p$, the coefficients satisfy
\begin{align*}
\pi_j=\sum_{k=0}^{\min\{p,j\}}a_k\beta_{j-k},\qquad j\geq 0.
\end{align*}
Therefore
\begin{align*}
\sum_{j=0}^{\infty}|\pi_j|
\leq
\sum_{j=0}^{\infty}\sum_{k=0}^{\min\{p,j\}}|a_k|\,|\beta_{j-k}|
\leq
\left(\sum_{k=0}^{p}|a_k|\right)\left(\sum_{m=0}^{\infty}|\beta_m|\right)
<\infty.
\end{align*}
[/step]
[step:Justify stable filters on second-order stationary processes]
Let $(Y_t)_{t\in\mathbb{Z}}$ be any real-valued second-order stationary process, and let $(c_j)_{j=0}^{\infty}$ be an absolutely summable sequence. For each $t \in \mathbb{Z}$, define the partial sums
\begin{align*}
S_{N,t}:=\sum_{j=0}^{N}c_jY_{t-j},\qquad N\geq 0.
\end{align*}
If $M>N$, then by the Cauchy-Schwarz inequality applied to the inner product $(U,V)_{L^2}:=\mathbb{E}[UV]$ on $L^2(\Omega,\mathcal{F},\mathbb{P})$,
\begin{align*}
\mathbb{E}\left[\left|S_{M,t}-S_{N,t}\right|^2\right]
&=
\mathbb{E}\left[\left|\sum_{j=N+1}^{M}c_jY_{t-j}\right|^2\right]\\
&\leq
\left(\sum_{j=N+1}^{M}|c_j|\left(\mathbb{E}[Y_{t-j}^2]\right)^{1/2}\right)^2\\
&=
\mathbb{E}[Y_0^2]\left(\sum_{j=N+1}^{M}|c_j|\right)^2.
\end{align*}
Since $(c_j)_{j=0}^{\infty}\in \ell^1$, the sequence $(S_{N,t})_{N=0}^{\infty}$ is Cauchy in $L^2(\Omega,\mathcal{F},\mathbb{P})$. Hence the filtered random variable
\begin{align*}
c(B)Y_t:=\sum_{j=0}^{\infty}c_jY_{t-j}
\end{align*}
is well-defined as an $L^2$ limit.
[guided]
We will apply infinite filters to both $X_t$ and $Z_t$, so we first record the basic $L^2$ convergence mechanism. Let $(Y_t)_{t\in\mathbb{Z}}$ be a real-valued second-order stationary process, and let $(c_j)_{j=0}^{\infty}$ satisfy
\begin{align*}
\sum_{j=0}^{\infty}|c_j|<\infty.
\end{align*}
For fixed $t\in\mathbb{Z}$, define
\begin{align*}
S_{N,t}:=\sum_{j=0}^{N}c_jY_{t-j},\qquad N\geq 0.
\end{align*}
We need to show that these partial sums converge in $L^2(\Omega,\mathcal{F},\mathbb{P})$.
For $M>N$, apply the Cauchy-Schwarz inequality in the Hilbert space $L^2(\Omega,\mathcal{F},\mathbb{P})$ to the finite sum:
\begin{align*}
\left\|\sum_{j=N+1}^{M}c_jY_{t-j}\right\|_{L^2}
&\leq
\sum_{j=N+1}^{M}|c_j|\,\|Y_{t-j}\|_{L^2}.
\end{align*}
Second-order stationarity gives $\|Y_{t-j}\|_{L^2}=\|Y_0\|_{L^2}$ for every $j$, so
\begin{align*}
\mathbb{E}\left[\left|S_{M,t}-S_{N,t}\right|^2\right]
&=
\left\|S_{M,t}-S_{N,t}\right\|_{L^2}^2\\
&\leq
\left(\sum_{j=N+1}^{M}|c_j|\,\|Y_0\|_{L^2}\right)^2\\
&=
\mathbb{E}[Y_0^2]\left(\sum_{j=N+1}^{M}|c_j|\right)^2.
\end{align*}
The tail $\sum_{j=N+1}^{M}|c_j|$ tends to $0$ uniformly in $M>N$ because $(c_j)$ is absolutely summable. Thus $(S_{N,t})$ is Cauchy in $L^2$, and $L^2$ is complete. This defines
\begin{align*}
c(B)Y_t:=\sum_{j=0}^{\infty}c_jY_{t-j}
\end{align*}
as an $L^2$ limit.
[/guided]
[/step]
[step:Apply the reciprocal filter to the ARMA equation]
Let $\theta_0:=1$, and let $\theta_k$ for $1\leq k\leq q$ be the moving-average coefficients in $\theta(z)=\sum_{k=0}^{q}\theta_k z^k$. The identity $\beta(z)\theta(z)=1$ on $|z|<r$ implies the coefficient identities
\begin{align*}
\sum_{k=0}^{\min\{q,m\}}\theta_k\beta_{m-k}
=
\begin{cases}
1, & m=0,\\
0, & m\geq 1.
\end{cases}
\end{align*}
Because $\theta(B)Z_t$ is a finite linear combination of the white-noise variables $Z_t,\dots,Z_{t-q}$, the reciprocal filter can be applied term by term in $L^2$. For every $t\in\mathbb{Z}$,
\begin{align*}
\beta(B)\theta(B)Z_t
&=
\sum_{j=0}^{\infty}\beta_j\sum_{k=0}^{q}\theta_k Z_{t-j-k}\\
&=
\sum_{m=0}^{\infty}\left(\sum_{k=0}^{\min\{q,m\}}\theta_k\beta_{m-k}\right)Z_{t-m}\\
&=
Z_t,
\end{align*}
where the rearrangement is justified because the outer coefficient sequence is absolutely summable and the inner sum over $k$ is finite.
Similarly, since $\phi(z)=\sum_{k=0}^{p}a_kz^k$ and $\pi(z)=\beta(z)\phi(z)$, the same finite-convolution calculation gives
\begin{align*}
\beta(B)\phi(B)X_t=\pi(B)X_t
\end{align*}
in $L^2(\Omega,\mathcal{F},\mathbb{P})$.
[guided]
The algebraic identity $\beta(z)\theta(z)=1$ is a power-series identity, and we now translate it into a filter identity. Write $\theta_0:=1$, so
\begin{align*}
\theta(z)=\sum_{k=0}^{q}\theta_kz^k.
\end{align*}
Multiplying the two power series gives
\begin{align*}
\left(\sum_{j=0}^{\infty}\beta_jz^j\right)
\left(\sum_{k=0}^{q}\theta_kz^k\right)
=
1.
\end{align*}
Equating coefficients of $z^m$ yields
\begin{align*}
\sum_{k=0}^{\min\{q,m\}}\theta_k\beta_{m-k}
=
\begin{cases}
1, & m=0,\\
0, & m\geq 1.
\end{cases}
\end{align*}
We may apply the infinite filter $\beta(B)$ to $\theta(B)Z_t$ because $(\beta_j)$ is absolutely summable and $\theta(B)Z_t$ is a second-order stationary process: it is a finite linear combination of shifts of the white noise process. Thus the filtered expression is an $L^2$ limit. Since the $\theta$-sum is finite, we may collect terms by the total lag $m=j+k$:
\begin{align*}
\beta(B)\theta(B)Z_t
&=
\sum_{j=0}^{\infty}\beta_j\sum_{k=0}^{q}\theta_k Z_{t-j-k}\\
&=
\sum_{m=0}^{\infty}\left(\sum_{k=0}^{\min\{q,m\}}\theta_k\beta_{m-k}\right)Z_{t-m}\\
&=
Z_t.
\end{align*}
The final equality uses the coefficient identities: the coefficient of $Z_t$ is $1$, and the coefficient of every lagged variable $Z_{t-m}$ with $m\geq 1$ is $0$.
The same computation applies to the autoregressive side. Write
\begin{align*}
\phi(z)=\sum_{k=0}^{p}a_kz^k,
\end{align*}
where $a_0=1$ and $a_k=-\phi_k$ for $1\leq k\leq p$. Since $\pi(z)=\beta(z)\phi(z)$, the coefficient of $z^m$ in this product is $\pi_m$. Therefore
\begin{align*}
\beta(B)\phi(B)X_t=\pi(B)X_t
\end{align*}
in $L^2(\Omega,\mathcal{F},\mathbb{P})$.
[/guided]
[/step]
[step:Recover the innovations from present and past observations]
The ARMA equation gives $\phi(B)X_t=\theta(B)Z_t$ in $L^2(\Omega,\mathcal{F},\mathbb{P})$ for every $t\in\mathbb{Z}$. Applying the stable filter $\beta(B)$ to both sides and using the identities established above gives
\begin{align*}
\pi(B)X_t
=
\beta(B)\phi(B)X_t
=
\beta(B)\theta(B)Z_t
=
Z_t.
\end{align*}
Equivalently,
\begin{align*}
Z_t=\sum_{j=0}^{\infty}\pi_jX_{t-j}
\end{align*}
with convergence in $L^2(\Omega,\mathcal{F},\mathbb{P})$. Since $\sum_{j=0}^{\infty}|\pi_j|<\infty$, this representation is a stable one-sided reconstruction of $Z_t$ from the present and past observations $X_t,X_{t-1},\dots$. Hence $(X_t)_{t\in\mathbb{Z}}$ is invertible, with inverse filter $\pi(z)=\phi(z)/\theta(z)$.
[/step]