[proofplan]
Set $Y_t=\Delta^dX_t$ and rewrite the ARIMA equation as an ARMA equation for $Y$. The root condition on $\phi$ gives an absolutely summable power-series inverse for $\phi(B)$, and multiplying this inverse by the finite moving-average polynomial $\theta(B)$ gives absolutely summable causal coefficients $(\psi_j)$. These coefficients define an $L^2$ convergent moving average driven by the white noise, which is weakly stationary. Finally, choosing $\mu_Y=c/\phi(1)$ accounts for the constant term and gives the asserted representation.
[/proofplan]
[step:Construct the absolutely summable causal inverse of $\phi$]
Because every zero of $\phi$ lies outside the closed unit disk, each reciprocal root has modulus strictly less than $1$. Hence $\phi$ factors over $\mathbb{C}$ in the form
\begin{align*}
\phi(z)=\prod_{r=1}^{m}(1-\lambda_r z)^{a_r},
\end{align*}
where $\lambda_1,\dots,\lambda_m \in \mathbb{C}$ are distinct, $a_1,\dots,a_m \in \mathbb{N}$, and $|\lambda_r|<1$ for every $r$.
The rational function $1/\phi(z)$ has a power-series expansion on a disk of radius strictly larger than $1$:
\begin{align*}
\frac{1}{\phi(z)}=\sum_{j=0}^{\infty}\pi_j z^j.
\end{align*}
Equivalently, by partial fractions, each coefficient $\pi_j$ is a finite linear combination of terms of the form $j^k\lambda_r^j$, with $0 \leq k \leq a_r-1$ and $1 \leq r \leq m$. Since $|\lambda_r|<1$, each numerical series
\begin{align*}
\sum_{j=0}^{\infty} j^k |\lambda_r|^j
\end{align*}
converges. Therefore
\begin{align*}
\sum_{j=0}^{\infty} |\pi_j| < \infty.
\end{align*}
Define the sequence $(\psi_j)_{j=0}^{\infty}$ by the identity
\begin{align*}
\sum_{j=0}^{\infty}\psi_j z^j
=
\frac{\theta(z)}{\phi(z)}
=
\left(\sum_{\ell=0}^{q}\theta_\ell z^\ell\right)
\left(\sum_{j=0}^{\infty}\pi_j z^j\right),
\end{align*}
where $\theta_0:=1$. Thus
\begin{align*}
\psi_j = \sum_{\ell=0}^{\min\{j,q\}}\theta_\ell \pi_{j-\ell}.
\end{align*}
Since $(\theta_\ell)_{\ell=0}^{q}$ is finite and $(\pi_j)_{j=0}^{\infty}$ is absolutely summable,
\begin{align*}
\sum_{j=0}^{\infty}|\psi_j|
\leq
\sum_{j=0}^{\infty}\sum_{\ell=0}^{\min\{j,q\}}|\theta_\ell|\,|\pi_{j-\ell}|
\leq
\left(\sum_{\ell=0}^{q}|\theta_\ell|\right)
\left(\sum_{j=0}^{\infty}|\pi_j|\right)
<\infty.
\end{align*}
[guided]
The root condition is exactly what makes the causal inverse stable. Since the zeros of $\phi$ are outside the closed unit disk, the reciprocal quantities $\lambda_r$ appearing in the factorisation
\begin{align*}
\phi(z)=\prod_{r=1}^{m}(1-\lambda_r z)^{a_r}
\end{align*}
all satisfy $|\lambda_r|<1$.
This implies that the reciprocal $1/\phi(z)$ has a convergent power series on a disk whose radius is larger than $1$:
\begin{align*}
\frac{1}{\phi(z)}=\sum_{j=0}^{\infty}\pi_j z^j.
\end{align*}
More concretely, partial fractions express $\pi_j$ as a finite linear combination of sequences $j^k\lambda_r^j$, where $0 \leq k \leq a_r-1$. The polynomial factor $j^k$ does not prevent summability because the exponential factor $|\lambda_r|^j$ decays geometrically. Therefore
\begin{align*}
\sum_{j=0}^{\infty}|\pi_j|<\infty.
\end{align*}
Now multiply by the finite moving-average polynomial $\theta$. Define $(\psi_j)_{j=0}^{\infty}$ through
\begin{align*}
\sum_{j=0}^{\infty}\psi_j z^j
=
\frac{\theta(z)}{\phi(z)}
=
\left(\sum_{\ell=0}^{q}\theta_\ell z^\ell\right)
\left(\sum_{j=0}^{\infty}\pi_j z^j\right),
\end{align*}
where $\theta_0:=1$. Comparing coefficients gives
\begin{align*}
\psi_j=\sum_{\ell=0}^{\min\{j,q\}}\theta_\ell\pi_{j-\ell}.
\end{align*}
This is a finite convolution of the absolutely summable sequence $(\pi_j)$ with the finite sequence $(\theta_\ell)_{\ell=0}^{q}$. Hence
\begin{align*}
\sum_{j=0}^{\infty}|\psi_j|
\leq
\left(\sum_{\ell=0}^{q}|\theta_\ell|\right)
\left(\sum_{j=0}^{\infty}|\pi_j|\right)
<\infty.
\end{align*}
[/guided]
[/step]
[step:Define the causal moving average and prove weak stationarity]
Define the process $(W_t)_{t\in\mathbb{Z}}$ by
\begin{align*}
W_t := \sum_{j=0}^{\infty}\psi_j Z_{t-j},
\end{align*}
where the series is interpreted in $L^2(\Omega,\mathcal{F},\mathbb{P})$. Since $(\psi_j)$ is absolutely summable, it is square summable. For $N>M$,
\begin{align*}
\mathbb{E}\left[\left|\sum_{j=M+1}^{N}\psi_j Z_{t-j}\right|^2\right]
=
\sigma^2\sum_{j=M+1}^{N}\psi_j^2,
\end{align*}
because the white-noise variables have zero covariance at distinct times. The right-hand side tends to $0$ as $M,N\to\infty$, so the series defining $W_t$ converges in $L^2$.
The mean is
\begin{align*}
\mathbb{E}[W_t]
=
\sum_{j=0}^{\infty}\psi_j\mathbb{E}[Z_{t-j}]
=
0.
\end{align*}
For $h\in\mathbb{Z}$, define $\psi_j:=0$ for $j<0$. Then
\begin{align*}
\operatorname{Cov}(W_{t+h},W_t)
&=
\mathbb{E}[W_{t+h}W_t] \\
&=
\sigma^2\sum_{j=0}^{\infty}\psi_{j+h}\psi_j,
\end{align*}
where the sum is absolutely convergent by the Cauchy-Schwarz inequality because $(\psi_j)\in \ell^2$. This expression depends only on $h$, not on $t$. Therefore $(W_t)_{t\in\mathbb{Z}}$ is weakly stationary.
[guided]
We now turn the formal power series into a stochastic process. Define
\begin{align*}
W_t := \sum_{j=0}^{\infty}\psi_j Z_{t-j}.
\end{align*}
This definition must be justified because the sum is infinite. Since
\begin{align*}
\sum_{j=0}^{\infty}|\psi_j|<\infty,
\end{align*}
we also have $(\psi_j)\in \ell^2$. Thus the partial sums are Cauchy in $L^2(\Omega,\mathcal{F},\mathbb{P})$: for $N>M$,
\begin{align*}
\mathbb{E}\left[\left|\sum_{j=M+1}^{N}\psi_j Z_{t-j}\right|^2\right]
=
\sum_{i=M+1}^{N}\sum_{j=M+1}^{N}\psi_i\psi_j\mathbb{E}[Z_{t-i}Z_{t-j}].
\end{align*}
The cross terms vanish when $i\neq j$ because white noise has zero covariance at distinct times, and the diagonal terms equal $\sigma^2\psi_j^2$. Hence
\begin{align*}
\mathbb{E}\left[\left|\sum_{j=M+1}^{N}\psi_j Z_{t-j}\right|^2\right]
=
\sigma^2\sum_{j=M+1}^{N}\psi_j^2
\to 0.
\end{align*}
So $W_t$ is well-defined as an $L^2$ limit.
The mean is zero because each $Z_t$ has mean zero:
\begin{align*}
\mathbb{E}[W_t]
=
\sum_{j=0}^{\infty}\psi_j\mathbb{E}[Z_{t-j}]
=
0.
\end{align*}
To compute covariance, set $\psi_j:=0$ for negative $j$. Then, for $h\in\mathbb{Z}$,
\begin{align*}
\operatorname{Cov}(W_{t+h},W_t)
&=
\mathbb{E}[W_{t+h}W_t] \\
&=
\sigma^2\sum_{j=0}^{\infty}\psi_{j+h}\psi_j.
\end{align*}
The sum is finite in absolute value because $(\psi_j)\in\ell^2$ and the Cauchy-Schwarz inequality applies. The resulting covariance depends only on the lag $h$, not on the time $t$. Thus $(W_t)$ is weakly stationary.
[/guided]
[/step]
[step:Verify that the moving average solves the centered ARMA equation]
The coefficient identity
\begin{align*}
\phi(z)\sum_{j=0}^{\infty}\psi_j z^j = \theta(z)
\end{align*}
means that, for every $k\geq 0$, the coefficient of $z^k$ in the product on the left equals the coefficient of $z^k$ in $\theta(z)$. Therefore applying the finite operator $\phi(B)$ to the $L^2$-convergent causal series gives
\begin{align*}
\phi(B)W_t = \theta(B)Z_t.
\end{align*}
Thus $(W_t)$ is the causal centered solution of the ARMA equation.
[/step]
[step:Choose the mean and identify the differenced process]
Since every zero of $\phi$ lies outside the closed unit disk, $1$ is not a zero of $\phi$, so $\phi(1)\neq 0$. Define
\begin{align*}
\mu_Y := \frac{c}{\phi(1)}.
\end{align*}
The constant process $t\mapsto \mu_Y$ satisfies
\begin{align*}
\phi(B)\mu_Y
=
\phi(1)\mu_Y
=
c.
\end{align*}
Therefore the process $t\mapsto \mu_Y+W_t$ satisfies
\begin{align*}
\phi(B)(\mu_Y+W_t)
=
c+\theta(B)Z_t.
\end{align*}
By the assumed causal second-order interpretation of the ARMA equation for $Y_t=\Delta^dX_t$, we have
\begin{align*}
Y_t=\mu_Y+W_t.
\end{align*}
Consequently
\begin{align*}
Y_t-\mu_Y
=
\sum_{j=0}^{\infty}\psi_jZ_{t-j},
\end{align*}
with convergence in $L^2(\Omega,\mathcal{F},\mathbb{P})$.
Since $(W_t)$ is weakly stationary and adding a constant changes only the mean, $(Y_t)$ is weakly stationary with mean $\mu_Y$. This proves the asserted stationarity of the differenced process and the causal representation.
[/step]