[proofplan]
The proof is an immediate consequence of the defining property of orthogonal projection in the Hilbert space $L^2(\Omega,\mathcal F,\mathbb P)$. First, the residual obtained by subtracting the projection of $X_t$ onto the closed past $\mathcal H_{t-1}^X$ is orthogonal to that closed past. Then, for $s<t$, the earlier innovation $\varepsilon_s$ belongs to $\mathcal H_s^X$, hence to $\mathcal H_{t-1}^X$, so the orthogonality just proved gives the covariance identity.
[/proofplan]
[step:Use the orthogonal projection residual to obtain orthogonality to the past]
Fix $t\in\mathbb Z$. The space $L^2(\Omega,\mathcal F,\mathbb P)$ is a complex Hilbert space with inner product
\begin{align*}
(Y,Z)_{L^2}:=\mathbb E[Y\overline Z],
\end{align*}
for $Y,Z\in L^2(\Omega,\mathcal F,\mathbb P)$. By definition, $\mathcal H_{t-1}^X$ is a closed linear subspace of this Hilbert space, and $P_{\mathcal H_{t-1}^X}X_t$ is the orthogonal projection of $X_t$ onto $\mathcal H_{t-1}^X$. Therefore the projection residual satisfies
\begin{align*}
X_t-P_{\mathcal H_{t-1}^X}X_t\perp \mathcal H_{t-1}^X.
\end{align*}
Since $\varepsilon_t:=X_t-P_{\mathcal H_{t-1}^X}X_t$, this gives
\begin{align*}
\varepsilon_t\perp \mathcal H_{t-1}^X.
\end{align*}
[guided]
Fix $t\in\mathbb Z$. We work in the complex Hilbert space $L^2(\Omega,\mathcal F,\mathbb P)$, whose inner product is
\begin{align*}
(Y,Z)_{L^2}:=\mathbb E[Y\overline Z],
\end{align*}
for $Y,Z\in L^2(\Omega,\mathcal F,\mathbb P)$. The subspace $\mathcal H_{t-1}^X$ was defined as the $L^2$-closed linear span of the random variables $X_s$ with $s\leq t-1$, so it is a closed linear subspace of $L^2(\Omega,\mathcal F,\mathbb P)$.
The orthogonal projection $P_{\mathcal H_{t-1}^X}X_t$ is the unique element of $\mathcal H_{t-1}^X$ such that the residual from $X_t$ to that element is orthogonal to $\mathcal H_{t-1}^X$. Hence
\begin{align*}
X_t-P_{\mathcal H_{t-1}^X}X_t\perp \mathcal H_{t-1}^X.
\end{align*}
But the innovation at time $t$ is defined by
\begin{align*}
\varepsilon_t:=X_t-P_{\mathcal H_{t-1}^X}X_t.
\end{align*}
Substituting this definition into the residual orthogonality gives
\begin{align*}
\varepsilon_t\perp \mathcal H_{t-1}^X.
\end{align*}
[/guided]
[/step]
[step:Place every earlier innovation inside the later past]
Let $s,t\in\mathbb Z$ satisfy $s<t$. Since $s\leq t-1$, the set of indices $\{u\in\mathbb Z:u\leq s\}$ is contained in $\{u\in\mathbb Z:u\leq t-1\}$. Taking closed linear spans gives
\begin{align*}
\mathcal H_s^X\subset \mathcal H_{t-1}^X.
\end{align*}
Also, $X_s\in\mathcal H_s^X$ by definition, and
\begin{align*}
P_{\mathcal H_{s-1}^X}X_s\in\mathcal H_{s-1}^X\subset \mathcal H_s^X.
\end{align*}
Because $\mathcal H_s^X$ is a linear subspace,
\begin{align*}
\varepsilon_s=X_s-P_{\mathcal H_{s-1}^X}X_s\in\mathcal H_s^X\subset \mathcal H_{t-1}^X.
\end{align*}
[guided]
Let $s,t\in\mathbb Z$ satisfy $s<t$. The goal is to use the orthogonality of $\varepsilon_t$ to $\mathcal H_{t-1}^X$, so we must verify that $\varepsilon_s$ is actually an element of $\mathcal H_{t-1}^X$.
Since $s<t$, we have $s\leq t-1$. Therefore every observation with index at most $s$ is also an observation with index at most $t-1$:
\begin{align*}
\{u\in\mathbb Z:u\leq s\}\subset \{u\in\mathbb Z:u\leq t-1\}.
\end{align*}
Taking linear spans and then $L^2$ closures preserves inclusion, so
\begin{align*}
\mathcal H_s^X\subset \mathcal H_{t-1}^X.
\end{align*}
Now we check that $\varepsilon_s\in\mathcal H_s^X$. By definition of $\mathcal H_s^X$, the random variable $X_s$ belongs to $\mathcal H_s^X$. The projection $P_{\mathcal H_{s-1}^X}X_s$ belongs to the target subspace $\mathcal H_{s-1}^X$, and since $\mathcal H_{s-1}^X\subset\mathcal H_s^X$, we have
\begin{align*}
P_{\mathcal H_{s-1}^X}X_s\in\mathcal H_s^X.
\end{align*}
Because $\mathcal H_s^X$ is a linear subspace, the difference of these two elements also belongs to $\mathcal H_s^X$:
\begin{align*}
\varepsilon_s=X_s-P_{\mathcal H_{s-1}^X}X_s\in\mathcal H_s^X.
\end{align*}
Combining this with $\mathcal H_s^X\subset\mathcal H_{t-1}^X$ gives
\begin{align*}
\varepsilon_s\in\mathcal H_{t-1}^X.
\end{align*}
[/guided]
[/step]
[step:Convert Hilbert space orthogonality into the covariance identity]
From the first step, $\varepsilon_t\perp\mathcal H_{t-1}^X$. From the second step, $\varepsilon_s\in\mathcal H_{t-1}^X$. Hence
\begin{align*}
(\varepsilon_t,\varepsilon_s)_{L^2}=0.
\end{align*}
Using the definition of the complex $L^2$ inner product,
\begin{align*}
(\varepsilon_t,\varepsilon_s)_{L^2}=\mathbb E[\varepsilon_t\overline{\varepsilon_s}],
\end{align*}
and therefore
\begin{align*}
\mathbb E[\varepsilon_t\overline{\varepsilon_s}]=0.
\end{align*}
This proves the stated orthogonality and the asserted covariance identity.
[/step]