[proofplan]
We introduce the exact unnormalized Feynman-Kac functionals $\gamma_s(h)$ and their particle analogues $\gamma_{s,N}(h)$, chosen so that $\gamma_{M,N}(1)$ is exactly the product likelihood estimator. The proof is then a one-step conditional unbiasedness argument: at time $0$ the proposal density $q_0$ and the weight $\omega_0$ recover the initial unnormalized measure, while at later times unbiased resampling preserves the previous empirical measure in [conditional expectation](/page/Conditional%20Expectation) and mutation with the weight ratio recovers the next Feynman-Kac prediction. Iterating these identities by conditional expectation gives $\mathbb{E}[\gamma_{M,N}(1)]=\gamma_M(1)=Z_M$.
[/proofplan]
[step:Define the exact and particle unnormalized functionals]
For $0 \leq s \leq M$, define the unnormalized path density
\begin{align*}
\gamma_s:E_0 \times \cdots \times E_s \to [0,\infty)
\end{align*}
by
\begin{align*}
\gamma_s(x_{0:s})
=
\mu_0(x_0)g_0(y_0\mid x_0)
\prod_{r=1}^s f_r(x_r\mid x_{r-1})g_r(y_r\mid x_r).
\end{align*}
For every bounded measurable function
\begin{align*}
h:E_0 \times \cdots \times E_s \to \mathbb{R},
\end{align*}
for which the following integral is finite, define
\begin{align*}
\gamma_s(h)
=
\int_{E_0 \times \cdots \times E_s}
h(x_{0:s})\gamma_s(x_{0:s})
\,d(\lambda_0 \otimes \cdots \otimes \lambda_s)(x_{0:s}).
\end{align*}
Thus $\gamma_M(1)=Z_M$.
For $0 \leq s \leq M$, define the empirical average weight
\begin{align*}
\overline W_s=\frac{1}{N}\sum_{i=1}^N W_{s,i}.
\end{align*}
For each realized particle system, define the time-zero particle unnormalized functional as the map
\begin{align*}
\gamma_{0,N}:\{h:E_0\to\mathbb{R}\text{ measurable with finite displayed sum}\}\to\mathbb{R}
\end{align*}
by
\begin{align*}
\gamma_{0,N}(h)=\frac{1}{N}\sum_{i=1}^N W_{0,i} h(X_{0,i}).
\end{align*}
For $1 \leq s \leq M$, define the time-$s$ particle unnormalized functional as the map
\begin{align*}
\gamma_{s,N}:\{h:E_0\times\cdots\times E_s\to\mathbb{R}\text{ measurable with finite displayed sum}\}\to\mathbb{R}
\end{align*}
by
\begin{align*}
\gamma_{s,N}(h)=\left(\prod_{r=0}^{s-1}\overline W_r\right)\frac{1}{N}\sum_{i=1}^N W_{s,i} h(X_{0:s,i}).
\end{align*}
With $h\equiv 1$, this gives
\begin{align*}
\gamma_{M,N}(1)
=
\prod_{s=0}^M \overline W_s
=
\widehat Z_{M,N}.
\end{align*}
[/step]
[step:Verify the initial unbiasedness identity by integrating under $q_0$]
Let
\begin{align*}
h:E_0 \to \mathbb{R}
\end{align*}
be measurable and integrable with respect to the initial unnormalized measure. Since $X_{0,1},\dots,X_{0,N}$ are independent with density $q_0$ with respect to $\lambda_0$, linearity of expectation gives
\begin{align*}
\mathbb{E}[\gamma_{0,N}(h)]
=
\frac{1}{N}\sum_{i=1}^N
\mathbb{E}[\omega_0(X_{0,i})h(X_{0,i})].
\end{align*}
For each $i$, the definition of $\omega_0$ and the positivity assumption on $q_0$ give
\begin{align*}
\mathbb{E}[\omega_0(X_{0,i})h(X_{0,i})]
=
\int_{E_0}
h(x_0)
\frac{\mu_0(x_0)g_0(y_0\mid x_0)}{q_0(x_0)}
q_0(x_0)
\,d\lambda_0(x_0).
\end{align*}
Hence
\begin{align*}
\mathbb{E}[\omega_0(X_{0,i})h(X_{0,i})]
=
\int_{E_0}
h(x_0)\mu_0(x_0)g_0(y_0\mid x_0)
\,d\lambda_0(x_0)
=
\gamma_0(h).
\end{align*}
Therefore
\begin{align*}
\mathbb{E}[\gamma_{0,N}(h)]
=
\gamma_0(h).
\end{align*}
[guided]
The first step checks that importance sampling is unbiased before any resampling occurs. We take a measurable function $h:E_0 \to \mathbb{R}$ such that the displayed expectations and integrals are finite. The random variables $X_{0,1},\dots,X_{0,N}$ are independent and each has density $q_0$ with respect to $\lambda_0$, while $W_{0,i}=\omega_0(X_{0,i})$. Therefore
\begin{align*}
\mathbb{E}[\gamma_{0,N}(h)]
=
\mathbb{E}\left[
\frac{1}{N}\sum_{i=1}^N W_{0,i} h(X_{0,i})
\right].
\end{align*}
By linearity of expectation and the identical distribution of the particles,
\begin{align*}
\mathbb{E}[\gamma_{0,N}(h)]
=
\frac{1}{N}\sum_{i=1}^N
\mathbb{E}[\omega_0(X_{0,i})h(X_{0,i})].
\end{align*}
Now compute one summand using the proposal density $q_0$:
\begin{align*}
\mathbb{E}[\omega_0(X_{0,i})h(X_{0,i})]
=
\int_{E_0}
h(x_0)\omega_0(x_0)q_0(x_0)
\,d\lambda_0(x_0).
\end{align*}
The positivity assumption on $q_0$ ensures that the ratio defining $\omega_0$ is valid wherever the numerator can contribute. Substituting
\begin{align*}
\omega_0(x_0)=\frac{\mu_0(x_0)g_0(y_0\mid x_0)}{q_0(x_0)}
\end{align*}
gives
\begin{align*}
\mathbb{E}[\omega_0(X_{0,i})h(X_{0,i})]
=
\int_{E_0}
h(x_0)\mu_0(x_0)g_0(y_0\mid x_0)
\,d\lambda_0(x_0).
\end{align*}
The last integral is exactly $\gamma_0(h)$ by the definition of the initial unnormalized functional. Averaging $N$ identical copies of the same expectation does not change the value, so
\begin{align*}
\mathbb{E}[\gamma_{0,N}(h)]
=
\gamma_0(h).
\end{align*}
[/guided]
[/step]
[step:Express one exact Feynman-Kac transition as a weighted proposal integral]
Fix $1 \leq s \leq M$. For every measurable function
\begin{align*}
h:E_0 \times \cdots \times E_s \to \mathbb{R},
\end{align*}
define
\begin{align*}
G_s h:E_0 \times \cdots \times E_{s-1} \to \mathbb{R}
\end{align*}
by
\begin{align*}
(G_s h)(x_{0:s-1})
=
\int_{E_s}
h(x_{0:s})
f_s(x_s\mid x_{s-1})g_s(y_s\mid x_s)
\,d\lambda_s(x_s),
\end{align*}
whenever the integral is finite. Then the product definition of $\gamma_s$ and [Tonelli's theorem](/theorems/601) for nonnegative integrands, extended to integrable signed functions by applying the same argument to positive and negative parts, gives
\begin{align*}
\gamma_s(h)=\gamma_{s-1}(G_s h).
\end{align*}
Moreover, by the definition of $\omega_s$,
\begin{align*}
(G_s h)(x_{0:s-1})
=
\int_{E_s}
h(x_{0:s})\omega_s(x_{0:s})Q_s(x_s\mid x_{0:s-1})
\,d\lambda_s(x_s)
\end{align*}
for every previous path $x_{0:s-1}$ for which the integral is finite. The positivity assumption on $Q_s$ ensures that the ratio defining $\omega_s$ is valid on the support of the unnormalized transition density.
[/step]
[step:Show unbiased resampling and mutation give the conditional Feynman-Kac update]
Fix $1 \leq s \leq M$, and let
\begin{align*}
h:E_0 \times \cdots \times E_s \to \mathbb{R}
\end{align*}
be measurable with the required integrability. Let $\mathcal F_{s-1}$ denote the sigma-algebra generated by all particles, weights, and resampling variables through time $s-1$. Conditional on $\mathcal F_{s-1}$, define the offspring count
\begin{align*}
C_{s,j}=\#\{i:A_{s,i}=j\}
\end{align*}
for $1 \leq j \leq N$. By assumption,
\begin{align*}
\mathbb{E}[C_{s,j}\mid\mathcal F_{s-1}]
=
\frac{N W_{s-1,j}}{\sum_{k=1}^N W_{s-1,k}}.
\end{align*}
Since mutation is conditionally independent given the ancestor indices and $\mathcal F_{s-1}$, the proposal-density calculation from the previous step gives
\begin{align*}
\mathbb{E}[W_{s,i} h(X_{0:s,i})\mid A_{s,i}=j,\mathcal F_{s-1}]
=
(G_s h)(X_{0:s-1,j}).
\end{align*}
Therefore
\begin{align*}
\mathbb{E}\left[
\frac{1}{N}\sum_{i=1}^N W_{s,i} h(X_{0:s,i})
\mid \mathcal F_{s-1}
\right]
=
\frac{1}{N}\sum_{j=1}^N
\mathbb{E}[C_{s,j}\mid\mathcal F_{s-1}]
(G_s h)(X_{0:s-1,j}).
\end{align*}
Substituting the unbiased offspring-count identity yields
\begin{align*}
\mathbb{E}\left[
\frac{1}{N}\sum_{i=1}^N W_{s,i} h(X_{0:s,i})
\mid \mathcal F_{s-1}
\right]
=
\frac{\sum_{j=1}^N W_{s-1,j}(G_s h)(X_{0:s-1,j})}
{\sum_{k=1}^N W_{s-1,k}}.
\end{align*}
Multiplying by $\prod_{r=0}^{s-1}\overline W_r$, which is $\mathcal F_{s-1}$-measurable, and using
\begin{align*}
\overline W_{s-1}
=
\frac{1}{N}\sum_{k=1}^N W_{s-1,k}
\end{align*}
gives
\begin{align*}
\mathbb{E}[\gamma_{s,N}(h)\mid\mathcal F_{s-1}]
=
\gamma_{s-1,N}(G_s h).
\end{align*}
[guided]
This is the main particle-filter step. We want to prove that one resampling-mutation-weighting cycle has the correct conditional mean. Fix $1 \leq s \leq M$ and a measurable function $h:E_0 \times \cdots \times E_s \to \mathbb{R}$ for which the relevant conditional expectations are finite. The sigma-algebra $\mathcal F_{s-1}$ contains the previous paths $X_{0:s-1,j}$ and weights $W_{s-1,j}$, so these quantities are fixed when we condition on $\mathcal F_{s-1}$.
For each previous particle $j$, define the random offspring count
\begin{align*}
C_{s,j}=\#\{i:A_{s,i}=j\}.
\end{align*}
The resampling scheme is not assumed to be multinomial or independent across children. The only property used is the unbiased count condition
\begin{align*}
\mathbb{E}[C_{s,j}\mid\mathcal F_{s-1}]
=
\frac{N W_{s-1,j}}{\sum_{k=1}^N W_{s-1,k}}.
\end{align*}
The denominator is positive almost surely by hypothesis, so these conditional probabilities and expectations are well-defined.
Now condition further on the event that the $i$-th new particle has ancestor $j$. Then the old part of its path is fixed as $X_{0:s-1,j}$, and the new coordinate $X_{s,i}$ is drawn from the proposal density $Q_s(\cdot\mid X_{0:s-1,j})$ with respect to $\lambda_s$. Thus
\begin{align*}
\mathbb{E}[W_{s,i} h(X_{0:s,i})\mid A_{s,i}=j,\mathcal F_{s-1}]
=
\int_{E_s}
h(X_{0:s-1,j},x_s)
\omega_s(X_{0:s-1,j},x_s)
Q_s(x_s\mid X_{0:s-1,j})
\,d\lambda_s(x_s).
\end{align*}
Substituting the incremental weight ratio gives
\begin{align*}
\mathbb{E}[W_{s,i} h(X_{0:s,i})\mid A_{s,i}=j,\mathcal F_{s-1}]
=
\int_{E_s}
h(X_{0:s-1,j},x_s)
f_s(x_s\mid x_{s-1,j})g_s(y_s\mid x_s)
\,d\lambda_s(x_s).
\end{align*}
Here $x_{s-1,j}$ denotes the terminal coordinate of the realized path $X_{0:s-1,j}$, so the transition density is evaluated at the previous state of ancestor $j$.
By definition of $G_s h$, this is
\begin{align*}
\mathbb{E}[W_{s,i} h(X_{0:s,i})\mid A_{s,i}=j,\mathcal F_{s-1}]
=
(G_s h)(X_{0:s-1,j}).
\end{align*}
The sum over children can now be grouped by ancestor. Conditional on $\mathcal F_{s-1}$, all children with ancestor $j$ have the same conditional mean contribution $(G_s h)(X_{0:s-1,j})$. Therefore
\begin{align*}
\mathbb{E}\left[
\frac{1}{N}\sum_{i=1}^N W_{s,i} h(X_{0:s,i})
\mid \mathcal F_{s-1}
\right]
=
\frac{1}{N}\sum_{j=1}^N
\mathbb{E}[C_{s,j}\mid\mathcal F_{s-1}]
(G_s h)(X_{0:s-1,j}).
\end{align*}
Using the unbiased offspring-count identity, this becomes
\begin{align*}
\mathbb{E}\left[
\frac{1}{N}\sum_{i=1}^N W_{s,i} h(X_{0:s,i})
\mid \mathcal F_{s-1}
\right]
=
\frac{\sum_{j=1}^N W_{s-1,j}(G_s h)(X_{0:s-1,j})}
{\sum_{k=1}^N W_{s-1,k}}.
\end{align*}
Finally multiply by the previous product of average weights. Since $\prod_{r=0}^{s-1}\overline W_r$ is $\mathcal F_{s-1}$-measurable, it may be pulled through the conditional expectation. The factor $\overline W_{s-1}=N^{-1}\sum_{k=1}^N W_{s-1,k}$ cancels the denominator above, leaving
\begin{align*}
\mathbb{E}[\gamma_{s,N}(h)\mid\mathcal F_{s-1}]
=
\left(\prod_{r=0}^{s-2}\overline W_r\right)
\frac{1}{N}\sum_{j=1}^N
W_{s-1,j}(G_s h)(X_{0:s-1,j}).
\end{align*}
The right-hand side is precisely $\gamma_{s-1,N}(G_s h)$ by the definition of the particle unnormalized functional. Hence
\begin{align*}
\mathbb{E}[\gamma_{s,N}(h)\mid\mathcal F_{s-1}]
=
\gamma_{s-1,N}(G_s h).
\end{align*}
[/guided]
[/step]
[step:Iterate the one-step identity to obtain the likelihood unbiasedness]
Define the constant function
\begin{align*}
\mathbf{1}_M:E_0 \times \cdots \times E_M \to \mathbb{R}
\end{align*}
by $\mathbf{1}_M(x_{0:M})=1$. Since $\gamma_M(\mathbf{1}_M)=Z_M<\infty$, the backward functions constructed below are nonnegative and finite inside the exact unnormalized integrals; no pointwise finiteness outside null or irrelevant sets is needed. To justify the particle identities for possibly unbounded $h_s$, first apply the previous conditional identities to the bounded truncations $h_{s,m}=\min\{h_s,m\}$ for $m\in\mathbb{N}$. For these truncations, the corresponding weighted sums are bounded above by the same sums with [test function](/page/Test%20Function) $m$, so the displayed conditional expectations are finite by the integrability assumptions on the weights. As $m\to\infty$, the functions $h_{s,m}$ increase pointwise to $h_s$, and the nonnegative particle sums also increase pointwise. The [monotone convergence theorem](/theorems/509) for conditional expectations therefore permits passage to the limit in the conditional identities. The stated integrability assumptions on the particle weights and on $\widehat Z_{M,N}$ ensure that the limiting particle expectations appearing below are finite. For $0 \leq s \leq M-1$, define recursively [measurable functions](/page/Measurable%20Functions)
\begin{align*}
h_s:E_0 \times \cdots \times E_s \to \mathbb{R}
\end{align*}
by $h_M=\mathbf{1}_M$ and
\begin{align*}
h_{s-1}=G_s h_s
\end{align*}
for $1 \leq s \leq M$. The exact Feynman-Kac identity from above gives
\begin{align*}
\gamma_M(\mathbf{1}_M)=\gamma_0(h_0).
\end{align*}
The conditional particle identity gives, for every $1 \leq s \leq M$,
\begin{align*}
\mathbb{E}[\gamma_{s,N}(h_s)]
=
\mathbb{E}[\gamma_{s-1,N}(h_{s-1})],
\end{align*}
where the equality follows by taking expectations on both sides of
\begin{align*}
\mathbb{E}[\gamma_{s,N}(h_s)\mid\mathcal F_{s-1}]
=
\gamma_{s-1,N}(h_{s-1}).
\end{align*}
Iterating from $s=T$ down to $s=1$ yields
\begin{align*}
\mathbb{E}[\gamma_{M,N}(\mathbf{1}_M)]
=
\mathbb{E}[\gamma_{0,N}(h_0)].
\end{align*}
By the initial unbiasedness identity,
\begin{align*}
\mathbb{E}[\gamma_{0,N}(h_0)]
=
\gamma_0(h_0).
\end{align*}
Combining these identities,
\begin{align*}
\mathbb{E}[\widehat Z_{M,N}]
=
\mathbb{E}[\gamma_{M,N}(\mathbf{1}_M)]
=
\gamma_M(\mathbf{1}_M)
=
Z_M.
\end{align*}
In the hidden Markov model interpretation, $p(y_{0:M})$ denotes the marginal likelihood of the fixed observations obtained by integrating the joint latent-observation density over the latent path, so $Z_M=p(y_{0:M})$.
[/step]