[proofplan]
We rewrite the estimator as the average of independent identically distributed real random variables $X_i=g(Y_i)$, where $g:S\to\mathbb R$ is the real-valued measurable representative of $hL$ fixed in the statement. The Radon--Nikodym identity converts expectations under the sampling law $\mathbb Q$ into integrals under the target law $\mathbb P$, giving unbiasedness and also showing that $I$ is finite. The square-integrability assumption gives finite second moments, so the variance of the average is computed by expanding the square and using independence to eliminate cross-covariances.
[/proofplan]
[step:Define the weighted summands and verify their second moments]
For each $i\in\{1,\dots,N\}$, define the real-valued [random variable](/page/Random%20Variable)
\begin{align*}
X_i:\Omega \to \mathbb R
\end{align*}
by
\begin{align*}
X_i(\omega):=g(Y_i(\omega)).
\end{align*}
Since $g:S\to\mathbb R$ is $\mathcal S$-measurable and real-valued, each $X_i=g\circ Y_i$ is $\mathcal F$-measurable and real-valued on all of $\Omega$. Because $Y_i$ has law $\mathbb Q$, the change-of-variables formula for the law of a random variable gives
\begin{align*}
\mathbb E_{\mathbb P_0}[X_i^2]=\int_S g(s)^2\,d\mathbb Q(s)=\int_S h(s)^2L(s)^2\,d\mathbb Q(s)<\infty,
\end{align*}
where the middle equality uses $g=hL$ $\mathbb Q$-a.e. Thus $X_i\in L^2(\Omega,\mathcal F,\mathbb P_0)$ for every $i$, and therefore
\begin{align*}
\widehat I_{N,\mathrm{IS}}=\frac{1}{N}\sum_{i=1}^N X_i
\end{align*}
also belongs to $L^2(\Omega,\mathcal F,\mathbb P_0)$.
[/step]
[step:Use the Radon--Nikodym derivative to identify the mean]
Since $\mathbb P \ll \mathbb Q$ and $L=d\mathbb P/d\mathbb Q$, the Radon--Nikodym integration identity gives, first for $|h|$ and then for $h$,
\begin{align*}
\int_S |h(s)|\,d\mathbb P(s)=\int_S |h(s)|L(s)\,d\mathbb Q(s).
\end{align*}
The right-hand side is finite by Cauchy--Schwarz on the probability space $(S,\mathcal S,\mathbb Q)$:
\begin{align*}
\int_S |h(s)|L(s)\,d\mathbb Q(s)\le \left(\int_S h(s)^2L(s)^2\,d\mathbb Q(s)\right)^{1/2}\left(\int_S 1\,d\mathbb Q(s)\right)^{1/2}<\infty.
\end{align*}
Hence $h\in L^1(S,\mathcal S,\mathbb P)$, so $I$ is finite. For each $i$, using again that $Y_i$ has law $\mathbb Q$, using $g=hL$ $\mathbb Q$-a.e., and then applying the Radon--Nikodym identity,
\begin{align*}
\mathbb E_{\mathbb P_0}[X_i]=\int_S g(s)\,d\mathbb Q(s)=\int_S h(s)L(s)\,d\mathbb Q(s)=\int_S h(s)\,d\mathbb P(s)=I.
\end{align*}
Therefore
\begin{align*}
\mathbb E_{\mathbb P_0}[\widehat I_{N,\mathrm{IS}}]=\frac{1}{N}\sum_{i=1}^N \mathbb E_{\mathbb P_0}[X_i]=I.
\end{align*}
[guided]
The estimator is unbiased because the likelihood ratio $L$ changes the sampling measure $\mathbb Q$ back into the target measure $\mathbb P$. We must first check that the target integral is a genuine finite number. Since $L=d\mathbb P/d\mathbb Q$, the Radon--Nikodym integration identity says that for every non-negative measurable function $g:S\to[0,\infty]$,
\begin{align*}
\int_S g(s)\,d\mathbb P(s)=\int_S g(s)L(s)\,d\mathbb Q(s).
\end{align*}
Apply this with $g=|h|$. We obtain
\begin{align*}
\int_S |h(s)|\,d\mathbb P(s)=\int_S |h(s)|L(s)\,d\mathbb Q(s).
\end{align*}
The assumption $hL\in L^2(S,\mathcal S,\mathbb Q)$ implies $hL\in L^1(S,\mathcal S,\mathbb Q)$ because $\mathbb Q(S)=1$. More explicitly, Cauchy--Schwarz applied to the functions $|h|L$ and $1$ on the probability space $(S,\mathcal S,\mathbb Q)$ gives
\begin{align*}
\int_S |h(s)|L(s)\,d\mathbb Q(s)\le \left(\int_S h(s)^2L(s)^2\,d\mathbb Q(s)\right)^{1/2}\left(\int_S 1\,d\mathbb Q(s)\right)^{1/2}<\infty.
\end{align*}
Thus $h$ is $\mathbb P$-integrable, and the number
\begin{align*}
I=\int_S h(s)\,d\mathbb P(s)
\end{align*}
is well-defined.
Now fix $i\in\{1,\dots,N\}$. Since $Y_i$ has law $\mathbb Q$, the expectation of the composition $g\circ Y_i$ is the integral of $g$ against $\mathbb Q$:
\begin{align*}
\mathbb E_{\mathbb P_0}[X_i]=\int_S g(s)\,d\mathbb Q(s)=\int_S h(s)L(s)\,d\mathbb Q(s),
\end{align*}
where the second equality uses $g=hL$ $\mathbb Q$-a.e. Applying the Radon--Nikodym identity to the integrable function $h$ gives
\begin{align*}
\int_S h(s)L(s)\,d\mathbb Q(s)=\int_S h(s)\,d\mathbb P(s)=I.
\end{align*}
Therefore every summand has mean $I$. Since the estimator is the arithmetic average of these summands,
\begin{align*}
\mathbb E_{\mathbb P_0}[\widehat I_{N,\mathrm{IS}}]=\mathbb E_{\mathbb P_0}\left[\frac{1}{N}\sum_{i=1}^N X_i\right]=\frac{1}{N}\sum_{i=1}^N \mathbb E_{\mathbb P_0}[X_i]=I.
\end{align*}
[/guided]
[/step]
[step:Use independence to reduce the variance to one summand]
Because $Y_1,\dots,Y_N$ are independent and each $X_i$ is obtained from $Y_i$ by the same measurable map $g:S\to\mathbb R$, the random variables $X_1,\dots,X_N$ are independent and identically distributed. Since each $X_i\in L^2(\Omega,\mathcal F,\mathbb P_0)$, the variance of their average can be computed by expanding the square:
\begin{align*}
\operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\mathbb E_{\mathbb P_0}\left[\left(\frac{1}{N}\sum_{i=1}^N (X_i-I)\right)^2\right].
\end{align*}
Expanding the finite square gives
\begin{align*}
\operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\frac{1}{N^2}\sum_{i=1}^N \mathbb E_{\mathbb P_0}[(X_i-I)^2]+\frac{2}{N^2}\sum_{1\le i<j\le N}\mathbb E_{\mathbb P_0}[(X_i-I)(X_j-I)].
\end{align*}
For $i\ne j$, independence gives
\begin{align*}
\mathbb E_{\mathbb P_0}[(X_i-I)(X_j-I)]=\mathbb E_{\mathbb P_0}[X_i-I]\mathbb E_{\mathbb P_0}[X_j-I]=0.
\end{align*}
Since the $X_i$ are identically distributed,
\begin{align*}
\operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\frac{1}{N}\operatorname{Var}_{\mathbb P_0}(X_1).
\end{align*}
[/step]
[step:Expand the single-summand variance]
By the definition of variance and the already established identity $\mathbb E_{\mathbb P_0}[X_1]=I$,
\begin{align*}
\operatorname{Var}_{\mathbb P_0}(X_1)=\mathbb E_{\mathbb P_0}[X_1^2]-I^2.
\end{align*}
Since $Y_1$ has law $\mathbb Q$ and $g=hL$ $\mathbb Q$-a.e.,
\begin{align*}
\mathbb E_{\mathbb P_0}[X_1^2]=\int_S g(s)^2\,d\mathbb Q(s)=\int_S h(s)^2L(s)^2\,d\mathbb Q(s).
\end{align*}
Combining this with the previous step yields
\begin{align*}
\operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\frac{1}{N}\left(\int_S h(s)^2L(s)^2\,d\mathbb Q(s)-I^2\right).
\end{align*}
Together with the unbiasedness already proved, this is exactly the asserted formula for the standard [importance sampling estimator](/theorems/2001).
[/step]