[proofplan]
We first check that the weight function $w$ is measurable and compute its expectation under the proposal law $\mathbb Q$. The defining support condition $\gamma=0$ on $\{q=0\}$ up to a $\mu$-null set ensures that the integral of $w$ against $q\,d\mu$ recovers the whole normalizing constant $Z$, rather than only the part of $\gamma$ on $\{q>0\}$. This gives integrability and unbiasedness by linearity of expectation. The almost sure convergence is then the [Strong Law of Large Numbers](/theorems/520) applied to the independent identically distributed sequence $(w(Y_i))_{i\in\mathbb N}$.
[/proofplan]
[step:Verify that the importance weight is a measurable nonnegative function]
Let
\begin{align*}
A:=\{y\in S:q(y)>0\}.
\end{align*}
Since $q:S\to[0,\infty)$ is $\mathcal S$-measurable, the set $A$ belongs to $\mathcal S$. Let $\mathcal S|_A:=\{E\cap A:E\in\mathcal S\}$ denote the trace $\sigma$-algebra on $A$. The restriction $q|_A:A\to(0,\infty)$ is $\mathcal S|_A$-measurable, and since the reciprocal map $(0,\infty)\to(0,\infty)$, $t\mapsto 1/t$, is Borel measurable, the function $1/q|_A:A\to(0,\infty)$ is $\mathcal S|_A$-measurable. Hence the product of the restrictions $\gamma|_A$ and $1/q|_A$ is $\mathcal S|_A$-measurable, so the restriction $w|_A:A\to[0,\infty)$ is measurable. On $S\setminus A=\{q=0\}$, the function $w$ is defined to be $0$. Therefore $w:S\to[0,\infty)$ is $\mathcal S$-measurable and nonnegative.
[/step]
[step:Compute the proposal expectation and recover the full normalizing constant]
Because $Y_1$ has law $\mathbb Q$ and $w:S\to[0,\infty)$ is measurable, the change from expectation to integration against the law of $Y_1$ gives
\begin{align*}
\mathbb E[w(Y_1)]=\int_S w(y)\,d\mathbb Q(y).
\end{align*}
By the definition of $\mathbb Q$ as the measure with density $q$ with respect to $\mu$,
\begin{align*}
\int_S w(y)\,d\mathbb Q(y)=\int_S w(y)q(y)\,d\mu(y).
\end{align*}
Using the definition of $w$ on $A=\{q>0\}$ and on $S\setminus A=\{q=0\}$, we have $w(y)q(y)=\gamma(y)$ for every $y\in A$ and $w(y)q(y)=0$ for every $y\in S\setminus A$. Hence
\begin{align*}
\int_S w(y)q(y)\,d\mu(y)=\int_A \gamma(y)\,d\mu(y).
\end{align*}
The hypothesis $\gamma=0$ $\mu$-almost everywhere on $S\setminus A$ gives
\begin{align*}
\int_{S\setminus A}\gamma(y)\,d\mu(y)=0.
\end{align*}
Since $\gamma$ is nonnegative and measurable,
\begin{align*}
\int_A \gamma(y)\,d\mu(y)=\int_S \gamma(y)\,d\mu(y)=Z.
\end{align*}
Therefore
\begin{align*}
\mathbb E[w(Y_1)]=Z.
\end{align*}
Since $Z<\infty$ and $w(Y_1)\ge 0$, the [random variable](/page/Random%20Variable) $w(Y_1):\Omega\to[0,\infty)$ is integrable.
[guided]
The purpose of this step is to justify the identity behind importance sampling: averaging $\gamma/q$ under the proposal law with density $q$ should reproduce the integral of $\gamma$ under $\mu$. We must check this carefully because the quotient $\gamma/q$ is only meaningful on the set where $q>0$.
Let
\begin{align*}
A:=\{y\in S:q(y)>0\}.
\end{align*}
The random variable $Y_1:\Omega\to S$ has law $\mathbb Q$, and $w:S\to[0,\infty)$ is measurable. Therefore the definition of the law of a random variable gives
\begin{align*}
\mathbb E[w(Y_1)]=\int_S w(y)\,d\mathbb Q(y).
\end{align*}
The measure $\mathbb Q$ was defined by the density $q$ with respect to $\mu$, meaning that integration of any nonnegative measurable function $h:S\to[0,\infty]$ against $\mathbb Q$ is integration of $hq$ against $\mu$. Applying this with $h=w$ gives
\begin{align*}
\int_S w(y)\,d\mathbb Q(y)=\int_S w(y)q(y)\,d\mu(y).
\end{align*}
Now split the computation according to whether $q$ is positive or zero. If $y\in A$, then $w(y)=\gamma(y)/q(y)$, so
\begin{align*}
w(y)q(y)=\gamma(y).
\end{align*}
If $y\in S\setminus A$, then $q(y)=0$ and $w(y)=0$, so
\begin{align*}
w(y)q(y)=0.
\end{align*}
Thus the integral against $wq$ sees exactly the part of $\gamma$ lying on $A$:
\begin{align*}
\int_S w(y)q(y)\,d\mu(y)=\int_A \gamma(y)\,d\mu(y).
\end{align*}
This is where the support hypothesis is used. The theorem assumes that $\gamma=0$ $\mu$-almost everywhere on $S\setminus A=\{q=0\}$. Since $\gamma$ is nonnegative, this implies
\begin{align*}
\int_{S\setminus A}\gamma(y)\,d\mu(y)=0.
\end{align*}
Consequently no positive contribution to the normalizing constant is lost by sampling from $\mathbb Q$, and
\begin{align*}
\int_A \gamma(y)\,d\mu(y)=\int_S \gamma(y)\,d\mu(y)=Z.
\end{align*}
Combining the displayed equalities gives
\begin{align*}
\mathbb E[w(Y_1)]=Z.
\end{align*}
Finally, since $w(Y_1)\ge 0$ and its expectation is the finite number $Z$, the random variable $w(Y_1):\Omega\to[0,\infty)$ is integrable.
[/guided]
[/step]
[step:Use linearity of expectation to prove unbiasedness of the sample average]
For each $i\in\{1,\dots,N\}$, the random variable $Y_i$ has law $\mathbb Q$, so the computation above gives
\begin{align*}
\mathbb E[w(Y_i)]=Z.
\end{align*}
Since each $w(Y_i)$ is integrable, linearity of expectation gives
\begin{align*}
\mathbb E[\widehat Z_N]=\mathbb E\left[\frac{1}{N}\sum_{i=1}^N w(Y_i)\right].
\end{align*}
Thus
\begin{align*}
\mathbb E[\widehat Z_N]=\frac{1}{N}\sum_{i=1}^N \mathbb E[w(Y_i)].
\end{align*}
Substituting $\mathbb E[w(Y_i)]=Z$ for every $i$ yields
\begin{align*}
\mathbb E[\widehat Z_N]=\frac{1}{N}\sum_{i=1}^N Z=Z.
\end{align*}
This proves that $\widehat Z_N$ is an unbiased estimator of $Z$.
[/step]
[step:Apply the Strong Law of Large Numbers to obtain almost sure convergence]
For each $i\in\mathbb N$, define the real-valued random variable $X_i:\Omega\to[0,\infty)$ by
\begin{align*}
X_i(\omega):=w(Y_i(\omega)).
\end{align*}
Because $Y_1,Y_2,\dots$ are independent and identically distributed with common law $\mathbb Q$, the random variables $X_1,X_2,\dots$ are independent and identically distributed. From the previous computation, $X_1=w(Y_1)$ is integrable and
\begin{align*}
\mathbb E[X_1]=Z.
\end{align*}
By the [Strong Law of Large Numbers](/theorems/1852) for integrable i.i.d. real-valued random variables, applied to the sequence $(X_i)_{i\in\mathbb N}$,
\begin{align*}
\frac{1}{N}\sum_{i=1}^N X_i\to \mathbb E[X_1]
\end{align*}
$\mathbb P$-almost surely as $N\to\infty$. Since
\begin{align*}
\frac{1}{N}\sum_{i=1}^N X_i=\widehat Z_N
\end{align*}
and $\mathbb E[X_1]=Z$, we conclude that
\begin{align*}
\widehat Z_N\to Z
\end{align*}
$\mathbb P$-almost surely as $N\to\infty$. This completes the proof.
[/step]