Variance of the Standard Importance Sampling Estimator

Variance of the Standard Importance Sampling Estimator (Theorem # 7207)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We rewrite the estimator as the average of independent identically distributed real random variables $X_i=g(Y_i)$, where $g:S\to\mathbb R$ is the real-valued measurable representative of $hL$ fixed in the statement. The Radon--Nikodym identity converts expectations under the sampling law $\mathbb Q$ into integrals under the target law $\mathbb P$, giving unbiasedness and also showing that $I$ is finite. The square-integrability assumption gives finite second moments, so the variance of the average is computed by expanding the square and using independence to eliminate cross-covariances. [/proofplan] [step:Define the weighted summands and verify their second moments] For each $i\in\{1,\dots,N\}$, define the real-valued [random variable](/page/Random%20Variable) \begin{align*} X_i:\Omega \to \mathbb R \end{align*} by \begin{align*} X_i(\omega):=g(Y_i(\omega)). \end{align*} Since $g:S\to\mathbb R$ is $\mathcal S$-measurable and real-valued, each $X_i=g\circ Y_i$ is $\mathcal F$-measurable and real-valued on all of $\Omega$. Because $Y_i$ has law $\mathbb Q$, the change-of-variables formula for the law of a random variable gives \begin{align*} \mathbb E_{\mathbb P_0}[X_i^2]=\int_S g(s)^2\,d\mathbb Q(s)=\int_S h(s)^2L(s)^2\,d\mathbb Q(s)<\infty, \end{align*} where the middle equality uses $g=hL$ $\mathbb Q$-a.e. Thus $X_i\in L^2(\Omega,\mathcal F,\mathbb P_0)$ for every $i$, and therefore \begin{align*} \widehat I_{N,\mathrm{IS}}=\frac{1}{N}\sum_{i=1}^N X_i \end{align*} also belongs to $L^2(\Omega,\mathcal F,\mathbb P_0)$. [/step] [step:Use the Radon--Nikodym derivative to identify the mean] Since $\mathbb P \ll \mathbb Q$ and $L=d\mathbb P/d\mathbb Q$, the Radon--Nikodym integration identity gives, first for $|h|$ and then for $h$, \begin{align*} \int_S |h(s)|\,d\mathbb P(s)=\int_S |h(s)|L(s)\,d\mathbb Q(s). \end{align*} The right-hand side is finite by Cauchy--Schwarz on the probability space $(S,\mathcal S,\mathbb Q)$: \begin{align*} \int_S |h(s)|L(s)\,d\mathbb Q(s)\le \left(\int_S h(s)^2L(s)^2\,d\mathbb Q(s)\right)^{1/2}\left(\int_S 1\,d\mathbb Q(s)\right)^{1/2}<\infty. \end{align*} Hence $h\in L^1(S,\mathcal S,\mathbb P)$, so $I$ is finite. For each $i$, using again that $Y_i$ has law $\mathbb Q$, using $g=hL$ $\mathbb Q$-a.e., and then applying the Radon--Nikodym identity, \begin{align*} \mathbb E_{\mathbb P_0}[X_i]=\int_S g(s)\,d\mathbb Q(s)=\int_S h(s)L(s)\,d\mathbb Q(s)=\int_S h(s)\,d\mathbb P(s)=I. \end{align*} Therefore \begin{align*} \mathbb E_{\mathbb P_0}[\widehat I_{N,\mathrm{IS}}]=\frac{1}{N}\sum_{i=1}^N \mathbb E_{\mathbb P_0}[X_i]=I. \end{align*} [guided] The estimator is unbiased because the likelihood ratio $L$ changes the sampling measure $\mathbb Q$ back into the target measure $\mathbb P$. We must first check that the target integral is a genuine finite number. Since $L=d\mathbb P/d\mathbb Q$, the Radon--Nikodym integration identity says that for every non-negative measurable function $g:S\to[0,\infty]$, \begin{align*} \int_S g(s)\,d\mathbb P(s)=\int_S g(s)L(s)\,d\mathbb Q(s). \end{align*} Apply this with $g=|h|$. We obtain \begin{align*} \int_S |h(s)|\,d\mathbb P(s)=\int_S |h(s)|L(s)\,d\mathbb Q(s). \end{align*} The assumption $hL\in L^2(S,\mathcal S,\mathbb Q)$ implies $hL\in L^1(S,\mathcal S,\mathbb Q)$ because $\mathbb Q(S)=1$. More explicitly, Cauchy--Schwarz applied to the functions $|h|L$ and $1$ on the probability space $(S,\mathcal S,\mathbb Q)$ gives \begin{align*} \int_S |h(s)|L(s)\,d\mathbb Q(s)\le \left(\int_S h(s)^2L(s)^2\,d\mathbb Q(s)\right)^{1/2}\left(\int_S 1\,d\mathbb Q(s)\right)^{1/2}<\infty. \end{align*} Thus $h$ is $\mathbb P$-integrable, and the number \begin{align*} I=\int_S h(s)\,d\mathbb P(s) \end{align*} is well-defined. Now fix $i\in\{1,\dots,N\}$. Since $Y_i$ has law $\mathbb Q$, the expectation of the composition $g\circ Y_i$ is the integral of $g$ against $\mathbb Q$: \begin{align*} \mathbb E_{\mathbb P_0}[X_i]=\int_S g(s)\,d\mathbb Q(s)=\int_S h(s)L(s)\,d\mathbb Q(s), \end{align*} where the second equality uses $g=hL$ $\mathbb Q$-a.e. Applying the Radon--Nikodym identity to the integrable function $h$ gives \begin{align*} \int_S h(s)L(s)\,d\mathbb Q(s)=\int_S h(s)\,d\mathbb P(s)=I. \end{align*} Therefore every summand has mean $I$. Since the estimator is the arithmetic average of these summands, \begin{align*} \mathbb E_{\mathbb P_0}[\widehat I_{N,\mathrm{IS}}]=\mathbb E_{\mathbb P_0}\left[\frac{1}{N}\sum_{i=1}^N X_i\right]=\frac{1}{N}\sum_{i=1}^N \mathbb E_{\mathbb P_0}[X_i]=I. \end{align*} [/guided] [/step] [step:Use independence to reduce the variance to one summand] Because $Y_1,\dots,Y_N$ are independent and each $X_i$ is obtained from $Y_i$ by the same measurable map $g:S\to\mathbb R$, the random variables $X_1,\dots,X_N$ are independent and identically distributed. Since each $X_i\in L^2(\Omega,\mathcal F,\mathbb P_0)$, the variance of their average can be computed by expanding the square: \begin{align*} \operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\mathbb E_{\mathbb P_0}\left[\left(\frac{1}{N}\sum_{i=1}^N (X_i-I)\right)^2\right]. \end{align*} Expanding the finite square gives \begin{align*} \operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\frac{1}{N^2}\sum_{i=1}^N \mathbb E_{\mathbb P_0}[(X_i-I)^2]+\frac{2}{N^2}\sum_{1\le i<j\le N}\mathbb E_{\mathbb P_0}[(X_i-I)(X_j-I)]. \end{align*} For $i\ne j$, independence gives \begin{align*} \mathbb E_{\mathbb P_0}[(X_i-I)(X_j-I)]=\mathbb E_{\mathbb P_0}[X_i-I]\mathbb E_{\mathbb P_0}[X_j-I]=0. \end{align*} Since the $X_i$ are identically distributed, \begin{align*} \operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\frac{1}{N}\operatorname{Var}_{\mathbb P_0}(X_1). \end{align*} [/step] [step:Expand the single-summand variance] By the definition of variance and the already established identity $\mathbb E_{\mathbb P_0}[X_1]=I$, \begin{align*} \operatorname{Var}_{\mathbb P_0}(X_1)=\mathbb E_{\mathbb P_0}[X_1^2]-I^2. \end{align*} Since $Y_1$ has law $\mathbb Q$ and $g=hL$ $\mathbb Q$-a.e., \begin{align*} \mathbb E_{\mathbb P_0}[X_1^2]=\int_S g(s)^2\,d\mathbb Q(s)=\int_S h(s)^2L(s)^2\,d\mathbb Q(s). \end{align*} Combining this with the previous step yields \begin{align*} \operatorname{Var}_{\mathbb P_0}(\widehat I_{N,\mathrm{IS}})=\frac{1}{N}\left(\int_S h(s)^2L(s)^2\,d\mathbb Q(s)-I^2\right). \end{align*} Together with the unbiasedness already proved, this is exactly the asserted formula for the standard [importance sampling estimator](/theorems/2001). [/step]

Explore Further

Ackermann Formula applied Particle Marginal Metropolis-Hastings Correctness Theorem applied Coordinate Euler-Lagrange Equations applied Polynomial-Time Algorithm for an NP-Hard Language Implies $P = NP$ applied 3-SAT Is NP-Complete applied Reduced Radial Schrödinger Equation applied Kalman Decomposition Theorem applied Continuous-Time Kalman Filter Algebraic Riccati Theorem applied

What brings you to Androma?

Start with a route through the knowledge graph.