[proofplan]
We first remove the stationary mean of $h$, since adding a constant to the summands does not change the variance. On the finite-dimensional [Hilbert space](/page/Hilbert%20Space) of $\pi$-mean-zero functions, reversibility makes each transition kernel a self-adjoint operator, and irreducibility plus aperiodicity force all eigenvalues to lie strictly between $-1$ and $1$. This gives the resolvent formula $\sigma_P^2(g)=\langle g,(I+P)(I-P)^{-1}g\rangle_\pi$ for centered $g$. Finally, the off-diagonal dominance assumption gives the Dirichlet form ordering $I-P_1 \ge I-P_2$, and inversion reverses the order for positive definite [self-adjoint operators](/page/Self-Adjoint%20Operators), yielding the desired comparison.
[/proofplan]
[step:Center the observable without changing the variance]
Define the stationary mean of $h$ by
\begin{align*}
\pi(h) := \sum_{x \in E} h(x)\pi(x).
\end{align*}
Define the centered function $g:E \to \mathbb{R}$ by $g(x):=h(x)-\pi(h)$ for $x \in E$. For every $N \in \mathbb{N}$,
\begin{align*}
\frac{1}{N}\sum_{n=0}^{N-1} h(X_n) = \pi(h) + \frac{1}{N}\sum_{n=0}^{N-1} g(X_n).
\end{align*}
Since variance is invariant under addition of a deterministic constant,
\begin{align*}
\operatorname{Var}\left(\frac{1}{N}\sum_{n=0}^{N-1} h(X_n)\right) = \operatorname{Var}\left(\frac{1}{N}\sum_{n=0}^{N-1} g(X_n)\right).
\end{align*}
It is therefore enough to prove the result for functions $g:E \to \mathbb{R}$ satisfying $\pi(g)=0$.
[/step]
[step:Represent each reversible kernel on the mean-zero Hilbert space]
Let $L^2(\pi)$ be the real Hilbert space of functions $f:E \to \mathbb{R}$ with [inner product](/page/Inner%20Product)
\begin{align*}
\langle f,k\rangle_\pi := \sum_{x \in E} f(x)k(x)\pi(x).
\end{align*}
Let
\begin{align*}
L^2_0(\pi) := \{f:E \to \mathbb{R} : \langle f,\mathbb{1}_E\rangle_\pi = 0\}
\end{align*}
be the closed subspace of $\pi$-mean-zero functions, where $\mathbb{1}_E:E \to \mathbb{R}$ denotes the constant function $\mathbb{1}_E(x)=1$.
For a Markov kernel $P:E \times E \to [0,1]$ reversible with respect to $\pi$, define the linear operator
\begin{align*}
T_P:L^2(\pi) &\to L^2(\pi)
\end{align*}
\begin{align*}
f &\mapsto T_Pf
\end{align*}
by
\begin{align*}
(T_Pf)(x) := \sum_{y \in E} P(x,y)f(y).
\end{align*}
Reversibility gives, for all $f,k \in L^2(\pi)$,
\begin{align*}
\langle T_P f,k\rangle_\pi = \sum_{x,y \in E} P(x,y)f(y)k(x)\pi(x).
\end{align*}
Using $\pi(x)P(x,y)=\pi(y)P(y,x)$ and exchanging the names of $x$ and $y$ gives
\begin{align*}
\langle T_P f,k\rangle_\pi = \sum_{x,y \in E} f(x)P(x,y)k(y)\pi(x) = \langle f,T_P k\rangle_\pi.
\end{align*}
Thus $T_P$ is self-adjoint on $L^2(\pi)$.
Since $\pi$ is stationary for $P$ by reversibility and stochasticity, $T_P$ preserves $L^2_0(\pi)$. We write $P$ also for the restricted self-adjoint operator
\begin{align*}
P:L^2_0(\pi) \to L^2_0(\pi).
\end{align*}
[/step]
[step:Show that the resolvent formula gives the asymptotic variance]
Let $P:E \times E \to [0,1]$ be irreducible, aperiodic, and reversible with respect to $\pi$, and let $g \in L^2_0(\pi)$. Since $E$ is finite, $L^2_0(\pi)$ is finite-dimensional. By the finite-dimensional [spectral theorem for self-adjoint operators](/theorems/6911), the self-adjoint operator $P:L^2_0(\pi) \to L^2_0(\pi)$ admits an orthonormal eigenbasis $(e_j)_{j=1}^m$ with real eigenvalues $(\lambda_j)_{j=1}^m$.
We record why these eigenvalues lie strictly inside $(-1,1)$. If $Pf=\lambda f$ on $L^2(\pi)$ and $x_0 \in E$ satisfies $|f(x_0)|=\max_{x \in E}|f(x)|$, then stochasticity gives $|\lambda|\le 1$ whenever $f \ne 0$; hence every eigenvalue of the finite Markov operator has modulus at most $1$. If $Pf=f$, then the maximum principle and irreducibility force $f$ to be constant on $E$, so the eigenspace for $1$ on $L^2(\pi)$ is $\operatorname{span}\{\mathbb{1}_E\}$; therefore $1$ is not an eigenvalue on $L^2_0(\pi)$. If $Pf=-f$, then equality in the same maximum-modulus argument forces the chain to alternate signs on every possible transition, which partitions the communicating class into two cyclic classes of period $2$; this contradicts aperiodicity. Thus
\begin{align*}
-1 < \lambda_j < 1
\end{align*}
for every $j \in \{1,\dots,m\}$.
Write
\begin{align*}
g = \sum_{j=1}^m a_j e_j,
\end{align*}
where $a_j := \langle g,e_j\rangle_\pi$. Let $\mathbb{P}$ denote the law of the stationary Markov chain with transition kernel $P$ and initial law $\pi$, and let $\mathbb{E}$ denote expectation with respect to $\mathbb{P}$. For square-integrable real random variables $Y,Z:\Omega \to \mathbb{R}$, define
\begin{align*}
\operatorname{Cov}(Y,Z) := \mathbb{E}[(Y-\mathbb{E}[Y])(Z-\mathbb{E}[Z])].
\end{align*}
Stationarity and the Markov property give
\begin{align*}
\operatorname{Cov}(g(X_0),g(X_k)) = \langle g,P^k g\rangle_\pi
\end{align*}
for every $k \in \mathbb{N}\cup\{0\}$. Hence, for $N \in \mathbb{N}$,
\begin{align*}
N\operatorname{Var}\left(\frac{1}{N}\sum_{n=0}^{N-1}g(X_n)\right) = \sum_{k=-(N-1)}^{N-1}\left(1-\frac{|k|}{N}\right)\langle g,P^{|k|}g\rangle_\pi.
\end{align*}
Using the eigenbasis expansion,
\begin{align*}
\langle g,P^k g\rangle_\pi = \sum_{j=1}^m a_j^2\lambda_j^k.
\end{align*}
Since $|\lambda_j|<1$, the finite geometric series limit gives
\begin{align*}
\lim_{N \to \infty} \sum_{k=-(N-1)}^{N-1}\left(1-\frac{|k|}{N}\right)\lambda_j^{|k|} = 1 + 2\sum_{k=1}^{\infty}\lambda_j^k = \frac{1+\lambda_j}{1-\lambda_j}.
\end{align*}
Therefore the stationary asymptotic variance exists and satisfies
\begin{align*}
\sigma_P^2(g) = \sum_{j=1}^m a_j^2\frac{1+\lambda_j}{1-\lambda_j}.
\end{align*}
Since $I-P$ is positive definite on $L^2_0(\pi)$, this is equivalently
\begin{align*}
\sigma_P^2(g)=\langle g,(I+P)(I-P)^{-1}g\rangle_\pi.
\end{align*}
[guided]
We prove the asymptotic variance identity from first principles in the finite-state setting. The advantage of working on $L^2_0(\pi)$ is that the constant eigenfunction has been removed, so the problematic eigenvalue $1$ no longer appears.
Because $P$ is reversible with respect to $\pi$, the operator $P:L^2_0(\pi)\to L^2_0(\pi)$ is self-adjoint. Since $L^2_0(\pi)$ is finite-dimensional, the finite-dimensional spectral theorem for self-adjoint operators gives an [orthonormal basis](/page/Orthonormal%20Basis) $(e_j)_{j=1}^m$ of $L^2_0(\pi)$ and [real numbers](/page/Real%20Numbers) $\lambda_1,\dots,\lambda_m$ such that $Pe_j=\lambda_j e_j$ for each $j$.
We now verify the spectral restrictions used below. If $Pf=\lambda f$ on $L^2(\pi)$ and $x_0 \in E$ is chosen so that $|f(x_0)|=\max_{x \in E}|f(x)|$, then
\begin{align*}
|\lambda| |f(x_0)| = |(Pf)(x_0)| \le \sum_{y \in E}P(x_0,y)|f(y)| \le |f(x_0)|.
\end{align*}
For a nonzero eigenfunction this gives $|\lambda|\le 1$. If $\lambda=1$, equality in the maximum principle and irreducibility force $f$ to be constant on all states reachable from $x_0$, hence on all of $E$; therefore the only eigenfunctions with eigenvalue $1$ are constants, and no nonzero constant lies in $L^2_0(\pi)$. If $\lambda=-1$, equality forces every transition with positive probability to move between opposite signs of an extremal eigenfunction, producing a two-class cyclic decomposition of the irreducible chain. That contradicts aperiodicity. Hence each eigenvalue on $L^2_0(\pi)$ satisfies
\begin{align*}
-1 < \lambda_j < 1.
\end{align*}
Now expand the centered observable:
\begin{align*}
g = \sum_{j=1}^m a_j e_j,
\end{align*}
where $a_j:=\langle g,e_j\rangle_\pi$. Let $\mathbb{P}$ denote the law of the stationary Markov chain with transition kernel $P$ and initial law $\pi$, and let $\mathbb{E}$ denote expectation with respect to $\mathbb{P}$. For square-integrable real random variables $Y,Z:\Omega \to \mathbb{R}$, define
\begin{align*}
\operatorname{Cov}(Y,Z) := \mathbb{E}[(Y-\mathbb{E}[Y])(Z-\mathbb{E}[Z])].
\end{align*}
Since the chain is stationary, $X_0$ has law $\pi$, and the Markov property gives
\begin{align*}
\mathbb{E}[g(X_k)\mid X_0] = (P^k g)(X_0).
\end{align*}
Because $\pi(g)=0$, the covariance at lag $k$ is
\begin{align*}
\operatorname{Cov}(g(X_0),g(X_k)) = \mathbb{E}[g(X_0)g(X_k)] = \langle g,P^k g\rangle_\pi.
\end{align*}
The variance of the empirical average is therefore the sum of all lag covariances:
\begin{align*}
N\operatorname{Var}\left(\frac{1}{N}\sum_{n=0}^{N-1}g(X_n)\right) = \sum_{k=-(N-1)}^{N-1}\left(1-\frac{|k|}{N}\right)\langle g,P^{|k|}g\rangle_\pi.
\end{align*}
The eigenbasis diagonalizes each covariance term:
\begin{align*}
\langle g,P^k g\rangle_\pi = \sum_{j=1}^m a_j^2\lambda_j^k.
\end{align*}
For a fixed eigenvalue $\lambda_j$ with $|\lambda_j|<1$, the weighted covariance sum converges to the two-sided geometric series
\begin{align*}
1 + 2\sum_{k=1}^{\infty}\lambda_j^k = \frac{1+\lambda_j}{1-\lambda_j}.
\end{align*}
There are only finitely many eigenvalues, so we may pass the limit through the finite sum and obtain
\begin{align*}
\sigma_P^2(g) = \sum_{j=1}^m a_j^2\frac{1+\lambda_j}{1-\lambda_j}.
\end{align*}
Finally, because $(I-P)e_j=(1-\lambda_j)e_j$ and $1-\lambda_j>0$, the inverse $(I-P)^{-1}$ is well-defined on $L^2_0(\pi)$, and
\begin{align*}
(I+P)(I-P)^{-1}e_j = \frac{1+\lambda_j}{1-\lambda_j}e_j.
\end{align*}
Thus the preceding spectral expression is exactly
\begin{align*}
\sigma_P^2(g)=\langle g,(I+P)(I-P)^{-1}g\rangle_\pi.
\end{align*}
[/guided]
[/step]
[step:Convert off-diagonal dominance into Dirichlet form ordering]
For $i \in \{1,2\}$, define the positive self-adjoint operator
\begin{align*}
A_i:L^2_0(\pi) &\to L^2_0(\pi)
\end{align*}
\begin{align*}
f &\mapsto (I-P_i)f.
\end{align*}
For every $f \in L^2_0(\pi)$, reversibility gives the Dirichlet form identity
\begin{align*}
\langle f,A_i f\rangle_\pi = \frac{1}{2}\sum_{x,y \in E}\pi(x)P_i(x,y)(f(x)-f(y))^2.
\end{align*}
Indeed, expanding the right-hand side gives
\begin{align*}
\frac{1}{2}\sum_{x,y \in E}\pi(x)P_i(x,y)(f(x)-f(y))^2 = \sum_{x \in E}\pi(x)f(x)^2 - \sum_{x,y \in E}\pi(x)P_i(x,y)f(x)f(y),
\end{align*}
where stochasticity gives the first quadratic term and reversibility identifies the two symmetric square terms. This equals $\langle f,(I-P_i)f\rangle_\pi$.
Since the diagonal terms satisfy $(f(x)-f(x))^2=0$, only pairs with $x \ne y$ contribute. The assumption $P_1(x,y)\ge P_2(x,y)$ for $x \ne y$ therefore implies
\begin{align*}
\langle f,A_1 f\rangle_\pi \ge \langle f,A_2 f\rangle_\pi
\end{align*}
for every $f \in L^2_0(\pi)$.
[/step]
[step:Invert the ordered positive operators to compare the resolvents]
The preceding step says $A_1-A_2$ is positive semidefinite on $L^2_0(\pi)$. From the spectral conclusion above, $A_i=I-P_i$ is positive definite on $L^2_0(\pi)$ for $i \in \{1,2\}$.
We prove the inverse-ordering directly. Define
\begin{align*}
B:=A_2^{-1/2}A_1A_2^{-1/2}.
\end{align*}
The square root $A_2^{-1/2}:L^2_0(\pi)\to L^2_0(\pi)$ is defined by the finite-dimensional spectral theorem applied to the positive definite self-adjoint operator $A_2$. Then $B$ is self-adjoint and $B\ge I$, so every eigenvalue of $B$ is at least $1$, hence $B^{-1}\le I$. Since
\begin{align*}
A_1^{-1}=A_2^{-1/2}B^{-1}A_2^{-1/2},
\end{align*}
we obtain $A_1^{-1}\le A_2^{-1}$.
[guided]
The comparison of asymptotic variances will come from comparing the inverses of $A_1=I-P_1$ and $A_2=I-P_2$. The direction of the inequality reverses under inversion: a larger positive definite operator has a smaller inverse.
We already proved that
\begin{align*}
\langle f,A_1f\rangle_\pi \ge \langle f,A_2f\rangle_\pi
\end{align*}
for every $f \in L^2_0(\pi)$. In operator notation, this is $A_1\ge A_2$. We also know that $A_2$ is positive definite, because all eigenvalues $\lambda$ of $P_2$ on $L^2_0(\pi)$ satisfy $\lambda<1$, so all eigenvalues $1-\lambda$ of $A_2$ are positive.
To prove the inverse comparison rigorously, define
\begin{align*}
B:=A_2^{-1/2}A_1A_2^{-1/2}.
\end{align*}
This is a self-adjoint operator on $L^2_0(\pi)$ because $A_1$ and $A_2^{-1/2}$ are self-adjoint. For any $v \in L^2_0(\pi)$, set $w:=A_2^{-1/2}v$. Then
\begin{align*}
\langle v,Bv\rangle_\pi = \langle w,A_1w\rangle_\pi \ge \langle w,A_2w\rangle_\pi = \langle v,v\rangle_\pi.
\end{align*}
Thus $B\ge I$. Since $B$ is finite-dimensional and self-adjoint, it has an orthonormal eigenbasis, and the inequality $B\ge I$ means that every eigenvalue of $B$ is at least $1$. Therefore every eigenvalue of $B^{-1}$ is at most $1$, so $B^{-1}\le I$.
Now compute the inverse of $A_1$ from the definition of $B$:
\begin{align*}
A_1 = A_2^{1/2}BA_2^{1/2}.
\end{align*}
Hence
\begin{align*}
A_1^{-1}=A_2^{-1/2}B^{-1}A_2^{-1/2}.
\end{align*}
Because $B^{-1}\le I$, for every $g \in L^2_0(\pi)$ we get
\begin{align*}
\langle g,A_1^{-1}g\rangle_\pi \le \langle g,A_2^{-1}g\rangle_\pi.
\end{align*}
This is the precise operator step that converts “more movement off the diagonal” into “smaller asymptotic variance.”
[/guided]
[/step]
[step:Apply the resolvent identity to finish the Peskun comparison]
For $i \in \{1,2\}$, the resolvent formula gives
\begin{align*}
\sigma_{P_i}^2(g)=\langle g,(I+P_i)(I-P_i)^{-1}g\rangle_\pi.
\end{align*}
Since $A_i=I-P_i$, we have $P_i=I-A_i$, and therefore
\begin{align*}
(I+P_i)(I-P_i)^{-1} = (2I-A_i)A_i^{-1}=2A_i^{-1}-I.
\end{align*}
Thus
\begin{align*}
\sigma_{P_i}^2(g)=2\langle g,A_i^{-1}g\rangle_\pi-\langle g,g\rangle_\pi.
\end{align*}
Using $A_1^{-1}\le A_2^{-1}$ in quadratic form order gives
\begin{align*}
\sigma_{P_1}^2(g)\le \sigma_{P_2}^2(g).
\end{align*}
Since $g=h-\pi(h)$ and centering does not change the variance of the empirical average, the same inequality holds for the original function $h:E\to\mathbb{R}$. The resolvent formula also proves that both asymptotic variance limits exist and are finite. This completes the proof.
[/step]