Gibbs Variational Principle (Theorem # 6723)
Theorem
Let $(E,\mathcal E)$ be a measurable space, let $\mu$ be a probability measure, and let $g:E\to\mathbb R$ be measurable with
\begin{align*}
0<\int_E e^g\,d\mu<\infty.
\end{align*}
For each $\nu\in\mathcal P(E)$, interpret
\begin{align*}
\int_E g\,d\nu-D(\nu\|\mu)
\end{align*}
as $-\infty$ whenever
\begin{align*}
\int_E g^+\,d\nu=\infty
\end{align*}
or $D(\nu\|\mu)=+\infty$. Then
\begin{align*}
\log\int_E e^g\,d\mu=\sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\},
\end{align*}
where the supremum is over all probability measures on $(E,\mathcal E)$. If the variational expression at the tilted measure $\nu_g$ below is finite-valued, then the supremum is attained by $\nu_g$, where
\begin{align*}
\frac{d\nu_g}{d\mu}=\frac{e^g}{\int_E e^g\,d\mu}.
\end{align*}
In the remaining extended-value cases, the same supremum identity holds and is obtained as the limit of the formula applied to bounded truncations of $g$.
Knowledge Status
Probability & Statistics
Discussion
Let (E, E) be a measurable space, let be a probability measure, and let g:E R be measurable with 0< _E e^g\,d < .
Proof
[proofplan]
We first prove the variational identity for [measurable functions](/page/Measurable%20Functions) that are bounded above; this includes the bounded case and supplies the entropy comparison. The key algebra is to subtract the proposed tilted measure and identify the error term as a relative entropy, whose non-negativity follows from the elementary inequality $a\log a-a+1\geq 0$. We then obtain the upper bound for a general $g$ by applying the bounded formula to two-sided truncations and passing to the limit. Finally, the reverse inequality is obtained from tilted probability measures supported on the bounded sets $\{|g|\leq m\}$, so each competitor has finite $g^+$-integral under the theorem's extended-value convention.
[/proofplan]
[step:Record the entropy inequality used in the variational comparison]
Let $(X,\mathcal A,\lambda)$ be a probability space, and let $\rho$ be a probability measure on $(X,\mathcal A)$ with $\rho\ll\lambda$. Let
\begin{align*}
r:X\to[0,\infty]
\end{align*}
denote a [Radon-Nikodym density](/page/Absolutely%20Continuous%20Measures) of $\rho$ with respect to $\lambda$, so that $\rho(A)=\int_A r\,d\lambda$ for every $A\in\mathcal A$ and $\int_X r\,d\lambda=1$.
Define $\psi:[0,\infty]\to[0,\infty]$ by $\psi(a)=a\log a-a+1$ for $a\in(0,\infty)$, with $\psi(0)=1$ and $\psi(+\infty)=+\infty$. We use the standard entropy convention $0\log 0=0$ when writing $r\log r$. On $[0,\infty)$, the function $\psi$ is convex, $\psi'(1)=0$, and $\psi(1)=0$, so $\psi(a)\geq 0$ for every $a\in[0,\infty]$ under this extended convention. Therefore
\begin{align*}
\int_X r\log r\,d\lambda=\int_X \psi(r)\,d\lambda+\int_X r\,d\lambda-\int_X 1\,d\lambda\geq 0.
\end{align*}
Thus $D(\rho\|\lambda)\geq 0$. Moreover, equality can occur only when $\psi(r)=0$ $\lambda$-a.e., hence only when $r=1$ $\lambda$-a.e., that is, $\rho=\lambda$.
[/step]
[step:Prove the formula when the potential is bounded above]
Let
\begin{align*}
f:E\to\mathbb R
\end{align*}
be an $\mathcal E$-measurable function bounded above. Define $f^+:E\to[0,\infty)$ by $f^+(x)=\max\{f(x),0\}$ and $f^-:E\to[0,\infty)$ by $f^-(x)=\max\{-f(x),0\}$. Assume
\begin{align*}
0<Z_f:=\int_E e^f\,d\mu<\infty.
\end{align*}
Define the tilted probability measure $\mu_f$ by
\begin{align*}
\frac{d\mu_f}{d\mu}=\frac{e^f}{Z_f}.
\end{align*}
This is a probability measure because the density
\begin{align*}
E\to[0,\infty),\qquad x\mapsto \frac{e^{f(x)}}{Z_f}
\end{align*}
is non-negative and integrates to $1$ with respect to $\mu$.
Let $\nu\in\mathcal P(E)$. If $\nu\not\ll\mu$, then $D(\nu\|\mu)=+\infty$, and the variational expression is $-\infty$. Suppose $\nu\ll\mu$, and let
\begin{align*}
h:E\to[0,\infty]
\end{align*}
be a [Radon-Nikodym density](/page/Absolutely%20Continuous%20Measures) of $\nu$ with respect to $\mu$. If $D(\nu\|\mu)=+\infty$, then the variational expression is $-\infty$ under the same extended-value convention, so the upper bound is immediate. Hence assume $D(\nu\|\mu)<\infty$. Since the density
\begin{align*}
E\to(0,\infty),\qquad x\mapsto \frac{e^{f(x)}}{Z_f}
\end{align*}
is strictly positive, $\nu\ll\mu_f$, and the [Radon-Nikodym density](/page/Absolutely%20Continuous%20Measures) of $\nu$ with respect to $\mu_f$ is
\begin{align*}
\frac{d\nu}{d\mu_f}=\frac{h}{e^f/Z_f}.
\end{align*}
If
\begin{align*}
\int_E f^-\,d\nu=\infty,
\end{align*}
then
\begin{align*}
\int_E f\,d\nu=-\infty,
\end{align*}
so the desired upper bound is immediate. Otherwise all terms below are well-defined in $(-\infty,\infty]$. Since $D(\nu\|\mu)<\infty$, the positive part of $h\log h$ is $\mu$-integrable, and the negative part of $h\log h$ is integrable with respect to $\mu$ because $a\log a\geq -e^{-1}$ for $a\geq 0$ and $\mu(E)=1$. Since $f$ is bounded above, $f^+$ is bounded, so $\int_E f^+\,d\nu<\infty$; by the present case assumption, $\int_E f^-\,d\nu<\infty$. Hence $\int_E fh\,d\mu=\int_E f\,d\nu$ is finite, and the relative entropy expansion below is not an undefined extended-real subtraction. Direct expansion gives
\begin{align*}
D(\nu\|\mu_f)=\int_E h\log\left(\frac{h}{e^f/Z_f}\right)\,d\mu.
\end{align*}
Therefore
\begin{align*}
D(\nu\|\mu_f)=\int_E h\log h\,d\mu-\int_E fh\,d\mu+\log Z_f\int_E h\,d\mu.
\end{align*}
Since $\int_E h\,d\mu=1$, this rearranges to
\begin{align*}
\int_E f\,d\nu-D(\nu\|\mu)=\log Z_f-D(\nu\|\mu_f).
\end{align*}
The [entropy inequality](/theorems/6729) from the previous step gives $D(\nu\|\mu_f)\geq 0$, hence
\begin{align*}
\int_E f\,d\nu-D(\nu\|\mu)\leq \log Z_f.
\end{align*}
It remains to check that equality is achieved at $\mu_f$. Because $f$ is bounded above, say $f\leq M$, the negative part of $f e^f$ is integrable with respect to $\mu$: on $\{f<0\}$, the function $-f e^f$ is bounded above by $e^{-1}$, and on $\{f\geq 0\}$ there is no negative part. Hence $\int_E f\,d\mu_f$ is finite from below. Also
\begin{align*}
D(\mu_f\|\mu)=\int_E \frac{e^f}{Z_f}\left(f-\log Z_f\right)\,d\mu
\end{align*}
is finite-valued. Substituting $\nu=\mu_f$ in the identity above gives
\begin{align*}
\int_E f\,d\mu_f-D(\mu_f\|\mu)=\log Z_f.
\end{align*}
Thus
\begin{align*}
\log\int_E e^f\,d\mu=\sup_{\nu\in\mathcal P(E)}\left\{\int_E f\,d\nu-D(\nu\|\mu)\right\}
\end{align*}
whenever $f$ is measurable, bounded above, and has $0<\int_E e^f\,d\mu<\infty$.
[guided]
The central idea is to compare every competitor $\nu$ with the probability measure suggested by the formula. Let
\begin{align*}
f:E\to\mathbb R
\end{align*}
be an $\mathcal E$-measurable function bounded above. Define $f^+:E\to[0,\infty)$ by $f^+(x)=\max\{f(x),0\}$ and $f^-:E\to[0,\infty)$ by $f^-(x)=\max\{-f(x),0\}$. Define
\begin{align*}
Z_f:=\int_E e^f\,d\mu.
\end{align*}
The assumptions say $0<Z_f<\infty$, so the formula
\begin{align*}
\frac{d\mu_f}{d\mu}=\frac{e^f}{Z_f}
\end{align*}
defines a probability measure $\mu_f$ on $(E,\mathcal E)$.
Now fix an arbitrary probability measure $\nu$ on $(E,\mathcal E)$. If $\nu$ is not absolutely continuous with respect to $\mu$, then $D(\nu\|\mu)=+\infty$ by definition, and this competitor contributes $-\infty$ to the supremum. Such a measure cannot improve the variational value.
Assume therefore that $\nu\ll\mu$, and let
\begin{align*}
h:E\to[0,\infty]
\end{align*}
be a [Radon-Nikodym density](/page/Absolutely%20Continuous%20Measures) of $\nu$ with respect to $\mu$. If $D(\nu\|\mu)=+\infty$, then this competitor has variational value $-\infty$ under the extended-value convention, so it cannot improve the upper bound. We may therefore assume $D(\nu\|\mu)<\infty$ before performing the entropy algebra. Since the density
\begin{align*}
E\to(0,\infty),\qquad x\mapsto \frac{e^{f(x)}}{Z_f}
\end{align*}
is strictly positive, $\nu$ is also absolutely continuous with respect to $\mu_f$, and its density relative to $\mu_f$ is
\begin{align*}
\frac{d\nu}{d\mu_f}=\frac{h}{e^f/Z_f}.
\end{align*}
This is the quantity whose entropy measures the loss from not choosing the tilted measure.
If
\begin{align*}
\int_E f^-\,d\nu=\infty,
\end{align*}
then
\begin{align*}
\int_E f\,d\nu=-\infty
\end{align*}
because $f^+$ is bounded by the assumed upper bound on $f$. The variational expression is then $-\infty$, and the upper bound follows. Otherwise $\int_E f\,d\nu$ is well-defined. We also check that the entropy computation is legitimate in the extended-real sense. Since $D(\nu\|\mu)<\infty$, the positive part of $h\log h$ is $\mu$-integrable, while the inequality $a\log a\geq -e^{-1}$ for $a\geq 0$ and the identity $\mu(E)=1$ show that the negative part of $h\log h$ is $\mu$-integrable. Since $f$ is bounded above and we are in the case $\int_E f^-\,d\nu<\infty$, the integral $\int_E fh\,d\mu=\int_E f\,d\nu$ is finite. Thus the following expansion is not an undefined $\infty-\infty$ subtraction. We compute the relative entropy of $\nu$ with respect to $\mu_f$:
\begin{align*}
D(\nu\|\mu_f)=\int_E h\log\left(\frac{h}{e^f/Z_f}\right)\,d\mu.
\end{align*}
Expanding the logarithm gives
\begin{align*}
D(\nu\|\mu_f)=\int_E h\log h\,d\mu-\int_E fh\,d\mu+\log Z_f\int_E h\,d\mu.
\end{align*}
Since $h$ is a probability density, $\int_E h\,d\mu=1$. Also $\int_E fh\,d\mu=\int_E f\,d\nu$. Therefore
\begin{align*}
\int_E f\,d\nu-D(\nu\|\mu)=\log Z_f-D(\nu\|\mu_f).
\end{align*}
The previous step proves $D(\nu\|\mu_f)\geq 0$, so every competitor satisfies
\begin{align*}
\int_E f\,d\nu-D(\nu\|\mu)\leq \log Z_f.
\end{align*}
Finally we verify that the proposed competitor actually reaches the upper bound. For $\nu=\mu_f$, the density $d\nu/d\mu_f$ is $1$, so the loss term $D(\mu_f\|\mu_f)$ is $0$. We also need the original expression to be finite. Since $f$ is bounded above and $-t e^t\leq e^{-1}$ for $t<0$, the negative part of $f e^f$ is $\mu$-integrable. Hence $\int_E f\,d\mu_f$ is finite from below, and
\begin{align*}
D(\mu_f\|\mu)=\int_E \frac{e^f}{Z_f}\left(f-\log Z_f\right)\,d\mu
\end{align*}
is finite-valued. Therefore
\begin{align*}
\int_E f\,d\mu_f-D(\mu_f\|\mu)=\log Z_f.
\end{align*}
This proves the bounded-above variational formula.
[/guided]
[/step]
[step:Pass two-sided truncations to obtain the upper bound for $g$]
For each $m\in\mathbb N$, define the bounded measurable truncation
\begin{align*}
g_m:E\to\mathbb R,\qquad x\mapsto \max\{-m,\min\{g(x),m\}\}.
\end{align*}
The bounded-above formula gives, for every $\nu\in\mathcal P(E)$,
\begin{align*}
\int_E g_m\,d\nu-D(\nu\|\mu)\leq \log\int_E e^{g_m}\,d\mu.
\end{align*}
The functions $e^{g_m}$ converge pointwise to $e^g$. Moreover $e^{g_m}\leq e^g+1$ for every $m$, so the [Dominated Convergence Theorem](/theorems/4) gives
\begin{align*}
\int_E e^{g_m}\,d\mu\to\int_E e^g\,d\mu=Z.
\end{align*}
Now fix $\nu\in\mathcal P(E)$. If
\begin{align*}
\int_E g^+\,d\nu=\infty
\end{align*}
or
\begin{align*}
D(\nu\|\mu)=+\infty,
\end{align*}
then the variational expression for $g$ is $-\infty$ by convention and is bounded above by $\log Z$. Otherwise
\begin{align*}
\int_E g^+\,d\nu<\infty
\end{align*}
and
\begin{align*}
D(\nu\|\mu)<\infty.
\end{align*}
Since
\begin{align*}
g_m=(g^+\wedge m)-(g^-\wedge m),
\end{align*}
the [Monotone Convergence Theorem](/theorems/509) applied separately to $g^+\wedge m$ and $g^-\wedge m$ gives
\begin{align*}
\int_E g_m\,d\nu\to\int_E g\,d\nu
\end{align*}
in the extended real sense, with value $-\infty$ if $\int_E g^-\,d\nu=\infty$. Passing to the limit in the inequality for $g_m$ yields
\begin{align*}
\int_E g\,d\nu-D(\nu\|\mu)\leq \log Z.
\end{align*}
Taking the supremum over all $\nu\in\mathcal P(E)$ gives
\begin{align*}
\sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\}\leq \log Z.
\end{align*}
[guided]
The bounded-above formula is not applied directly to $g$, because $g$ may be unbounded above. Instead, for each $m\in\mathbb N$, we define the bounded measurable truncation
\begin{align*}
g_m:E\to\mathbb R,\qquad x\mapsto \max\{-m,\min\{g(x),m\}\}.
\end{align*}
The previous step applies to $g_m$, so every probability measure $\nu$ on $(E,\mathcal E)$ satisfies
\begin{align*}
\int_E g_m\,d\nu-D(\nu\|\mu)\leq \log\int_E e^{g_m}\,d\mu.
\end{align*}
We first pass to the limit on the right-hand side. The functions $e^{g_m}$ converge pointwise to $e^g$. Also $e^{g_m}\leq e^g+1$ for every $m$, and $e^g+1$ is $\mu$-integrable because $\mu$ is a probability measure and $\int_E e^g\,d\mu<\infty$. The [Dominated Convergence Theorem](/theorems/4) therefore gives
\begin{align*}
\int_E e^{g_m}\,d\mu\to\int_E e^g\,d\mu=Z.
\end{align*}
Now fix $\nu\in\mathcal P(E)$. If
\begin{align*}
\int_E g^+\,d\nu=\infty
\end{align*}
or
\begin{align*}
D(\nu\|\mu)=+\infty,
\end{align*}
then the convention in the theorem makes the variational expression for $g$ equal to $-\infty$, so the desired upper bound is immediate. Otherwise
\begin{align*}
\int_E g^+\,d\nu<\infty
\end{align*}
and
\begin{align*}
D(\nu\|\mu)<\infty.
\end{align*}
Since
\begin{align*}
g_m=(g^+\wedge m)-(g^-\wedge m),
\end{align*}
the [Monotone Convergence Theorem](/theorems/509) applied separately to $g^+\wedge m$ and $g^-\wedge m$ gives
\begin{align*}
\int_E g_m\,d\nu\to\int_E g\,d\nu
\end{align*}
in the extended real sense, with value $-\infty$ if $\int_E g^-\,d\nu=\infty$. Passing to the limit in the bounded-truncation inequality yields
\begin{align*}
\int_E g\,d\nu-D(\nu\|\mu)\leq \log Z.
\end{align*}
Because $\nu$ was arbitrary, taking the supremum over all probability measures on $(E,\mathcal E)$ gives
\begin{align*}
\sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\}\leq \log Z.
\end{align*}
[/guided]
[/step]
[step:Use bounded-support tilted measures to obtain the reverse inequality]
For each $m\in\mathbb N$, define the measurable set
\begin{align*}
A_m:=\{x\in E: |g(x)|\leq m\}.
\end{align*}
Since $g$ is real-valued, the sets $A_m$ increase to $E$. For each $m\in\mathbb N$, let $\mathbb{1}_{A_m}:E\to\{0,1\}$ denote the indicator function of $A_m$, defined by $\mathbb{1}_{A_m}(x)=1$ for $x\in A_m$ and $\mathbb{1}_{A_m}(x)=0$ for $x\notin A_m$. Define
\begin{align*}
Z_m:=\int_{A_m} e^g\,d\mu.
\end{align*}
By the [Monotone Convergence Theorem](/theorems/509) applied to the non-negative functions $\mathbb{1}_{A_m}e^g$, we have $Z_m\uparrow Z$. Since $Z>0$, choose $m$ large enough that $Z_m>0$, and define the probability measure $\nu_m$ on $(E,\mathcal E)$ by
\begin{align*}
\frac{d\nu_m}{d\mu}=\frac{\mathbb{1}_{A_m}e^g}{Z_m}.
\end{align*}
On $A_m$ one has $|g|\leq m$, so $\int_E g^+\,d\nu_m\leq m$ and all terms below are finite. Moreover,
\begin{align*}
D(\nu_m\|\mu)=\int_E \frac{\mathbb{1}_{A_m}e^g}{Z_m}\left(g-\log Z_m\right)\,d\mu.
\end{align*}
Using the same density in the integral of $g$ gives
\begin{align*}
\int_E g\,d\nu_m-D(\nu_m\|\mu)=\log Z_m.
\end{align*}
Therefore
\begin{align*}
\sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\}\geq \log Z_m
\end{align*}
for every sufficiently large $m$. Since $Z_m\uparrow Z$ and the logarithm is increasing and continuous on $(0,\infty)$, passing $m\to\infty$ yields
\begin{align*}
\log Z\leq \sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\}.
\end{align*}
[/step]
[step:Identify the maximizer when the tilted expression is finite]
[guided]
We now assemble the identity from the two estimates already proved. For the upper bound, two-sided truncations $g_m=\max\{-m,\min\{g,m\}\}$ satisfy the bounded variational formula, the [Dominated Convergence Theorem](/theorems/4) gives $\int_E e^{g_m}\,d\mu\to Z$, and the [Monotone Convergence Theorem](/theorems/509) passes $\int_E g_m\,d\nu$ to $\int_E g\,d\nu$ for every finite competitor. For the lower bound, the probability measures with density $\mathbb{1}_{\{|g|\leq m\}}e^g/Z_m$ have admissible variational value $\log Z_m$, and $Z_m\uparrow Z$. Hence
\begin{align*}
\log\int_E e^g\,d\mu=\sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\}.
\end{align*}
Now assume that the variational expression at $\nu_g$ is finite-valued, where
\begin{align*}
\frac{d\nu_g}{d\mu}=\frac{e^g}{Z}.
\end{align*}
Equivalently, $\nu_g$ has density
\begin{align*}
E\to(0,\infty),\qquad x\mapsto \frac{e^{g(x)}}{Z}
\end{align*}
with respect to $\mu$. Let $\nu\in\mathcal P(E)$. Competitors outside the finite regime cannot beat $\log Z$: if $\nu\not\ll\mu$, if $D(\nu\|\mu)=+\infty$, or if
\begin{align*}
\int_E g^+\,d\nu=\infty,
\end{align*}
then the theorem's convention assigns value $-\infty$. If instead $\nu\ll\mu$, $D(\nu\|\mu)<\infty$, and $\int_E g^+\,d\nu<\infty$, but
\begin{align*}
\int_E g^-\,d\nu=\infty,
\end{align*}
then $\int_E g\,d\nu=-\infty$, so this competitor also cannot exceed $\log Z$.
It remains to handle the case where $\int_E g^+\,d\nu<\infty$, $\int_E g^-\,d\nu<\infty$, and $D(\nu\|\mu)<\infty$. Let $h:E\to[0,\infty]$ be a Radon-Nikodym density of $\nu$ with respect to $\mu$. Since the density of $\nu_g$ with respect to $\mu$ is strictly positive, $\nu\ll\nu_g$, and the density of $\nu$ with respect to $\nu_g$ is
\begin{align*}
\frac{d\nu}{d\nu_g}=\frac{hZ}{e^g}.
\end{align*}
The assumed integrability prevents any undefined $\infty-\infty$ subtraction, so expanding the relative entropy gives
\begin{align*}
D(\nu\|\nu_g)=D(\nu\|\mu)-\int_E g\,d\nu+\log Z.
\end{align*}
Equivalently,
\begin{align*}
\int_E g\,d\nu-D(\nu\|\mu)=\log Z-D(\nu\|\nu_g).
\end{align*}
The entropy inequality gives $D(\nu\|\nu_g)\geq 0$, so no admissible competitor exceeds $\log Z$. For $\nu=\nu_g$, the loss term is $D(\nu_g\|\nu_g)=0$, and the assumed finiteness makes the original expression well-defined. Thus the supremum is attained by $\nu_g$.
If the tilted expression is not finite-valued, the preceding limiting construction still proves the identity. The two-sided truncations give the upper bound, and the measures supported on $A_m=\{|g|\leq m\}$ give admissible values equal to $\log Z_m$, with $Z_m\uparrow Z$. Hence the variational values converge upward to $\log Z$, which is the asserted extended-value case.
[/guided]
Combining the upper and lower bounds proves
\begin{align*}
\log\int_E e^g\,d\mu=\sup_{\nu\in\mathcal P(E)}\left\{\int_E g\,d\nu-D(\nu\|\mu)\right\}.
\end{align*}
Assume now that the variational expression at $\nu_g$ is finite-valued, where
\begin{align*}
\frac{d\nu_g}{d\mu}=\frac{e^g}{Z}.
\end{align*}
Equivalently, $\nu_g$ has density
\begin{align*}
E\to(0,\infty),\qquad x\mapsto \frac{e^{g(x)}}{Z}
\end{align*}
with respect to $\mu$. Let $\nu\in\mathcal P(E)$. If $\nu\not\ll\mu$, or $D(\nu\|\mu)=+\infty$, or $\int_E g^+\,d\nu=\infty$, then the convention in the statement makes the variational expression equal to $-\infty$, so this competitor cannot exceed $\log Z$. If $\nu\ll\mu$, $D(\nu\|\mu)<\infty$, and $\int_E g^+\,d\nu<\infty$, but $\int_E g^-\,d\nu=\infty$, then $\int_E g\,d\nu=-\infty$, and again the competitor cannot exceed $\log Z$.
It remains only to consider the case where $\int_E g^+\,d\nu<\infty$, $\int_E g^-\,d\nu<\infty$, and $D(\nu\|\mu)<\infty$. Let $h:E\to[0,\infty]$ be a Radon-Nikodym density of $\nu$ with respect to $\mu$. Since the density of $\nu_g$ with respect to $\mu$ is strictly positive, $\nu\ll\nu_g$, and the density of $\nu$ with respect to $\nu_g$ is
\begin{align*}
\frac{d\nu}{d\nu_g}=\frac{hZ}{e^g}.
\end{align*}
The integrability assumptions make the following expansion an equality in $(-\infty,\infty]$ without any undefined $\infty-\infty$ subtraction:
\begin{align*}
D(\nu\|\nu_g)=D(\nu\|\mu)-\int_E g\,d\nu+\log Z.
\end{align*}
Equivalently,
\begin{align*}
\int_E g\,d\nu-D(\nu\|\mu)=\log Z-D(\nu\|\nu_g).
\end{align*}
The entropy inequality gives $D(\nu\|\nu_g)\geq 0$, so no competitor exceeds $\log Z$. For $\nu=\nu_g$, the loss term is $D(\nu_g\|\nu_g)=0$, and the assumed finiteness ensures the expression is well-defined. Therefore the supremum is attained by $\nu_g$.
If that tilted expression is not finite-valued, the identity has already been proved by the limiting argument: the two-sided truncations give the upper bound, and the bounded-support tilted measures $\nu_m$ give admissible variational values increasing to $\log Z$. This is precisely the asserted extended-value interpretation.
[/step]
Prerequisites (0/1 completed)
Prerequisites Graph
Interactive dependency map showing how this theorem builds on foundational concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Theorems
Explore Further
Entropy Inequality
Theorem #6729
Testing-to-Estimation Reduction
Probability & Statistics
Mean Integrated Squared Error Upper Bound for Hölder Kernel Density Estimators
Probability & Statistics
Derivatives of the Log-Laplace Transform
Probability & Statistics
Asymptotic Eigenvector Overlap in the Rank-One Spiked Covariance Model
Probability & Statistics
Dantzig Selector Oracle Inequality under a Restricted Eigenvalue Condition
Probability & Statistics
Moments and Asymptotic Normality of the Wilcoxon Signed-Rank Statistic
Probability & Statistics
Equality Condition in the Gauss-Markov Theorem
Probability & Statistics
Radon-Nikodym Theorem (Probabilistic)
Martingale Theory
Probability & Statistics
Area