[proofplan]
We prove the estimate by tilting $\mu$ with density proportional to $e^{\alpha g}$ and comparing $\nu$ to this tilted probability measure. If $D(\nu\|\mu)=\infty$, or if
\begin{align*}
\int_E g(x)\,d\nu(x)=-\infty,
\end{align*}
the inequality is immediate. Thus the only substantive case is when $\nu$ has finite relative entropy and $g$ is $\nu$-integrable. In that case, non-negativity of relative entropy for the tilted measure gives exactly the desired inequality after expanding the logarithm of the Radon-Nikodym density.
[/proofplan]
[step:Discard the immediate infinite cases]
Define the normalizing constant $Z\in(0,\infty)$ by
\begin{align*}
Z:=\int_E e^{\alpha g(x)}\,d\mu(x).
\end{align*}
We use the following convention for relative entropy. If $\nu\ll\mu$ and $r:E\to[0,\infty)$ is a Radon-Nikodym density of $\nu$ with respect to $\mu$, then
\begin{align*}
D(\nu\|\mu):=\int_E r(x)\log r(x)\,d\mu(x),
\end{align*}
with $0\log 0=0$; if $\nu$ is not absolutely continuous with respect to $\mu$, then $D(\nu\|\mu):=\infty$.
The hypothesis
\begin{align*}
\int_E g^+(x)\,d\nu(x)<\infty
\end{align*}
implies that the extended integral
\begin{align*}
\int_E g(x)\,d\nu(x)
\end{align*}
is well-defined in $[-\infty,\infty)$. If $D(\nu\|\mu)=\infty$, then the right-hand side is $+\infty$, so the asserted inequality holds. If
\begin{align*}
\int_E g(x)\,d\nu(x)=-\infty,
\end{align*}
then the asserted inequality also holds. Hence it remains to prove the inequality under the assumptions
\begin{align*}
D(\nu\|\mu)<\infty
\end{align*}
and
\begin{align*}
-\infty<\int_E g(x)\,d\nu(x)<\infty.
\end{align*}
By the definition of relative entropy, finite relative entropy implies $\nu\ll\mu$. Since $\mu$ and $\nu$ are probability measures, both are finite and hence $\sigma$-finite. Thus the finiteness hypotheses of the [Radon-Nikodym theorem](/page/Absolutely%20Continuous%20Measures) are satisfied for the pair $(\nu,\mu)$. Let $h:E\to[0,\infty)$ be a Radon-Nikodym density of $\nu$ with respect to $\mu$, so that $\nu(A)=\int_A h(x)\,d\mu(x)$ for every $A\in\mathcal{E}$, and
\begin{align*}
D(\nu\|\mu)=\int_E h(x)\log h(x)\,d\mu(x).
\end{align*}
[/step]
[step:Tilt $\mu$ by $e^{\alpha g}$ to create a comparison probability measure]
Define the function $\rho:E\to(0,\infty)$ by
\begin{align*}
\rho(x):=\frac{e^{\alpha g(x)}}{Z}.
\end{align*}
Since $g$ is $\mathcal{E}$-measurable, $\rho$ is $\mathcal{E}$-measurable. Also,
\begin{align*}
\int_E \rho(x)\,d\mu(x)=\frac{1}{Z}\int_E e^{\alpha g(x)}\,d\mu(x)=1.
\end{align*}
Therefore the set function $\mu_\alpha:\mathcal{E}\to[0,1]$ defined by
\begin{align*}
\mu_\alpha(A):=\int_A \rho(x)\,d\mu(x)
\end{align*}
is a probability measure on $(E,\mathcal{E})$. Since $\rho(x)>0$ for every $x\in E$, the implication $\mu_\alpha(A)=0\implies\mu(A)=0$ holds; because $\nu\ll\mu$, we have $\nu\ll\mu_\alpha$.
[/step]
[step:Use non-negativity of entropy relative to the tilted measure]
Let $k:E\to[0,\infty)$ be the Radon-Nikodym density of $\nu$ with respect to $\mu_\alpha$. Since $d\mu_\alpha=\rho\,d\mu$ and $d\nu=h\,d\mu$, the density is $k(x)=h(x)/\rho(x)$ for $\mu_\alpha$-almost every $x\in E$.
We use the scalar inequality
\begin{align*}
t\log t-t+1\geq 0
\end{align*}
for every $t\in[0,\infty)$, with the convention $0\log 0=0$. Indeed, the function $t\mapsto t\log t-t+1$ has derivative $\log t$ on $(0,\infty)$, attains its minimum $0$ at $t=1$, and has value $1$ at $t=0$ by the stated convention. Since $t\log t\geq -e^{-1}$ for $t\in[0,\infty)$, the negative part of $k\log k$ is bounded by $e^{-1}$ and is therefore $\mu_\alpha$-integrable. Applying this inequality to $k$ and integrating with respect to $\mu_\alpha$ gives
\begin{align*}
\int_E k(x)\log k(x)\,d\mu_\alpha(x)\geq \int_E k(x)\,d\mu_\alpha(x)-\int_E 1\,d\mu_\alpha(x)=1-1=0.
\end{align*}
Thus
\begin{align*}
D(\nu\|\mu_\alpha)=\int_E k(x)\log k(x)\,d\mu_\alpha(x)\geq 0.
\end{align*}
[guided]
Recall the objects already constructed in the proof. The constant $Z\in(0,\infty)$ is defined by
\begin{align*}
Z:=\int_E e^{\alpha g(x)}\,d\mu(x).
\end{align*}
The function $\rho:E\to(0,\infty)$ is defined by
\begin{align*}
\rho(x):=\frac{e^{\alpha g(x)}}{Z}.
\end{align*}
The probability measure $\mu_\alpha:\mathcal E\to[0,1]$ is defined by
\begin{align*}
\mu_\alpha(A):=\int_A \rho(x)\,d\mu(x)
\end{align*}
for $A\in\mathcal E$. Also $h:E\to[0,\infty)$ is a Radon-Nikodym density of $\nu$ with respect to $\mu$, so $d\nu=h\,d\mu$.
The tilted measure $\mu_\alpha$ is designed so that its density contains exactly the exponential term in the desired estimate. Because $d\mu_\alpha=\rho\,d\mu$ and $d\nu=h\,d\mu$, the density of $\nu$ with respect to $\mu_\alpha$ must satisfy $k(x)\rho(x)=h(x)$ for $\mu$-almost every $x\in E$. Hence $k(x)=h(x)/\rho(x)$ for $\mu_\alpha$-almost every $x\in E$.
Now we prove the only entropy fact needed here. Define the scalar function $\varphi:[0,\infty)\to\mathbb{R}$ by $\varphi(t)=t\log t-t+1$ for $t>0$ and $\varphi(0)=1$. On $(0,\infty)$ we have $\varphi'(t)=\log t$, so $\varphi$ decreases on $(0,1]$ and increases on $[1,\infty)$. Since $\varphi(1)=0$ and $\varphi(0)=1$, the minimum of $\varphi$ on $[0,\infty)$ is $0$. Equivalently,
\begin{align*}
t\log t\geq t-1
\end{align*}
for every $t\in[0,\infty)$, using $0\log 0=0$. The same calculus computation gives $t\log t\geq -e^{-1}$ for all $t\in[0,\infty)$, so the negative part of $k\log k$ is bounded by $e^{-1}$ and is $\mu_\alpha$-integrable. Applying this pointwise inequality to the measurable density $k$ and integrating with respect to the probability measure $\mu_\alpha$ yields
\begin{align*}
\int_E k(x)\log k(x)\,d\mu_\alpha(x)\geq \int_E k(x)\,d\mu_\alpha(x)-\int_E 1\,d\mu_\alpha(x).
\end{align*}
The first integral on the right is $\nu(E)=1$, because $k=d\nu/d\mu_\alpha$, and the second is $\mu_\alpha(E)=1$, because $\mu_\alpha$ is a probability measure. Therefore
\begin{align*}
D(\nu\|\mu_\alpha)=\int_E k(x)\log k(x)\,d\mu_\alpha(x)\geq 0.
\end{align*}
This non-negativity is the mechanism that converts the comparison measure into an upper bound for $\int_E g\,d\nu$.
[/guided]
[/step]
[step:Expand the tilted entropy and rearrange]
Using $k=h/\rho$ and $d\mu_\alpha=\rho\,d\mu$, we compute $D(\nu\|\mu_\alpha)=\int_E h(x)\log(h(x)/\rho(x))\,d\mu(x)$.
Since $\rho(x)=e^{\alpha g(x)}/Z$, we have
\begin{align*}
\log\frac{h(x)}{\rho(x)}=\log h(x)-\alpha g(x)+\log Z
\end{align*}
where the expression is interpreted on the set where $h(x)>0$; on $\{h=0\}$ the contribution to the entropy integral is zero. The term $D(\nu\|\mu)$ is a finite real number by the reduction in the first step, the term $\int_E g(x)\,d\nu(x)$ is a finite real number by the same reduction, and $\log Z$ is finite because $Z\in(0,\infty)$. Therefore the following subtraction and regrouping take place among finite [real numbers](/page/Real%20Numbers):
\begin{align*}
D(\nu\|\mu_\alpha)=D(\nu\|\mu)-\alpha\int_E g(x)\,d\nu(x)+\log Z.
\end{align*}
By the preceding step, $D(\nu\|\mu_\alpha)\geq 0$, hence
\begin{align*}
\alpha\int_E g(x)\,d\nu(x)\leq D(\nu\|\mu)+\log Z.
\end{align*}
Dividing by $\alpha>0$ and substituting the definition of $Z$ gives
\begin{align*}
\int_E g(x)\,d\nu(x)\leq \frac{1}{\alpha}\left(D(\nu\|\mu)+\log\int_E e^{\alpha g(x)}\,d\mu(x)\right).
\end{align*}
This is the asserted entropy inequality.
[/step]