Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[step:Coarse-grain the relative entropy to the two sets $A_\varepsilon$ and $A_\varepsilon^c$]By the [Radon-Nikodym theorem](/page/Radon-Nikodym%20Theorem), let $r:\Omega\to[0,\infty]$ be the measurable function given by $r(\omega)=\frac{dP}{dQ}(\omega)$ for $\omega\in\Omega$, the Radon-Nikodym derivative of $P$ with respect to $Q$. Since $P(\Omega)=1<\infty$, the derivative $r$ may be chosen finite $Q$-a.e.; changing $r$ on a $Q$-null set does not affect any integral below. Define the function $\varphi:[0,\infty)\to\mathbb{R}$ by $\varphi(t)=t\log t$ for $t>0$ and $\varphi(0)=0$. The function $\varphi$ is convex on $[0,\infty)$ because $\varphi''(t)=1/t\geq 0$ for $t>0$ and $\varphi$ is the continuous extension at $0$ of this convex function. Then \begin{align*} D_{\mathrm{KL}}(P \mid Q)=\int_\Omega \varphi(r)\,dQ. \end{align*} We claim that \begin{align*} D_{\mathrm{KL}}(P\|Q) \geq a_\varepsilon\log\left(\frac{a_\varepsilon}{b_\varepsilon}\right) + (1-a_\varepsilon)\log\left(\frac{1-a_\varepsilon}{1-b_\varepsilon}\right), \end{align*} with the usual conventions $0\log(0/c)=0$ for $c>0$ and $c\log(c/0)=+\infty$ for $c>0$. Indeed, on the measurable set $A_\varepsilon$, [Jensen's inequality](/page/Jensen%27s%20Inequality) for the probability measure $Q(\cdot \cap A_\varepsilon)/Q(A_\varepsilon)$ gives, when $b_\varepsilon>0$, \begin{align*} \int_{A_\varepsilon}\varphi(r)\,dQ \geq b_\varepsilon\, \varphi\left( \frac{1}{b_\varepsilon}\int_{A_\varepsilon} r\,dQ \right) = b_\varepsilon\, \varphi\left(\frac{a_\varepsilon}{b_\varepsilon}\right) = a_\varepsilon\log\left(\frac{a_\varepsilon}{b_\varepsilon}\right). \end{align*} If $b_\varepsilon=0$, then $P \ll Q$ implies $a_\varepsilon=0$, so the same term is $0$. If $1-b_\varepsilon>0$, applying [Jensen's inequality](/page/Jensen%27s%20Inequality) to the probability measure $Q(\cdot\cap A_\varepsilon^c)/Q(A_\varepsilon^c)$ gives \begin{align*} \int_{A_\varepsilon^c}\varphi(r)\,dQ \geq (1-a_\varepsilon)\log\left(\frac{1-a_\varepsilon}{1-b_\varepsilon}\right). \end{align*} If $1-b_\varepsilon=0$, then $Q(A_\varepsilon^c)=0$, and $P\ll Q$ implies $P(A_\varepsilon^c)=1-a_\varepsilon=0$; the complementary entropy term is therefore $0$. Adding the two estimates proves the claimed coarse-grained lower bound.[/step]

custom_env admin

[guided]The purpose of this step is to replace the original probability space by the two events $A_\varepsilon$ and $A_\varepsilon^c$. This is useful because the total variation distance only asks for probability gaps of measurable sets, and a single nearly extremal set contains enough information to prove the estimate. Since $P \ll Q$, the [Radon-Nikodym theorem](/page/Radon-Nikodym%20Theorem) gives a Radon-Nikodym derivative. Let $r:\Omega\to[0,\infty]$ be the measurable function given by $r(\omega)=\frac{dP}{dQ}(\omega)$ for $\omega\in\Omega$. Thus, for every measurable set $E \in \mathcal{F}$, \begin{align*} P(E)=\int_E r\,dQ. \end{align*} Because $P(\Omega)=1<\infty$, we may choose this derivative finite $Q$-a.e.; if necessary, redefine $r$ on the $Q$-null set where it is infinite. This redefinition preserves the displayed identity for every $E\in\mathcal{F}$ and preserves all integrals with respect to $Q$. We also define the function $\varphi:[0,\infty)\to\mathbb{R}$ by $\varphi(t)=t\log t$ for $t>0$ and $\varphi(0)=0$. The function $\varphi$ is convex on $[0,\infty)$: on $(0,\infty)$ it has second derivative $\varphi''(t)=1/t\geq 0$, and the value $\varphi(0)=0$ is the continuous endpoint extension of that convex function. The relative entropy can be written as \begin{align*} D_{\mathrm{KL}}(P \mid Q)=\int_\Omega \varphi(r)\,dQ. \end{align*} We now estimate the contribution from $A_\varepsilon$. If $b_\varepsilon=Q(A_\varepsilon)>0$, then $Q(\cdot \cap A_\varepsilon)/b_\varepsilon$ is a probability measure on $A_\varepsilon$. [Jensen's inequality](/page/Jensen%27s%20Inequality) applied to the convex function $\varphi$ gives \begin{align*} \frac{1}{b_\varepsilon}\int_{A_\varepsilon}\varphi(r)\,dQ \geq \varphi\left( \frac{1}{b_\varepsilon}\int_{A_\varepsilon}r\,dQ \right). \end{align*} Multiplying by $b_\varepsilon$ and using $\int_{A_\varepsilon}r\,dQ=P(A_\varepsilon)=a_\varepsilon$, we get \begin{align*} \int_{A_\varepsilon}\varphi(r)\,dQ \geq b_\varepsilon\varphi\left(\frac{a_\varepsilon}{b_\varepsilon}\right) = a_\varepsilon\log\left(\frac{a_\varepsilon}{b_\varepsilon}\right). \end{align*} If $b_\varepsilon=0$, absolute continuity $P \ll Q$ forces $a_\varepsilon=P(A_\varepsilon)=0$, so the corresponding entropy term is interpreted as $0$ and the inequality remains valid. We next apply the same Jensen mechanism to the complement $A_\varepsilon^c$, but we must first check the endpoint. Since $Q(A_\varepsilon^c)=1-b_\varepsilon$ and $P(A_\varepsilon^c)=1-a_\varepsilon$, if $1-b_\varepsilon>0$, then $Q(\cdot\cap A_\varepsilon^c)/(1-b_\varepsilon)$ is a probability measure on $A_\varepsilon^c$ and [Jensen's inequality](/page/Jensen%27s%20Inequality) gives \begin{align*} \int_{A_\varepsilon^c}\varphi(r)\,dQ \geq (1-a_\varepsilon)\log\left(\frac{1-a_\varepsilon}{1-b_\varepsilon}\right). \end{align*} If $1-b_\varepsilon=0$, then $Q(A_\varepsilon^c)=0$, and absolute continuity $P\ll Q$ gives $P(A_\varepsilon^c)=1-a_\varepsilon=0$, so the complementary entropy term is $0$. Adding the two estimates yields \begin{align*} D_{\mathrm{KL}}(P \mid Q) \geq a_\varepsilon\log\left(\frac{a_\varepsilon}{b_\varepsilon}\right) + (1-a_\varepsilon)\log\left(\frac{1-a_\varepsilon}{1-b_\varepsilon}\right). \end{align*} This is exactly the relative entropy of the two-point distributions $(a_\varepsilon,1-a_\varepsilon)$ and $(b_\varepsilon,1-b_\varepsilon)$.[/guided]

custom_env admin

Verification Progress

6 Total Blocks

0 Verified

0% verified

Contributors

admin 6 blocks (0 verified)

Who Can Verify

Areas: Probability & Statistics

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data