Theorems Joint Conditioning Inequality Attributions

Attributions & Verification

Track contributions and verify content correctness

Proof

[proofplan] We expand the joint entropy $H(X, Y, Z)$ in two ways using the [Chain Rule for Entropy](/theorems/???): once peeling off $Z$ last, and once peeling off $X$ last. Equating the two expansions produces an identity relating $H(X \mid Y)$, $H(X \mid Y, Z)$, and the conditional entropies of $Z$. Using the non-negativity of conditional entropy and the fact that conditioning reduces entropy, both $H(Z \mid X, Y) \geq 0$ and $H(Z \mid Y) \leq H(Z)$, we bound the difference by $H(Z)$. [/proofplan]

custom_env Unknown

[step:Expand $H(X, Y, Z)$ peeling off $Z$ first] Applying the [Chain Rule for Entropy](/theorems/???) repeatedly, grouping $(X, Y)$ first and then peeling off $Y$, \begin{align*} H(X, Y, Z) &= H(X, Y) + H(Z \mid X, Y) \\ &= H(Y) + H(X \mid Y) + H(Z \mid X, Y). \end{align*} Each application of the chain rule requires only that the relevant joint distributions be well defined, which holds since $X$, $Y$, $Z$ are defined on a common probability space with finite alphabets. [/step]

custom_env Unknown

[step:Expand $H(X, Y, Z)$ peeling off $X$ first] Again by the [Chain Rule for Entropy](/theorems/???), this time grouping $(Y, Z)$ first and then peeling off $Y$, \begin{align*} H(X, Y, Z) &= H(Y, Z) + H(X \mid Y, Z) \\ &= H(Y) + H(Z \mid Y) + H(X \mid Y, Z). \end{align*} [/step]

custom_env Unknown

[step:Equate the two expansions and cancel $H(Y)$]Setting the two expressions for $H(X, Y, Z)$ equal, \begin{align*} H(Y) + H(X \mid Y) + H(Z \mid X, Y) = H(Y) + H(Z \mid Y) + H(X \mid Y, Z). \end{align*} Subtracting $H(Y)$ (a finite real number, since the alphabets are finite) from both sides, \begin{align*} H(X \mid Y) + H(Z \mid X, Y) = H(Z \mid Y) + H(X \mid Y, Z). \end{align*} Rearranging, \begin{align*} H(X \mid Y) - H(X \mid Y, Z) = H(Z \mid Y) - H(Z \mid X, Y). \end{align*}[/step]

custom_env Unknown

[guided]The two chain-rule expansions both decompose $H(X, Y, Z)$, so equating them is a bookkeeping identity that must hold. The value of this manoeuvre is that it produces, on the two sides, exactly the quantities we need to compare: $H(X \mid Y)$ and $H(X \mid Y, Z)$ on the left, and $H(Z \mid Y)$ and $H(Z \mid X, Y)$ on the right. The symmetry between the two triples is the key idea — the identity converts a statement about the $X$-side into a statement about the $Z$-side, which turns out to be easier to bound. Cancelling $H(Y)$ from both sides is valid because $H(Y)$ is a finite non-negative real number (finite because $\mathcal{Y}$ is finite and the Shannon entropy of a random variable with finite support is finite). Rearranging so that both "$X$-quantities" are on the left and both "$Z$-quantities" are on the right yields \begin{align*} H(X \mid Y) - H(X \mid Y, Z) = H(Z \mid Y) - H(Z \mid X, Y). \end{align*} This identity is the core of the proof: it reduces the desired inequality to bounding the right-hand side.[/guided]

custom_env Unknown

[step:Bound the right-hand side by $H(Z)$]By [non-negativity of conditional entropy](/theorems/???), $H(Z \mid X, Y) \geq 0$. Hence \begin{align*} H(Z \mid Y) - H(Z \mid X, Y) \leq H(Z \mid Y). \end{align*} By [Conditioning Reduces Entropy](/theorems/1652) applied to the pair $(Z, Y)$, \begin{align*} H(Z \mid Y) \leq H(Z). \end{align*} Chaining these bounds, \begin{align*} H(Z \mid Y) - H(Z \mid X, Y) \leq H(Z \mid Y) \leq H(Z). \end{align*}[/step]

custom_env Unknown

[guided]The right-hand side of the identity established above is $H(Z \mid Y) - H(Z \mid X, Y)$. Two facts deliver the bound we need. First, conditional entropy is non-negative: for any random variables $U, V$ on finite alphabets, \begin{align*} H(U \mid V) = \sum_v p_V(v) \, H(U \mid V = v) \geq 0, \end{align*} since each $H(U \mid V = v) \geq 0$ (entropy of a probability distribution on a finite set is non-negative) and $p_V(v) \geq 0$. Applying this to $(U, V) = (Z, (X, Y))$, \begin{align*} H(Z \mid X, Y) \geq 0, \end{align*} so dropping this term can only increase the right-hand side: \begin{align*} H(Z \mid Y) - H(Z \mid X, Y) \leq H(Z \mid Y). \end{align*} Second, [Conditioning Reduces Entropy](/theorems/1652) states: for any discrete random variables $U, V$ on finite alphabets, $H(U \mid V) \leq H(U)$. Applying this with $(U, V) = (Z, Y)$, \begin{align*} H(Z \mid Y) \leq H(Z). \end{align*} Both hypotheses of theorem 1652 are satisfied: $Z$ and $Y$ are discrete random variables on finite alphabets defined on the same probability space, by the hypotheses of the theorem being proved. Chaining, \begin{align*} H(Z \mid Y) - H(Z \mid X, Y) \leq H(Z \mid Y) \leq H(Z). \end{align*}[/guided]

custom_env Unknown

[step:Combine the identity and the bound to conclude] Combining the identity from the third step, \begin{align*} H(X \mid Y) - H(X \mid Y, Z) = H(Z \mid Y) - H(Z \mid X, Y), \end{align*} with the bound from the fourth step, \begin{align*} H(Z \mid Y) - H(Z \mid X, Y) \leq H(Z), \end{align*} we obtain \begin{align*} H(X \mid Y) - H(X \mid Y, Z) \leq H(Z), \end{align*} equivalently \begin{align*} H(X \mid Y) \leq H(X \mid Y, Z) + H(Z), \end{align*} which is the statement of the theorem. [/step]

custom_env Unknown

Verification Progress

8 Total Blocks

0 Verified

0% verified

Contributors

Unknown 8 blocks (0 verified)

Who Can Verify

Areas: Probability & Statistics
Subareas: Probability Theory

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

Raw Attribution Data

Loading attribution data...

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data