McMillan's Theorem — Statement & Proof

McMillan's Theorem (Theorem # 1627)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

No discussion available for this theorem.

Proof

[proofplan] The strategy is to bound a high power of the Kraft sum. Expanding $\left(\sum_i a^{-l_i}\right)^R$ and grouping terms by total length yields a sum over $l$ of $b_l\, a^{-l}$, where $b_l$ counts tuples of $R$ codewords with lengths summing to $l$. Decipherability means that distinct tuples of codewords produce distinct concatenations in $B^*$, so the map from such tuples to length-$l$ strings is injective; hence $b_l \le a^l$. This collapses the sum to at most $Rs$, giving $(\sum_i a^{-l_i})^R \le Rs$. Taking $R$-th roots and sending $R \to \infty$ yields $\sum_i a^{-l_i} \le 1$. [/proofplan] [step:Set up notation and the combinatorial identity for the $R$-th power] Let $c : A \to B^*$ be a decipherable $a$-ary code with $\#A = m$, word lengths $l_1, \ldots, l_m$, and $s := \max_{1 \le i \le m} l_i$. Fix $R \in \mathbb{N}$ and define, for $l \in \{R, R+1, \ldots, Rs\}$, \begin{align*} b_l := \#\!\left\{(i_1, \ldots, i_R) \in \{1, \ldots, m\}^R : l_{i_1} + l_{i_2} + \cdots + l_{i_R} = l\right\}. \end{align*} Note $b_l \ge 0$ and $b_l = 0$ for $l < R$ and $l > Rs$ (since each $l_{i_j}$ is between $1$ and $s$). Expanding the power and grouping by the total length of the selected tuple, \begin{align*} \left(\sum_{i=1}^m a^{-l_i}\right)^{\!R} = \sum_{i_1=1}^m \cdots \sum_{i_R=1}^m a^{-(l_{i_1} + \cdots + l_{i_R})} = \sum_{l=R}^{Rs} b_l \, a^{-l}. \end{align*} [guided] The $R$-th power of a sum is the sum over all $R$-tuples of indices. Each tuple $(i_1, \ldots, i_R)$ contributes the product of the corresponding terms, which is $a^{-l_{i_1}} \cdots a^{-l_{i_R}} = a^{-(l_{i_1} + \cdots + l_{i_R})}$. Grouping the tuples by the common value of the length sum gives a sum indexed by that total length $l$, with the number of tuples having sum $l$ recorded as $b_l$. The range $R \le l \le Rs$ is forced by the bounds $1 \le l_{i_j} \le s$. [/guided] [/step] [step:Bound $b_l$ by $a^l$ using decipherability] Define the concatenation map \begin{align*} \Phi : \{1, \ldots, m\}^R &\to B^*, \\ (i_1, \ldots, i_R) &\mapsto c(x_{i_1})\, c(x_{i_2}) \cdots c(x_{i_R}), \end{align*} where $A = \{x_1, \ldots, x_m\}$ enumerates the source alphabet. Decipherability of $c$ (extended coordinatewise to $c : A^* \to B^*$, $a_1 \ldots a_R \mapsto c(a_1) \cdots c(a_R)$) means precisely that this extension is injective on $A^R$; hence $\Phi$ is injective on $\{1, \ldots, m\}^R$. If $(i_1, \ldots, i_R)$ contributes to $b_l$, then $|\Phi(i_1, \ldots, i_R)| = l_{i_1} + \cdots + l_{i_R} = l$, so $\Phi$ restricts to an injection \begin{align*} \Phi\big|_{\{\text{tuples with length sum } l\}} : \{\text{tuples with length sum } l\} \hookrightarrow B^l. \end{align*} Taking cardinalities, \begin{align*} b_l \le \#B^l = a^l. \end{align*} [guided] Decipherability is exactly the statement that the concatenation of codewords has an unambiguous inverse — the map "source string $\mapsto$ concatenated codeword string" is injective. Via the bijection $A \leftrightarrow \{1, \ldots, m\}$, this translates to $\Phi$ being injective on index tuples. Now, when the length sum equals $l$, the output lives in $B^l$, so an injection into $B^l$ gives the bound $b_l \le \#B^l = a^l$. This is the only place where decipherability is used. [/guided] [/step] [step:Collapse the sum to a polynomial bound in $R$] Substituting $b_l \le a^l$ into the identity of Step 1, \begin{align*} \left(\sum_{i=1}^m a^{-l_i}\right)^{\!R} = \sum_{l=R}^{Rs} b_l \, a^{-l} \le \sum_{l=R}^{Rs} a^l \cdot a^{-l} = \sum_{l=R}^{Rs} 1 = Rs - R + 1 \le Rs. \end{align*} [/step] [step:Take $R$-th roots and pass to the limit $R \to \infty$] Let $K := \sum_{i=1}^m a^{-l_i} \ge 0$. The previous step gives $K^R \le Rs$ for every $R \in \mathbb{N}$, i.e., \begin{align*} K \le (Rs)^{1/R} \quad \text{for all } R \in \mathbb{N}. \end{align*} Since $s$ is a fixed positive integer, $(Rs)^{1/R} = \exp\!\left(\tfrac{\log(Rs)}{R}\right)$ and \begin{align*} \lim_{R \to \infty} \frac{\log(Rs)}{R} = \lim_{R \to \infty} \frac{\log R + \log s}{R} = 0. \end{align*} By continuity of $\exp$ at $0$, $\lim_{R \to \infty} (Rs)^{1/R} = e^0 = 1$. Hence $K \le 1$, which is Kraft's inequality. [guided] The point of taking the $R$-th power was to extract polynomial-in-$R$ information from an $R$-fold product. After taking the $R$-th root, the right-hand side $(Rs)^{1/R}$ converges to $1$ — this is the standard limit $\lim_{R \to \infty} R^{1/R} = 1$, adjusted by the bounded factor $s^{1/R} \to 1$. Since $K \le (Rs)^{1/R}$ for every $R$, passing to the infimum over $R$ gives $K \le 1$. No bound weaker than $K \le 1$ would have sufficed: Kraft's inequality is exactly the assertion $K \le 1$. [/guided] [/step]

Explore Further

Kraft's Inequality Characterises Prefix-Free Codes Information Theory LLN for BSC Errors Information Theory Prefix-Free Codes Are No Worse Than Decipherable Codes Information Theory Maximum Entropy Information Theory Properties of Hamming Codes Information Theory DSA Correctness Information Theory Capacity of the nth Extension Information Theory BCH Bound Information Theory

What brings you to Androma?

Start with a route through the knowledge graph.

McMillan's Theorem (Theorem # 1627)

Discussion

Proof

Explore Further

Sign in to Androma

Check your inbox

One last step

McMillan's Theorem (Theorem # 1627)

Discussion

Proof

Explore Further