Attributions & Verification

Track contributions and verify content correctness

Proof

custom_env admin

[guided]The purpose of the decoder is to turn an arbitrary estimator taking values in $\Theta$ into an estimator of the hidden vertex $v \in \{0,1\}^m$. For each $w \in \{0,1\}^m$, define $r_w:\mathcal X\to[0,\infty)$ by \begin{align*} r_w(x)=\rho(\hat{\theta}(x),\psi(w)). \end{align*} Let $\mathcal B(\Theta)$ denote the Borel $\sigma$-algebra generated by the metric $\rho$ on $\Theta$. This function is measurable because $\hat\theta:(\mathcal X,\mathcal A)\to(\Theta,\mathcal B(\Theta))$ is measurable, while $\theta\mapsto\rho(\theta,\psi(w))$ is continuous. Since the cube is finite, the minimum of $w\mapsto r_w(x)$ over $w\in\{0,1\}^m$ exists for each observation $x\in\mathcal X$. We choose the lexicographically first minimizer to make the choice single-valued. We also need the decoder to be measurable, because later we will take probabilities of events involving $\hat v_j$. For a fixed $w\in\{0,1\}^m$, the event $\{x\in\mathcal X:\hat v(x)=w\}$ is described by finitely many comparisons between the [measurable functions](/page/Measurable%20Functions) $r_w$ and $r_u$: if $u$ is lexicographically after $w$, require $r_w\le r_u$, and if $u$ is lexicographically before $w$, require $r_w<r_u$. Sets of the form $\{r_w\le r_u\}$ and $\{r_w<r_u\}$ are measurable, so each fibre of $\hat v$ is measurable. Therefore $\hat v:(\mathcal X,\mathcal A)\to\{0,1\}^m$ is measurable for the discrete $\sigma$-algebra on the finite cube. By definition of this nearest-neighbour choice, for every fixed true vertex $v \in \{0,1\}^m$, \begin{align*} \rho(\hat{\theta}(x),\psi(\hat{v}(x))) \leq \rho(\hat{\theta}(x),\psi(v)). \end{align*} Now we compare the two cube vertices $\psi(\hat{v}(x))$ and $\psi(v)$. The triangle inequality in the metric space $(\Theta,\rho)$ gives \begin{align*} \rho(\psi(\hat{v}(x)),\psi(v))\leq \rho(\psi(\hat{v}(x)),\hat{\theta}(x))+\rho(\hat{\theta}(x),\psi(v)). \end{align*} By symmetry of $\rho$, the first term equals $\rho(\hat{\theta}(x),\psi(\hat{v}(x)))$. The nearest-neighbour inequality then gives \begin{align*} \rho(\psi(\hat{v}(x)),\psi(v))\leq 2\rho(\hat{\theta}(x),\psi(v)). \end{align*} The hypercube separation hypothesis says that metric distance between embedded vertices dominates Hamming distance: \begin{align*} \rho(\psi(\hat{v}(x)),\psi(v)) \geq 2s\,d_H(\hat{v}(x),v). \end{align*} Combining the last two displays and dividing by $2$ gives \begin{align*} \rho(\hat{\theta}(x),\psi(v)) \geq s\,d_H(\hat{v}(x),v). \end{align*} Thus every unit error in the decoded Hamming distance costs at least $s$ in the original metric loss.[/guided]

custom_env admin

[guided]Fix a coordinate $j\in\{1,\dots,m\}$. We compare vertices in pairs that differ only in the $j$th coordinate. Let \begin{align*} \mathcal V_j^0 := \{v \in \{0,1\}^m : v_j = 0\} \end{align*} be the lower half of the cube in coordinate $j$, and define the measurable decision event \begin{align*} A_j := \{x \in \mathcal X : \hat v_j(x)=1\}. \end{align*} The set $A_j$ lies in $\mathcal A$ because $\hat v$ is measurable and the coordinate projection $w\mapsto w_j$ from the finite cube to $\{0,1\}$ is measurable. For $v\in\mathcal V_j^0$, the estimator makes a $j$th-coordinate error under $P_v$ exactly on $A_j$, while it makes a $j$th-coordinate error under $P_{v^{(j)}}$ exactly on $A_j^c$. Therefore \begin{align*} P_v(\hat{v}_j \neq v_j)+P_{v^{(j)}}(\hat{v}_j \neq v^{(j)}_j) = P_v(A_j)+P_{v^{(j)}}(A_j^c). \end{align*} Using $P_{v^{(j)}}(A_j^c)=1-P_{v^{(j)}}(A_j)$, we compute \begin{align*} P_v(A_j)+P_{v^{(j)}}(A_j^c)=1-\bigl(P_{v^{(j)}}(A_j)-P_v(A_j)\bigr). \end{align*} By the definition of [total variation distance](/page/Total%20Variation), \begin{align*} P_{v^{(j)}}(A_j)-P_v(A_j) \leq \operatorname{TV}(P_v,P_{v^{(j)}}), \end{align*} so \begin{align*} P_v(A_j)+P_{v^{(j)}}(A_j^c) \geq 1-\operatorname{TV}(P_v,P_{v^{(j)}}). \end{align*} This is the binary testing lower bound in the only form needed here, derived directly from the definition of total variation. Now average over all pairs with $v_j=0$. Each vertex of the cube appears exactly once in the paired sum, either as $v$ or as $v^{(j)}$, so \begin{align*} 2^{-m}\sum_{v \in \{0,1\}^m}P_v(\hat{v}_j \neq v_j) = 2^{-m}\sum_{v \in \mathcal V_j^0} \left[ P_v(\hat{v}_j \neq v_j)+P_{v^{(j)}}(\hat{v}_j \neq v^{(j)}_j) \right]. \end{align*} Applying the pairwise bound gives \begin{align*} 2^{-m}\sum_{v \in \{0,1\}^m}P_v(\hat{v}_j \neq v_j) \geq 2^{-m}\sum_{v \in \mathcal V_j^0} \left[1-\operatorname{TV}(P_v,P_{v^{(j)}})\right]. \end{align*} Since $|\mathcal V_j^0|=2^{m-1}$, the constant part contributes $2^{-m}2^{m-1}=1/2$. Also, the map $v\mapsto v^{(j)}$ pairs the cube into unordered pairs and total variation is symmetric, so \begin{align*} 2^{-m}\sum_{v \in \mathcal V_j^0}\operatorname{TV}(P_v,P_{v^{(j)}}) = \frac{1}{2}\alpha_j. \end{align*} Consequently \begin{align*} 2^{-m}\sum_{v \in \{0,1\}^m}P_v(\hat{v}_j \neq v_j) \geq \frac{1-\alpha_j}{2}. \end{align*}[/guided]

custom_env admin

Verification Progress

11 Total Blocks

0 Verified

0% verified

Contributors

admin 11 blocks (0 verified)

Who Can Verify

Areas: Probability & Statistics

Viktor Miykov Admin

Max Vassiliev Global Reviewer

Horia Neagu Global Reviewer

강현욱 Global Reviewer

Demo Testing Global Reviewer

Archie Pennycook Global Reviewer

Quick Actions

Edit Theorem

What brings you to Androma?

Start with a route through the knowledge graph.

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Sign in to Androma

Check your inbox

One last step

Attributions & Verification

Proof

Verification Progress

Contributors

Who Can Verify

Quick Actions

Raw Attribution Data