RIP Sufficient Condition for Exact Basis Pursuit Recovery

RIP Sufficient Condition for Exact Basis Pursuit Recovery (Theorem # 5917)

Theorem

Edit Issues Pull Requests Attributions Admin

Discussion

Proof

[proofplan] We prove the result with the explicit universal choice \begin{align*} \delta_* = \frac{1}{3}. \end{align*} The main point is to derive the null-space inequality \begin{align*} \|h_S\|_1 < \|h_{S^c}\|_1 \end{align*} for every non-zero $h \in \ker A$ and every index set $S \subset \{1,\dots,N\}$ with $|S| \leq k$. This is obtained by adjoining to $S$ the largest block $S_1 \subset S^c$, applying the restricted isometry property to the combined set $S \cup S_1$, and bounding only the remaining tail blocks by the $\ell^1$ norm on $S^c$. Once this strict null-space inequality is known, exact recovery follows by comparing any feasible competitor $z = x+h$ with the original sparse vector $x$. [/proofplan] [step:Derive an RIP orthogonality estimate for disjoint sparse vectors] Let $\delta := \delta_{2k}(A)$, and assume \begin{align*} \delta \leq \frac{1}{3}. \end{align*} If $k=0$, then $\Sigma_0=\{0\}$. For $x=0$ we have $y=Ax=0$, and the basis pursuit objective $z \mapsto \|z\|_1$ has the unique global minimizer $z=0$ among all feasible vectors. Hence the assertion is immediate when $k=0$, and for the rest of the proof we assume $k\geq 1$. By definition of the restricted isometry constant, every vector $v \in \mathbb{R}^N$ with at most $2k$ non-zero coordinates satisfies \begin{align*} (1-\delta)\|v\|_2^2 \leq \|Av\|_2^2 \leq (1+\delta)\|v\|_2^2. \end{align*} We first record the consequence that if $u,v \in \mathbb{R}^N$ have disjoint supports and each support has cardinality at most $k$, then \begin{align*} |\langle Au, Av\rangle| \leq \delta \|u\|_2\|v\|_2. \end{align*} Indeed, the vectors $u+v$ and $u-v$ are both supported on $\operatorname{supp}(u)\cup \operatorname{supp}(v)$, which has cardinality at most $2k$. Since the supports are disjoint, we have \begin{align*} \|u+v\|_2^2 = \|u\|_2^2+\|v\|_2^2. \end{align*} The same disjoint-support computation gives \begin{align*} \|u-v\|_2^2 = \|u\|_2^2+\|v\|_2^2. \end{align*} Using the polarization identity in $\mathbb{R}^m$ gives \begin{align*} 4\langle Au,Av\rangle = \|A(u+v)\|_2^2-\|A(u-v)\|_2^2. \end{align*} Applying the RIP upper bound to $u+v$ and the RIP lower bound to $u-v$ gives \begin{align*} 4\langle Au,Av\rangle \leq (1+\delta)(\|u\|_2^2+\|v\|_2^2) - (1-\delta)(\|u\|_2^2+\|v\|_2^2) = 2\delta(\|u\|_2^2+\|v\|_2^2). \end{align*} Applying the same argument to $-v$ gives the corresponding lower bound. Replacing $u$ by $tu$ and $v$ by $t^{-1}v$ for $t>0$, then choosing $t^2=\|v\|_2/\|u\|_2$ when both vectors are non-zero, yields \begin{align*} |\langle Au,Av\rangle| \leq \delta \|u\|_2\|v\|_2. \end{align*} The estimate is immediate if $u=0$ or $v=0$. [guided] The RIP controls norms of sparse vectors, but we need to control inner products between images of vectors on disjoint supports. The bridge is polarization. Let $u,v \in \mathbb{R}^N$ have disjoint supports, with $|\operatorname{supp}(u)| \leq k$ and $|\operatorname{supp}(v)| \leq k$. Then $u+v$ and $u-v$ are both $2k$-sparse, so the RIP hypothesis applies to both. Because the supports are disjoint, the Euclidean cross term vanishes. Hence \begin{align*} \|u+v\|_2^2 = \|u\|_2^2+\|v\|_2^2. \end{align*} The same computation with $u-v$ gives \begin{align*} \|u-v\|_2^2 = \|u\|_2^2+\|v\|_2^2. \end{align*} In the measurement space $\mathbb{R}^m$, polarization gives \begin{align*} 4\langle Au,Av\rangle = \|A(u+v)\|_2^2-\|A(u-v)\|_2^2. \end{align*} Now we use the RIP upper estimate on the first term and the RIP lower estimate on the second: \begin{align*} 4\langle Au,Av\rangle \leq (1+\delta)\|u+v\|_2^2-(1-\delta)\|u-v\|_2^2. \end{align*} Substituting the two disjoint-support norm identities yields \begin{align*} 4\langle Au,Av\rangle \leq (1+\delta)(\|u\|_2^2+\|v\|_2^2) - (1-\delta)(\|u\|_2^2+\|v\|_2^2) = 2\delta(\|u\|_2^2+\|v\|_2^2). \end{align*} Replacing $v$ by $-v$ gives the same upper bound for $-\langle Au,Av\rangle$, hence \begin{align*} |\langle Au,Av\rangle| \leq \frac{\delta}{2}(\|u\|_2^2+\|v\|_2^2). \end{align*} This estimate is homogeneous, so we improve it to the product form by scaling. For $t>0$, apply the last inequality to $tu$ and $t^{-1}v$: \begin{align*} |\langle Au,Av\rangle| \leq \frac{\delta}{2}(t^2\|u\|_2^2+t^{-2}\|v\|_2^2). \end{align*} If $u$ and $v$ are non-zero, choose $t^2=\|v\|_2/\|u\|_2$. Then the right-hand side becomes $\delta\|u\|_2\|v\|_2$. If either vector is zero, the same conclusion is immediate. Therefore \begin{align*} |\langle Au,Av\rangle| \leq \delta \|u\|_2\|v\|_2. \end{align*} [/guided] [/step] [step:Partition the complement into decreasing blocks] Fix a non-zero vector $h \in \ker A$ and a set $S \subset \{1,\dots,N\}$ with $|S|\leq k$. For any set $E \subset \{1,\dots,N\}$, define the coordinate restriction map $R_E: \mathbb{R}^N \to \mathbb{R}^N$ by declaring that, for $w \in \mathbb{R}^N$, the vector $R_Ew \in \mathbb{R}^N$ has coordinates \begin{align*} (R_Ew)_i = w_i \text{ for } i \in E, \quad \text{and} \quad (R_Ew)_i = 0 \text{ for } i \notin E. \end{align*} We write $h_E := R_Eh$. Partition $S^c$ into disjoint sets $S_1,S_2,\dots,S_r$ as follows. The set $S_1$ contains the indices of the largest $k$ values of $|h_i|$ on $S^c$, or all remaining indices if fewer than $k$ remain. Having chosen $S_1,\dots,S_{j-1}$, the set $S_j$ contains the indices of the next largest $k$ values of $|h_i|$ among the remaining coordinates. Thus $|S_j|\leq k$ for every $j$, the sets $S_j$ are pairwise disjoint, and \begin{align*} h_{S^c} = \sum_{j=1}^r h_{S_j}. \end{align*} [/step] [step:Control the tail blocks by the complement $\ell^1$ norm] For every $j \geq 2$ and every $i \in S_j$, the construction of the blocks gives \begin{align*} |h_i| \leq \frac{1}{k}\|h_{S_{j-1}}\|_1. \end{align*} Therefore \begin{align*} \|h_{S_j}\|_2 \leq k^{1/2}\max_{i\in S_j}|h_i| \leq k^{-1/2}\|h_{S_{j-1}}\|_1. \end{align*} Summing over $j \geq 2$ yields \begin{align*} \sum_{j=2}^r \|h_{S_j}\|_2 \leq k^{-1/2}\sum_{j=1}^{r-1}\|h_{S_j}\|_1 \leq k^{-1/2}\|h_{S^c}\|_1. \end{align*} [/step] [step:Use the kernel equation to estimate $\|h_S\|_1$ through the tail] Let $T := S \cup S_1$. Since $S$ and $S_1$ are disjoint and each has cardinality at most $k$, the set $T$ has cardinality at most $2k$. If $r=1$, then $h=h_T$. Since $h\in\ker A$, this gives $Ah_T=0$, while the RIP lower bound gives \begin{align*} (1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2=0. \end{align*} Because $1-\delta>0$, it follows that $h_T=0$, hence $h=0$, contradicting the choice of $h$. Therefore $r\geq 2$. Since $h \in \ker A$ and $h = h_T + \sum_{j=2}^r h_{S_j}$, we have \begin{align*} Ah_T = -\sum_{j=2}^r Ah_{S_j}. \end{align*} Taking the [inner product](/page/Inner%20Product) with $Ah_T$ in $\mathbb{R}^m$ gives \begin{align*} \|Ah_T\|_2^2 = -\sum_{j=2}^r \langle Ah_T,Ah_{S_j}\rangle. \end{align*} The supports of $h_T$ and $h_{S_j}$ are disjoint. Moreover $h_T$ is supported on at most $2k$ coordinates and $h_{S_j}$ is supported on at most $k$ coordinates. To use the RIP orthogonality estimate with order $2k$, decompose $h_T=h_S+h_{S_1}$ and apply the estimate separately to the pairs $(h_S,h_{S_j})$ and $(h_{S_1},h_{S_j})$. By the triangle inequality in $\mathbb{R}$, \begin{align*} |\langle Ah_T,Ah_{S_j}\rangle| \leq \delta\|h_S\|_2\|h_{S_j}\|_2+ \delta\|h_{S_1}\|_2\|h_{S_j}\|_2. \end{align*} The triangle inequality for the Euclidean norm gives $\|h_S\|_2+\|h_{S_1}\|_2 \leq \sqrt{2}\|h_T\|_2$, because $h_S$ and $h_{S_1}$ have disjoint supports. Hence \begin{align*} |\langle Ah_T,Ah_{S_j}\rangle| \leq \sqrt{2}\delta\|h_T\|_2\|h_{S_j}\|_2. \end{align*} Since $h_T$ is $2k$-sparse, the RIP lower bound gives \begin{align*} (1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2. \end{align*} Combining these inequalities yields \begin{align*} (1-\delta)\|h_T\|_2^2 \leq \sqrt{2}\delta\|h_T\|_2\sum_{j=2}^r\|h_{S_j}\|_2. \end{align*} If $h_T=0$, then $h_S=0$ and the strict null-space inequality follows unless $h_{S^c}=0$, which would force $h=0$, contrary to the choice of $h$. If $h_T\neq 0$, divide by $\|h_T\|_2$ and use the tail estimate from the previous step: \begin{align*} \|h_T\|_2 \leq \frac{\sqrt{2}\delta}{1-\delta} \sum_{j=2}^r\|h_{S_j}\|_2 \leq \frac{\sqrt{2}\delta}{1-\delta}k^{-1/2}\|h_{S^c}\|_1. \end{align*} Since $\delta \leq \frac{1}{3}$, we have $\frac{\sqrt{2}\delta}{1-\delta} \leq \frac{1}{\sqrt{2}} < 1$. Therefore \begin{align*} \|h_S\|_1 \leq k^{1/2}\|h_S\|_2 \leq k^{1/2}\|h_T\|_2 \leq \frac{1}{\sqrt{2}}\|h_{S^c}\|_1 < \|h_{S^c}\|_1. \end{align*} [guided] We now turn the equation $Ah=0$ into the strict null-space inequality. Define \begin{align*} T := S \cup S_1. \end{align*} The sets $S$ and $S_1$ are disjoint, and each has cardinality at most $k$, so $|T|\leq 2k$. First we remove the degenerate possibility that there is no tail. If $r=1$, then the block decomposition gives $h=h_T$. Since $h\in\ker A$, we have $Ah_T=0$. The RIP lower bound applies because $h_T$ is supported on at most $2k$ coordinates, and gives \begin{align*} (1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2=0. \end{align*} Since $\delta\leq 1/3$, the coefficient $1-\delta$ is positive, so $h_T=0$. This would imply $h=0$, contradicting the choice of $h$. Therefore $r\geq 2$, and the tail sum is genuinely present. Because $h=h_T+\sum_{j=2}^r h_{S_j}$ and $Ah=0$, linearity of $A$ gives \begin{align*} Ah_T=-\sum_{j=2}^r Ah_{S_j}. \end{align*} We take the Euclidean inner product in $\mathbb{R}^m$ with $Ah_T$ to compare the size of the main block with the tail: \begin{align*} \|Ah_T\|_2^2 = -\sum_{j=2}^r \langle Ah_T,Ah_{S_j}\rangle. \end{align*} For each $j\geq 2$, the support of $h_{S_j}$ is disjoint from both $S$ and $S_1$. The orthogonality estimate from the first step only applies to pairs whose supports each have size at most $k$, so we split $h_T=h_S+h_{S_1}$ and apply it separately to $(h_S,h_{S_j})$ and $(h_{S_1},h_{S_j})$. This yields \begin{align*} |\langle Ah_T,Ah_{S_j}\rangle| \leq \delta\|h_S\|_2\|h_{S_j}\|_2+ \delta\|h_{S_1}\|_2\|h_{S_j}\|_2. \end{align*} Since $h_S$ and $h_{S_1}$ have disjoint supports, the Euclidean Pythagorean identity gives \begin{align*} \|h_T\|_2^2=\|h_S\|_2^2+\|h_{S_1}\|_2^2. \end{align*} Therefore \begin{align*} \|h_S\|_2+\|h_{S_1}\|_2 \leq \sqrt{2}\|h_T\|_2, \end{align*} by the [Cauchy-Schwarz inequality](/theorems/432) in $\mathbb{R}^2$. Hence \begin{align*} |\langle Ah_T,Ah_{S_j}\rangle| \leq \sqrt{2}\delta\|h_T\|_2\|h_{S_j}\|_2. \end{align*} The RIP lower bound again applies to $h_T$ because $|T|\leq 2k$, so \begin{align*} (1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2. \end{align*} Combining this lower bound with the preceding inner-product estimate gives \begin{align*} (1-\delta)\|h_T\|_2^2 \leq \sqrt{2}\delta\|h_T\|_2\sum_{j=2}^r\|h_{S_j}\|_2. \end{align*} If $h_T=0$, then $h_S=0$. Since $h\neq 0$, we must have $h_{S^c}\neq 0$, and therefore \begin{align*} \|h_S\|_1=0<\|h_{S^c}\|_1. \end{align*} If $h_T\neq 0$, divide by $\|h_T\|_2$ and use the tail estimate proved in the previous step: \begin{align*} \|h_T\|_2 \leq \frac{\sqrt{2}\delta}{1-\delta} \sum_{j=2}^r\|h_{S_j}\|_2 \leq \frac{\sqrt{2}\delta}{1-\delta}k^{-1/2}\|h_{S^c}\|_1. \end{align*} Because $\delta\leq 1/3$, we have \begin{align*} \frac{\sqrt{2}\delta}{1-\delta} \leq \frac{\sqrt{2}/3}{2/3} = \frac{1}{\sqrt{2}}. \end{align*} Finally, since $k\geq 1$ and $h_S$ is supported on at most $k$ coordinates, the Cauchy-Schwarz inequality gives $\|h_S\|_1\leq k^{1/2}\|h_S\|_2$. Also $\|h_S\|_2\leq\|h_T\|_2$ because $S\subset T$. Thus \begin{align*} \|h_S\|_1 \leq k^{1/2}\|h_S\|_2 \leq k^{1/2}\|h_T\|_2 \leq \frac{1}{\sqrt{2}}\|h_{S^c}\|_1 < \|h_{S^c}\|_1. \end{align*} This proves the strict null-space inequality. [/guided] [/step] [step:Compare every feasible competitor with the sparse vector] Let $x \in \Sigma_k$, and let $S := \operatorname{supp}(x)$, so $|S|\leq k$. Let $z \in \mathbb{R}^N$ be any feasible vector for basis pursuit with $Az=Ax$. Define $h := z-x$. Then $Ah=0$, so $h\in \ker A$. If $h=0$, then $z=x$. If $h\neq 0$, the strict null-space inequality from the previous step gives \begin{align*} \|h_S\|_1 < \|h_{S^c}\|_1. \end{align*} Since $x_{S^c}=0$, the triangle inequality gives \begin{align*} \|z\|_1 = \|x+h\|_1 = \|x_S+h_S\|_1+\|h_{S^c}\|_1. \end{align*} Applying the [reverse triangle inequality](/theorems/2300) on the coordinates in $S$ yields \begin{align*} \|z\|_1 \geq \|x_S\|_1-\|h_S\|_1+\|h_{S^c}\|_1. \end{align*} Using $\|h_S\|_1 < \|h_{S^c}\|_1$, we obtain \begin{align*} \|z\|_1 > \|x_S\|_1 = \|x\|_1. \end{align*} Thus every feasible $z\neq x$ has strictly larger $\ell^1$ norm than $x$. Therefore $x$ is the unique minimizer of the basis pursuit problem, and basis pursuit exactly recovers every $k$-sparse vector from $y=Ax$ whenever $\delta_{2k}(A)\leq \frac{1}{3}$. This proves the theorem with the universal constant $\delta_*=\frac{1}{3}$. [/step]

Prerequisites (0/8 completed)

Prerequisites Graph

Interactive dependency map showing how this theorem builds on foundational concepts

Loading dependency graph...

Theorems

Definitions & Concepts

Explore Further

Inner Product Definition Orthogonality Definition Matrix Definition Set Definition Cauchy-Schwarz Inequality Theorem #432 Cauchy-Schwarz Inequality for Covariance Theorem #5018 Triangle Inequality For Inner Product Spaces Theorem #433 Pythagorean Identity for Orthonormal Families Theorem #4883 Tower Property of Conditional Expectation Conditional Expectation Stable Embedding Lower Bound for Sparse Vectors Probability & Statistics Continuity of Probability Probability Theory Martingale Prediction Principle Probability Theory Karush-Kuhn-Tucker Conditions for the Lasso Active Set Probability & Statistics Reflection Principle for Brownian Motion Brownian Motion Properties of Conditional Expectation Probability Theory Effective-Rank Sample Covariance Concentration Theorem Probability & Statistics Probability & Statistics Area

What brings you to Androma?

Start with a route through the knowledge graph.