[proofplan]
We prove the result with the explicit universal choice
\begin{align*}
\delta_* = \frac{1}{3}.
\end{align*}
The main point is to derive the null-space inequality
\begin{align*}
\|h_S\|_1 < \|h_{S^c}\|_1
\end{align*}
for every non-zero $h \in \ker A$ and every index set $S \subset \{1,\dots,N\}$ with $|S| \leq k$. This is obtained by adjoining to $S$ the largest block $S_1 \subset S^c$, applying the restricted isometry property to the combined set $S \cup S_1$, and bounding only the remaining tail blocks by the $\ell^1$ norm on $S^c$. Once this strict null-space inequality is known, exact recovery follows by comparing any feasible competitor $z = x+h$ with the original sparse vector $x$.
[/proofplan]
[step:Derive an RIP orthogonality estimate for disjoint sparse vectors]
Let $\delta := \delta_{2k}(A)$, and assume
\begin{align*}
\delta \leq \frac{1}{3}.
\end{align*}
If $k=0$, then $\Sigma_0=\{0\}$. For $x=0$ we have $y=Ax=0$, and the basis pursuit objective $z \mapsto \|z\|_1$ has the unique global minimizer $z=0$ among all feasible vectors. Hence the assertion is immediate when $k=0$, and for the rest of the proof we assume $k\geq 1$.
By definition of the restricted isometry constant, every vector $v \in \mathbb{R}^N$ with at most $2k$ non-zero coordinates satisfies
\begin{align*}
(1-\delta)\|v\|_2^2 \leq \|Av\|_2^2 \leq (1+\delta)\|v\|_2^2.
\end{align*}
We first record the consequence that if $u,v \in \mathbb{R}^N$ have disjoint supports and each support has cardinality at most $k$, then
\begin{align*}
|\langle Au, Av\rangle| \leq \delta \|u\|_2\|v\|_2.
\end{align*}
Indeed, the vectors $u+v$ and $u-v$ are both supported on $\operatorname{supp}(u)\cup \operatorname{supp}(v)$, which has cardinality at most $2k$. Since the supports are disjoint, we have
\begin{align*}
\|u+v\|_2^2 = \|u\|_2^2+\|v\|_2^2.
\end{align*}
The same disjoint-support computation gives
\begin{align*}
\|u-v\|_2^2 = \|u\|_2^2+\|v\|_2^2.
\end{align*}
Using the polarization identity in $\mathbb{R}^m$ gives
\begin{align*}
4\langle Au,Av\rangle
= \|A(u+v)\|_2^2-\|A(u-v)\|_2^2.
\end{align*}
Applying the RIP upper bound to $u+v$ and the RIP lower bound to $u-v$ gives
\begin{align*}
4\langle Au,Av\rangle
\leq
(1+\delta)(\|u\|_2^2+\|v\|_2^2)
-
(1-\delta)(\|u\|_2^2+\|v\|_2^2)
=
2\delta(\|u\|_2^2+\|v\|_2^2).
\end{align*}
Applying the same argument to $-v$ gives the corresponding lower bound. Replacing $u$ by $tu$ and $v$ by $t^{-1}v$ for $t>0$, then choosing $t^2=\|v\|_2/\|u\|_2$ when both vectors are non-zero, yields
\begin{align*}
|\langle Au,Av\rangle| \leq \delta \|u\|_2\|v\|_2.
\end{align*}
The estimate is immediate if $u=0$ or $v=0$.
[guided]
The RIP controls norms of sparse vectors, but we need to control inner products between images of vectors on disjoint supports. The bridge is polarization. Let $u,v \in \mathbb{R}^N$ have disjoint supports, with $|\operatorname{supp}(u)| \leq k$ and $|\operatorname{supp}(v)| \leq k$. Then $u+v$ and $u-v$ are both $2k$-sparse, so the RIP hypothesis applies to both.
Because the supports are disjoint, the Euclidean cross term vanishes. Hence
\begin{align*}
\|u+v\|_2^2 = \|u\|_2^2+\|v\|_2^2.
\end{align*}
The same computation with $u-v$ gives
\begin{align*}
\|u-v\|_2^2 = \|u\|_2^2+\|v\|_2^2.
\end{align*}
In the measurement space $\mathbb{R}^m$, polarization gives
\begin{align*}
4\langle Au,Av\rangle
=
\|A(u+v)\|_2^2-\|A(u-v)\|_2^2.
\end{align*}
Now we use the RIP upper estimate on the first term and the RIP lower estimate on the second:
\begin{align*}
4\langle Au,Av\rangle
\leq
(1+\delta)\|u+v\|_2^2-(1-\delta)\|u-v\|_2^2.
\end{align*}
Substituting the two disjoint-support norm identities yields
\begin{align*}
4\langle Au,Av\rangle
\leq
(1+\delta)(\|u\|_2^2+\|v\|_2^2)
-
(1-\delta)(\|u\|_2^2+\|v\|_2^2)
=
2\delta(\|u\|_2^2+\|v\|_2^2).
\end{align*}
Replacing $v$ by $-v$ gives the same upper bound for $-\langle Au,Av\rangle$, hence
\begin{align*}
|\langle Au,Av\rangle|
\leq
\frac{\delta}{2}(\|u\|_2^2+\|v\|_2^2).
\end{align*}
This estimate is homogeneous, so we improve it to the product form by scaling. For $t>0$, apply the last inequality to $tu$ and $t^{-1}v$:
\begin{align*}
|\langle Au,Av\rangle|
\leq
\frac{\delta}{2}(t^2\|u\|_2^2+t^{-2}\|v\|_2^2).
\end{align*}
If $u$ and $v$ are non-zero, choose $t^2=\|v\|_2/\|u\|_2$. Then the right-hand side becomes $\delta\|u\|_2\|v\|_2$. If either vector is zero, the same conclusion is immediate. Therefore
\begin{align*}
|\langle Au,Av\rangle| \leq \delta \|u\|_2\|v\|_2.
\end{align*}
[/guided]
[/step]
[step:Partition the complement into decreasing blocks]
Fix a non-zero vector $h \in \ker A$ and a set $S \subset \{1,\dots,N\}$ with $|S|\leq k$. For any set $E \subset \{1,\dots,N\}$, define the coordinate restriction map $R_E: \mathbb{R}^N \to \mathbb{R}^N$ by declaring that, for $w \in \mathbb{R}^N$, the vector $R_Ew \in \mathbb{R}^N$ has coordinates
\begin{align*}
(R_Ew)_i = w_i \text{ for } i \in E, \quad \text{and} \quad (R_Ew)_i = 0 \text{ for } i \notin E.
\end{align*}
We write $h_E := R_Eh$.
Partition $S^c$ into disjoint sets $S_1,S_2,\dots,S_r$ as follows. The set $S_1$ contains the indices of the largest $k$ values of $|h_i|$ on $S^c$, or all remaining indices if fewer than $k$ remain. Having chosen $S_1,\dots,S_{j-1}$, the set $S_j$ contains the indices of the next largest $k$ values of $|h_i|$ among the remaining coordinates. Thus $|S_j|\leq k$ for every $j$, the sets $S_j$ are pairwise disjoint, and
\begin{align*}
h_{S^c} = \sum_{j=1}^r h_{S_j}.
\end{align*}
[/step]
[step:Control the tail blocks by the complement $\ell^1$ norm]
For every $j \geq 2$ and every $i \in S_j$, the construction of the blocks gives
\begin{align*}
|h_i| \leq \frac{1}{k}\|h_{S_{j-1}}\|_1.
\end{align*}
Therefore
\begin{align*}
\|h_{S_j}\|_2
\leq
k^{1/2}\max_{i\in S_j}|h_i|
\leq
k^{-1/2}\|h_{S_{j-1}}\|_1.
\end{align*}
Summing over $j \geq 2$ yields
\begin{align*}
\sum_{j=2}^r \|h_{S_j}\|_2
\leq
k^{-1/2}\sum_{j=1}^{r-1}\|h_{S_j}\|_1
\leq
k^{-1/2}\|h_{S^c}\|_1.
\end{align*}
[/step]
[step:Use the kernel equation to estimate $\|h_S\|_1$ through the tail]
Let $T := S \cup S_1$. Since $S$ and $S_1$ are disjoint and each has cardinality at most $k$, the set $T$ has cardinality at most $2k$. If $r=1$, then $h=h_T$. Since $h\in\ker A$, this gives $Ah_T=0$, while the RIP lower bound gives
\begin{align*}
(1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2=0.
\end{align*}
Because $1-\delta>0$, it follows that $h_T=0$, hence $h=0$, contradicting the choice of $h$. Therefore $r\geq 2$. Since $h \in \ker A$ and $h = h_T + \sum_{j=2}^r h_{S_j}$, we have
\begin{align*}
Ah_T = -\sum_{j=2}^r Ah_{S_j}.
\end{align*}
Taking the [inner product](/page/Inner%20Product) with $Ah_T$ in $\mathbb{R}^m$ gives
\begin{align*}
\|Ah_T\|_2^2
=
-\sum_{j=2}^r \langle Ah_T,Ah_{S_j}\rangle.
\end{align*}
The supports of $h_T$ and $h_{S_j}$ are disjoint. Moreover $h_T$ is supported on at most $2k$ coordinates and $h_{S_j}$ is supported on at most $k$ coordinates. To use the RIP orthogonality estimate with order $2k$, decompose $h_T=h_S+h_{S_1}$ and apply the estimate separately to the pairs $(h_S,h_{S_j})$ and $(h_{S_1},h_{S_j})$. By the triangle inequality in $\mathbb{R}$,
\begin{align*}
|\langle Ah_T,Ah_{S_j}\rangle|
\leq
\delta\|h_S\|_2\|h_{S_j}\|_2+
\delta\|h_{S_1}\|_2\|h_{S_j}\|_2.
\end{align*}
The triangle inequality for the Euclidean norm gives $\|h_S\|_2+\|h_{S_1}\|_2 \leq \sqrt{2}\|h_T\|_2$, because $h_S$ and $h_{S_1}$ have disjoint supports. Hence
\begin{align*}
|\langle Ah_T,Ah_{S_j}\rangle|
\leq
\sqrt{2}\delta\|h_T\|_2\|h_{S_j}\|_2.
\end{align*}
Since $h_T$ is $2k$-sparse, the RIP lower bound gives
\begin{align*}
(1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2.
\end{align*}
Combining these inequalities yields
\begin{align*}
(1-\delta)\|h_T\|_2^2
\leq
\sqrt{2}\delta\|h_T\|_2\sum_{j=2}^r\|h_{S_j}\|_2.
\end{align*}
If $h_T=0$, then $h_S=0$ and the strict null-space inequality follows unless $h_{S^c}=0$, which would force $h=0$, contrary to the choice of $h$. If $h_T\neq 0$, divide by $\|h_T\|_2$ and use the tail estimate from the previous step:
\begin{align*}
\|h_T\|_2
\leq
\frac{\sqrt{2}\delta}{1-\delta}
\sum_{j=2}^r\|h_{S_j}\|_2
\leq
\frac{\sqrt{2}\delta}{1-\delta}k^{-1/2}\|h_{S^c}\|_1.
\end{align*}
Since $\delta \leq \frac{1}{3}$, we have $\frac{\sqrt{2}\delta}{1-\delta} \leq \frac{1}{\sqrt{2}} < 1$. Therefore
\begin{align*}
\|h_S\|_1
\leq
k^{1/2}\|h_S\|_2
\leq
k^{1/2}\|h_T\|_2
\leq
\frac{1}{\sqrt{2}}\|h_{S^c}\|_1
<
\|h_{S^c}\|_1.
\end{align*}
[guided]
We now turn the equation $Ah=0$ into the strict null-space inequality. Define
\begin{align*}
T := S \cup S_1.
\end{align*}
The sets $S$ and $S_1$ are disjoint, and each has cardinality at most $k$, so $|T|\leq 2k$. First we remove the degenerate possibility that there is no tail. If $r=1$, then the block decomposition gives $h=h_T$. Since $h\in\ker A$, we have $Ah_T=0$. The RIP lower bound applies because $h_T$ is supported on at most $2k$ coordinates, and gives
\begin{align*}
(1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2=0.
\end{align*}
Since $\delta\leq 1/3$, the coefficient $1-\delta$ is positive, so $h_T=0$. This would imply $h=0$, contradicting the choice of $h$. Therefore $r\geq 2$, and the tail sum is genuinely present.
Because $h=h_T+\sum_{j=2}^r h_{S_j}$ and $Ah=0$, linearity of $A$ gives
\begin{align*}
Ah_T=-\sum_{j=2}^r Ah_{S_j}.
\end{align*}
We take the Euclidean inner product in $\mathbb{R}^m$ with $Ah_T$ to compare the size of the main block with the tail:
\begin{align*}
\|Ah_T\|_2^2
=
-\sum_{j=2}^r \langle Ah_T,Ah_{S_j}\rangle.
\end{align*}
For each $j\geq 2$, the support of $h_{S_j}$ is disjoint from both $S$ and $S_1$. The orthogonality estimate from the first step only applies to pairs whose supports each have size at most $k$, so we split $h_T=h_S+h_{S_1}$ and apply it separately to $(h_S,h_{S_j})$ and $(h_{S_1},h_{S_j})$. This yields
\begin{align*}
|\langle Ah_T,Ah_{S_j}\rangle|
\leq
\delta\|h_S\|_2\|h_{S_j}\|_2+
\delta\|h_{S_1}\|_2\|h_{S_j}\|_2.
\end{align*}
Since $h_S$ and $h_{S_1}$ have disjoint supports, the Euclidean Pythagorean identity gives
\begin{align*}
\|h_T\|_2^2=\|h_S\|_2^2+\|h_{S_1}\|_2^2.
\end{align*}
Therefore
\begin{align*}
\|h_S\|_2+\|h_{S_1}\|_2
\leq
\sqrt{2}\|h_T\|_2,
\end{align*}
by the [Cauchy-Schwarz inequality](/theorems/432) in $\mathbb{R}^2$. Hence
\begin{align*}
|\langle Ah_T,Ah_{S_j}\rangle|
\leq
\sqrt{2}\delta\|h_T\|_2\|h_{S_j}\|_2.
\end{align*}
The RIP lower bound again applies to $h_T$ because $|T|\leq 2k$, so
\begin{align*}
(1-\delta)\|h_T\|_2^2 \leq \|Ah_T\|_2^2.
\end{align*}
Combining this lower bound with the preceding inner-product estimate gives
\begin{align*}
(1-\delta)\|h_T\|_2^2
\leq
\sqrt{2}\delta\|h_T\|_2\sum_{j=2}^r\|h_{S_j}\|_2.
\end{align*}
If $h_T=0$, then $h_S=0$. Since $h\neq 0$, we must have $h_{S^c}\neq 0$, and therefore
\begin{align*}
\|h_S\|_1=0<\|h_{S^c}\|_1.
\end{align*}
If $h_T\neq 0$, divide by $\|h_T\|_2$ and use the tail estimate proved in the previous step:
\begin{align*}
\|h_T\|_2
\leq
\frac{\sqrt{2}\delta}{1-\delta}
\sum_{j=2}^r\|h_{S_j}\|_2
\leq
\frac{\sqrt{2}\delta}{1-\delta}k^{-1/2}\|h_{S^c}\|_1.
\end{align*}
Because $\delta\leq 1/3$, we have
\begin{align*}
\frac{\sqrt{2}\delta}{1-\delta}
\leq
\frac{\sqrt{2}/3}{2/3}
=
\frac{1}{\sqrt{2}}.
\end{align*}
Finally, since $k\geq 1$ and $h_S$ is supported on at most $k$ coordinates, the Cauchy-Schwarz inequality gives $\|h_S\|_1\leq k^{1/2}\|h_S\|_2$. Also $\|h_S\|_2\leq\|h_T\|_2$ because $S\subset T$. Thus
\begin{align*}
\|h_S\|_1
\leq
k^{1/2}\|h_S\|_2
\leq
k^{1/2}\|h_T\|_2
\leq
\frac{1}{\sqrt{2}}\|h_{S^c}\|_1
<
\|h_{S^c}\|_1.
\end{align*}
This proves the strict null-space inequality.
[/guided]
[/step]
[step:Compare every feasible competitor with the sparse vector]
Let $x \in \Sigma_k$, and let $S := \operatorname{supp}(x)$, so $|S|\leq k$. Let $z \in \mathbb{R}^N$ be any feasible vector for basis pursuit with $Az=Ax$. Define $h := z-x$. Then $Ah=0$, so $h\in \ker A$.
If $h=0$, then $z=x$. If $h\neq 0$, the strict null-space inequality from the previous step gives
\begin{align*}
\|h_S\|_1 < \|h_{S^c}\|_1.
\end{align*}
Since $x_{S^c}=0$, the triangle inequality gives
\begin{align*}
\|z\|_1
=
\|x+h\|_1
=
\|x_S+h_S\|_1+\|h_{S^c}\|_1.
\end{align*}
Applying the [reverse triangle inequality](/theorems/2300) on the coordinates in $S$ yields
\begin{align*}
\|z\|_1
\geq
\|x_S\|_1-\|h_S\|_1+\|h_{S^c}\|_1.
\end{align*}
Using $\|h_S\|_1 < \|h_{S^c}\|_1$, we obtain
\begin{align*}
\|z\|_1 > \|x_S\|_1 = \|x\|_1.
\end{align*}
Thus every feasible $z\neq x$ has strictly larger $\ell^1$ norm than $x$. Therefore $x$ is the unique minimizer of the basis pursuit problem, and basis pursuit exactly recovers every $k$-sparse vector from $y=Ax$ whenever $\delta_{2k}(A)\leq \frac{1}{3}$. This proves the theorem with the universal constant $\delta_*=\frac{1}{3}$.
[/step]