[proofplan]
We prove membership in NP by using the chosen subset as a certificate and checking the corresponding sum in polynomial time. For NP-hardness we give a polynomial-time many-one reduction from $\mathrm{3}\text{-}\mathrm{SAT}$, the decision problem whose instances are Boolean formulas in conjunctive normal form with exactly three literal occurrences per clause and whose yes-instances are the satisfiable formulas. The reduction uses decimal columns to encode the choice of one truth value per variable and the satisfaction count of each clause. The base is chosen so that no carries can occur, so equality of integers is equivalent to equality in every column. Slack numbers then make a clause column reach the target digit exactly when the chosen assignment satisfies that clause.
[/proofplan]
[step:Verify that subset choices can be checked in polynomial time]
An instance of $\mathrm{SUBSET}\text{-}\mathrm{SUM}$ consists of integers $a_1,\dots,a_N,T \in \mathbb{N} \cup \{0\}$ written in binary. A certificate is a subset $I \subseteq \{1,\dots,N\}$, equivalently a binary vector
\begin{align*}
\varepsilon: \{1,\dots,N\} \to \{0,1\}
\end{align*}
with $\varepsilon(i)=1$ precisely when $i \in I$.
Given $\varepsilon$, compute
\begin{align*}
S_\varepsilon := \sum_{i=1}^{N} \varepsilon(i)a_i.
\end{align*}
Binary addition of $N$ integers whose bit lengths are bounded by the input size takes polynomial time, and then one compares $S_\varepsilon$ with $T$. Thus $\mathrm{SUBSET}\text{-}\mathrm{SUM}$ belongs to NP.
[/step]
[step:Construct the subset sum instance from a $3$-CNF formula]
The problem $\mathrm{3}\text{-}\mathrm{SAT}$ is the decision problem whose input is a Boolean formula in conjunctive normal form with exactly three literal occurrences in each clause, and whose output is yes precisely when the formula has a satisfying truth assignment. We reduce from $\mathrm{3}\text{-}\mathrm{SAT}$, using the theorem that [3-SAT Is NP-Complete](/theorems/31).
Let $\Phi$ be a $3$-CNF formula with variables $x_1,\dots,x_n$ and clauses $C_1,\dots,C_m$, each clause containing exactly three literal occurrences. We define a polynomial-time map from $\Phi$ to a $\mathrm{SUBSET}\text{-}\mathrm{SUM}$ instance.
Set the base to be $B := 10$. The constructed integers will have $n+m$ base-$B$ columns. Columns $1,\dots,n$ are variable columns, and columns $n+1,\dots,n+m$ are clause columns. For $i \in \{1,\dots,n\}$ and $j \in \{1,\dots,m\}$, using the convention that the truth value $1$ means true and the truth value $0$ means false, define $c_{i,1,j} \in \{0,1,2,3\}$ to be the number of occurrences of the literal $x_i$ in clause $C_j$, and define $c_{i,0,j} \in \{0,1,2,3\}$ to be the number of occurrences of the literal $\neg x_i$ in clause $C_j$.
For each variable $x_i$ and each truth value $\tau \in \{0,1\}$, define the truth-choice integer
\begin{align*}
A_{i,\tau} := B^{i-1} + \sum_{j=1}^{m} c_{i,\tau,j}B^{n+j-1}.
\end{align*}
The integer $A_{i,1}$ represents setting $x_i$ to true, and $A_{i,0}$ represents setting $x_i$ to false.
For each clause $C_j$, define two slack integers
\begin{align*}
U_j := B^{n+j-1}
\end{align*}
and
\begin{align*}
V_j := 2B^{n+j-1}.
\end{align*}
Finally define the target integer
\begin{align*}
T_\Phi := \sum_{i=1}^{n} B^{i-1} + 4\sum_{j=1}^{m} B^{n+j-1}.
\end{align*}
Define the many-one reduction map $f$ from $3$-CNF formulas to subset-sum instances by sending $\Phi$ to the instance whose $2n+2m$ integers are
\begin{align*}
\{A_{i,\tau} : 1 \leq i \leq n,\ \tau \in \{0,1\}\} \cup \{U_j,V_j : 1 \leq j \leq m\}
\end{align*}
and whose target is $T_\Phi$.
[guided]
The construction uses one digit column for each constraint we want to enforce. The variable column for $x_i$ will force the subset to choose exactly one of $A_{i,0}$ and $A_{i,1}$, because both contribute $1$ in column $i$ and the target digit in that column is $1$. The clause column for $C_j$ records how many selected literal choices satisfy that clause, counted with multiplicity of literal occurrences.
We choose the base $B := 10$ so that base-$B$ addition has no carries in any possible subset of the constructed integers. In a variable column, only $A_{i,0}$ and $A_{i,1}$ contribute, so the total possible contribution is at most $2$. In a clause column $n+j$, the total contribution from selected truth-choice integers is at most the number of literal occurrences in $C_j$, namely $3$, because only the selected truth-choice integer for a variable can contribute to each literal occurrence. The slack integers $U_j$ and $V_j$ contribute at most $1+2=3$ more. Hence every column sum in any subset is at most $6$, which is strictly less than $B=10$. Therefore equality of the resulting integers is equivalent to equality of all their base-$B$ digits column by column.
For each $i \in \{1,\dots,n\}$ and $\tau \in \{0,1\}$, the truth-choice integer is
\begin{align*}
A_{i,\tau} := B^{i-1} + \sum_{j=1}^{m} c_{i,\tau,j}B^{n+j-1}.
\end{align*}
The term $B^{i-1}$ places a $1$ in the variable column for $x_i$. The coefficient $c_{i,\tau,j}$ places in the clause column for $C_j$ the number of literal occurrences in $C_j$ made true by assigning $x_i$ the value $\tau$. The slack integers
\begin{align*}
U_j := B^{n+j-1}
\end{align*}
and
\begin{align*}
V_j := 2B^{n+j-1}
\end{align*}
affect only the clause column for $C_j$. The target
\begin{align*}
T_\Phi := \sum_{i=1}^{n} B^{i-1} + 4\sum_{j=1}^{m} B^{n+j-1}
\end{align*}
therefore demands digit $1$ in every variable column and digit $4$ in every clause column.
We now verify, within the construction, why these digits encode satisfiability. Suppose first that $\alpha: \{x_1,\dots,x_n\} \to \{0,1\}$ is a satisfying truth assignment. Select $A_{i,\alpha(x_i)}$ for every $i$. The variable columns then contribute exactly $1$. For a clause $C_j$, let
\begin{align*}
s_j := \sum_{i=1}^{n} c_{i,\alpha(x_i),j}.
\end{align*}
Since $\alpha$ satisfies $C_j$, the integer $s_j$ lies in $\{1,2,3\}$. If $s_j=1$, select both $U_j$ and $V_j$; if $s_j=2$, select $V_j$; if $s_j=3$, select $U_j$. These choices add respectively $3$, $2$, and $1$ in clause column $n+j$, so the clause digit becomes $4$.
Conversely, suppose a selected subset sums to $T_\Phi$. Because every possible column sum is less than $B$, equality of integers forces equality in each column. In variable column $i$, only $A_{i,0}$ and $A_{i,1}$ contribute, each by digit $1$, and the target digit is $1$; hence exactly one of them is selected. Define $\alpha: \{x_1,\dots,x_n\} \to \{0,1\}$ by $\alpha(x_i)=\tau$ when $A_{i,\tau}$ is selected. For a clause $C_j$, define again
\begin{align*}
s_j := \sum_{i=1}^{n} c_{i,\alpha(x_i),j}.
\end{align*}
The slack contribution in clause column $n+j$ can only be $0$, $1$, $2$, or $3$. Since the target digit is $4$, the equality $s_j+r_j=4$ for some slack contribution $r_j \in \{0,1,2,3\}$ rules out $s_j=0$. Thus $s_j \geq 1$, so every clause has a true literal occurrence under $\alpha$.
[/guided]
[/step]
[step:Show that a satisfying assignment gives a subset summing to the target]
Assume $\Phi$ is satisfiable. Let
\begin{align*}
\alpha: \{x_1,\dots,x_n\} \to \{0,1\}
\end{align*}
be a satisfying truth assignment, where $\alpha(x_i)=1$ means $x_i$ is true and $\alpha(x_i)=0$ means $x_i$ is false.
Choose the truth-choice integer $A_{i,\alpha(x_i)}$ for each $i \in \{1,\dots,n\}$. In each variable column $i$, the chosen truth-choice integers contribute exactly $1$, matching the target digit.
Fix a clause index $j \in \{1,\dots,m\}$. Define the clause contribution
\begin{align*}
s_j := \sum_{i=1}^{n} c_{i,\alpha(x_i),j}.
\end{align*}
This is the number of literal occurrences in $C_j$ made true by $\alpha$. Since $\alpha$ satisfies $\Phi$, every clause is satisfied, so $s_j \in \{1,2,3\}$.
If $s_j=1$, choose both $U_j$ and $V_j$, whose clause-column contribution is $1+2=3$. If $s_j=2$, choose $V_j$, whose contribution is $2$. If $s_j=3$, choose $U_j$, whose contribution is $1$. In each case the total contribution in clause column $n+j$ is $4$.
Because no column sum reaches $B=10$, no carries occur. Thus the selected integers have base-$B$ digits equal to those of $T_\Phi$ in every column, so their sum is $T_\Phi$. Hence the constructed subset-sum instance is a yes-instance.
[/step]
[step:Show that a subset summing to the target gives a satisfying assignment]
Assume that a subset of the constructed integers sums to $T_\Phi$. Since no column sum can reach $B=10$, equality with $T_\Phi$ implies equality in every base-$B$ column.
For each $i \in \{1,\dots,n\}$, only the two integers $A_{i,0}$ and $A_{i,1}$ contribute to variable column $i$. The target digit in column $i$ is $1$, and each of $A_{i,0}$ and $A_{i,1}$ contributes exactly $1$ there. Therefore exactly one of $A_{i,0}$ and $A_{i,1}$ is selected. Define
\begin{align*}
\alpha: \{x_1,\dots,x_n\} \to \{0,1\}
\end{align*}
by setting $\alpha(x_i)=\tau$ when $A_{i,\tau}$ is the selected truth-choice integer.
Fix $j \in \{1,\dots,m\}$. Define
\begin{align*}
s_j := \sum_{i=1}^{n} c_{i,\alpha(x_i),j}.
\end{align*}
The value $s_j$ is the number of true literal occurrences in $C_j$ under $\alpha$, so $s_j \in \{0,1,2,3\}$. The only possible slack contributions in clause column $n+j$ are $0$, $1$, $2$, and $3$, obtained from choosing no slack number, choosing $U_j$, choosing $V_j$, or choosing both. Since the target digit in clause column $n+j$ is $4$, there must be a slack contribution $r_j \in \{0,1,2,3\}$ such that
\begin{align*}
s_j + r_j = 4.
\end{align*}
This is impossible when $s_j=0$. Therefore $s_j \geq 1$, so clause $C_j$ has at least one true literal occurrence under $\alpha$. Since $j$ was arbitrary, $\alpha$ satisfies every clause of $\Phi$.
[/step]
[step:Conclude polynomial-time NP-hardness and NP-completeness]
The construction produces $2n+2m$ integers. Each integer has at most $n+m$ base-$10$ digits, hence bit length $O(n+m)$. The coefficients $c_{i,\tau,j}$ are obtained by scanning the literal occurrences of $\Phi$, so the whole transformation is computable in time polynomial in the size of $\Phi$.
The preceding two steps prove that
\begin{align*}
\Phi \in \mathrm{3}\text{-}\mathrm{SAT} \iff f(\Phi) \in \mathrm{SUBSET}\text{-}\mathrm{SUM},
\end{align*}
where $f$ denotes the constructed many-one reduction. Since [3-SAT Is NP-Complete](/theorems/31), $\mathrm{SUBSET}\text{-}\mathrm{SUM}$ is NP-hard. Together with $\mathrm{SUBSET}\text{-}\mathrm{SUM}$ belonging to NP, this proves that $\mathrm{SUBSET}\text{-}\mathrm{SUM}$ is NP-complete.
[/step]