next up previous contents
Next: Previous Empirical Methods Up: Solving Subset Sum Problems Previous: Introduction

  
Theoretical Bounds on Solving Subset Sum Problems

The majority of attacks on knapsack-based cryptosystems exploit the specific constructions of the cryptosystems. Two algorithms have been proposed, however, which depend only on the properties of the subset sum problem and not on any specific method of construction. These algorithms, one by Brickell [6] and one by Lagarias and Odlyzko [26], show that almost all low-density subset sum problem may be solved in polynomial time. The density d of a set of weights $a_1,\ldots{},a_n$ is defined by

 \begin{displaymath}d = \frac{n}{{\displaystyle \log_2 \max\begin{Sb}1 \leq i \leq n\end{Sb}
a_i}}.
\end{displaymath} (3.2)

For d > 1 there will in general be many subsets of weight with the same sum s, so from an encryption-decryption point of view we are interested in subset sum problem instances with $d \leq 1$. The Brickell and Lagarias-Odlyzko algorithms show that it is possible to solve almost all subset sum problems with d sufficiently small. The Brickell and Lagarias-Odlyzko attacks reduce the subset sum problem to the problem of finding the Euclidean-norm shortest nonzero vector in a lattice. As was mentioned in Section 1.1 above, finding short vectors in lattices may be very hard in general. The theoretical worst-case bounds for the LLL algorithm and its variants are not encouraging, and no bound currently exists for Seysen's algorithm. However, these techniques tend to perform much better in practice than in theory; the performance of Seysen's algorithm on the Korkin-Zolotarev test lattice with $\theta = 0.4$ (Section 2.2.3) is one such empirical example. Thus, it seems important to separate the efficiency of lattice reduction and finding short nonzero vectors from the difficulty in reducing subset sum problems to lattice reduction questions. We consider a Euclidean-norm lattice oracle (or lattice oracle for short) that, when given a lattice basis as its input, with high probability finds in polynomial time the Euclidean-norm shortest nonzero vector in the given lattice. We do not know how to construct such an oracle, but it might be possible to do so. Data provided in [26,33] show that at low densities the LLL algorithm behaves as a lattice oracle, and our results in Section 3.5 below show that this is also the case for a combination of the Seysen and LLL algorithms, even for significantly larger and denser subset sum problems. Given the existence of a lattice oracle, the analysis in [26] shows that it is possible to solve almost all subset sum problems of density $d < 0.6463\ldots{}$ in polynomial time. Recently, Coster, LaMacchia, Odlyzko and Schnorr [12] and Joux and Stern [19] independently demonstrated via different techniques that this bound could be improved to d < 0.9408. In fact, if we assume the existence of a sup-norm lattice oracle instead of a Euclidean-norm lattice oracle, [12] showed that the density bound then becomes d < 1. The Lagarias-Odlyzko attack proceeds as follows. Let $\{a_1,\ldots{},a_n\}$ be a set of weights with $0 \leq a_i \leq A$ for some positive integer A and for all $1 \leq i \leq n$. Let $\bold{e} = (e_1,\ldots{},e_n) \in \{0,1\}^n,
\bold{e} \neq (0,0,\ldots{},0)$ be fixed and depend only on n. Then

\begin{displaymath}s = \sum_{i=1}^n e_ia_i, \quad\text{for } e_i \in \{0,1\},
\end{displaymath}

is the sum of the subset of weights, where ai is in the subset if and only if ei = 1. Now, define basis vectors $\bold{b_1},\ldots{},\bold{b_{n+1}}$ as follows:
\begin{align*}\bold{b_1} & = (1,0,\ldots{},0,Na_1), \\
\bold{b_2} & = (0,1,\ld...
...0,0,\ldots{},1,Na_n), \\
\bold{b_{n+1}} & = (0,0,\ldots{},0,Ns),
\end{align*}
where N is a positive integer $> \sqrt{n}$. Let L be the lattice defined by the basis vectors $\bold{b_1},\ldots{},\bold{b_{n+1}}$. That is,

\begin{displaymath}L = \left\{ \sum_{i=1}^{n+1} z_i\bold{b_i} \colon z_i \in {\Bbb Z},\quad\text{for } 1
\leq i \leq n+1 \right\}.
\end{displaymath}

Notice that lattice L contains the vector $\bold{\hat{e}} =
(e_1,e_2,\ldots{},e_n,0)$, the solution vector to the subset sum problem, since

\begin{displaymath}\bold{\hat{e}} = \sum_{i=1}^n e_i\bold{b_i} - \bold{b_{n+1}}.
\end{displaymath}

Let P denote the probability that there exists another vector $\bold{\hat{x}} \in L$ such that $\Vert\bold{\hat{x}}\Vert \leq \Vert\bold{\hat{e}}\Vert$ and $\bold{\hat{x}} \not\in \{\bold{0},\bold{\hat{e}},-\bold{\hat{e}}\}$. The simplified analysis of the Lagarias-Odlyzko attack presented in [14] shows that this probability is bounded:

 \begin{displaymath}P \leq n \left(2n\sqrt{\tfrac{1}{2}n}+1\right)\frac{2^{c_0n}}{A}, \quad\text{for }
c_0 = 1.54724\ldots{}
\end{displaymath} (3.3)

Thus, if the bound on the size of the weights A = 2cn with c > c0, ${\displaystyle
\lim_{n\rightarrow \infty} P = 0}$. If the density of a subset sum problem is less than $0.6463\ldots{}$, then
\begin{align*}\frac{n}{{\displaystyle \log_2\max\begin{Sb}1 \leq i \leq n\end{Sb...
...b} a_i >
2^{n/0.6463\ldots{}} \\ & \Longrightarrow A > 2^{c_0n}.
\end{align*}
Thus, all subset sum problems with density $< 0.6463\ldots{}$ could be solved in polynomial time, given the existence of a lattice oracle. Recently, two independent improvements [12,19] to the Lagarias-Odlyzko attack have been developed, both of which increase the density bound to $d < 0.9408\ldots{}$. The modification suggested in [12] is to replace the $\bold{b_{n+1}}$ basis vector in L with

\begin{displaymath}\bold{b_{n+1}'} = (\tfrac{1}{2},\tfrac{1}{2},\ldots{},\tfrac{1}{2},Ns).
\end{displaymath}

Let L' be the lattice spanned by the vectors $\bold{b_1},\ldots{},\bold{b_n},\bold{b_{n+1}'}$. Lattice L' does not contain the solution vector $\bold{\hat{e}}$, but it contains a similar vector $\bold{\hat{e}'}$:

\begin{displaymath}\bold{\hat{e}'} = (e_1',\ldots{},e_n',0), \quad\text{where } e_i' = e_i-\tfrac{1}{2}.
\end{displaymath}

We know that $e_i' \in \{-\tfrac{1}{2},\tfrac{1}{2}\}$ for $1 \leq i \leq n$ since $e_i \in \{0,1\}$ for $1 \leq i \leq n$. Then $\Vert\bold{\hat{e}'}\Vert^2 \leq
\tfrac{1}{4}n$ independent of the number of ei's which are equal to 1. Using lattice L' we are now interested in the probability P' that there exists a vector $\bold{\hat{x}}'\in L'$ such that:

 \begin{displaymath}\begin{split}
\Vert\bold{\hat{x}}'\Vert & \leq \Vert\bold{\h...
...n \{\bold{0},\bold{\hat{e}}',-\bold{\hat{e}}'\}.
\end{split}
\end{displaymath} (3.4)

Utilizing similar techniques to those in [14,26,28], [12] shows that the probability P' is bounded above by:

 \begin{displaymath}P' \leq n \left(4n\sqrt{n}+1\right)\frac{2^{c_0'n}}{A}\quad\text{for }
c_0' = 1.0628\ldots{}.
\end{displaymath} (3.5)

This bound is similar to that in Equation 3.3 above. Since $1/c_0' = 0.9408\ldots{}$, any subset sum problem with density $d < 0.9408\ldots{}$ may be solved in polynomial time, given the existence of a lattice oracle.
next up previous contents
Next: Previous Empirical Methods Up: Solving Subset Sum Problems Previous: Introduction
Brian A. LaMacchia
1999-10-30