Joe Ganley: NP-completeness I: It's not like porn

Recently Jeff Atwood posted about NP-completeness. I and a few other commenters complained that he made a bit of a mess of it. It's easy to be a critic, so I thought I'd give it a go myself. I intend to write a series of posts; this first one will define NP-completeness (correctly), and subsequent posts will talk about various ways of attacking NP-hard problems.

The executive summary is that there is a large class of problems that are NP-complete, all of which probably cannot be solved on general inputs more quickly than in time exponential in the input size. There is no known polynomial-time algorithm for any of them, and smart money says none exists. This is what is meant by the "P=NP" debate; if a polynomial-time algorithm for any NP-complete problem were found, that would imply that P (the class of polynomial-time solvable problems) and NP (the class of NP-complete problems, which we'll define in a moment) are the same. This almost certainly isn't true, though, which in practice means that none of these problems can be solved in time that is not exponential in the size of the input. For a few reasons, this isn't quite as grim as it sounds, as will become clear in subsequent installments.

For a problem to be in the class NP, it must be possible to verify the validity of a solution to the problem in polynomial time. Mostly in NP we see the decision versions of the optimization problems we actually care about: "Does there exist a traveling salesman tour for these points whose length is less than 100?" rather than "Find me the shortest traveling salesman tour for these points." You can easily see that the optimization problem can be used to solve the decision problem, and is thus at least as hard. Often the decision problem is in NP (given a set of points and a tour, it's trivial to determine whether the tour is shorter than the given bound), while the optimization problem is not (there's no polynomial-time way to verify that a tour is the shortest possible, unless P=NP).

To be NP-complete, a problem must be in NP, and it must also be possible to transform any instance of some known NP-complete problem into an instance of our problem, and to transform a solution to our problem back into a solution to that one. This is where we get the all-or-nothing nature of the P=NP conundrum: If you had a polynomial-time algorithm for any NP-complete problem, you could use it to solve every other NP-complete problem via these transformations. Devising such a transformation is at the heart of proving a problem NP-complete.

A problem that can be transformed to an NP-complete problem, but isn't (or isn't known to be) in NP, is referred to as NP-hard rather than NP-complete. The optimization versions of many of the NP-complete decision problems fall into this category.

After you've seen a number of NP-hard problems, you start to get an intuition for whether a new problem is likely to be NP-hard or not. Be careful, though; sometimes problems that 'feel' NP-hard actually aren't, and vice-versa.

A few of the classic NP-complete problems are:

Satisfiability (aka SAT): Given a boolean formula, is there a set of assignments to its variables that makes the formula true?

Vertex cover: Given a graph, is there a set of fewer than n vertices such that at least one endpoint of every edge is in the set?

Bin packing: Is it possible to pack some sized objects into a set of sized bins?
See Wikipedia for a longer list, though many of them aren't defined. I may have to fix that.

Garey and Johnson's Computers and Intractability: A Guide to the Theory of NP-Completeness is the book to read if you want more; it's one of the better-written CS books I've ever encountered. (I believe somewhere my wife has a photo of me sitting on the beach reading this book. I'm such a nerd.) David Johnson also wrote a column on NP-completeness for Journal of Algorithms for a number of years; it is also excellent reading, and is available online!

Labels: algorithms, np-completeness, theory

Joe Ganley	I make software and sometimes other things.

Due to Blogger's termination of support for FTP, this blog is no longer active. It is possible that some links from here, particularly those within the site, are now broken. If you encounter one of those, your best bet is to go to the new front page and hunt for it from there. Most, but not all, of the blog's posts are on this page; the archives are here.
NP-completeness I: It's not like porn
Recently Jeff Atwood posted about NP-completeness. I and a few other commenters complained that he made a bit of a mess of it. It's easy to be a critic, so I thought I'd give it a go myself. I intend to write a series of posts; this first one will define NP-completeness (correctly), and subsequent posts will talk about various ways of attacking NP-hard problems. The executive summary is that there is a large class of problems that are NP-complete, all of which probably cannot be solved on general inputs more quickly than in time exponential in the input size. There is no known polynomial-time algorithm for any of them, and smart money says none exists. This is what is meant by the "P=NP" debate; if a polynomial-time algorithm for any NP-complete problem were found, that would imply that P (the class of polynomial-time solvable problems) and NP (the class of NP-complete problems, which we'll define in a moment) are the same. This almost certainly isn't true, though, which in practice means that none of these problems can be solved in time that is not exponential in the size of the input. For a few reasons, this isn't quite as grim as it sounds, as will become clear in subsequent installments. For a problem to be in the class NP, it must be possible to verify the validity of a solution to the problem in polynomial time. Mostly in NP we see the decision versions of the optimization problems we actually care about: "Does there exist a traveling salesman tour for these points whose length is less than 100?" rather than "Find me the shortest traveling salesman tour for these points." You can easily see that the optimization problem can be used to solve the decision problem, and is thus at least as hard. Often the decision problem is in NP (given a set of points and a tour, it's trivial to determine whether the tour is shorter than the given bound), while the optimization problem is not (there's no polynomial-time way to verify that a tour is the shortest possible, unless P=NP). To be NP-complete, a problem must be in NP, and it must also be possible to transform any instance of some known NP-complete problem into an instance of our problem, and to transform a solution to our problem back into a solution to that one. This is where we get the all-or-nothing nature of the P=NP conundrum: If you had a polynomial-time algorithm for any NP-complete problem, you could use it to solve every other NP-complete problem via these transformations. Devising such a transformation is at the heart of proving a problem NP-complete. A problem that can be transformed to an NP-complete problem, but isn't (or isn't known to be) in NP, is referred to as NP-hard rather than NP-complete. The optimization versions of many of the NP-complete decision problems fall into this category. After you've seen a number of NP-hard problems, you start to get an intuition for whether a new problem is likely to be NP-hard or not. Be careful, though; sometimes problems that 'feel' NP-hard actually aren't, and vice-versa. A few of the classic NP-complete problems are: Satisfiability (aka SAT): Given a boolean formula, is there a set of assignments to its variables that makes the formula true? Vertex cover: Given a graph, is there a set of fewer than n vertices such that at least one endpoint of every edge is in the set? Bin packing: Is it possible to pack some sized objects into a set of sized bins? See Wikipedia for a longer list, though many of them aren't defined. I may have to fix that. Garey and Johnson's Computers and Intractability: A Guide to the Theory of NP-Completeness is the book to read if you want more; it's one of the better-written CS books I've ever encountered. (I believe somewhere my wife has a photo of me sitting on the beach reading this book. I'm such a nerd.) David Johnson also wrote a column on NP-completeness for Journal of Algorithms for a number of years; it is also excellent reading, and is available online! Labels: algorithms, np-completeness, theory 0 Comments: Post a Comment << Home