The executive summary is that there is a large class of problems that are NP-complete, all of which probably cannot be solved on general inputs more quickly than in time exponential in the input size. There is no known polynomial-time algorithm for any of them, and smart money says none exists. This is what is meant by the "P=NP" debate; if a polynomial-time algorithm for any NP-complete problem were found, that would imply that P (the class of polynomial-time solvable problems) and NP (the class of NP-complete problems, which we'll define in a moment) are the same. This almost certainly isn't true, though, which in practice means that none of these problems can be solved in time that is not exponential in the size of the input. For a few reasons, this isn't quite as grim as it sounds, as will become clear in subsequent installments.

For a problem to be in the class NP, it must be possible to verify the validity of a solution to the problem in polynomial time. Mostly in NP we see the *decision* versions of the optimization problems we actually care about: "Does there exist a traveling salesman tour for these points whose length is less than 100?" rather than "Find me the shortest traveling salesman tour for these points." You can easily see that the optimization problem can be used to solve the decision problem, and is thus at least as hard. Often the decision problem is in NP (given a set of points and a tour, it's trivial to determine whether the tour is shorter than the given bound), while the optimization problem is not (there's no polynomial-time way to verify that a tour is the shortest possible, unless P=NP).

To be *NP-complete*, a problem must be in NP, and it must also be possible to transform any instance of some known NP-complete problem into an instance of our problem, and to transform a solution to our problem back into a solution to that one. This is where we get the all-or-nothing nature of the P=NP conundrum: If you had a polynomial-time algorithm for any NP-complete problem, you could use it to solve every other NP-complete problem via these transformations. Devising such a transformation is at the heart of proving a problem NP-complete.

A problem that can be transformed to an NP-complete problem, but isn't (or isn't known to be) in NP, is referred to as *NP-hard* rather than NP-complete. The optimization versions of many of the NP-complete decision problems fall into this category.

After you've seen a number of NP-hard problems, you start to get an intuition for whether a new problem is likely to be NP-hard or not. Be careful, though; sometimes problems that 'feel' NP-hard actually aren't, and vice-versa.

A few of the classic NP-complete problems are:

*Satisfiability*(aka

*SAT*): Given a boolean formula, is there a set of assignments to its variables that makes the formula true?

*Vertex cover*: Given a graph, is there a set of fewer than

*n*vertices such that at least one endpoint of every edge is in the set?

*Bin packing*: Is it possible to pack some sized objects into a set of sized bins?

See Wikipedia for a longer list, though many of them aren't defined. I may have to fix that.

Garey and Johnson's *Computers and Intractability: A Guide to the Theory of NP-Completeness* is the book to read if you want more; it's one of the better-written CS books I've ever encountered. (I believe somewhere my wife has a photo of me sitting on the beach reading this book. I'm such a nerd.) David Johnson also wrote a column on NP-completeness for *Journal of Algorithms* for a number of years; it is also excellent reading, and is available online!

Labels: algorithms, np-completeness, theory