No, The TSP Isn't NP Complete
Two recent blog posts discussing the Traveling Saleman Problem (TSP) led me to write this post. The two blog posts are What The mistake is to say that the TSP is NP complete. This is wrong as far as we know. Truth is that TSP is NP hard. Granted, this mistake is quite minor and technical (NP complete vs NP hard), and fixing it will not change the world. Nor does it mean that the TSP isn't a very exemplar problem, worth studying. But it is a mistake anyway. Let us define what we are discussing here. The TSP is a problem defined by a set of cities and the distances between each city pair. The problem is to find a circuit that goes through each city once and that ends where it starts. This in itself isn't difficult. What makes the problem interesting is to find the shortest circuit among all those that are possible. Solving this problem is quite simple. One merely need to compute the length of all possible circuits, then keep the shortest one. Issue is that the number of such circuits grows very quickly with the number of cities. If there are n cities then this number is factorial of n1 = (n1)(n2)...3.2 Indeed, we can select arbitrarily one city to start from (the starting point doesn't matter much given one must end at the same place). Then we have n1 different choices for the second city to be visited, n2 choices for the third city, and so on. For 16 cities there are more than a trillion circuits. For 100 cities there are approximately 10^{156} circuits. For 10,000 cities this number is about10^{35,657} There is no way one can compute the shortest circuit for 10,000 cities this way, even using the fastest available computing grid for a century. Yet, scientific advances made by OR researchers led to quite effective methods able to solve TSPs with large number of cities. For instance, the figure below shows the optimal solution for a tour of all 13,509 cities and towns in the US that have more than 500 residents. Figure credit: David Applegate, Robert Bixby, Vasek Chvatal and William Cook. Let us now define NP completeness. A problem is NP if one can easily (in polynomial time) check that a proposed solution is indeed a solution. A problem is NP hard if is is as difficult as any NP problem. A problem is NP complete if it is both NP and NP hard. Is the TSP NP complete or not? We must first check that the TSP is NP. For this we must be able to check that a proposed tour is a solution to the TSP. For this we merely have to check that each city is visited once. This can be done in polynomial time, hence the TSP is NP. Second, we must check that the TSP is NP hard. I won't explain it here, but this has been proved true. Being both NP and NP hard, the TSP is NP complete! Wait a minute, didn't I just make the same mistake as others? Here is the trick. In order to check that a proposed tour is a solution of the TSP we need to check two things, namely
We didn't check the second condition! The second condition is what makes the problem difficult to solve. As of today, no one has found a way to check condition 2 in polynomial time. It means that the TSP isn't in NP, as far as we know. Therefore, TSP isn't NP complete as far as we know. We can only say that TSP is NP hard.
Readers interested in learning more can read the following, or Rod Hilton's Trav A more formal point of view In all fairness, I don't think that computer science experts have a confused view. When they write that TSP is NP complete, they mean that the following decision problem (yes/no question) is NP complete: TSPDECISION : Given a number L, a set of cities, and distance between all city pairs, is there a tour visiting each city exactly once of length less than L? This problem is indeed NP complete, as it is easy (polynomial time) to check that a given tour leads to a yes answer to TSPDECISION.
I have elaborated on this mapping between NP complete decision problems and optimization problems in my NP O I am fine with the shortcut about TSP being NP complete if people understand that it is a statement about TSPDECISION. What I find wrong is when people think that the following decision problem is NP complete TSPMINDECISION : Given a tour T, a set of cities, and distance between all city pairs, is T visiting each city exactly once and is T of minimal length? Truth is that no one has found a polynomial time way to check that a tour is of minimal length. And if someone does it, then it would prove P = NP, which would be a tremendous achievement. Solving the TSP, i.e. finding a minimal length tour, is at least as hard as checking that a tour is of minimal length. Therefore it cannot be done in polynomial time as far as we know. Let me thank Matthew Saltzman who helped me nail down this argument (any error or misconception are mine). Update on January 2 Rewritten the end to address the most common criticism of this post. Update on January 3 Added link to Paul Rubin's post. Update on January 9: Lance Fortnow reacted in writing his own followup blog post..
