Comments (11)
  • Add a Comment
  • Edit
  • More Actions v
  • Quarantine this Entry

1 cumthsc commented Permalink

As long as we are trying to nail down precise ideas here, I think you slightly misunderstand the distinction between decision and optimization problems. A decision problem poses a yes-no question about a property of the instance, not about a solution. The form of problem is: Given an instance of the structure under investigation and a property of the structure, is the answer "yes" (the instance possesses the property or "no" (it doesn't)? If the answer is yes, a solution that exhibits the property is a "certificate". If the fact that the solution illustrates the property can be verified in polynomial time, the problem is in NP. So for the TSP, the decision formulation is: Given a graph G=(V,E) with weights c_j for j in E and a number L, is there a tour with total weight L or less. A set of edges is a certificate for a yes answer if it forms a tour and it has total edge weight L or less. Verifying that a set of edges forms a tour and computing its total weight can be done in polynomial time, so TSPDECISION is in NP.

 
It's not clear how you could pose a decision form of TSPMINIMUM. The closest I can think of is: Given a graph G=(V,E) and a set of edge weights c_j for j in E and a tour T with total length L, is T a minum-length tour? But that's just CoTSPDECISION: Given G, c, and L (the length of the tour T included with your instance), is there *no* tour with length less than L? We don't know any polynomial certificate for any CoNP problem. If P=NP, there clearly is one, so NP=CoNP. ISTR that the converse is true as well (i.e., a polynomial certificate for a CoNP problem implies P=NP), but I can't reconstruct a proof off the top of my head. By providing the tour T, you guarantee that there is in fact a tour with length L, but that's just a slight refinement.
 
An instance of TSPMINIMUM actually looks like this: Given a graph G=(V, E) and edge weights c_j for j in E, identify an L such that (a) there is a tour of length L and (b) there is no tour of length less than L. Condition (a) is TSPDECISION and condition (b) is CoTSPDECISION.. But the answer is L, not "yes" or "no", so this is not a decision problem.
 
I think when discussing NP-completeness of TSP (for example), we really just implicitly convert optimization to decision problems. So to say "TSP is NP-complete" informally just means. "TSPDECISION is NP-complete.". TSPOPTIMIZATION is NP-hard because every instance of HAMILTONCYCLE is an instance of TSPOPTIMIZATION with c_j = 0 (in which case any tour proves L=0), but it's not in NP because in general, it can't be formulated as a yes/no question.

2 JeanFrancoisPuget commented Permalink

Mathew,

 
thank you for the comment. You make good and valid points but I still stand by what I wrote in this post. Let me try to explain it better here.
You are right, NP applies to decision problems. Therefore, as you point out, when one states that the TSP is NP complete, one implictly maps the TSP to a decision problem.
 
Therefore, I see (at least) 3 ways to refute the statement that TSP is NP complete.
 
1. One can just say that the TSP isn't a decision problem, hence the definition of NP completeness does not apply. That's true from a formal point of view, but I doubt it helps people understand where the mistake is.
 
2. One can define a set of decision problems of the form: "is there a tour of length less than k?". This is the classical view, which I described in my "NP Or Not NP? That Is the Question" post. It is also what you call TSPDECISION. While it is a very interesting construction, leading to the class NPO, I don't think it correctly capture the confusion people make. Indeed, these decision problems are NP complete. Therefore this mapping adds to the confusion, as one could say: TSP isn't NP complete but the "equivalent" decision problem is NP complete. If that was true, then the confusion would just be about subtle computer science wording. But the confusion is deeper. It is about forgetting that a solution of the TSP must be of minimal length. More formally, the solutions (certificates) of TSPDECISION are NOT the solutions of the original TSP. Indeed, a solution to the TSP is a minimal length tour. It is not any tour with length smaller than k, for a given k.
 
3. I think what matters is a decision problem whose "solutions" are the solutions to the original TSP. We can define a decision problem using the original definition of NP, as follows. The decision problem is to answer this question: "Is this tour a minimal length tour for this set of cities?". The tours for which the answer is yes are the same as the solutions of the TSP. Therefore, this decision problem is "equivalent" to the TSP. Let's call this decision problem TSPMINDECISION. Note that we cannot define TSPMINDECISION in term of TSPDECISION. They are fundamentally different, and one is not reducible to the other. As you write, TSPMINDECISION is related to TSPDECISION and coTSPDECISION.
Then,TSPMINDECISION is NP if and only if we can answer its question in polynomial time with a non deterministic Turing machine. Given no one has yet found a way to answer that question in polynomial time, this decision problem is not known to be NP.
We can now explain why TSP is not NP. It is because the equivalent decision problem is not in NP.

3 cumthsc commented Permalink

JeanFrancois, do we agree that an instance of a "decision problem" consists of some set of data and a yes/no question about that data? If so, then when you pose TSPMINIMUM as a decision problem, you need to say what data is provided and what the question is. As I read your post and your comment, the data that you provide for an instance of TSPMINDECISION is the graph, edge weights, and a tour, and the decision question is: Is the given tour of minimum length among tours?

 
This is what I was getting at in my second paragraph above. An instance of TSPMINDECISION with graph G, weights c, and tour T (with length L) is trivially an instance of CoTSPDECISION with graph G, weights c and length L. The decision question in CoTSPDECISION is: Is there *no* tour with length L or less? The answer is yes iff the answer to the instance of TSPMINDECISION is yes (iff the answer to TSPDECISION with graph G, weights c and length L is no). This is indeed a decision problem, but it is in CoNP, not in NP (unless NP = CoNP). The only difference between TSPMINDECISION and CoTSPDECISION is that in the former case, L is guaranteed to be the length of an actual tour, T, whereas that isn't a requirement of CoTSPDECISION.
 
(CoNP is the set of complements of problems in NP, i.e., the answer is yes iff the answer to the NP problem is no. The thing about CoNP is, you still want a certificate in case the answer is yes, but it's not apparent how to produce one that is polynomially verifiable. That is, what the nondeterministic Turing machine can do in the case of TSPDECISION is generate a tour, compute its length, and answer yes if the tour has length L or less (obviously doable in polynomial time). In the case of TSPMINDECISION or CoTSPDECISION, it's not apparent what the NDTM should do in order to be able to answer yes. Complements of NP-complete problems are not in NP--that is, there is nothing that a NDTM could do in polynomial time to answer yes for a CoNP problem--unless NP = CoNP, so TSPMINDECISION and CoTSPDECISION are probably not in NP. Note that P is in the intersection of NP and CoNP, so if P = NP, then NP = CoNP.)
 
Also, TSPMINDECISION is not equivalent to the optimization problem TSPMINIMUM, because the data provided is different. An instance of TSPMINIMUM consists of a graph G and weights c and the question: What is the length of the shortest tour? An instance of TSPMINDECISION consists of G, c, and tour T, and only asks if the supplied T (with length L) is the shortest tour. If the answer to TSPMINDECISION is no, then you haven't solved TSPMINIMUM because you still don't know the length of the shortest tour. To answer TSPMINIMUM (what is the length of the shortest tour?), then, you would need to answer TSPMINDECISION (is the length of *this* tour shortest?) with potentially every possible tour.
 
Hope that clarifies my point.

4 cumthsc commented Permalink

So to your title point, TSPMINIMIZE in NP-hard in the sense that if I had a polynomial algorithm for producing the minimum tour length, I'd have a polynomial algorithm for answering TSPDECISION (and CoTSPDECISION), but it's not (known to be) in NP, if you like, because it's not a decision problem with a yes/no answer. If you prefer, it's because there's not (known to be) a polynomial process that a nondeterministic Turing machine could carry out to produce the verified length of the shortest tour.

5 JeanFrancoisPuget commented Permalink

Your point was crystal clear.

 
My point is that the TSPDECISION problem is not capturing the confusiton people make.
 
TSPMINDECISION is a yes/no problem that captures the difficulty of the TSP. It is not solvable in polynomial time as far as we know;
 
I am not saying that TSPMINDECISION brings any mathematical insight.

6 JeanFrancoisPuget commented Permalink

Lance Fortnow pointed me to the Euclidian TSP that was "proven" to be NP complete. There are two seminal papers for this

 
The first paper is "Some NP-complete Geometric Problems" by Garey et al. In this paper, the TSP is defined as follows
TRAVELING SALESMAN PROBLEM: Given a set S of integer coordinate points in the plane and an integer L, does there exist a circuit passing through all the points of S which, with edge lengths l measured by L 1 (L2) has total length at most L?
 
This is precisely the TSPDECISION problem introduced by Matthew in his comment, and not the original TSP which is an optimization problem.
 
The second paper is "The Euclidean traveling salesman problem is NP-complete" by Papadimitriou. This paper shows that the Euclidian path TSP is NP hard because exact cover can be reduced to it, but it does not prove that it is in NP
 
I therefore stand to my claim which is that TSP, even in the Euclidian case, is not known to be NP complete.

7 pjcpjcpjc commented Permalink

In general, any optimization problem can be reformulated as a decision problem by moving the objective into the constraint set. (Matthew describes this nicely for TSP). If the derived decision problem is NP-complete, then I think it is reasonable to refer to the original optimization problem as NP-complete.

 
My thinking is as follows. Experts will understand that when another expert refers to an optimization problem as NP-complete it is a simply as a shorthand for saying the derived decision problem is NP-complete. That is to say, when I hear Lance Fortnow say "TSP is NP-complete" I know he really means "the decision that asks whether a tour exists with total distance less than X is NP-complete".
 
Moreover, I think this usage of NP-complete is best for lay readers. In this case, the key insight is that the optimization problem has the property of being hard to optimize but easy to validate. We want Bruce the cab dispatcher to know that the TSP engine he purchased can be relied on to provide only feasible solutions, even though these solutions might not always be optimal. If one of Bruce's customers is stranded, then Bruce almost always has cause for complaint. This is not necessarily the case If one of Bruce's drivers happens to know a better route than that provided by the engine.
 
So I am going to continue to refer to TSP as being NP-complete when discussing optimization. I would not want to imply that it is difficult to validate feasibility and compute total cost for a given tour, and I believe Bruce would make exactly that inference when hearing that TSP is NP-hard but not NP-complete. On the other hand, I don't think Lance is bothered when he hears me call TSP NP-complete - he understands I am actually referring to the derived decision problem.
 
The issue here seems to be a choice between following the terminology in a strict way, or being a little loose with the terminology for the sake of preserving "truthiness". I tend to lean towards the latter, but I respect those who (perhaps following the French tradition of mathematical excellence) prefer the former.

8 JeanFrancoisPuget commented Permalink

Pete,

 
thank you for your comment. Have you considered a diplomatic career (thinking of what you write about French mathematical excellence...) ?
 
I respect and understand when experts say "TSP is NP complete" as a shortcut for "the decision problem of deciding if a tour of length less than L exists is NP complete". I hope I didn't convey the idea that computer science experts have a confused view here. I am more concerned about layman reading such experts, then wrongly concluding that it is easy to show that a feasible solution to a combinatorial optimization problem is optimal.
 
I sincerely think that calling TSP NP Complete is misleading non experts. Said differently, I strongly disagree with what you say: " the key insight is that the optimization problem has the property of being hard to optimize but easy to validate." Indeed, validating that a tour is of minimal length is far from being easy. Finding a polynomial way to do it would mean P= NP, which everyone agrees would be a tremendous achievement. That's why I wrote this post. I would not have bothered writing if it was merely about an abuse of mathematical language.

9 cumthsc commented Permalink

One can read the statement "Optimization problem X is NP-complete" either as shorthand for the (correct) statement "The decision form of problem X is NP-complete" or as a misstatement of "Optimization problem X is NP-hard." We who know the language generally give others who know the language the benefit of the doubt and assume they mean the former. I tend to lapse into that form in class after a while because the constant repetition of the phrase "the decision form of" gets tiresome.

 
I think Jean Francois is concerned that non-experts will make the latter reading and take as their lesson the erroneous interpretation. I'm not so sure that, for most non-experts, it makes much difference. For the non-student of complexity, I think "NP-complete" is mostly just a high-sounding way of saying "hard". For the new student of complexity, getting these concepts clear is what they are there for, and instructors do need to make sure that the full expression has been drummed into students' minds before introducing the shorthand. I admit, when I was first exposed to these ideas as related to optimization, I found the "NP-easy" part for optimization problems a little confusing. "What does a polynomial certificate for shortest tour even look like?" (I also admit that once or twice, I got the direction of the reduction for the "NP-hard" part backward. Making either of those mistakes on a homework problem quickly leads to a sudden moment of clarity!) It has made me a more careful instructor to remember those moments.

10 pjcpjcpjc commented Permalink

JFP

 
TSP is easy to validate in the following sense. Given a sequence of moves, it is easy to determine if this sequence is a valid tour, and if so, it is also easy to compute the total cost of this tour.
 
But, yes, it is computationally hard to determine if this total cost is the minimum possible cost. I did not mean to imply otherwise.
 
I find topics like this fascinating. I think it's interesting that there is genuine disagreement about how to discuss these problems with lay-readers. I will write a blog post on this topic myself, and let you know when it's live.

11 JeanFrancoisPuget commented Permalink

Matthew, Pete, thank you for your comments. i've rewritten the end of my post to take some of them into account.