Topic
  • 16 replies
  • Latest Post - ‏2013-08-12T15:54:15Z by azak
azak
azak
8 Posts

Pinned topic No proof of optimality

‏2013-07-19T15:13:47Z |

Hello!

I've just started using the studio, modeling and solving an MIP problem derived from graph colouring. It is solved fast (within seconds) when reaching minimization objective of 0. However, when I change one of the bounds (reduce number of available colours) which causes the objective function be greater than 0, it poses problems and finding the minimum objective takes much longer (for problems with 5000 constraints it takes about a minute, not possible when dealing with 28000 of constraints, 24hours is not enough). A few feasible solutions are found after exploring 30000 nodes but there is no proof of optimality (the last feasible solution with the gap 100%), the solution does not improve and the node exploration continues for hours (the number of nodes to explore keeps increasing). I can only stop by setting a time or node exploration limit. I tried different MIP strategies, as well as probing level 3 and aggressive cuts generation but it didn't help. I did also some tuning, but it returned default as the best strategy. I did profiling and time values are

ROOT • CPLEX MIP Optimization 99%
ROOT • CPLEX MIP Optimization • CPLEX Branch and Bound 93%
The rest is not more than 10% in terms of time and memory.

Are there any other ways to explore and help CPLEX prove optimality? I would be grateful for any suggestions, thank you! 

Regards,
Anna 

  • DanielJunglas
    DanielJunglas
    1554 Posts

    Re: No proof of optimality

    ‏2013-07-29T08:31:37Z  

    Could you post the first and last say 100 lines of the log file? That would give us a slightly better idea of what is going on. I guess the gap of 100% is displayed because the dual bound always stays at 0?

  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-07-29T09:49:50Z  

    Could you post the first and last say 100 lines of the log file? That would give us a slightly better idea of what is going on. I guess the gap of 100% is displayed because the dual bound always stays at 0?

    Hi Daniel, Yes, the bound stays at 0, parts of the log file attached. Looking forward to your reply.

    Attachments

  • DanielJunglas
    DanielJunglas
    1554 Posts

    Re: No proof of optimality

    ‏2013-07-29T12:39:38Z  
    • azak
    • ‏2013-07-29T09:49:50Z

    Hi Daniel, Yes, the bound stays at 0, parts of the log file attached. Looking forward to your reply.

    Do you have any idea what the optimal solution in your case is?

    From the log we can see that CPLEX is not able to raise the dual bound above 0. As long as the dual bound is zero, CPLEX will always show a gap of 100%. You could try to disable heuristics to increase the node throughput but I'm afraid that will not help much (if it will help at all).

    You could try to improve your model formulation by adding additional valid inequalities. How closely is your problem related to graph coloring? You may be able to use some of the known bounds on the chromatic number/index to calculate a better dual bound than 0.

  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-07-31T14:54:38Z  

    Do you have any idea what the optimal solution in your case is?

    From the log we can see that CPLEX is not able to raise the dual bound above 0. As long as the dual bound is zero, CPLEX will always show a gap of 100%. You could try to disable heuristics to increase the node throughput but I'm afraid that will not help much (if it will help at all).

    You could try to improve your model formulation by adding additional valid inequalities. How closely is your problem related to graph coloring? You may be able to use some of the known bounds on the chromatic number/index to calculate a better dual bound than 0.

    Thank you for your comments. I don't know where the solution lies, can rely only on heuristics run aside but that would be the upper bound. 
     
    The problematic graph has nearly 150 vertices. It's a vertex colouring problem with some additional constraints on the colours used within groups of nodes. I also solve it for a weighted graph and the objective is to minimize the overall weight (or combined with the number of colours, then the gap stops at around 80% ). I added a few valid inequalities, so that more cuts are generated initially but it doesn't help the bound. 
     
    Concerning reformulation, I have a more OPL related question: How to efficiently use the opl command maxl, when the set is defined by forall? None of the tries worked and I used two sets of constraints as follows:
     
      forall (i, j, k)
        c1:
           a[i]>=b[i][j][k];
            
       forall (i)
        c2:
           a[i]<=sum(i, j, k)b[i][j][k]
     
    Both variables are binary and variable a is used in the objective and I have a feeling this approach to simply a[i]=max(b[i][j][k]) may be one of the bottlenecks. 
    Updated on 2013-07-31T14:55:22Z at 2013-07-31T14:55:22Z by azak
  • DanielJunglas
    DanielJunglas
    1554 Posts

    Re: No proof of optimality

    ‏2013-08-01T05:57:07Z  
    • azak
    • ‏2013-07-31T14:54:38Z
    Thank you for your comments. I don't know where the solution lies, can rely only on heuristics run aside but that would be the upper bound. 
     
    The problematic graph has nearly 150 vertices. It's a vertex colouring problem with some additional constraints on the colours used within groups of nodes. I also solve it for a weighted graph and the objective is to minimize the overall weight (or combined with the number of colours, then the gap stops at around 80% ). I added a few valid inequalities, so that more cuts are generated initially but it doesn't help the bound. 
     
    Concerning reformulation, I have a more OPL related question: How to efficiently use the opl command maxl, when the set is defined by forall? None of the tries worked and I used two sets of constraints as follows:
     
      forall (i, j, k)
        c1:
           a[i]>=b[i][j][k];
            
       forall (i)
        c2:
           a[i]<=sum(i, j, k)b[i][j][k]
     
    Both variables are binary and variable a is used in the objective and I have a feeling this approach to simply a[i]=max(b[i][j][k]) may be one of the bottlenecks. 

    > The problematic graph has nearly 150 vertices. It's a vertex colouring problem with some additional constraints on the colours used within
    > groups of nodes. I also solve it for a weighted graph and the objective is to minimize the overall weight (or combined with the number of
    > colours, then the gap stops at around 80% ). I added a few valid inequalities, so that more cuts are generated initially but it doesn't help
    > the bound.
    >

    I'm afraid I don't understand. If you problem is vertex coloring (with potential additional constraints) and your objective is to minimize the number of colors, then how can the dual bound be 0? You will need at least one color to color the graph. If the graph has an edge, you will need at least two colors. If the graph has a clique of size N then you need at least N colors. Doesn't exploiting this sort of information give a better dual bound than 0? Or are there colors that have cost 0?

    > Concerning reformulation, I have a more OPL related question: How to efficiently use the opl command maxl, when the set is defined by
    > forall? None of the tries worked and I used two sets of constraints as follows:
    >  forall (i, j, k)
    >    c1:
    >       a[i]>=b[i][j][k];     
    >   forall (i)
    >    c2:
    >       a[i]<=sum(i, j, k)b[i][j][k]
    > Both variables are binary and variable a is used in the objective and I have a feeling this approach to simply a[i]=max(b[i][j][k]) may be
    > one of the bottlenecks.
    >
    If I understand correctly then your objective is 'min a[i]'? In that case c2 would be redundant and you can just drop it. As long as a[i] has a positive coefficient in the objective function it will never be >0 in an optimal solution unless one of the b[i][j][k] forces it to >0 via c1.

    Vertex coloring (in particular the unweighted version) has lots of symmetries (given a feasible solution you can just renumber colors to get another feasible solution with the same objective function value) which may slow down MIP search. It sometimes helps to explicitly break those symmetries or explicitly handle them during the search. What version of CPLEX do you use? Recent versions of CPLEX improved handling of symmetries in models so you may want to try 12.5.1 if you are not already using that). Other quick ideas to help with symmetries in vertex coloring that come to mind are

    1. Do some preprocessing: Find a large/maximum clique and fix colors for the vertices on this clique (every vertex in the clique needs a different color). Depending on the number of vertices the clique has this can dramatically reduce the size of the problem that has to be solved by CPLEX.
    2. There are formulations of the vertex coloring problem that avoid any symmetry. See for example this paper:
      @article{
      DBLP:journals/ipl/CampeloCF04,
        author    = {Manoel B. Camp{\^e}lo and
                     Ricardo C. Corr{\^e}a and
                     Yuri Frota},
        title     = {Cliques, holes and the vertex coloring polytope},
        journal   = {Inf. Process. Lett.},
        volume    = {89},
        number    = {4},
        year      = {2004},
        pages     = {159-164},
        ee        = {http://dx.doi.org/10.1016/j.ipl.2003.11.005},
        bibsource = {DBLP, 
      http://dblp.uni-trier.de}
      }
      
      Maybe similar techniques can help in your case as well.
  • EdKlotz
    EdKlotz
    244 Posts

    Re: No proof of optimality

    ‏2013-08-01T17:13:42Z  

    > The problematic graph has nearly 150 vertices. It's a vertex colouring problem with some additional constraints on the colours used within
    > groups of nodes. I also solve it for a weighted graph and the objective is to minimize the overall weight (or combined with the number of
    > colours, then the gap stops at around 80% ). I added a few valid inequalities, so that more cuts are generated initially but it doesn't help
    > the bound.
    >

    I'm afraid I don't understand. If you problem is vertex coloring (with potential additional constraints) and your objective is to minimize the number of colors, then how can the dual bound be 0? You will need at least one color to color the graph. If the graph has an edge, you will need at least two colors. If the graph has a clique of size N then you need at least N colors. Doesn't exploiting this sort of information give a better dual bound than 0? Or are there colors that have cost 0?

    > Concerning reformulation, I have a more OPL related question: How to efficiently use the opl command maxl, when the set is defined by
    > forall? None of the tries worked and I used two sets of constraints as follows:
    >  forall (i, j, k)
    >    c1:
    >       a[i]>=b[i][j][k];     
    >   forall (i)
    >    c2:
    >       a[i]<=sum(i, j, k)b[i][j][k]
    > Both variables are binary and variable a is used in the objective and I have a feeling this approach to simply a[i]=max(b[i][j][k]) may be
    > one of the bottlenecks.
    >
    If I understand correctly then your objective is 'min a[i]'? In that case c2 would be redundant and you can just drop it. As long as a[i] has a positive coefficient in the objective function it will never be >0 in an optimal solution unless one of the b[i][j][k] forces it to >0 via c1.

    Vertex coloring (in particular the unweighted version) has lots of symmetries (given a feasible solution you can just renumber colors to get another feasible solution with the same objective function value) which may slow down MIP search. It sometimes helps to explicitly break those symmetries or explicitly handle them during the search. What version of CPLEX do you use? Recent versions of CPLEX improved handling of symmetries in models so you may want to try 12.5.1 if you are not already using that). Other quick ideas to help with symmetries in vertex coloring that come to mind are

    1. Do some preprocessing: Find a large/maximum clique and fix colors for the vertices on this clique (every vertex in the clique needs a different color). Depending on the number of vertices the clique has this can dramatically reduce the size of the problem that has to be solved by CPLEX.
    2. There are formulations of the vertex coloring problem that avoid any symmetry. See for example this paper:
      @article{
      DBLP:journals/ipl/CampeloCF04,
        author    = {Manoel B. Camp{\^e}lo and
                     Ricardo C. Corr{\^e}a and
                     Yuri Frota},
        title     = {Cliques, holes and the vertex coloring polytope},
        journal   = {Inf. Process. Lett.},
        volume    = {89},
        number    = {4},
        year      = {2004},
        pages     = {159-164},
        ee        = {http://dx.doi.org/10.1016/j.ipl.2003.11.005},
        bibsource = {DBLP, 
      http://dblp.uni-trier.de}
      }
      
      Maybe similar techniques can help in your case as well.

    Can you upload the LP file of the model?   Overall I agree with Daniel's comments.   You might be able to deduce additional information by adding a constraint that the MIP objective function is 0 to the model.   That will presumably be infeasible.   Then run CPLEX's conflict refiner to obtain an explanation of the infeasibility.   The constraints in the conflict may shed light on additional cuts besides the ones Daniel mentioned. Also, set CPLEX's symmetry detection parameter to its most aggressive setting of 5. 

    Also, I recommend that you find out whether the challenging part of this model involves the vertex coloring constraints or the additional constraints on the colors mentioned above, or both.   Try commenting out the OPL statements that specify the additional constraints.   Does the model remain challenging?   If so, focus on the coloring constraints in terms of tightening the formulation.   Otherwise, have a look at the additional constraints, perhaps filtering them out more individually to find the hard ones.   The constraints that make the model hard are usually the ones you want to examine when trying to tighten the formulation.

    Based on the log you sent, I doubt you will be able to prove optimality in a reasonable amount of time just by parameter tuning.   You will need to tighten the formulation.   To test this hypothesis, try setting CPLEX's MIP emphasis parameter to 3, symmetry detection parameter to 5, and variableselect parameter to 3.     Those settings will really emphasize moving the dual bound, with much less.  If they don't improve the dual bound from 0, I doubt any other parameter settings will either.

  • EdKlotz
    EdKlotz
    244 Posts

    Re: No proof of optimality

    ‏2013-08-02T01:30:14Z  
    • EdKlotz
    • ‏2013-08-01T17:13:42Z

    Can you upload the LP file of the model?   Overall I agree with Daniel's comments.   You might be able to deduce additional information by adding a constraint that the MIP objective function is 0 to the model.   That will presumably be infeasible.   Then run CPLEX's conflict refiner to obtain an explanation of the infeasibility.   The constraints in the conflict may shed light on additional cuts besides the ones Daniel mentioned. Also, set CPLEX's symmetry detection parameter to its most aggressive setting of 5. 

    Also, I recommend that you find out whether the challenging part of this model involves the vertex coloring constraints or the additional constraints on the colors mentioned above, or both.   Try commenting out the OPL statements that specify the additional constraints.   Does the model remain challenging?   If so, focus on the coloring constraints in terms of tightening the formulation.   Otherwise, have a look at the additional constraints, perhaps filtering them out more individually to find the hard ones.   The constraints that make the model hard are usually the ones you want to examine when trying to tighten the formulation.

    Based on the log you sent, I doubt you will be able to prove optimality in a reasonable amount of time just by parameter tuning.   You will need to tighten the formulation.   To test this hypothesis, try setting CPLEX's MIP emphasis parameter to 3, symmetry detection parameter to 5, and variableselect parameter to 3.     Those settings will really emphasize moving the dual bound, with much less.  If they don't improve the dual bound from 0, I doubt any other parameter settings will either.

    And one other thing.   Try to provide a reasonable upper bound on the variables associated with the assignment of the color to the vertex.   For example, if you use different integer values to identify different colors, and your graph has 150 vertices, then a bound of 150 is available for these variables since the worst you can do is assign a distinct color to each vertex.   While this is obvious on the context of the graph theory problem, I doubt it is so clear to to CPLEX based on the equations in the model you provide.    Such bounds could help tighten the formulation, which in turn will yield more progress in the dual bound.

  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-08-02T14:44:32Z  
    • EdKlotz
    • ‏2013-08-02T01:30:14Z

    And one other thing.   Try to provide a reasonable upper bound on the variables associated with the assignment of the color to the vertex.   For example, if you use different integer values to identify different colors, and your graph has 150 vertices, then a bound of 150 is available for these variables since the worst you can do is assign a distinct color to each vertex.   While this is obvious on the context of the graph theory problem, I doubt it is so clear to to CPLEX based on the equations in the model you provide.    Such bounds could help tighten the formulation, which in turn will yield more progress in the dual bound.

    Thank you for the comments. Replying to your questions, the problem is a vertex coloring problem, where the vertices have certain weights <0;1> (some of them may have 0 indeed, I have tens of these graphs to analyze). Within the graph, there is also a number of vertex groups (vertices within a group are not always connected, so it's not a clique) and the colors should not be reused within a group. There are two problems that I look at:
     
    1) In the first approach the objective is to find chromatic number X with the grouping constraints, this is solved easily.
    2) The upper bound on the number of available colors is lower than X found in 1) and the objective is to minimize  the sum of weights of those vertices that are colored with the same color in a group.  
     
    I also enclose the model below (you're right Daniel, I dropped C2 from the previous post). I use CPLEX 12.5, tried the settings suggested by Ed and as he suspected, it didn't help.
     
    -------------------------------------------------
     
     
    -------------------------------------------------
     
    I can try tightening the bound by adding the following valid inequality: 
     
       forall (i in Vertices, g in Grouping: GroupSize>NbColors, k in Colors)
        groupReuse_validineq:
          sum(i in Vertices)a[i]*(VertexWeight[i])>= sum (g) of (GroupSize-NbColors) lowest VertexWeight
     
    but there is still a chance, the sum on the right side is 0.
    Updated on 2013-09-24T15:12:33Z at 2013-09-24T15:12:33Z by azak
  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-08-02T14:59:42Z  
    • EdKlotz
    • ‏2013-08-02T01:30:14Z

    And one other thing.   Try to provide a reasonable upper bound on the variables associated with the assignment of the color to the vertex.   For example, if you use different integer values to identify different colors, and your graph has 150 vertices, then a bound of 150 is available for these variables since the worst you can do is assign a distinct color to each vertex.   While this is obvious on the context of the graph theory problem, I doubt it is so clear to to CPLEX based on the equations in the model you provide.    Such bounds could help tighten the formulation, which in turn will yield more progress in the dual bound.

    Thank you for the comments. Replying to your questions, the problem is a vertex coloring problem, where the vertices have certain weights <0;1> (some of them may have 0 indeed, I have tens of these graphs to analyze). Within the graph, there is also a number of vertex groups (vertices within a group are not always connected, so it's not a clique) and the colors should not be reused within a group. There are two problems that I look at:
     
    1) In the first approach the objective is to find chromatic number X with the grouping constraints, this is solved easily.
    2) The upper bound on the number of available colors is lower than X found in 1) and the objective is to minimize  the sum of weights of those vertices that are colored with the same color in a group.  
     
    I also enclose the model below (you're right Daniel, I dropped C2 from the previous post). I use CPLEX 12.5, tried the settings suggested by Ed and as he suspected, it didn't help.
     
    -------------------------------------------------
     
     
    -------------------------------------------------
     
    I can try tightening the bound by adding the following valid inequality: 
     
       forall (i in Vertices, g in Grouping: GroupSize>NbColors, k in Colors)
        groupReuse_validineq:
          sum(i in Vertices)a[i]*(VertexWeight[i])>= sum (g) of (GroupSize-NbColors) lowest VertexWeight
     
    but there is still a chance, the sum on the right side is 0.
    Updated on 2013-09-24T15:13:01Z at 2013-09-24T15:13:01Z by azak
  • DanielJunglas
    DanielJunglas
    1554 Posts

    Re: No proof of optimality

    ‏2013-08-03T13:22:35Z  
    • azak
    • ‏2013-08-02T14:44:32Z
    Thank you for the comments. Replying to your questions, the problem is a vertex coloring problem, where the vertices have certain weights <0;1> (some of them may have 0 indeed, I have tens of these graphs to analyze). Within the graph, there is also a number of vertex groups (vertices within a group are not always connected, so it's not a clique) and the colors should not be reused within a group. There are two problems that I look at:
     
    1) In the first approach the objective is to find chromatic number X with the grouping constraints, this is solved easily.
    2) The upper bound on the number of available colors is lower than X found in 1) and the objective is to minimize  the sum of weights of those vertices that are colored with the same color in a group.  
     
    I also enclose the model below (you're right Daniel, I dropped C2 from the previous post). I use CPLEX 12.5, tried the settings suggested by Ed and as he suspected, it didn't help.
     
    -------------------------------------------------
     
     
    -------------------------------------------------
     
    I can try tightening the bound by adding the following valid inequality: 
     
       forall (i in Vertices, g in Grouping: GroupSize>NbColors, k in Colors)
        groupReuse_validineq:
          sum(i in Vertices)a[i]*(VertexWeight[i])>= sum (g) of (GroupSize-NbColors) lowest VertexWeight
     
    but there is still a chance, the sum on the right side is 0.

    Some comments that came to my mind concerning the second case (minimize number of vertices in a group that get the same color):

    1. diffNeighColor_validineq is redundant, isn't it? It is implied by diffNeighColor.
    2. groupReuse2 and groupReuse3 also look redundant. If VertexWeight[i]>0 then a[i] will only be >0 in an optimal solution if b[i][j][k]>0 and that will only be >0 if a[i][k]>0 and a[j][k]>0 (otherwise the solution is not optimal). If VertexWeight[i]==0 then the value of a[i] does not matter anyway.
    3. I think in the second case you can do better than constraint maxColors: Since there is no difference between colors you can just set w[0..NbColors-]=1 and w[0..Colors]=0. This will allow CPLEX presolve to eliminate the w variables from your model. Not sure if CPLEX would always pick up this reduction otherwise.
    4. Not sure if this will help but I think the following is a valid inequality: Assume G is a group and i is a vertex in that group. If i is colored k then a[i] must be one if at least one other vertex in G is colored k. So for each color k we have (|G|-1)*a[i] >= sum(j in G && j != i) x[j][k] - (|G|-1)*(1 - x[i][k]).
      You could look for other combinatorial properties like this that give rise to valid inequalities.

    Vertex coloring using branch&bound/cut is a well studied problem (see for example this paper). There are some valid inequalities/facets that are not too hard to find. You could try adding some of them to your formulation.

  • EdKlotz
    EdKlotz
    244 Posts

    Re: No proof of optimality

    ‏2013-08-06T21:04:52Z  
    • azak
    • ‏2013-08-02T14:59:42Z
    Thank you for the comments. Replying to your questions, the problem is a vertex coloring problem, where the vertices have certain weights <0;1> (some of them may have 0 indeed, I have tens of these graphs to analyze). Within the graph, there is also a number of vertex groups (vertices within a group are not always connected, so it's not a clique) and the colors should not be reused within a group. There are two problems that I look at:
     
    1) In the first approach the objective is to find chromatic number X with the grouping constraints, this is solved easily.
    2) The upper bound on the number of available colors is lower than X found in 1) and the objective is to minimize  the sum of weights of those vertices that are colored with the same color in a group.  
     
    I also enclose the model below (you're right Daniel, I dropped C2 from the previous post). I use CPLEX 12.5, tried the settings suggested by Ed and as he suspected, it didn't help.
     
    -------------------------------------------------
     
     
    -------------------------------------------------
     
    I can try tightening the bound by adding the following valid inequality: 
     
       forall (i in Vertices, g in Grouping: GroupSize>NbColors, k in Colors)
        groupReuse_validineq:
          sum(i in Vertices)a[i]*(VertexWeight[i])>= sum (g) of (GroupSize-NbColors) lowest VertexWeight
     
    but there is still a chance, the sum on the right side is 0.

    After looking at the model description, I am a bit puzzled as to how the best node value could be 0 in the log you sent, even at the root node.  The simplest representation of the vertex coloring model would consist of minimizing the sum of the w variables, subject to the diffNeighColor and oneColor constraints.   If I just look at that objective and those two groups of constraints, I believe they already imply a positive objective at the root node relaxation.  

    Specifically, let's follow on Daniel's point that vertices in a clique require different colors.   So, suppose I consider a graph consisting of 3 vertices with edges connecting each pair.   If we create this minimal vertex coloring formulation involving just the diffNeighColor and oneColor constraints, we can add up all of the constraints and show that the sum of the w variables is at least 2.    This does not make use of the integrality of the variables at all, so it applies to the LP relaxation as well as the MIP.   This holds regardless of the number of colors available.    It also holds regardless of the specific edge set of the graph; as soon as you have a single edge in the graph, the LP relaxation objective must be at least 2.

    So, how did the node log you attached have a zero objective, not only for the root, but for many subsequent nodes in the optimization?   Were you using your second objective?  If so, can you post a log for a challenging model for the first objective?   I'm not completely clear on the second objective; is it trying to minimize the sum of reused colors in a group that allows violations of the basic unique color requirement for edges?   

    If I correctly understand your second objective, I am not particularly surprised that the model with the second objective is more difficult to solve for a branch and cut solver like CPLEX than the first.   You are introducing some soft constraints with the second objective.   Soft constraints can weaken a formulation in the sense that certain infeasibilities that previously could be used to derive cuts no longer are available.   Think about it this way; if you had a knapsack constraint, you initially might be able to deduce numerous cover cuts.   But, if you soften it by subtracting a surplus variable, you remove those cover cuts.   The same thing applies for your model with clique cuts.

    So, are most of your performance difficulties occurring with the second objective, or do you have them with the first objective as well?    I'd really like to see a node log from a problematic run using the first objective, as I now have concluded the log you sent above involves the second objective, not the first.

    Regarding models with the first objective, try setting CPLEX's clique cut parameter to its most aggressive setting of 3.   That might help performance.

    Regarding models with the second objective, try solving an auxiliary model where you minimize just the sum of the reuse variables.   Also, try fixing all the reuse variables to 0 and solving the MIP.   If it is infeasible, you immediately can add the cut that the sum of the reuse variables is >= 1.   Also, you could run CPLEX's conflict refiner if the model is infeasible, and the resulting conflict may shed light on some tighter cuts.  

     

  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-08-08T13:33:33Z  
    • EdKlotz
    • ‏2013-08-06T21:04:52Z

    After looking at the model description, I am a bit puzzled as to how the best node value could be 0 in the log you sent, even at the root node.  The simplest representation of the vertex coloring model would consist of minimizing the sum of the w variables, subject to the diffNeighColor and oneColor constraints.   If I just look at that objective and those two groups of constraints, I believe they already imply a positive objective at the root node relaxation.  

    Specifically, let's follow on Daniel's point that vertices in a clique require different colors.   So, suppose I consider a graph consisting of 3 vertices with edges connecting each pair.   If we create this minimal vertex coloring formulation involving just the diffNeighColor and oneColor constraints, we can add up all of the constraints and show that the sum of the w variables is at least 2.    This does not make use of the integrality of the variables at all, so it applies to the LP relaxation as well as the MIP.   This holds regardless of the number of colors available.    It also holds regardless of the specific edge set of the graph; as soon as you have a single edge in the graph, the LP relaxation objective must be at least 2.

    So, how did the node log you attached have a zero objective, not only for the root, but for many subsequent nodes in the optimization?   Were you using your second objective?  If so, can you post a log for a challenging model for the first objective?   I'm not completely clear on the second objective; is it trying to minimize the sum of reused colors in a group that allows violations of the basic unique color requirement for edges?   

    If I correctly understand your second objective, I am not particularly surprised that the model with the second objective is more difficult to solve for a branch and cut solver like CPLEX than the first.   You are introducing some soft constraints with the second objective.   Soft constraints can weaken a formulation in the sense that certain infeasibilities that previously could be used to derive cuts no longer are available.   Think about it this way; if you had a knapsack constraint, you initially might be able to deduce numerous cover cuts.   But, if you soften it by subtracting a surplus variable, you remove those cover cuts.   The same thing applies for your model with clique cuts.

    So, are most of your performance difficulties occurring with the second objective, or do you have them with the first objective as well?    I'd really like to see a node log from a problematic run using the first objective, as I now have concluded the log you sent above involves the second objective, not the first.

    Regarding models with the first objective, try setting CPLEX's clique cut parameter to its most aggressive setting of 3.   That might help performance.

    Regarding models with the second objective, try solving an auxiliary model where you minimize just the sum of the reuse variables.   Also, try fixing all the reuse variables to 0 and solving the MIP.   If it is infeasible, you immediately can add the cut that the sum of the reuse variables is >= 1.   Also, you could run CPLEX's conflict refiner if the model is infeasible, and the resulting conflict may shed light on some tighter cuts.  

     

    Thank you for the comments, I'm sorry if it was unclear, I'll try to explain in more details.

     

    >>So, how did the node log you attached have a zero objective, not only for the root, but for many subsequent nodes in the optimization?   Were you using your second objective?

    Yes, that is for the second objective. In general the problem is easily solvable if we have high K, which results in either the optimal number of colors (obj1) or 0 (obj2) as all the a variables are set to 0. In the example that I use for analysis here, the optimal number of colors to satisfy regular and group constraints is 10 (solving obj 1- log_bigK_o1).

    The challenge start when we decrease the number of colors in such a way that diffNeighColor is still satisfied but some a(i) are forced to 1. Hence, the obj1 should be at least NbColors whereas obj2>0.

     

    >>If so, can you post a log for a challenging model for the first objective?

    Yes, attached as log_smallK_o1. The optimal number of colors is 10, I decreased to 9 (the bound could be 8 if some vertex weights are 0). Even more, I know that by reducing by 1 color, some of the groups will reuse a color. Summing the min weights of vertices in such groups I get 2.5518, so min value of obj1 is 10.5518 (2.5518 for obj2).  

     

    >>I'm not completely clear on the second objective; is it trying to minimize the sum of reused colors in a group that allows violations of the basic unique color requirement for edges?  

    It allows violations and reuse a color within the group (among the vertices that are not connected), so it still keeps the basic unique requirement for edges (diffNeighColor).

     

    >>If I correctly understand your second objective, I am not particularly surprised that the model with the second objective is more difficult to solve for a branch and cut solver like CPLEX than the first.   You are introducing some soft constraints with the second objective.   Soft constraints can weaken a formulation in the sense that certain infeasibilities that previously could be used to derive cuts no longer are available.   Think about it this way; if you had a knapsack constraint, you initially might be able to deduce numerous cover cuts.   But, if you soften it by subtracting a surplus variable, you remove those cover cuts.   The same thing applies for your model with clique cuts.

    >>So, are most of your performance difficulties occurring with the second objective, or do you have them with the first objective as well?  I'd really like to see a node log from a problematic run using the first objective, as I now have concluded the log you sent above involves the second objective, not the first.

    Yes, the log attached before is with obj2. I have the problem with obj1 objective as soon as any of a[i] is set to 1 (previously mentioned log log_smallK_o1)

     

    >>Regarding models with the second objective, try solving an auxiliary model where you minimize just the sum of the reuse variables. 

    I changed the second objective also to

    O3: sum a(i) (log_smallK_o3) the bound doesn't move from 0 and I know that the optimal value is 28.

     

    >>Also, try fixing all the reuse variables to 0 and solving the MIP.   If it is infeasible, you immediately can add the cut that the sum of the reuse variables is >= 1.  

    I added then constraint sum a(i)>=1, and the bound is stuck at 1 (log_smallK_o3_v2).

    Similarly no move if I use obj 2. The bound shows 0, even though I know that for this particular graph there would be 28 repeated colors and if those are applied to the nodes with the lowest weights within groups the objective should be at least 2.5518. I really don't understand this behavior.

     

    >>Regarding models with the first objective, try setting CPLEX's clique cut parameter to its most aggressive setting of 3.   That might help performance.

    I tried that, as well as all other aggressive cuts, the bound doesn't move.

     

    >>Also, you could run CPLEX's conflict refiner if the model is infeasible, and the resulting conflict may shed light on some tighter cuts.

    Can you give me some more hints on how to use the conflict refiner efficiently?

    Updated on 2013-08-08T15:52:03Z at 2013-08-08T15:52:03Z by azak
  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-08-08T15:42:40Z  

    Some comments that came to my mind concerning the second case (minimize number of vertices in a group that get the same color):

    1. diffNeighColor_validineq is redundant, isn't it? It is implied by diffNeighColor.
    2. groupReuse2 and groupReuse3 also look redundant. If VertexWeight[i]>0 then a[i] will only be >0 in an optimal solution if b[i][j][k]>0 and that will only be >0 if a[i][k]>0 and a[j][k]>0 (otherwise the solution is not optimal). If VertexWeight[i]==0 then the value of a[i] does not matter anyway.
    3. I think in the second case you can do better than constraint maxColors: Since there is no difference between colors you can just set w[0..NbColors-]=1 and w[0..Colors]=0. This will allow CPLEX presolve to eliminate the w variables from your model. Not sure if CPLEX would always pick up this reduction otherwise.
    4. Not sure if this will help but I think the following is a valid inequality: Assume G is a group and i is a vertex in that group. If i is colored k then a[i] must be one if at least one other vertex in G is colored k. So for each color k we have (|G|-1)*a[i] >= sum(j in G && j != i) x[j][k] - (|G|-1)*(1 - x[i][k]).
      You could look for other combinatorial properties like this that give rise to valid inequalities.

    Vertex coloring using branch&bound/cut is a well studied problem (see for example this paper). There are some valid inequalities/facets that are not too hard to find. You could try adding some of them to your formulation.

    Thank you for the insights!

    1. Yes, it's redundant but I noticed shorter solving time (when the runs are not the problematic ones).

    2. This is a way to eliminate nonlinearity  x[i][k]*x[j][k]=1, but instead of using (x[i][k]+x[j][k])/2 >= c[i][j][k] I split it into 2 inequalities hoping that this will make it tighter. 

    3. True, this already limited by the NbColors set. 

    4. You are right, this is a valid inequality. I added this and precalculated min sum of a(i) as well as weight of the vertices with reused colors (as in the reply to Ed) yet still no effect on the bound.

    I'm going through the papers you recommended looking for cut inspirations. The formulation deals well with the basic coloring constraints, just added one small extension and yet this poses so many problems. Will also have a look into soft constraints as suggested by Ed.

     I'm quite new in this area and there is plenty of things to learn so I greatly appreciate your tips! 

  • EdKlotz
    EdKlotz
    244 Posts

    Re: No proof of optimality

    ‏2013-08-08T16:26:12Z  
    • azak
    • ‏2013-08-08T13:33:33Z

    Thank you for the comments, I'm sorry if it was unclear, I'll try to explain in more details.

     

    >>So, how did the node log you attached have a zero objective, not only for the root, but for many subsequent nodes in the optimization?   Were you using your second objective?

    Yes, that is for the second objective. In general the problem is easily solvable if we have high K, which results in either the optimal number of colors (obj1) or 0 (obj2) as all the a variables are set to 0. In the example that I use for analysis here, the optimal number of colors to satisfy regular and group constraints is 10 (solving obj 1- log_bigK_o1).

    The challenge start when we decrease the number of colors in such a way that diffNeighColor is still satisfied but some a(i) are forced to 1. Hence, the obj1 should be at least NbColors whereas obj2>0.

     

    >>If so, can you post a log for a challenging model for the first objective?

    Yes, attached as log_smallK_o1. The optimal number of colors is 10, I decreased to 9 (the bound could be 8 if some vertex weights are 0). Even more, I know that by reducing by 1 color, some of the groups will reuse a color. Summing the min weights of vertices in such groups I get 2.5518, so min value of obj1 is 10.5518 (2.5518 for obj2).  

     

    >>I'm not completely clear on the second objective; is it trying to minimize the sum of reused colors in a group that allows violations of the basic unique color requirement for edges?  

    It allows violations and reuse a color within the group (among the vertices that are not connected), so it still keeps the basic unique requirement for edges (diffNeighColor).

     

    >>If I correctly understand your second objective, I am not particularly surprised that the model with the second objective is more difficult to solve for a branch and cut solver like CPLEX than the first.   You are introducing some soft constraints with the second objective.   Soft constraints can weaken a formulation in the sense that certain infeasibilities that previously could be used to derive cuts no longer are available.   Think about it this way; if you had a knapsack constraint, you initially might be able to deduce numerous cover cuts.   But, if you soften it by subtracting a surplus variable, you remove those cover cuts.   The same thing applies for your model with clique cuts.

    >>So, are most of your performance difficulties occurring with the second objective, or do you have them with the first objective as well?  I'd really like to see a node log from a problematic run using the first objective, as I now have concluded the log you sent above involves the second objective, not the first.

    Yes, the log attached before is with obj2. I have the problem with obj1 objective as soon as any of a[i] is set to 1 (previously mentioned log log_smallK_o1)

     

    >>Regarding models with the second objective, try solving an auxiliary model where you minimize just the sum of the reuse variables. 

    I changed the second objective also to

    O3: sum a(i) (log_smallK_o3) the bound doesn't move from 0 and I know that the optimal value is 28.

     

    >>Also, try fixing all the reuse variables to 0 and solving the MIP.   If it is infeasible, you immediately can add the cut that the sum of the reuse variables is >= 1.  

    I added then constraint sum a(i)>=1, and the bound is stuck at 1 (log_smallK_o3_v2).

    Similarly no move if I use obj 2. The bound shows 0, even though I know that for this particular graph there would be 28 repeated colors and if those are applied to the nodes with the lowest weights within groups the objective should be at least 2.5518. I really don't understand this behavior.

     

    >>Regarding models with the first objective, try setting CPLEX's clique cut parameter to its most aggressive setting of 3.   That might help performance.

    I tried that, as well as all other aggressive cuts, the bound doesn't move.

     

    >>Also, you could run CPLEX's conflict refiner if the model is infeasible, and the resulting conflict may shed light on some tighter cuts.

    Can you give me some more hints on how to use the conflict refiner efficiently?

    That clarifies things a bit.    I'll try to have a closer look at the formulation involving the second objective and reusing colors.   Meanwhile, based on your description,   you might be able to find additional solutions by running CPLEX's RINS heuristic every 100 nodes or so.   This is a local search heuristic, which makes sense for a model like this.   It certainly seems possible that you can find better solutions by adjusting the colors on fairly small subsets of nodes in the graph, and local search heuristics like RINS can detect such improvements.    However, while that may find better solutions, it probably won't help proving optimality in this case.   In that regard, I recommend

    • Read the paper Daniel mentioned about formulating this problem without symmetry.   Even if you decide that you don't want to use that formulation (e.g. because it doesn't extend to the color reuse activities), it may shed light on how to apply symmetry breaking constraints to your existing formulation.
    • Look for other opportunities to tighten the formulation.   Daniel already mentioned cuts involving cliques in your graph, since each vertex in a clique must have a separate color.   Look for other subgraphs that might yield something useful.   In some ways, it seems like vertex coloring is well suited to some sort of decomposition scheme.   After all, if you can show that a particular subgraph needs n colors, connecting that subgraph to other vertices in the original graph can only increase the colors needed.
    • Regarding parameter settings, at least for the first objective, try setting CPLEX's clique cut generation to its most aggressive setting of 3.  
    Updated on 2013-08-08T16:29:58Z at 2013-08-08T16:29:58Z by EdKlotz
  • DanielJunglas
    DanielJunglas
    1554 Posts

    Re: No proof of optimality

    ‏2013-08-12T07:16:37Z  
    • azak
    • ‏2013-08-08T15:42:40Z

    Thank you for the insights!

    1. Yes, it's redundant but I noticed shorter solving time (when the runs are not the problematic ones).

    2. This is a way to eliminate nonlinearity  x[i][k]*x[j][k]=1, but instead of using (x[i][k]+x[j][k])/2 >= c[i][j][k] I split it into 2 inequalities hoping that this will make it tighter. 

    3. True, this already limited by the NbColors set. 

    4. You are right, this is a valid inequality. I added this and precalculated min sum of a(i) as well as weight of the vertices with reused colors (as in the reply to Ed) yet still no effect on the bound.

    I'm going through the papers you recommended looking for cut inspirations. The formulation deals well with the basic coloring constraints, just added one small extension and yet this poses so many problems. Will also have a look into soft constraints as suggested by Ed.

     I'm quite new in this area and there is plenty of things to learn so I greatly appreciate your tips! 

    Would you be willing to post a .mod and .dat file for the problematic graph here? Then we could try to take a quick look at what CPLEX is doing (or what if fails to do) with this model and check if there is something obvious to improve.

    If you don't want to post the files in public you can also send them directly to me: daniel(dot)junglas(at)de(dot)ibm(dot)com.

  • azak
    azak
    8 Posts

    Re: No proof of optimality

    ‏2013-08-12T15:54:15Z  

    Would you be willing to post a .mod and .dat file for the problematic graph here? Then we could try to take a quick look at what CPLEX is doing (or what if fails to do) with this model and check if there is something obvious to improve.

    If you don't want to post the files in public you can also send them directly to me: daniel(dot)junglas(at)de(dot)ibm(dot)com.

    Thank you very much, Daniel! I have just sent you the files and will be looking forward to your reply.