Dear all, I am modeling a Dynamic Job shop scheduling problem. The problem specification can be found in the attachment. Average problem instance has around 1000 cars. Following are the concerns when comparing CP vs MIP:
1. Is the order of the input data and the values that the decision variables take have an influence on the solver performance? The problem has values in the order of 1E6?
2. How well CP scales in comparison to MIP? Is MIP better choice if a linear formulation for the problem is derived?
Any remarks, pointers to relevant material are helpful. Thanks in advance.
Topic

Re: CP VS MIP for Dynamic Job Shop scheduling using CPLEX
20130403T12:52:51ZThis is the accepted answer. This is the accepted answer.Hello,
About your two general questions:
1. The order of the input data will not have, in average, impact on the solver performance. I say "in average" because as the engine needs to break some ties when selecting variables, it may be that changing the order of data and of variables and constraints in the model results in different search path in the engine. But this should not change the performance in average.
2. The question of the comparison of the performances of CP and MIP on the problem will depend a lot on the number of values you have for your variables. I understand from the problem description that the main variables are the start and end time of activities. If the size of the domain of temporal variables is large and if you cannot easily bucketize them using a reasonable number of values then CP will probably be better than MIP on this type of scheduling problem. The reason is that when solving a scheduling model using interval variables in CP, the complexity in general does not depend on the size of the temporal domain but only on the number of interval variables (activities).
Here is an illustrative example of a small CP Optimizer scheduling model inspired from the description of your problem. Activities like washing, drying are modeled as interval variables; time spent in the queue is also modeled by interval variables. The model assumes that you cannot have two cars at the same time in the machine (noOverlap constraint) and the number of cars in the queue at every instant t is modeled by a cumul function (queue) with limited maximal value (queue <= MaxQueueSize).
Philippe
using CP; int n = 1000; range Cars = 1..n; int MaxQueueSize = 20; int MaxQueueingTime = 1000; int WashTime = 12; int DryTime = 10; int entryDate[c in Cars] = ...; dvar interval queueForWash[c in Cars] size 1..MaxQueueingTime; dvar interval wash[c in Cars] size WashTime; dvar interval queueForDry[c in Cars] size 1..MaxQueueingTime; dvar interval dry [c in Cars] size DryTime; cumulFunction queue = sum(c in Cars) (pulse(queueForWash[c], 1) + pulse(queueForDry[c], 1)); minimize max(c in Cars) endOf(dry[c]); subject to { forall(c in Cars) { startOf(queueForWash[c]) == entryDate[c]; endAtStart(queueForWash[c], wash[c]); endAtStart(wash[c], queueForDry[c]); endAtStart(queueForDry[c], dry[c]); } noOverlap(append(wash, dry)); queue <= MaxQueueSize; }

Re: CP VS MIP for Dynamic Job Shop scheduling using CPLEX
20130404T07:42:45ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20130403T12:52:51Z
Hello,
About your two general questions:
1. The order of the input data will not have, in average, impact on the solver performance. I say "in average" because as the engine needs to break some ties when selecting variables, it may be that changing the order of data and of variables and constraints in the model results in different search path in the engine. But this should not change the performance in average.
2. The question of the comparison of the performances of CP and MIP on the problem will depend a lot on the number of values you have for your variables. I understand from the problem description that the main variables are the start and end time of activities. If the size of the domain of temporal variables is large and if you cannot easily bucketize them using a reasonable number of values then CP will probably be better than MIP on this type of scheduling problem. The reason is that when solving a scheduling model using interval variables in CP, the complexity in general does not depend on the size of the temporal domain but only on the number of interval variables (activities).
Here is an illustrative example of a small CP Optimizer scheduling model inspired from the description of your problem. Activities like washing, drying are modeled as interval variables; time spent in the queue is also modeled by interval variables. The model assumes that you cannot have two cars at the same time in the machine (noOverlap constraint) and the number of cars in the queue at every instant t is modeled by a cumul function (queue) with limited maximal value (queue <= MaxQueueSize).
Philippe
<pre class="jivepre">using CP; int n = 1000; range Cars = 1..n; int MaxQueueSize = 20; int MaxQueueingTime = 1000; int WashTime = 12; int DryTime = 10; int entryDate[c in Cars] = ...; dvar interval queueForWash[c in Cars] size 1..MaxQueueingTime; dvar interval wash[c in Cars] size WashTime; dvar interval queueForDry[c in Cars] size 1..MaxQueueingTime; dvar interval dry [c in Cars] size DryTime; cumulFunction queue = sum(c in Cars) (pulse(queueForWash[c], 1) + pulse(queueForDry[c], 1)); minimize max(c in Cars) endOf(dry[c]); subject to { forall(c in Cars) { startOf(queueForWash[c]) == entryDate[c]; endAtStart(queueForWash[c], wash[c]); endAtStart(wash[c], queueForDry[c]); endAtStart(queueForDry[c], dry[c]); } noOverlap(append(wash, dry)); queue <= MaxQueueSize; }
</pre>
First of all thanks a lot for an elaborate answer.
The CP model you posted is quite similar to the one I have (I didn't posted it). Despite the fact that CP models (yours and mine) can be further optimized, at this point I would like to select between CP and MIP by comparing respective performances. The suitable optimizer needs to be embedded in an optimal controller where the total time taken by the optimizer is restricted by a deadline of 400 ms. An optimizer that solves (or determines that It cannot be done) in real time is required. Does such a solver exist? The point is I am starting this project and it is crucial to select a suitable optimizer.
What is 'bucketization'? I may consider to do that. When using MIP/LP, all constraints can be formalized as linear constraints. Furthermore, tight upper and lower bounds can be determined for the decision variables. The question now is which one is better. I have started implementing an LP model for the same problem and lets see how it turns out in terms of performance. 
Re: CP VS MIP for Dynamic Job Shop scheduling using CPLEX
20130404T09:22:02ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20130404T07:42:45Z
Hello Philippe,
First of all thanks a lot for an elaborate answer.
The CP model you posted is quite similar to the one I have (I didn't posted it). Despite the fact that CP models (yours and mine) can be further optimized, at this point I would like to select between CP and MIP by comparing respective performances. The suitable optimizer needs to be embedded in an optimal controller where the total time taken by the optimizer is restricted by a deadline of 400 ms. An optimizer that solves (or determines that It cannot be done) in real time is required. Does such a solver exist? The point is I am starting this project and it is crucial to select a suitable optimizer.
What is 'bucketization'? I may consider to do that. When using MIP/LP, all constraints can be formalized as linear constraints. Furthermore, tight upper and lower bounds can be determined for the decision variables. The question now is which one is better. I have started implementing an LP model for the same problem and lets see how it turns out in terms of performance.
For the selection between CP and MIP, I think the best is to try the two models, they should not be very complex in terms of model definition. But a 400ms time limit is a priori very challenging for such a problem with several thousand activities (you mentioned one thousand cars). Of course, it also depends on the actual definition of the problem, I suppose you do not reschedule 1000 cars from scratch in 400ms and, as you mention, providing tight lower and upper bounds for the activities may help. So it will be very important to compare the approaches on some data that is as close as possible from the one in the context of the real application. If the timewindows for activities are very tight, maybe the problem can be decomposed along the time axis.
About bucketization. A problem with MIP on scheduling problems is that the size of the model often depends on the number of possible values for time variables (start, end of activities). Typically, you will have one boolean variable x_{i,t} saying that an activity a_i starts at time value t (or is executing at value time t). In general, it means that you need to find a suitable time granularity for your problem and, in the MIP model, t does not represent a precise date in time but a time bucket. If in your problem the time windows of activities are very tight, you may consider a MIP model indeed (although tight lower and upper bounds for activities will help CP too).
Philippe