Hello :)
I use Cplex from Java and running a simulation. I basically run each instance problem about one hundred times and keep track of the cplex time for solving the optimal solution for it. For smallmodel size in terms of number of variables, the captured time is quite close to each other. However, for the bigmodel size, the variance of solution times are quite big and different. I'm wondering how theoretically it could be justified? I appreciate your response
Topic

Re: solution times
20121118T22:49:09ZThis is the accepted answer. This is the accepted answer.As far as I know, cplex does not preserve order. In fact, two different runs might yield two different solutions (with the same objective). cplex deploys several heuristics which might be the cause of time differencse. Alternatively, I suppose that whenever it has to make a branching decision, it sometimes happens that two variables are equally promising to branch on, so a random choice is made. This however could turn out to be a bad branching decision in the end. 
Re: solution times
20121119T13:24:04ZThis is the accepted answer. This is the accepted answer. JorisK
 20121118T22:49:09Z
As far as I know, cplex does not preserve order. In fact, two different runs might yield two different solutions (with the same objective). cplex deploys several heuristics which might be the cause of time differencse. Alternatively, I suppose that whenever it has to make a branching decision, it sometimes happens that two variables are equally promising to branch on, so a random choice is made. This however could turn out to be a bad branching decision in the end.
>
You are quite wrong here. In default settings CPLEX is deterministic. That is, running the exact same problem with the exact same parameter settings on the exact same machine will produce the exact same results. Unless there are lots of jobs running concurrently even solution times should be pretty much the same.
You have to explicitly change parameters to make CPLEX nondeterministic. 
Re: solution times
20121119T13:26:35ZThis is the accepted answer. This is the accepted answer.Not sure if I got that right: Do you solve the exact same problem 100 times or are there (small) differences in the problems you solve? Are these MIPs or pure LPs? Do you solve all problems in the same IloCplex instance or do you create a new IloCplex instance for each problem? 
Re: solution times
20121119T22:48:25ZThis is the accepted answer. This is the accepted answer.Do you turn off advanced starts (on by default)? If not, and if you do not create a new instance of IloCplex for each solve of the same problem (see Daniel's question), the first crack at each problem should take considerably longer than all subsequent attempts (inflating variance).
Paul
Mathematicians are like Frenchmen: whenever you say something to them, they translate it into their own language, and at once it is something entirely different. (Goethe) 
Re: solution times
20121125T02:21:46ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121119T13:26:35Z
Not sure if I got that right: Do you solve the exact same problem 100 times or are there (small) differences in the problems you solve? Are these MIPs or pure LPs? Do you solve all problems in the same IloCplex instance or do you create a new IloCplex instance for each problem? 
Re: solution times
20121125T02:37:41ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121119T22:48:25Z
Do you turn off advanced starts (on by default)? If not, and if you do not create a new instance of IloCplex for each solve of the same problem (see Daniel's question), the first crack at each problem should take considerably longer than all subsequent attempts (inflating variance).
Paul
Mathematicians are like Frenchmen: whenever you say something to them, they translate it into their own language, and at once it is something entirely different. (Goethe) 
Re: solution times
20121126T07:21:04ZThis is the accepted answer. This is the accepted answer. Bryanr
 20121125T02:21:46Z
Thanks a lot Daniel. I solve the exact 01 LP problem 100 times by running a command line program multiple times in Java with the exact same parameter settings on the exact same dedicated machine. So I'm not using the same IloCplex instance because the program is run again. However, for big model size the variance of solution time is still quite considerable and different! Do you have any thoughts?
Also, could you export your parameter settings by calling
cplex.writeParam("params.prm");
right before calling cplex.solve() and send us the created file?
What is the version of CPLEX you use?Updated on 20140324T22:45:32Z at 20140324T22:45:32Z by ironman 
Re: solution times
20121127T16:12:37ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121126T07:21:04Z
That indeed sounds weird. Could you send us the log output for two runs that are dramatically different?
Also, could you export your parameter settings by calling
<pre class="java dw" dataeditorlang="java" datapbcklang="java" dir="ltr">cplex.writeParam("params.prm"); </pre>
right before calling cplex.solve() and send us the created file?
What is the version of CPLEX you use?Attachments

Re: solution times
20121127T16:44:56ZThis is the accepted answer. This is the accepted answer. Bryanr
 20121127T16:12:37Z
Thanks Daniel! I have attached logs of two runs, namely 'min' and 'max', as well as the parameter settings log. I use cplex V12.4.
max.txt:
... Aggregator did 1567 substitutions. Reduced MIP has 1530 rows, 47356 columns, and 115729 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ...
min.txt:
... Aggregator did 1570 substitutions. Reduced MIP has 1527 rows, 47353 columns, and 115714 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ...
As you can see, the reduced models are different and from that point one cannot expect the two solves to be the same.
Can you export the two models to a SAV file using cplex.exportModel("model.sav") before calling cplex.solve() and attach the SAV files here? I suspect there is some difference in those files.Updated on 20140324T22:44:56Z at 20140324T22:44:56Z by ironman 
Re: solution times
20121128T08:41:57ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121127T16:44:56Z
How certain are you that you indeed solve the exact same problem? The two log files show different solution trees. Moreover, they also show different presolve results:
max.txt:
<pre class="java dw" dataeditorlang="java" datapbcklang="java" dir="ltr">... Aggregator did 1567 substitutions. Reduced MIP has 1530 rows, 47356 columns, and 115729 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... </pre>
min.txt:
<pre class="java dw" dataeditorlang="java" datapbcklang="java" dir="ltr">... Aggregator did 1570 substitutions. Reduced MIP has 1527 rows, 47353 columns, and 115714 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... </pre>
As you can see, the reduced models are different and from that point one cannot expect the two solves to be the same.
Can you export the two models to a SAV file using cplex.exportModel("model.sav") before calling cplex.solve() and attach the SAV files here? I suspect there is some difference in those files.Attachments

 attachment_14913636_logs.zip
 2.5 MB

Re: solution times
20121128T09:05:48ZThis is the accepted answer. This is the accepted answer. Bryanr
 20121128T08:41:57Z
I realized but I dont know how it is possible! I run another 100 runs for the same problem and exported models. I've attached logs including two first minimum solution times and also for the two maximum instances. Thanks a lot
How do you generate these models before you call cplex.exportModel()? Are you sure that in each iteration you create the models in the exact same way? Could you show us the (pseudo) code that you use to generate the model? Do you use any data structures that may result in nondeterministic ordering of things (like hashmaps that use memory addresses as keys)? 
Re: solution times
20121128T10:36:02ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121128T09:05:48Z
Do I understand correctly: The models min1.sav and max1.sav in the attachment are supposed to be the same? Well, they are not. At least the 'y' variables in these models have different names and a completely different order.
How do you generate these models before you call cplex.exportModel()? Are you sure that in each iteration you create the models in the exact same way? Could you show us the (pseudo) code that you use to generate the model? Do you use any data structures that may result in nondeterministic ordering of things (like hashmaps that use memory addresses as keys)?Attachments

Re: solution times
20121128T12:18:45ZThis is the accepted answer. This is the accepted answer. Bryanr
 20121128T10:36:02Z
Thanks for your attention. Actually you are right. Both models min1.sav and max1.sav are should be the same at the end. However, the generated 'variables names' can have different order. I used hashmaps for mapping cplex variables to my data structure. I also attached the code it may be helpful.
I did not go through all your code and I am not sure whether your use of HashMap and HashSet is a problem here. However, if the Activity and Service classes have something like a unique id (that does not change between two runs) then I would change to TreeMap and TreeSet, using this unique id as key.
I don't know where all you data comes from but you also need to make sure that the "order" of the data is the same for each run. For example, if the order of services or activities changes between two runs then the resulting model will have variables/constraints in a different order. So the resulting model will be different and solution times may again differ.
Unless your code produces the exact same model for each run there will always be differences in the solution times and they may be significant. 
Re: solution times
20121128T16:53:36ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121128T12:18:45Z
OK, I think by now it is clear what you need to do: You need to change the code that sets up the problem so that it always creates the exact same problem (in this case the created SAV files will be identical).
I did not go through all your code and I am not sure whether your use of HashMap and HashSet is a problem here. However, if the Activity and Service classes have something like a unique id (that does not change between two runs) then I would change to TreeMap and TreeSet, using this unique id as key.
I don't know where all you data comes from but you also need to make sure that the "order" of the data is the same for each run. For example, if the order of services or activities changes between two runs then the resulting model will have variables/constraints in a different order. So the resulting model will be different and solution times may again differ.
Unless your code produces the exact same model for each run there will always be differences in the solution times and they may be significant. 
Re: solution times
20121128T22:31:55ZThis is the accepted answer. This is the accepted answer. Bryanr
 20121128T16:53:36Z
Thanks Daniel! I will change the code and give it a shot with TreeMaps and keep you updated. However, I'm wondering why and how the order of data changes the solution times for different runs as long as the number of variables and constraints should be exactly the same in generated model for the same problem ?! I attached the input synthesized dataset sample which I feed.
And then there are many other places where the order matters, like presolving, etc.
Laci 
Re: solution times
20121129T10:15:26ZThis is the accepted answer. This is the accepted answer. Laci Ladanyi
 20121128T22:31:55Z
Just an example how ordering your variables and/or constraints can influence the solution time: When you solve an LP relaxation these objects will be considered in different order when decisions need to be made. An example is deciding which variable leaves and which one enters the basis. If you consider the variables in different order (especially if the problem is degenerate) then you'll most likely end up at a different extreme point as the optimal solution. Then of course, the MIP solution process will take different turns and your solution time could be very different.
And then there are many other places where the order matters, like presolving, etc.
Laci
The point is just that CPLEX takes many heuristic decisions during the search (which pivot to perform next in simplex, which variable to branch on in MIP, and so on). Pretty much all models have degeneracy, which means that for some of the decisions multiple variables look exactly the same. Thus, in order to select one of those variables, the only criterion that remains is the index of the variable. So, depending on the decision, CPLEX would typically pick the first or the last of those identically looking variables. This is where the order of the variables (and also rows) comes into play.
In other words, the ordering of the variables and rows has a similar effect as changing the random seed parameter, which is available since CPLEX 12.5.
Tobias 
Re: solution times
20121129T20:12:51ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121127T16:44:56Z
How certain are you that you indeed solve the exact same problem? The two log files show different solution trees. Moreover, they also show different presolve results:
max.txt:
<pre class="java dw" dataeditorlang="java" datapbcklang="java" dir="ltr">... Aggregator did 1567 substitutions. Reduced MIP has 1530 rows, 47356 columns, and 115729 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... </pre>
min.txt:
<pre class="java dw" dataeditorlang="java" datapbcklang="java" dir="ltr">... Aggregator did 1570 substitutions. Reduced MIP has 1527 rows, 47353 columns, and 115714 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... </pre>
As you can see, the reduced models are different and from that point one cannot expect the two solves to be the same.
Can you export the two models to a SAV file using cplex.exportModel("model.sav") before calling cplex.solve() and attach the SAV files here? I suspect there is some difference in those files.
min
.... MIP Presolve eliminated 1049 rows and 3101 columns. Aggregator did 1542 substitutions. Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... Total (root+branch&cut) = 18.75 sec. (4200.35 ticks) Solution status = Optimal max
.....
MIP Presolve eliminated 1049 rows and 3101 columns.
Aggregator did 1542 substitutions.
Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros.
Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
.....
Total (root+branch&cut) = 62.29 sec. (15598.11 ticks)
Solution status = Optimal
{code}
As we can see the reduced models are the same; however, it leads to different solution times. I'm still subliming around :(Attachments

 attachment_14914598_sav.zip
 1.2 MB
Updated on 20140324T22:44:37Z at 20140324T22:44:37Z by ironman 
Re: solution times
20121129T20:25:16ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121129T10:15:26Z
Just to avoid confusion: the order of the variables will not have any systematic influence, i.e., there is no "best" or "worst" order, and you cannot "optimize" the order of the variables in your model.
The point is just that CPLEX takes many heuristic decisions during the search (which pivot to perform next in simplex, which variable to branch on in MIP, and so on). Pretty much all models have degeneracy, which means that for some of the decisions multiple variables look exactly the same. Thus, in order to select one of those variables, the only criterion that remains is the index of the variable. So, depending on the decision, CPLEX would typically pick the first or the last of those identically looking variables. This is where the order of the variables (and also rows) comes into play.
In other words, the ordering of the variables and rows has a similar effect as changing the random seed parameter, which is available since CPLEX 12.5.
Tobias 
Re: solution times
20121203T07:27:04ZThis is the accepted answer. This is the accepted answer. Bryanr
 20121129T20:12:51Z
I walked through the codes and had some changes to make sure about the model. And I'm pretty sure about the generated model after testing. So I had another run for the same problem with exact same settings and captured two solutions times for max and min. I also attached the sav format files
min
<pre class="java dw" dataeditorlang="java" datapbcklang="java" dir="ltr">.... MIP Presolve eliminated 1049 rows and 3101 columns. Aggregator did 1542 substitutions. Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... Total (root+branch&cut) = 18.75 sec. (4200.35 ticks) Solution status = Optimal max </pre>
.....
MIP Presolve eliminated 1049 rows and 3101 columns.
Aggregator did 1542 substitutions.
Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros.
Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
.....
Total (root+branch&cut) = 62.29 sec. (15598.11 ticks)
Solution status = Optimal
{code}
As we can see the reduced models are the same; however, it leads to different solution times. I'm still subliming around :(
You can see that for yourself by export to LP files instead of SAV files.
LP files are human readable and the LP files generated by your code should be exactly the same. 
Re: solution times
20130213T04:07:50ZThis is the accepted answer. This is the accepted answer. SystemAdmin
 20121203T07:27:04Z
The two SAV files still contain different problems. I checked the objective function and that is quite different between the two files.
You can see that for yourself by export to LP files instead of SAV files.
LP files are human readable and the LP files generated by your code should be exactly the same.