Hello :)
I use Cplex from Java and running a simulation. I basically run each instance problem about one hundred times and keep track of the cplex time for solving the optimal solution for it. For smallmodel size in terms of number of variables, the captured time is quite close to each other. However, for the bigmodel size, the variance of solution times are quite big and different. I'm wondering how theoretically it could be justified? I appreciate your response
Topic
This topic has been locked.
20 replies
Latest Post
 20130213T04:07:50Z by Bryanr
ACCEPTED ANSWER
Pinned topic solution times
20121117T07:28:20Z

Answered question
This question has been answered.
Unanswered question
This question has not been answered yet.
Updated on 20130213T04:07:50Z at 20130213T04:07:50Z by Bryanr

ACCEPTED ANSWER
Re: solution times
20121118T22:49:09Z in response to BryanrAs far as I know, cplex does not preserve order. In fact, two different runs might yield two different solutions (with the same objective). cplex deploys several heuristics which might be the cause of time differencse. Alternatively, I suppose that whenever it has to make a branching decision, it sometimes happens that two variables are equally promising to branch on, so a random choice is made. This however could turn out to be a bad branching decision in the end.
ACCEPTED ANSWER
Re: solution times
20121119T13:24:04Z in response to JorisK> As far as I know, cplex does not preserve order. In fact, two different runs might yield two different solutions (with the same objective). cplex deploys several heuristics which might be the cause of time differencse. Alternatively, I suppose that whenever it has to make a branching decision, it sometimes happens that two variables are equally promising to branch on, so a random choice is made. This however could turn out to be a bad branching decision in the end.
>
You are quite wrong here. In default settings CPLEX is deterministic. That is, running the exact same problem with the exact same parameter settings on the exact same machine will produce the exact same results. Unless there are lots of jobs running concurrently even solution times should be pretty much the same.
You have to explicitly change parameters to make CPLEX nondeterministic.


ACCEPTED ANSWER
Re: solution times
20121119T13:26:35Z in response to BryanrNot sure if I got that right: Do you solve the exact same problem 100 times or are there (small) differences in the problems you solve? Are these MIPs or pure LPs? Do you solve all problems in the same IloCplex instance or do you create a new IloCplex instance for each problem?
ACCEPTED ANSWER
Re: solution times
20121125T02:21:46Z in response to SystemAdminThanks a lot Daniel. I solve the exact 01 LP problem 100 times by running a command line program multiple times in Java with the exact same parameter settings on the exact same dedicated machine. So I'm not using the same IloCplex instance because the program is run again. However, for big model size the variance of solution time is still quite considerable and different! Do you have any thoughts?
ACCEPTED ANSWER
Re: solution times
20121126T07:21:04Z in response to BryanrThat indeed sounds weird. Could you send us the log output for two runs that are dramatically different?
Also, could you export your parameter settings by calling
cplex.writeParam("params.prm");
right before calling cplex.solve() and send us the created file?
What is the version of CPLEX you use?Updated on 20140324T22:45:32Z at 20140324T22:45:32Z by ironman
ACCEPTED ANSWER
Re: solution times
20121127T16:12:37Z in response to SystemAdminThanks Daniel! I have attached logs of two runs, namely 'min' and 'max', as well as the parameter settings log. I use cplex V12.4.Attachments

ACCEPTED ANSWER
Re: solution times
20121127T16:44:56Z in response to BryanrHow certain are you that you indeed solve the exact same problem? The two log files show different solution trees. Moreover, they also show different presolve results:
max.txt:
... Aggregator did 1567 substitutions. Reduced MIP has 1530 rows, 47356 columns, and 115729 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ...
min.txt:
... Aggregator did 1570 substitutions. Reduced MIP has 1527 rows, 47353 columns, and 115714 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ...
As you can see, the reduced models are different and from that point one cannot expect the two solves to be the same.
Can you export the two models to a SAV file using cplex.exportModel("model.sav") before calling cplex.solve() and attach the SAV files here? I suspect there is some difference in those files.Updated on 20140324T22:44:56Z at 20140324T22:44:56Z by ironman
ACCEPTED ANSWER
Re: solution times
20121128T08:41:57Z in response to SystemAdminI realized but I dont know how it is possible! I run another 100 runs for the same problem and exported models. I've attached logs including two first minimum solution times and also for the two maximum instances. Thanks a lotAttachments

 attachment_14913636_logs.zip
 2.5 MB

ACCEPTED ANSWER
Re: solution times
20121128T09:05:48Z in response to BryanrDo I understand correctly: The models min1.sav and max1.sav in the attachment are supposed to be the same? Well, they are not. At least the 'y' variables in these models have different names and a completely different order.
How do you generate these models before you call cplex.exportModel()? Are you sure that in each iteration you create the models in the exact same way? Could you show us the (pseudo) code that you use to generate the model? Do you use any data structures that may result in nondeterministic ordering of things (like hashmaps that use memory addresses as keys)?
ACCEPTED ANSWER
Re: solution times
20121128T10:36:02Z in response to SystemAdminThanks for your attention. Actually you are right. Both models min1.sav and max1.sav are should be the same at the end. However, the generated 'variables names' can have different order. I used hashmaps for mapping cplex variables to my data structure. I also attached the code it may be helpful.Attachments

ACCEPTED ANSWER
Re: solution times
20121128T12:18:45Z in response to BryanrOK, I think by now it is clear what you need to do: You need to change the code that sets up the problem so that it always creates the exact same problem (in this case the created SAV files will be identical).
I did not go through all your code and I am not sure whether your use of HashMap and HashSet is a problem here. However, if the Activity and Service classes have something like a unique id (that does not change between two runs) then I would change to TreeMap and TreeSet, using this unique id as key.
I don't know where all you data comes from but you also need to make sure that the "order" of the data is the same for each run. For example, if the order of services or activities changes between two runs then the resulting model will have variables/constraints in a different order. So the resulting model will be different and solution times may again differ.
Unless your code produces the exact same model for each run there will always be differences in the solution times and they may be significant.
ACCEPTED ANSWER
Re: solution times
20121128T16:53:36Z in response to SystemAdminThanks Daniel! I will change the code and give it a shot with TreeMaps and keep you updated. However, I'm wondering why and how the order of data changes the solution times for different runs as long as the number of variables and constraints should be exactly the same in generated model for the same problem ?! I attached the input synthesized dataset sample which I feed.
ACCEPTED ANSWER
Re: solution times
20121128T22:31:55Z in response to BryanrJust an example how ordering your variables and/or constraints can influence the solution time: When you solve an LP relaxation these objects will be considered in different order when decisions need to be made. An example is deciding which variable leaves and which one enters the basis. If you consider the variables in different order (especially if the problem is degenerate) then you'll most likely end up at a different extreme point as the optimal solution. Then of course, the MIP solution process will take different turns and your solution time could be very different.
And then there are many other places where the order matters, like presolving, etc.
Laci
ACCEPTED ANSWER
Re: solution times
20121129T10:15:26Z in response to Laci LadanyiJust to avoid confusion: the order of the variables will not have any systematic influence, i.e., there is no "best" or "worst" order, and you cannot "optimize" the order of the variables in your model.
The point is just that CPLEX takes many heuristic decisions during the search (which pivot to perform next in simplex, which variable to branch on in MIP, and so on). Pretty much all models have degeneracy, which means that for some of the decisions multiple variables look exactly the same. Thus, in order to select one of those variables, the only criterion that remains is the index of the variable. So, depending on the decision, CPLEX would typically pick the first or the last of those identically looking variables. This is where the order of the variables (and also rows) comes into play.
In other words, the ordering of the variables and rows has a similar effect as changing the random seed parameter, which is available since CPLEX 12.5.
Tobias
ACCEPTED ANSWER
Re: solution times
20121129T20:25:16Z in response to SystemAdminThanks for your responses here, Tobias and Laci! It was thoughtful of you. So we can draw a conclusion that it is the main reason that results in different solution times for the same problem!








ACCEPTED ANSWER
Re: solution times
20121129T20:12:51Z in response to SystemAdminI walked through the codes and had some changes to make sure about the model. And I'm pretty sure about the generated model after testing. So I had another run for the same problem with exact same settings and captured two solutions times for max and min. I also attached the sav format files
min
.... MIP Presolve eliminated 1049 rows and 3101 columns. Aggregator did 1542 substitutions. Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros. Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators. ... Total (root+branch&cut) = 18.75 sec. (4200.35 ticks) Solution status = Optimal max
.....
MIP Presolve eliminated 1049 rows and 3101 columns.
Aggregator did 1542 substitutions.
Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros.
Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
.....
Total (root+branch&cut) = 62.29 sec. (15598.11 ticks)
Solution status = Optimal
{code}
As we can see the reduced models are the same; however, it leads to different solution times. I'm still subliming around :(Attachments

 attachment_14914598_sav.zip
 1.2 MB
Updated on 20140324T22:44:37Z at 20140324T22:44:37Z by ironman
ACCEPTED ANSWER
Re: solution times
20121203T07:27:04Z in response to BryanrThe two SAV files still contain different problems. I checked the objective function and that is quite different between the two files.
You can see that for yourself by export to LP files instead of SAV files.
LP files are human readable and the LP files generated by your code should be exactly the same.
ACCEPTED ANSWER
Re: solution times
20130213T04:07:50Z in response to SystemAdminThanks a lot Daniel! Sorry for late reply as I was missing your last message here. I had it fixed








ACCEPTED ANSWER
Re: solution times
20121119T22:48:25Z in response to BryanrDo you turn off advanced starts (on by default)? If not, and if you do not create a new instance of IloCplex for each solve of the same problem (see Daniel's question), the first crack at each problem should take considerably longer than all subsequent attempts (inflating variance).
Paul
Mathematicians are like Frenchmen: whenever you say something to them, they translate it into their own language, and at once it is something entirely different. (Goethe)
ACCEPTED ANSWER
Re: solution times
20121125T02:37:41Z in response to SystemAdminMany thanks for your response here, Paul. I just had chance to reply back Daniel here. Actually it is not the case that I create a new instance because I run a command line java multiple times with the exact same argument for the same problem. But once the size of the same problem in terms of number of variables get increased (e.g., for a large model) running multiple times of the same problem result in different solution time, however, after the variance of captured solution times is big. The machine is also the same, and there are not lots of jobs running concurrently!
