Topic
20 replies Latest Post - ‏2013-02-13T04:07:50Z by Bryanr
Bryanr
Bryanr
18 Posts
ACCEPTED ANSWER

Pinned topic solution times

‏2012-11-17T07:28:20Z |
Hello :)

I use Cplex from Java and running a simulation. I basically run each instance problem about one hundred times and keep track of the cplex time for solving the optimal solution for it. For small-model size in terms of number of variables, the captured time is quite close to each other. However, for the big-model size, the variance of solution times are quite big and different. I'm wondering how theoretically it could be justified? I appreciate your response
Updated on 2013-02-13T04:07:50Z at 2013-02-13T04:07:50Z by Bryanr
  • JorisK
    JorisK
    42 Posts
    ACCEPTED ANSWER

    Re: solution times

    ‏2012-11-18T22:49:09Z  in response to Bryanr
    As far as I know, cplex does not preserve order. In fact, two different runs might yield two different solutions (with the same objective). cplex deploys several heuristics which might be the cause of time differencse. Alternatively, I suppose that whenever it has to make a branching decision, it sometimes happens that two variables are equally promising to branch on, so a random choice is made. This however could turn out to be a bad branching decision in the end.
    • SystemAdmin
      SystemAdmin
      7929 Posts
      ACCEPTED ANSWER

      Re: solution times

      ‏2012-11-19T13:24:04Z  in response to JorisK
      > As far as I know, cplex does not preserve order. In fact, two different runs might yield two different solutions (with the same objective). cplex deploys several heuristics which might be the cause of time differencse. Alternatively, I suppose that whenever it has to make a branching decision, it sometimes happens that two variables are equally promising to branch on, so a random choice is made. This however could turn out to be a bad branching decision in the end.
      >
      You are quite wrong here. In default settings CPLEX is deterministic. That is, running the exact same problem with the exact same parameter settings on the exact same machine will produce the exact same results. Unless there are lots of jobs running concurrently even solution times should be pretty much the same.
      You have to explicitly change parameters to make CPLEX non-deterministic.
  • SystemAdmin
    SystemAdmin
    7929 Posts
    ACCEPTED ANSWER

    Re: solution times

    ‏2012-11-19T13:26:35Z  in response to Bryanr
    Not sure if I got that right: Do you solve the exact same problem 100 times or are there (small) differences in the problems you solve? Are these MIPs or pure LPs? Do you solve all problems in the same IloCplex instance or do you create a new IloCplex instance for each problem?
    • Bryanr
      Bryanr
      18 Posts
      ACCEPTED ANSWER

      Re: solution times

      ‏2012-11-25T02:21:46Z  in response to SystemAdmin
      Thanks a lot Daniel. I solve the exact 0-1 LP problem 100 times by running a command line program multiple times in Java with the exact same parameter settings on the exact same dedicated machine. So I'm not using the same IloCplex instance because the program is run again. However, for big model size the variance of solution time is still quite considerable and different! Do you have any thoughts?
      • SystemAdmin
        SystemAdmin
        7929 Posts
        ACCEPTED ANSWER

        Re: solution times

        ‏2012-11-26T07:21:04Z  in response to Bryanr
        That indeed sounds weird. Could you send us the log output for two runs that are dramatically different?
        Also, could you export your parameter settings by calling
        cplex.writeParam("params.prm");
        

        right before calling cplex.solve() and send us the created file?
        What is the version of CPLEX you use?
        Updated on 2014-03-24T22:45:32Z at 2014-03-24T22:45:32Z by iron-man
        • Bryanr
          Bryanr
          18 Posts
          ACCEPTED ANSWER

          Re: solution times

          ‏2012-11-27T16:12:37Z  in response to SystemAdmin
          Thanks Daniel! I have attached logs of two runs, namely 'min' and 'max', as well as the parameter settings log. I use cplex V12.4.

          Attachments

          • SystemAdmin
            SystemAdmin
            7929 Posts
            ACCEPTED ANSWER

            Re: solution times

            ‏2012-11-27T16:44:56Z  in response to Bryanr
            How certain are you that you indeed solve the exact same problem? The two log files show different solution trees. Moreover, they also show different presolve results:
            max.txt:
            ...
            Aggregator did 1567 substitutions.
            Reduced MIP has 1530 rows, 47356 columns, and 115729 nonzeros.
            Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
            ...
            

            min.txt:
            ...
            Aggregator did 1570 substitutions.
            Reduced MIP has 1527 rows, 47353 columns, and 115714 nonzeros.
            Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
            ...
            

            As you can see, the reduced models are different and from that point one cannot expect the two solves to be the same.
            Can you export the two models to a SAV file using cplex.exportModel("model.sav") before calling cplex.solve() and attach the SAV files here? I suspect there is some difference in those files.
            Updated on 2014-03-24T22:44:56Z at 2014-03-24T22:44:56Z by iron-man
            • Bryanr
              Bryanr
              18 Posts
              ACCEPTED ANSWER

              Re: solution times

              ‏2012-11-28T08:41:57Z  in response to SystemAdmin
              I realized but I dont know how it is possible! I run another 100 runs for the same problem and exported models. I've attached logs including two first minimum solution times and also for the two maximum instances. Thanks a lot

              Attachments

              • SystemAdmin
                SystemAdmin
                7929 Posts
                ACCEPTED ANSWER

                Re: solution times

                ‏2012-11-28T09:05:48Z  in response to Bryanr
                Do I understand correctly: The models min1.sav and max1.sav in the attachment are supposed to be the same? Well, they are not. At least the 'y' variables in these models have different names and a completely different order.
                How do you generate these models before you call cplex.exportModel()? Are you sure that in each iteration you create the models in the exact same way? Could you show us the (pseudo) code that you use to generate the model? Do you use any data structures that may result in non-deterministic ordering of things (like hashmaps that use memory addresses as keys)?
                • Bryanr
                  Bryanr
                  18 Posts
                  ACCEPTED ANSWER

                  Re: solution times

                  ‏2012-11-28T10:36:02Z  in response to SystemAdmin
                  Thanks for your attention. Actually you are right. Both models min1.sav and max1.sav are should be the same at the end. However, the generated 'variables names' can have different order. I used hashmaps for mapping cplex variables to my data structure. I also attached the code it may be helpful.

                  Attachments

                  • SystemAdmin
                    SystemAdmin
                    7929 Posts
                    ACCEPTED ANSWER

                    Re: solution times

                    ‏2012-11-28T12:18:45Z  in response to Bryanr
                    OK, I think by now it is clear what you need to do: You need to change the code that sets up the problem so that it always creates the exact same problem (in this case the created SAV files will be identical).
                    I did not go through all your code and I am not sure whether your use of HashMap and HashSet is a problem here. However, if the Activity and Service classes have something like a unique id (that does not change between two runs) then I would change to TreeMap and TreeSet, using this unique id as key.
                    I don't know where all you data comes from but you also need to make sure that the "order" of the data is the same for each run. For example, if the order of services or activities changes between two runs then the resulting model will have variables/constraints in a different order. So the resulting model will be different and solution times may again differ.
                    Unless your code produces the exact same model for each run there will always be differences in the solution times and they may be significant.
                    • Bryanr
                      Bryanr
                      18 Posts
                      ACCEPTED ANSWER

                      Re: solution times

                      ‏2012-11-28T16:53:36Z  in response to SystemAdmin
                      Thanks Daniel! I will change the code and give it a shot with TreeMaps and keep you updated. However, I'm wondering why and how the order of data changes the solution times for different runs as long as the number of variables and constraints should be exactly the same in generated model for the same problem ?! I attached the input synthesized dataset sample which I feed.
                      • Laci Ladanyi
                        Laci Ladanyi
                        79 Posts
                        ACCEPTED ANSWER

                        Re: solution times

                        ‏2012-11-28T22:31:55Z  in response to Bryanr
                        Just an example how ordering your variables and/or constraints can influence the solution time: When you solve an LP relaxation these objects will be considered in different order when decisions need to be made. An example is deciding which variable leaves and which one enters the basis. If you consider the variables in different order (especially if the problem is degenerate) then you'll most likely end up at a different extreme point as the optimal solution. Then of course, the MIP solution process will take different turns and your solution time could be very different.
                        And then there are many other places where the order matters, like presolving, etc.

                        --Laci
                        • SystemAdmin
                          SystemAdmin
                          7929 Posts
                          ACCEPTED ANSWER

                          Re: solution times

                          ‏2012-11-29T10:15:26Z  in response to Laci Ladanyi
                          Just to avoid confusion: the order of the variables will not have any systematic influence, i.e., there is no "best" or "worst" order, and you cannot "optimize" the order of the variables in your model.
                          The point is just that CPLEX takes many heuristic decisions during the search (which pivot to perform next in simplex, which variable to branch on in MIP, and so on). Pretty much all models have degeneracy, which means that for some of the decisions multiple variables look exactly the same. Thus, in order to select one of those variables, the only criterion that remains is the index of the variable. So, depending on the decision, CPLEX would typically pick the first or the last of those identically looking variables. This is where the order of the variables (and also rows) comes into play.

                          In other words, the ordering of the variables and rows has a similar effect as changing the random seed parameter, which is available since CPLEX 12.5.

                          Tobias
                          • Bryanr
                            Bryanr
                            18 Posts
                            ACCEPTED ANSWER

                            Re: solution times

                            ‏2012-11-29T20:25:16Z  in response to SystemAdmin
                            Thanks for your responses here, Tobias and Laci! It was thoughtful of you. So we can draw a conclusion that it is the main reason that results in different solution times for the same problem!
            • Bryanr
              Bryanr
              18 Posts
              ACCEPTED ANSWER

              Re: solution times

              ‏2012-11-29T20:12:51Z  in response to SystemAdmin
              I walked through the codes and had some changes to make sure about the model. And I'm pretty sure about the generated model after testing. So I had another run for the same problem with exact same settings and captured two solutions times for max and min. I also attached the sav format files

              min
              ....
              MIP Presolve eliminated 1049 rows and 3101 columns.
              Aggregator did 1542 substitutions.
              Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros.
              Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
              ...
              Total (root+branch&cut) =   18.75 sec. (4200.35 ticks)
              Solution status = Optimal
               
               
              max
              

              .....
              MIP Presolve eliminated 1049 rows and 3101 columns.
              Aggregator did 1542 substitutions.
              Reduced MIP has 1555 rows, 47381 columns, and 115039 nonzeros.
              Reduced MIP has 46325 binaries, 0 generals, 0 SOSs, and 880 indicators.
              .....
              Total (root+branch&cut) = 62.29 sec. (15598.11 ticks)
              Solution status = Optimal
              {code}

              As we can see the reduced models are the same; however, it leads to different solution times. I'm still subliming around :(

              Attachments

              Updated on 2014-03-24T22:44:37Z at 2014-03-24T22:44:37Z by iron-man
              • SystemAdmin
                SystemAdmin
                7929 Posts
                ACCEPTED ANSWER

                Re: solution times

                ‏2012-12-03T07:27:04Z  in response to Bryanr
                The two SAV files still contain different problems. I checked the objective function and that is quite different between the two files.
                You can see that for yourself by export to LP files instead of SAV files.
                LP files are human readable and the LP files generated by your code should be exactly the same.
                • Bryanr
                  Bryanr
                  18 Posts
                  ACCEPTED ANSWER

                  Re: solution times

                  ‏2013-02-13T04:07:50Z  in response to SystemAdmin
                  Thanks a lot Daniel! Sorry for late reply as I was missing your last message here. I had it fixed
  • SystemAdmin
    SystemAdmin
    7929 Posts
    ACCEPTED ANSWER

    Re: solution times

    ‏2012-11-19T22:48:25Z  in response to Bryanr
    Do you turn off advanced starts (on by default)? If not, and if you do not create a new instance of IloCplex for each solve of the same problem (see Daniel's question), the first crack at each problem should take considerably longer than all subsequent attempts (inflating variance).

    Paul

    Mathematicians are like Frenchmen: whenever you say something to them, they translate it into their own language, and at once it is something entirely different. (Goethe)
    • Bryanr
      Bryanr
      18 Posts
      ACCEPTED ANSWER

      Re: solution times

      ‏2012-11-25T02:37:41Z  in response to SystemAdmin
      Many thanks for your response here, Paul. I just had chance to reply back Daniel here. Actually it is not the case that I create a new instance because I run a command line java multiple times with the exact same argument for the same problem. But once the size of the same problem in terms of number of variables get increased (e.g., for a large model) running multiple times of the same problem result in different solution time, however, after the variance of captured solution times is big. The machine is also the same, and there are not lots of jobs running concurrently!