Comments (4)
  • Add a Comment
  • Edit
  • More Actions v
  • Quarantine this Entry

1 msoos commented Permalink

Also, there are more parameters than just time. You hinted at this with the "first solution" example where it may be better to give a good enough first solution quick rather than a much better one a bit later. I don't have this problem (with SAT), but memory, for example, is very important, and many algorithms do eat quite a big chunk of it. Thus, comparing times becomes rather useless as the memory limit of the user is reached long before any solution can be found. I've found this particularly problematic, as small but hard problems would greatly benefit from the algorithms storing their data differently but then larger problems would not fit. This sometimes means the algorithm has to be written twice, or in a polymorphic/templated way which makes it difficult to debug, etc.

2 JeanFrancoisPuget commented Permalink

Right, there are many possible definitions for f, and memory use could be one. I've documented the ones we use given they suit our needs. Other definitions may make better sense in another context. Note that the trap I discuss is relevant whatever definition for f is used.

 
Regarding memory specifically, it was a blocking factor with 32 bit operating systems. With 64 bits OS memory isn't really a limit in general.

3 Marc-AndréCarle commented Permalink

You seem to have a very systematic approach to testing and benchmarking; that's great but it also requires significant computational resources to benchmark each potential change. In some project I wanted to do this but testing would be too slow to pace up with code development. I considered bundling small changes together (that are as independent as possible from one another) for the purpose of testing; the alternative was to use smaller test samples. I ended up doing a bit of both. Is that something you have to do sometimes?

4 JeanFrancoisPuget commented Permalink

You are right, this requires significant compute resource. We have several farms of identical machines that help us expedite testing. Key is to have strictly identical machines in which case you can run performance tests in parallel.
This said, performing full performance test for each change could indeed be too much. What we can do is test using a subset as a filter. Then proposed changes that improve on the subset are lumped together, and the overall change is tested against the full test set. Point is that no commit is permanent before a full test is done.