Known issues and limitations for IBM® Spectrum LSF Simulator
Known issues and limitations for LSF Simulator.
Known issues
Known issues are issues discovered during testing that can be addressed in future releases. Currently, there are no known issues for LSF Simulator.
Limitations
- One-time advance reservations
- LSF Simulator does not support one-time advance reservations, only recurring advance reservations.
- Threshold scheduling
- LSF Simulator does not support threshold scheduling.
- Memory or CPU limits
- LSF
Simulator does not support memory limits (
MEMLIMIT
) or CPU limits (CPULIMIT
). - Pre-execution, post-execution, or job starter processing
- LSF
Simulator does not support pre-execution (
pre-exec
) and post-execution (post-exec
) processing, or job starter (job_starter
) commands. - Dynamic resources and
elim
scripts - LSF
Simulator does not support
elim
scripts. Dynamic resources that are reported from the elim process are converted to fixed resources when loading the LSF configuration. - Dynamic hosts
- If the LSF production cluster that you are going to simulate contains dynamic hosts, these hosts are not simulated as dynamic hosts. LSF Simulator attempts to convert the dynamic hosts into static hosts in the collected data so these hosts show up as static hosts in the simulated cluster. The conversion adds these hosts to the lsf.cluster.cluster_name file in the conf directory in the collected data and adds the dynamic hosts into the hostlist line. This adds the hosts as static hosts that are recognized by LSF Simulator.
- LSF resource connector
- LSF Simulator does not support the LSF resource connector, as LSF Simulator requires communicating to cloud APIs and using dynamic hosts.
- LSF multicluster capability
- LSF Simulator does not support the LSF multicluster capability. LSF Simulator only works with single LSF clusters.
- LSF Data Manager
- LSF Simulator does not support LSF Data Manager.
- Time compression factor
-
LSF Simulator time compression is primarily intended for low throughput environments in which most jobs are long running (such as more than an hour). Compression works by internally simulating a faster clock to reduce the real time used to simulate running a job. For example, a time compression factor of 2 means that a job that took one hour to complete in reality, will take only half an hour in the simulation.
Note that compression will typically not reduce the real scheduling cycle time when running a simulation. The LSF scheduling time will increase by the compression factor with respect to simulation time. For example, if it takes ten seconds to complete a scheduling cycle in real time, then a compression factor of 2, the scheduling cycle is 20 seconds.
If in the simulation, the scheduling cycle time contains large relative job lengths, this can seriously affect the accuracy of the simulation. As a best practice, set the compression value so that the average scheduling cycle time in the simulation is only a few percent of the average length of the job.
Moreover, the scheduling efficiency metrics in the badmin perfmon view command output indicates the degree to which scheduling overhead is reducing resources used in the cluster. Setting the RELAX_JOB_DISPATCH_ORDER parameter in the lsb.params file with a large maximum reuse time can also help to reduce the effect of a large effective scheduling cycle time during simulations.