Decreasing GPU power consumption when a GPU is not in use

A GPU consumes significant power even when it idles. LSF provides configuration parameters to decrease the GPU power that is consumed if GPU is not in use within a specified time. By default, LSF does not power off a GPU even when it is idle.

Set the LSB_GPU_POWEROFF_DURATION parameter in the lsf.conf file to specify the minimum number of seconds before LSF can power off an idle GPU. When the LSB_GPU_POWEROFF_DURATION parameter is set, LSF tries to allocate the GPU that is not running in "MIN power limit" mode. If not enough GPUs are in "MAX power limit" mode, LSF allocates the GPUs that are in "MIN power limit" mode and switches those GPUs to run in "MAX power limit" mode.

If the LSB_GPU_POWEROFF_DURATION=0 parameter is set, LSF powers off GPUs immediately after the job finishes.

LSF uses the following criteria to allocate the GPU flow:
  • All GPUs are in the same PCI.
  • Check whether the "MAX power limit" mode GPUs meets job requirements. If they do, LSF does not allocate the "MIN power limit" mode GPUs first. If they do not meet the requirements, LSF allocates all the GPUs to the job, including both "MAX power limit" and "MIN power limit" mode GPUs.

If the sbatchd daemon is restarted, the GPU idle time is recalculated.

NVIDIA K80 hardware supports switch power limits. The NVML library must be Version 6.340 or newer.