]>

Understanding mbatchd performance metrics

LSF mbatchd performance metrics help administrators to identify the root causes of mbatchd performance issues when they occur, so that administrators can take appropriate corrective actions. These performance metrics are particularly useful when the cause of issue is related to the cluster environment; for example, the performance of shared storage or a network connection.

You can enable mbatchd performance metrics in two ways:

  • Set performance metric parameters in lsf.conf
  • Dynamically enable performance metrics with the badmin perflog subcommand

 

Configuring performance metrics in lsf.conf

Set the following parameters in lsf.conf file, then run badmin mbdrestart to make them take effect.

  • LSB_ENABLE_PERF_METRICS_LOG=Y        Enable this parameter to have LSF log mbatchd performance metrics. In any sample period, data that is not likely to cause performance issues will not be logged. The performance metric data is logged periodically according to the time interval set in LSB_PERF_METRICS_SAMPLE_PERIOD.
  • LSB_PERF_METRICS_LOGDIR=directory     Sets the directory in which mbatchd performance metric data is logged. The primary owner of this directory is the LSF administrator. The default value is LSF_LOGDIR.
  • LSB_PERF_METRICS_SAMPLE_PERIOD=minutes              Determines the sampling period for which mbatchd performance metric data is collected. The default value is 5 minutes. 

 

Enabling performance metrics with badmin perflog

badmin perflog  [-t sample_period] [-d duration] [-f log_file_name] [-o] 

  • -t sample_period    Sampling period in minutes for mbatchd performance metric collection. The default value is 5 minutes.
  • -d duration  Duration in minutes to keep logging performance metric data. mbatchd does not log messages once the duration is over. The default value for this is forever, until you stop it manually, restart mbatchd or reconfigure mbatchd.
  • -f log_file_name              Name of the log file where performance metric information is saved. It is either a file name or a full path to a file name. If you do not specify the path for the log file, the default path is used. The default name for the log file is mbatchd.perflog.host_name under LSF_LOGDIR.

Once the feature is enabled, LSF mbatchd will periodically write performance data into the performance log file.

Here is an example of performance output:

Oct  7 08:57:11 2011 8036 6 8.0.1 sample period: 300 307
Oct  7 08:57:11 2011 8036 6 8.0.1 job_submission_log_jobfile logJobInfo: 2741 0 104 0 632 0 10 0 10 0 10 0 80
Oct  7 08:57:11 2011 8036 6 8.0.1 job_submission do_submitReq: 3023 0 166 0 2753 0 10 0 480 0 10 0 230
Oct  7 08:57:11 2011 8036 6 8.0.1 job_status_update statusJob: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 job_rusage_update rusageJob: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 sched_get_new_job doSchedGetJobReq:  59 45 143 93 5 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 sched_get_resource doSchedGetRsrcReq: 59 129 3483 382 22 0 10000 169 10 0 0 0  0
Oct  7 08:57:11 2011 8036 6 8.0.1 job_dispatch_read_jobfile readLogJobInfo: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 job_dispatch EM_executeJobCtrlDecsn: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 sched_publish_decision doSchedPublishDecision: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 sched_publish_pending_reason doSchedPublishReason: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 mbd_call_sbd call_server: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 mbd_update_load RB_updateLoad: 18 1880 3836 2613 47 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 mbd_query_job fork: 263 353 2596 1206 317 0 0 0 0 0 10000 228 60
Oct  7 08:57:11 2011 8036 6 8.0.1 mbd_clean_job clean: 14 1 3 2 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 mbd_event_switch switchEvent: 0 0 0 0 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 job_dependency_eval checkJgrpDep: 59 12 2313 0 0 0 0 0 0 0 0 0
Oct  7 08:57:11 2011 8036 6 8.0.1 mbd_channel chanSelect/chanPoll: 2375 3 2012 299484 0  10000 4 10 0 10000 4 10

Each line contains following fields:

  • Metric category name
  • Function name
  • Count: Total number of calls to the function in this sample period
  • rt_min: Minimum runtime of one call to the function in this sample period
  • rt_max: Maximum runtime of one call to the function in this sample period
  • rt_avg: Average runtime of the calls to the function in this sample period
  • rt_total: Total runtime of all the calls to the function in this sample period
  • ut_min: Minimum user mode CPU-time of one call to the function in this sample period
  • ut_max: Maximum user mode CPU-time of one call to the function in this sample period
  • ut_avg: Average user mode CPU-time of the calls to the function in this sample period
  • ut_total: Total user mode CPU-time of all the calls to the function in this sample period
  • st_min: Minimum system mode CPU-time of one call to the function in this sample period
  • st_max: Maximum system mode CPU-time of one call to the function in this sample period
  • st_avg: Average system mode CPU-time of the calls to the function in this sample period
  • st_total: Total system mode CPU-time of all the calls to the function in this sample period
     

Those records will be logged if they pass the following internal thresholds.

job_submission        Handling job submission request. The job_submission item includes job_submission_log_jobfile item. If the maximum runtime of one call is greater than 50 ms, or the total runtime is longer than %15 of the logging period, mbatchd logs this item. 

job_submission_log_jobfile        LSF creates a job file under the LSF working directory at job submission. The job_submission_log_jobfile item is a sub item of job_submission. Typically, administrators put the LSF working directory on a shared file system. If the network or the shared file system has performance issues, you should see a large amount of time spent on this item. A reasonable time should be within a few ms for an individual job_submission_log_jobfile.

job_status_update  Job status updates from sbatchd. The job_status_update item writes one event into lsb.events. The lsb.events file is kept open. If the maximum runtime of one call is greater than 50 ms, or the total runtime is longer than %15 of the logging period, mbatchd logs this item. In a large high throughput computing cluster, a large number of invocations of is normal.

job_rusage_update  Running job resource usage updates from sbatchd. If the maximum runtime of one call is greater than 50 ms, or the total runtime is longer than %15 of the logging period, mbatchd logs this item. In a large high throughput computing cluster, a large number of invocations of is normal.

job_dependency_eval          Evaluate job dependency conditions If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

sched_get_new_job  mbschd Requests to get new jobs from mbatchd. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

sched_get_resource            mbschd requests to get resources from mbatchd. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

sched_publish_decision    Publish the job scheduling decision from mbschd to mbatchd. The sched_publish_decision item includes the job_dispatch and job_dispatch_read_jobfile sub items. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

job_dispatch            Dispatch one job. Job dispatch can be slow if there is network issue. The job_dispatch item includes the job_dispatch_read_jobfile sub item. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

job_dispatch_read_jobfile          mbatchd reads the job file when dispatching a job. Reading the job file can be slow if the network or the shared file system has performance issues. If the maximum runtime of one call is greater than 50 ms, mbatchd logs this item.

sched_publish_pending_reason    Publish the job pending reason from mbschd to mbatchd. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

mbd_clean_job          Clean jobs from mbatchd memory. This operation triggers removing of job file. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

mbd_event_switch    lsb.events file switching operation. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item.

mbd_update_load      mbatchd calls LIM to update load. If the maximum runtime of one call is greater than 2 seconds, mbatchd logs this item. A reasonable time should be within a few hundred ms.

mbd_query_xxx          Client queries trigger mbatchd to fork, which can be slow if mbatchd has a very large memory footprint. If the maximum runtime of one call is greater than 50 ms, or the total runtime is longer than %15 of the logging period, mbatchd logs this item.

mbd_channel  mbatchd processeses requests from the client, sbatchd or scheduler. This is the mbatchd idle time. mbatchd always logs this item.