Monitoring job throughput with benchmark jobs
You can configure a benchmark job to measure job throughput in your cluster. A benchmark job is any job that you configure to submit to the LSF cluster at a specific interval. You can graphically view time values for the job such as communication times between RTM and LSF, how long it took to start a job, over time.
Configure a benchmark job to monitor job throughput
About this task
Procedure
Using the Benchmark Stats graph to analyze benchmark results
After your benchmark job has run, you can access a graphical view of the results by selecting Graph icon in the Actions column.
and clicking theWhen the Benchmark Stats graph is displayed, click inside the graph to show daily, weekly, monthly, and yearly averages for the benchmark job.
The time values displayed for a benchmark job can indicate how fast your cluster is processing jobs that are sent by clients and can highlight issues in your cluster.
The following table illustrates the time values used for the calculations of durations that are displayed in the graph.
Time Duration Displayed on Graph | Description | Use the value to identify |
---|---|---|
Submit Time |
The average time that it took for the job to be submitted from RTM to LSF. The value is calculated as an average for all jobs in the time period:
The time stamps that are used for the calculation are:
|
Communication duration from a client to LSF for job submission. |
Seen Time |
The average time that it took for the job from submission from RTM until the job was recognized as submitted by RTM. The value is calculated as an average for all jobs in the time period:
The time stamps that are used for the calculation are:
|
Full duration of communication from a client to LSF and from LSF to a client for job submission. |
Start Time |
The average time that it took for the job to start from the time it was submitted by RTM to the time it was recognized by RTM as started in LSF. The value is calculated as an average for all jobs in the time period:
The time stamps that are used for the calculation are:
|
How long the job took to start from the time it was submitted until the client identified it as started |
Run Time |
The average time that it took for a job to run to completion. This value is the actual LSF run time:
The time stamps that are used for the calculation are:
|
Actual time the job took to run |
Done Time |
The average time that it took for the job to finish from the moment it was submitted from RTM until it finished in LSF. The value is calculated as an average for all jobs in the time period:
The time stamps that are used for the calculation are:
|
Duration of time from job submission from the client until the job finished in LSF |
Seen Done Time |
The average time that it took for the job to finish from the moment it was submitted from RTM until the moment it was identified by RTM as finished in LSF. The value is calculated as an average for all jobs in the time period:
The time stamps that are used for the calculation are:
|
Full duration from job submission from the client until the job is recognized as finished by
the client. Use Seen Done Time - Done Time to evaluate the communication duration
between LSF and RTM after the job finished. |
Viewing benchmark job results over a specific time period
About this task
Procedure
Viewing benchmark jobs that exceed thresholds
About this task
Procedure
- Select Benchmark Jobs Exceptions. and look at the section
- Click the value in the Benchmark Name column to view more details about the benchmark job and the Benchmark Submission Stats graph.