Question & Answer
Question
What does 'GRID - Cluster/Host Load Average' graph shows?
Answer
GRID - Cluster/Host Load Average
This graph shows measured LSF load averages as collected from the Platform LSF LIMs running on each monitored batch host. The graph reports three load averages (15 seconds, 1 minute and 15 minutes).
These Platform LSF load indices are actually reporting the average run-queue length over the measured period, so busy systems will show higher values.
Load indices collected by LIM:
| Index | Measures | Units | Direction | Averaged over | Update Interval |
| r15s | run queue length | processes | Increasing | 15 seconds | 15 seconds |
| r1m | run queue length | processes | Increasing | 1 minute | 15 seconds |
| r15m | run queue length | processes | Increasing | 15 minutes | 15 seconds |
Because Platform RTM is again averaging these reported results, this graph is “an average of averages. There’s a graph for every LSF batch host on each tracked metric.
/usr/bin/rrdtool graph - \
--title='rbplsf913_Summary - GRID Load Average' \
DEF:a='/opt/IBM/cacti/rra/rbplsf913_summary_r1m_7.rrd':'r15s':AVERAGE \
DEF:b='/opt/IBM/cacti/rra/rbplsf913_summary_r1m_7.rrd':'r1m':AVERAGE \
DEF:c='/opt/IBM/cacti/rra/rbplsf913_summary_r1m_7.rrd':'r15m':AVERAGE \
The graph below (see attachment) is showing variations in load averages over time and the differences between the 15 seconds, one minute and 15 minutes averages. When interpreting this data we need to consider that Platform RTM is only sampling these metrics at the polling interval.

There is one setting in RTM, @ Console > Grid Settings > Poller > CPU Run Queue Length Load Indices Type: DEFAULT/EFFECTIVE/NORMALIZED.
For Effective, LSF scales the run queue value on multiprocessor systems to make the CPU load of uniprocessors and multiprocessors comparable (lsload -E). For Normalized, LSF also adjusts the CPU run queue based on the relative speeds of the processors (CPU Factor, lsload -N). The default is raw data (lsload -l).
Was this topic helpful?
Document Information
Modified date:
17 June 2018
UID
isg3T1026279