Monitoring service class load

When work is partitioned into different service classes, it can be helpful to monitor the activity load over time to determine which service classes are busy, and which are being impacted by queuing due to lack of available resources.

The following monitor elements can be used to observe activity load in service classes over time:
ADM_RUNNING_ACT_LOAD
The load of activities that are controlled by the adaptive workload manager admission control. The load is computed as the total runtime of activities controlled by adaptive workload manager divided by the interval of measurement (statistics interval). For example, if N activities controlled by the adaptive workload manager were executing constantly throughout the interval, the load would be N.
ADM_QUEUED_ACT_LOAD
The load of activities queued by the adaptive workload manager admission control. The load is computed as the total activity queue time divided by the interval of measurement (statistics interval). For example, if N activities were queued by the adaptive workload manager throughout the interval, the load would be N.
ADM_BYPASSED_ACT_LOAD
The load of activities that bypass the adaptive workload manager admission control. The load is computed the total runtime of activities that bypassed the adaptive workload manager divided by the interval of measurement (statistics interval). For example, if N activities that bypassed the adaptive workload manager were executing constantly throughout the interval, the load would be N.
These monitor elements are reported by the various service class statistics interfaces, which include the following interfaces:
  • Statistics Event monitor (superclassstats and scstats logical data groups)
  • MON_GET_SERVICE_SUBCLASS_STATS table function
  • MON_GET_SERVICE_SUPERCLASS_STATS table function

Use these monitor elements to determine in which service classes queries are running, and where queuing is occurring. These metrics can help explain why a service class is not achieving its resource entitlement. For example, if a service class is entitled to 50% of the resources, but only using 30%, the reason might be that there is not enough work running in the service class to use the full 50% entitlement. The load metrics show if there is work running in a service class over time and whether any queuing is occurring, where queuing indicates that work in the service class could make use of additional resources. The bypass metrics show if the work running in the service class is controlled by adaptive workload manager. If necessary, work that is bypassing the adaptive workload manager can be controlled by other means (for example, by a CPU limit).

Example 1: Extract and graph the running and queued load metrics over time from a statistics event monitor for a service class named SSCU1_1.
WITH LOADMAX(LOADMAX ) AS (SELECT MAX(ADM_RUNNING_ACT_LOAD + ADM_QUEUED_ACT_LOAD + ADM_BYPASSED_ACT_LOAD) FROM
                     SUPERCLASSSTATS_EVMONSTATISTICSU1
                     WHERE STATISTICS_TIMESTAMP > '2019-05-01-14.50.56.299277' AND
                           STATISTICS_TIMESTAMP < '2019-05-01-17.50.38.19156' AND
                           SERVICE_SUPERCLASS_NAME = 'SSCU1_1')
SELECT STATISTICS_TIMESTAMP,
       SUBSTR( CONCAT( REPEAT( 'R', CAST( (MAX(ADM_RUNNING_ACT_LOAD) * 40 / LOADMAX) AS INTEGER)), 
                       REPEAT( '-', CAST( (MAX(ADM_QUEUED_ACT_LOAD) * 40 / LOADMAX) AS INTEGER ))), 1, 40) AS LOAD_GRAPH
FROM SUPERCLASSSTATS_EVMONSTATISTICSU1, LOADMAX
WHERE STATISTICS_TIMESTAMP > '2019-05-01-14.50.56.299277' AND
      STATISTICS_TIMESTAMP < '2019-05-01-17.50.38.19156' AND
      SERVICE_SUPERCLASS_NAME = 'SSCU1_1'
GROUP BY STATISTICS_TIMESTAMP, SERVICE_SUPERCLASS_NAME, LOADMAX
ORDER BY STATISTICS_TIMESTAMP ASC
In the following output, the executing load is denoted in the graph with the letter R, and the queued load is denoted with a dash (-). The graph shows that the service superclass SSCU1_1 is busy throughout the monitored time interval, with activities both executing and queuing. The service class might be able to run additional queries if it had a higher resource entitlement.
STATISTICS_TIMESTAMP       LOAD_GRAPH                              
-------------------------- ----------------------------------------
..
2019-05-01-17.07.58.569627 RRRRRRRRRRRRRRRRRRRRR------------------ 
2019-05-01-17.08.33.617516 RRRRRRRRRRRRRRRRRRRRRRR---------------- 
2019-05-01-17.09.08.130650 RRRRRRRRRRRRRRRRRRRRRR----------------  
2019-05-01-17.09.42.748035 RRRRRRRRRRRRRRRRRRR-------------------- 
2019-05-01-17.10.17.868458 RRRRRRRRRRRRRRRRRR--------------------- 
2019-05-01-17.10.51.869760 RRRRRRRRRRRRRRRRRRR-------------------- 
2019-05-01-17.11.26.568063 RRRRRRRRRRRRRRRRRR--------------------- 
2019-05-01-17.12.01.457030 RRRRRRRRRRRRRRRR----------------------- 
2019-05-01-17.12.37.545589 RRRRRRRRRRRRRRRRRRRRR------------------ 
2019-05-01-17.13.12.487246 RRRRRRRRRRRRRRRRR---------------------- 
2019-05-01-17.13.46.452150 RRRRRRRRRRRRRRRR----------------------- 
2019-05-01-17.14.20.848864 RRRRRRRRRRRRRRRRRRRRR------------------ 
2019-05-01-17.14.55.600267 RRRRRRRRRRRRRRRRRR--------------------- 
2019-05-01-17.15.30.287768 RRRRRRRRRRRRR-------------------------- 
2019-05-01-17.16.06.004835 RRRRRRRRRRRRRRRRRRRRR------------------ 
2019-05-01-17.16.41.061820 RRRRRRRRRRRRRRR------------------------
…