DB2 10.5 for Linux, UNIX, and Windows

Exercise 5: Using histograms for service classes

This exercise demonstrates how to use the COLLECT AGGREGATE ACTIVITY DATA BASE option on a service class to produce histograms of coordinator activity lifetimes, coordinator activity execution times, and coordinator activity queue times.

Estimated time: 25-30 minutes

These three histograms are useful for knowing more than just the average lifetime, execution time, or queue time of the activities run on the system, since they can be used to calculate standard deviations and can reveal outliers. For more information on histograms, see Histograms in workload management.

Histograms are accessed through the statistics event monitor. This exercise reuses the statistics event monitor created in Exercise 1 Step 1.

Additional Information: The statistics event monitor is a write-to-table event monitor and contains logical data groups.  The first is the control logical data group, which every event monitor has, and then there are the logical data groups that are specific to the statistics event monitor type. The specific logical data groups are:

Step 1: Create views for viewing histogram statistics

Create several views to make querying the HISTOGRAMBIN_DB2STATISTICS table easier. The first view lists all of the histogram types available. This exercise reports just the three basic types:lifetime, execution time and queue time.

CREATE VIEW HISTOGRAMTYPES AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) AS HISTOGRAM_TYPE
  FROM HISTOGRAMBIN_DB2STATISTICS

A second view makes it easier to find out which service classes are having histograms collected for them. The HISTOGRAMBIN_DB2STATISTICS table identifies the service classes for which histograms are being collected using the service class ID. Joining this table with the SERVICECLASSES catalog table permits the service class information to be presented with the service super class name and service subclass name instead of the service class ID.

CREATE VIEW HISTOGRAMSERVICECLASSES AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) AS HISTOGRAM_TYPE,
         SUBSTR(PARENTSERVICECLASSNAME,1,24) AS SERVICE_SUPERCLASS,
         SUBSTR(SERVICECLASSNAME,1,24) AS SERVICE_SUBCLASS
  FROM HISTOGRAMBIN_DB2STATISTICS AS H,
       SYSCAT.SERVICECLASSES AS S
  WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID

The third view lists all of the times that a histogram of a given type was collected for a given service class. Such as the histogramserviceclasses view, it also joins the HISTOGRAMBIN_DB2STATISTICS table with the SERVICECLASSES catalog table. The difference is that it includes the STATISTICS_TIMESTAMP column as one of the columns in the view.

CREATE VIEW HISTOGRAMTIMES AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) AS HISTOGRAM_TYPE,
         SUBSTR(PARENTSERVICECLASSNAME,1,24) AS SERVICE_SUPERCLASS,
         SUBSTR(SERVICECLASSNAME,1,24) AS SERVICE_SUBCLASS,
         STATISTICS_TIMESTAMP AS TIMESTAMP
  FROM HISTOGRAMBIN_DB2STATISTICS AS H,
       SYSCAT.SERVICECLASSES AS S
  WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID 

The fourth and final view will be used to show the histograms themselves. It also demonstrates something that one often needs to do when dealing with histograms, which is to aggregate them over time. This view shows the top of each bin and the number of activities that were counted towards each bin. For the three histograms in this exercise, the BIN_TOP field measures the number of milliseconds in the activity lifetime, execution time or queue time. When BIN_TOP is, say 3000 milliseconds and the BIN_TOP of the previous bin is 2000 milliseconds and the NUMBER_IN_BIN is ten for a lifetime histogram, you know that ten activities had a lifetime that was between 2 and 3 seconds.

CREATE VIEW HISTOGRAMS(HISTOGRAM_TYPE,
                       SERVICE_SUPERCLASS,
                       SERVICE_SUBCLASS,
                       BIN_TOP,
                       NUMBER_IN_BIN) AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) AS HISTOGRAM_TYPE,
         SUBSTR(PARENTSERVICECLASSNAME,1,24) AS SERVICE_SUPERCLASS,
         SUBSTR(SERVICECLASSNAME,1,24) AS SERVICE_SUBCLASS,
         TOP AS BIN_TOP,
         SUM(NUMBER_IN_BIN) AS NUMBER_IN_BIN
  FROM HISTOGRAMBIN_DB2STATISTICS AS H,
       SYSCAT.SERVICECLASSES AS S
  WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID
  GROUP BY HISTOGRAM_TYPE, PARENTSERVICECLASSNAME, SERVICECLASSNAME, TOP

Step 2: Turn on the collection of histograms

The activity lifetime, queue time, and execution time histograms are collected for a service subclass when the base collect aggregate activity data option is enabled for the subclass. Enable the base aggregate activity data collection for the default subclass under the default user super class using the COLLECT AGGREGATE ACTIVITY DATA clause.

Note that all activities will be run in the default user service class since all the user defined workloads were disabled at the end of the previous exercise.

ALTER SERVICE CLASS SYSDEFAULTSUBCLASS
  UNDER SYSDEFAULTUSERCLASS
  COLLECT AGGREGATE ACTIVITY DATA BASE

Step 3: Activate the statistics event monitor

Activate the statistics event monitor that was created earlier so that it may receive the aggregate data whenever it is collected.

SET EVENT MONITOR DB2STATISTICS STATE 1

Step 4: Run activities and send statistics to the statistics event monitor

Now some activities can be run. After the activities have finished, the WLM_COLLECT_STATS stored procedure is called to send the statistics (including the activity lifetime, execution time and queue time histograms for the default user service class) to the active statistics event monitor. These histograms contain data about all activities that executed in the default user service class since aggregate activity statistics were enabled. Calling this stored procedure also resets the statistics. To show changes in database activity over time, three collection intervals are created. In the first interval, run two scripts, work1.db2 and work2.db2, and then collect and reset the statistics.

db2 -o -tvf work1.db2
db2 -o -tvf work2.db2

CONNECT TO SAMPLE

CALL WLM_COLLECT_STATS()

In the second interval, only run the work1.db2 script once and then collect and reset the statistics.

db2 -o -tvf work1.db2

CONNECT TO SAMPLE

CALL WLM_COLLECT_STATS()

In the third interval, run work1.db2 twice and run work2.db2 script once and then collect and reset the statistics.

db2 -o -tvf work1.db2
db2 -o -tvf work2.db2
db2 -o -tvf work1.db2

CONNECT TO SAMPLE

CALL WLM_COLLECT_STATS()

Collecting data periodically such as this permits you to watch how work on your system changes over time.

Additional Information: Collecting data periodically does not need to be a manual operation. Using the WLM_COLLECT_INT database configuration parameter, one can set the interval in minutes after which statistics collection and reset automatically occurs.

Step 5: Query views to view statistics

Now that statistics have been collected, the views created earlier can be used to look at the statistics. The HISTOGRAMTYPES view just returns the types of histograms available.

SELECT * FROM HISTOGRAMTYPES

HISTOGRAM_TYPE
------------------------
CoordActExecTime
CoordActLifetime
CoordActQueueTime

  3 record(s) selected.

Since the BASE option was used when altering the service class, there are three histograms: lifetime, exectime and queuetime. The HISTOGRAMSERVICECLASSES view permits you to see the service classes for which a histogram was collected. The following example restricts the output to that of the CoordActLifetime histogram only. Since aggregate activity collection was only turned on for the default user service class's default subclass, only that class is shown when selecting from the HISTOGRAMSERVICECLASSES view.

SELECT * FROM HISTOGRAMSERVICECLASSES
  WHERE HISTOGRAM_TYPE = 'CoordActLifetime'
  ORDER BY SERVICE_SUPERCLASS, SERVICE_SUBCLASS

HISTOGRAM_TYPE           SERVICE_SUPERCLASS       SERVICE_SUBCLASS
------------------------ ------------------------ -----------------------
-
CoordActLifetime         SYSDEFAULTUSERCLASS      SYSDEFAULTSUBCLASS

  1 record(s) selected.

The HISTOGRAMTIMES view shows the times when histograms were collected. Since the WLM_COLLECT_STATS procedure was run three times, there are three timestamps for the lifetime histogram shown.

SELECT * FROM HISTOGRAMTIMES
  WHERE HISTOGRAM_TYPE = 'CoordActLifetime'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY TIMESTAMP

HISTOGRAM_TYPE   SERVICE_SUPERCLASS  SERVICE_SUBCLASS   TIMESTAMP
---------------- ------------------- ------------------ -----------------
---------
CoordActLifetime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS 2007-08-05-
20.44.51.519380
CoordActLifetime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS 2007-08-05-
21.04.27.131281
CoordActLifetime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS 2007-08-05-
21.08.27.474168 

  3 record(s) selected.

The last view, HISTOGRAMS, is for looking at the histograms themselves. Unlike the HISTOGRAMTIMES view that lists each collection interval as its own row, this view aggregates histogram data across multiple intervals to produce a single histogram of a given type for a given service class.

SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
  WHERE HISTOGRAM_TYPE = 'CoordActLifetime'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY BIN_TOP

BIN_TOP              NUMBER_IN_BIN
-------------------- --------------------
                  -1                    0
                   1                   88
                   2                    0
                   3                    0
                   5                    2
                   8                    6
                  12                    7
                  19                    8
                  29                    12
                  44                    13
                  68                    23
                 103                    11
                 158                    2
                 241                    5
                 369                    0
                 562                    0
                 858                    0
                1309                    0
                1997                    0
                3046                    0
                4647                    0
                7089                    0
               10813                    0
               16493                    0
               25157                    0
               38373                    0
               58532                    0
               89280                    0
              136181                    0
              207720                    0
              316840                    0
              483283                    0
              737162                    0
             1124409                    0
             1715085                    0
             2616055                    0
             3990325                    0
             6086529                    0
             9283913                    0
            14160950                    0
            21600000                    0

  41 record(s) selected. 

The output from the histograms can then be used as input into a graphing tool to generate a graph. The following diagram shows a graph that was created using a Ruby Graphing Library called Gruff Graphs.

Lifetime histogram for SYSDEFAULTUSERCLASS (CoordActLifetime):

Lifetime histogram for SYSDEFAULTUSERCLASS (CoordActLifetime)

Running the query of this view, should produce output that will not be exactly the same as what is shown in the preceding lifetime histogram graph since activity lifetimes depend on the performance of the system. In the previous output, there are 41 bins and all of the largest bins are empty. At the top, there is a bin whose BIN_TOP is -1. This bin represents all of those activities whose lifetime was too large to fit in the histogram. Seeing a NUMBER_OF_BIN greater than zero when the BIN_TOP is -1 indicates that you should probably increase the high bin value of your histogram. In the previous output, the NUMBER_IN_BIN is 0, so there is no need to make such a change. A large number of activities, 88 in this case, were counted in the bin with a BIN_TOP of 1. This is the lowest bin and it means that 88 activities had a lifetime between 0 and 1 milliseconds. Another piece of information that can be extracted from the histogram is that, since the largest BIN_TOP for which there is a corresponding non-zero NUMBER_IN_BIN is 241, the largest lifetime of any activity in the workloads collected in this histogram was between 158 milliseconds and 241 milliseconds. The COORD_ACT_LIFETIME_TOP column in the SCSTATS_DB2STATISTICS table gives a more precise measurement of the lifetime of the activity with the largest lifetime.

The same query can be repeated with a histogram_type of CoordActExecTime instead of CoordActLifetime. The execution time histogram is expected to be similar but not identical to the lifetime histogram. The reason they are different, even when there is no queuing, is that execution time does not include initialization time or cursor idle time, while lifetime does.

SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
  WHERE HISTOGRAM_TYPE = 'CoordActExecTime'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY BIN_TOP

BIN_TOP              NUMBER_IN_BIN
-------------------- --------------------
                  -1                    0
                   1                  112
                   2                    0
                   3                    0
                   5                    0
                   8                    5
                  12                    7
                  19                    5
                  29                   12
                  44                    7
                  68                   11
                 103                   11
                 158                    2
                 241                    5
                 369                    0
                 562                    0
                 858                    0
                1309                    0
                1997                    0
                3046                    0
                4647                    0
                7089                    0
               10813                    0
               16493                    0
               25157                    0
               38373                    0
               58532                    0
               89280                    0
              136181                    0
              207720                    0
              316840                    0
              483283                    0
              737162                    0
             1124409                    0
             1715085                    0
             2616055                    0
             3990325                    0
             6086529                    0
             9283913                    0
            14160950                    0
            21600000                    0

  41 record(s) selected.

Execution time histogram for SYSDEFAULTUSERCLASS (CoordActExecTime):

Execution time histogram for SYSDEFAULTUSERCLASS (CoordActExecTime)
Once again, a large number of activities are counted in the first bin and the highest execution time of any activity is at most 241 milliseconds.

Finally, the HISTOGRAMS view will be used to look at the CoordActQueueTime histogram. This is the simplest histogram because there is no queuing, since no queuing thresholds were created or enabled in this exercise.

SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
  WHERE HISTOGRAM_TYPE = 'CoordActQueueTime'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY BIN_TOP 

BIN_TOP              NUMBER_IN_BIN
-------------------- --------------------
                  -1                    0
                   1                  177
                   2                    0
                   3                    0
                   5                    0
                   8                    0
                  12                    0
                  19                    0
                  29                    0
                  44                    0
                  68                    0
                 103                    0
                 158                    0
                 241                    0
                 369                    0
                 562                    0
                 858                    0
                1309                    0
                1997                    0
                3046                    0
                4647                    0
                7089                    0
               10813                    0
               16493                    0
               25157                    0
               38373                    0
               58532                    0
               89280                    0
              136181                    0
              207720                    0
              316840                    0
              483283                    0
              737162                    0
             1124409                    0
             1715085                    0
             2616055                    0
             3990325                    0
             6086529                    0
             9283913                    0
            14160950                    0
            21600000                    0

  41 record(s) selected.

Queue time histogram for SYSDEFAULTUSERCLASS (CoordActQueueTime):

Queue time histogram for SYSDEFAULTUSERCLASS (CoordActQueueTime)
Every activity was counted in the 0 to 1 millisecond bin because every activity spent zero milliseconds queuing.

The last several queries looked at activity lifetimes, execution times and queue times broken down into bins but aggregated across multiple intervals. The following query presents the same information from a different perspective. It shows averages instead of histograms and, rather than combining the intervals, it shows each interval individually. It also reports a count of the number of completed activities which shows how many activities completed in each interval. It uses the SCSTATS_DB2STATISTICS table instead of the HISTOGRAMBIN_DB2STATISTICS table.

SELECT STATISTICS_TIMESTAMP,
       COORD_ACT_LIFETIME_AVG AS LIFETIMEAVG,
       COORD_ACT_EXEC_TIME_AVG AS EXECTIMEAVG,
       COORD_ACT_QUEUE_TIME_AVG AS QUEUETIMEAVG,
       COORD_ACT_COMPLETED_TOTAL AS COMPLETED_TOTAL
FROM SCSTATS_DB2STATISTICS
WHERE SERVICE_SUPERCLASS_NAME = 'SYSDEFAULTUSERCLASS'
  AND SERVICE_SUBCLASS_NAME = 'SYSDEFAULTSUBCLASS'
ORDER BY STATISTICS_TIMESTAMP

STATISTICS_TIMESTAMP       LIFETIMEAVG EXECTIMEAVG QUEUETIMEAVG
COMPLETED_TOTAL
-------------------------- ----------- ----------- ------------ ---------
------
2007-08-07-14.07.44.511153         508         475            0             
77
2007-08-07-14.07.46.537777         513         508            0             
39
2007-08-07-14.07.51.882173         314         253            0            
113

  3 record(s) selected. 

The result shows that average lifetimes are slightly higher than average execution times for each interval and all three are just over a half a second or less. The average queue time, as expected, is zero. The counts of the number of completed activities in each interval is as expected because workloads 1 and 2 were run in the first interval which resulted in 77 activities collected, workload 1 ran alone in the second interval which resulted in 39 activities, and workload 1 ran twice and workload 2 ran once in the third interval, which resulted in 113 activities.

Step 6: Reset for the next exercise

The final step is to turn off collection of aggregate activities on the default user service class and drop the views and delete the information in the statistics tables.

ALTER SERVICE CLASS SYSDEFAULTSUBCLASS
  UNDER SYSDEFAULTUSERCLASS
  COLLECT AGGREGATE ACTIVITY DATA NONE

DROP VIEW histograms
DROP VIEW histogramtimes
DROP VIEW histogramserviceclasses
DROP VIEW histogramtypes

SET EVENT MONITOR DB2STATISTICS STATE 0

DELETE FROM HISTOGRAMBIN_DB2STATISTICS
DELETE FROM SCSTATS_DB2STATISTICS