DB2 10.5 for Linux, UNIX, and Windows

Exercise 11: Using extended aggregates for service classes

This exercise demonstrates how to use the COLLECT AGGREGATE ACTIVITY DATA EXTENDED option on a service class to produce histograms of coordinator activity inter-arrival times and estimated costs.

Estimated time: 25-30 minutes

The inter-arrival time is the time interval between the arrival of one activity into the system and the arrival of the next activity. The estimated cost of an activity represents an estimate by the SQL compiler of the system resources that will be used during the execution of the activity, and only applies to DML activities.

Inter-arrival time histograms or estimated cost histograms can be correlated with one another, with lifetime histograms, or with other lifetime statistics to determine whether changes in lifetime histograms or lifetime statistics could be due to one of the following events:
  • A change in the complexity of the workload, such as a change in the distribution of the estimated costs
  • A change in the activity arrival rate, determined from the inter-arrival time distribution
  • A change in the system itself, such as the introduction of a new threshold, a change in the priority given to a service class, or a change in hardware

For more information on histograms, see Histograms in workload management.

Histograms are accessed through the statistics event monitor.  This exercise reuses the statistics event monitor created in Exercise 1 Step 1.

Step 1: Create views for viewing histogram statistics

Create several views to make querying the HISTOGRAMBIN_DB2STATISTICS table easier. The first view lists all of the histogram types available. In this exercise, it reports just the three basic types: lifetime, execution time and queue time.

CREATE VIEW HISTOGRAMTYPES AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE
  FROM HISTOGRAMBIN_DB2STATISTICS

A second view makes it easier to find out what service classes are having histograms collected for them. The HISTOGRAMBIN_DB2STATISTICS table reports the service class for which the histogram is being collected by giving the service class ID. Joining this table with the SERVICECLASSES catalog table will permit the service class information to be presented with the service super class name and service subclass name instead of the service class ID.

CREATE VIEW HISTOGRAMSERVICECLASSES AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE,
         SUBSTR(PARENTSERVICECLASSNAME,1,24) SERVICE_SUPERCLASS,
         SUBSTR(SERVICECLASSNAME,1,24) SERVICE_SUBCLASS
  FROM HISTOGRAMBIN_DB2STATISTICS H,
       SYSCAT.SERVICECLASSES S
  WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID

The third view lists all of the times that a histogram of a given type was collected for a given service class. Such as the HISTOGRAMSERVICECLASSES view, it joins the HISTOGRAMBIN_DB2STATISTICS table with the SERVICECLASSES catalog table. The difference is in the STATISTICS_TIMESTAMP column which is included as one of the columns in this view.

CREATE VIEW HISTOGRAMTIMES AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE,
         SUBSTR(PARENTSERVICECLASSNAME,1,24) SERVICE_SUPERCLASS,
         SUBSTR(SERVICECLASSNAME,1,24) SERVICE_SUBCLASS,
         STATISTICS_TIMESTAMP TIMESTAMP
  FROM HISTOGRAMBIN_DB2STATISTICS H,
       SYSCAT.SERVICECLASSES S
  WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID

The fourth and final view will be used to show the histograms themselves. It also demonstrates a common task when dealing with histograms, which is to aggregate them over time. This view shows the top of each bin and the number of activities that were counted towards each bin. Of the two following histograms, the BIN_TOP field measures the number of milliseconds in the activity inter-arrival time and the number of timerons in the estimated cost. When BIN_TOP is, 3000 milliseconds and the BIN_TOP of the previous bin is 2000 milliseconds and the NUMBER_IN_BIN is ten for an inter-arrival time histogram you know that there were ten activities which each arrived into the system between 2 and 3 seconds after the arrival of the previous activity, for example.

CREATE VIEW HISTOGRAMS(HISTOGRAM_TYPE, SERVICE_SUPERCLASS,
  SERVICE_SUBCLASS, BIN_TOP, NUMBER_IN_BIN) AS
  SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE,
         SUBSTR(PARENTSERVICECLASSNAME,1,24) SERVICE_SUPERCLASS,
         SUBSTR(SERVICECLASSNAME,1,24) SERVICE_SUBCLASS,
         TOP AS BIN_TOP,
         SUM(NUMBER_IN_BIN) AS NUMBER_IN_BIN
  FROM HISTOGRAMBIN_DB2STATISTICS H,
       SYSCAT.SERVICECLASSES S
  WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID
  GROUP BY HISTOGRAM_TYPE, PARENTSERVICECLASSNAME, SERVICECLASSNAME, TOP

Step 2: Turn on collection of histograms

Turning on the collection of histograms is done for the default user service class by altering its default subclass to collect aggregate activity data with the EXTENDED option. This provides both the three histograms available in the BASE option (lifetime, execution time, and queue time) and the two histograms available only when using the EXTENDED option (inter-arrival time and estimated cost).

ALTER SERVICE CLASS SYSDEFAULTSUBCLASS UNDER SYSDEFAULTUSERCLASS
   COLLECT AGGREGATE ACTIVITY DATA EXTENDED 

Step 3: Activate statistics event monitor

If not already active, activate the event monitor that was created earlier so that it can receive aggregate data whenever it is collected.

SET EVENT MONITOR DB2STATISTICS STATE 1

Step 4: Run activities and send statistics to statistics event monitor

First run some activities; after the activities have finished the WLM_COLLECT_STATS stored procedure is called to send the service class statistics to the active statistics event monitor (including the activity lifetime, execution time, queue time, inter-arrival time and estimated cost histograms for the default user service class). These histograms contain data about all activities that executed in the default user service class since aggregate activity statistics were enabled. Calling the stored procedure also resets the statistics. To show changes in database activity over time, three collection intervals are created.

In the first interval, run two scripts, work1.db2 and work2.db2, then collect and reset the statistics.

db2 -o- -tvf work1.db2
db2 -o- -tvf work2.db2

CONNECT TO SAMPLE

CALL WLM_COLLECT_STATS

In the second interval, run only the work1.db2 script once, then collect and reset the statistics.

db2 -o- -tvf work1.db2

CONNECT TO SAMPLE

CALL WLM_COLLECT_STATS

In the third interval, run the work1.db2 script twice and the work2.db2 script once, then collect and reset the statistics.

db2 -o- -tvf work1.db2
db2 -o- -tvf work2.db2
db2 -o- -tvf work1.db2

CONNECT TO SAMPLE

CALL WLM_COLLECT_STATS

Collecting data periodically such as this permits you to watch how work on your system changes over time.

Additional Information: Collecting data periodically does not need to be a manual operation. Using the WLM_COLLECT_INT database configuration parameter, you can set the interval in minutes after which statistics collection and reset occurs.

Step 5: Query views to view statistics

Now that statistics have been collected, the views created earlier can be used to look at the statistics. The HISTOGRAMTYPES view just returns the types of histograms available.

SELECT * FROM HISTOGRAMTYPES

HISTOGRAM_TYPE
------------------------
CoordActEstCost
CoordActExecTime
CoordActInterArrivalTime
CoordActLifetime
CoordActQueueTime

  3 record(s) selected.

Since the EXTENDED option was used when altering the service class, there are five histograms.

The HISTOGRAMSERVICECLASSES view permits you to see the service classes for which a histogram was collected. The following example restricts the output to that of the CoordActInterArrivalTime histogram only. Since aggregate activity collection was turned on only for the default user service class's default subclass, only that class is shown when selecting from the HISTOGRAMSERVICECLASSES view.

SELECT * FROM HISTOGRAMSERVICECLASSES
  WHERE HISTOGRAM_TYPE = 'CoordActInterArrivalTime'
  ORDER BY SERVICE_SUPERCLASS, SERVICE_SUBCLASS

HISTOGRAM_TYPE           SERVICE_SUPERCLASS       SERVICE_SUBCLASS
------------------------ ------------------------ -----------------------
-
CoordActInterArrivalTime SYSDEFAULTUSERCLASS      SYSDEFAULTSUBCLASS

  1 record(s) selected.

The HISTOGRAMTIMES view shows the times when histograms were collected. Since the WLM_COLLECT_STATS procedure was run three times, there are three timestamps for the inter-arrival time histogram shown.

SELECT * FROM HISTOGRAMTIMES
  WHERE HISTOGRAM_TYPE = 'CoordActInterArrivalTime'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY TIMESTAMP

HISTOGRAM_TYPE           SERVICE_SUPERCLASS       SERVICE_SUBCLASS        
TIMESTAMP
------------------------ ------------------------ -----------------------
- --------------------------
CoordActInterArrivalTime SYSDEFAULTUSERCLASS      SYSDEFAULTSUBCLASS      
2007-08-08-13.41.38.870298
CoordActInterArrivalTime SYSDEFAULTUSERCLASS      SYSDEFAULTSUBCLASS      
2007-08-08-13.41.42.802855
CoordActInterArrivalTime SYSDEFAULTUSERCLASS      SYSDEFAULTSUBCLASS      
2007-08-08-13.41.53.577835

The last view, HISTOGRAMS, is for looking at the histograms themselves. Unlike the HISTOGRAMTIMES view that lists each collection interval as its own row, this view aggregates histogram data across multiple intervals to produce a single histogram of a given type for a given service class.

SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
  WHERE HISTOGRAM_TYPE = 'CoordActInterArrivalTime'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY BIN_TOP

BIN_TOP              NUMBER_IN_BIN
-------------------- --------------------
                  -1                    0
                   1                   10
                   2                    6
                   3                    7
                   5                   14
                   8                    7
                  12                   32
                  19                    2
                  29                    9
                  44                   24
                  68                   11
                 103                    8
                 158                    8
                 241                    9
                 369                    1
                 562                   10
                 858                    5
                1309                    5
                1997                    0
                3046                    0
                4647                    0
                7089                    0
               10813                    2
               16493                    2
               25157                    0
               38373                    0
               58532                    0
               89280                    0
              136181                    0
              207720                    0
              316840                    0
              483283                    0
              737162                    0
             1124409                    0
             1715085                    0
             2616055                    0
             3990325                    0
             6086529                    0
             9283913                    0
            14160950                    0
            21600000                    0

  41 record(s) selected.

Running this query produces output than will not be exactly the same as what is shown in the preceding example since activity inter-arrival times depend on the performance of the system. In the previous output, there are 41 bins and all of the largest bins are empty. At the top, there is a bin whose BIN_TOP is -1. This bin represents all of those activities whose inter-arrival time was too large to fit in the histogram. Seeing a NUMBER_OF_BIN greater than zero when the BIN_TOP is -1 indicates that you should probably increase the high bin value of your histogram. In the previous output, the NUMBER_IN_BIN is 0, so there is no need to make such a change. The majority of activities arrived less than 1309 milliseconds apart from each other. Four activities arrived between 7089 milliseconds and 16493 milliseconds apart.

The same query can be repeated with a histogram_type of CoordActEstCost instead of CoordActInterArrivalTime.

SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
  WHERE HISTOGRAM_TYPE = 'CoordActEstCost'
    AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
    AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
  ORDER BY BIN_TOP

BIN_TOP              NUMBER_IN_BIN
-------------------- --------------------
                  -1                    0
                   1                   39
                   2                    0
                   3                    0
                   5                    0
                   8                   30
                  12                    0
                  19                   30
                  29                    0
                  44                    0
                  68                    0
                 103                    0
                 158                    0
                 241                    0
                 369                    0
                 562                    0
                 858                    0
                1309                    0
                1997                    0
                3046                    0
                4647                    0
                7089                    0
               10813                    0
               16493                    0
               25157                    0
               38373                    0
               58532                    0
               89280                    0
              136181                    0
              207720                    0
              316840                    0
              483283                    0
              737162                    0
             1124409                    0
             1715085                    0
             2616055                    0
             3990325                    0
             6086529                    0
             9283913                    0
            14160950                    0
            21600000                    0

  41 record(s) selected.

A histogram such as this is typical for a small workload. With a small workload, there is not much variation in the size of activities, so there are only three different bins that had activities counted towards them. Slightly more than 60% of the activities had a cost estimate between 5 and 19 timerons, with the rest having cost estimates of less than 1 timeron.

Step 6: Reset for other exercises

The final step is to turn off collection of aggregate activities on the default user service class, to drop the views and to delete the information in the statistics tables.

ALTER SERVICE CLASS SYSDEFAULTSUBCLASS UNDER SYSDEFAULTUSERCLASS
   COLLECT AGGREGATE ACTIVITY DATA NONE

DROP VIEW HISTOGRAMS
DROP VIEW HISTOGRAMTIMES
DROP VIEW HISTOGRAMSERVICECLASSES
DROP VIEW HISTOGRAMTYPES

SET EVENT MONITOR DB2STATISTICS STATE 0

DELETE FROM HISTOGRAMBIN_DB2STATISTICS
DELETE FROM SCSTATS_DB2STATISTICS