Exercise 11: Using extended aggregates for service classes
This exercise demonstrates how to use the COLLECT AGGREGATE ACTIVITY DATA EXTENDED option on a service class to produce histograms of coordinator activity inter-arrival times and estimated costs.
Estimated time: 25-30 minutes
The inter-arrival time is the time interval between the arrival of one activity into the system and the arrival of the next activity. The estimated cost of an activity represents an estimate by the SQL compiler of the system resources that will be used during the execution of the activity, and only applies to DML activities.
- A change in the complexity of the workload, such as a change in the distribution of the estimated costs
- A change in the activity arrival rate, determined from the inter-arrival time distribution
- A change in the system itself, such as the introduction of a new threshold, a change in the priority given to a service class, or a change in hardware
For more information on histograms, see Histograms in workload management.
Histograms are accessed through the statistics event monitor. This exercise reuses the statistics event monitor created in Exercise 1 Step 1.
Step 1: Create views for viewing histogram statistics
Create several views to make querying the HISTOGRAMBIN_DB2STATISTICS table easier. The first view lists all of the histogram types available. In this exercise, it reports just the three basic types: lifetime, execution time and queue time.
CREATE VIEW HISTOGRAMTYPES AS
SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE
FROM HISTOGRAMBIN_DB2STATISTICS
A second view makes it easier to find out what service classes are having histograms collected for them. The HISTOGRAMBIN_DB2STATISTICS table reports the service class for which the histogram is being collected by giving the service class ID. Joining this table with the SERVICECLASSES catalog table will permit the service class information to be presented with the service super class name and service subclass name instead of the service class ID.
CREATE VIEW HISTOGRAMSERVICECLASSES AS
SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE,
SUBSTR(PARENTSERVICECLASSNAME,1,24) SERVICE_SUPERCLASS,
SUBSTR(SERVICECLASSNAME,1,24) SERVICE_SUBCLASS
FROM HISTOGRAMBIN_DB2STATISTICS H,
SYSCAT.SERVICECLASSES S
WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID
The third view lists all of the times that a histogram of a given type was collected for a given service class. Such as the HISTOGRAMSERVICECLASSES view, it joins the HISTOGRAMBIN_DB2STATISTICS table with the SERVICECLASSES catalog table. The difference is in the STATISTICS_TIMESTAMP column which is included as one of the columns in this view.
CREATE VIEW HISTOGRAMTIMES AS
SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE,
SUBSTR(PARENTSERVICECLASSNAME,1,24) SERVICE_SUPERCLASS,
SUBSTR(SERVICECLASSNAME,1,24) SERVICE_SUBCLASS,
STATISTICS_TIMESTAMP TIMESTAMP
FROM HISTOGRAMBIN_DB2STATISTICS H,
SYSCAT.SERVICECLASSES S
WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID
The fourth and final view will be used to show the histograms themselves. It also demonstrates a common task when dealing with histograms, which is to aggregate them over time. This view shows the top of each bin and the number of activities that were counted towards each bin. Of the two following histograms, the BIN_TOP field measures the number of milliseconds in the activity inter-arrival time and the number of timerons in the estimated cost. When BIN_TOP is, 3000 milliseconds and the BIN_TOP of the previous bin is 2000 milliseconds and the NUMBER_IN_BIN is ten for an inter-arrival time histogram you know that there were ten activities which each arrived into the system between 2 and 3 seconds after the arrival of the previous activity, for example.
CREATE VIEW HISTOGRAMS(HISTOGRAM_TYPE, SERVICE_SUPERCLASS,
SERVICE_SUBCLASS, BIN_TOP, NUMBER_IN_BIN) AS
SELECT DISTINCT SUBSTR(HISTOGRAM_TYPE,1,24) HISTOGRAM_TYPE,
SUBSTR(PARENTSERVICECLASSNAME,1,24) SERVICE_SUPERCLASS,
SUBSTR(SERVICECLASSNAME,1,24) SERVICE_SUBCLASS,
TOP AS BIN_TOP,
SUM(NUMBER_IN_BIN) AS NUMBER_IN_BIN
FROM HISTOGRAMBIN_DB2STATISTICS H,
SYSCAT.SERVICECLASSES S
WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID
GROUP BY HISTOGRAM_TYPE, PARENTSERVICECLASSNAME, SERVICECLASSNAME, TOP
Step 2: Turn on collection of histograms
Turning on the collection of histograms is done for the default user service class by altering its default subclass to collect aggregate activity data with the EXTENDED option. This provides both the three histograms available in the BASE option (lifetime, execution time, and queue time) and the two histograms available only when using the EXTENDED option (inter-arrival time and estimated cost).
ALTER SERVICE CLASS SYSDEFAULTSUBCLASS UNDER SYSDEFAULTUSERCLASS
COLLECT AGGREGATE ACTIVITY DATA EXTENDED
Step 3: Activate statistics event monitor
If not already active, activate the event monitor that was created earlier so that it can receive aggregate data whenever it is collected.
SET EVENT MONITOR DB2STATISTICS STATE 1
Step 4: Run activities and send statistics to statistics event monitor
First run some activities; after the activities have finished the WLM_COLLECT_STATS stored procedure is called to send the service class statistics to the active statistics event monitor (including the activity lifetime, execution time, queue time, inter-arrival time and estimated cost histograms for the default user service class). These histograms contain data about all activities that executed in the default user service class since aggregate activity statistics were enabled. Calling the stored procedure also resets the statistics. To show changes in database activity over time, three collection intervals are created.
In the first interval, run two scripts, work1.db2 and work2.db2, then collect and reset the statistics.
db2 -o- -tvf work1.db2
db2 -o- -tvf work2.db2
CONNECT TO SAMPLE
CALL WLM_COLLECT_STATS
In the second interval, run only the work1.db2 script once, then collect and reset the statistics.
db2 -o- -tvf work1.db2
CONNECT TO SAMPLE
CALL WLM_COLLECT_STATS
In the third interval, run the work1.db2 script twice and the work2.db2 script once, then collect and reset the statistics.
db2 -o- -tvf work1.db2
db2 -o- -tvf work2.db2
db2 -o- -tvf work1.db2
CONNECT TO SAMPLE
CALL WLM_COLLECT_STATS
Collecting data periodically such as this permits you to watch how work on your system changes over time.
Additional Information: Collecting data periodically does not need to be a manual operation. Using the WLM_COLLECT_INT database configuration parameter, you can set the interval in minutes after which statistics collection and reset occurs.
Step 5: Query views to view statistics
Now that statistics have been collected, the views created earlier can be used to look at the statistics. The HISTOGRAMTYPES view just returns the types of histograms available.
SELECT * FROM HISTOGRAMTYPES
HISTOGRAM_TYPE
------------------------
CoordActEstCost
CoordActExecTime
CoordActInterArrivalTime
CoordActLifetime
CoordActQueueTime
3 record(s) selected.
Since the EXTENDED option was used when altering the service class, there are five histograms.
The HISTOGRAMSERVICECLASSES view permits you to see the service classes for which a histogram was collected. The following example restricts the output to that of the CoordActInterArrivalTime histogram only. Since aggregate activity collection was turned on only for the default user service class's default subclass, only that class is shown when selecting from the HISTOGRAMSERVICECLASSES view.
SELECT * FROM HISTOGRAMSERVICECLASSES
WHERE HISTOGRAM_TYPE = 'CoordActInterArrivalTime'
ORDER BY SERVICE_SUPERCLASS, SERVICE_SUBCLASS
HISTOGRAM_TYPE SERVICE_SUPERCLASS SERVICE_SUBCLASS
------------------------ ------------------------ -----------------------
-
CoordActInterArrivalTime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS
1 record(s) selected.
The HISTOGRAMTIMES view shows the times when histograms were collected. Since the WLM_COLLECT_STATS procedure was run three times, there are three timestamps for the inter-arrival time histogram shown.
SELECT * FROM HISTOGRAMTIMES
WHERE HISTOGRAM_TYPE = 'CoordActInterArrivalTime'
AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
ORDER BY TIMESTAMP
HISTOGRAM_TYPE SERVICE_SUPERCLASS SERVICE_SUBCLASS
TIMESTAMP
------------------------ ------------------------ -----------------------
- --------------------------
CoordActInterArrivalTime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS
2007-08-08-13.41.38.870298
CoordActInterArrivalTime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS
2007-08-08-13.41.42.802855
CoordActInterArrivalTime SYSDEFAULTUSERCLASS SYSDEFAULTSUBCLASS
2007-08-08-13.41.53.577835
The last view, HISTOGRAMS, is for looking at the histograms themselves. Unlike the HISTOGRAMTIMES view that lists each collection interval as its own row, this view aggregates histogram data across multiple intervals to produce a single histogram of a given type for a given service class.
SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
WHERE HISTOGRAM_TYPE = 'CoordActInterArrivalTime'
AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
ORDER BY BIN_TOP
BIN_TOP NUMBER_IN_BIN
-------------------- --------------------
-1 0
1 10
2 6
3 7
5 14
8 7
12 32
19 2
29 9
44 24
68 11
103 8
158 8
241 9
369 1
562 10
858 5
1309 5
1997 0
3046 0
4647 0
7089 0
10813 2
16493 2
25157 0
38373 0
58532 0
89280 0
136181 0
207720 0
316840 0
483283 0
737162 0
1124409 0
1715085 0
2616055 0
3990325 0
6086529 0
9283913 0
14160950 0
21600000 0
41 record(s) selected.
Running this query produces output than will not be exactly the same as what is shown in the preceding example since activity inter-arrival times depend on the performance of the system. In the previous output, there are 41 bins and all of the largest bins are empty. At the top, there is a bin whose BIN_TOP is -1. This bin represents all of those activities whose inter-arrival time was too large to fit in the histogram. Seeing a NUMBER_OF_BIN greater than zero when the BIN_TOP is -1 indicates that you should probably increase the high bin value of your histogram. In the previous output, the NUMBER_IN_BIN is 0, so there is no need to make such a change. The majority of activities arrived less than 1309 milliseconds apart from each other. Four activities arrived between 7089 milliseconds and 16493 milliseconds apart.
The same query can be repeated with a histogram_type of CoordActEstCost instead of CoordActInterArrivalTime.
SELECT BIN_TOP, NUMBER_IN_BIN FROM HISTOGRAMS
WHERE HISTOGRAM_TYPE = 'CoordActEstCost'
AND SERVICE_SUPERCLASS = 'SYSDEFAULTUSERCLASS'
AND SERVICE_SUBCLASS = 'SYSDEFAULTSUBCLASS'
ORDER BY BIN_TOP
BIN_TOP NUMBER_IN_BIN
-------------------- --------------------
-1 0
1 39
2 0
3 0
5 0
8 30
12 0
19 30
29 0
44 0
68 0
103 0
158 0
241 0
369 0
562 0
858 0
1309 0
1997 0
3046 0
4647 0
7089 0
10813 0
16493 0
25157 0
38373 0
58532 0
89280 0
136181 0
207720 0
316840 0
483283 0
737162 0
1124409 0
1715085 0
2616055 0
3990325 0
6086529 0
9283913 0
14160950 0
21600000 0
41 record(s) selected.
A histogram such as this is typical for a small workload. With a small workload, there is not much variation in the size of activities, so there are only three different bins that had activities counted towards them. Slightly more than 60% of the activities had a cost estimate between 5 and 19 timerons, with the rest having cost estimates of less than 1 timeron.
Step 6: Reset for other exercises
The final step is to turn off collection of aggregate activities on the default user service class, to drop the views and to delete the information in the statistics tables.
ALTER SERVICE CLASS SYSDEFAULTSUBCLASS UNDER SYSDEFAULTUSERCLASS
COLLECT AGGREGATE ACTIVITY DATA NONE
DROP VIEW HISTOGRAMS
DROP VIEW HISTOGRAMTIMES
DROP VIEW HISTOGRAMSERVICECLASSES
DROP VIEW HISTOGRAMTYPES
SET EVENT MONITOR DB2STATISTICS STATE 0
DELETE FROM HISTOGRAMBIN_DB2STATISTICS
DELETE FROM SCSTATS_DB2STATISTICS