Preliminary remarks about the accuracy of summary and detail reports

Summary and detail reports show in parts identical information, for example, the number of Getpage requests during the time data was collected.

If you work with summary and detail reports that cover identical time frames, you expect that identical counters report identical numbers, but might encounter that these numbers sometimes are not equal. This topic discloses the technical causes and helps to understand the accuracies of both reports.

As described in Collecting data, different data types, identified by IFCIDs, are used for summary and detail reports. Summary reports use buffer pool and data set statistics data from Db2, whereas detail reports use activity data from Db2. Both data types are continuously provided by Db2. Tools like Buffer Pool Analyzer collect this data for specified time frames that you want to analyze.

Activity data is purely event based. Db2 keeps a record of every single event. When Buffer Pool Analyzer collects activity data from Db2 for a duration or time frame specified by you, it obtains precise information about every single activity during that time frame. This information and the known data collection start time and end time can be used for precise totaling and calculations and results in exact numbers in detail reports.

However, statistics data is recorded by Db2 at intervals, and this interval can vary dependent on the initial system settings. When Buffer Pool Analyzer collects statistics data from Db2 for a duration or time frame specified by you, it gets hold of a number of interval recordings during that time frame. Worse, the collection start time and end time rarely perfectly match the system's interval recordings. As a result, any calculations and the numbers in summary reports are based on the time frame between the first and last interval recording that is covered by the specified start and end times. Partial intervals at the beginning and ending of the collection time remain uncovered.

Example:
Figure 1. Example of how the statistics interval influences the accuracy
 |------|------|------|------|------|------|------|------|-->  Time line
t0     t1     t2     t3     t4     t5     t6     t7     t8    

 |  4   |  10  |  3   |  7   |  4   |  8   |   2  |   5  |     Activities between time slots


 |-------------|-------------|-------------|-------------|-->  Statistics intervals

              14            24            36            43     Statistics counts at intervals
                                                               (accumulative)

        |-----------------------------------------|            Data collection time

        |  10  |  3   |  7   |  4   |  8   |   2  |            34 actual activities

        |     14            24            36      |            36 - 14 = 22 calculated activities

This example uses fictitious numbers and an unrealistically short data collection time to emphasize the cause of inaccuracies. The data collection starts at t1 and ends at t7. A detail report would show precisely 34 activities during this time, simply by counting the events that DB2 has recorded for every single activity. Opposed to this, summary reports rely on the statistics counter, which is updated (incremented in this case) at times t2, t4, t6, t8, and so on. Here, only the counter values at times t2, t4, and t6 are covered by the data collection time. The number of activities (22) is calculated by determining the difference between the smallest (14) and greatest (36) counter value. No attempt is made to estimate how the smallest value (14) developed between t1 and t2. Also, the time between t6 and t7 remains unconsidered.

In practice these inaccuracies are marginal, if at all visible. They do not degrade the expressiveness of summary reports. Keep in mind that two different methods are used, and do not try to match any counter values down to a single digit in both reports.

If you want to know more about Db2 statistics data, different counter types (watermarks, snapshots, accumulative counters), different processing modes (regular, interval, delta processing), see, for example, Monitoring Performance from the Workstation.