Historical Package Cache SQL monitoring in DSM requires a repository database, and thus is a feature supported only for the database which has DSM Enterprise capability such as AESE, etc.

SQL is defined as interesting based on any of the following:

Executing a large number of times
Running a long time
Consuming large amounts of resources (CPU, IO, memory)
Waiting a long time
Reading a lot of data
Returning a lot of rows
Escalating a lot of locks

The level of “interest” is relative to other queries currently in the package cache.

“Uninteresting” queires will not be captured.

Those not in the top N of any key metrics are defined as not interesting
In a busy system most queries will not be interesting and thus not captured
This approach greatly improves performance, scale-up, and focus over capturing everything

SQL Monitoring Cycle

The periodic monitoring cycle for historical SQL monitoring performs the following steps each iteration:

Get metrics for all SQLs that had execution completions in the past M minutes from the package cache, where M is the period of the monitoring cycles plus 1 (e.g. if the monitoring is done every 5 minutes, M = 6)
Cache the new metrics in memory and discard the prior set
Determine the “top N” most interesting set of SQLs to capture
Retrieve each of these SQLs from the package cache if still available
If plan capture is also enabled, send the executable_ids of the retrieved SQLs to the plan capture component to have their plans captured asynchronously
If SQL normalization is enabled (the default), normalize the SQL texts (see “SQL Normalization” below) and consolidate metrics for duplicates
Persist the metric deltas (differences between current and prior metric values for this SQL) for each retrieved SQL to the IBMOTS.SQL_FACT table (which has foreign key column sql_hash_id into the IBMOTS.SQL_DIM table to point to the SQL text itself)
If not already captured, also persist the SQL text to the IBMOTS.SQL_DIM table (primary key = sql_hash_id column)

SQL Normalization

To limit the amount of resources consumed, especially disk space in the repository database, DSM attempts to avoid monitoring data with “infinite” cardinality. Because unnormalized dynamic SQL statements may have literal parameter values in them, the number of unique unnormalized SQLs from a application may be essentially unbounded. Thus SQL normalization enabled by default.

SQL normalization replaces literal values with question mark (?) tokens to collapse the cardinality of unique SQL statements back to a (typically small) finite set. For example,

select a, b, c from tab1 where x = 1 and y = 2

becomes

select a, b, c from tab1 where x = ? and y = ?

The number of normalized SQLs from an application is usually small.

SQL Hash IDs

SQL entries in DB2’s package cache each have an executable_id as an identifier. However, these IDs are not stable over time because when a SQL falls out of the package cache, its executable_id is lost. When the same SQL re-enters the package cache, it gets a new executable_id. Thus DSM does not track executable_ids, only SQL hash IDs, which are stable over time. This is another way DSM collapses a potentially unlimited number of apparently different SQL statements back down to a small, finite number.

Determining the Most Interesting SQL

As mentioned earlier DSM attempts to capture only the most “interesting” SQL statements. At this point you are probably wondering how “interesting” is defined. The answer is that it is based on a SQL statement scoring in the top N on one or more of the following 14 package cache metrics (from the mon_get_pkg_cache_stmt() table function):

num_exec_with_metrics
stmt_exec_time
total_cpu_time
rows_read
rows_returned
total_act_wait_time
lock_wait_time
sort_overflows
logical_reads
physical_reads
temp_reads
pool_data_l_reads
pool_index_l_reads
lock_escals

The metric magnitudes used for ranking are the differences (deltas) between the values in the current monitoring cycle and the prior one. In this way the interest level is based on recent activity of the various SQL statements, rather than the entire life history of their entries in the package cache, because frequently executed SQL might stay in the package cache for very long periods, even months.

The “interesting” SQLs are selected by the following approach:

Retrieve the 14 numeric metrics above and their executable_ids for SQLs that ran in past M minutes
Compute metric deltas against prior values cached in memory (matched on executable_id) – only the values from the most recent prior cycle are cached
Sort the deltas of each metric
List the executable_ids of the top N (default 20) deltas of each metric
Create a set union of the 14 lists (i.e. remove duplicates); this union can have more than N entries (as many as 14 x N, but in practice there is significant overlap of SQLs scoring high on several metrics, so the cardinality of the union set is usually less than this)
Retrieve the SQL text of each executable_id in unioned set from the package cache

Note that if an executable_id of a SQL that is in the package cache during the current monitoring cycle was not there in the prior cycle, DSM will not be able to compute the metric deltas. It may still be able to capture such a SQL if there is either definite or heuristic evidence that its package cache entry is “new,” meaning the values of the metrics can be considered to have been zero in the prior monitoring cycle.

SQL History Repository Pruning

Pruning of historical SQL data from the DSM repository database is based on an age cut-off, which is user-settable per monitored database, with a default of one week.

IBMOTS.SQL_FACT rows (metrics) are pruned when they age past the cutoff
IBMOTS.SQL_DIM rows (SQL text) are pruned when they are no longer referenced by any other table. In addition to historical SQL monitoring, this table is also referenced by other features of DSM such as Alerts and Query Tuning

Historical Package Cache SQL Monitoring Configuration Parameters

This section describes the configuration parameters a DSM administrator can set in Monitoring Profiles via the Set Up / Monitoring Profiles menu choice, selecting a monitroing profile, and clicking the pencil icon to edit it. The settings applicable to this feature are shown below in Figure 1.

Figure 1. Historical Package Cache SQL Monitoring configuration parameters.

The meanings of the parameters are as follows:

SQL statement execution data checkbox [default ON] – whether to collect historical SQL data (requires Repository Persistence to be ON)
Package cache data checkbox [default ON] – whether to collect historical Package cache data
Collect and persist data every [default 5 minutes]– number of minutes between historical SQL monitoring cycles
Number of SQL statements with the highest values for a metric to capture [default 20] – this is the “N” described above in selecting the “top N most interesting” SQLs for each of 14 metrics
Keep data for [default 7 days] – number of days to keep the historical SQL data before pruning it from the repository database
Normalize captured SQL statements checkbox [default ON] – whether to normalize SQL by replacing any literal values with question mark (?) tokens to collapse the number of unique SQL statements observed from a potentially unlimited number to a likely small, finite number

IBM Support

Tips

Package Cache SQL Monitoring in History mode

Question & Answer

Question

Answer

SQL Monitoring Cycle

SQL Normalization

SQL Hash IDs

Determining the Most Interesting SQL

SQL History Repository Pruning

Historical Package Cache SQL Monitoring Configuration Parameters

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?