Mining your package cache for problem SQL in DB2 for Linux, UNIX, and Windows

In both OLTP and DSS database environments, DBAs can use information stored in the package cache to find problematic SQL statements either before or after they begin to negatively impact performance. While there are many third party tools available that can be used to identify and analyze problematic SQL, by executing a few simple queries, DBAs can locate such SQL without having to use one of these tools. This article shows you how to mine package cache data using some of those queries.

Share:

Ember Crooks (ember.crooks@gmail.com), Application Engineering Architect, Rosetta

Ember                 Crooks photoEmber Crooks writes a DB2-related blog, which can be seen at http://db2commerce.com. She works for Rosetta, a consulting-centered interactive agency engineered to transform marketing for the connected world. Ember has been a DB2 DBA for 11 years and holds a number of DB2 certifications, including: IBM Certified Database Administrator for DB2 10.1 for Linux, UNIX, and Windows, and IBM Certified Advanced Database Administrator for DB2 9.7 for Linux, UNIX and Windows. She was a finalist in the DB2Night Show's DB2' Got Talent 2012 competition, and has been a guest speaker on the DB2Night Show.



21 November 2012

Also available in Russian Portuguese

What is the package cache?

DB2® for Linux®, UNIX®, and Windows® uses many different memory areas to cache critical information. These memory areas are very powerful tools in controlling database performance. The package cache is one such area. This is where DB2 stores static and dynamic queries, access plans, and information about the execution of those queries. Tuning your package cache is an important part of DB2 performance tuning. However, if all you do with the package cache is ensure that package cache hit ratios are good, you're missing the true power of this important memory area.

What information is in the package cache?

Just like a gold mine, your package cache may seem like a black hole with an entire mountain of information behind it that you must sort through to find the gold nuggets that will make you rich (in terms of database performance). That is because there is a variety of interesting information available in this memory area. A few of the more interesting pieces of information available are:

  • Query statements — full text of dynamic queries executed against the database, with parameter markers if they were used
  • Number of executions — the number of times a particular query has been executed
  • Execution time — the total time spent on all executions of a particular query
  • Lock wait time — the total time a particular query spent waiting on locks
  • Sort time — the total time a particular query spent organizing and ordering data
  • Rows read — the total number of rows a particular query has examined
  • Rows returned — the total number of rows a query has returned in result sets
  • CPU time — the total CPU time consumed by a particular query
  • Execution time — the total time a particular query spent executing

All information in the package cache is cumulative since the last time a particular query was placed in the package cache.

What information is NOT available in the package cache?

While the package cache contains a lot of useful information, it does not replace a SQL statement event monitor. The package cache cannot tell you:

  • When a particular query was executed
  • Some details on static SQL (especially if using GET SNAPSHOT, or if not on DB2 9.7 or later)
  • Specific values used in execution of a statement when that statement uses parameter markers
  • Literal values used during execution (if STMT_CONC is used)

How long does information stay in the package cache?

DB2 moves individual statements in and out of the package cache to keep the most frequently executed statements in memory. Consequently, a statement might be in the package cache for a while before it is overwritten.

You can purge the contents of the package cache at any time by executing the following statement:

  FLUSH PACKAGE CACHE DYNAMIC

It is important to note that the execution of this statement can affect performance on a running system, since it will cause the access plans for all dynamic SQL to be re-generated.

Deactivating a database will also cause the package cache to be emptied. This includes explicit and automatic database deactivations, DB2 instance stop/start operations, and server reboots. However, the resetting of monitor switches does not flush the package cache, nor does it zero the counters for statements that are in the package cache.

You cannot count on the statistics in the package cache being all inclusive since your last reboot or explicit flush, especially if you have a smaller package cache.

Finding problem SQL in the package cache

What is problem SQL?

In my experience, problem SQL is a major cause of database performance problems. I've seen outage-level issues caused by problem SQL, especially during peak periods. I'm going to use a broad definition of "Problem SQL" here — in this case problem SQL is any SQL that causes a slow down or degradation in performance. With this broad definition, there are a number of different categories we can place SQL in. Top categories are:

  • Rows read — A high number of rows read (especially if the number of rows selected is small) can indicate that table scans are occurring.
  • Execution time — Simply put, queries that take a long time to execute are more likely to be problems than queries with shorter execution times, when summed across all executions.
  • CPU time — Reducing the CPU cycles needed to process a query can be a major improvement, especially if your system is CPU-bound.
  • Sort Time — Particularly in OLTP databases, sort time can be a bad thing that increases query execution time.

In each of these situations, the number of times an SQL statement is executed must be taken into consideration. You can spend an hour reducing a 30 second query to 20 seconds, but if that query is executed only once an hour, then your time might be better spent reducing a query that is executed 200 times a minute by 100 milliseconds.

I choose to pay the most attention to categories I know I can do something about. I most often work with a vended application for which I don't have much ability to change the SQL that runs against my databases. But I do have the ability to add or remove indexes, which predisposes me to look at rows read and read efficiency.

Accessing information in the package cache

There are a number of methods that can be used to access the information found in the package cache. These methods are summarized in Table 1.

Table 1. Methods used to access information stored in the package cache
MethodTypeBasic informationExtended informationStatic SQLVersionEnabling collection
GET SNAPSHOT FOR DYNAMIC SQLFlat FileYesNoNoAllDFT_MON_STMT, STATEMENT monitor switch
SNAPSHOT_DYN_SQLSQL/Table FunctionYesNoNo8.2 and laterDFT_MON_STMT
SQLCACHE_SNAPSHOTSQL/Table FunctionYesNoNo8.2 and laterDFT_MON_STMT
SNAP_GET_DYN_SQLSQL/Table FunctionYesNoNo8.2 and laterDFT_MON_STMT
SNAP_GET_DYN_SQL_V91SQL/Table FunctionYesNoNo9.1 and laterDFT_MON_STMT
SNAP_GET_DYN_SQL_V95SQL/Table FunctionYesNoNo9.5 and laterDFT_MON_STMT
SNAPDYN_SQLSQL/Administrative ViewYesNoNo9.5 and laterDFT_MON_STMT
LONG_RUNNING_SQLSQL/Administrative ViewPartialNoNo9.7 and laterDFT_MON_STMT
QUERY_PREP_COSTSQL/Administrative ViewPartialNoNo9.7 and laterDFT_MON_STMT
TOP_DYNAMIC_SQLSQL/Administrative ViewPartialNoNo9.7 and laterDFT_MON_STMT
MON_PKG_CACHE_SUMMARYSQL/Administrative ViewYesNoYes9.7, FixPack 1 and laterMON_ACT_METRICS, COLLECT ACTIVITY METRICS on workload
MON_GET_PKG_CACHE_STMTSQL/Table FunctionYesYesYes9.7MON_ACT_METRICS, COLLECT ACTIVITY METRICS on workload
MON_GET_PKG_CACHE_STMT_DETAILSSQL/Table Function/XMLYesYesYes9.7, FixPack 1 and laterMON_ACT_METRICS, COLLECT ACTIVITY METRICS on workload

Note: DFT_MON_STMT is a database manager configuration parameter, while MON_ACTMETRICS is a database configuration parameter.

The traditional way of accessing information stored in the package cache is to use a GET SNAPSHOT command that looks like this:

 GET SNAPSHOT FOR DYNAMIC SQL ON [DBAlias]

where: DBAlias is the alias of the database that package cache information is to be collected for.

When this command is executed, the output produced looks something like this:

Listing 1. Snapshot output
      Dynamic SQL Snapshot Result

 Database name                      = SAMPLE

 Database path                      = /db_data/db2inst1/NODE0000/SQL00002/

 Number of executions               = 80
 Number of compilations             = 1
 Worst preparation time (ms)        = 12
 Best preparation time (ms)         = 12
 Internal rows deleted              = 0
 Internal rows inserted             = 0
 Rows read                          = 80
 Internal rows updated              = 0
 Rows written                       = 0
 Statement sorts                    = 0
 Statement sort overflows           = 0
 Total sort time                    = 0
 Buffer pool data logical reads     = 80
 Buffer pool data physical reads    = 0
 Buffer pool temporary data logical reads   = 0
 Buffer pool temporary data physical reads  = 0
 Buffer pool index logical reads    = 240
 Buffer pool index physical reads   = 0
 Buffer pool temporary index logical reads  = 0
 Buffer pool temporary index physical reads = 0
 Buffer pool xda logical reads      = 0
 Buffer pool xda physical reads     = 0
 Buffer pool temporary xda logical reads    = 0
 Buffer pool temporary xda physical reads   = 0
 Total execution time (sec.microsec)= 0.019997
 Total user cpu time (sec.microsec) = 0.010496
 Total system cpu time (sec.microsec)= 0.000000
 Total statistic fabrication time (milliseconds) = 0
 Total synchronous runstats time  (milliseconds) = 0
 Statement text                     = SELECT T1.PARTNUMBER, T1.FIELD5, T1.URL, 
T1.CATENTRY_ID, T1.BASEITEM_ID, T1.FIELD4, T1.OID, T1.ONAUCTION, T1.MFNAME, 
T1.ONSPECIAL, T1.BUYABLE, T1.FIELD2, T1.MFPARTNUMBER, T1.MEMBER_ID, T1.FIELD1, T1.STATE, 
T1.FIELD3, T1.ITEMSPC_ID, T1.LASTUPDATE, T1.MARKFORDELETE, T1.STARTDATE, T1.ENDDATE, 
T1.AVAILABILITYDATE, T1.LASTORDERDATE, T1.ENDOFSERVICEDATE, T1.DISCONTINUEDATE, 
T1.OPTCOUNTER, T1.CATENTTYPE_ID FROM CATENTRY  T1 WHERE T1.CATENTTYPE_ID = 'ItemBean'
 AND T1.CATENTRY_ID = ?
...

The most important information you could ask for is here, and in this case, it is fairly easy to read. The problem is that what is presented here is information for a single query. You're likely to obtain information for thousands of queries in a single dynamic SQL snapshot that is taken from a production system. A flat file is much more difficult to parse through to find information about the problem SQL statements. Also, having been around since the beginning of DB2, this command only returns information that was available in previous versions — information on locks, number of rows selected, and a whole host of other data is not available with this command.

My preferred method of accessing package cache information is to query the table function MON_GET_PKG_CACHE_STMT. I choose this method because it is lighter weight and lower impact on the database being monitored compared to other methods available. It also returns information in SQL-accessible format. You can even use the MON_GET_PKG_CACHE_STMT function to obtain information about static SQL statements, if desired. Just be aware that this function is only available with DB2 9.7 and later.

Using the MON_GET_PKG_CACHE_STMT function, I have constructed a couple of master queries that I frequently use to both look at the most critical performance metrics and to rank those metrics against all other dynamic queries currently in the package cache. These queries are as follows:

Listing 2. Master query #1 using MON_GET_PKG_CACHE_STMT
WITH SUM_TAB (SUM_RR, SUM_CPU, SUM_EXEC, SUM_SORT, SUM_NUM_EXEC) AS (
        SELECT  FLOAT(SUM(ROWS_READ)),
                FLOAT(SUM(TOTAL_CPU_TIME)),
                FLOAT(SUM(STMT_EXEC_TIME)),
                FLOAT(SUM(TOTAL_SECTION_SORT_TIME)),
                FLOAT(SUM(NUM_EXECUTIONS))
            FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T
        )
SELECT
        SUBSTR(STMT_TEXT,1,10) as STATEMENT,
        ROWS_READ,
        DECIMAL(100*(FLOAT(ROWS_READ)/SUM_TAB.SUM_RR),5,2) AS PCT_TOT_RR,
        TOTAL_CPU_TIME,
        DECIMAL(100*(FLOAT(TOTAL_CPU_TIME)/SUM_TAB.SUM_CPU),5,2) AS PCT_TOT_CPU,
        STMT_EXEC_TIME,
        DECIMAL(100*(FLOAT(STMT_EXEC_TIME)/SUM_TAB.SUM_EXEC),5,2) AS PCT_TOT_EXEC
    FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T, SUM_TAB
    ORDER BY ROWS_READ DESC FETCH FIRST 5 ROWS ONLY WITH UR;
Listing 3. Master query #2 using MON_GET_PKG_CACHE_STMT
WITH SUM_TAB (SUM_RR, SUM_CPU, SUM_EXEC, SUM_SORT, SUM_NUM_EXEC) AS (
        SELECT  FLOAT(SUM(ROWS_READ)),
                FLOAT(SUM(TOTAL_CPU_TIME)),
                FLOAT(SUM(STMT_EXEC_TIME)),
                FLOAT(SUM(TOTAL_SECTION_SORT_TIME)),
                FLOAT(SUM(NUM_EXECUTIONS))
            FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T)
SELECT
        SUBSTR(stmt_text,1,10) as STATEMENT,
        TOTAL_SECTION_SORT_TIME,
        DECIMAL(100*(FLOAT(TOTAL_SECTION_SORT_TIME)/SUM_TAB.SUM_SORT),5,2) AS PCT_TOT_SRT,
        NUM_EXECUTIONS,
        DECIMAL(100*(FLOAT(NUM_EXECUTIONS)/SUM_TAB.SUM_NUM_EXEC),5,2) AS PCT_TOT_EXEC,
        DECIMAL(FLOAT(STMT_EXEC_TIME)/FLOAT(NUM_EXECUTIONS),5,2) AS AVG_EXEC_TIME
    FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T, SUM_TAB
    ORDER BY ROWS_READ DESC FETCH FIRST 5 ROWS ONLY WITH UR;

When the first of these queries is executed, the following output is produced:

Listing 4. Master query #1 output
STATEMENT  ROWS_READ  PCT_TOT_RR TOTAL_CPU_TIME  PCT_TOT_CPU STMT_EXEC_TIME  PCT_TOT_EXEC
---------- ---------- ---------- --------------- ----------- --------------- ------------
select * f      91195      90.94         1013901        8.87            6699        14.33
SELECT SCS       4064       4.05          456660        3.99             450         0.96
SELECT T1.        537       0.53           75457        0.66              81         0.17
SELECT T1.        530       0.52          241887        2.11             104         0.22
UPDATE SCH        504       0.50           42339        0.37              55         0.11

  5 record(s) selected.

When the second of these queries is executed, this output is produced instead:

Listing 5. Master query #2 output
STATEMENT  TOTAL_SECTION_SORT_TIME PCT_TOT_SRT  NUM_EXECUTIONS  PCT_TOT_EXEC AVG_EXEC_TIME
---------- ----------------------- ------------ --------------- ------------ -------------
select * f                       0         0.00              13         0.22        515.30
SELECT SCS                      58        92.06             463         7.88          0.97
SELECT T1.                       0         0.00             537         9.14          0.15
SELECT T1.                       0         0.00             265         4.51          0.39
UPDATE SCH                       0         0.00             504         8.58          0.10

  5 record(s) selected.

In this example, we can see that one query that is executed only 13 times, but takes up 90% of the rows read. Depending on how often that query is actually executed, it may be desirable to create an index that can improve performance. Another query is executed 463 times and uses about 4% of the rows read — it too is probably an excellent candidate for further investigation.

One of the things I like about these two master queries is that I can look for problem queries in various categories by changing the ORDER BY clause. In the previous example, I am using rows read, but I can also see that the second query uses 3.99% of the total CPU cycles being consumed by queries and 92% of the sort time. If I were specifically looking for queries consuming CPU time, I would order the results by TOTAL_CPU_TIME; if I were looking for queries consuming sort time, I would order the results by TOTAL_SECTION_SORT_TIME.

For ease of reading, these queries only return the first few characters of each SQL statement, but by removing the SUBSTR() function, you can easily get the full query syntax.

If you really are only concerned about one area of performance, you can do a more focused query. One of my favorites focuses on rows read and the index read efficiency on a query-by-query basis. This query looks like this:

Listing 6. Query focused on rows read using MON_GET_PKG_CACHE_STMT
WITH SUM_TAB (SUM_RR) AS (
        SELECT FLOAT(SUM(ROWS_READ))
        FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T)
SELECT
        SUBSTR(STMT_TEXT,1,20) AS STATEMENT,
        ROWS_READ,
        DECIMAL(100*(FLOAT(ROWS_READ)/SUM_TAB.SUM_RR),5,2) AS PCT_TOT_RR,
        ROWS_RETURNED,
        CASE
            WHEN ROWS_RETURNED > 0 THEN
                DECIMAL(FLOAT(ROWS_READ)/FLOAT(ROWS_RETURNED),10,2)
            ELSE -1
        END AS READ_EFFICIENCY,
        NUM_EXECUTIONS
    FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T, SUM_TAB
    ORDER BY ROWS_READ DESC FETCH FIRST 5 ROWS ONLY WITH UR;

And when executed this query produces output that looks like this:

Listing 7. Query results
STATEMENT            ROWS_READ  PCT_TOT_RR ROWS_RETURNED  READ_EFFICIENCY NUM_EXECUTIONS
-------------------- ---------- ---------- -------------- --------------- ---------------
select * from schsta     105240      91.20             15         7016.00              15
SELECT SCSINSTREFNUM       4510       3.90            295           15.28             514
SELECT T1.SCCHOST, T        597       0.51            597            1.00             597
SELECT T1.STATE, T1.        590       0.51            295            2.00             295
UPDATE SCHSTATUS  SE        560       0.48              0           -1.00             560

  5 record(s) selected.

Using this output, we can look at the Read Efficiency for each query, which tells us how well our database is indexed for that particular query. This, in turn, helps us identify queries where the addition of an index might improve performance. For OLTP databases, I consider a query with a Read Efficiency over 10 a potential problem. Keep in mind that the number of executions must be taken into account — is taking action really worth the cost in both your time and the time DB2 spends maintaining indexes? If the answer is yes, then by all means create the appropriate index(es).

If CPU time is a major concern, the following query works well:

Listing 8. Query focused on CPU cycles using MON_GET_PKG_CACHE_STMT
WITH SUM_TAB (SUM_CPU) AS (
        SELECT FLOAT(SUM(TOTAL_CPU_TIME))
        FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T)
SELECT
        SUBSTR(STMT_TEXT,1,20) AS STATEMENT,
        TOTAL_CPU_TIME,
        DECIMAL(100*(FLOAT(TOTAL_CPU_TIME)/SUM_TAB.SUM_CPU),5,2) AS PCT_TOT_CPU,
        DECIMAL(FLOAT(TOTAL_CPU_TIME)/FLOAT(NUM_EXECUTIONS),10,2) AS AVG_CPU_TIME_PER_EXE,
        NUM_EXECUTIONS
    FROM TABLE(MON_GET_PKG_CACHE_STMT ( 'D', NULL, NULL, -2)) AS T, SUM_TAB
    ORDER BY TOTAL_CPU_TIME DESC FETCH FIRST 5 ROWS ONLY WITH UR;

When this query is executed, the results produced look like this:

Listing 9. Query results
STATEMENT            TOTAL_CPU_TIME  PCT_TOT_CPU AVG_CPU_TIME_PER_EXE NUM_EXECUTIONS
-------------------- --------------- ----------- -------------------- ---------------
SELECT ORDERS.ORDERS        10073382       25.80            100733.82             100
(((SELECT T2.STATE,          4289268       10.98              2331.12            1840
SELECT T1.LASTNAME,          2977586        7.62              1586.35            1877
SELECT T1.SALT, T1.P         2863354        7.33              1315.87            2176
select  q1."EDPORDER         1916547        4.91               261.18            7338

  5 record(s) selected.

In this example, a couple of potential problem queries now appear in our data set — queries that did not come up when we were looking at queries ordered by rows read. It is interesting to look not just at the total CPU time over all executions, but also the average CPU time per execution.

What to do about problem SQL

There are a number of possible actions we can take to improve the performance of problem SQL. My favorite approach is to run Explain and then the Index/Design Advisor on a problem query to find indexes that might be helpful. (Indexing is a separate art that is outside the scope of this article.) I also discuss problem queries with developers — sometimes making specific syntax recommendations, or sometimes discussing how the frequency of execution of a particular query can be reduced through application-level caching or other techniques.

How often should you perform an analysis?

The frequency at which you examine the data in your package cache depends on many factors. If you have the space, I recommend writing the top problem queries off to history tables on a periodic basis. For OLTP databases, hourly is nice. This allows you to go back and examine particular time periods based on reported performance problems or spikes in performance problem indicators.

One of my favorite developerWorks articles in recent years, Monitoring in DB2 9.7, Part 1: Emulating data reset with the new DB2 9.7 monitoring table functions, sets forth a general methodology for emulating the "reset monitor switches" functionality. The same methodology can be used with the MON_GET_PKG_CACHE_STMT function to get the monitoring benefits of emptying the package cache without the negative performance impact of performing a package cache flush. Taking this methodology one step further, you can write data off to a history table that only includes package cache information from a specified time period. This is particularly useful for investigating performance issues after they have occurred or for performing a detailed analysis of a particular problem time period.

Even if you don't track this data over time, at a minimum you should be looking at information stored in the package cache at least quarterly and before peak periods. If you are working with an OLTP database, the focus of queries can change as different features are used or as different searches become more common — different promotions can drive user traffic in different directions. If you're working with data warehouses and decision support systems (DSS), you may have different types of queries that are run at different times of the year or queries that change as users change.


The Big WHY

The queries presented here can give you a good starting point for mining valuable data from your package cache. Even so, mining your package cache can be a lot of work. So, why should you analyze data from your package cache and how should you justify the time you spend doing it? The answers are simple — to improve performance and to prevent those outage-level performance problems that make you lose sleep. Overall, if you keep an eye on your package cache and take action appropriately, you can reduce the hardware needed for your database and increase performance at the same time.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=846448
ArticleTitle=Mining your package cache for problem SQL in DB2 for Linux, UNIX, and Windows
publish-date=11212012