IBM Support

DB2: How to troubleshoot a query with parameter markers using section actuals?

Technical Blog Post


Abstract

DB2: How to troubleshoot a query with parameter markers using section actuals?

Body

Quite often we are faced with a situation when query happens to run slower for some specific predicates, but only when parameter markers are being used. Analysis of execution metric of an application running on a live system might be inconvenient, but doing that from a command line in an isolated connection is usually much easier.

Let's consider a simple query on SAMPLE database:

SELECT e.empno,
       e.lastname,
       d.deptname
FROM   employee e,
       department d
WHERE  e.workdept = d.deptno
       AND mgrno = ?;


We can generate an explain for such a query with parameter marker (saved as myquery.sql) using command line:
$ db2 "set current explain mode explain"
$ db2 -tvf myquery.sql
SQL0217W  The statement was not executed as only Explain information requests
$ db2exfmt -d sample -1 -t:
<cut>
Access Plan:
-----------
        Total Cost:             13.6284
        Query Degree:           1

                           Rows
                          RETURN
                          (   1)
                           Cost
                            I/O
                            |
                             3
                          NLJOIN
                          (   2)
                          13.6284
                             2
                /-----------+------------\
               1                            3
            FETCH                        FETCH
            (   3)                       (   5)
            6.81343                      6.81494
               1                            1
         /----+----\                  /----+----\
        1            14              3            42
     IXSCAN    TABLE: DB2V105     IXSCAN    TABLE: DB2V105
     (   4)      DEPARTMENT       (   6)       EMPLOYEE
   0.00633371        Q1         0.00758325        Q2
        0                            0
       |                            |
       14                           42
 INDEX: DB2V105               INDEX: DB2V105
     XDEPT2                        XEMP2
       Q1                           Q2


but we cannot execute such a query from CLP.

 

In such situation db2batch can be useful - using #PARAM directive we can pass parameter values to the query we want to execute. After #PARAM keyword, list of parameters to be used for given predicate can be provided. Moreover, we can execute the same query multiple times with different parameters using  #BGBLK directive. It will be clearer with an example.

To have our query executed with 2 different values of parameter marker, we can use the following input file:
$ cat db2batch.1.in
--#BGBLK 2
--#PARAM '000060' '000010'
SELECT e.empno,
       e.lastname,
       d.deptname
FROM   employee e,
       department d
WHERE  e.workdept = d.deptno
       AND mgrno = ?;
--#EOBLK

Block is enclosed in BGBLK EOBLK directives, number after BGBLK, 2, says how many times we want the block to be executed. For each execution, one of provided parameters will be used ('000060' and '000010' for the input file above). Executing it as follows:
$ db2batch -d sample -f db2batch.1.in -i complete -o r 0 p 5 -r db2batch.1.out

gives us:
-i complete: breakdown of prepare, execute, and fetch time
-o r 0: zero actual rows on the output (as we don't really want to see them, fetch will still happen though)
-o p 5: all types of snapshot collected


In the output file we will have the metric for the execution with each parameter:

$ cat db2batch.1.out

<cut>
* List of 1 supplied parameter(s):
*   '000010'

EMPNO  LASTNAME        DEPTNAME                            
------ --------------- ------------------------------------

* 5 row(s) fetched, 0 row(s) output.

* Prepare Time is:       0.000205 seconds
* Execute Time is:       0.000278 seconds
* Fetch Time is:         0.000138 seconds
* Elapsed Time is:       0.000621 seconds (complete)

* Block Iteration 2 of 2:
---------------------------------------------

* SQL Statement Number 1:

select e.empno, e.lastname, d.deptname from employee e, department d where e.workdept = d.deptno and mgrno = ? ;

* List of 1 supplied parameter(s):
*   '000060'

EMPNO  LASTNAME        DEPTNAME                            
------ --------------- ------------------------------------

* 11 row(s) fetched, 0 row(s) output.

* Prepare Time is:       0.000000 seconds
* Execute Time is:       0.000116 seconds
* Fetch Time is:         0.000066 seconds
* Elapsed Time is:       0.000182 seconds (complete)


Obviously, this query is simple and there is no much difference, but we can e.g see that for the first execution, query had to be compiled (prepare time is not zero), which was not the case for the second execution (it was already in package cache).


We can use two parameter markers in a single query by using two #PARAM directives, e.g:

$ cat db2batch.2.in
--#BGBLK 2
--#PARAM '000060' '000010'
--#PARAM 50000 60000 70000
SELECT e.empno,
       e.lastname,
       d.deptname
FROM   employee e,
       department d
WHERE  e.workdept = d.deptno
       AND d.mgrno = ?
       AND e.salary > ?;
--#EOBLK

- for salary predicate, one of 3 provided values will be used. For integer and decimal values, ranges of input values can be automatically generated, details can be found in db2batch documentation:
https://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0002043.html

There are also interesting examples on the blog of our colleagues from WebSphere Commerce team:
https://www.ibm.com/developerworks/community/blogs/wcs/entry/Tips_for_troubleshooting_slow_queries_DB2?lang=en


Db2batch can automatically collect snapshots ("-o p" option), but what we can do if we need more details?

The obvious answers are event monitors, but they might be quite tricky if you've never used them before.
One of the interesting features of event monitors is ability to collect "section actuals" - the actual number of rows "flowing" through each access plan operator, which can give us lots of insights into query execution, most importantly, whether estimates that optimizer had were accurate and this is what we'll try to do here.


There is a couple of ways you can set up activity details monitoring, but there are 3 things you will need:
1. Section_actuals have to be enabled:
$ db2 connect to sample
$ db2 update database configuration using section_actuals base

2. One needs to have an activity event monitor. If you don't have one yet, you can create it as follows:
$ db2 create event monitor actuals_mon for activities write to table
(the name will be "ACTUALS_MON")
- this will create 5 new tables, where our data will be stored by event monitor infrastructure:
ACTIVITYMETRICS_ACTUALS_MON
ACTIVITYSTMT_ACTUALS_MON
ACTIVITYVALS_ACTUALS_MON
ACTIVITY_ACTUALS_MON
CONTROL_ACTUALS_MON

3. Monitor has to be enabled to capture the data:
$ db2 set event monitor actuals_mon state 1

 

Once we have the event monitor ready, metrics and actuals collection has to be enabled, either explicitly from given connection:
$ db2 "call wlm_set_conn_env(NULL,'<collectactdata>WITH DETAILS, SECTION</collectactdata><collectactpartition>ALL</collectactpartition>')"
or implicitly via workloads. We might don't want to have it enabled for all applications running in the database (likely in default workload SYSDEFAULTUSERWORKLOAD) to avoid overhead, so a new workload should be created. To assign "our" workload (one, in which we will execute our query), I tend to use the assignment based on client accounting string:
$ db2 "create workload actuals current client_acctng('actuals') collect activity data with details,section"
$ db2 "grant usage on workload actuals to public"

From now on, all connections which will have client accounting string set to 'actuals', what can be done via CLI ClientAcctStr keyword or by calling proper WLM procedure:
CALL SYSPROC.WLM_SET_CLIENT_INFO(NULL, NULL, NULL, 'actuals', NULL);
will have both activity metrics and section actuals collected (as long as event monitor is active).

Other option is to assign the workload based one.g  userid or name of application from which query is being run. For example, to have all db2batch executions assigned automatically to that workload, we can do it as follows:

$ db2 "create workload actuals_mon applname('db2batch') collect activity data with details,section"


Let's stick with the assignment based on the client accounting string. To our input file I'll just add one line:
$ cat db2batch.3.in
CALL SYSPROC.WLM_SET_CLIENT_INFO(NULL, NULL, NULL, 'actuals', NULL);
--#BGBLK 2
--#PARAM '000060' '000010'
select e.empno, e.lastname, d.deptname from employee e, department d where e.workdept = d.deptno and mgrno = ? ;
--#EOBLK

and execute it:
$ db2batch -d sample -f db2batch.3.in -i complete -o r 0 p 5 -r db2batch.3.out

- this will execute our query the same way it was the case before, but this time additional details will be captured by event monitors. Now we can disable event monitor:
$ db2 set event monitor actuals_mon state 0
and check what was stored in our tables listed before:
ACTIVITYMETRICS_ACTUALS_MON <- this one has query execution metrics
ACTIVITYSTMT_ACTUALS_MON    <- this one has executed statement
ACTIVITYVALS_ACTUALS_MON    <- this one would have values for parameters, but our workload didn't have "AND VALUES" clause
ACTIVITY_ACTUALS_MON        <- this one will have the "actuals" that we wanted capture.

 

The unique key for all those tables are 3 columns: APPL_ID, UOW_IU and ACTIVITY_ID, so to match the details for same activity, we should join on all 3 of those columns. The best way to confirm that actuals were captured properly is to check whether SECTION_ACTUALS BLOB was actually stored in our table (ACTIVITYSTMT_ACTUALS_MON ):
$ db2 "SELECT a.time_completed,
       Substr(appl_name, 1, 20)   appl_name,
       Substr(a.appl_id, 1, 28)   appl_id,
       a.uow_id,
       a.activity_id,
       Length(a.section_actuals)  act_len,
       Substr(s.stmt_text, 1, 50) stmt
FROM   activity_actuals_mon a,
       activitystmt_actuals_mon s
WHERE  a.appl_id = s.appl_id
       AND a.uow_id = s.uow_id
       AND a.activity_id = s.activity_id"

TIME_COMPLETED   APPL_NAME APPL_ID                     UOW_ID ACTIVITY_ID ACT_LEN  STMT                             
---------------- --------- --------------------------- ------ ----------- -------- ---------------------------------
2016-08-19-09.14 db2bp     *LOCAL.db2v105.160818174004     27           1        0 VALUES CURRENT SQLID INTO :H00133
2016-08-19-09.14 db2bp     *LOCAL.db2v105.160818174004     28           1        0 CALL WLM_SET_CONN_ENV(?,?)       
2016-08-19-09.14 db2bp     *LOCAL.db2v105.160818174004     30           1        0 select e.empno, e.lastname, d.dep
2016-08-19-09.40 db2bp     *LOCAL.db2v105.160819073936      4           1        0 select e.empno, e.lastname, d.dep
2016-08-19-12.42 db2batch  *LOCAL.db2v105.160819104221      2           1     2480 select e.empno, e.lastname, d.dep
2016-08-19-12.42 db2batch  *LOCAL.db2v105.160819104221      3           1     2480 select e.empno, e.lastname, d.dep

- here we can see that first 3 activities stored in our table (run from DB2 CLP) do not have activities stored (length = 0), but for the runs from db2batch, data is there.


To generate explain including actuals, EXPLAIN_FROM_ACTIVITY should be used, where we need to pass APPL_ID, UOW_ID, ACTIVITY_ID and name of event monitor, e.g:
$ db2 "CALL EXPLAIN_FROM_ACTIVITY( '*LOCAL.db2v105.160819104221', 3, 1, 'ACTUALS_MON', '', ?, ?, ?, ?, ? )"
$ db2exfmt -d sample -1 -t
<cut>
Explain level:    Explain from section

Access Plan:
-----------
        Total Cost:             13.6284
        Query Degree:           0

                           Rows
                        Rows Actual
                          RETURN
                          (   1)
                           Cost
                            I/O
                            |
                             3
                             5
                          NLJOIN
                          (   2)
                          13.6284
                            NA
                /-----------+------------\
               1                            3
               1                            5
            FETCH                        FETCH
            (   3)                       (   5)
            6.81343                      6.81494
              NA                           NA
         /----+----\                  /----+----\
        1            14              3            42
        1            NA              5            NA
     IXSCAN    TABLE: DB2V105     IXSCAN    TABLE: DB2V105
     (   4)      DEPARTMENT       (   6)       EMPLOYEE
   0.00633371        Q1         0.00758325        Q2
       NA                           NA
       |                            |
       14                           42
       NA                           NA
 INDEX: DB2V105               INDEX: DB2V105
     XDEPT2                        XEMP2
       Q1                           Q2

- generated explain contains both estimates that optimizer used to pick the plan, but also actual number of rows during execution. For example, we can see that IXSCAN 5 expected 3 rows, but the actual number of rows returned was 5. In this example it is perfectly fine, but if numbers would differ by orders of magnitude, further analysis is required - quite likely additional indexes or statistical views might be needed to help optimizer to find a better access plan.

ACTIVITYMETRICS_ACTUALS_MON table can be used to query various metrics collected during this specific execution (not the aggregates, that e.g snapshots or mon_get_pkg_cache_stmt have). E.g to get number of rows read and time spent reading the data, we can use following query:
$ db2 "select rows_read, pool_read_time, pool_index_l_reads from ACTIVITYMETRICS_ACTUALS_MON where appl_id = '*LOCAL.db2v105.160819104221' and uow_id = 3 and activity_id = 1"

ROWS_READ            POOL_READ_TIME       POOL_INDEX_L_READS  
-------------------- -------------------- --------------------
                   6                    0                    3

- not that useful for such a simple query because pool read time must be below 1 [ms], but should help to determine where time is spent for more complex queries.

P.S. If we do not need parameter markers, section actuals for a specific query can be easily obtained using db2caem tool, example usage can be found in DB2 Knowledge Center: http://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0057282.html

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":""}]

UID

ibm13285861