DB2: How to troubleshoot a query with parameter markers using section actuals?

Technical Blog Post

Abstract

Db2 LUW: How to troubleshoot a query with parameter markers and use section actuals?

Body

Troubleshooting the performance of a query that is using parameter markers might be challenging as many of the techniques commonly used like Db2 CLP execution cannot be used. It is also possible that the same query, when run with the parameter values hardcoded in the statement text, will execute faster as Db2 optimizer might pick a different access plan (as it is able to use distribution statistics and estimate cardinalities for the specific value provided and not a "general case"). An analysis of the execution metric of the application running on a live system might be inconvenient, but doing that from a command line in an isolated connection is usually much easier.

Let's consider a simple query on SAMPLE database:

SELECT e.empno,
       e.lastname,
       d.deptname
FROM   employee e,
       department d
WHERE  e.workdept = d.deptno
       AND mgrno = ?;

One can generate the explain for such a query with parameter marker (saved as myquery.sql) using Db2 CLP:

db2 "set current explain mode explain"
db2 -tvf myquery.sql
SQL0217W  The statement was not executed as only Explain information requests

db2exfmt -d sample -1 -t:
[...]
Access Plan:
-----------
        Total Cost:             13.6284
        Query Degree:           1

                           Rows
                          RETURN
                          (   1)
                           Cost
                            I/O
                            |
                             3
                          NLJOIN
                          (   2)
                          13.6284
                             2
                /-----------+------------\
               1                            3
            FETCH                        FETCH
            (   3)                       (   5)
            6.81343                      6.81494
               1                            1
         /----+----\                  /----+----\
        1            14              3            42
     IXSCAN    TABLE: DB2V105     IXSCAN    TABLE: DB2V105
     (   4)      DEPARTMENT       (   6)       EMPLOYEE
   0.00633371        Q1         0.00758325        Q2
        0                            0
       |                            |
       14                           42
 INDEX: DB2V105               INDEX: DB2V105
     XDEPT2                        XEMP2
       Q1                           Q2

Such query cannot be executed from Db2 CLP though.

In such situation db2batch can be useful - using #PARAM directive one can pass parameter values to the query and execute it. After #PARAM keyword, list of parameters to be used for given predicate can be provided. Moreover, one can execute the same query multiple times with different parameters using #BGBLK directive.

Consider the following example.

To have our query executed with 2 different values of parameter marker, one can use the following input file (used later as "db2batch.1.in"):

--#BGBLK 2
--#PARAM '000060' '000010'
SELECT e.empno,
       e.lastname,
       d.deptname
FROM   employee e,
       department d
WHERE  e.workdept = d.deptno
       AND mgrno = ?;
--#EOBLK

Block is enclosed in BGBLK-EOBLK directives, number after BGBLK, 2, says how many times given block will be executed.

For each execution, one of provided parameters will be used ('000060' and '000010' for the input file).

Executing it as follows:

db2batch -d sample -f db2batch.1.in -i complete -o r 0 p 5 -r db2batch.1.out

Gives us:

-i complete: breakdown of prepare, execute, and fetch time
-o r 0: zero actual rows on the output (as we don't really want to see them, fetch will still happen though)
-o p 5: all types of snapshot collected

In the output file one will find the execution metrics for each parameter:

cat db2batch.1.out

<cut>
* List of 1 supplied parameter(s):
*   '000010'

EMPNO  LASTNAME        DEPTNAME                            
------ --------------- ------------------------------------

* 5 row(s) fetched, 0 row(s) output.

* Prepare Time is:       0.000205 seconds
* Execute Time is:       0.000278 seconds
* Fetch Time is:         0.000138 seconds
* Elapsed Time is:       0.000621 seconds (complete)

* Block Iteration 2 of 2:
---------------------------------------------

* SQL Statement Number 1:

select e.empno, e.lastname, d.deptname from employee e, department d where e.workdept = d.deptno and mgrno = ? ;

* List of 1 supplied parameter(s):
*   '000060'

EMPNO  LASTNAME        DEPTNAME                            
------ --------------- ------------------------------------

* 11 row(s) fetched, 0 row(s) output.

* Prepare Time is:       0.000000 seconds
* Execute Time is:       0.000116 seconds
* Fetch Time is:         0.000066 seconds
* Elapsed Time is:       0.000182 seconds (complete)

For such a simple query, there is no significant difference but one can see that, for example, the first execution query had to be compiled (prepare time is not zero), which was not the case for the second execution (package cache was used).

One can use two parameter markers in a single query by using two #PARAM directives:

--#BGBLK 2
--#PARAM '000060' '000010'
--#PARAM 50000 60000 70000
SELECT e.empno,
       e.lastname,
       d.deptname
FROM   employee e,
       department d
WHERE  e.workdept = d.deptno
       AND d.mgrno = ?
       AND e.salary > ?;
--#EOBLK

for salary predicate, one of 3 provided values will be used. For integer and decimal values, ranges of input values can be automatically generated, details can be found in db2batch documentation

Db2batch can automatically collect snapshots ("-o p" option), but it doesn't automatically use the more detailed event monitor infrastructure.

One of the interesting features of event monitors is ability to collect "section actuals" - the actual number of rows "flowing" through each access plan operator, which can give a lot of insights into the query execution, most importantly, whether estimates that optimizer expected were accurate.

This is an example that will show how to use db2batch with event monitors, including section and section actuals information.

There is a couple of ways one can set up activity details monitoring, but there are 3 things, which are always needed

1. An activity event monitor. If there is none existing in the database, then it can be created with the following command (its name will be "actuals_mon"):

db2 "create event monitor actuals_mon for activities write to table manualstart"

The command will implicitly create 5 new tables, where event monitor data will be stored:

ACTIVITYMETRICS_ACTUALS_MON
ACTIVITYSTMT_ACTUALS_MON
ACTIVITYVALS_ACTUALS_MON
ACTIVITY_ACTUALS_MON
CONTROL_ACTUALS_MON

2. This event monitor needs to be enabled:

db2 "set event monitor actuals_mon state 1"

3. To capture actuals the relevant database parameter needs to be enabled globally (see the example) or on the workload/connection level (details later).

db2 "update database configuration using section_actuals base"

Once the event monitor is ready, metrics and actuals collection has to be enabled, either explicitly from given connection:

db2 "call wlm_set_conn_env(NULL,'<collectactdata>WITH DETAILS, SECTION</collectactdata> <collectsectionactuals>BASE</collectsectionactuals> <collectactpartition>ALL</collectactpartition>')"

Or implicitly using workloads. One might don't want to have it enabled for all applications running in the database (likely in default workload SYSDEFAULTUSERWORKLOAD), in order to avoid the overhead, so a new workload should be created. To assign the workload (one, in which we will execute our query), one can use for example the client accounting string:

db2 "create workload actuals current client_acctng('actuals') collect activity data with details, section include actuals base"
db2 "grant usage on workload actuals to public"

Once such a workload is created all connections, which will have client accounting string set to 'actuals', what can be done via CLI ClientAcctStr keyword or by calling proper WLM procedure:

CALL SYSPROC.WLM_SET_CLIENT_INFO(NULL, NULL, NULL, 'actuals', NULL);

Will have both activity metrics and section actuals collected (as long as the event monitor is active).

Other option is to assign the workload based on, for example, the userid or the name of application from which query is being run.

To have all db2batch executions assigned automatically to that workload, one can run:

db2 "create workload actuals_mon applname('db2batch') collect activity data with details,section include actuals base"

With the workload assignment based on the client accounting string event monitor data can be collected for the db2batch execution by adding the WLM_SET_CLIENT_INFO call to the input file (saved as "db2batch.3.in").

CALL SYSPROC.WLM_SET_CLIENT_INFO(NULL, NULL, NULL, 'actuals', NULL);
--#BGBLK 2
--#PARAM '000060' '000010'
select e.empno, e.lastname, d.deptname from employee e, department d where e.workdept = d.deptno and mgrno = ? ;
--#EOBLK

Execution:

db2batch -d sample -f db2batch.3.in -i complete -o r 0 p 5 -r db2batch.3.out

this will execute our queries the same way it was the case before, but with more details captured by the event monitor.

After disabling the event monitor (to avoid any further data being stored there for workload with activity collection enabled):

db2 "set event monitor actuals_mon state 0"

One can query monitor tables mentioned before:

ACTIVITYMETRICS_ACTUALS_MON - execution metrics
ACTIVITYSTMT_ACTUALS_MON - statement information
ACTIVITYVALS_ACTUALS_MON - values for the parameter markers (could be saved with "AND VALUES" on "CREATE WORKLOAD")
ACTIVITY_ACTUALS_MON - actuals information

The unique key for all those tables are 3 columns: APPL_ID, UOW_IU and ACTIVITY_ID, so to match the details for the same activity one should join on all 3 columns. The best way to confirm that actuals were captured properly is to check whether SECTION_ACTUALS BLOB was stored in the relevant table (ACTIVITYSTMT_ACTUALS_MON):

db2 "SELECT a.time_completed,
       Substr(appl_name, 1, 20)   appl_name,
       Substr(a.appl_id, 1, 28)   appl_id,
       a.uow_id,
       a.activity_id,
       Length(a.section_actuals)  act_len,
       Substr(s.stmt_text, 1, 50) stmt
FROM   activity_actuals_mon a,
       activitystmt_actuals_mon s
WHERE  a.appl_id = s.appl_id
       AND a.uow_id = s.uow_id
       AND a.activity_id = s.activity_id"

TIME_COMPLETED   APPL_NAME APPL_ID                     UOW_ID ACTIVITY_ID ACT_LEN  STMT                             
---------------- --------- --------------------------- ------ ----------- -------- ---------------------------------
2016-08-19-09.14 db2bp     *LOCAL.db2v105.160818174004     27           1        0 VALUES CURRENT SQLID INTO :H00133
2016-08-19-09.14 db2bp     *LOCAL.db2v105.160818174004     28           1        0 CALL WLM_SET_CONN_ENV(?,?)       
2016-08-19-09.14 db2bp     *LOCAL.db2v105.160818174004     30           1        0 select e.empno, e.lastname, d.dep
2016-08-19-09.40 db2bp     *LOCAL.db2v105.160819073936      4           1        0 select e.empno, e.lastname, d.dep
2016-08-19-12.42 db2batch  *LOCAL.db2v105.160819104221      2           1     2480 select e.empno, e.lastname, d.dep
2016-08-19-12.42 db2batch  *LOCAL.db2v105.160819104221      3           1     2480 select e.empno, e.lastname, d.dep

From the output we can conclude that the first 3 activities stored in the table (run from DB2 CLP) do not have activities stored (length = 0), but for the runs from db2batch, the data is there.
To generate the explain including actuals, EXPLAIN_FROM_ACTIVITY should be used, where we need to pass APPL_ID, UOW_ID, ACTIVITY_ID and name of event monitor, for example:

db2 "CALL EXPLAIN_FROM_ACTIVITY( '*LOCAL.db2v105.160819104221', 3, 1, 'ACTUALS_MON', '', ?, ?, ?, ?, ? )"
db2exfmt -d sample -1 -t

Explain level:    Explain from section

Access Plan:
-----------
        Total Cost:             13.6284
        Query Degree:           0

                           Rows
                        Rows Actual
                          RETURN
                          (   1)
                           Cost
                            I/O
                            |
                             3
                             5
                          NLJOIN
                          (   2)
                          13.6284
                            NA
                /-----------+------------\
               1                            3
               1                            5
            FETCH                        FETCH
            (   3)                       (   5)
            6.81343                      6.81494
              NA                           NA
         /----+----\                  /----+----\
        1            14              3            42
        1            NA              5            NA
     IXSCAN    TABLE: DB2V105     IXSCAN    TABLE: DB2V105
     (   4)      DEPARTMENT       (   6)       EMPLOYEE
   0.00633371        Q1         0.00758325        Q2
       NA                           NA
       |                            |
       14                           42
       NA                           NA
 INDEX: DB2V105               INDEX: DB2V105
     XDEPT2                        XEMP2
       Q1                           Q2

Generated explain contains now not only the estimates that optimizer used to pick the plan, but also actual number of rows during execution. For example, we can see that IXSCAN 5 expected 3 rows, but the actual number of rows returned was 5. In this example it is perfectly fine, but if numbers would differ by orders of magnitude, further analysis is required - quite likely additional indexes or statistical views might be needed to help optimizer to find a better access plan.
ACTIVITYMETRICS_ACTUALS_MON table can be used to query various metrics collected during this specific execution (not the aggregates, that for example snapshots or mon_get_pkg_cache_stmt have). For instance, to get number of rows read and time spent reading the data, we can use following query:

db2 "select rows_read, pool_read_time, pool_index_l_reads from ACTIVITYMETRICS_ACTUALS_MON where appl_id = '*LOCAL.db2v105.160819104221' and uow_id = 3 and activity_id = 1"

ROWS_READ            POOL_READ_TIME       POOL_INDEX_L_READS  
-------------------- -------------------- --------------------
                   6                    0                    3

Information like this is not that useful for such a simple query because the pool read time is below 1 [ms], but could help to determine where time is spent for more complex queries.

If one does not need to use the parameter markers in the query then section actuals can be easily obtained using db2caem tool, an example usage can be found in db2caem documentation

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm13285861

Tips

DB2: How to troubleshoot a query with parameter markers using section actuals?

Technical Blog Post

Abstract

Body

UID

Share your feedback

Need support?