IBM Support

Why does event monitor for statistics only work for workload at position 1 and not for all the other workloads ?

Technical Blog Post


Abstract

Why does event monitor for statistics only work for workload at position 1 and not for all the other workloads ?

Body

Here is an example of the problem description:

 

Create Workload:
--

db2 "SET WORKLOAD TO SYSDEFAULTADMWORKLOAD"  
db2 "CREATE WORKLOAD WL_A SESSION_USER('DB2V10')"
db2 "CREATE WORKLOAD WL_B SESSION_USER('JKGERA')" 
db2 "ALTER WORKLOAD WL_A COLLECT AGGREGATE ACTIVITY DATA BASE"  
db2 "ALTER WORKLOAD WL_B COLLECT AGGREGATE ACTIVITY DATA BASE"  
db2 "GRANT USAGE ON WORKLOAD WL_A TO PUBLIC"  
db2 "GRANT USAGE ON WORKLOAD WL_B TO PUBLIC"  
db2 "SET WORKLOAD TO AUTOMATIC" 


Create an event monitor :
---

db2 "create event monitor mon_wlm_statistics 
    for statistics  
    write to table  
        qstats    (table db.mon_statistics_qstats_pmr    in userspace1),  
        scmetrics (table db.mon_statistics_scmetrics_pmr in userspace1),  
        wcstats   (table db.mon_statistics_wcstats_pmr   in userspace1),  
        wlmetrics (table db.mon_statistics_wlmetrics_pmr in userspace1),  
        scstats   (table db.mon_statistics_scstats_pmr   in userspace1),  
        wlstats   (table db.mon_statistics_wlstats_pmr   in userspace1),  
        control   (table db.mon_statistics_control_pmr   in userspace1)  
    nonblocked autostart" 

db2 "set event monitor  mon_wlm_statistics state 1"
db2 "UPDATE DATABASE CONFIGURATION USING WLM_COLLECT_INT 5 IMMEDIATE" 


Run workload:
--
As both users 'DB2V10' and 'JKGERA' running:

db2 "select count(*) from syscat.tables, syscat.tables, syscat.tables"

1
-----------
  267089984

  1 record(s) selected.


We can see all the sessions that are available:


$ db2 "select substr(workload_name,1,10) as workload_name, system_auth_id from table (wlm_get_service_class_workload_occurrences('', '',-2)) as scinfo"

WORKLOAD_NAME SYSTEM_AUTH_ID
------------- -----------------
WL_A          DB2V10
WL_A          DB2V10
WL_B          JKGERA



But when we query the event monitor, only the workload at the lowest position shows up:

$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"

STATISTICS_TIMESTAMP       WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-15.50.00.949172 WL_A                        128297

  1 record(s) selected.

>><<

$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"

STATISTICS_TIMESTAMP       WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-15.55.01.023037 WL_A                     202158290
2016-02-15-15.50.00.949172 WL_A                        128297

  2 record(s) selected.


Now if we alter the position for WL_B to 1:

$ db2 "alter workload WL_B position at 1"
$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"

STATISTICS_TIMESTAMP       WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-16.00.00.077792 WL_B                      76213905
2016-02-15-15.55.01.023037 WL_A                     202158290
2016-02-15-15.50.00.949172 WL_A                        128297


If I alter the Positions as WL_A as 10 and WL_B as 20:


db2 "alter workload WL_A position at 10"
DB20000I  The SQL command completed successfully.

db2 "alter workload WL_B position at 20"
DB20000I  The SQL command completed successfully.


The event monitor only lists the one at the lower position:

$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"

STATISTICS_TIMESTAMP       WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-16.05.00.133662 WL_A                         16678
2016-02-15-16.00.00.077792 WL_B                      76213905
2016-02-15-15.55.01.023037 WL_A                     202158290
2016-02-15-15.50.00.949172 WL_A                        128297

 

=====

 

This is because the event monitor is created with "nonblocked" option which can loss record if the buffer is not big enough to hold all the records.

db2 "create event monitor mon_wlm_statistics
    for statistics 
    write to table 
        qstats    (table db.mon_statistics_qstats_pmr    in userspace1), 
        scmetrics (table db.mon_statistics_scmetrics_pmr in userspace1), 
        wcstats   (table db.mon_statistics_wcstats_pmr   in userspace1), 
        wlmetrics (table db.mon_statistics_wlmetrics_pmr in userspace1), 
        scstats   (table db.mon_statistics_scstats_pmr   in userspace1), 
        wlstats   (table db.mon_statistics_wlstats_pmr   in userspace1), 
        control   (table db.mon_statistics_control_pmr   in userspace1) 
    nonblocked autostart"

 

If you look at the event monitor control  table, you should see the  overflows message in it.


$ db2 "select * from  db.mon_statistics_control_pmr"

PARTITION_KEY EVENT_MONITOR_NAME                                                                                                               MESSAGE                                                                                                MESSAGE_TIME               PARTITION_NUMBER
------------- -------------------------------------------------------------------------------------------------------------------------------- ---------------------------------- ---------------------------------------------------------------------------------------------- -------------------------- ----------------
            0 MON_WLM_STATISTICS                                                                                                               FIRST_CONNECT                                                                                                2016-02-25-15.26.47.388091                0
            0 MON_WLM_STATISTICS                                                                                                               EVMON_START                                                                                                2016-02-25-15.33.26.881491                0
            0 MON_WLM_STATISTICS                                                                                                               OVERFLOWS:129                                                                                                2016-02-25-15.35.00.226871                0

 


In v105, the statistics event monitor was changed to use fast writer  to write out the monitor record, which is always blocking so the "nonblocked" option will be ignore in v10.5

 

Changing the event monitor to 'blocked" will resolve the issue and we should be able to get all the records.

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11140874