IBM Support

IC82237: HIGH SYS CPU, PERFORMANCE IMPACT, LATCHING ISSUES WHEN USING DB2 V97 NEW DEADLOCK EVENT MONITORING WITH HISTORY

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When new deadlock event monitoring is used with history on busy
    systems, it may cause high SYS CPU, latching and performance
    impact to running transactions.
    
    These are the steps used to enable new deadlock event
    monitoring:
    
    1. db2 connect to <dbname>
    
    2. db2 update db cfg using mon_deadlock HISTORY
    
    3. db2 update db cfg using MON_LCK_MSG_LVL 2
    
    4. db2 "create event monitor <eventmon_name> for locking write
    to unformatted event table (table <schema_name>.<table_name> in
    <tablespace_name> PCTDEACTIVATE 90)"
    
    5. db2 set event monitor  <eventmon_name> STATE 1
    
    Stack dumps collected during the issue look like this:
    
    sqloSpinLockReleaseConflict + 0xBC
    
    sqloSpinLockReleaseConflict@glue74 + 0x78
    
    releaseLatch__12SMemBasePoolFv + 0xC
    
    sqlofmblkEx + 0x2A0
    
    sqlrr_cleanup_past_activity__FPP19sqlrr_past_activity@glue1381 +
    0x1D8
    sqlrr_cleanup_past_activity_list__FP8sqlrr_cb + 0x70
    
    sqlrr_cleanup_tran_before_DPS__FP8sqlrr_cbiN62PiT9b +
    
    sqlrrcom__FP8sqlrr_cbiT2 + 0x374
    
    sqlrr_commit__FP14db2UCinterface + 0xC4
    
    sqljs_ddm_rdbcmm__FP14db2UCinterfaceP13sqljDDMObject
    
    
    
    thread_waitlock@glue833 + 0x98
    sqloXlatchConflict + 0x374
    sqloXlatchConflict@glue1A2 + 0x78
    captureLatch__12SMemBasePoolFv + 0x54
    sqlogmblkEx + 0x270
    sqlrr_activity_allocate_monheap__FP8sqlrr_cbiPPv +
    sqlriGetStmtHistoryInputData__FP8sqlrr_cbbT2 + 0x2F0
    sqlriGetStmtHistoryInputData__FP8sqlrr_cbbT2@glueDE4
    sqlrr_process_execute_request__FP8sqlrr_cbi + 0x254
    sqlrr_execute__FP14db2UCinterfaceP9UCstpInfo + 0x218
    executeSection__10pvmPackageFP5sqlcaUib + 0x83C
    executeQuery__3PVMFUib + 0x1A8
    run__3PVMFv + 0xD1C
    pvm_entry + 0x464
    
    
    db2pd -latches output will show waiters for the following latch:
    
    SQLO_LT_SMemPool__MemLatchType__latch
    
    vmstats output:
    
    
    
     kthr          memory                         page
    
    faults           cpu       time
    
    ------- ---------------------
    ------------------------------------
    ------------------ ----------- --------
    
      r   b        avm               fre             re    pi    po
    fr
    sr    cy    in     sy    cs                  us sy id wa hr mi
    se
      5   0   44227427     232329     0     0     0     0      0
    0 19136
    447226 146613 27 70  1  1 14:54:32
    
      3   0   44237843     220164     0     0     0     0      0
    0 19642
    637742 150012 30 69  0  1 14:54:34
    
      3   0   44224064     231840     0     0     0     0      0
    0 19592
    429623 149365 28 70  1  1 14:54:36
    
      2   0   44229123     225170     0     0     0     0      0
    0 19951
    392838 145386 26 73  0  1 14:54:38
    
      1   0   44236053     217649     0     0     0     0      0
    0 18337
    370967 143991 25 73  1  1 14:54:40
    
      2   0   44224091     227491     0     0     0     0      0
    0 20176
    434307 145266 30 69  0  1 14:54:42
    
      2   0   44229157     220808     0     0     0     0      0
    0 19545
    441719 142319 27 71  1  1 14:54:44
    
      2   0   44224866     223328     0     0     0     0      0
    0 19633
    406209 144418 28 71  0  0 14:54:46
    
      3   0   44234431     220387     0     0     0  4357   7462
    0 19146
    410787 144249 26 73  0  1 14:54:48
    
      1   0   44229943     224622     0     0     0     0      0
    0 18742
    407306 134669 24 75  1  1 14:54:50
    
      1   0   44225729     227109     0     0     0     0      0
    0 18179
    530986 139523 25 74  0  0 14:54:52
    
      4   0   44226324     224252     0     0     0     0      0
    0 19299
    452771 141628 29 70  0  0 14:54:54
    
      5   0   44230404     218497     0     0     0     0      0
    0 18960
    434550 141010 28 71  1  1 14:54:56
    
      3   0   44230367     216868     0     0     0     0      0
    0 18586
    404354 145387 25 73  1  1 14:54:58
    
      2   0   44221461     228618     0     0     0  2501   3895
    0 19523
    449963 149634 28 70  1  1 14:55:00
    
      1   0   44235237     213978     0     0     0     0      0
    0 18594
    413282 142301 26 72  1  1 14:55:02
    
      4   0   44226901     221169     0     0     0     0      0
    0 19663
    392744 145358 28 71  0  1 14:55:04
    
      3   0   44224207     222234     0     0     0     0      0
    0 18897
    436398 151382 27 70  1  2 14:55:06
    
      1   0   44226700     216937     0     0     0     0      0
    0 18513
    332716 137340 26 72  1  1 14:55:08
    
      2   0   44242060     198595     0     0     0     0      0
    0 18739
    342144 142550 25 73  1  1 14:55:10
    
    and tprof ouput will show the following code paths as high CPU
    consumers:
    
                         --------------------
    
     Count  Total Time  % sys Avg Time Min Time Max Time  SVC
    (Address)
                (msec)   time   (msec)   (msec)   (msec)
    
    ====== =========== ====== ======== ======== ========
    ================
    349058 130109.4667 39.98%   0.3727   0.0007   1.4563
    thread_unlock(2.
    410691  99536.5384 30.59%   0.2424   0.0013   1.4235
    thread_waitlock.
     61949   6575.3293  2.02%   0.1061   0.0050   2.3843
    kpread(2b4e510)
    334922   2179.7238  0.67%   0.0065   0.0000   0.4175
    kwrite(2b4e4e0)
       825   1108.8897  0.34%   1.3441   0.0186   5.1333
    execve(2b43e60)
    

Local fix

  • As a temporary workaround, collect deadlock event monitoring
    without history.
    
    This can be accomplished by updating db cfg mon_deadlock with
    WITHOUT_HIST:
    
    db2 update db cfg using mon_deadlock WITHOUT_HIST
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All users on DB2 V97 Fp6 and ealrier                         *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to DB2 V97 Fp7                                       *
    ****************************************************************
    

Problem conclusion

  • First Fixed in DB2 V97 FP7
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC82237

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-03-23

  • Closed date

    2012-10-30

  • Last modified date

    2012-10-30

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IC88619

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R970 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
30 October 2012