IBM Support

IC90673: SNAPSHOTS UNDER OUT OF MEMORY OR LOW MEMORY CONDITIONS MAY TRIGGER A HANG IN DB2

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Under out of memory or low memory conditions, a snapshot
    operation may lead to a 'dead latch', causing a hang in DB2. The
    'dead latch' is triggered by some 'Memory Allocation Error' in
    sqm_get_next_dbcb(), which is recorded in db2diag.log as
    follows:
    ===========================================
    2013-01-01-01.01.01.123456+789 I12345678901234    LEVEL: Error
    PID     : 12345678             TID  : 12345       PROC : db2sysc
    3
    INSTANCE: rdminst              NODE : 003
    APPHDL  : 0-1234
    EDUID   : 12345                EDUNAME: db2agent (idle) 3
    FUNCTION: DB2 UDB, database monitor, sqm___sqm_get_next_dbcb,
    probe:60
    MESSAGE : Memory Allocation Error
    ===========================================
    
    The problem is caused by EDUs waiting on latch
    SQLO_LT_sqeDBMgr__dbMgrLatch.
    
    For latch waiters, following stack traces will be seen in stack
    files:
    ===========================================
    sqloXlatchConflict
    sqloXlatchConflict
    sqm_get_next_dbcb
    sqlmonssagnt
    sqlmPdbRequestRouter
    
    Or
    
    sqloXlatchConflict
    sqloXlatchConflict
    StartUsingLocalDatabase
    AppStartUsing
    ===========================================
    
    For latch holder, it is stuck in following stack trace:
    ===========================================
    sqloXlatchConflict
    sqloXlatchConflict
    sqm_get_next_dbcb
    turn_off_switches
    update_switches
    ===========================================
    And in the "LatchInformation" section of the stack file, the
    holder is holding the SQLO_LT_sqeDBMgr__dbMgrLatch whilst
    waiting on the same latch too, which is a 'dead latch' situation
    that can only be resolved by killing the DB2 instance. Here is
    an example:
    ===========================================
    <LatchInformation>
    
    Waiting on latch type: (SQLO_LT_sqeDBMgr__dbMgrLatch) - Address:
    (780000000212121), Line: 278, File: sqlmutil.C
    
    Holding Latch type: (SQLO_LT_sqeDBMgr__dbMgrLatch) - Address:
    (780000000212121), Line: 278, File: sqlmutil.C HoldCount: 1
    </LatchInformation>
    ===========================================
    

Local fix

  • 1. Find out the reason of the 'Memory Allocation Error', fix the
    issue to avoid the error.
    2. Avoid using snapshots under out of memory conditions.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All users of version 9.7 on Linux, Unix and Windows          *
    * platforms.                                                   *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Update to DB2 LUW Version 9.7 Fix Pack 9 or higher levels.   *
    ****************************************************************
    

Problem conclusion

  • First fixed in DB2 LUW Version 9.7 Fix Pack 9.
    

Temporary fix

  • Find out the reason of the 'Memory Allocation Error', fix the
    issue to avoid the error.
    

Comments

APAR Information

  • APAR number

    IC90673

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-03-06

  • Closed date

    2014-05-09

  • Last modified date

    2014-05-09

  • APAR is sysrouted FROM one or more of the following:

    IC90657

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R970 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
09 May 2014