IBM Support

IT26163: IN DPF , DB2 MAY CRASH WHEN TRYING TO UNLATCH A LATCH THAT IS NOT HELD

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • db2diag.log reports following error:
    
    2018-07-31-16.43.57.183345+480 I333558187A369    LEVEL: Severe
    PID    : 15139118            TID : 1543       PROC : db2sysc 60
    INSTANCE: db2inst1            NODE : 060
    EDUID  : 1543                EDUNAME: db2fcmr 60
    FUNCTION: DB2 UDB, fast comm manager,
    sqkfChannel::DeliverInboundBuffer, probe:15
    MESSAGE : Invalid Sequence No. Detected = 1. Expected No. = 2
    
    2018-07-31-16.43.57.183328+480 I333558557A1565   LEVEL: Severe
    PID    : 42009450            TID : 1543       PROC : db2sysc 59
    INSTANCE: db2inst1            NODE : 059
    EDUID  : 1543                EDUNAME: db2fcmr 59
    FUNCTION: DB2 UDB, SQO Latch Tracing,
    sqlo_xlatch::releaseConflict, probe:10
    DATA #?1 : String, 27 bytes
    unlocking an unlatched lock
    DATA #?2 : Pointer, 8 bytes
    0x07800000b0dd4ca0
    DATA #?3 : String, 117 bytes
    {
      lock         = { 0x00000000 [ unlocked ] }
      identity     = sqkfRecvLockManager::sqkfRecvLockManager (103)
    }
    DATA #?4 : Hexdump, 8 bytes
    0x07800000B0DD4CA0 : 0000 0000 0067 0000
    .....g..
    CALLSTCK: (Static functions may not be resolved correctly, as
    they are resolved to the nearest symbol)
     [0] 0x090000001EB05C6C pdLog + 0xF8
     [1] 0x090000001B0A8904 pdLog@glue415 + 0x12C
     [2] 0x090000001B05321C sqloSpinLockReleaseConflict + 0x5C
     [3] 0x090000001E9162AC sqloSpinLockReleaseConflict@glue73 +
    0x78
     [4] 0x090000001BE64F34
    DeliverInboundBuffer__11sqkfChannelFP10sqkfBufferP17SQLKF_SESSIO
    N_HDLP18SQLZ_PDB_UNIQUE_IDP15sql_static_data + 0x4A4
     [5] 0x090000001B638BEC
    DeliverBufferToTargetChannel__19sqkfFastCommManagerFP10sqkfBuffe
    riN2217SQLKF_CHANNEL_PRIP17SQLKF_SESSION_HDLP15sqkfSendConduit +
    0x84
     [6] 0x090000001AA59B78
    RouteInboundBuffer__19sqkfFastCommManagerFRP10sqkfBufferP17SQLKF
    _SESSION_HDLiT3 + 0x6A0
     [7] 0x090000001BAC2D98 HandleDataEvent__15sqkfRecvConduitFUl +
    0xCA0
     [8] 0x090000001BAC1BC0 RunEDU__15sqkfRecvConduitFv + 0xEC0
     [9] 0x090000001E82D9BC EDUDriver__9sqzEDUObjFv + 0xE4
    
    In certain case, a FastCommManager claimed the channel and
    put it in its receive table and at same time the channel is
    pointing to another FastCommManager, a possible reason here is
    that somehow the channel is reused before it was actually
    closed. This may also come with double free of a channel.
    

Local fix

  • Disable communication between MLN nodes via shared memory. In
    this case, they should set the reg var DB2_FORCE_FCM_BP to NO.
    When this is set, FCM resources are created per logical node and
    are not shared among other nodes on the same host.
    db2set DB2_FORCE_FCM_BP=NO
    recycle the instance
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * all                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to db2_v111m4fp6 or later                            *
    ****************************************************************
    

Problem conclusion

  • Upgrade to db2_v111m4fp6 or later
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT26163

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-09-03

  • Closed date

    2021-03-15

  • Last modified date

    2021-03-15

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • RB10 PSN

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"DB2 for Linux- UNIX and Windows"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
04 May 2022