IBM Support

IC69276: DB2 V9.5, V9.7 MAY GET SIGSEV WHEN HITTING TIMING HOLE BETWEEN CONCURRENT CONNECTION REQUESTS

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • A sigsev may occur due to a timing issue when concurrent
    connection requests attempt to acquire and release latches.
    This issue is caused by a timing hole between a connection that
    is terminating and a new connection that is coming in.   This
    could also occur when attempting to perform a quiesce database
    operation.
    
    This impacts users on DB2 Version 9.5 and 9.7.
    
    One way to diagnose if you have encountered this issue is to
    review the db2diag.log messages and search for one that displays
    message about latch conflict with "unlocking an unlatched lock"
    and "NO_IDENTITY".   View the sample output listed below.
    
    2010-04-01-20.59.42.595251+000 I5796499244A1497   LEVEL: Severe
    PID     : 111111               TID  : 23138       PROC : db2sysc
    0
    INSTANCE: db2inst1             NODE : 000         DB   : DBROQC6
    APPHDL  : 0-55171              APPID: 10.1.1.1.51364.10041820575
    AUTHID  : db2inst1
    EDUID   : 23138                EDUNAME: db2agent (idle) 0
    FUNCTION: DB2 UDB, SQO Latch Tracing,
    sqlo_xlatch::releaseConflict, probe:10
    DATA #1 : String, 27 bytes
    unlocking an unlatched lock
    DATA #2 : Pointer, 8 bytes
    0x0780000001f766e8
    DATA #3 : String, 86 bytes
    {
       lock          = { 0x00000000 [ unlocked ] }
       identity      = NO_IDENTITY (0)
    }
    DATA #4 : Hexdump, 8 bytes
    0x0780000001F766E8 : 0000 0000 0001 0000
    ........
    CALLSTCK:
      [0] 0x09000000081F2634 pdLog@glue32E + 0x128
      [1] 0x090000000678B57C sqloSpinLockReleaseConflict + 0x174
      [2] 0x0900000006AF5500 sqloSpinLockReleaseConflict@glue7A +
    0x74
      [3] 0x0900000007977D48
    TermDbConnect__16sqeLocalDatabaseFP8sqeAgentP5sqlcai + 0x238
      [4] 0x09000000072563D4
    HandleStartUsingError__14sqeApplicationFP8sqeAgentP8SQLE_BWAP5sq
    lcaP22SQLESRSU_STATUS_VECTOR
    + 0x384
      [5] 0x0900000007979B70
    AppStartUsing__14sqeApplicationFP8SQLE_BWAP8sqeAgentcT3P5sqlcaPc
    + 0xFC
      [6] 0x090000000697425C
    AppLocalStart__14sqeApplicationFP14db2UCinterface + 0x390
      [7] 0x0900000006A6FCF0 sqlelostWrp__FP14db2UCinterface + 0x34
      [8] 0x0900000006A6FD6C sqleUCengnInit__FP14db2UCinterfaceUs +
    0x24
      [9] 0x090000000693AF28 sqleUCagentConnect + 0x27C
    
    Stack trace for thread looks similar to :
    
    <StackTrace>
    -------Frame------ ------Function + Offset------
    0x090000000062B81C pthread_kill + 0x88
    0x090000000612F52C sqloDumpEDU + 0x48
    0x0900000006573974 sqle_panic__Fv + 0x4C
    0x09000000064F9E0C sqle_panic__Fv@glue4F1 + 0x74
    0x090000000678B584 sqloSpinLockReleaseConflict + 0x17C
    0x0900000006AF5500 sqloSpinLockReleaseConflict@glue7A + 0x74
    0x0900000007977D48
    TermDbConnect__16sqeLocalDatabaseFP8sqeAgentP5sqlcai + 0x238
    0x09000000072563D4 ...........
    </StackTrace>
    
    
    Additional Note:
    There is also another rare scenario with this timing issue where
    it's possible for the lock identity to be defined already
    (instead of NO_IDENTITY").
    The db2diag.log message will contain the following keywords:
    "unlocking an unlatched lock" and "sqeLocalDatabase::dblatch
    (1)"
    
    FUNCTION: DB2 UDB, SQO Latch Tracing,
    sqlo_xlatch::releaseConflict, probe:10
    DATA #1 : String, 27 bytes
    unlocking an unlatched lock
    DATA #2 : Pointer, 8 bytes
    0x07800000014966e8
    DATA #3 : String, 100 bytes
    {
       lock          = { 0x00000000 [ unlocked ] }
       identity      = sqeLocalDatabase::dblatch (1)
    }
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * Refer to Problem Description                                 *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to v9.7 FP3                                          *
    ****************************************************************
    

Problem conclusion

  • Refer to local fix
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC69276

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-06-16

  • Closed date

    2010-09-30

  • Last modified date

    2010-09-30

  • APAR is sysrouted FROM one or more of the following:

    IC69252

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R950 PSY

       UP

  • R970 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
30 September 2010