IBM Support

IC98574: DB2STOP FORCE MAY HANG ON PAGE LATCHES DUE TO AN INSTANCE TRAP WHICH DID NOT TERMINATE THE INSTANCE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • This problem is specific to DB2 on the Windows operating system
    platforms.
    
    A "db2stop force" command might hang and while the instance is
    hung, performing a "db2fodc -db <database name> -hang full"
    would reveal pool read latches being held indefinitely :
    
     Stack:
     ======================
    
      0000000180012E26 <SQLO_SLATCH_CAS32::getConflictComplex>
    <sqloLatchCAS32.C:718>
      00000001800128D7 <SQLO_SLATCH_CAS32::getConflict>
    <sqloLatchCAS32.C:1005>
      0000000001D66504 <sqlbFindPageInBPOrSim>
    <E:\db2_v97fp6\ntx64\s120629\engn\sqb\inc\sqlbslat.h:720>
      0000000001D58FE4 <sqlbPurgeOrFlushAllPagesInSmallRange>
    <sqlbbuffers.C:4071>
      0000000001D57C7E <sqlbPurgeObject> <sqlbbuffers.C:5514>
      0000000001D430FE <sqlbSMSDeleteObject> <sqlbfiles.C:2063>
      0000000001EF12CA <sqldDropObj> <sqldmdrp.C:1262>
      0000000001EF0975 <sqldDropTable> <sqldmdrp.C:946>
      0000000001DD24D7 <sqlbPFPrefetcherEntryPoint>
    <sqlbpfchr.C:1961>
      0000000001DD1AFD <sqbPrefetcherEdu::RunEDU> <sqlbpfchr.C:7699>
      0000000003AA32C5 <sqlzRunEDU> <sqlz_edu_obj.C:35>
      00000001800E69FE <sqloEDUEntry> <sqloedu.C:3454>
    
    
     Summary:
     Found in 3 stacks of a total of 46 stacks ( 6.52% ) in 1 files
     Found in:
            92644.019.866.stack1.txt -- Pid:3660 Tid:7032 --
    2013-12-09-09.41.30.813000
            92644.019.866.stack1.txt -- Pid:3660 Tid:6072 --
    2013-12-09-09.41.30.860000
            92644.019.866.stack1.txt -- Pid:3660 Tid:3788 --
    2013-12-09-09.41.30.860000
    
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ~~~~~~~~
    
    
    and latches held
    Holding latch type: (SQLO_LT_SQLB_POOL_CB__readLatch) - Address:
    (000000001DD5F868), Line: 2455, File: sqlbpacc.C HoldCount: 9
    Holding latch type: (SQLO_LT_SQLB_POOL_CB__readLatch) - Address:
    (000000001DD5F868), Line: 2455, File: sqlbpacc.C HoldCount: 18
    Holding latch type: (SQLO_LT_SQLB_POOL_CB__readLatch) - Address:
    (000000001DD5F868), Line: 2455, File: sqlbpacc.C HoldCount: 19
    
    
    You will also see that page cleaners share the same stack and be
    held on sqloLioAIOCollect function:
     Stack:
     ======================
    
      00000001800E992A <sqloWaitInterrupt> <sqloedu.C:634>
      000000018001C74F <sqloWaitIPCWaitPost> <sqlowaitpost.C:1979>
      00000001800F3A9D <SQLO_LIO_HANDLE_DATA::sqloLioAIOCollect>
    <sqlolio.C:3578>
      00000001800F3678 <sqloLioCollectNBlocks> <sqlolio.C:4631>
      000000000415F73E <sqlbClnrFindWork> <sqlbclnr_core.C:2633>
      000000000415E981 <sqlbClnrEntryPoint> <sqlbclnr_core.C:3427>
      000000000415E8BF <sqbPgClnrEdu::RunEDU> <sqlbclnr_core.C:4358>
      0000000003AA32C5 <sqlzRunEDU> <sqlz_edu_obj.C:35>
      00000001800E69FE <sqloEDUEntry> <sqloedu.C:3454>
    
    
    This hang situation could happen if a severe error that would
    require to bring down the instance to avoid further issues was
    previously encountered in the AIO code. In this case DB2 might
    go on processing instead of performing a "panic" emergency stop
    of the instance and generate a trap. This could be identified by
    the following message in db2diag.log :
    
    2013-12-06-14.09.28.492000+000 I1723226F516       LEVEL: Severe
    (OS)
    PID     : 3660                 TID  : 2052        PROC :
    db2syscs.exe
    INSTANCE: DB2                  NODE : 000     EDUID   : 2052
    EDUNAME: db2aiothr
    FUNCTION: DB2 UDB, oper system services,
    sqloAIOCollectorEDUEntry, probe:100
    MESSAGE : ZRC=0x83000070=-2097151888
    CALLED  : OS, -, GetQueuedCompletionStatus
    OSERR   : 112 "There is not enough space on the disk."
    DATA #1 : String, 37 bytes
    Failed in getting completion status.
    
    This message should trigger an instance "panic" but in this case
    it does not. This APAR is to have the instance "panic" in this
    specific case as it should.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * DB2 on Windows platform only                                 *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to version 9.7 fix pack 10                           *
    ****************************************************************
    

Problem conclusion

  • first fixed in version 9.7 fix pack 10
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC98574

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-01-07

  • Closed date

    2014-11-10

  • Last modified date

    2014-11-10

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IT03829 IT05180

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R970 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
10 November 2014