IBM Support

JR28152: DATABASE HANG DURING HADR SOFT CHECK POINT REQUEST

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • If HADR requests a soft check point, and at the same time there
    is another operation requesting log truncation (e.g. backup),
    the database may hang with the following symptoms:
    .
    1) BACKUP in progress
    =====================
    The backup EDU will have the following call stack:
    .
    sqloWaitInterrupt + 0x74 (sqloedu.C:850)
    sqloWaitIPCWaitPost + 0x4D (sqlowaitpost.C:1857)
    sqlpgPostLoggrWithoutLatching + 0x9F (sqlpgpst.C:148)
    sqlpgPostLoggr + 0x4B (sqlpiPostLogger.h:127)
    sqlpWriteToLog + 0xB2A (sqlpwlg.C:1250)
    sqlpWriteLR + 0x1F5 (sqlpwlg.C:2542)
    sqlpWriteLRSingularTran + 0xF2 (sqlpwlg.C:2732)
    sqlpWriteLRSingularTran + 0x27 (sqlplog.h:430)
    sqlubWriteEndLogRecord + 0x14E (sqlubsfp.C:1707)
    sqlubUpdateLFH + 0x62 (sqlubsfp.C:1579)
    sqlubBMResponse + 0xF9 (sqlubaPoll.C:313)
    sqlubPollMsg + 0x12B (sqlubaPoll.C:167)
    sqlubcka + 0x74C (sqlubcka.C:981)
    sqlubcka_route_in_DA + 0x399 (sqlubaFirewall.C:534)
    sqlerKnownProcedure + 0x2FD (sqlerKnownProcs.C:770)
    .
    The EDU has posted the DB2 Log Reader (db2loggr) to truncate
    the active log file. While posting, we have acquired the
    dbcb->IOTaskSem latch, which is a latch used to synchronize
    I/O requests for the database. Now we are waiting for the
    db2loggr's response. If there are other EDUs trying to
    perform an operation which needs the db2cb->IOTaskSem
    (e.g. logging), they will be seen waiting, too.
    .
    2) DB2 Log Reader
    =================
    The following is the call stack for the DB2 Log Reader:
    .
    sqloSpinLockConflict + 0x130 (sqloltch.C:348)
    sqloxltc_track + 0x50 (sqloXlatch.h:491)
    sqlpgPrePostLoggw + 0x33 (sqlpgasn.C:268)
    sqlpgTruncateLogRuntime + 0x2E (sqlpgForceTruncateLog.C:380)
    sqlpgasn + 0x4C4 (NotFound:-1)
    sqloEDUEntry + 0x287 (sqloedu.C:2868)
    .
    This is the main routine for log truncation. The EDU is about
    to post the DB2 Log Writer (db2loggw) to perform the actual
    log truncation. Before posting, the dbcb->wTaskSem latch
    needs to be acquired, which is a latch used to synchronize
    jobs at the database level. This latch is currently taken,
    so we cannot post the db2loggw yet.
    .
    3) HADR requesting a soft check point
    =====================================
    The holder of the dbcb->wTaskSem is HADR:
    .
    sqloxconflict_LONG + 0x21D (NT/sqloltch.C:1057)
    sqlpPreForceLogArchive + 0x200 (NotFound:-1)
    hdrWritePersist + 0x5A (hdr.C:1786)
    hdrSetHdrState + 0xF3 (hdr.C:350)
    hdrDrainRequestsAndSetStopping + 0x6A9 (hdrEdu.C:3018)
    hdrCloseLogFiles + 0x92E (hdrEdu.C:5139)
    sqloEDUEntry + 0x287 (sqloedu.C:2868)
    .
    This EDU is holding the dbcb->wTaskSem, and at the same
    time it is waiting for dbcb->IOTaskSem, i.e. the latch
    held by (1). This results in the following three-way
    deadlatch:
    .
    * (1) is waiting for (2)
    * (2) is waiting for (3)
    * (3) is waiting for (1)
    .
    Workaround:
    ===========
    None.
    

Local fix

Problem summary

  • see APAR description
    

Problem conclusion

  • First fixed in DB2 UDB Version 9.5, FixPak 2
    

Temporary fix

  • see APAR description
    

Comments

APAR Information

  • APAR number

    JR28152

  • Reported component name

    DB2 UDB ESE WIN

  • Reported component ID

    5765F4101

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2007-12-18

  • Closed date

    2008-11-03

  • Last modified date

    2008-11-03

  • APAR is sysrouted FROM one or more of the following:

    JR28150

  • APAR is sysrouted TO one or more of the following:

Modules/Macros

  • ENGN_HDR
    

Fix information

  • Fixed component name

    DB2 UDB ESE WIN

  • Fixed component ID

    5765F4101

Applicable component levels

  • R950 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"950","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
03 November 2008