IBM Support

LI74126: DB2 INSTANCE MAY CRASH BECAUSE OF THE INVALID TIMEOUT VALUE WHEN ALTERNATE PAGE CLEANING IS USED.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • With alternate page cleaning enabled, DB2 UDB may panic due
    to an invalid timeout value discovered during waiting. The
    invalid
    value is a result of an overflow, thus often being reported as a
    very large negative number.
    
    
    
    
    
    db2diag.log entries and stack trace look like this:
    
    2008-12-18-15.07.41.620931-120 I7866516G1466      LEVEL: Severe
    (OS)
    PID     : 677                  TID  : 3068128160  PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000
    EDUID   : 28                   EDUNAME: db2pclnr (AUTOM02)
    FUNCTION: DB2 UDB, oper system services, sqloWaitEDUWaitPost,
    probe:100
    MESSAGE : ZRC=0x83000016=-2097151978
    CALLED  : OS, -, semop                            OSERR: EINVAL
    (22)
    DATA #1 : timeout value, 4 bytes
    -825177850
    DATA #2 : Bitmask, 4 bytes
    0x00000000
    DATA #3 : Hex, PD_TYPE_INTERNAL_12, 44 bytes
    0x10010ACC : 1800 6216 0200 0700 A0EB DFB6 1C00 0000
    ..b.............
    0x10010ADC : 0000 0000 0000 0000 0000 0000 B003 37A8
    ..............7.
    0x10010AEC : A0FE 4510 E023 B910 0000 0000
    ..E..#......
    DATA #4 : String, 0 bytes
    Object not dumped: Address: 0xB6DFDDE8 Size: 0 Reason:
    Zero-length data
    DATA #5 : edu waitpost, PD_TYPE_SQLO_EDUWAITPOST, 24 bytes
    0xA83703B0 : 0000 0000 01CC 0600 0000 0000 FEAB 0000
    ................
    0xA83703C0 : 0100 0000 CC0A 0110                        ........
    CALLSTCK:
      [0] 0x0255BC42 sqloWaitEDUWaitPost + 0x3E6
      [1] 0x015B37D9 _Z16sqlbClnrFindWorkP12SQLB_CLNR_CB + 0x5B7
      [2] 0x015B30CE _Z18sqlbClnrEntryPointPhj + 0x70
      [3] 0x025DCB3A sqloEDUEntry + 0x21A
      [4] 0x00AC8341 /lib/tls/libpthread.so.0 + 0x5341
      [5] 0x009E56FE __clone + 0x5E
      [6] 0x00000000 ?unknown + 0x0
      [7] 0x00000000 ?unknown + 0x0
      [8] 0x00000000 ?unknown + 0x0
      [9] 0x00000000 ?unknown + 0x0
    
    2008-12-18-15.07.41.659288-120 I7867983G381       LEVEL: Severe
    PID     : 677                  TID  : 3068128160  PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000
    EDUID   : 28                   EDUNAME: db2pclnr (AUTOM02)
    FUNCTION: DB2 UDB, buffer pool services,
    sqlbWaitOnWpWithTimeout, probe:10
    MESSAGE : ZRC=0x83000016=-2097151978
    DATA #1 : String, 11 bytes
    sqlowait rc
    
    2008-12-18-15.07.41.644923-120 I7868365G1394      LEVEL: Severe
    (OS)
    PID     : 677                  TID  : 1353706400  PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000
    EDUID   : 3384                 EDUNAME: db2pclnr (SAMPLE)
    FUNCTION: DB2 UDB, oper system services, sqloWaitEDUWaitPost,
    probe:100
    MESSAGE : ZRC=0x83000016=-2097151978
    CALLED  : OS, -, semop                            OSERR: EINVAL
    (22)
    DATA #1 : timeout value, 4 bytes
    -825177850
    DATA #2 : Bitmask, 4 bytes
    0x00000000
    DATA #3 : Hex, PD_TYPE_INTERNAL_12, 44 bytes
    0x10034B9C : 5000 3D1E 0200 0700 A0EB AF50 380D 0000
    P.=........P8...
    0x10034BAC : 0000 0000 0000 0000 0000 0000 B05A DE83
    .............Z..
    0x10034BBC : 00FA D010 6000 D110 0000 0000
    ....`.......
    DATA #4 : String, 1 bytes
    .
    DATA #5 : edu waitpost, PD_TYPE_SQLO_EDUWAITPOST, 24 bytes
    0x83DE5AB0 : 0000 0000 01CC 0600 0000 0000 FEAB 0000
    ................
    0x83DE5AC0 : 0100 0000 9C4B 0310                        .....K..
    CALLSTCK:
      [0] 0x0255BC42 sqloWaitEDUWaitPost + 0x3E6
      [1] 0x015B37D9 _Z16sqlbClnrFindWorkP12SQLB_CLNR_CB + 0x5B7
      [2] 0x015B30CE _Z18sqlbClnrEntryPointPhj + 0x70
      [3] 0x025DCB3A sqloEDUEntry + 0x21A
      [4] 0x00AC8341 /lib/tls/libpthread.so.0 + 0x5341
      [5] 0x009E56FE __clone + 0x5E
      [6] 0x00000000 ?unknown + 0x0
      [7] 0x00000000 ?unknown + 0x0
      [8] 0x00000000 ?unknown + 0x0
      [9] 0x00000000 ?unknown + 0x0
    
    2008-12-18-15.07.41.697677-120 I7869760G379       LEVEL: Severe
    PID     : 677                  TID  : 1353706400  PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000
    EDUID   : 3384                 EDUNAME: db2pclnr (SAMPLE)
    FUNCTION: DB2 UDB, buffer pool services,
    sqlbWaitOnWpWithTimeout, probe:10
    MESSAGE : ZRC=0x83000016=-2097151978
    DATA #1 : String, 11 bytes
    sqlowait rc
    
    2008-12-18-15.07.41.698027-120 E7870140G822       LEVEL:
    Critical
    PID     : 677                  TID  : 1353706400  PROC : db2sysc
    INSTANCE: db2inst1             NODE : 000
    EDUID   : 3384                 EDUNAME: db2pclnr (SAMPLE)
    FUNCTION: DB2 UDB, RAS/PD component, pdStartFODC, probe:10
    MESSAGE : ADM14001C  An unexpected and critical error has
    occurred: "Panic".
              The instance may have been shutdown as a result.
    "Automatic" FODC
              (First Occurrence Data Capture) has been invoked and
    diagnostic
              information has been recorded in directory
    
    "/home/db2inst1/sqllib/db2dump/FODC_Panic_2008-12-18-15.07.41.67
    0513/
              ". Please look in this directory for detailed evidence
    about what
              happened and contact IBM support if necessary to
    diagnose the
              problem.
    
    
    Timeout value here has the value of -825177850. The timeout
    value can be either a positive number or -1 (NO_TIMEOUT).
    
    This happens only when alternate page cleaning is used:
    
    DB2_USE_ALTERNATE_PAGE_CLEANING=ON
    

Local fix

  • Unset alternate page cleaning:
    
    DB2_USE_ALTERNATE_PAGE_CLEANING=OFF
    

Problem summary

  • With alternate page cleaning enabled, DB2 UDB may panic due
    to an invalid timeout value discovered during waiting. The
    invalid
    value is a result of an overflow, thus often being reported as a
    very large negative number.
    

Problem conclusion

  • The issue can arise only when alternate page cleaning is used:
    
    DB2_USE_ALTERNATE_PAGE_CLEANING=ON
    
    The local fix is to unset the alternate page cleaning:
    
    db2set DB2_USE_ALTERNATE_PAGE_CLEANING=OFF
    
    The APAR fix will be included in DB2 V9.5 FP4
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI74126

  • Reported component name

    DB2 UDB WSE LIN

  • Reported component ID

    5765F3504

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-01-13

  • Closed date

    2009-06-01

  • Last modified date

    2009-06-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 UDB WSE LIN

  • Fixed component ID

    5765F3504

Applicable component levels

  • R910 PSY

       UP

  • R950 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"950","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
01 June 2009