IBM Support

LI73470: ERROR HANDLING BETWEEN PRIMARY AND STANDBY WHILE ONE CONNECTION IS CLOSED IN ONE CAUSES INSTANCE CRASH.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Instance crash with Signal #11 in one server in HADR system with
    "Communication with HADR partner was lost" because of
    mishandling of error message while connection is closed.
    
    One may see error messages in db2diag log as below before
    instance shutdown message.
    
    2008-01-12-22.09.11.357881-300 I877091G404        LEVEL: Severe
    PID     : 7917                 TID  : 4142896832  PROC :
    db2hadrs (CATUIT) 0
    INSTANCE: db2inst4             NODE : 000         DB   : CATUIT
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrEduAcceptEvent, probe:20215
    RETCODE : ZRC=0x8280001B=-2105540581=HDR_ZRC_COMM_CLOSED
              "Communication with HADR partner was lost"
    
    . . . . . . . . . . . .
    
    2008-01-12-22.09.11.429787-300 I878266G430        LEVEL: Warning
    PID     : 7917                 TID  : 4142896832  PROC :
    db2hadrs (CATUIT) 0
    INSTANCE: db2inst4             NODE : 000         DB   : CATUIT
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrSendMsg, probe:30040
    MESSAGE : TCP/IP send error. Closing HADR connection.
    DATA #1 : Hexdump, 4 bytes
    0xFEEA9660 : 3D01 8087                                  =...
    
    
    <Stacktrace>
    --Frame--- ------Function + Offset------
    0xFEEA99E8
    _Z17hdrEduAcceptEventP8HDR_DBCBP14HDR_EVENT_TYPEPP11HDR_HDR_MSGP
    P8HDR_RQST + 0x0e5b
      (/home/db2inst4/sqllib/lib/libdb2e.so.1)
    0xFEEA9E78 address: 0x015B4761 ; dladdress: 0x00434000 ; offset
    in lib: 0x01180761 ;
      (/home/db2inst4/sqllib/lib/libdb2e.so.1)
    0xFEEAA400 address: 0x015B5A54 ; dladdress: 0x00434000 ; offset
    in lib: 0x01181A54 ;
      (/home/db2inst4/sqllib/lib/libdb2e.so.1)
    0xFEEAA45C _Z13sqloCreateEDUPFvPcmES_jP13SQLO_EDU_INFOPl +
    0x028e
      (/home/db2inst4/sqllib/lib/libdb2e.so.1)
    0xFEEAA728 address: 0x00D4273E ; dladdress: 0x00434000 ; offset
    in lib: 0x0090E73E ;
      (/home/db2inst4/sqllib/lib/libdb2e.so.1)
    
    while one connection was closed, error message being transferred
    from one active to passive ends up with invalid address that
    cause instance to crash with Signal #11.
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Any user of HADR
    ****************************************************************
    PROBLEM DESCRIPTION:
    In rare cases, a lost connection between Primary and Standby
    will cause the Primary instance to crash.
    ****************************************************************
    RECOMMENDATION:
    Upgrade to V9.5 fixpack 2 or higher.
    ****************************************************************
    

Problem conclusion

  • First fixed in DB2 UDB Version 9.5 fixpack 2
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI73470

  • Reported component name

    DB2 UDE ESE LIN

  • Reported component ID

    5765F4104

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2008-05-20

  • Closed date

    2008-11-25

  • Last modified date

    2008-11-25

  • APAR is sysrouted FROM one or more of the following:

    LI73210

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 UDE ESE LIN

  • Fixed component ID

    5765F4104

Applicable component levels

  • R950 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"950","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
25 November 2008