IBM Support

IT27252: NOT ABLE TO FORM HADR CONNECTION AFTER TAKEOVER WHEN PURESCALE HADR STANDBY MEMBERS ON SAME HOST HAVE THE SAME HADR_LOCAL_SVC

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When using HADR feature in pureScale, it is possible to define
    the standby cluster with multiple members running on the same
    host. In such configuration, the HADR_LOCAL_HOST database
    configuration parameter is the same for these members. Their
    HADR_LOCAL_SVC database configuration parameter must be set to a
    different port value in order to avoid conflict.
    
    When HADR_LOCAL_SVC is set to the same value, this incorrect
    configuration is not detected when the database is activated as
    standby, because only one member of the standby cluster is
    activated. A wrongly such configured standby can still function,
    e.g. reach peer state. However, after HADR takeover, only one
    member of the new primary can listen on the configured port on
    that host. It would reject the connection request sent by the
    new standby when it detects it is for a different member. This
    unexpected rejection will cause the new standby database to
    terminate. Additionally, when attempting to activate the
    database on all members of the new primary cluster, some
    members will fail to be activated due to the port conflict.
    
    It is more desirable to detect the incorrect configuration and
    prevent the database to be activated as the standby.
    
    The following message in the db2diag.log of the new primary
    confirms that the new primary is rejecting the standby
    because it is for a different member than what the standby
    wanted to be connected with.
    2018-12-03-03.11.32.289579-300 I92473E622            LEVEL:
    Error
    PID     : 15302                TID : 140287603107584 PROC :
    db2sysc 0
    INSTANCE: hsjiang              NODE : 000            DB   :
    HADRDB
    HOSTNAME: hotellnx113
    EDUID   : 112                  EDUNAME: db2hadrp.0.1 (HADRDB) 0
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrVerifyMembersMatch, probe:15595
    MESSAGE : ZRC=0x87800140=-2021654208=HDR_ZRC_CONFIGURATION_ERROR
              "One or both databases of the HADR pair is configured
    incorrectly"
    DATA #1 : <preformatted>
    The local HADR log stream id 0 does not match the remote log
    stream id 2
    
    The following message in the db2diag.log of the new standby
    confirms that the rejection cause the standby database to
    terminate.
    2018-12-03-03.11.32.281133-300 I175790E491           LEVEL: Info
    PID     : 5264                 TID : 140135265986304 PROC :
    db2sysc 0
    INSTANCE: hsjiang              NODE : 000            DB   :
    HADRDB
    HOSTNAME: hotellnx112
    EDUID   : 483                  EDUNAME: db2hadrs.2.0 (HADRDB) 0
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrHandleHsAck, probe:43900
    DATA #1 : <preformatted>
    Handshake HDR_MSG_HDRREJECT message is received from
    hotellnx113:32601 (9.26.121.209:32601)
    
    2018-12-03-03.11.32.282571-300 I176812E609           LEVEL:
    Error
    PID     : 5264                 TID : 140135265986304 PROC :
    db2sysc 0
    INSTANCE: hsjiang              NODE : 000            DB   :
    HADRDB
    HOSTNAME: hotellnx112
    EDUID   : 483                  EDUNAME: db2hadrs.2.0 (HADRDB) 0
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrHandleHsAck, probe:43901
    MESSAGE : ZRC=0x87800140=-2021654208=HDR_ZRC_CONFIGURATION_ERROR
              "One or both databases of the HADR pair is configured
    incorrectly"
    DATA #1 : <preformatted>
    HADR handshake with hotellnx113:32601 (9.26.121.209:32601)
    failed.
    
    2018-12-03-03.11.32.298613-300 E182259E1365          LEVEL:
    Severe
    PID     : 5264                 TID : 140135333095168 PROC :
    db2sysc 0
    INSTANCE: hsjiang              NODE : 000            DB   :
    HADRDB
    APPHDL  : 0-115                APPID: *N0.DB2.181203081132
    HOSTNAME: hotellnx112
    EDUID   : 489                  EDUNAME: db2agent (HADRDB) 0
    FUNCTION: DB2 UDB, data protection services,
    SQLP_DBCB::setLogState, probe:5000
    DATA #1 : <preformatted>
    Database error has been detected.  As a result, for
    precautionary reasons
    all logging services have been stopped.
    
    2018-12-03-03.11.32.307417-300 I184040E572           LEVEL:
    Severe
    PID     : 5264                 TID : 140135333095168 PROC :
    db2sysc 0
    INSTANCE: hsjiang              NODE : 000            DB   :
    HADRDB
    APPHDL  : 0-115                APPID: *N0.DB2.181203081132
    HOSTNAME: hotellnx112
    EDUID   : 489                  EDUNAME: db2agent (HADRDB) 0
    FUNCTION: DB2 UDB, base sys utilities,
    sqeApplication::AppStopUsing, probe:7876
    MESSAGE : ZRC=0xFFFFFBF6=-1034
              SQL1034C  The database was damaged, so all
    applications processing the database were stopped.
    

Local fix

  • Configure different port for the members on the same host.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to Db2 11.1 Mod 4 Fixpack 5 or higher                *
    ****************************************************************
    

Problem conclusion

  • First fixed in Db2 11.1 Mod 4 Fixpack 5
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT27252

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    B10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-12-09

  • Closed date

    2020-01-16

  • Last modified date

    2020-01-16

  • APAR is sysrouted FROM one or more of the following:

    IT27157

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • RB10 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 January 2020