IBM Support

IC72534: SDS NODES IN SDS CLUSTER WILL BE IN INCORRECT STATUS AFTER NETWORK WAS BROKEN FOR A WHILE

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • There are two database servers in SDS cluster, DB server A and
    DB server B;
    
    DB server A resides on box A, DB server B resides on box B;
    
    There are two CM servers to construct a CM cluster, CM1 reside
    in the same box with DB server A, CM2 reside in the same box
    with DB server B;
    
    At first, DB server A is the primary server in SDS cluster and
    CM1 is the only one CM Arbitrator, DB server B was in SDS
    read-only status, all client connection request will be
    redirected to DB server A;
    
    Sometime, break the network on the box A which DB server A was
    resided in to simulate network outage (like:take off the network
    cable);
    
    After the time specified by EVENT_TIMEOUT and FOC timeout, CM2
    notified the break in cluster and issue the fail over command.
    After that DB server B will be new primary server and CM2 will
    be new CM Arbitrator;
    
    At this time, all new client connection request will be
    redirected to DB server B; All old connections were in stuck
    status;
    
    After some while, network on box A will be recovered. Now DB
    server A and B are all in primary status, CM1 and CM2 are all CM
    Arbitrator; DB servers don't communicated each other again;
    
    New client connection request will be redirected to DB server A
    again, and any data operation can be happened on DB server A,
    but any changes happened on DB server A can not be wrote to
    disk, data was lost. Changes happened on DB server B can be
    remained.
    
    After restart DB server A, it will be switched to SDS secondary.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * Users of SDS cluster in MACH11 environment.                  *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * Due to loss of network on the primary server in a SDS        *
    * cluster, the connection manager will attempt to perform a    *
    * failover on the SDS node. After the primary network          *
    * connection is re-established, the SDS cluster can have 2     *
    * primary nodes, each potentially corrupting the shared disks. *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to 11.50xC9 when available.                          *
    ****************************************************************
    

Problem conclusion

  • Problem fixed in 11.50xC9. See CQ218767 remarks for additional
    information on the fix approach.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC72534

  • Reported component name

    IBM IDS ENTRP E

  • Reported component ID

    5724L2304

  • Reported release

    B15

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-11-11

  • Closed date

    2011-09-27

  • Last modified date

    2011-09-27

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    IBM IDS ENTRP E

  • Fixed component ID

    5724L2304

Applicable component levels

  • RB15 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"B15","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
27 September 2011