IBM Support

IJ07893: SPURIOUS RG MOVE DURING SHUTDOWN -F PROCESSING

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When "shutdown -F" is issued on a node, the documented
    and
    expected behavior of PowerHA is to do a forced down (that
    is, unmanage the resource groups) on that node, and do a
    graceful down on other nodes.  The intent was to quickly
    cease processing on the node being shut down, but also to
    clean up NFS cross mounts or other replication mechanisms
    on surviving nodes.
    However, what actually happens is that the node that is
    being shut down does a forced down, as expected, and then
    attempts to release the resource groups held on that
    node.
    If the actual stop_server processing takes longer than a
    few seconds, AIX will kill the event processing in the
    middle - since AIX allows only a short time for other
    parts of the system to respond to a "shutdown -F".  This
    can leave applications in odd states, as reflected in the
    shared storage.
    

Local fix

Problem summary

  • when aix is shutdown with PowerHA cluster services active, the
    expected behavior is that the node being shutdown will run a
    "node_down forced" event and the survivng nodes will run a
    "node_down" graceful event. In other words, the node being
    shutdown will make to attempt to release cluster resources and
    the survivng nodes will make no attempt to takeover resources.
    Depending on the timing of the shutdown and the performance of
    user supplied scripts, there may be a scenario where resources
    are taken over when they should not be. This may leave the
    resource group in ERROR state, or there may be an event script
    error that needs recovery.
    

Problem conclusion

  • Solution is to run the node_down event directly on the node
    being shutdown without involving the remote nodes. This ensures
    the PowerHA stack is brought down as soon as possible and no
    attempt is made at takeover.
    

Temporary fix

  • Best practice is to stop PowerHA cluster services before
    shutting down AIX.
    

Comments

APAR Information

  • APAR number

    IJ07893

  • Reported component name

    POWERHA SYSMIR

  • Reported component ID

    5765H3900

  • Reported release

    721

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2018-07-18

  • Closed date

    2018-11-27

  • Last modified date

    2018-11-27

  • APAR is sysrouted FROM one or more of the following:

    IJ07892

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    POWERHA SYSMIR

  • Fixed component ID

    5765H3900

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSLM9V","label":"PowerHA SystemMirror Standard Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"721","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSXU4N","label":"PowerHA SystemMirror Enterprise Edition for AIX"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"721","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSLM9V","label":"PowerHA SystemMirror Standard Edition for AIX"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"721","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU008","label":"Security"},"Product":{"code":"SGL4G4","label":"PowerHA"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"721","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
19 October 2021