IBM Support

IJ29444: MMHEALTH SHOW DEGRADED STATUS DUE TO RECONNECT_START

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • GPFS keeps in DEGRADED if reconnect_start is
    initialized by one IP but reconnection is established
    using another IP.
    
    The issue may happen when there are multiple connections
    between 2 nodes with different IPs. If one IP connection
    failed reconnect_start event is triggered and GPFS
    enters DEGRADED state. Later the state should change to
    HEALTHY after a reconnect_done/reconnection_aborted
    event is triggered. While if the reconnection is
    established using another IP then no
    reconnect_done/reconnection_aborted event is triggered
    so GPFS keeps in DEGRADED status.
    
    Reported In:
    Spectrum Scale 5.0.4.3
    

Local fix

  • On the node reporting the issue run:
    mmsysmoncontrol restart
    

Problem summary

  • After Node-B successfully reestablishes
    a broken connection to Node-A,
    Node-A still shows the reconnect_start
    state (DEGRADED).
    

Problem conclusion

  • Benefits of the solution:
    No more incorrect node connect status.
    
    Work around:
    Restart the systemhealth monitor (mmsysmoncontrol restart).
    
    Problem trigger:
    Reconnecting broken connections
    
    Symptom:
    Error output/messages
    
    Platforms affected:
    ALL Operating System environments
    
    Functional Area affected:
    System Health
    
    Customer Impact:
    Low
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ29444

  • Reported component name

    SPEC SCALE ADV

  • Reported component ID

    5737F35AP

  • Reported release

    504

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-11-24

  • Closed date

    2021-02-03

  • Last modified date

    2021-02-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ30684

Fix information

  • Fixed component name

    SPEC SCALE ADV

  • Fixed component ID

    5737F35AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"504","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
04 February 2021