IBM Support

IJ08826: CLIENT NODE STUCK ON PROBECLUSTERTHREAD AFTER REBOOT

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • During Full System reboot, lots of nodes hung while
    arbitrating with GPFS and failed to recover.
    
    Reported In:
    Spectrum Scale 5.0.1.0 on ppc64le
    
    Known Impact:
    Nodes in arbitrating state
    === mmdiag: waiters ===
    Waiting 6.2707 sec since 11:39:36, monitored, thread
    63507 ProbeClusterThread: on ThCond 0x7FFBD40A50E0
    (TcpConnectCondvar), reason 'wait for outbound connect'
    
    === mmdiag: waiters ===
    Waiting 15.9439 sec since 11:40:26, ignored, thread 63507
    ProbeClusterThread: delaying for 14.056008000 more
    seconds, reason: pause before retry join cluster
    
    Recovery action:
    Restart gpfs (mmshutdown;mmstartup)
    

Local fix

Problem summary

  • GPFS stays in arbitrating state after node reboot
    

Problem conclusion

  • Provide more debugging info in mmfs.log in the cases when a node
     has trouble connecting to the quorum nodes.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ08826

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    501

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-08-28

  • Closed date

    2019-02-19

  • Last modified date

    2019-02-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ15272

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"501","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
19 February 2019