APAR status
Closed as program error.
Error description
During Full System reboot, lots of nodes hung while arbitrating with GPFS and failed to recover. Reported In: Spectrum Scale 5.0.1.0 on ppc64le Known Impact: Nodes in arbitrating state === mmdiag: waiters === Waiting 6.2707 sec since 11:39:36, monitored, thread 63507 ProbeClusterThread: on ThCond 0x7FFBD40A50E0 (TcpConnectCondvar), reason 'wait for outbound connect' === mmdiag: waiters === Waiting 15.9439 sec since 11:40:26, ignored, thread 63507 ProbeClusterThread: delaying for 14.056008000 more seconds, reason: pause before retry join cluster Recovery action: Restart gpfs (mmshutdown;mmstartup)
Local fix
Problem summary
GPFS stays in arbitrating state after node reboot
Problem conclusion
Provide more debugging info in mmfs.log in the cases when a node has trouble connecting to the quorum nodes.
Temporary fix
Comments
APAR Information
APAR number
IJ08826
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
501
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-08-28
Closed date
2019-02-19
Last modified date
2019-02-19
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"501","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
19 February 2019