APAR status
Closed as program error.
Error description
If clock jump forward more than 160 seconds, then the internal command "tsctl nQstatus -Y" will return a status of "unresponsive". This will trigger the gpfs_unresponsive event and causing CES to failover to other nodes.
Local fix
Problem summary
If clock jump forward more than 160 seconds, then the internal command "tsctl nQstatus -Y" will return a status of "unresponsive". This will trigger the gpfs_unresponsive event and causing CES to failover to other nodes.
Problem conclusion
The solution uses TickTime (based on ticks since boot) instead of HiResTime to track 'lastHealthUpdate' time. This avoids node being transiently flagged as unresponsive when the system clock is changed (via 'date' command or by NTP)
Temporary fix
Comments
APAR Information
APAR number
IJ16283
Reported component name
SPEC SCALE ADV
Reported component ID
5737F35AP
Reported release
502
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2019-05-20
Closed date
2019-06-12
Last modified date
2019-06-12
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE ADV
Fixed component ID
5737F35AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"502","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
12 June 2019