APAR status
Closed as program error.
Error description
Due to a change in Spectrum Scale 5.0.5.1 OpenDevice call was not followed by a CloseDevice, thus leaving unused open file descriptors open. When these reach a threshold of >64k, the mmfsd will assert on opening a socket or a file with : [X] logAssertFailed: isValidSocket(sock) causing this node to recycle and temporarily lose access to the filesystem.
Local fix
one cause of using up FDs is the periodical run diskIOcheck every 300 sec (default) by disabling this via: # mmchconfig diskIOCheckInterval=0 -i these check will not anymore add to the leakage.
Problem summary
logAssertFailed:isValidSocket(sock) line 2661 of file thread.C
Problem conclusion
Benefits of the solution: No more assert Work Around: No stable workaround to avoid this, but stopping the unmount of the file system and a stable NSD server node with stable disks could help. Problem trigger: File system unmount operations and disk path rediscovery or NSD disk stats changes. Symptom: mmfsd crash Platforms affected: All Operating Systems Functional Area affected: All Scale users Customer Impact: Critical Changed Externals:None
Temporary fix
Comments
APAR Information
APAR number
IJ28321
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
505
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-09-23
Closed date
2020-09-28
Last modified date
2020-09-28
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"505","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
29 September 2020