APAR status
Closed as program error.
Error description
On an AIX node, in some occasions, including the /var file system becoming full, mmfsd is unable to run child processes, and that results in different failures, depending on the process which mmfsd attempts to run. Among the operations which have been seen to fail: - mmadddisk - mmauth Once the problem is triggered, it will remain until the mmfsd daemon is restarted. If the problem is initiated by the /var file system getting full, freeing up space on that file system is not enough to solve the problem. An indication that problem is taking place is in the output of command /usr/lpp/mmfs/bin/tslsfs nonexistent_FS (that is, passing the name of a nonexistent file system as parameter to the command above) In a system where the problem is occurring, the output will be mmcommon getEFOptions nonexistent_FS failed. Return code 1. while on a system without the problem, the output will be mmcommon: File system nonexistent_FS is not known to the GPFS cluster.
Local fix
Once the issue in /var is resolved, the mmfsd should be restarted.
Problem summary
On an AIX node, in some occasions, including the /var file system becoming full, mmfsd is unable to run child processes, and that results in different failures, depending on the process which mmfsd attempts to run. Among the operations which have been seen to fail: - mmadddisk - mmauth Once the problem is triggered, it will remain until the mmfsd daemon is restarted. If the problem is initiated by the /var file system getting full, freeing up space on that file system is not enough to solve the problem. An indication that problem is taking place is in the output of command /usr/lpp/mmfs/bin/tslsfs nonexistent_FS (that is, passing the name of a nonexistent file system as parameter to the command above) In a system where the problem is occurring, the output will be mmcommon getEFOptions nonexistent_FS failed. Return code 1. while on a system without the problem, the output will be mmcommon: File system nonexistent_FS is not known to the GPFS cluster.
Problem conclusion
The Problem is fixed in 5.0.5 PTF 8. Benefits of the solution: With the fix, even though the problem may still occur as long as the issue in /var (likely being out of space) is present, when that issue is resolved, mmfsd will resume working properly. Operations which failed while the problem in /var was taking place can be retried. Work Around: Once the issue in /var is resolved, the mmfsd should be restarted. Problem trigger: A likely trigger for the problem is the /var file system being filled, possibly around the time an operation is taking place that results in information being produced to the mmfs.log file. Symptom: Unexpected Results/Behavior Platforms affected: AIX/Power only Functional Area affected: All Scale Users Customer Impact: High
Temporary fix
Comments
APAR Information
APAR number
IJ32892
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
505
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-06-04
Closed date
2021-06-04
Last modified date
2021-06-04
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"505","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
05 June 2021