APAR status
Closed as program error.
Error description
In a replicated file system (-r 2), when one of the disks is suspended (or better, one of the failure groups in two is suspended), the file writes succeed allocating disk space on the available failure group (disk) but only one replica per logical block is allocated. Quota will count only the allocated replica, but the quota share does not get reclaimed or updated. This means quota in_doubt raising in error continuously.
Local fix
Problem summary
In a replicated file system (-r 2), when disks of a failure group are not available, e.g. one of the failure groups in two is suspended, the file writes succeed allocating disk space on the available failure group but only one replica per logical block is allocated - the file is ill-replicated. In such scenario, quota is not handling correctly the partial successful block allocation as GetLocalQuota and FixLocalQuota routines are out of sync. As result, some quota shares (in-doubt) become not reclaimable and leading to increase of in-doubt values over time.
Problem conclusion
This problem is fixed in 5.1.7.1 To see all Spectrum Scale APARs and their respective Fix solutions refer to page: https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_ apars.html Benefits of the solution: Fix quota accounting in data replicated file systems. Work Around: Run mmcheckquota to correct outstanding in-doubt values. Problem trigger: Unavailability of disks in an entire failure group in a replicated file system with two failure groups. Symptom: Quota in_doubt will not decrease after workload ceased. Platforms affected: ALL Operating System environments Functional Area affected: Quotas Customer Impact: High Importance
Temporary fix
Comments
APAR Information
APAR number
IJ45706
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
516
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-03-06
Closed date
2023-04-13
Last modified date
2023-04-13
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"516","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
14 April 2023