IBM Support

IJ45706: QUOTA IN_DOUBT RAISING WHEN SUSPENDING ONE OF TWO FAILURE GROUPS

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • In a replicated file system (-r 2), when one of the disks
    is suspended (or better, one of the failure groups in two
    is suspended), the file writes succeed allocating disk
    space on the available failure group (disk) but only one
    replica per logical block is allocated.
    Quota will count only the allocated replica, but the
    quota share does not get reclaimed or updated. This means
    quota in_doubt raising in error continuously.
    

Local fix

Problem summary

  • In a replicated file system (-r 2), when disks of a failure
    group are not available, e.g. one of the failure groups in two
    is suspended, the file writes succeed allocating disk space on
    the available failure group but only one replica per logical
    block is allocated - the file is ill-replicated. In such
    scenario, quota is not handling correctly the partial successful
     block allocation as GetLocalQuota and FixLocalQuota routines
    are out of sync. As result, some quota shares (in-doubt) become
    not reclaimable and leading to increase of in-doubt values over
    time.
    

Problem conclusion

  • This problem is fixed in 5.1.7.1
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page:
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    Fix quota accounting in data replicated file systems.
    
    Work Around:
    Run mmcheckquota to correct outstanding in-doubt values.
    
    Problem trigger:
    Unavailability of disks in an entire failure group in a
    replicated file system with two failure groups.
    
    Symptom:
    Quota in_doubt will not decrease after workload ceased.
    
    Platforms affected:
    ALL Operating System environments
    
    Functional Area affected:
    Quotas
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ45706

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    516

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2023-03-06

  • Closed date

    2023-04-13

  • Last modified date

    2023-04-13

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"516","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
14 April 2023