IBM Support

IJ42283: LOG GROUP CAN BE UNAVAILABLE WHEN RG FAILS TO RECOVER AND PDISK IS RECOVERED FROM MISSING STATE

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • A race between recovery group recovery and pdisk state update
    broadcasting may make pdisk appear as missing which will
    prevent log group recovery and block I/O.
    

Local fix

  • Restart the gpfs daemons running on the GNR
    nodes that manage the affected recovery group
    

Problem summary

  • A race between recovery group recovery and pdisk state update
    broadcasting may make pdisk appear as missing which will
    prevent log group recovery and block I/O.
    

Problem conclusion

  • This problem is fixed in 5.1.5 PTF 1
    To see all Spectrum Scale APARs and
    their respective fix solutions refer to page
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    Fixed the code to allow pdisk state version to increment
    as it always has but we delay any state update broadcast
    until any in-progress RG recovery completes.
    
    Work Around:  Restart the gpfs daemons running on the GNR
    nodes that manage the affected recovery group
    Problem trigger
    Recovery group and log group failure from too many missing
    pdisks that could be caused by bad disks, nodes, or network.
    Symptom: Stuck IO
    Platforms affected:  Linux Only
    Functional Area affected: ESS/GNR
    Customer Impact: High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ42283

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    515

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-09-08

  • Closed date

    2022-09-08

  • Last modified date

    2022-09-08

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"515","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
08 September 2022