IBM Support

IJ22372: DEADLOCK RIGHT AFTER USER SCRIPT GPFSREADY FAIL

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • User script gpfsready error caused mmfsd going down.
    However it hit a deadlock hang situation which postpone
    5 minutes.
    
    Reported in:
    Spectrum Scale 5.0.2.3
    
    Known Impact:
    Spectrum Scale takes more than 5 mins to complete reboot
    due to
    deadlock hang
    

Local fix

Problem summary

  • When the use of the gpfsready script is configured
    ('verifyGpfsReady'
    configuration variable must be set to 'true')
    this user script will be called
    during GPFS startup. In case user specific
    checks in this script fail and the
    script returns with a non-zero exit
    code GPFS goes down. It turned out, that
    during the following GPFS shutdown, the
    cleanup thread doing the shutdown in the
    GPFS mmfsd daemon is waiting for a
    mutex which has been acquired by another
    thread but not released yet. The other
    thread has been sent to a particular
    handler routine by the cleanup thread
    without having the chance to release the
    mutex the cleanup thread is waiting for.
    This way the cleanup thread cannot make
    progress and the other thread is waiting for
    5 minutes in the handler routine before
    it will exit.
    

Problem conclusion

  • Benefits of the solution: 
    No more 5 minute delay during GPFS shutdown,
    when the gpfsready script fails
    during GPFS startup.
    Work around:
    Not available.
    Problem trigger:
    gpfsready user script fails during GPFS startup.
    Symptom:
    Hang
    Platforms affected:
    Just seen on x86_64-linux, other platforms possible
    Functional Area affected:
    GPFS mmfsd startup
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ22372

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    502

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-01-29

  • Closed date

    2020-02-21

  • Last modified date

    2020-02-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ22939

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"502","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
21 February 2020