IBM Support

IJ33000: AFM: DEADLOCK AFTER ABNORMAL EXIT OF AFM EVICT

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • After an abnormal exit (crash or killed) of AFM evict,
    Spectrum Scale might detect deadlock as below:
    
    Waiting 41491.0316 sec since 23:19:16, monitored, thread
    954231 dmRequestRightHandlerThread: on ThCond
    0x183214B2370 (LkObjCondvar), reason 'waiting for XW
    lock'
    
    Reported In:
    Spectrum Scale 5.1.0.1
    

Local fix

  • Restart Spectrum Scale on gateway node
    

Problem summary

  • In the current implementation of Eviction on a file, the
     eviction program acquires a DMAPI lock first on the file
     first and punches a hole on it.
    The program can be terminated at any point without the
     DMAPI lock to be released - causing a lock leak and hence
     later DMAPI lock acquire on the file can deadlock and the
     only way to come out of this is to bounce the mmfsd.
    

Problem conclusion

  • The problem is fixed in 5.0.5 PTF8.
    Benefits of the solution:
    No more deadlock
    
    Work around:
    None
    
    Problem trigger:
    Try to evict a file or list of files, and the eviction
     getting killed midway through.
    
    Symptom:
    Deadlock
    
    Platforms affected:
    ALL Linux and AIX OS environments
    
    Functional Area affected:
    AFM
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ33000

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    505

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-06-04

  • Closed date

    2021-06-04

  • Last modified date

    2021-06-04

  • APAR is sysrouted FROM one or more of the following:

    IJ32512

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"505","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
05 June 2021