IBM Support

IJ49701: PROCESSES HANG DUE TO DEADLOCKS IN OUR STORAGE SCALE CLUSTER

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • Processes hang due to deadlocks in our Storage Scale cluster.
    There aredeadlock notifications on multiple nodes which were
    triggered by 'long waiter' events on the nodes
    

Local fix

Problem summary

  • Processes hang due to deadlocks in our Storage Scale cluster.
    There aredeadlock notifications on multiple nodes which were
    triggered by 'long waiter' events on the nodes
    

Problem conclusion

  • This problem is fixed in 5.1.9.2
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page:
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    Newer Linux kernels try to determine if performing a readahead
    would be beneficial and trigger it if so. This creates certain
    challenges for distributed filesystems which often ends up in
    locking related issues. The solution implements the readahead
    VFS operation to handle this issue.
    
    Work Around:
    None
    
    Problem trigger:
    A single large file being read sequentially from one
    node(causing a readahead to be performed on the file or by using
    a posix_fadvise call to trigger readahead forcefully) and also
    being truncated/deleted from another node at the same
    time.
    
    Symptom:
    Client processes hang and system deadlocks
    
    Platforms affected:
    Linux Only
    
    Functional Area affected:
    Regular file read flow in kernel version >= 5.14
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ49701

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    519

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2024-01-10

  • Closed date

    2024-01-10

  • Last modified date

    2024-01-10

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"519","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
11 January 2024