IBM Support

IJ47004: DEADLOCK ON DEALLOCHELPERTHREAD PENDING ON ALLOCMSGTYPEREQUESTOWNERSHIP RPC MESSAGE

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • Two client nodes are working on the same two regions for block
    deallocations and each client node owns one region of the two
    and doing the flush for the region it owns, meanwhile, the
    DeallocHelperThread on each client node is also requesting the
    ownership for the region owned by the other client node, then
    the revoke ownership request would be blocked on each other
    because the two regions are in flushing state but pending for
    ownership request from each other, thus forms a deadlock.
    

Local fix

  • Restart GPFS on the client node showing long waiter on
    allocMsgTypeRequestOwnership RPC message from
    DeallocHelperThread.
    

Problem summary

  • Two client nodes are working on the same two regions for block
    deallocations and each client node owns one region of the two
    and doing the flush for the region it owns, meanwhile, the
    DeallocHelperThread on each client node is also requesting the
    ownership for the region owned by the other client node, then
    the revoke ownership request would be blocked on each other
    because the two regions are in flushing state but pending for
    ownership request from each other, thus forms a deadlock.
    

Problem conclusion

  • This problem is fixed in 5.1.2.12 
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page:
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    Avoid deadlock on block deallocations and flushing.
    
    Work around:
    Restart GPFS on the client node showing long waiter on
    allocMsgTypeRequestOwnership RPC message from
    DeallocHelperThread.
    
    Problem trigger:
    Users files data block deallocations from at least two different
    client nodes.
    
    Symptom:
    Deadlock
    
    Platforms affected:
    All Operating Systems
    
    Functional Area affected:
    All Scale Users
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ47004

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    512

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2023-05-25

  • Closed date

    2023-07-19

  • Last modified date

    2023-07-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"512","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
20 July 2023