IBM Support

IJ51787: OPTIMIZE HANDLING OF SECURITY CONTEXTS TO ELIMINATE UNNECESSARY RESOURCE UTILIZATION

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When a large number of secure connections are created at the
    same time between the mmfsd daemon instances in a Scale cluster,
    some of the secure connections may fail as a result of
    timeouts,resulting in unstable cluster operations.
    

Local fix

  • Stage the rebooting of nodes in large Scale clusters such that
    they don't reboot at the same time.
    

Problem summary

  • When a large number of secure connections are created at the
    same time between the mmfsd daemon instances in a Scale cluster,
    some of the secure connections may fail as a result of
    timeouts,resulting in unstable cluster operations.
    

Problem conclusion

  • This problem is fixed in 5.1.9.5
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page: 
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale
    _apars.html
    
    Benefits of the solution:
    By optimizing the use of security contexts by the mmfsd daemon,
    the number of secure connections between mmfsd daemon instances
    that can be established concurrently increases
    substantially,allowing all necessary secure connections to be
    successfully established.  This is particularly beneficial for
    cases where a large Scale cluster's node are rebooted at the
    same time and all mmfsddaemon instances running on the cluster
    nodes attempt to create secure connections with each other at
    the same time.
    
    Work Around:
    Stage the rebooting of nodes in large Scale clusters such that
    they don't reboot at the same time.
    
    Problem trigger:
    Rebooting all nodes of a large Scale cluster at the same time.
    
    Symptom:
    Unexpected Results/Behavior
    
    Platforms affected:
    ALL
    
    Functional Area affected:
    GPFS Core
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ51787

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    519

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2024-07-11

  • Closed date

    2024-07-29

  • Last modified date

    2024-07-29

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"519","Line of Business":{"code":"LOB69","label":"Storage TPS"}}]

Document Information

Modified date:
30 July 2024