IBM Support

IJ53799: PMCOLLECTOR SERVICES CRASHES WITH SEGFAULT IN LOGSTORE::READANDPROCESS()

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • pmcollector service can segfault in LogStore::readAndProcess()
    and service will restart.There is a data race between 2 parallel
    code threads which was observed when the pmcollector aggregation
    was running (every 6h) while the node was under load.
    

Local fix

Problem summary

  • pmcollector service can segfault in LogStore::readAndProcess()
    and service will restart.There is a data race between 2 parallel
    code threads which was observed when the pmcollector aggregation
    was running (every 6h) while the node was under load.
    

Problem conclusion

  • This problem is fixed in 5.1.9.8
    To see all Spectrum Scale APARs and their respective
    Fix solutions refer to page: 
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale
    _apars.html
    
    
    Benefits of the solution:
    A restart of pmcollector service cause a loss of performance
    metrics for the time of the restart and could trigger error
    events in the GUI and mmhealth.Fixing the data race segfault
    prevents such downtime.
    
    Work Around:
    None
    
    Problem trigger:
    
    High system CPU utilization while pmcollector is processing
    it's data aggregation.
    
    Symptom:
    Abend/Crash
    
    Platforms affected:
    ALL Linux OS environments
    
    Functional Area affected:
    perfmon (Zimon)
    
    Customer Impact:
    Suggested
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ53799

  • Reported component name

    SPEC SCALE STD

  • Reported component ID

    5737F33AP

  • Reported release

    519

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2025-03-04

  • Closed date

    2025-03-04

  • Last modified date

    2025-03-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE STD

  • Fixed component ID

    5737F33AP

Applicable component levels

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"519","Line of Business":{"code":"LOB69","label":"Storage TPS"}}]

Document Information

Modified date:
05 March 2025