IBM Support

IJ40726: AUDITP_MSGQ_UNSUPPORTED CAUSED MMFSD THREADS KEEP INCREASING

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • After node updated from 5.0.5 to 5.1.2, without disable
    message queue. mmhealth showed:auditp_msgq_unsupported
    mmfsd threads keep increasing till crash due to failed to
    allocate memory.
    # ps -eT | grep 'pidof mmfsd' | awk '{print $5}' | sort |
    uniq -c
      349 mmfsd
       81 rdk:broker-1
      243 rdk:broker10000
       81 rdk:main
    
    A few hours later. Whatever the rdk:broker10000 thread is
    growing.
    
    # ps -eT | grep 'pidof mmfsd' | awk '{print $5}' | sort |
    uniq -c
      373 mmfsd
      666 rdk:broker-1
      1998 rdk:broker10000
      666 rdk:main
    
    Created in nodes updated from 5.0.5 to 5.1.2.4 or 5.1.3.1
    

Local fix

Problem summary

  • A problem was identified when running in a mixed level
    cluster where some nodes support msgqueue and others
    do not. Excessive librdkafka threads will be created
    for each IO event on the 5.1.2+ nodes resulting
    in thread exhaustion for that particular node.
    

Problem conclusion

  • This problem is fixed in 5.1.2 PTF 6
    To see all Spectrum Scale APARs and
    their respective fix solutions refer to page
    https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_
    apars.html
    
    Benefits of the solution:
    The fix will prevent these librdkafka threads from
    being created and will notify the user that msgqueue
    is unsupported. The 5.1.2+ node will be unable to
    generate audit events until all nodes are upgraded
    and moved off of the deprecated msgqueue infrastructure.
     The nodes <5.1.2 level will continue to
    generate auditing events.
    
    Work around: None
    Problem trigger:
    Running a cluster where msgqueue is supported.
    Upgrading a node to 5.1.2+ where msgqueue is no longer
    supported. Running IO to the 5.1.2+ node.
    Symptom: Hang/Deadlock/Unresponsiveness/Long Waiters
    Platforms affected:
    ALL Linux OS environments supported by
    Clustered Watch Folder / File Audit Logging
    Functional Area affected: Watch Folder / File audit logging
    Customer Impact: High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ40726

  • Reported component name

    SPEC SCALE ADV

  • Reported component ID

    5737F35AP

  • Reported release

    512

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-06-17

  • Closed date

    2022-06-30

  • Last modified date

    2022-06-30

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE ADV

  • Fixed component ID

    5737F35AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"512","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
01 July 2022