IBM Support

IJ35907: AGENT HAS HIGH CPU WHEN COLLECTING LINUX FILE INFORMATION DATA WHEN RESOURCE ERROR OCCURS

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Problem Description:
    When a situation or historical collection is collecting data
    for the Linux File Information attribute group where a filename
    is not specified, and the agent encounters the pthread_create
    error, the agent will go in a loop trying to collect the data
    resulting in high CPU for the agent process.
    
    (6177F69A.40B6-6:klz20agt.cpp,1424,"executeSystemCallInSeparate
    Thread")WARNING: pthread_create failed with error code 11 (Resou
    temporarily unavailable)
    
    The error above occurs when there are insufficient resources to
    create another thread. A system-imposed limit on the number of
    threads was encountered.
    
    
    A Technote has been written which contains details os a SLES 12
    environment setting that causes the resource issue. See the
    technote for more details.
    
    High CPU running IBM Tivoli Monitoring Linux OS Agent on SLES 12
     after upgrade to 6.30 FP7 SP7 (or later):
    https://www.ibm.com/support/pages/node/6552616
    
    
    
    Related Files and Output:
    ------------------------------------
    In the agent RAS1 log file
    <hostname>_lz_klzagent_<timestamp>-<nn>.log, with tracing set to
    (UNIT:klz20agt ALL), the following WARNING messages will
    displayed followed by the
    executeSystemCallInSeparateThread(READDIR) for the same path
    over and over.
    
    (6177F69A.40B2-6:klz20agt.cpp,762,"getFileInfo") calling
    executeSystemCallInSeparateThread(READDIR) for
    path=/opt/IBM/ITM/bin/
    (6177F69A.40B3-6:klz20agt.cpp,1324,"executeSystemCallInSeparate
    Thread")Entry
    (6177F69A.40B4-6:klz20agt.cpp,1337,"executeSystemCallInSeparate
    Thread")name passed in:/opt/IBM/ITM/bin/
    (6177F69A.40B5-6:klz20agt.cpp,1384,"executeSystemCallInSeparate
    Thread")Thread for file  "/opt/IBM/ITM/bin/" found in cache
    (6177F69A.40B6-6:klz20agt.cpp,1424,"executeSystemCallInSeparate
    Thread")WARNING: pthread_create failed with error code 11 (Resou
    temporarily unavailable)
    (6177F69A.40B7-6:klz20agt.cpp,1425,"executeSystemCallInSeparate
    Thread")Exit: 0xFFFFFFFF
    (6177F69A.40B8-6:klz20agt.cpp,801,"getFileInfo") readdir failed
    for path /opt/IBM/ITM/bin/. file:itmCANDLEDATA_xx.sh errno = 11;
    Resource temporarily unavailable
    (6177F69A.40B9-6:klz20agt.cpp,762,"getFileInfo") calling
    executeSystemCallInSeparateThread(READDIR) for
    path=/opt/IBM/ITM/bin/
    (6177F69A.40BA-6:klz20agt.cpp,1324,"executeSystemCallInSeparate
    Thread")Entry
    (6177F69A.40BB-6:klz20agt.cpp,1337,"executeSystemCallInSeparate
    Thread")name passed in:/opt/IBM/ITM/bin/
    (6177F69A.40BC-6:klz20agt.cpp,1384,"executeSystemCallInSeparate
    Thread")Thread for file  "/opt/IBM/ITM/bin/" found in cache
    (6177F69A.40BD-6:klz20agt.cpp,1396,"executeSystemCallInSeparate
    Thread")The thread for file "/opt/IBM/ITM/bin/" is probably in h
    (6177F69A.40BE-6:klz20agt.cpp,1397,"executeSystemCallInSeparate
    Thread")Exit: 0xFFFFFFFE
    (6177F69A.40BF-6:klz20agt.cpp,793,"getFileInfo") readdir timed
    out for path:/opt/IBM/ITM/bin/ file:itmCANDLEDATA_xx.sh
    situation:CC_Linux_LZ_Chk777W
    (6177F69A.40C0-6:klz20agt.cpp,762,"getFileInfo") calling
    executeSystemCallInSeparateThread(READDIR) for
    path=/opt/IBM/ITM/bin/
    (6177F69A.40C1-6:klz20agt.cpp,1324,"executeSystemCallInSeparate
    Thread")Entry
    (6177F69A.40C2-6:klz20agt.cpp,1337,"executeSystemCallInSeparate
    Thread")name passed in:/opt/IBM/ITM/bin/
    (6177F69A.40C3-6:klz20agt.cpp,1384,"executeSystemCallInSeparate
    Thread")Thread for file  "/opt/IBM/ITM/bin/" found in cache
    (6177F69A.40C4-6:klz20agt.cpp,1396,"executeSystemCallInSeparate
    Thread")The thread for file "/opt/IBM/ITM/bin/" is probably in h
    (6177F69A.40C5-6:klz20agt.cpp,1397,"executeSystemCallInSeparate
    Thread")Exit: 0xFFFFFFFE
    (6177F69A.40C6-6:klz20agt.cpp,793,"getFileInfo") readdir timed
    out for path:/opt/IBM/ITM/bin/ file:itmCANDLEDATA_xx.sh
    situation:CC_Linux_LZ_Chk777W
    (6177F69A.40C7-6:klz20agt.cpp,762,"getFileInfo") calling
    executeSystemCallInSeparateThread(READDIR) for
    path=/opt/IBM/ITM/bin/
    

Local fix

  • 1. Turn off any situations for Linux File Information (LNXFILE)
    where the filename is not specified (e.g. only the path is
    specified). The looping problem would not occur if the path and
    filename are both specified.
    
    2. The documentation for the error encountered (EAGAIN) for the
    pthread_create indicates the following possible causes.  Check
    these settings to see if they are set too low and adjust.
    
    A system-imposed limit on the number of threads was
    encountered.  There are a number of limits that may trigger this
    error: the RLIMIT_NPROC soft resource limit (set via
    setrlimit(2)), which limits the number of processes and threads
    for a real user ID, was reached; the kernel's system-wide limit
    on the number of processes and threads,
    /proc/sys/kernel/threads-max, was reached (see proc(5)); or the
    maximum number of PIDs, /proc/sys/kernel/pid_max, was reached
    (see proc(5)).
    

Problem summary

  • High CPU when collecting Linux File Information when resource
    error.
    
    
    When a situation or historical collection is collecting data for
    the Linux File Information attribute group where a filename is
    not specified, and the agent encounters the pthread_create
    error, the agent will go in a loop trying to collect the data
    resulting in high CPU for the agent process.
    
    (6177F69A.40B6-6:klz20agt.cpp,1424,"executeSystemCallInSeparateT
    hread")WARNING: pthread_create failed with error code 11
    (Resource temporarily unavailable)
    
    The error above occurs when there are insufficient resources to
    create another thread.  A system-imposed limit on the number of
    threads was encountered.
    

Problem conclusion

  • The code has been updated to handle the error condition so it
    does not result in the code looping.
    
    
    The fix for this APAR is contained in the following maintenance
    packages:
    
       | service pack | 6.3.0.7-TIV-ITM-SP0012
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ35907

  • Reported component name

    ITM AGENT LINUX

  • Reported component ID

    5724C04LN

  • Reported release

    630

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-11-02

  • Closed date

    2022-05-02

  • Last modified date

    2022-05-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    ITM AGENT LINUX

  • Fixed component ID

    5724C04LN

Applicable component levels

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTFXA","label":"Tivoli Monitoring"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"630","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
08 March 2023