IBM Support

IC89270: cimprovagt segfaults reported when trace file name rolls over

 

APAR status

  • Closed as fixed if next.

Error description

  • When Director Platform Agent cimserver logging is enabled, its
    trace file size will keep growing and may reach the limitation
    of 2 Gigabyte. At this point the file name will roll over with a
    suffix like ?.1? and then segfaults for cimprovagt may be
    reported.
    
    Impact:  The  Common Information Model (CIM) provider module
    that experiences this cimprovagt segfault will crash and can not
    be restarted.
    

Local fix

  • To resolve this issue, stop the CIM server by running
    ?/etc/init.d/cimserverd stop?, remove the trace files and thenᄉ>
    start the CIM server by running ?/etc/init.d/cimserverd
    start?.
    

Problem summary

  • When Director Platform Agent cimserver logging is enabled, its
    trace file size will keep growing and may reach the limitation
    of 2 Gigabyte, at this point the file name will roll over with a
    suffix like ?.1? but the memory allocated for the trace file
    name does not take this extra suffix into account.  As a result,
    a buffer overrun issue will be introduced, and at this point,
    segfaults for cimprovagt may be reported.
    
    Generally, this issue is reported for the provider module
    OSBase_MetricValueProvider, but any other provider module may
    also experience this issue. When this issue is reported for the
    OSBase_MetricValueProvider module, the functional ping that the
    Platform Agent Watchdog daemon uses to check the status of the
    Gatherer services will get some errors, and then it will
    consider the Gatherer services as not functioning. The Watchdog
    will then try to restart the Gatherer services when this issue
    is encountered. So in the Linux message file, the Gatherer
    services will be restarted after 4 times of segfault for
    cimprovagt. The typical Linux message log for this issue is
    shown as follows.
    
    Jan  9 11:09:08 xrng26 cimserver[27288]: A failure was detected
    in provider module OSBase_MetricValueProvider.
    Jan  9 11:09:10 xrng26 kernel: cimprovagt[29096]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffcc66cc error
    4
    Jan  9 11:09:40 xrng26 kernel: cimprovagt[29672]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffa3cc1c error
    4
    Jan  9 11:10:40 xrng26 kernel: cimprovagt[31002]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffa778dc error
    4
    Jan  9 11:11:40 xrng26 kernel: cimprovagt[32224]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000fff94c5c error
    4
    Jan  9 11:12:41 xrng26 gatherd[1074]: Gatherd is starting up.
    Jan  9 11:12:41 xrng26 reposd[1078]: Reposd is starting up.
    Jan  9 11:12:41 xrng26 reposd[1079]: Remote reposd is starting
    up.
    Jan  9 11:12:47 xrng26 kernel: cimprovagt[1208]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffebb4fc error
    4
    Jan  9 11:13:17 xrng26 kernel: cimprovagt[1892]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffbb6f1c error
    4
    Jan  9 11:14:17 xrng26 kernel: cimprovagt[3171]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffcfb87c error
    4
    Jan  9 11:15:17 xrng26 kernel: cimprovagt[4492]: segfault at
    000000007261763b rip 0000000000af8175 rsp 00000000ffec4b9c error
    4
    Jan  9 11:16:17 xrng26 gatherd[5757]: Gatherd is starting up.
    Jan  9 11:16:17 xrng26 reposd[5765]: Reposd is starting up.
    Jan  9 11:16:17 xrng26 reposd[5768]: Remote reposd is starting
    up.
    
    This issue will be fixed in a future release of the IBM Systems
    Director Platform Agent.
    
    Platforms: All Linux Platforms that are supported by IBM Systems
    Director
    Versions:  IBM Systems Director Platform Agent v6.3, v6.3.1,
    v6.3.2
    

Problem conclusion

  • The fix for this APAR is available in IBM Systems Director
    release 6.3.3.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC89270

  • Reported component name

    IBM DIR AGT XLI

  • Reported component ID

    5765DRXLA

  • Reported release

    631

  • Status

    CLOSED FIN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-12-26

  • Closed date

    2013-01-10

  • Last modified date

    2013-06-20

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

  • R610 PSN

       UP

  • R611 PSN

       UP

  • R612 PSN

       UP

  • R620 PSN

       UP

  • R621 PSN

       UP

  • R630 PSN

       UP

  • R631 PSN

       UP

  • R632 PSN

       UP

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SUPPORT","label":"IBM Worldwide Support"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"631","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SGZ2Z3","label":"IBM Systems Director"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"631","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
22 August 2022