IBM Support

IZ55766: EVENTS DROPPED BY KPX DURING AN EVENT STORM

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Severity:         2
    Approver:        DK
    Reported Release:621
    Compid:          5724C04TE Tivoli Enterprise Management Agent
    Abstract:        Events dropped by kpx during an event storm
    
    Environment:
      HUB TEMS: ITM 6.2.1 IF0003 on AIX 5.3/6.1 64-bit
    
    Problem Description:
      During an event storm of 2000 rows of data being sent from an
    agent to the TEMS in one go, 1-2 events get dropped, inspite of
    the fact that they have Display Item set (unique value in each
    row).
    
      Here is L3's root-cause analysis in PMR 24853,694,760:
    The incoming row of data that is missed by kpx, is appended to
    the RequestData's buffer from a code path from IRA_NCS_Sample
    that receives all incoming data.  At the same time
    KPXLOC_TakeSample that processes the rows of data and sends it
    forward to the data server, is clearing out the same buffer on
    another thread.  This causes that row of data to be missed even
    though it has been received by the TEMS.
    
    Detailed Recreation Procedure:
    [1] Extract test.mdl, test-2K.sh and test_c_1.xml from
        24853.694.760.20090521c.tar.Z.
    [2] Install/Configure TEMS, TEPS and UA in AIX platform.
    [3] Configure UA to use File Data Provider.
    [4] Start TEMS, TEPS and UA.
    [5] Import test.mdl into UA.
    [6] Import test_c_1 situation from test_c_1.xml.
    [7] Run test-2K.sh which adds 2,000 lines to /tmp/test3.log, and
        each lines causes a "test_c_1" pure-event being sent to TEMS
        from UA.
    [8] Count "test_c_1* is true" lines in TEMS operation log.
    [9] If number of "test_c_1* is true" rows is less than 2,000,
        the problem is recreated.
    
    Please note the occurrence of this problem is intermit.  Repeat
    the step [7] and [9] in case the problem cannot be recreated.
    
    Related Files and Output:
      TEMS traces using debug module with 'KBB_RAS1=ERROR (UNIT:kpx
    ALL)'.  This time, TEMS lost two pure-events of display-item
    <07:15:12.062> and <07:15:13.660>.
    Directory: /ecurep/pmr/2/4/24853,694,760
    File Name: 24853.694.760.20090619a.tar.Z and
               24853.694.760.20090521c.tar.Z
    

Local fix

Problem summary

  • During an event storm of 2000 rows of data being sent from an
    agent to the management server in one go, 1-2 events get
    dropped, inspite of the fact that they have Display Item set
    (unique value in each row).
    

Problem conclusion

  • The missing row of data is received by the TEMS. However, it is
    erased by another thread that processes the data into an event.
    
    
    The fix for this APAR is contained in the following maintenance
    packages:
    
       | fix pack | 6.2.1-TIV-ITM-FP0001
    

Temporary fix

Comments

APAR Information

  • APAR number

    IZ55766

  • Reported component name

    TEMS

  • Reported component ID

    5724C04MS

  • Reported release

    621

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-07-22

  • Closed date

    2009-09-29

  • Last modified date

    2009-12-09

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    OA31005 OA31035

Fix information

  • Fixed component name

    TEMS

  • Fixed component ID

    5724C04MS

Applicable component levels

  • R621 PSY

       UP

[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTFXA","label":"Tivoli Monitoring"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"621"}]

Document Information

Modified date:
30 December 2022