Topic
  • 6 replies
  • Latest Post - ‏2012-01-17T14:48:24Z by rkrzywicki
Sammy1984
Sammy1984
46 Posts

Pinned topic Mttrapd probe issue

‏2012-01-11T19:16:55Z |
Hello Netcool Experts,

Mttarpd probe behaves in a wierd way. Probe is fetching "Link Up" events from Juniper Routers, but it is not reflecting in AEL. However it is able to fetch "Link Down" events from the same device and reflects in AEL. We are able to capture those "Link Up" events in TCR reports with the help of ODBC Gateway.

Kindly find the attachment for better understanding about the issue.

This is very critical for our environment.

My humble request to all Netcool Experts to guide me in fixing this issue.

Awaiting your reply.

Regards,
Sammy
Updated on 2012-01-17T14:48:24Z at 2012-01-17T14:48:24Z by rkrzywicki
  • rkrzywicki
    rkrzywicki
    133 Posts

    Re: Mttrapd probe issue

    ‏2012-01-13T14:02:21Z  
    I don't have a RAR extraction tool to look at your attachment but I will take a stab at it anyway.

    Resolution events such as "link up" are used to clear the associated problem event, in this case the "Link Down".
    You should see the effect of the Link Up in the AEL as the Link Down event severity should change to 0 (green).
    The resolution event is usually deleted from the objectserver after clearing the problem, which is probably why you do not see it in the AEL.
  • SystemAdmin
    SystemAdmin
    1283 Posts

    Re: Mttrapd probe issue

    ‏2012-01-13T15:27:22Z  
    I don't have a RAR extraction tool to look at your attachment but I will take a stab at it anyway.

    Resolution events such as "link up" are used to clear the associated problem event, in this case the "Link Down".
    You should see the effect of the Link Up in the AEL as the Link Down event severity should change to 0 (green).
    The resolution event is usually deleted from the objectserver after clearing the problem, which is probably why you do not see it in the AEL.
    Off the topic. 7-zip is a useful tool for decompression of a wide range of formats including .rar if your IT policy allows use of open source tools.
  • Sammy1984
    Sammy1984
    46 Posts

    Re: Mttrapd probe issue

    ‏2012-01-13T16:55:29Z  
    I don't have a RAR extraction tool to look at your attachment but I will take a stab at it anyway.

    Resolution events such as "link up" are used to clear the associated problem event, in this case the "Link Down".
    You should see the effect of the Link Up in the AEL as the Link Down event severity should change to 0 (green).
    The resolution event is usually deleted from the objectserver after clearing the problem, which is probably why you do not see it in the AEL.
    I got your point, but in my case "Link Down" still reflects in the AEL. This problem is not just in one device, but many devices in the network. If event correlation happens, then it will clear both problem and resolution event.. which is not happening in my case.

    What I observed is "Link Up" event never reaches Objectserver for correlation to happen. But it reflects in TCR reports. This is very wired problem I am facing with mttrapd probe.

    I am attaching files in .zip format. Hope this time you will be able to look into attachment.

    Thank you.
  • rkrzywicki
    rkrzywicki
    133 Posts

    Re: Mttrapd probe issue

    ‏2012-01-13T20:17:31Z  
    • Sammy1984
    • ‏2012-01-13T16:55:29Z
    I got your point, but in my case "Link Down" still reflects in the AEL. This problem is not just in one device, but many devices in the network. If event correlation happens, then it will clear both problem and resolution event.. which is not happening in my case.

    What I observed is "Link Up" event never reaches Objectserver for correlation to happen. But it reflects in TCR reports. This is very wired problem I am facing with mttrapd probe.

    I am attaching files in .zip format. Hope this time you will be able to look into attachment.

    Thank you.
    I think you are most likely experiencing the same-second or sub-second event issue.
    ObjectServer time fields are stored as the number of seconds since the epoch, more commonly known as Unix time. Because of this, it is possible for multiple events to have the same LastOccurrence time value. IBM actually created a new field called ProbeSubSecondId to distinguish the order in which events were received by the probe when they occur within the same second. Unfortunately they do not seem to consider it within the default generic_clear automation. There is a technote that discusses the issue and offers a solution at http://www-01.ibm.com/support/docview.wss?uid=swg21329027.

    You could try a simple fix to the problem by modifying a line of the generic_clear automation. Under the Action tab look for the line

    update alerts.problem_events set Resolved = true where LastOccurrence < resolution.LastOccurrence and ...

    Change the < to <= and hit ok.

    That should take care of it.
  • Sammy1984
    Sammy1984
    46 Posts

    Re: Mttrapd probe issue

    ‏2012-01-16T12:53:11Z  
    I think you are most likely experiencing the same-second or sub-second event issue.
    ObjectServer time fields are stored as the number of seconds since the epoch, more commonly known as Unix time. Because of this, it is possible for multiple events to have the same LastOccurrence time value. IBM actually created a new field called ProbeSubSecondId to distinguish the order in which events were received by the probe when they occur within the same second. Unfortunately they do not seem to consider it within the default generic_clear automation. There is a technote that discusses the issue and offers a solution at http://www-01.ibm.com/support/docview.wss?uid=swg21329027.

    You could try a simple fix to the problem by modifying a line of the generic_clear automation. Under the Action tab look for the line

    update alerts.problem_events set Resolved = true where LastOccurrence < resolution.LastOccurrence and ...

    Change the < to <= and hit ok.

    That should take care of it.
    Hi Raymond,

    Thank you very much for the update and valuable information.

    I have below questions running in my mind. I would be thankful if you can help me out in understand it...

    1) Why "Link Up" events for some of the network devices not reflecting in AEL? However we are able to look it into TCR reports.
    2) As per the tech note it says, if Problem event is raised within the same second after raising Resolution event then Problem event remains as Resolution event. In my case, Problem event (Link Down) still reflects as Problem since it never got correlated with Resolution Event (Link Up).

    Awaiting your reply..

    Thank you.
  • rkrzywicki
    rkrzywicki
    133 Posts

    Re: Mttrapd probe issue

    ‏2012-01-17T14:48:24Z  
    • Sammy1984
    • ‏2012-01-16T12:53:11Z
    Hi Raymond,

    Thank you very much for the update and valuable information.

    I have below questions running in my mind. I would be thankful if you can help me out in understand it...

    1) Why "Link Up" events for some of the network devices not reflecting in AEL? However we are able to look it into TCR reports.
    2) As per the tech note it says, if Problem event is raised within the same second after raising Resolution event then Problem event remains as Resolution event. In my case, Problem event (Link Down) still reflects as Problem since it never got correlated with Resolution Event (Link Up).

    Awaiting your reply..

    Thank you.
    I will try to answer your questions.

    1. I don't know what filter you have defined for the AEL screenshot you posted (NCP_VIEW_22015083). Is it possible that the filter is excluding Type 2 or Severity 1 events? Another possibility is that the Type 2 events are being deleted before you see them in the AEL. Turn off the DeleteClears automation and see if they show in the AEL after verifying that the filter is not excluding them.

    2. The situation being addressed in the technote reveals flaws in the deduplication and generic_clear triggers. In your situation, you are only seeing the effects of the generic_clear flaw.