The event search tools can find the root cause
of problems that are generating large numbers of events in your environment.
The tools can detect patterns in the event data that, for example,
can identify the root cause events that cause event storms. They can
save you time that would otherwise be spent manually looking for the
event that is causing problems. You can quickly pinpoint the most
important events and issues.
The tools are built into
the Web GUI event
lists (AEL and Event Viewer). They run searches against the event
data, based on default criteria, filtered over specific time periods.
You can search against large numbers of events. You can change the
search criteria and specify different time filters. When run, the
tools start the Operations Analytics - Log
Analysis product,
where the search results are displayed.
Procedure
- To start using the event search tools, select one or more
events from an event list and right-click. From the right-click menu,
click Event Search, click a tool, and click
a time filter.
The tools are as follows:
Tool |
Description |
Show event dashboard by node |
Searches for all events that originate from the same host
name, service name, or IP address, which is equivalent to the Node
field of the ObjectServer alerts.status table. |
Search for similar events |
Searches for all events that have the same failure type,
type, and severity as the selected events. The failure type equates
to the AlertGroup field of the alerts.status table. The type equates
to the Type field. The severity equates to the Severity field. |
Search for events by node |
Searches for all events that originate from the same source, that is, hostname, service
name, or IP address. This is equivalent to the Node field of the alerts.status table. The results
are displayed in a list in in the Operations Analytics - Log
Analysis GUI. |
Show keywords and event count |
Extracts a list of keywords from the text of the event summary,
event source, and failure type. The event summary text equates to
the Summary field of the alerts.status table. The event source equates
to the Node field. The failure type equates to the AlertGroup field. |
The time filters are calculated from the
time stamp of the selected event or events. The Operations Analytics - Log
Analysis time
stamp is equivalent to the FirstOccurrence field of the ObjectServer
alerts.status table. The default time filters are as follows. If you
click Custom specify an integer and unit of
time, such as 15 weeks.
- 15 minutes before event
- 1 hour before event
- 1 day before event
- 1 week before event
- 1 month before event
- 1 year before event
- Custom ...
If a single event is selected that has the
time stamp 8 January 2014 08:15:26 AM, and you click , the result is filtered on
the following time range: (8 January 2014 07:15:26 AM) to (8 January
2014 08:15:26 AM).
If multiple events are selected,
the time filter is applied from the earliest to the most recent time
stamp. For three events that have the time stamps 1 January 2014 8:28:46
AM, 7 January 2014 8:23:20 AM, and 8 January 2014 8:15:26 AM, the , returns matching events in
the following time range: (25 December 2013 08:28:46 AM) to (08 January
2014 08:15:26 AM).
Restriction: The Web GUI and Operations Analytics - Log
Analysis process
time stamps differently. The Web GUI recognizes
hours, minutes, and seconds but Operations Analytics - Log
Analysis ignores
seconds. This problem affects the Show event dashboard
by node and Search for events by node.
If the time stamp 8 January 2014 07:15:26 AM is passed, Operations Analytics - Log
Analysis interprets
this time stamp as 8 January 2014 07:15 AM. So, the results of subsequent
searches might differ from the search that was originally run.
The results are displayed differently depending on the
tool. The time filter has no effect on how the results are displayed.
Tool |
How search results are displayed |
Show event dashboard by node |
A dashboard is opened for the OMNIbus
Static Dashboard custom app that shows the following information about the distribution of the matching events:
- Event Trend by Severity
- Event Storm by AlertGroup
- Event Storm by Node
- Hotspot by Node and AlertGroup
- Severity Distribution
- Top 5 AlertGroups Distribution
- Top 5 Nodes Distribution
- Hotspot by AlertGroup and Severity
For more information about the OMNIbus
Static Dashboard custom app, see Netcool/OMNIbus Insight Pack.
|
Search for similar events
and Search for events by node |
The results are displayed in the search timeline, which shows the distribution of matching
events over the specified time period. After the timeline, the list of results is displayed. Click
Table View or List View to change how the results are
formatted. Click > or < to move forward and back
in the pages of results. Keywords that occur multiple times in the search results are displayed in
the Common Patterns area of the navigation pane, with the number of
occurrences in parentheses (). |
Show keywords and event count |
The keywords are displayed in the Configured
Patterns area of the Operations Analytics - Log
Analysis GUI.
Each occurrence of the keyword over the time period is counted and
displayed in parentheses () next to the keyword.
|
- After the results are displayed, you can refine them by performing further searches on
the results in the search workspace.
For example, click a keyword from the Configured Patterns list to
add it to the Search field.
Important: Because of the difference in handling seconds between the two products, if
you run a further search against the keyword counts that result from the Show keywords
and event count tool, you might see a difference in the count that was returned for a
keyword under Configured Patterns and in the search that you run in the
search workspace.
Before the Search field, a sequence of breadcrumbs is displayed
to indicate the progression of your search. Click any of the breadcrumb items to return the results
of that search.
Example
The Show keywords and event count tool
can examine what happened before a problematic event in your environment.
Assume that high numbers of critical events are being generated in
an event storm. A possible work flow is as follows:
- You select a number of critical events and click so that you can identify any similarities between critical
events that occurred in the last hour.
- The most recent time stamp (FirstOccurrence) of an event is 1
January 2014 8:28:00 AM. In the Operations Analytics - Log
Analysis GUI,
the search results show all keywords from the Summary, Node, and AlertGroup
fields and the number of occurrences.
- You notice that the string
swt0001
, which is the hostname of a switch in your
environment, has a high number of occurrences. You click swt0001 and run a
further search, which reduces the number of results to only the events that contain swt0001
.
- From this pared-down results list, you quickly notice that one
event shows that switch is misconfigured, and that this problem is
causing problems downstream in the environment. You can then return
to the event list in the Web GUI and
take action against this single event.