Incident overview

The incident Overview functions as a report on the best course of action by presenting the top probable causes, as well as similar past resolution tickets, associated automations to remediate the alerts, and a host of other information.

Incident overview
Figure. Incident overview

From a ChatOps integration (Slack, Microsoft Teams, or ServiceNow), you can launch into an incident's overview from the incident title link in the notification. From within Cloud Pak for AIOps, complete the following steps to navigate to an incident's overview page.

Procedure

  1. Click the navigation icon at the upper-left corner of the screen to go to the main navigation menu.
  2. In the main navigation menu, click Operate > Incidents.
  3. From the Incidents tab, select an incident from the list and click the text in the Title column.

The Noise reduction indicator (upper right) shows how Cloud Pak for AIOps reduces the number of IT events that your operations staff must evaluate. The Noise reduction percentage shows the reduction in the number of events that are correlated and grouped into alerts that are included in each incident.

Noise reduction
Figure. Noise reduction

Next to Noise reduction, you can click View raw if you want to view the incident data in raw JSON format.

The following sections are presented on the incident Overview page:

For more information about the Alerts tab and Topology tab, see Alert Viewer and Viewing a topology.

Probable cause alerts

Based on the topology that is associated to an incident, and the classification of the alerts, the alerts that are most likely to be the probable cause of the incident are presented here. Probable cause alerts show the top three most likely cause alerts, ranked in order of probability of being the cause of the incident.

Probable cause alerts
Figure. Probable cause alerts

The following information is displayed for each probable cause alert:

  • Probable cause ranking Rank 1 Rank 2 Rank 3

    Probable cause assigns a ranking to all of the alerts in the incident. Alerts are ranked in order of likelihood of being the cause of the incident.

  • Severity indicator

    Indicates the severity of the alert. The following severity icons are used:

    • Critical Critical

    • Major Major

    • Minor Minor

    • Warning Warning

    • Information Informational

    • Indeterminate Indeterminate

  • Trigger alert Trigger

    Denotes alerts that are defined as trigger alerts. That is, an alert that either caused the incident to be created, or would have caused creation had an incident not existed.

  • Summary

    A description of the alert.

Note: Where probable cause cannot be identified, the section Top Alerts appears instead, listing the top 3 alerts for the incident ranked in order by severity, first occurrence, trigger alerts, and finally by the availability of associated automations.

Select any of the probable cause or top alerts to see topology information for that alert.

To get more details about a probable cause alert, click View alert below the topology diagram. This brings you to the Alert Viewer, with the relevant alert selected and the Alert details displayed.

Alert topology

Select a probable cause alert (or top alert) from the list. If the resource on which the alert occurred can be located in the network topology system, a one hop topology map is displayed for this alert, centered on the resource on which the alert occurred.

Restriction: For an incident topology to be generated from the alert, at least one topology group that is enabled for correlation must have been added to the alert.

Alert topology
Figure. Alert topology

To investigate the incident topology, click View topology below the topology diagram. If the topology group condition is met, then this action opens the Topology tab, which provides a holistic view of the relationships between different infrastructure and application components.

For each resource associated with the incident, you can see the number of alerts, severity indicator, and probable cause ranking. This information can be used to rapidly identify the root cause and the impact of the issue on your environment before acting.

Side panel

The side panel is displayed by default on the Incident overview page. You can click Edit in the side panel to change the incident name or description.

Incident Side panel
Figure. Incident Side panel

The following incident information is displayed in the Overview side panel:

  • Incident ID - A unique number that is assigned to the incident

  • Time opened - How long the incident has been opened for.

  • Source - The IBM Cloud Pak for AIOps policy that created the incident. Click the link to locate that policy on the Policies UI.

  • Notifications - When available, the linked ServiceNow incident number.

  • Priority - Priority level that indicates the impact level of an incident on the customer. Priority 1 is the highest priority, through to Priority 5 the lowest. You can change an incident's priority level from the drop-down menu in the side panel.

  • Status - Unassigned, In progress, On hold, or Resolved. You can change an incident's status from the drop-down menu in the side panel.

The overview side panel also contains the following sections:

  • Assignees

    If the incident is assigned, the Group and Owner of the incident are listed under Assignees. If the incident is not yet assigned, expand Assignees to assign a group or owner.

  • Impacted applications

    Applications that are impacted by the alerts that are part of the incident are listed under Impacted applications. A number indicates how many applications have been impacted. Click the arrow Expand to expand the list of applications. The importance of each impacted application to the business is shown by a Business criticality indicator on the right of the side panel. The number of Active incidents that the application is impacted by is also displayed. These incidents are categorized by Priority level. Applications are sorted based on priority. You can click an application name to launch out to the application in Resource management. If there are more than five applications, you can click View all to open a modal containing the complete list. You can sort the applications in the table by clicking on each column header within the modal table. To filter the list of impacted applications, enter a search string in the Search Search box within the modal and start typing the name of the application you are looking for. As you type, the list displays only applications that contain the typed text. As with the applications displayed in the side panel, clicking an application's name opens that application.

  • Related incidents

    Any incidents that share contextual alerts are listed in the Related incidents section of the side panel. Related incidents can provide more context when you are resolving an incident. A number indicates how many related incidents have been found. Click the arrow Expand to expand the list of incidents. The priority and status of each incident is displayed, and the number of associated alerts. Related incidents are sorted based on priority. You can click an incident's title text to go to the incident's Overview page. If there are more than five related incidents, you can click View all to open a modal containing a complete list. You can sort the incidents in the table by priority, status, and alert count. To filter the list of incidents, enter a search string in the Search Search box within the modal and start typing the name of the incident you are looking for. As you type, the list displays only incidents that contain the typed text. As with the incidents displayed in the side panel, clicking the title text opens that incident's Overview page.

  • Notes and activity

    A timeline of major events that have occurred in the lifecycle of an incident. Click View all notes and activity to go to the Notes and activity tab.

    To collapse the side panel, click the arrow Close at the top of the panel beside the incident name.

Probable alert automations

The overview page for an incident displays the top three probable alerts for an incident. Runbooks that are associated with those alerts are visible in a table below the selected alert's topology. Select an alert to display the associated runbooks.

Similar past resolution tickets

It can be helpful to review the details of similar past resolution tickets to help determine the best course of action. Click Add tickets to incident button to open the window.

Similar past resolution tickets window
Figure. Similar past resolution tickets window

When the Add tickets to incident window opens you can search, select tickets from the past, and click Add to incident button to add the selected tickets to this incident. The following diagram shows the capability to add tickets to an incident.

Add tickets to an incident by using the search capability
Figure. Add tickets to the incident by using the Add to incident button

The following prerequisites must be met for the data to show in the ticketing insights window.

  1. Setup ticketing integrations. For more information about Integrations, see Integrations.
  2. Run Similar tickets training. For more information about setting up Similar tickets training, see Setting up training for similar tickets.
  3. Enable the default story query similar incidents service policy. To enable this policy, go to Automations > Policies. Set the state of this policy to enabled.
  4. A minimum of five qualifying tickets are needed. Qualified tickets are closed. For more information about the required attributes for similar tickets, see Qualifying tickets.

Insights are used to display any similar past resolution tickets that are associated with the incident, together with a description of each ticket. You can remove the past resolution tickets from an incident. You can click the dismiss (-) button to remove the past resolution ticket. The following diagram shows the ability to remove tickets from an incident.

Remove tickets from an incident
Figure. Remove tickets to the incident by using the dismiss button

Click the title of a past-resolution ticket to open the linked ticket. For example, the ticket can be a ServiceNow ticket.

You can navigate to the following tabs from the incident overview:

Alerts

Click the Alerts tab for a list of all the alerts that are associated to the incident.

Affected resources

Shows the relationships of impacted resources. Note, this view is only populated when an incident has an associated topology.

Associated automations

See an aggregate of automations associated with alerts within the incident. For a given automation, the table displays various details and you can navigate to the associated policy, alert, or run the automation.

The following information is displayed for each recommended runbook:

  • Status

    Successfully executed, Failed, Cancelled, In progress, or Completed.

  • Type

    Indicates the Runbook type: Manual or Automated.

  • Success rate

    The success rate is calculated by using the number of successful and unsuccessful executions of the runbook.

  • Policy

    Shows the Cloud Pak for AIOps policy that assigned the runbook to the selected alert. Click the policy name to locate that policy on the Policies UI.

  • Average rating

    Operations analysts can provide feedback about the quality of a runbook including a rating, which is displayed here (five stars is the top rating).

  • Associated alert

    Click to view the alert that is associated to the runbook in the Alert Viewer.

  • Execute

    Click Run to run the runbook.

Notes and activity

A timeline of major events that have occurred in the lifecycle of an incident. The following events are logged to the Activity timeline:

  • When an incident is created.
  • Incident is assigned to new owner or team.
  • An alert has been associated with the incident.
  • Incident state changes. For example, the incident has been set to On hold from In progress, or marked as Resolved.
  • Changes to the incident title or description.
  • User added comment.

To add a comment to the incident timeline, type a comment in the field that is provided and click + Add comment. A maximum of 250 characters are supported. The comment is stored in the timeline in chronological order with the other entries.

For more information, see the following pages: