Viewing metric anomaly details

In Incidents and alerts, alerts can appear in the Alert Viewer table to indicate metric anomaly detection. More detailed information on these alerts, and a Key Performance Indicator (KPI) timeseries chart, can be viewed by opening the Alert details panel.

KPI: A KPI is made up of a resource, which is a physical entity such as a network interface card, and a metric. For example, if a server has 2 network interface cards and you want to analyze data for 10 metrics on each network card in the server, the total number of KPIs is 20. KPIs are often interchangeably referred to as metrics, so a metric can sometimes refer to a measurement that is taken for a type of device - for example CPU_Context_Switches, and sometimes it can refer to an instance of that metric - for example CPU_Context_Switches on server ABC.acme.com.

Alert count metric anomalies: Default alert expectations are calculated by the metric manager capability based on the input data. If an anomalous alert rate is detected that is either higher or lower than expected, an alert is raised and displayed as a Metric anomaly in the Alerts table. When selected, the Alert details window depicts the expected values against the actual values. If required, an administrator can disable this specific functionality by editing the feature-flags-configmap.yaml file, and setting the IBM_IR_AI_ALERT_COUNT_METRICS_ENABLED flag to false.

Procedure

  1. In the Alert Viewer (Incidents and alerts > Alerts tab) click anywhere on the row of a metric anomaly alert (identified by ANOMALY displayed in the Type column in the table).

  2. An Alert details side panel opens. By default, the Metric anomaly details chart displays, showing the properties of the alert and associated values.

  3. The chart plots the activity of the anomaly over a 2-day period (the two days up to the date in the date field).

  4. You can increase the viewing size of the chart within the panel:

    • Click the 3 vertical dots in the gray tab on the left side of the panel. Then, drag the tab to the left.
    • Select a Graph start date and Graph end date to view data for the selected date range.

Understanding the chart

2-day chart
Figure. 2-day chart

  • The unbroken black line that is labeled with the name of the metric resource that is plotted on the chart represents the values of the selected metric over time.

  • A green-shaded area (labeled Baseline) represents the acceptable operating range of that anomaly.

    Note: The default KPI Baseline is always that of the primary metric. To change to another Baseline, from related alerts or additional metrics, click the radio button (blank circle) attached to another KPI. Making this Baseline adjustment also changes the KPI that is forecast. If you select multiple metrics, you can choose to view only one Baseline from all of the KPIs that are in the related alerts and additional metrics tabs.

  • A red-shaded area or red vertical line (labeled Anomaly) indicates where the metric is anomalous: one or more of the anomaly detection algorithms determined the metric was not behaving in accordance with the learned normal behavior.

  • A purple area in the chart (labeled Expected range), with a dashed line (labeled Forecast) indicates forecasted metric anomalies. For more information, see Metric anomaly forecasts.

    Note: Sometimes the purple range is not shown, or is only partially shown. This is because the range becomes too wide to show it. The dashed line denoting the forecast will still be shown unless there is some reason that the forecast is not produced, for example if there are too many gaps in the timeseries, or if data availability is too low.

Tooltip: Hover the cursor over any part of a time-series graph to view tooltip information that is related to multiple anomalies at a particular point in time.

Tooltip
Figure. Tooltip

Chart Icons

Icon Name Description
Copy Copy link Copies URL link of chart to clipboard.
Menu Show as table Tabular representation of anomaly, including occurrence times and values, which can be downloaded as a CSV or JSON file.
Maximize Make fullscreen Maximizes size of chart to fit onto full page.
Export More options Allows chart to be exported as a CSV, JSON, PNG, or JPG file. PNG and JPEG options are only available when you use the Chrome browser.

Zooming in on the timeline

On the main chart, place and click the "+" "cursor at the beginning of the time period you want to focus on, then drag it along the timeline, to your chosen endpoint. The main chart adjusts to the selected timeline.

For a more controlled zoom, use the zoom scale (miniaturized chart on top of the main chart) to focus on a particular date range. Select and move the vertical slider black bars, left or right:

Metric anomaly chart zoom bar
Figure. Metric anomaly chart zoom

Expanding the timeline of the chart

  • For metric anomaly detection alerts, click View expanded chart to change the 2-day timeline in the side panel to a larger, 7-day timeline chart.
  • For Log Anomaly - Golden Signals metric anomalies, click View expanded chart and log messages.

The Metric anomaly details page opens.

Metric anomaly expanded chart
Figure. The metric anomaly expanded chart with the Related alerts and Additional metrics tabs

Metric anomaly graph

From the Metric anomaly details page for Log Anomaly - Golden Signals metric anomalies, the Metric anomaly graph section includes a Graph view and a Logs view.

  • Both the Graph view and Logs view include filters to specify the date and time you want to display.
    • Select a Graph start date and Graph end date to view data for the selected date range.
    • Select a Logs pinned start date and a Logs pinned start time to specify a pinned time in the graph. Click Pin logs start time or Unpin logs start time to pin or unpin the specified time. Pinning can help you see how the logs in the Logs view correspond to the graph in the Graph view.
  • The Graph view is displayed by default and shows the timeline. Baseline data is omitted where the value on the graph is zero.
  • Click the Logs view to view a table of log messages that correspond to the alert and template ID.
    • To filter the logs that are shown, specify the date and time or search for keywords.
    • Download the logs by clicking Export to CSV file.
    • If the alert was derived from live logs that can be found in the log aggregator, you can click Launch log aggregator to open a log aggregator at the pinned start time.

Related alerts table: If alerts related to the primary metric are detected, then a tab for Related alerts appears in the expanded view of the chart. When this tab is selected, a table of related alerts appears. Up to 4 related alert metrics can be added to the chart as KPIs, in addition to the main metric KPI. For more information, see Related alerts table.

Additional metrics

Use the Additional metrics tab to add up to another 3 KPIs to the Metric anomaly details chart.

  1. Click Add metric.
  2. Select an item from each field: Metric group, Metric, and Resource.
  3. When selection is complete, the timeline of the KPI is added to the chart.
  4. Repeat steps 1-3 up to 2 more times.

To modify a selection, remove the current item by clicking close icon in the Metric group, Metric, or Resource fields. Then, click inside the empty field, and a menu displays all available items for that field.

To delete a KPI, click close icon in the Metric group field or the subtract icon icon.

Note: Entering part of a string in a field filters the menu so that it displays only the items that contain that element. In the following example, in the Resource field, entering 02 brings up all resources that contain the string 02.