Analyzing Incidents

Here is an example of an incident. A service is suddenly responding slower than usual, we call this a sudden increase in average latency. The incident is automatically marked in yellow as a warning. The colour is presented as long as this incident is still active. Once it is resolved, the colour changes to grey and is still available for the drill-down menu. Clicking on the line in the table starts the experience.

The incident detail view is organized into three parts:

  1. The header contains basic information about the key facts of the incident.

    • Start time;
    • End time (current if it is still ongoing);
    • The number of the still active events;
    • The number of changes involved;
    • The number of affected entities.

    You can see the incident start date, the end date (if available), how many events are still active, how many changes belong to this incident, and the number of affected entities:

  2. The second section provides a visual representation of the incident development over time. The chart shows the complete time frame, from start to end and all events, sorted by start time. The view is limited to seven events when collapsed. Press the expand button to see the full view if your incidents contains more than seven events at a time. Clicking on either of the bars will open the detail-view for that issue:

  3. The third section contains the details for the graph view in section 2. A list of all events, sorted by start time, allows the user to see all available information for each event. To do this, just click it to expand it:

The details help in understanding the event, followed by multiple charts with the corresponding metric plotted for visualization. If an event is still active, the chart will continue rendering new incoming metric values. There are two flags available, emphasizing that this event affects a service and/or that this event has triggered the incident. If available, the flags are placed top of each event in the list.

When focusing on an event, the detail section will provide the same information described in the incidents event list on point 3.

Search Capabilities - Finding an Incident

Searching through events discovered by Instana relies on the Dynamic Focus feature. By clicking on one or selecting multiple bars in the events bar chart at the top, events table will list only the events which are included in the selected bars. This allows detailed inspection of events without changing current time interval.

In addition, you can use the search box to find specific items by the data shown in the columns “Title” or “On” (the name of the service on which the incident occurred) in the overview table. In this example, the search query is event.text:"Error rate". The result is a list of all events containing the phrase "Error rate" in the title: