Smart Alerts

Smart Alerts provide you with automatically generated alerting configurations so you can receive alerts based on out-of-the-box blueprints such as website slowness, JavaScript errors, and HTTP status codes.

Select a blueprint that you would like to be alerted for, choose an arbitrary scope, for example, by geolocation, browser, OS, and so on, and the system automatically creates a tailored alert for you.

Adding an alert

  1. On the sidebar, click Websites.
  2. Click the name of your website and then click Add Alert.

Simple mode

By default, you create an alert in the simple mode, which involves the following steps:

  1. Select an alert.
  2. Confirm your scope.
  3. Select an alert channel.

Simple mode enables you to select alerts with zero configuration so that you don't need to create queries or define thresholds.

To create an alert in advanced mode, which allows you to investigate and modify any automatically configured alert setting, click Switch to Advanced Mode.

Select an alert

Select one of the following predefined blueprints that you want to create an alert for.

Blueprint Description
JS Errors Click Select JS Error to select an existing JavaScript error message that you want to be alerted for. Alternatively, you can choose the error message by providing a JavaScript message pattern by using equal, contain, starts with, or ends with a defined string.
Slowness Select Slowness to receive alerts when the onLoad time exceeds expectations that are derived from historical data. The onLoad time metric exists for each page load and models the time until navigation is complete. For example, when the loading indicator of the browser is stopped. More information about onLoad time and related metrics can be found in the [website Monitoring FAQ](../website_monitoring/faq.html#what-do-the-website-metrics-mean).
HTTP Status Codes Select a specific HTTP Status Code that you would like to be alerted for when they occur more often than usual.
Throughput Select Unexpectedly Low Number of Page Loads or Unexpectedly High Number of Page Loads to receive alerts when the expected number of page loads of your website significantly differs compared to the available historical data. A page load is defined by the retrieval of the initial HTML document and all subsequent actions until the next navigation in the browser.
Custom Events Select a specific Custom Event to receive an alert when it occurs more or less often than it should.

Confirm your scope

The scope, your current open website, is automatically selected. By applying the Unbounded Analytics queries, you can further scope the alert to a specific subset of website traffic, for example, by geolocation, browser, or user.

Each query filter connects with the AND logic operator by default. Therefore, a website beacon needs to match all the filters applied:

  • Specific pages.
  • Browser types.
  • Operating systems.
  • Countries.
  • Meta: specify more metadata that you can use to annotate page loads. Select one of the available keys, a predefined value, and then select an operator.

Add alert channels

Click Select Alert Channel, and from the list of preconfigured channels, select the channels to receive the alerts. For information about creating channels, see Alert Channels.

Advanced mode

To have a full understanding and control over your alerts, advanced mode helps you to inspect the configuration of each preconfigured alert and modify if need be. In addition to the selections available in simple mode, the advanced mode offers the following.

Trigger

Select one of the following predefined blueprints that you want to be alerted for.

JS errors

The same configuration options are available as in simple mode. For more information, see Select an alert.

In addition, you can select which metric is used for an alert evaluation; errors rate or errors count. Whichever metric you select, Instana automatically derives a threshold value based on the past 4 weeks of data, which can also be modified.

Alerts JS Errors

Slowness

The same configuration options are available as in simple mode. For more information, see Select an alert.

Furthermore, you can also select which percentile metric is used for alert evaluation. An alert is triggered when the percentage of website page loads, with the onLoad time less than the threshold, is less than the corresponding percentile number. Depending on the amount of available historical data, Instana suggests a static or a dynamic baseline. Additionally, you can choose between a daily or weekly seasonality baseline when sufficient data is available.

The static baseline value can be directly modified. Daily or weekly seasonality can be tuned by using the sensitivity parameter, which defines how much outliers can deviate from the expected value before it's considered a violation.

Alerts Slowness

HTTP status codes

The same configuration options are available as in simple mode. For more information, see Select an alert.

Additionally, you can also select which metric is used for alert evaluation; status code count or status code rate. Whichever metric you select, Instana recommends a threshold value based on historical data that can be modified.

Alerts HTTP Status Codes

Throughput

In contrast to the use-case specific options in simple mode that is described in Select an alert, the advanced mode allows to define a more generic Smart Alert based on any page view related metric.

In addition, you can select which metric is used for alert evaluation, such as Page Loads or Page Transitions. More details about the available metrics can be found in the website Monitoring FAQ. Given the example configuration in the image, an alert is triggered when the number of page transitions is higher than usual. Depending on the amount of available historical data, Instana suggests a static or a dynamic baseline. Additionally, you can choose between a daily or weekly seasonality baseline when sufficient data is available.

The static baseline value can be directly modified. Daily or weekly seasonality can be tuned by using the sensitivity parameter, which defines how much outliers can deviate from the expected value before it's considered a violation.

Alerts Page Transitions

Custom Events

The same configuration options are available as in simple mode. For more information, see Select an alert.

Alerts Custom Events

Type of Threshold

When you set up a Smart Alert, you can choose to use static or adaptive thresholds.

Threshold type

Static

Static thresholds do not change after the Smart Alert is created. The threshold itself can be either a simple constant value, or can account for seasonal variations that occurred in the past at the time of creation of the Smart Alert configuration. You can imagine the later one as a lookup table for every point in time of the day or week that is precomputed once based on historic data.

The threshold might be no longer relevant after the underlying metric is changed significantly. In response, the threshold can be manually adjusted or recalculated at any point in time.

When to use static threshold

Static thresholds work best for blueprints like Slowness or JS Errors in the following situations:

  • Irrespective of any seasonality of the underlying metric. It is undesirable for the metric to go larger than or lower than a constant value.
  • The underlying metric is seasonal, and therefore different thresholds exist depending on the point in time of the day or week. But these thresholds themselves don't change over time, and gradual changes to these thresholds over long periods of time is undesirable.

Adaptive (public preview)

Adaptive thresholds continuously evolve and adjust themselves with new data that is observed by Instana. This means that the threshold continuously accounts for seasonal changes to the underlying metric without any human intervention.

Adaptive baselines is classified as a public preview feature.

When to use adaptive threshold

Adaptive thresholds work best for blueprints such as Throughput or generally for the following situations:

  • The underlying metric is not seasonal. The threshold is expected to gradually change over time, but any sudden deviation from this trend is undesirable.
  • The underlying metric is seasonal and different thresholds exist for different times of the day or week. The thresholds themselves are expected to gradually change over time, but any sudden deviation from this trend is undesirable.

Scope

The same configuration options are available as in simple mode. For more information, see Confirm your scope.

Alert threshold

In this section, you can configure the alert threshold of the Smart Alert. The underlying metric is an aggregation of beacons that relate to the given Website. When the alert threshold of the Smart Alert is configured, the alert preview of the dialog shows the metric, threshold, and the violations on historic data for the last 24 hours or 7 days.

Choose a metric

This step is relevant when you choose the Slowness blueprint, where the following options are available to choose from: arithmetic mean, minimum, and maximum, along with 25th, 50th, 75th, 90th, 95th, 98th and 99th percentiles.

The metric is calculated for website beacons with a timestamp within the evaluation granularity, which is chosen as part of the time threshold.

Choose threshold operator

Based on the chosen blueprint, you have the option between <, <=, >, >=.

Choose threshold type

Here you can choose among the following static threshold types:

  • Static Threshold: Takes a constant value as threshold.
  • Static Daily Seasonality: Uses a threshold that captures the daily repeating patterns of the metric where every day behaves roughly the same, but is different throughout the day. As an example, a website that has more traffic during the day compared to during the evening.
  • Static Weekly Seasonality: Uses a threshold that captures the weekly repeating patterns of the metric where every day of a week behaves roughly the same, but is different throughout the week. As an example, a website that has more traffic on workdays compared to the weekend.

For Static Daily Seasonality, at least 5 days of continuous metric data is required, but 7 days of data is recommended. For Static Weekly Seasonality, at least 2 weeks of continuous historic metric data is required. The Smart Alert cannot be created when these requirements are not met.

For Adaptive Threshold, at least 5 days of continuous metric data is required. If this requirement is not fulfilled, the Smart Alert can still be created. Issue detection and alerting will start working as soon as the data requirement is met to initialize the used model.

Choose threshold value or sensitivity

If you choose Static Threshold, you can either use the suggested threshold value or define the value manually.

Increase the sensitivity to narrow the upper and lower anomaly detection boundaries. As a consequence, you will receive more alerts. However, if you want to receive fewer alert notifications, you can decrease the sensitivity. This effectively extends the detection boundaries that define the expected value range of the metric. Depending on the used threshold operator, a metric that exceeds either the upper or lower detection boundary is considered a violation that could cause an alert.

Time threshold

So that the alert is triggered, time threshold allows you to impose more conditions on how the defined threshold on the metric are violated.

The following typical conditions, often used in practice, are offered:

  • Persistence of time: Select a time window. When the metric violates a defined threshold over the defined time window, you are alerted.

  • The number of violations over time: Select a time window and the number of violations. When the metric violates the threshold a specified number of times during the time window, you are alerted.

  • User impact:

    Besides, the thresholding condition on the selected metric, you can define a secondary criterion on the minimum required impacted users. This allows you to receive alerts only in case a significant number or ratio of users are impacted by the defined problem.

    • User Impact Evaluation method: Defines how the user impact is measured when the primary metric is violated for the amount of evaluation windows. You can select one of the following methods:
      • Aggregate across all evaluation windows, which measures the user impact as a single aggregate across the defined time window. This value needs to be exceeded to receive an alert.
      • Calculate for each evaluation window, which measures user impact for each evaluation window individually, like any other metric. To receive an alert, the defined number of evaluation windows need to be violated in sequence for both the primary metric and the user impact.
    • Number of impacted users or percentage of impacted users: Specify either the absolute number of users impacted, the percentage of users impacted, or both. In the latter case, you get alerted only when both limits are reached during the defined time window.

    The user impact metric requires Instana's Users API to identify authenticated users and Session API to approximate other users based on their session. Depending on the integration of these APIs, the user ID is used if provided, otherwise the session ID as a fallback.

In the presence of gaps for a metric that is not SUM aggregated, such as latency or error-rates, Instana preserves the current alert state until the next metric value is seen. This is, for example, especially helpful in case a Smart Alert is defined for a website that receives only infrequent traffic, but suffers from a persistent problem. Therefore, these periods without any website traffic will not cause repetitive alerts. However, the absence of a single website beacon for more than 24 hours cause any active alert to be closed in any case.

The following image depicts an example of a configuration for a time threshold based on user impact. Using a metric evaluation granularity of 10 minutes, an alert is triggered when at least 20% of the users are impacted within the last 10 minutes.

Time Threshold User Impact

Alert channels

Click Select Alert Channel, and from the list of preconfigured channels, select the channels to receive the alerts. For more information, see Alert Channels.

Alert properties

Adding more alert properties is optional; however, it provides you with the additional configuration that best suits your needs. Along with editing the current title and description of the alert, you can also define the alert level; warning or critical, and select whether the alert triggers an incident. For more information, see Alerting.

Custom payloads

To include an extra payload that is relevant to you in alert notifications for specific alert configuration that is sent by Instana click Add Row in the Custom Payloads section.

For more information, see: Configure Custom Payload Globally.

Both global custom payload and alert-specific custom payload will be included in alert notifications if applicable, but the alert-specific configuration has priority over the global configuration. As a result, in the case of using the same key, the value of the global custom payload field will be overridden by the alert-specific one.

You can see the globally defined custom payloads that are effectively used in the alert configuration as follows:

Read only global custom payload

Dynamic custom payload fields in alert-specific configuration are also supported.

Select Dynamic Tag as follows:

Dynamic Custom Payload

You can use the suggestions to select the right key for the selected dynamic tag or add it manually.

Dynamic Custom Payload Suggestions