Setting up alerting rules

You can enable alerts for critical and warning events and define when to forward a certain alert to the user. You can set the throttle time so users are not spammed with alerts when an event persists. To change the default alerting rules, you must use the Alerting APIs.

For more information, see Configure alert rules.

About this task

The default alerting rules are set as follows:

  • For critical events, you are alerted if 3 critical events are recorded during the last 3 monitor runs. After you’re alerted, the alert is snoozed for 12 hours.
  • For warning events, you are alerted if 5 warning events are recorded during the last 20 monitor runs. After you’re alerted, the alert is snoozed for 24 hours.

You can set the following parameters:

Parameter Description
severity The severity to be set.

Can be one of the following options:

  • critical
  • warning

You can't configure the alert rules for informational alerts.

trigger_type

Determines how to trigger alerts.

Can be one of the following options:
  • immediate
  • custom: This custom option is associated with alert_count and alert_over_count.
alert_count Count of events with the severity type.
alert_over_count Count of total events to be referenced.
snooze_time The number of hours to wait before an alert is sent after an event occurs.
notify_when_condition_clears Determines whether to send an alert when the condition clears. This alert is sent with an alert_type of info.

Procedure

Define an alert extension to monitor Kubernetes resources. Alerting rules are defined through zen_alert_type extensions.

Each alert type defines of rules for each event severity: critical, warning, info. For example, the following alert extension defines an alert to monitor Kubernetes resources daily.

  extensions: |
    [
      {
        "extension_point_id": "zen_alert_type",
        "extension_name": "zen_alert_type_platform",
        "display_name": "Platform alert type",
        "details": {
          "name": "platform",
          "description": "defines rules for alerting on diagnostics monitors",
          "rules": {
            "critical": { 
              "trigger_type": "custom",
              "alert_count": 3,
              "alert_over_count": 3,
              "snooze_time": 12,
              "notify_when_condition_clears": true
            }, 
            "warning": { 
              "trigger_type": "custom",
              "alert_count": 5,
              "alert_over_count": 20,
              "snooze_time": 24
            }
          }
        }
      }
    ]