Alerts
Operations Dashboard allows defining proactive alerts that will be triggered when specified conditions are met to quickly respond to issues. Alerts are based on queries against the tracing data that is collected by Operations Dashboard. The Web Console includes several alerts that are automatically generated.
In order to allow Operations Dashboard to send emails or invoke a Web Service, SMTP system parameters should be configured. More settings, including Syslog destination server details, are available in Reports and alerts parameters.
Alerts list
The Alerts page lists all the available alerts and allows to create new alerts, or to test, edit, delete and duplicate existing alerts.
Figure 1. Alerts list
Adding and editing alerts
Click on Add new alert to add a new alert, or click Edit in an existing alert row to edit an existing alert. For preloaded alerts, editing certain fields, such as the alert query, is not allowed. It is recommended that you duplicate the preloaded alert (by clicking Duplicate in an existing alert row) and then edit the duplicate alert.
General attributes
Figure 2. General attributes
Table 1. General attributes
Attribute | Description |
---|---|
Name | A unique name for the alert. |
Description | A more detailed description of the alert. |
Cron scheduling expression | Used for scheduling the alert. See below for more details. |
API reference | A unique identifier (FFU, not currently used). |
Alert destination | Send to Syslog - Output alerts to a Syslog destination server. Invoke email web service - Output alerts to a Web Service endpoint. Send email - Output alerts to via SMTP. Specify email addresses that will receive the alerts by email when applicable. If no alert destinations are checked, the alert result can still be viewed in the Alerts history page (discussed later). |
Scheduling
The Cron scheduling expression field is used for specifying when the alert should be executed automatically. Leave this field empty to disable automatic scheduling.
The scheduling expression format is: second minute hour day month weekday year
. Each value (except for the seconds) accepts *
(an asterisk) for specifying “All”.
Click the three dots button to open a wizard for generating common expressions.
Figure 3. Scheduling wizard
Table 2. A few examples of valid Cron scheduling expressions
Expression | Description |
---|---|
0 30 16 * * * * |
Every day at 16:30 |
0 0 0 * 11 * 2020 |
At midnight, every day of November 2020. |
0 */15 * 10 * * * |
Every 15 minutes, on every 10th of the month |
0 15 */12 * * 2 * |
Every 12 hours, at minute 15 (e.g. 00:15, 12:15) , only on Tuesdays. |
Definition attributes
Each alert’s definition consists of an alert type, time period, an operator and a threshold. When the alert query (discussed later) is executed, the result is tested against the alert type, operator and threshold parameters to determine if conditions are met to trigger the alert.
Figure 4. Definition attributes
Table 3. Alert types
Alert type | Description | Example |
---|---|---|
Any | The condition will be met if any results are returned for the query. | All spans that contain an MQ Return code of 2003 (each span will trigger an alert). |
Flatline | The condition will be met if there is a value greater than a certain threshold. | At least one APP C span duration was more than 10 seconds (one alert is issued for all spans). |
Frequency | The condition will be met if there were X events in the checked time. | More than 5 spans with errors seen in the last 10 minutes (one alert is issued for all spans). |
List | The condition will be met if a result is in/not-in a pre-defined list of values. | An error code is in the list of specified errors. |
Table 4. More definition attributes
Attribute | Description |
---|---|
Operator | Equals, Not equals, Greater than, Greater than or equals, Less than or Less than or equals. |
Alert query period | The time frame for the alert query. |
Alert error threshold | Only applicable for Flatline and Frequency alert types, a numeric value that will be tested against the query result. |
Field name | Only applicable for List alert type, the field name containing the list of values to check. |
Value list | Only applicable for List alert type, the list of values delimited by the delimiter specified in the "delimiter" field. |
Query attributes
The Query section contains attributes of the actual query that should be executed against the tracing data that is collected by Operations Dashboard.
Figure 5. Query attributes
Table 5. Query attributes
Attribute | Description |
---|---|
Index set | Which Elasticsearch index set to query. |
Document type | Which Elasticsearch document type to query. |
Query | An Elasticsearch JSON query. |
Parameters | A JSON string containing key-value pairs to replace placeholders in the query. For example, you may use the placeholder $minElapsedTime in the query, and then enter {"minElapsedTime":"100"} in the parameters field. The value will be replaced during the query execution. |
Filters
The filters section is used to specify which filters will be applied to the query during its execution. For example: Appliance name=prod1
.
Note that the Time filter is missing from the filters list. The time frame for the query is set in the Alert query period field.
Figure 6. Filters
Testing alerts
To manually test an alert, click on Test in an existing alert row in the alerts list page.
Alerts results
When alerts are triggered, they are sent to the specified destination (e.g. a Syslog server), and can also be downloaded from the Alerts history page, which shows all recent alert executions, both scheduled and manually tested alerts.
Figure 7. Alerts history
The list includes the name of the alert, the testing user or “Scheduler” if the alert was scheduled automatically and the status (such as OK, Executing and Error). A Download button will appear only if alerts has been triggered to download a JSON file describing the alerts.