This policy monitors a set of run-time performance conditions for an API, and sends alerts to a
specified destination when the performance conditions are violated. This policy enables you to
monitor run-time performance for one or more specified applications. You can configure this policy
to define a Service Level Agreement (SLA), which is a set of conditions that defines the level of
performance that an application should expect from an API. You can use this policy to identify
whether the API threshold rules are met or exceeded. For example, you might define an agreement with
a particular application that sends an alert to the application if responses are not sent within a
certain maximum response time. You can configure SLAs for each API or application combination.
Parameters like success count, fault count and total request count are immediate monitoring
parameters and the evaluation happens immediately after the limit is breached. The rest of the
parameters are Aggregated monitoring parameters whose evaluation happens once the configured
interval is over. If there is a breach in any of the parameters, an event notification ( Monitor
event) is sent to the configured destination. In a single policy, multiple action configurations
behave as AND condition. The OR condition can be achieved by configuring multiple policies.
The table lists the properties that you can specify for this policy:
Parameter |
Value |
Action Configuration. Specifies the type of action to be configured. |
|
Name |
Specifies the name of the metric to be monitored. You can select one of the available
metrics:
- Availability. Indicates whether the native API is available to the
clients as specified in the current interval. webMethods API Gateway calculates the availability of the native
API based on the alert interval specified and it is calculated from the instant the API activation
takes place.
The availability of the API is calculated as = (time for which the native API is up /
total interval of time) x 100. This value is measured in %.
For example, if you set
Availability as less than 90, then whenever the availability of the native API falls below 90%, in
the specified time interval, webMethods API Gateway generates an alert. Suppose, the alert interval is set as 1
minute (60 seconds) and if there are 7 API invocations at various times in that 1 minute with a
combination of up and down as shown in the table, the availability is calculated as
follows.
Request# |
Invocation time |
Service status |
Up time |
1 |
5 |
Up |
5 (from start to now) |
2 |
15 |
Up |
10 (between 1 and 2) |
3 |
30 |
Down |
15 (between 2 and 3) |
4 |
40 |
Down |
0 (since last state is down) |
5 |
45 |
Up |
0 |
6 |
50 |
Down |
5 (between 5 and 6) |
7 |
55 |
Up |
0 |
|
|
|
5 (remaining 5 seconds considered as Up inline with last state) |
|
Total |
|
40 (Availability is 67%) |
As the availability of the native API calculated is 66.67% and falls below 90%, API
Gateway generates an alert. The API is considered to be down for the ongoing request when API
Gateway receives a connection related error from the native API in the outbound call. If the API
does not respond with an HTTP response, then it is considered as down.
- Average Response Time. Indicates the average time taken by the service to
complete all invocations in the current interval. The average is calculated from the instant the API
activation takes place for the configured interval.
For example, if you set an alert for Average
response time greater than 30 ms with an interval of 1 minute then on API activation, the monitoring
interval starts and the average of the response time of all runtime invocations for this API in 1
minute is calculated. If this is greater than 30 ms, then a monitor event is generated. If this is
configured under Monitor SLA policy with an option to configure applications so that application
specific SLA monitoring can be done, then the monitoring for the average response time is done only
for the specified application.
- Fault Count. Indicates the number of faults returned in the current
interval. The HTTP status codes greater than or equal to 400, returned from webMethods API Gateway are
considered as fault request transactions. This includes the downtime errors as well.
- Maximum Response Time. Indicates the maximum time in milliseconds (ms) to
respond to a request in the current interval.
- Minimum Response Time. Indicates the minimum time in milliseconds (ms) to
respond to a request in the current interval.
- Success Count. Indicates the number of successful requests in the current
interval.
- Total Request Count. Indicates the total number of requests (successful
and unsuccessful) in the current interval.
|
Operator |
Specifies the operator applicable to the metric selected. Select one of the available
operators, Greater Than, Less Than, Equals To.
|
Value |
Specifies the alert value for which the monitoring is applied. |
Destination |
Specifies the destination where the alert is to be logged. Select the required
options:
- webMethods API Gateway
- Developer Portal
- CentraSite
Note: This option is applicable only for the APIs published from CentraSite to API
Gateway.
- Elasticsearch
- Email. You can add multiple email addresses by clicking
. Note: If an email alias is available,
you can type the email alias in the Email Address field with the following
syntax, emasculation. For example, if test is the email alias then type {test}.
-
Local Log. You can select the severity of the messages to be logged (logging level) from the Log
Level list. The available log levels are ERROR, INFO, and WARN.
For details on publishing to custom destinations, see Custom destination .
|
Alert Interval |
Specifies the time period in which to monitor performance before sending an alert if a condition
is violated.
The timer starts once the API is activated and resets after the configured time interval. If and
API is deactivated the interval gets reset and on API activation its starts afresh.
|
Unit |
Specifies the unit of measurement of the Alert Interval configured, to
monitor performance, before sending an alert. You can provide one of the following units.
|
Alert Frequency |
Specifies how frequently to issue alerts for the counter-based metrics (Total Request
Count, Success Count, Fault Count). Select one of the options:
- Only Once. Triggers an alert only the first time one of the specified
conditions is violated.
- Every Time. Triggers an alert every time one of the specified conditions
is violated.
|
Alert Message |
Specifies the text to be included in the alert. |