Defining custom alerts for resources

You can define alerts that are triggered when two or more changes occur in the attributes, capacity, and performance of resources.

About this task

To define a custom alert, select the general attributes, capacity, and performance metrics that you want to combine to trigger an alert and specify their conditions and threshold values. You can combine conditions for the resource and its internal resources into a custom alert. The alert is triggered when the conditions for the attributes and capacity of the resource are met, and the performance of the resource falls outside the threshold values.

For example, you can create a custom alert that notifies you when the overall response time for the volumes on a SAN Volume Controller system is worse than 20 milliseconds per operation and the system CPU utilization on the nodes on the system is greater than 70%. The Overall Response Time is a metric that measures the average number of milliseconds that it takes to service each I/O operation on a volume. The System CPU Utilization is a metric that measures the average percentage of time that the processors on nodes are busy doing system I/O tasks.

Procedure

  1. To define alerts for resources, choose one of the following options:
    Option Steps
    Define alerts for a policy
    1. Go to Settings > Alert Policies.
    2. Double-click the policy.
    3. Click Edit Alert Definitions on the Alert Definitions tab.
    Define alerts for a resource that is not managed by a policy
    1. Go to the resource list page for the resource. For example, to define alerts for a block storage system, go to Storage > Block Storage Systems. To define alerts for a switch, go to Network > Switches.
    2. Right-click the resource for which you want to define alerts, then click View Alert Definitions.
    3. Click Edit Alert Definitions.
  2. Click Custom.
  3. Create alert icon Click the create alert icon, then enter a name for the alert.
  4. Assign a severity to the alert.
    Assigning a severity can help you more quickly identify and address the critical conditions that are detected on resources. The severity that you assign depends on the guidelines and procedures within your organization.
    Option Description
    Critical alert icon
    Critical
    Assign this severity to alerts that are critical and need to be resolved. For example, assign a critical severity to alerts that notify you when the Port Send Bandwidth Percentage is greater than or equal to 85%. The default severity for custom alerts is critical.
    Warning alert icon
    Warning
    Assign this severity to alerts that are not critical, but represent potential problems. For example, assign a warning severity to alerts that notify you when the Port Send Bandwidth Percentage is greater than or equal to 75% but less than 85%.
    Informational alert icon
    Informational
    Assign this severity to alerts that might not require any action to resolve and are primarily for informational purposes.
  5. Select a component, category, and group for the alert.
    For example, select Storage System, Capacity, and Available Capacity.
  6. To generate an alert for a general or capacity attribute, specify the conditions for the alert.
    Conditions can include operators such as greater than or equal to, or less than or equal to. Conditions can also include storage values and time values.
    For example, for a capacity attribute such as Available Capacity, you can specify that an alert is generated when the amount of available capacity on a resource's pools is less than or equal to 50 GiB.
    Available Capacity alert
    Tips:
    • Not all attributes require conditions to generate an alert. For example, you can enable an alert for the Deleted Volume attribute, but you don't need to specify any conditions.
    • Some attributes use operators such as is, is not, contains, and changes. For example, for the Firmware attribute for a DS8000® you can select the operator Contains and enter R5 in the value field. An alert is triggered if the firmware is at the R5 level rather than at a later version such as R6.1, R6.2, or R6.3. You can use this alert definition if you want to be notified when the firmware for a storage system is reverted to a previous version.
  7. To generate an alert for a performance metric, specify the conditions for the alert.
    Conditions include an operator and a threshold value.
    1. Select an operator.
      An operator determines if an alert is triggered when the performance of a resource is greater than or equal to or less than or equal to the specified threshold value.
    2. Enter a threshold value.
      For example, to trigger an alert if the Total I/O Rate for a storage system is greater than or equal to 500 ops/s, enter the value 500.
      Tips for threshold values:
      • IBM Spectrum Control provides recommended values for threshold values that do not vary much between environments. For example, the default threshold values for Port Send Bandwidth Percentage are greater than or equal to 75% for warning alerts, and greater than or equal to 85% for critical alerts.

        However, for metrics that measure throughput and response times, thresholds can vary because of workload, model of hardware, amount of cache memory, and other factors. In these cases, there are no recommended values. To help determine threshold values for a resource, collect performance data over time to establish a baseline of the normal and expected performance behavior for that resource. After you determine a set of baseline values, define alerts to trigger if the measured performance behavior falls outside the normally expected range.

      • For some metrics, lower values might indicate more stress and higher values might indicate idle behavior. For example, a lower threshold value for the Cache Holding Time Threshold metric might indicate a performance problem.
  8. Optional: Click View Performance to view a chart of the performance of the resource. Use the chart to evaluate the current and historical performance of a resource to help determine the threshold value for an alert.
    The chart uses colored lines to represent the different threshold values and severities that can be defined for an alert:
    • Critical alert: red
    • Warning alert: orange
    • Information alert: blue
    To customize the chart, click Top 10 or Bottom 10 to show resources according to their performance, click a time period, and change the start and end dates for the data that is displayed.
  9. Duplicate alert icon Duplicate an alert.
    Use this action to add a second and subsequent condition to the alert.
  10. Repeat steps 5 - 9to add more conditions to the alert.
  11. Optional: If you want to send email notifications of alert violations to contacts other than the policy contacts or global alert notification addresses, enter the email addresses in the Email Override field.
    Tip: If you enter an email address in the Email Override field, only that email address receives notifications for the alert. The following contacts do not receive notifications:
    • Any email addresses that are specified as policy contacts, if the alert is in an alert policy.
    • Any global email addresses that are specified for alert notifications. To view the global alert notification addresses, go to Settings > Notification Settings.
  12. Optional: Click View Additional Options to specify how frequently you are notified of alerts.
    Use these settings to avoid triggering too many alerts for some conditions.
  13. Optional: Click View Additional Options to specify that the following actions are taken when alert conditions are detected on monitored resources:
    Run script
    Run a script when an alert is triggered for the condition. Use a script to call external programs or run commands that take action as the result of an alert. By using a script, you can automatically address potential storage issues when they are detected to avoid unplanned downtime or performance bottlenecks. Learn more.
    Netcool® / OMNIbus
    Send alert notifications to a Netcool server or OMNIbus EIF probe server within your environment that was configured to receive IBM Spectrum Control alerts.
    SNMP
    Generate SNMP trap messages to any network management station (NMS), console, or terminal when an alert condition is detected. System administrators must set up their SNMP trap ringer with the provided management information base (MIB) files to receive SNMP traps from the product.
    Windows event log or UNIX syslog
    Write alert messages to the OS log. If you already have an administrator monitoring OS logs, this method is a way to centralize your priority messages for quick notification and viewing.
  14. Click Save Changes.

Results

To view all the alerts generated by IBM Spectrum Control, go to Home > Alerts in the GUI.