How alerts work

Alerting functions examine the attributes, capacity, and performance of resources. If the conditions that are defined for alerts are met, the actions that are specified for the alert are taken. Typically, the actions include sending a notification. For example, if the status of a SAN Volume Controller storage system changes to Error, an alert is displayed in the Alerts page in the GUI, and an email might be sent to a storage administrator. For alerting on switches, fabrics, hosts, and virtual machines, asset, configuration, status, and performance information is examined.

Triggering conditions for alerts

The conditions that trigger alert notifications depend on the type of resource that you are monitoring. In general, the following types of conditions can trigger alerts:
  • An attribute or configuration of a resource changed
  • The capacity of a resource fell outside a specified range
  • The performance of a resource fell outside a specified range
  • The storage infrastructure was changed, such as a new or removed resource
  • Data is not being collected for a resource

Learn more about the conditions that can trigger alerts for each type of resource.

Event processing

Conditions that generate alerts are detected when data is collected from storage systems and during event processing. By default, the metadata that is collected from storage systems is refreshed every 24 hours. For some storage systems such as IBM Storage Accelerate and the XIV, events are polled every minute from the resource. For IBM Storage Scale, status change events are polled frequently, typically within minutes. For other resources, events are subscription-based, where the resource itself or a data source such as a CIM agent sends the events to IBM Storage Insights Pro when conditions change on the resource.

Examples of storage systems that use subscription-based event processing include SAN Volume Controller, Storwize V7000, Storwize V7000 Unified, and IBM Storage FlashSystem V9000. For these storage systems, a probe is automatically run when many events are received from the storage system in a short time period. To avoid performance bottlenecks, probes are run only every 20 minutes.

Determining which type of alert to use

To determine whether to define alerts in alert policies, for individual resources, or for the set of resources that are included in an application or general group, follow these guidelines:
Which type of alerts to use? Scenario Learn more icon Learn more
Alerts defined in alert policies You want to manage alert conditions and notification settings for a group of resources of the same type. For example, if you have several SAN Volume Controller storage systems in your environment, you can create an alert policy so that the alert definitions are the same for all of the SAN Volume Controller systems.

If you have some SAN Volume Controller systems in a test environment, and some in a production environment, you can use one alert policy for the test environment, and another for the production environment.

Resource alerts You want to receive alert notifications about changes for a specific resource, or its internal resources. For example, for a storage system, you can alert on the attributes of the system itself, and on the attributes of its volumes, pools, ports, and other internal resources.

If you define an alert for a resource, for example, a performance alert for the ports on a storage system, the alert threshold value applies to all of the ports on the storage system. You cannot apply different alert thresholds to internal resources of the same type on a resource.

Application alerts Use application alerts in the following scenarios:
  • You want to receive alert notifications for all the resources of a certain type in an application. For example, if your application uses multiple storage systems, you can define the storage system alerts once for the application and the alerts apply to all the storage systems. If you later add more storage systems to the application, the existing application alerts apply to those storage systems also.
  • You want to apply different thresholds to internal resources of the same type on a storage system. For example, you have production applications and test applications that use volumes on a SAN Volume Controller. The production applications require response times of 6 milliseconds or less while the test applications can tolerate response times up to 30 milliseconds. You can use application alerts to set separate response time thresholds for volumes used by the different applications, depending on the needs of that application.
General group alerts Use general group alerts in the following scenarios:
  • You want to receive alert notifications about changes for a subset of the resources of a particular type. For example, you can detect when the ports that are used for replication on your SAN Volume Controller have insufficient buffer-to-buffer credit. Alert notifications are not generated for ports that are not used for replication.
  • You want to receive alert notifications about changes for a group of resources that are logically related. You can group all the storage systems at a specific location or all the servers that use a particular operating system. For example, you can receive alert notifications when the used capacity of any of your Linux® servers exceeds 80%.
Tip: If a resource is in both an alert policy and a general group, the alert definitions for both the policy and the group are applied.