Pre-defined health alerts
Your IBM Cloud Pak for AIOps deployment is automatically configured with comprehensive self-monitoring capabilities to detect and alert on critical issues such as storage capacity. These alerts are automatically sent from Red Hat OpenShift Container Platform's Alertmanager to Cloud Pak for AIOps.
Prerequisites
- User workload monitoring must be enabled to collect metrics from your Cloud Pak for AIOps deployment. To enable user workload monitoring, see Configuring and enabling OpenShift Container Platform monitoring.
Monitoring
Cloud Pak for AIOps includes the following automated monitoring capabilities:
Storage monitoring
Prometheus rules that are available by default automatically monitor all Cloud Pak for AIOps persistent volume claims (PVCs) and generate alerts when storage thresholds are exceeded:
- Warning alerts are generated when PVC usage reaches 75% (25% available space remaining)
- Critical alerts are generated when PVC usage reaches 85% (15% available space remaining)
PVCs for the following components are monitored:
- Kafka
- Zookeeper
- Cassandra
- CouchDB
- EDB Postgres
- Elasticsearch
- MinIO
- Redis
- Zen Metastore EDB
Alert processing
Alerts generated by this automation can be identified by their
alert.sender.name = Prometheus. Create policies to
handle these alerts as needed, to learn more see Creating policies.
The following example policy shows a scope-based grouping:
Automated webhook configuration
A webhook instance called selfmonitoring-webhook is
automatically created during installation to:
- Receive alerts from OpenShift Container Platform Alertmanager
- Forward critical alerts to the Cloud Pak for AIOps Alert Viewer
- Filter alerts based on severity and namespace
This webhook is hidden from the Cloud Pak for AIOps UI to prevent accidental modification or deletion.