Performance considerations for metric data collection
When configuring an integration in IBM Cloud Pak for AIOps that collects metric data such as AppDynamics, AWS CloudWatch, Dynatrace, Instana, New Relic, Splunk, and Zabbix integrations, it is important to consider the amount of data on the target system when selecting metrics.
If you have an exceptionally large amount of data on the selected target system, there can be relatively higher resource usage from collecting data from that system when compared to a system with less data.
Filtering capabilities table
The types of filtering available for different integrations are indicated here. No target system supports all filtering options, which is why they are different for each integration.
Metric integration name | Metrics filtering | Host filtering | Namespace filtering | Application filtering | Technology filtering | Index filtering |
---|---|---|---|---|---|---|
AppDynamics | X | X | ||||
AWS CloudWatch | X | X | ||||
Dynatrace | X | X | ||||
Instana | X | X | ||||
New Relic | X | X | ||||
Splunk | X | X | X | |||
Zabbix | X | X | X |
For target systems with a high amount of data, consider the following actions to ensure your integration works as expected:
- Do not select all metrics. Instead, review the documentation for the target system you are pulling data for and select specific metrics that are useful. Not all metrics are suited to all use cases.
- Do not select all hosts for a particular integration. If you have thousands of hosts, it can be better to create multiple integrations instead of one integration with all your hosts.
- Do not create more than 10 integrations on a small cluster or 20 integrations on a large cluster at any one time. The more integrations that you have running simultaneously, the more computing resources are needed.
- If the Kafka persistent volume claims (PVCs) near capacity, increase each of the PVCs by an extra 60 GB. For more information about adjusting the size of the PVC, see Increasing Kafka PVC.
Note: For production deployments, the default Kafka PVC is 30 GB, but a minimum of 90 GB is recommended. This adjustment must be made after installing the cluster.
- If available within the console for the integration you are configuring, you can reduce the length of time for historical data collection, reduce the base parallelism, or select a smaller span value in order to reduce the resource usage of the integration.
Examples of performance considerations when collecting data:
- A Dynatrace integration taking too long to collect historical data, with unstable and inconsistent time frames, for example taking 4 days to collect 5 minutes of data.
- An AWS CloudWatch integration collecting more data than the capacity of the metric anomaly detection of the system, for example collecting 100,000 KPIs per 5 minutes while the cluster's sizing supports only 30,000 KPIs per 5 minutes with only 5 metrics selected. The solution to this would be to scale the sizing of the cluster to accomodate more metric data.
- A Zabbix integration pod restarts on a small Red Hat OpenShift Container Platform environment collecting historical metric data and shows an "Unknown" status. The Kafka PVC on this cluster has only 30 GB of storage on a production environment and it is at maximum capacity and the iaf-kafka pod is in a restarting status as well. The solution to this would be to increase the size of the Kafka PVC per the preceding recommendations and to restart the connector-bridge, iaf-kafka, and integration pods.