Resizing storage (IBM Cloud Pak for AIOps on OpenShift)

Learn about increasing the persistent volume claim (PVC) sizes for IBM Cloud Pak for AIOps.

The original PVC sizes in Persistent storage sizing provide adequate space for a user to begin working with IBM Cloud Pak for AIOps. Based on your actual usage and needs, the PVCs might need to be adjusted up to handle the workload when IBM Cloud Pak for AIOps is installed and running. In particular, CouchDB, Elasticsearch, and Kafka might need significantly more disk space. Expand the following sections to review details about custom sizings for these PVCs.

Cassandra

Cassandra is used to primarily store Topology Management resources, as well as incoming metric and alert data. By default, 50 GB is allocated for each of the 3 Cassandra instances (PVCs). This default configuration is targeted for environments with less than 500,000 topology resources to provide ample storage for topology resources, metric data (120K metrics per 5 minute KPI window), and Cassandra's internal data management activities. For environments with larger topology resource requirements, an additional 70 GB is recommended for each additional 1M topology resources, and an additional 15 GB for each additional 100K metrics per 5 minute KPI window for Metric storage.

In comparison to a default production environment, a larger environment requires additional storage. For example:

Example 1: An environment with 1.5M topology resources, and 120,000 metrics per 5 minute KPI window would require an additional 70 GB of storage for topology resources, no additional storage for metric processing, for a total of 120 GB required per Cassandra storage PVC.

Example 2: An environment with 3M topology resources, and 1.2M metrics per 5 minute KPI window would require an additional 210 GB for Topology storage and an additional 135 GB for Metric storage should be allocated to raise the total Cassandra allocation (per PVC) from the default of 50 GB to 395 GB of storage per Cassandra storage PVC.

Increasing the size of the Cassandra PVCs

You can increase the configured Cassandra storage size if you need to support more than the default configuration supports. You can also incrementally add disk storage as resource counts or metric rates increase over time. To add disk storage, use the Red Hat OpenShift (oc) CLI to scale your PVC storage:

  1. Export environment variables and switch to your project.

    export PROJECT=<project>
    export INSTALLATION_NAME="$(oc get installation -n ${PROJECT} -o jsonpath="{.items[0].metadata.name}")"
    export SIZE=<size>
    
    oc project ${PROJECT}
    

    Where

    • <project> is the project that your IBM Cloud Pak for AIOps installation is deployed in.
    • <size> is the storage size that you require for the Cassandra PVCs.
  2. Run the following command to increase Cassandra's PVC claims.

    oc patch installation ${INSTALLATION_NAME} --type merge -p '{"spec":{"storage":{"data-aiops-topology-cassandra":{"size":"'${SIZE}Gi'"}}}}'
    

CouchDB

CouchDB is used to store the results of manually run and fully automated runbook activities. CouchDB storage is uncapped to allow you access to historical activities. Each CouchDB PVC is set with a default value of 20 GB of storage.

As an example, in a small environment with an average of 60 runbook executions per hour (1 per minute, 1440 per day), you budget about 20 MB of storage for CouchDB per day. In a Large (high availability) deployment, with the same rate of runbook executions, you budget 60 MB of storage space for CouchDB per day. By using the default CouchDB storage sizes with this example scenario, the default storage for IBM Cloud Pak for AIOps supports about 1000 days of retention.

A sample runbook is provided to reduce the historical storage footprint when needed. For more information about sample runbooks, see Load and reset example runbooks.

EDB Postgres

Increasing the size of the Postgres PVCs

  1. Export environment variables and switch to your project.

    export PROJECT=<project>
    export INSTALLATION_NAME="$(oc get installation -n ${PROJECT} -o jsonpath="{.items[0].metadata.name}")"
    export SIZE=<size>
    
    oc project ${PROJECT}
    

    Where

    • <project> is the project that your IBM Cloud Pak for AIOps installation is deployed in.
    • <size> is the storage size that you require for the Postgres PVCs.
  2. Run the following command to increase EDB Postgres's PVC claim.

    oc patch installation ${INSTALLATION_NAME} --type merge -p '{"spec":{"storage":{"ibm-cp-aiops-edb-postgres":{"size":"'${SIZE}Gi'"}}}}'
    

    Important: You must run the preceding command as it is, even if the name of your Postgres PVC does not start with ibm-cp-aiops.

Elasticsearch

Elasticsearch is used to store various data sets in IBM Cloud Pak for AIOps, the largest of which is log data for training and analysis. If there are large volumes of log data, then you might need to increase the provision for Elasticsearch. Use the instructions for Storage class requirements under the IBM Cloud Pak for AIOps only section of hardware requirements. The exact amount of Elasticsearch data, and therefore the amount of storage that is needed, varies greatly from case to case.

For estimation purposes, plan for around 10Gi of disk space usage per replica per day, running at 1000 log messages per second.

One replica of the data is stored in Elasticsearch on a starter deployment of IBM Cloud Pak for AIOps, and three copies of the data are stored in Elasticsearch for a production IBM Cloud Pak for AIOps deployment.

The default retention period for Elasticsearch data is 14 days, so plan for 15 days of storage.

The total estimated space must be available across all of the Elasticsearch instances in the cluster combined. A starter deployment of IBM Cloud Pak for AIOps has one Elasticsearch node, and a production IBM Cloud Pak for AIOps deployment has five Elasticsearch nodes.

Plan for a minimum of a 20% buffer to allow for variations in log data size.

Example 1: Running at 4000 logs per second on a starter deployment for 14+ days.

  • This would be roughly 40Gi x 15 days x 1 replica = 600Gi of Elasticsearch PVC usage.
  • Multiply this by a buffer of 1.2.
  • This results in a recommendation of 720Gi in total for the one Elasticsearch instance deployed by default in a starter deployment.
  • Set the storage size for the Elasticsearch PVC to 720Gi.

Example 2: Running at 8000 logs per second on a production deployment for 14+ days.

  • This would be roughly 80Gi x 15 days x 3 replicas = 3600Gi of Elasticsearch PVC usage.
  • Multiply this by a buffer of 1.2.
  • This results in a recommendation of 4,320Gi total across all five Elasticsearch instances, or roughly 864Gi per instance.
  • Set the storage size for the Elasticsearch PVC to 864Gi.

Increasing the size of the Elasticsearch PVCs

  1. Export environment variables and switch to your project.

    export PROJECT=<project>
    export INSTALLATION_NAME="$(oc get installation -n ${PROJECT} -o jsonpath="{.items[0].metadata.name}")"
    export SIZE=<size>
    
    oc project ${PROJECT}
    

    Where

    • <project> is the project that your IBM Cloud Pak for AIOps installation is deployed in.
    • <size> is the storage size that you require for the Elasticsearch PVCs.
  2. Run the following command to increase Elasticsearch's PVC claims, where <size> is the storage size that you require for Elasticsearch's PVCs.

    oc patch installation ${INSTALLATION_NAME} --type merge -p '{"spec":{"storage":{"data-aiops-ibm-elasticsearch-es-server-all":{"size":"'${SIZE}Gi'"}}}}'
    

    Important: If you have a multizone environment, the name of your PVC might end with the zone name instead of all, but it is important that the preceding codeblock is run as it is, with data-aiops-ibm-elasticsearch-es-server-all.

Kafka

Kafka's storage requirements are primarily driven by Kafka log integrations, and also by other integrations such as Topology observers and Metric integrations, including custom Metric integrations.

The amount of storage that Kafka needs varies depending on your usage, and you might need to increase the storage provision for Kafka. Data collection from a metric data source commonly starts with historical data collection. For this reason, it is good practice to increase the PVC size for Kafka.

Increasing the size of the Kafka PVCs

You can increase Kafka's PVC size with the following commands. The Kafka pods will then restart with the resized PVCs.

  1. Export environment variables and switch to your project.

    export PROJECT=<project>
    export INSTALLATION_NAME="$(oc get installation -n ${PROJECT} -o jsonpath="{.items[0].metadata.name}")"
    export SIZE=<size>
    
    oc project ${PROJECT}
    

    Where

    • <project> is the project that your IBM Cloud Pak for AIOps installation is deployed in.
    • <size> is the storage size that you require for the Kafka PVCs.
  2. Run the following command to increase Kafka's PVC claim, where <size> is the storage size that you require for Kafka's PVCs.

    oc patch installation ${INSTALLATION_NAME} --type merge -p '{"spec":{"storage":{"data-iaf-system-kafka":{"size":"'${SIZE}Gi'"}}}}'
    

    Notes:

    • For Metric Anomaly Detection, it is recommended to increase the persistent storage for Kafka by 2Gi for every 10,000 KPIs being processed.
    • For Log Anomaly Detection, it is recommended to increase the Kafka PVC space based on the size of the logs, plus roughly an additional 20% buffer due to overheads, such as the key strings that are added to the messages.
    • For example, if you plan on uploading 40Gi of log data then add around 48Gi (40Gi plus a 20% buffer) to the existing 30Gi for a total of 78Gi for the Kafka PVC.

MinIO

Increasing the size of the MinIO PVCs

  1. Switch to your IBM Cloud Pak for AIOps project.

    oc project <project>
    

    Where <project> is the project that your IBM Cloud Pak for AIOps installation is deployed in.

  2. Run the following command to increase MinIO's PVC claims, where <size> is the storage size that you require for the MinIO PVCs.

    for p in $(oc get pvc -l app.kubernetes.io/name=ibm-minio -o jsonpath='{.items[*].metadata.name}'); do oc patch pvc $p --type merge -p '{"spec":{"resources":{"requests":{"storage":"<size>Gi"}}}}' ; done
    

    For example, to change the MinIO PVC size to 80Gi:

    for p in $(oc get pvc -l app.kubernetes.io/name=ibm-minio -o jsonpath='{.items[*].metadata.name}'); do oc patch pvc $p --type merge -p '{"spec":{"resources":{"requests":{"storage":"80Gi"}}}}' ; done
    

  3. Delete the aimanager-ibm-minio StatefulSet and leave its pods.

    oc delete sts --cascade=orphan aimanager-ibm-minio
    
  4. Run the following command to update the aimanager custom resource with the new MinIO storage size, where <size> is the storage size that you set for the MinIO PVCs.

    oc patch aimanager aimanager --type='json' -p="[{'op': 'replace', 'path': '/spec/minio/storage/storageSize', 'value': '<size>Gi'}]"
    

    For example, if MinIO PVC size is updated to 80Gi:

    oc patch aimanager aimanager --type='json' -p="[{'op': 'replace', 'path': '/spec/minio/storage/storageSize', 'value': '80Gi'}]"
    
  5. Verify that the aimanager-ibm-minio statefulset has the correct storage size.

    oc get statefulset  aimanager-ibm-minio -o jsonpath='{.spec.volumeClaimTemplates[0].spec.resources.requests.storage}'