Managing persistent volume sizes for Watson OpenScale

Manage persistent volumes to control data storage for Watson OpenScale.

Manage etcd persistent volumes with support

To manage the size of Watson OpenScale etcd persistent volumes that support resizing, you must complete the following steps:
  1. Log in to Red Hat OpenShift Container platform with the following command:
    oc login <OpenShift_URL>:<port>
  2. Pause the Watson OpenScale operator reconciliation of the Watson OpenScale custom resource:
    instanceProjectName='cpd-instance'
    
    oc patch WOService aiopenscale -n ${instanceProjectName} --type merge --patch '{"spec": {"ignoreForMaintenance": true}}'

    If you did not install Cloud Pak for Data in the cpd-instance project, specify accurate values in the instanceProjectName field.

  3. Delete the etcd statefulset cluster with the following command:
    oc delete sts aiopenscale-ibm-aios-etcd -n ${instanceProjectName}
  4. Customize the size of the etcd persistent volumes with the following command:
    targetPVCSize=8Gi
    
    oc patch WOServiceConfig aiopenscale -n ${instanceProjectName} --type merge --patch '{"spec": {"aios_etcd": {"pvc": {"size": "'${targetPVCSize}'"}}}}'

    The default size value is 4GI.

  5. Change the size of the etcd persistent volume claim (PVC) objects with the following commands:
    oc patch pvc data-aiopenscale-ibm-aios-etcd-0 -n ${instanceProjectName} --type merge --patch  '{"spec":{"resources":{"requests":{"storage":"'${targetPVCSize}'"}}}}'
    oc patch pvc data-aiopenscale-ibm-aios-etcd-1 -n ${instanceProjectName} --type merge --patch  '{"spec":{"resources":{"requests":{"storage":"'${targetPVCSize}'"}}}}'
    oc patch pvc data-aiopenscale-ibm-aios-etcd-2 -n ${instanceProjectName} --type merge --patch  '{"spec":{"resources":{"requests":{"storage":"'${targetPVCSize}'"}}}}'
  6. Resume the Watson OpenScale operator reconciliation of the Watson OpenScale custom resource:
    oc patch WOService aiopenscale -n ${instanceProjectName} --type merge --patch '{"spec": {"ignoreForMaintenance": false}}'
  7. Check the status of the reconciliation with the following command:
    oc get WOService aiopenscale -n ${instanceProjectName} -o jsonpath='{.status.wosStatus} {"\n"}'

    The status of the custom resources changes to Completed when the reconciliation finishes successfully.

  8. After the reconciliation completes, verify that the etcd PVCs are resized with the following command:
    oc get pvc -n ${instanceProjectName} | grep data-aiopenscale-ibm-aios-etcd

Manage etcd persistent volumes without support

To manage the size of Watson OpenScale etcd persistent volumes that don't support resizing, you must complete the following steps:
Note: If your persistent volume doesn't support resizing, it is recommended that you contact your administrator to configure the storage provisioner to enable persistent volume resizing.
  1. Log in to Red Hat OpenShift Container platform with the following command:
    oc login <OpenShift_URL>:<port>
  2. Pause the Watson OpenScale operator reconciliation of the Watson OpenScale custom resource:
    instanceProjectName='cpd-instance'
    
    oc patch WOService aiopenscale -n ${instanceProjectName} --type merge --patch '{"spec": {"ignoreForMaintenance": true}}'

    If you did not install Cloud Pak for Data in the cpd-instance project, specify accurate values in the instanceProjectName field.

  3. Scale down the etcd Statefulset cluster:
    oc scale sts aiopenscale-ibm-aios-etcd -n ${instanceProjectName} --replicas=0
  4. Change the persistentVolumeReclaimPolicy specification to Retain in all of the etcd persistent volume claim (PVC) objects with the following command:
    etcd_0_pv=`oc get pvc data-aiopenscale-ibm-aios-etcd-0 -n ${instanceProjectName} -o jsonpath='{.spec.volumeName}'`
    oc patch pv ${etcd_0_pv} -n ${instanceProjectName} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    
    etcd_1_pv=`oc get pvc data-aiopenscale-ibm-aios-etcd-1 -n ${instanceProjectName} -o jsonpath='{.spec.volumeName}'`
    oc patch pv ${etcd_1_pv} -n ${instanceProjectName} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    
    etcd_2_pv=`oc get pvc data-aiopenscale-ibm-aios-etcd-2 -n ${instanceProjectName} -o jsonpath='{.spec.volumeName}'`
    oc patch pv ${etcd_2_pv} -n ${instanceProjectName} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

    The command prevents the etcd PVC objects from being deleted.

  5. Customize the etcd persistent volume sizes:
    targetPVCSize=8Gi
    
    oc patch WOServiceConfig aiopenscale -n ${instanceProjectName} --type merge --patch '{"spec": {"aios_etcd": {"pvc": {"size": "'${targetPVCSize}'"}}}}'

    The default size value is 4GI.

  6. Create temporary backup of etcd PVC objects with the following command:
    etcdStorageClass=`oc get pvc data-aiopenscale-ibm-aios-etcd-0 -o jsonpath='{.spec.storageClassName}'`
    
    cat <<EOF |oc apply -f -
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: aiopenscale-etcd-backup-pvc-0
      namespace: ${instanceProjectName}
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: ${targetPVCSize}
      storageClassName: ${etcdStorageClass}
      volumeMode: Filesystem
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: aiopenscale-etcd-backup-pvc-1
      namespace: ${instanceProjectName}
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: ${targetPVCSize}
      storageClassName: ${etcdStorageClass}
      volumeMode: Filesystem
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: aiopenscale-etcd-backup-pvc-2
      namespace: ${instanceProjectName}
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: ${targetPVCSize}
      storageClassName: ${etcdStorageClass}
      volumeMode: Filesystem
    EOF
    

    The backup objects provide intermediate storage to complete resizing.

  7. Create a job to copy the original etcd data to the backup PVC objects with the following command:
    operatorProjectName='cpd-operator'
    
    operatorPod=`oc get pod -n ${operatorProjectName} | grep wos | awk '{ printf("%s", $1) }'`
    
    aiosKubectlImageDigest=`oc exec ${operatorPod} -n ${operatorProjectName} -- cat config-vars/images/images-x86_64.yaml | grep -A 1 aios_kubectl | grep digest | awk '{ printf("%s", $2) }' | tr -d '"'`
    
    cat <<EOF |oc apply -f -
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: backup-etcd-data-job
    spec:
      template:
        metadata:
          name: backup-etcd-data-job
          namespace: ${instanceProjectName}
        spec:
          serviceAccountName: zen-norbac-sa
          hostNetwork: false
          hostPID: false
          hostIPC: false
          securityContext:
            runAsNonRoot: true
            runAsGroup: 1000321421
          restartPolicy: Never
          containers:
            - name: backup-etcd-data-job
              image: cp.icr.io/cp/cpd/aios-kubectl@${aiosKubectlImageDigest}
              command: 
                - "/bin/sh"
                - "-ec"
                - |
                  echo "COPY FILES FROM ETCD DATA PVC TO BACKUP PVC"
                  cp -r /data-0/* /backup-0
                  cp -r /data-1/* /backup-1
                  cp -r /data-2/* /backup-2
                  echo
                  echo "POST COPY - PRINT LIST OF FILES IN ETCD BACKUP PVC"
                  echo "backup-0 ..."
                  ls -ltrR /backup-0
                  echo "backup-1 ..."
                  ls -ltrR /backup-1
                  echo "backup-2 ..."
                  ls -ltrR /backup-2
    
              volumeMounts:
              - name: backup-0
                mountPath: "/backup-0"
              - name: data-0
                mountPath: "/data-0"
              - name: backup-1
                mountPath: "/backup-1"
              - name: data-1
                mountPath: "/data-1"
              - name: backup-2
                mountPath: "/backup-2"
              - name: data-2
                mountPath: "/data-2"
          volumes:
          - name: backup-0
            persistentVolumeClaim:
                claimName: aiopenscale-etcd-backup-pvc-0
          - name: data-0
            persistentVolumeClaim:
                claimName: data-aiopenscale-ibm-aios-etcd-0
          - name: backup-1
            persistentVolumeClaim:
                claimName: aiopenscale-etcd-backup-pvc-1
          - name: data-1
            persistentVolumeClaim:
                claimName: data-aiopenscale-ibm-aios-etcd-1
          - name: backup-2
            persistentVolumeClaim:
                claimName: aiopenscale-etcd-backup-pvc-2
          - name: data-2
            persistentVolumeClaim:
                claimName: data-aiopenscale-ibm-aios-etcd-2
          
    EOF
    

    If you did not install Cloud Pak for Data in the cpd-instance project, specify accurate values in the instanceProjectName field.

  8. Verify that the etcd data is backed up by inspecting the backup-etcd-data-job job pod logs with the following command:
    backupEtcdDataJobPod=`oc get pod -n ${instanceProjectName} -l job-name=backup-etcd-data-job | awk 'NR>1 { printf("%s", $1) }'`
    oc logs -f pod/${backupEtcdDataJobPod} -n ${instanceProjectName}
  9. Delete the backup job with the following command:
    oc delete job -l job-name=backup-etcd-data-job -n ${instanceProjectName}
  10. Delete the original etcd data PVC objects with the following command:
    oc delete pvc data-aiopenscale-ibm-aios-etcd-0
    oc patch pv ${etcd_0_pv} --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
    
    oc delete pvc data-aiopenscale-ibm-aios-etcd-1
    oc patch pv ${etcd_1_pv} --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
    
    oc delete pvc data-aiopenscale-ibm-aios-etcd-2
    oc patch pv ${etcd_2_pv} --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
    

    The command does not delete the underlying PV objects.

  11. Recreate the resized etcd data PVC objects with the following command:
    cat <<EOF |oc apply -f -
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: data-aiopenscale-ibm-aios-etcd-0
      namespace: ${instanceProjectName}
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: ${targetPVCSize}
      storageClassName: ${etcdStorageClass}
      volumeMode: Filesystem
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: data-aiopenscale-ibm-aios-etcd-1
      namespace: ${instanceProjectName}
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: ${targetPVCSize}
      storageClassName: ${etcdStorageClass}
      volumeMode: Filesystem
    ---
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: data-aiopenscale-ibm-aios-etcd-2
      namespace: ${instanceProjectName}
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: ${targetPVCSize}
      storageClassName: ${etcdStorageClass}
      volumeMode: Filesystem
    EOF
    
    
  12. Create a job to restore etcd data from backup PVC objects to resized etcd PVC objects with the following command:
    cat <<EOF |oc apply -f -
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: restore-etcd-data-job
    spec:
      template:
        metadata:
          name: restore-etcd-data-job
          namespace: ${instanceProjectName}
        spec:
          serviceAccountName: zen-norbac-sa
          hostNetwork: false
          hostPID: false
          hostIPC: false
          securityContext:
            runAsNonRoot: true
            runAsGroup: 1000321421
          restartPolicy: Never
          containers:
            - name: restore-etcd-data-job
              image: cp.icr.io/cp/cpd/aios-kubectl@${aiosKubectlImageDigest}
              command: 
                - "/bin/sh"
                - "-ec"
                - |
                  echo "COPY FILES FROM ETCD BACKUP PVC TO DATA PVC"
                  cp -r /backup-0/* /data-0
                  cp -r /backup-1/* /data-1
                  cp -r /backup-2/* /data-2
                  echo
                  echo "POST COPY - PRINT LIST OF FILES IN ETCD DATA PVC"
                  echo "data-0 ..."
                  ls -ltrR /data-0
                  echo "data-1 ..."
                  ls -ltrR /data-1
                  echo "data-2 ..."
                  ls -ltrR /data-2
    
              volumeMounts:
              - name: backup-0
                mountPath: "/backup-0"
              - name: data-0
                mountPath: "/data-0"
              - name: backup-1
                mountPath: "/backup-1"
              - name: data-1
                mountPath: "/data-1"
              - name: backup-2
                mountPath: "/backup-2"
              - name: data-2
                mountPath: "/data-2"
          volumes:
          - name: backup-0
            persistentVolumeClaim:
                claimName: aiopenscale-etcd-backup-pvc-0
          - name: data-0
            persistentVolumeClaim:
                claimName: data-aiopenscale-ibm-aios-etcd-0
          - name: backup-1
            persistentVolumeClaim:
                claimName: aiopenscale-etcd-backup-pvc-1
          - name: data-1
            persistentVolumeClaim:
                claimName: data-aiopenscale-ibm-aios-etcd-1
          - name: backup-2
            persistentVolumeClaim:
                claimName: aiopenscale-etcd-backup-pvc-2
          - name: data-2
            persistentVolumeClaim:
                claimName: data-aiopenscale-ibm-aios-etcd-2
    EOF
    
  13. Verify that the etcd data is restored by inspecting the restore-etcd-data-job job pod logs with the following command:
    restoreEtcdDataJobPod=`oc get pod -n ${instanceProjectName} -l job-name=restore-etcd-data-job | awk 'NR>1 { printf("%s", $1) }'`
    
    oc logs -f pod/${restoreEtcdDataJobPod} -n ${instanceProjectName}
  14. Delete the restore job with the following command:
    oc delete job -l job-name=restore-etcd-data-job -n ${instanceProjectName}
  15. Delete the etcd statefulset cluster with the following command:
    oc delete sts aiopenscale-ibm-aios-etcd -n ${instanceProjectName}
  16. Resume the Watson OpenScale operator reconciliation of the Watson OpenScale custom resource:
    oc patch WOService aiopenscale -n ${instanceProjectName} --type merge --patch '{"spec": {"ignoreForMaintenance": false}}'
  17. Check the status of the reconciliation with the following command:
    oc get WOService aiopenscale -n ${instanceProjectName} -o jsonpath='{.status.wosStatus} {"\n"}'

    The status of the custom resources changes to Completed when the reconciliation finishes successfully.

  18. After the reconciliation completes, verify that the etcd PVCs are resized with the following command:
    oc get pvc -n ${instanceProjectName} | grep data-aiopenscale-ibm-aios-etcd

Increase Kafka persistent volumes

To increase the size of Kafka persistent volumes, you must complete the following steps:
  1. Move the Watson OpenScale custom resource to maintenance with the following command:
    oc patch WOService aiopenscale --type merge --patch '{"spec": {"ignoreForMaintenance": true}}'
  2. Scale down the Kafka StatefulSet cluster with the following command:
    oc scale sts aiopenscale-ibm-aios-kafka --replicas=0
  3. Delete all of the Watson OpenScale persistent volume claim (PVC) objects:
    oc delete pvc data-aiopenscale-ibm-aios-kafka-0 data-aiopenscale-ibm-aios-kafka-1 data-aiopenscale-ibm-aios-kafka-2
  4. Delete the Kafka StatefulSet cluster with the following command:
    oc delete sts aiopenscale-ibm-aios-kafka
  5. Specify a new size for the Kafka persistent volume storage with the following command:
    oc patch WOServiceConfig aiopenscale --type merge --patch '{"spec": {"aios_kafka": {"pvc": {"size": "4Gi"}}}}'

    The default size is 1Gi.

  6. Scale down all of the Watson OpenScale micro-service deployments with the following command:
    oc scale deployment -l "component in (aios-bias,aios-bkpi,aios-common,aios-configuration,aios-dashboard,aios-datamart,aios-drift,aios-explainability,aios-fast,aios-feedback,aios-ml,aios-mrm,aios-notification,aios-payload,aios-scheduling)" --replicas=0
  7. Move the Watson OpenScale custom resource out of maintenance with the following command:
    oc patch WOService aiopenscale --type merge --patch '{"spec": {"ignoreForMaintenance": false}}'
  8. Check the status of the reconciliation with the following command:
    oc get WOService aiopenscale

    The status of the custom resource changes to Completed when the reconciliation finishes successfully.

Reset Kafka persistent volumes

To reset Kafka persistent volumes, you must complete the following steps:
  1. Move the Watson OpenScale custom resource to maintenance with the following command:
    oc patch WOService aiopenscale --type merge --patch '{"spec": {"ignoreForMaintenance": true}}'
  2. Scale down the Kafka StatefulSet cluster with the following command:
    oc scale sts aiopenscale-ibm-aios-kafka --replicas=0
  3. Delete all of the Watson OpenScale persistent volume claim (PVC) objects:
    oc delete pvc data-aiopenscale-ibm-aios-kafka-0 data-aiopenscale-ibm-aios-kafka-1 data-aiopenscale-ibm-aios-kafka-2
  4. Scale down all of the Watson OpenScale micro-service deployments with the following command:
    oc scale deployment -l "component in (aios-bias,aios-bkpi,aios-common,aios-configuration,aios-dashboard,aios-datamart,aios-drift,aios-explainability,aios-fast,aios-feedback,aios-ml,aios-mrm,aios-notification,aios-payload,aios-scheduling)" --replicas=0
  5. Move the Watson OpenScale custom resource out of maintenance with the following command:
    oc patch WOService aiopenscale --type merge --patch '{"spec": {"ignoreForMaintenance": false}}'
  6. Check the status of the reconciliation with the following command:
    oc get WOService aiopenscale

    The status of the custom resource changes to Completed when the reconciliation finishes successfully.