Backup with IBM Fusion fails at pre-backup stage

Creating a backup with IBM Fusion fails at the pre-backup stage.

Symptoms

The backup sequence stops at the Hook: br-service-hooks/pre-backup stage. In the IBM Fusion transaction manager log file, you see an error message like in the following example:
time=<timestamp> level=info msg=waiting for job to complete: 
elasticsearch-master-ibm-elasticsearch-cpdbr-backup func=cpdbr-oadp/pkg/kube.KubeAPI.WaitForJobCompletion.func1 file=/go/src/cpdbr-oadp/pkg/kube/job.go:288

Causes

A backup operation is running for an excessively long time, and the elasticsearch-master-ibm-elasticsearch-cpdbr-backup pod is in a Running state for a long time.

Resolving the problem

Do the following steps:

  1. Log in to Red Hat® OpenShift® Container Platform as a cluster administrator.
    ${OC_LOGIN}
    Remember: OC_LOGIN is an alias for the oc login command.
  2. Change to the project where IBM® Software Hub instance is installed:
    oc project ${PROJECT_CPD_INST_OPERANDS}
  3. Quiesce the cluster:
    oc patch elasticsearchcluster elasticsearch-master --type merge --patch '{"spec": {"quiesce": true}}'
  4. Before you proceed to the next step, check that all Elasticsearch pods are terminated by running the following command:
    oc get pods | grep elasticsea
  5. Unquiesce the cluster:
    oc patch elasticsearchcluster elasticsearch-master --type merge --patch '{"spec": {"quiesce": false}}'
  6. Before you proceed to the next step, check that 3 Elasticsearch pods are now running:
    oc get pods | grep elasticsea
  7. Retrieve all current Elasticsearch snapshots:
    oc exec elasticsea-0ac3-ib-6fb9-es-server-esnodes-2 -c elasticsearch -- curl --request GET --url 'http://localhost:19200/_cat/snapshots/cloudpak?h=id' --header 'content-type: application/json'
  8. Remove all snapshots in the list that the previous step returns, except for the first snapshot.

    In the following command, replace <ID1>,<ID2>,... with the IDs of the snapshots to remove.

    oc exec elasticsea-0ac3-ib-6fb9-es-server-esnodes-0 -c elasticsearch -- curl --request DELETE --url 'http://localhost:19200/_snapshot/cloudpak/<ID1>,<ID2>,...' --header 'content-type: application/json'
  9. Take a new snapshot:
    oc exec elasticsea-0ac3-ib-6fb9-es-server-esnodes-0 -c elasticsearch -- curl --request PUT --url 'http://localhost:19200/_snapshot/cloudpak/cloudpak_snapshot_recovery?wait_for_completion=false' --data '{"indices": "-backup_in_progress,*","ignore_unavailable": true,"include_global_state": false,"metadata": {"taken_by": "cloudpak","taken_because": "backup recovery"}}' --header 'content-type: application/json'
  10. Verify that the snapshot completed:
    oc exec elasticsea-0ac3-ib-6fb9-es-server-esnodes-2 -c elasticsearch -- curl --request GET --url 'http://localhost:19200/_cat/snapshots/cloudpak' --header 'content-type: application/json'

    The snapshot is completed when the status of cloudpak_snapshot_recovery is COMPLETED.

    Note: Depending on the amount of data, it might take up to one hour for the snapshot to complete.
  11. Retry the backup.