{:childlinks: .ullinks}https://github.ibm.com/IBMPrivateCloud/common-services-docs/pull/7486

Roll back or sync Common Service DB and Zen data to a specified backup

Before you begin

Set the Common Service DB data to the desired state

  1. Determine the velero backup to roll back to.

     velero backup get
    
  2. Verify whether the backup was successful and check the details to see if all resources are saved.

      velero backup describe <__BACKUP_NAME__> --details
    
  3. Substitute the __BACKUP_NAME__ with the name of the backup resource that you gathered in the previous step.

     vi restore-mongo-data.yaml
    

    Note: If the cluster has multiple instances of common services and or mongo and you do not want to roll back all of them, specify the namespace(s) to roll back by replacing the '*' under includedNamespaces with target namespace(s). Each namespace should be on it's own line like the following:

         ```cmd
         includedNamespaces:
         - '<cs namespace1>'
         - '<cs namespace2>'
         ```
    
  4. Clean up existing MongoDB restore resources

    • Remove the mongodb-backup deployment if present
        oc delete deploy mongodb-backup -n <target namespace>
      
    • Remove the cs-mongodump pvc if present
        oc delete pvc cs-mongodump -n <target namespace>
      
    • Remove the velero restore object if present:
        velero restore delete restore-mongo-data
      
  5. Restore the MongoDB data.

     oc apply -f restore-mongo-data.yaml
    
  6. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

     velero restore get
    
     velero restore describe <__RESTORE_NAME__> --details
    
  7. Verify that the restore completed successfully.

    Check the logs for the velero restore to ensure that the restore went through. Search for the following log: "Failed: error connecting to db server: no reachable servers"

    If this message is present, follow these instructions:

    1. Get the mongo-restore.sh file.

       wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/mongoDB/mongo-restore.sh
      
    2. Make the restore script executable:

       chmod +x mongo-restore.sh
      
    3. Delete the existing mongodb-backup deployment:

       oc delete deploy mongodb-backup -n <cs namespace>
      
    4. Run the script:

       ./mongo-restore.sh <cs namespace>
      

      Troubleshooting: If the mongodb-restore pod is stuck in ContainerCreating:

      • delete the deployment mongodb-backup
      • make sure the mongodb-backup pod is fully deleted (not Terminating)
      • delete the mongodb-restore job and its pod (not Terminating)
      • rerun the mongo-restore.sh script

Note: The secondary steps that are listed here must be run only if the restore logs indicate that the restore was not run. Logs like duplicate key error collection are expected and do not indicate a need to run the secondary steps.

Set the Zen 5 data to the desired state

  1. Determine the velero backup to roll back to

     velero backup get
    
  2. Verify whether the backup was successful and check the details to see if all resources are saved.

      velero backup describe <__BACKUP_NAME__> --details
    
  3. Substitute the __BACKUP_NAME__ with the name of the backup resource that you want to roll back to.

     vi restore-zen5-data.yaml
    

    Note: If the cluster has multiple instances of common services and or mongo and you do not want to roll back all of them, specify the namespace(s) to roll back by replacing the '*' under includedNamespaces with target namespace(s). Each namespace should be on it's own line like the following:

     ```cmd
     includedNamespaces:
     - '<cs namespace1>'
     - '<cs namespace2>'
     ```
    
  4. Give the Zen 5 backup necessary permissions

    • For each namespace with a zenservice to backup, create a service account. Replace the <zenservice namespace> value before applying.

        oc apply -f zen5-sa.yaml
      
    • Once per zenservice namespace, apply the Role for the zen backup. Replace the <zenservice namespace> value before applying.

        oc apply -f zen5-role.yaml
      
    • Create the RoleBinding to connect the ServiceAccount to the Role.

      1. Edit the zen5-rolebinding.yaml file to add the ServiceAccount created earlier and replace the <zenservice namespace> value.

         vi zen5-rolebinding.yaml
        
      2. Apply the zen5-rolebinding.yaml file

         oc apply -f zen5-rolebinding.yaml
        
  5. Clean up existing zen5 restore resources

    • Remove the zen5-backup deployment if present
        oc delete deploy zen5-backup -n <target namespace>
      
    • Remove the zen5-backup-pvc if present
        oc delete pvc zen5-backup-pvc -n <target namespace>
      
    • Remove the velero restore object if present:
        velero restore delete restore-zen5-data
      
  6. Restore the Zen data.

     oc apply -f restore-zen5-data.yaml
    
  7. Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as Completed.

     velero restore get
    
     velero restore describe <__RESTORE_NAME__> --details
    
  8. Check logs of the velero restore to verify that the data was restored

     velero restore logs restore-zen5-data
    
    • Search for restore_zen5 to find relevant logs. If it is not present, the restore did not run. If the logs or the data indicate that the restore was not successful, the following steps can be taken as a workaround:

      1. Get the zen5-restore-job file:

         wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen5-restore-job.yaml
        
      2. Delete the existing zen5-backup deployment

         oc delete deploy zen5-backup -n <target namespace>
        
      3. Wait for the zen5-backup pods to fully delete (fully gone, not Terminating)

      4. Edit the zen5-restore-job.yaml file. Update any field in <>. These are the target restore namespace (<zenservice namesapce>) and the target zenservice (<zenservice name>).

      5. Apply the zen5-restore-job.yaml file

         oc apply -f zen5-restore-job.yaml
        
      6. Wait for the job to complete, then check the logs of the zen5-restore-job pod to verify that the restore is completed.

      7. Repeat as needed for each namespace with a zenservice instance installed.

  9. Wait for the zenservice instances to come ready. Once the Progress field is 100%, the instance is ready. The following command will continuously output the percentage of all the zenservices on the cluster.

     oc get zenservice -A -w -o yaml | grep Progress:
    

Troubleshooting: