Restoring a backup of Guardium Insights

Restore your backup of Guardium Insights to recover any lost data.

Before you begin

  • Verify that the target system is in normal running status because the restore process can't recover a broken cluster.
  • Verify that you have permission to access the backup files before you attempt to restore. To check your permissions, run the following command:
    chmod 777 -R <backup_directory>
Important: You cannot restore a backup from version 3.2.0 to version 3.2.x. Instead, restore to version 3.2.0. Then, update from version 3.2.0 to version 3.2.x.

Procedure

  1. If the backup was taken when Db2 was db2cluster and you switched to a db2uinstance, then complete steps 2 and 3. Otherwise, skip to step 4.
    You can verify whether your Db2 type is db2uinstance by running the following command:
    oc get db2uinstance
    If db2uinstance is installed, the output is similar to the following example:
    
    NAME          STATE   MAINTENANCESTATE   AGE
    staging-db2   Ready   None               42h 
    If db2uinstance is not installed, the following message displays:
    No resources found in <namespace i.e staging> namespace
  2. Get the Db2 pods by running the following command:
    oc get pod | grep db2-db2
    The output is similar to:
    
    c-staging-db2-db2u-0                                              1/1     Running     0          42h  
    c-staging-db2-db2u-1                                              1/1     Running     0          42h

    You might have a different number of pods based on the size of your environment.

  3. Run the following commands in each of the Db2 pods.
    oc rsh c-staging-db2-db2u-0
    
    sudo  mkdir -p /mnt/logs/active/
    sudo chmod 777 -R /mnt/logs/active/ 
    sudo chown -R db2inst1:db2iadm1 /mnt/logs/active/  
    ln -s /mnt/blumeta0/db2/databases/db2inst1 /mnt/logs/active/BLUDB 
    sudo ln -s /mnt/logs/archive/ /mnt/bludata0/db2/archive_log
    su - db2inst1
    db2 connect to bludb
    db2 get db config | grep LOGARCHMETH1
    The output is a path similar to:
    First log archive method         (LOGARCHMETH1) = DISK:/mnt/logs/archive/
    Copy this path for future use, and then run the following command:
    db2 get db config | grep "Path to log files"
    The output is a path similar to:
    Path to log files                         = /mnt/bludata0/db2/databases/db2inst1/NODE0000/SQL00001/NODE0000/LOGSTREAM0000/
    Copy this path for future use.
  4. Verify that you are logged in to the IBM Cloud® Private command-line interface (CLI). Logging in to this CLI also authenticates your account to use the OpenShift® CLI. To log in, run the following command:
    cloudctl login -a <ICP_hostname> -u <openshift_username> -p <openshift_password> --skip-ssl-validation staging
    • <ICP_hostname> is your Cloud Private server, for example https://cp-console.apps.myserver.com
    • <openshift_username> is your OpenShift username.
    • <openshift_password> is your OpenShift password.
  5. Prepare a custom resource file that is named gi-restore.yaml by following the examples in Options forGuardium Insights restore custom resource files.
  6. Create the backup resource:
    oc apply -f gi-restore.yaml

    The expected results are similar to the following example:

    restore.gi.ds.isc.ibm.com/insights created
  7. Verify that the custom resource was created:
    oc get restore

    The expected results are similar to the following example:

    NAME       AGE
    insights   10s
  8. Confirm that the job was created.
    1. Run the following command:
      oc get job|grep restore

      The expected results are similar to the following example:

      insights-restore                                             0/1           35s        35s
    2. If the job is not created after one minute, the most likely cause is a bug in the operator cache and that it is still holding the historical backup resource. To correct this issue, restart the operator:
      oc delete pod $(oc get pod |awk '/guardiuminsights-controller-manager/{print $1;}')

      The expected results are similar to the following example:

      pod "guardiuminsights-controller-manager-756b55dff9-zgz5g" deleted
    3. If needed, remove the backup restore:
      oc delete restore insights

      The expected results are similar to the following example:

      restore.gi.ds.isc.ibm.com "insights" deleted
    4. Then re-create the backup restore:
      oc apply -f gi-restore.yaml

      The expected results are similar to the following example:

      restore.gi.ds.isc.ibm.com/insights created
    5. Now, restart the operator:
      oc delete pod $(oc get pod |awk '/guardiuminsights-controller-manager/{print $1;}')

      The expected results are similar to the following example:

      pod "guardiuminsights-controller-manager-756b55dff9-zgz5g" deleted
    6. Check again to see whether the job exists. If it does not, repeat the preceding steps .
  9. Wait for the restore pod to show up (the job and its pod start in a few seconds).
    oc get pod |grep restore

    The expected results are similar to the following example:

    insights-restore-n7rgm            0/1       Pending     0          60s

    The job and its pod start in a few seconds.

  10. Confirm that the status of the pod is Running.
    1. Run the following command:
      oc get pod |grep restore
    2. If the status shows Pending, like in the following example, the PV is still attached to the PVC.
      insights-restore-n7rgm            0/1       Pending     0          60s
    3. To determine the status of the PV, run the following command:
      oc get pv|grep backup

      If the PV is attached to the PVC, the expected results are similar to the following example:

      pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8   50Gi       RWO            Retain           Released   staging/backup            rook-ceph-block             2d21h
    4. To manually release the PV, get its name from the results of the previous step (in this example, it is pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8), and then run the following command:
      oc patch pv pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8 -p '{"spec":{"claimRef": null}}'

      The expected results are similar to the following example:

      persistentvolume/pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8 patched
    5. To check the status of the pod, run the following command:
      oc get pod |grep restore

      The expected results show the Running status:

      insights-restore-n7rgm         1/1       Running     0          6m29s
  11. Watch the pod logs:
    oc logs --follow  insights-restore-n7rgm
    . . . . .
    
    . . . . .
    
    . . . . .
  12. Confirm that all services are accessible and that data is available.
  13. If you completed steps 2 and 3, run the following commands on each Db2 pod:
    su - db2inst1
    db2 connect to bludb
    db2 update db cfg for bludb using NEWLOGPATH /mnt/bludata0/db2/databases/db2inst1/NODE0000/SQL00001/NODE0000/ 
    <path saved earlier minus the LOGSTREAM path>
    db2 update db cfg for bludb using LOGARCHMETH1 DISK:/mnt/logs/archive/
    db2stop force
    db2 deactivate db bludb
    db2 activate db bludb
    db2start

What to do next

Use one of these methods to check the log files:

  • To check one pod, issue this command: oc logs --follow <pod>
  • See <gi-backup-xxxx>/ backup-<timestamp>.log <gi-backup-xxxx>/restore-<timestamp>.log. These logs are in the PV under each directory for full backups.