Backing up Guardium Data Security Center

Use the following procedure to back up Guardium Data Security Center.

Before you begin

If you are backing up to a remote destination, review the requirements for External storage allocation for backups.

Procedure

  1. Verify that you are logged in to the IBM Cloud® Private command-line interface (CLI). Logging in to this CLI also authenticates your account to use the OpenShift® CLI. To log in, run the following command:
    oc login -u <openshift_username> -p <openshift_password> --server=https://ICP_Hostname>:6443
    • <ICP_hostname> is your Cloud Private server, for example https://cp-console.apps.myserver.com
    • <openshift_username> is your OpenShift username.
    • <openshift_password> is your OpenShift password.
  2. Prepare a custom resource file that is named gdsc-backup.yaml by following the examples in Creating a custom resource file for backups.
  3. Create the backup resource:
    oc apply -f gdsc-backup.yaml
  4. Confirm that the resource is created:
    oc get backup

    The expected results are similar to the following example:

    NAME                                 AGE
    guardiumdatasecuritycenter-backup    20m
  5. Confirm that the cronjob is created:
    1. Run the following command:
      oc get cronjob|grep backup

      The expected results are similar to the following example

      guardiumdatasecuritycenter-backup          * */1 * * *   False     0         4m39s           21m
    2. If the cronjob is not created after 1 minute, restart the operator:
      oc delete pod $(oc get pod |awk '/guardiumdatasecuritycenter-controller-manager/{print $1;}')

      The expected results are similar to the following example

      pod "guardiumdatasecuritycenter-controller-manager-756b55dff9-zgz5g" deleted
    3. If needed, remove the backup restore:
      oc delete backup guardiumdatasecuritycenter
    4. And then re-create the backup restore:
      oc apply -f gdsc-<timestamp>.yaml
    5. Restart the operator:
      oc delete pod $(oc get pod |awk '/guardiumdatasecuritycenter-controller-manager/{print $1;}')

      The expected results are similar to the following example

      pod "guardiumdatasecuritycenter-controller-manager-756b55dff9-zgz5g" deleted
    6. Check again to see whether the cronjob is created. If it has not, repeat the steps 5.a to 5.e.
  6. At the next scheduled time, a job is created for backup. Check for the job and backup pod until they show up:

    To check the job, issue this command:

    oc get job |grep backup

    The expected results are similar to the following example

    guardiumdatasecuritycenter-backup-1636136400                  0/1           39s        39s

    To check the pod, issue this command:

    oc get pod |grep backup

    The expected results are similar to the following example

    guardiumdatasecuritycenter-backup-1636136400-8kj6d            0/1       Pending     0          49s

    The job and its pod are created, as defined in the gdsc-backup.yaml file.

  7. Confirm that the status of the pod is Running.
    1. Run the following command:
      oc get pod |grep backup
    2. If the status shows Pending, similar to this example, the PV is still attached to the PVC:
      guardiumdatasecuritycenter-backup-1636136400-8kj6d            0/1       Pending     0          49s
    3. To determine the status of the PV, run the following command:
      oc get pv|grep backup

      If the PV is attached to the PVC, the expected results are similar to the following example:

      pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8   50Gi       RWO            Retain
      Released   staging/backup            rook-ceph-block             2d20h
    4. To manually release the PV, get its name from the results (in this example, it is pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8), and then run the following command:
      oc patch pv pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8 -p '{"spec":{"claimRef": null}}'

      The expected results are similar to the following example

      persistentvolume/pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8 patched
    5. To verify that the status of the pod is Running, run the following command:
      oc get pod |grep backup

      The expected results show the Running status:

      guardiumdatasecuritycenter-backup-1636136400-8kj6d   1/1       Running           0          6s
  8. Watch the pod logs:
    oc logs --follow guardiumdatasecuritycenter-backup-1636136400-8kj6d
    . . . . .
    
    . . . . .
    
    . . . . .

    Each run creates a new pod. Wait for the completion of the number of runs that are set in your gdsc-backup.yaml. The first run produces a full backup, and subsequent runs alternate between delta incremental backups for Db2 and full backups.

Finding your backup files

Verify that your files are backed up by checking your persistent volume claim.

Backup files are stored in the persistent volume claim (PVC) that is mounted on the pod under /opt/data/backup/<folder_name>, where <folder_name> is a cluster data backup with this format: gdsc-backup-<guardium_data_security_center_host>-<guardium_data_security_center_namespace>-<guardium_data_security_center_instance>-<timestamp>. For example:
gdsc-backup-sys-qa-rtp02-staging-staging-2022-01-20-2325

File naming conventions are outlined in this table:

Table 1. File naming conventions
Directories and files Description
Info.yaml Process information, including the date, version, and number of modules that are backed up.
data Application data
data/DB2 Db2 data
data/MongoDB MongoDB memory dump
meta Metadata for the restore process
meta/ backupInternalCRs.yaml Internal CR
meta/configurations-backup GUC-manager file system
meta/guc-config-backup GUC-manager file system
meta/DB2_Label_<gdsc-version>_<instance-name>.info Db2 label information
meta/DB2_Tables_<gdsc-version>_<instance-name>.info Db2 tables information
meta/Tenant_IDs_<gdsc-version>_<instance-name>_.info Db2 tenant information
meta/LDAP_Config_<gdsc-version>_<instance-name>.info LDAP information
meta/Mongo_Tables_<gdsc-version>_<instance-name>.info MongoDB tenant information
meta/<cluster-name> Cluster data directory
meta/<cluster-name>/DB2/keystore_<gdsc-vervion>_ORG Directory that contains the keystores
meta/<cluster-name>/ DB2/keystore_<gdsc-vervion>_ORG/keystore/keystore.p12 Original keystore for db2 for cluster
meta/<cluster-name>/ DB2/keystore_<gdsc-vervion>_ORG/keystore/keystore.sth Original keystore for db2 for cluster
meta/<cluster-name>/ DB2/tmpkeystore Directory that holds the temporary keystore
meta/<cluster-name>/ DB2/tmpkeystore/temp_keystore.p12 The master key is held in the temp_keystore.p12

The following example shows backup files in a PVC:

gdsc-backup-sys-gdsc-rtp02-staging-staging-2022-09-29-1955
├── [  12K] backup-2022-09-29-1955.log
├── [ 6.0K] backup-2022-09-29-2003.log
├── [ 6.0K] backup-2022-09-29-2006.log
├── [ 1.1K] backup_info.yaml
├── [ 4.0K] data
│   ├── [ 4.0K] DB2
│   │   ├── [  13M] DB2.delta.20220929200405.tar.gz
│   │   ├── [  13M] DB2.delta.20220929200714.tar.gz
│   │   └── [ 969M] DB2.full.20220929200009.tar.gz
│   └── [ 4.0K] Mongodb
│   │  ├── [ 9.5K] Mongodb.20220929195747.gz
│   │  ├── [ 9.5K] Mongodb.20220929200501.gz
│   │  └── [ 9.5K] Mongodb.20220929200812.gz
│   └── [ 4.0K] PG
│       ├── [ 3.6M] riskmanager_backup_20220929_195843.tar
│       ├── [ 3.6M] riskmanager_backup_20220929_200601.tar
│       └── [ 3.6M] riskmanager_backup_20220929_200910.tar
├── [ 4.0K] meta
     ├── [  760] backupInternalCRs.yaml
     ├── [ 4.0K] datamart-backup
         ├── [ 4.0K] TNT_SH83ZQZVNYFKDHJBPRW76I
         └── [    0] watchedFiles
     ├── [   56] DB2_Label_v3.2.0_staging.info
     ├── [ 140K] DB2_Tables_v3.2.0_staging.info
     ├── [ 4.0K] guc-config-backup
     ├── [ 2.7K] insightsMigrationSecrets.csv
     ├── [  462] LDAP_Config_v3.2.0_staging.info
     ├── [ 2.6K] Mongo_Tables_v3.2.0_staging.info
     ├── [ 4.0K] staging
     │   └── [ 4.0K] DB2
     │       ├── [ 4.0K] keystore_3.0_ORG
     │       │   ├── [ 3.4K] keystore.p12
     │       │   └── [  193] keystore.sth
     │       └── [ 4.0K] tmpkeystore
     │           └── [ 3.4K] temp_keystore.p12
     └── [  103] Tenant_IDs_v3.2.0_staging.info

Stopping the backup process

The backup process is a cronjob, and unless you explicitly stop it by deleting the backup resource, it runs continuously. Before you consider stopping a backup process, verify that the backup data on the PV or PVC can remain after the cronjob is removed. The system recycles dynamically allocated PV or PVC, and all data is lost.

Procedure

  1. To ensure that the system does not recycle the PV, verify that the RECLAIM POLICY is set to Retain.
    1. Run the following command:
      oc get pv

      This command returns the PV:

      NAME                             CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                                           STORAGECLASS      REASON    AGE
      pvc-04c319c4-b1f3-4662-92f2-02b6e278d8d9   50Gi       RWO            Delete           Bound     staging/data-staging-kafka-0                                    rook-cephfs                 2d22h
      pvc-573b36c9-2250-4bb3-bd02-e761962c5c17   20Gi       RWO            Delete           Bound     staging/data-c-staging-redis-m-2                                rook-cephfs                 2d22h
      pvc-674ce393-1bd5-4c9c-bd1b-572629a23821   100Gi      RWO            Delete           Bound     staging/logs-volume-staging-mongodb-0                           rook-cephfs                 2d22h
      pvc-6c3bbd9f-b65c-4814-822f-9be8a42f6e1b   10Gi       RWO            Delete           Bound     staging/data-staging-zookeeper-0                                rook-cephfs                 2d22h
      pvc-7313e511-2e54-4090-986e-4a083e500d0f   100Gi      RWX            Delete           Bound     staging/c-staging-db2-meta                                      rook-cephfs                 2d22h
      pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8   50Gi       RWO            Delete           Bound     staging/backup                                                  rook-ceph-block             2d21h
    2. In this example, the backup is pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8 and its RECLAIM POLICY is Delete. Change this policy to Retain:
      oc patch pv 'pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8 -p 
      '{"spec":{"persistentVolumeReclaimPolicy": "Retain"}}
    3. To verify that the RECLAIM POLICY was changed, run the following command:
      oc get pv |grep backup

      The expected results reflect the Retain reclaim policy:

      pvc-7f8c3bb4-5a2c-4408-ad25-fe4f20b604f8   50Gi       RWO            Retain           Bound     staging/backup                                                  rook-ceph-block             2d22h
  2. You can now safely stop the cronjob. To stop the backup process, run one of the following commands:
    oc delete backup guardiumdatasecuritycenter

    Or

    oc delete -f gdsc-backup.yaml

Creating a custom resource file for backups

Before you install Guardium Data Security Center, enable backups with a Network File System (NFS) storage class by using a YAML custom resource (CR) file. To configure backups, you must provision your own Persistent Volume (PV) and Persistent Volume Claim (PVC).

The back up process

Backups for Guardium Data Security Center run as CronJobs and can be activated by using a YAML file.

The YAML file defines the frequency, schedule, and the retention days for a backup. A pod that contains the backup scripts is created, and the top-level script runs. The default location of the backup directory is in the backup pod unless it is configured differently.

During a full backup, the configuration files (such as LDAP information and oc secrets) and internal databases (Db2, MongoDB, and PostgreSQL) are backed up. During an incremental backup, the configuration files are not backed up.

The YAML file is applied by running the following command:
oc apply -f backup.yaml

YAML CR file definition

Create a YAML CR file by using the code in the following example:

apiVersion: gi.ds.isc.ibm.com/v1
kind: Backup
metadata:
  name: guardiumdatasecuritycenter
spec:
  gdsc-backup:
    cronjob:
      # schedule of jobs
      schedule: "0 23 * * *"
    insightsEnv:
      #How many days to keep the full backups, default 0, never remove
      RETENTION_FULL_BACKUP_IN_DAYS: 30
      #How frequent (in days) the full backup should be. (Default 7)
      FREQUENCY_FULL_BACKUP_IN_DAYS: 15
      #After X times of DB2 incremental backup, the next round of job 
      #will perform a system full backup. (Default 6)
      FREQUENCY_FULL_BACKUP_IN_INC_COUNT: 14
      #Resume the full backup process from where it failed before if 
      #failure occurred in previous full backup job run. (Default true)
      RESUME_FULL_BACKUP_ON_FAILURE: true
    persistentVolumesClaims:
      backup: 
        name: gdsc-custom-named-pvc
        size: 500Gi 
        storageClassName: <storage class on your system>
  targetGIInstance: gdsc-sample
Table 2. Descriptions of the definitions in the YAML CR file
Attribute Description
schedule: "0 23 * * *" "0 23 * * *" is the schedule of the CronJob that begins everyday at 23:00 (Coordinated Universal Time time zone). You can customize the schedule based on your needs.

Note: If the backup runs longer than anticipated, the next task might be out of schedule.

insightsEnv The settings of your environment. They can be customized to suit your needs.
name: gdsc-custom-named-pvc The name of the PVC for your NFS.
size: 500Gi The minimum size of the PV for your NFS.
storageClassName

The storage class on your system. Select the rwx file storage type.

For more information, see Validated storage options.

Based on the required frequency of full and incremental backups, define your cronjob by using the following examples:

Table 3. cronjob examples
Schedule name Aggressive Frequent Common Historical
Full backup schedule Once daily Once weekly (7 days) Once every 2 weeks (15 days) Once every 30 days
Incremental backup schedule 1 incremental 12 hours later in between 6 increments in between 14 increments in between 29 increments in between
CRON schedule "* 0-23/12 * * *" "* 23 * * *" "0 23 * * *" "* 23 * * *"
FREQUENCY_FULL_BACKUP_IN_DAYS 1 7 15 30
FREQUENCY_FULL_BACKUP_IN_INC_COUNT 1 6 14 29

Configuring backup after Guardium Data Security Center installation

Create a PVC for installation to run a successful backup.

Before you begin

When you apply the patch, the "claimName": in the oc patch command must match the name of the PVC that you create.

Procedure

  1. Deploy a Network File System (NFS) to your Guardium Data Security Center cluster. You can deploy an NFS in multiple ways.
    For example, you can clone the repo in your terminal by running the following command:
    git clone https://github.com/kubernetes-incubator/external-storage.git kubernetes-incubator

    For this example, use the kubernetes-incubator-staging folder. This folder contains rbac.yaml and deployment.yaml with the staging namespace already configured.

    1. Change the PROVISONER_NAME value from value:fuseim.pri.ifs to value:storage.io/nfs.
    2. Update class.yaml so the PROVISONER_NAME matches the one from the previous step.
    3. Deploy the modifications.
      oc create -f deploy/class.yaml
      oc create -f deploy/deployment.yaml
  2. Create a persistent volume (PV) and persistent volume claim (PVC) in accordance with the NFS from step 1. The following examples show you how to create the PV and PVC, but you might need to adjust them according to your needs.
    1. Use the yaml file backuppv.yaml.
      Replace <nfs server ip> with the IP of your NFS server.
      # This yaml file is to be used to create a PV based on the existing NFS:
      apiVersion: v1
      kind: PersistentVolume
      metadata:
        annotations:
          pv.kubernetes.io/provisioned-by: storage.io/nfs
        name: i-am-nfs-v320-backup
      spec:
        accessModes:
        - ReadWriteMany
        capacity:
          storage: 500Gi
        nfs:
          path: /data/guardiumdatasecuritycenter
          server: <nfs server ip>
        persistentVolumeReclaimPolicy: Retain
        storageClassName: managed-nfs-storage
        volumeMode: Filesystem 
    2. To create and apply the PV, run the following commands:
      oc project staging
      oc apply -f backuppv.yaml
      The staging value is the namespace where Guardium Data Security Center is in.
    3. Create a PVC yaml file and apply it in the same manner as the PV.
      The following example shows a sample PVC yaml file:
      # This yaml file is to be used to create a PVC based on the existing PV:
      kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: backup-pvc-support # This is the name the will be defined by the customer and passed into the oc patch commands under the claimName property.
        annotations:
          volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 500Gi # Size of the storage that the PVC will obtain from the PV
        claimRef:
          namespace: staging
          name: i-am-nfs-v320-backup # Name of the PV previously configured with the StorageClassName
  3. Edit the Guardium Data Security Center custom resource (CR) with backup values by using the code from the following example:
    The following list contains the name values:
    • Postgres

      name:gdsc-postgres-backup

    • MongoDB

      name:gdsc-backup-support-mount

    • Db2

      name:gdsc-backup-support-mount

    oc patch guardiumdatasecuritycenter $(oc get guardiumdatasecuritycenter -o jsonpath='{range.items[*]}{.metadata.name}') --type merge -p '{"spec":{"guardiumdatasecuritycenterGlobal":{"backupsupport":{"enabled":"true","name":"backup-support-pvc"}}}}'
    Note:
    • If the PVC is automatically mounted, it has the "storageClassName": value as "rook-cephfs". If the value is "managed-nfs-storage", run the patch command in step 4.
    • The PVC must be specified in the Guardium Data Security Center CR under the guardiumdatasecuritycenterGlobal.backupsupport.name section when guardiumdatasecuritycenterGlobal.backupsupport.enabled is set to true.
  4. Mount Postgres to the NFS PV from step 2.
    oc patch postgres-sts
    oc get guardiumdatasecuritycenter -o jsonpath='{range .items[*]}{.metadata.name}')-postgres-keeper --type='json' -p 
    '[{"op":"add","path":"/spec/template/spec/volumes/2","value":{"name":"gi-postgres-backup",
    "persistentVolumeClaim":{"claimName":"backup-pvc-support"}}},{"op":"add","path":"/spec/template/spec/containers/0/volumeMounts/3",
    "value":{"mountPath":"/opt/data/backup","name":"gi-postgres-backup"}}]' 
  5. Mount the MongoDB Community container to the NFS PV from step 2.
    oc patch MongoDBCommunity $(oc get mongodbcommunity -oname) --type='json' -p '[{"op":"add","path":"/spec/statefulSet/spec/template/spec/containers/1","value":
    {"name":"mongod","volumeMounts":[{"name":"gdsc-backup-support-mount","mountPath":"/opt/data/backup"}]}},{"op":"add","path":"/spec/statefulSet/spec/template/spec/volumes","value":
    [{"name":"gdsc-backup-support-mount","persistentVolumeClaim":{"claimName":"$BACKUP_PVC_NAME"}}]}]'
  6. Mount the db2ucluster to the NFS PV from step 2.
    oc patch db2ucluster $(oc get guardiumdatasecuritycenter -o jsonpath='{range .items[*]}{.metadata.name}')-db2 --type='json' -p
    '[{"op":"add","path":"/spec/storage/3","value":{"name":"backup","claimName":"backup-pvc-support",
    "spec":{"resources":{}},"type":"existing"}}]' 
    Tip: The claimName for all three databases is backup-pvc-support.
  7. Verify the mounting of Postgres, MongoDB Community, and Db2:
    oc describe pvc <pvc_name>