Backup and restore for cloud native analytics

The backup and restore function supports backing up cloud native analytics policies to an external location by using Secure Copy Protocol (SCP). It is then possible to use those backups to restore them on another deployment, which might be on another cluster or even the same cluster.

The backup and restore function is disabled by default. If you did not enable it during your deployment, then you must first enable it in the NOI custom resource (CR) YAML file under the spec section, as in the following example:
spec:
  backupRestore:
    enableAnalyticsBackups: true
Note: You can edit the NOI CR YAML file in one of the following two ways.
  • Edit the file from the command line with the oc edit noi for a full cloud deployment, or the oc edit noihybrid command for a hybrid deployment.
  • Edit the deployment from the Red Hat® OpenShift® Container Platform Operator Lifecycle Manager (OLM) console: Operators > Installed Operators > IBM Cloud Pak for AIOps Event Manager. Click the NOI or NOIHybrid tab and select your deployment. Then, click the YAML tab to edit and save the YAML file. Your changes are auto-deployed.
If you need to send backups to an external system by using SCP, complete the following steps:
  • Create a Public Private Key Pair, on the primary cluster, by using ssh-keygen. For more information, see How to Use ssh-keygen to Generate a New SSH Key external icon in the SSH documentation. When you generate the keys, two files are created, as in the following examples.
    $HOME/.ssh/id_rsa
    $HOME/.ssh/id_rsa.pub
    The public file is the file that you share with the backup cluster.
  • Add the public key to the $HOME/.ssh/authorized_keys file on the cluster that you plan to send the backups to. Copy the contents of the file on the primary cluster, which contains the public key (for example, $HOME/.ssh/id_rsa.pub) and add it to the $HOME/.ssh/authorized_keys on the target/backup file. To test if the file update is successful, use rsh from the source system to log in to the backup cluster by using the private key. For example, from the primary cluster, run the following command.
    ssh -i $HOME/.ssh/id_rsa root@api.bkupcluster.xyz.com
    If the transfer of the key is successful, you are not prompted for a password.

The private key is placed in a Kubernetes secret for use by the cron job to connect into the target system.

Backup

The following table describes what configuration parameters are used for backups in the IBM® Netcool® Operations Insight® Custom Record definition.
Section name Property name Description Default value
backupRestore enableAnalyticsBackups If set to true, the cronjob that does the backups is activated. false
helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.backupDestination.hostname (Optional) The destination hostname of the machine where the backups are copied to. false
helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.backupDestination.username

(Optional) The username on the destination hostname that does the SCP copy.

false
helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.backupDestination.directory The directory on the destination hostname that receives the backups. (Optional) false
helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.backupDestination.secretName

(Optional) The Kubernetes secret name, which contains the private ssh key that is used to do the SCP. The secret key privatekey must be used to store the ssh private key.

It needs to be set up up front if you want to use SCP before the installation of Netcool Operations Insight.

false
helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.schedule It is the Cron schedule format that is used to determine how often the backups are taken. For more information about this used approach for running scheduled runs, see cron external link.

Every 3 minutes

helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.claimName (Optional) The PVC claim name that is used to store the backups. An empty value implies no use of Kubernetes persistent storage.
Note: Valid for primary deployment only.

This property must be specified before the NOI deployment if Kubernetes persistent storage is needed.

false
helmValuesNOI ibm-noi-bkuprestore.noibackuprestore.maxbackups (Optional) The maximum number of historic policy backups to keep on the persistent volume to preserve storage space. 10
.
Create and send a backup by using SCP
To create the secret that is used for the backup, use the ssh private key that was created earlier. Run the following command on the primary cluster to create the secret.
oc create secret generic <secret_key_name> --from-file=privatekey=<home_directory of your_user_id>/.ssh/<generated_private_key> --namespace <namespace>
Where:
  • <secret name> is a name of your choice, for example: evtmanager-backuprestore-secret
  • <your user home directory> is the home dirctory of you user, for example: /root
  • <generated_private_key> is the private key that you generated, for example: id_rsa
  • <namespace> is the namespace where IBM Netcool Operations Insight on Red Hat OpenShift is deployed.
Example:
oc create secret generic ocd318-backup-key
        --from-file=privatekey=/root/.ssh/id_rsa --namespace noicase318
The secret is then used in the deployment YAML on the primary cluster.
An example of basic configuration of backup on the primary cluster:
helmValuesNOI:
    ibm-noi-bkuprestore.noibackuprestore.backupDestination.directory: <home_directory of your_user_id>/tmp/backups
    ibm-noi-bkuprestore.noibackuprestore.backupDestination.hostname: <hostname>.xyz.com
    ibm-noi-bkuprestore.noibackuprestore.backupDestination.secretName: <secret_key_name>
    ibm-noi-bkuprestore.noibackuprestore.backupDestination.username: <your_user_id>
    ibm-noi-bkuprestore.noibackuprestore.schedule: '*/3 * * * *'
The backup command copies the backups to the target cluster (for example hadr-inf.xyz.com), every three minutes and places the backup files in the <home_directory of your_user_id>/tmp/backups directory. The directory must exist on the target cluster. The indentation of the helmValuesNOI fields is important. If your NOI deployment already has a helmValuesNOI section, then add the new fields to it. Example with the backup values added:
  spec:
    helmValuesNOI:
      ibm-noi-bkuprestore.noibackuprestore.backupDestination.directory: /tmp/backups
      ibm-noi-bkuprestore.noibackuprestore.backupDestination.hostname: api.bkupcluster.xyz.com
      ibm-noi-bkuprestore.noibackuprestore.backupDestination.secretName: ocd318-backup-key
      ibm-noi-bkuprestore.noibackuprestore.backupDestination.username: root
      ibm-noi-bkuprestore.noibackuprestore.schedule: '*/3 * * * *'

Restore

  1. Install podman on your system (yum install podman), then log in to the Red Hat OpenShift Container Platform registry with your entitlement registry key, by running the following command:
    podman login -u cp -p var cp.icr.io
  2. Restoring the file on the target component can be done by using the noi-backuprestore-service docker image. To find the exact image to use for your deployment, locate the backuprestore cronjob. Run the oc get cronjobs | grep bkuprestore command from the primary cluster, where you are triggering the backups.
    Example:
    [root@api.primcluster.cp.fyre.ibm.com ~]# oc get cronjobs | grep bkuprestore
    evtmgr0-ibm-noi-bkuprestore-noibckup           */1 * * * *     False     0        25s             29h
    The image for backuprestore is in the YAML cronjob:
    oc get cronjob <release name>-ibm-noi-bkuprestore-noibckup -o yaml | grep icr
    Example:
    [root@api.primcluster.cp.fyre.ibm.com ~]# oc get cronjob evtmgr0-ibm-noi-bkuprestore-noibckup -o yaml | grep icr
                image: [cp.icr.io/cp/noi/noi-backuprestore-service@sha256:a28fb6c0cbdadda6f378a756ab3e8d8a629a3fd749c6d9343066545c1e374881]
    Pull this image on the backup cluster:
    # podman pull cp.icr.io/cp/noi/noi-backuprestore-service@sha256:a28fb6c0cbdadda6f378a756ab3e8d8a629a3fd749c6d9343066545c1e374881
    Trying to pull cp.icr.io/cp/noi/noi-backuprestore-service@sha256:a28fb6c0cbdadda6f378a756ab3e8d8a629a3fd749c6d9343066545c1e374881...
    Getting image source signatures
    Checking if image destination supports signatures
    Copying blob 8a81fb36b007 skipped: already exists  
    Copying blob 9d59a1377311 skipped: already exists  
    Copying blob d4cecfd56161 skipped: already exists  
    Copying blob 7562fe716fa4 skipped: already exists  
    Copying blob 0c10cd59e10e skipped: already exists  
    Copying blob 8a0eb7365b1a skipped: already exists  
    Copying blob 256777fbf05c skipped: already exists  
    Copying blob c8158b20c85a skipped: already exists  
    Copying blob 1b658ca76caf skipped: already exists  
    Copying blob 1e48335a1994 skipped: already exists  
    Copying blob a2f93eeba1ac skipped: already exists  
    Copying blob 47e0bdc406b5 skipped: already exists  
    Copying blob 66cf77cb242d skipped: already exists  
    Copying blob 4f4fb700ef54 skipped: already exists  
    Copying config d2436a7745 done  
    Writing manifest to image destination
    Storing signatures
    d2436a77456b2a230f3f603c9c42fa712c64408ae97a065b184b7d78ca866e89
    
  3. Create a directory to contain the configuration and policies that you want to upload to the target NOI deployment (for example /root/tmp/restore). Create a file called target.env in the restore directory. This file contains the credentials of the system that you want to restore to.
    Note: The target.env file must be located in the restore directory. The restore directory should be the parent directory of where the policies sub-directory is located.
    Example target.env file:
    export username=system 
    export password=<NOI deployment system auth password> 
    export tenantid=cfd95b7e-3bc7-4006-a4a8-a73a79c71255 
    export policysvcurl=https://<openshift noi deployment route endpoint  
    export inputdir=/input/policies 
    The username is a fixed value. The password value for <NOI deployment system auth password> can be obtained by using the following command.
    oc get secret <name>-systemauth-secret -o jsonpath --template '{.data.password}' | base64 --decode; echo
    The tenantid is a fixed value. The inputdir value is fixed and should be /input/policies. The policysvcurl value is https:// followed by the OpenShift route endpoint fully qualified hostname and can be obtained by running the following command:
    oc get routes -o=jsonpath='{range .items[*]}{.spec.host}{"\n"}{end}' | sort -u | grep netcool
  4. Copy the policies backup file into the <restoredir>. The file that is generated by the backup has the format cneapolicies-yyyy-MM-dd-mm:ss:SS:Z.tar.gz.
  5. Create a directory called policies in the <restoredir> directory. The <restoredir> is the directory where the target.env file resides. Extract the policy backup file in the <restoredir> by running the following command:
    tar xvf <'your backup tar gzip'> --force-local --directory policies 
    Note: The policy file name might need to be enclosed with single quotation symbols (').
  6. Restore by running the docker command:
    podman run -t -i --env LICENSE=accept --network host --user <your_user_id> --privileged -v <restoredir>:/input:ro <backuprestoreimage> /app/scripts/run.sh
    Where:
    • <your_user_id> is the user owning the <restoredir> and the backup images.
    • <backuprestoreimage> is the image from the bkuprestore cronjob.
    Note: Before you run the command, the user must be logged in to the podman location for reference to the backup and restore image.
    When running the command, you should see a stream of policy data output on the terminal showing that the individual policies are being written to the target system. On successful completion, you will see the following messages for each analytics policy type output on the screen.

    Successfully activated policy batch in destination PRS: groupid = <analytics_type>

    Example:
    # podman run -t -i --env LICENSE=accept --network host --user root --privileged -v `pwd`:/input:ro  [cp.icr.io/cp/noi/noi-backuprestore-service@sha256:a28fb6c0cbdadda6f378a756ab3e8d8a629a3fd749c6d9343066545c1e374881]/app/scripts/run.sh
    Running extraction script
    Restoring policies
    Note: Using npm emulator
    {
      tenantid: 'cfd95b7e-3bc7-4006-a4a8-a73a79c71255',
      policysvcurl: 'https://netcool-evtmgr0.apps.kcaiops42.xyz.com'
    } Running in policy load
    <lines deleted>
    Successfully activated policy batch in destination PRS:  groupid = topological-correlation
    Successfully activated policy batch in destination PRS:  groupid = scope
    Successfully activated policy batch in destination PRS:  groupid = topological-enrichment
    Successfully activated policy batch in destination PRS:  groupid = self_monitoring
  7. Confirm that the policies were added to the manage policies UI.