Restoring all components of IBM Cloud Pak® for Multicloud Management
Follow the steps to restore all components of IBM Cloud Pak® for Multicloud Management:
Prerequisites
-
Install the
watch
,kubectl
,oc
,python
,velero
,Helm
, andcloudctl
CLIs on the workstation machine, where you can access the OpenShift cluster, initiate and monitor the restoration of IBM Cloud Pak® for Multicloud Management. -
If your environment has no access to Internet, you need to upload the Ubuntu image to all the worker nodes by following Uploading the Ubuntu image in an air gap environment. The
Ubuntu
container is used to restore MongoDB that is running in theibm-common-services
namespace. -
All required storage classes must be created prior to the restoration, and must have the same name as the backup cluster.
Before you begin
You need to check the following points before you restore IBM Cloud Pak® for Multicloud Management:
- The restored cluster needs to be in the same region as the backed-up cluster.
- It is recommended to have the same OpenShift version in both the backed-up and restored cluster.
- If Monitoring needs to be restored, you need to keep the backed-up and restored cluster domain name the same. Otherwise, Monitoring agents might not be able to connect to IBM Cloud Pak® for Multicloud Management after restoration.
- The following steps in the Procedure section are to restore IBM Cloud Pak® for Multicloud Management in a new cluster.
- The backup also backs up keys and certificates from the previous clusters. Ensure that the restored data is accessible in the new deployment. This restoration procedure works with the backup procedure in Backing up IBM Cloud Pak® for Multicloud Management. Without backup, you can't run the restoration independently.
- It is important to restore the backed-up data first for different components like Common Services, Monitoring, GRC, Vulnerability Advisor (VA), Mutation Advisor (MA), and Managed Services, and then deploy Common Services and IBM Cloud Pak® for Multicloud Management operators. Otherwise, the restoration might not work.
- It is highly recommended that the version of Common Services, Red Hat Advanced Cluster Management, and IBM Cloud Pak® for Multicloud Management in restored cluster should be the same as the backup cluster.
-
The backup and restoration of Red Hat Advanced Cluster Management is managed independently of IBM Cloud Pak® for Multicloud Management. See Red Hat Advanced Cluster Management documentation for backing up and restoring Red Hat Advanced Cluster Management observability service. When you install the Red Hat Advanced Cluster Management observability service during restoration, it is recommended to use the same S3 bucket that is used during the installation of observability service in the backup cluster.
Procedure
-
Clone the GitHub repository by running the following command:
git clone https://github.com/IBM/cp4mcm-samples.git
-
Log in to the OpenShift cluster by running the following command:
oc login --token=<TOKEN> --server=<URL>
Where:
<TOKEN> is the token that you use to log in to the OpenShift cluster. <URL> is the OpenShift server URL.
-
Install Velero in the OpenShift cluster.
-
If your environment has no access to Internet, you can follow the steps in Installing Velero in an air gap environment.
-
If your environment has access to Internet, you can follow the steps in Installing Velero in an online environment.
-
-
Restore Common Services, Monitoring, GRC, Vulnerability Advisor (VA), Mutation Advisor (MA), and Managed Services。
-
Change the following values in the file
restore-data.json
based on real values. The filerestore-data.json
is available in the directory<Path of cp4mcm-samples>/bcdr/restore/scripts
, where <Path of cp4mcm-samples> is the real path where you put thecp4mcm-samples
GitHub repository."airGap": "<Indicates whether the install is online or offline. Set the value to true to install offline and false to install online>", "backupName":"<backup name>", "ingressSubdomain":"<ingress subdomain of cluster>", "grcCrNamespace":"<namespace name where all the grc policies are created>", "imRestoreLabelKey":"<label key which is added for Infrastructure Management backup and restore>", "imRestoreLabelValue":"<label value which is added for Infrastructure Management backup and restore>", "monitoringRestoreLabelKey":"<label key which is added for Monitoring backup and restore>", "monitoringRestoreLabelValue":"<label value which is added for Monitoring backup and restore>"
See the following example:
"airGap":"false", "backupName":"cp4mcm-backup-373383393", "ingressSubdomain":"apps.cp4mcm-restore.multicloud-apps.io", "grcCrNamespace":"default", "imRestoreLabelKey":"imbackup", "imRestoreLabelValue":"test", "monitoringRestoreLabelKey":"appbackup", "monitoringRestoreLabelValue":"monitoring"
-
Restore Common Services, Monitoring, GRC, VA/MA, and Managed Services.
-
Go to the directory
<Path of cp4mcm-samples>/bcdr/restore/scripts
by running the following command, where <Path of cp4mcm-samples> is the real path where you put thecp4mcm-samples
GitHub repository.cd <Path of cp4mcm-samples>/bcdr/restore/scripts
-
Start the restoration process by running either of the following commands:
nohup bash restore.sh -a > restore.log &
or
nohup bash restore.sh --all-restore > restore.log &
-
-
-
Install Common Services and IBM Cloud Pak® for Multicloud Management.
-
Install Red Hat Advanced Cluster Management, and enable the
observability
feature. For more information, see Installing Red Hat Advanced Cluster Management and Enabling the observability service in Red Hat Advanced Cluster Management. -
Create the installer catalog sources. For more information, see Create the installer catalog sources.
-
Install Common Services operator.
-
Install IBM Cloud Pak for Multicloud Management operator and create its CR by enabling different components. For example, enable Infrastructure Management, Managed Services, Service Library, GRC, Vulnerability Advisor (VA), Mutation Advisor (MA), and don't enable Monitoring. For Managed Services, specify the existing claim name details as follows:
- enabled: true name: ibm-management-cam-install spec: manageservice: camLogsPV: name: cam-logs-pv persistence: accessMode: ReadWriteMany enabled: true existingClaimName: "cam-logs-pv" existingDynamicVolume: false size: 10Gi storageClassName: "<your stotage class name>" useDynamicProvisioning: true camMongoPV: name: cam-mongo-pv persistence: accessMode: ReadWriteMany enabled: true existingClaimName: "cam-mongo-pv" existingDynamicVolume: false size: 15Gi useDynamicProvisioning: true storageClassName: "<your stotage class name>" camTerraformPV: name: cam-terraform-pv persistence: accessMode: ReadWriteMany enabled: true existingClaimName: "cam-terraform-pv" existingDynamicVolume: false size: 15Gi storageClassName: "<your stotage class name>" useDynamicProvisioning: true
For more information, see Installing the IBM Cloud Pak® for Multicloud Management.
-
Wait until the IBM Cloud Pak for Multicloud Management installation is complete, and all pods in the
ibm-common-services
namespace are running.
-
-
Restore IBM Common Services database.
- Change the image value in
mongo-restore-dbdump.yaml
file. The file is available in<Path of cp4mcm-samples>/bcdr/restore/scripts/cs
folder, where<Path of cp4mcm-samples>
is the real path wherecp4mcm-samples
GitHub repository is cloned. This image value should be equal to themongoDBDumpImage
helm variable value, which is used for taking backup. Get the image value by running the following command.kubectl get configmap backup-metadata -n backup -o jsonpath='{.data.mongoDBDumpImage}'
-
Run the
mongo-restore-dbdump
job for Common Services database to restore.oc apply -f mongo-restore-dbdump.yaml
Wait untill the
mongo-restore-dbdump
job is inCompleted
status. You can run the following command to check themongo-restore-dbdump
job status.oc get pod -n ibm-common-services | grep -i icp-mongodb-restore
-
Enable the Monitoring operator (
ibm-management-monitoring
) by running the following command:oc patch installations.orchestrator.management.ibm.com ibm-management -n <namespace in which IBM Cloud Pak for Multicloud Management is installed> --type='json' -p='[{"op": "replace", "path": "/spec/pakModules/1/enabled", "value": true }]'
- Change the image value in
-
Restore Infrastructure Management.
Note: Because Infrastructure Management restoration requires its CRD to be present before restoration, you need to perform Infrastructure Management restoration after Common Services and IBM Cloud Pak® for Multicloud Management installation.
-
Configure LDAP, and ensure that LDAP group name is the same as the one that is defined in the backed-up Infrastructure Management CR.
-
Restore Infrastructure Management by running either of the following commands:
nohup bash restore.sh -im > im-restore.log &
or
nohup bash restore.sh --im-restore > im-restore.log &
-
-
Restore Managed Clusters and applications.
This restoration of managed clusters and applications needs to be done after Red Hat Advanced Cluster Management and IBM Cloud Pak® for Multicloud Management installation.
-
To break the connection between managed and the old Hub Cluster that you backed up, delete the klusterlet from managed clusters by running the following shell script in the managed cluster:
https://github.com/open-cluster-management/deploy/blob/master/hack/cleanup-managed-cluster.sh
-
Reimport the managed cluster in the new restored Hub Cluster after the klusterlet deletion is completed.
-
After the reimport is completed, restore all the required resources by running the following command:
velero restore create <MANAGED_CLUSTER_RESTORE> \ --from-backup <BACKUP_NAME> \ --include-namespaces open-cluster-management,<MANAGED_CLUSTER_NAMESPACE>,<DEPLOYED_APPLICATION_NAMESPACE>
Note:
- If the restoration command fails with errors, run the same command again.
- You need to designate <MANAGED_CLUSTER_NAMESPACE> with the namespace where the managed cluster is imported. If there are multiple managed cluster namespaces, add each namespace, and separate them by commas. For example,
open-cluster-management,managed-cluster1-ns,app1-ns
. - You need to designate <DEPLOYED_APPLICATION_NAMESPACE> with the namespace where managed cluster applications are deployed. If there are multiple application-deployed namespaces, add each namespace, and separate them by commas.
For example,
app1-ns,app2-ns,app3-ns
.
-
Troubleshooting
1. LDAP user login is not working after restoration
Follow the steps to solve the problem:
- Log in to the IBM Cloud Pak console by using the default admin account.
- Click Administer > Identify and access on the IBM Cloud Pak console.
- Select the LDAP connection, and click Edit connection. Edit the LDAP connection with the correct information.
- Click Test connection.
- Click Save.
- log in to the IBM Cloud Pak console by using the LDAP users.