IBM Cloud Pak foundational services backup and restore for clusters with a single instance of foundational services
You can schedule backup and restore of foundational services by using the Red Hat OpenShift API for Data Protection (OADP) operator. Make sure that you use the stable-1.3
channel of the OADP operator.
Prerequisites
-
Set up any Amazon S3-compatible storage. For example, you can create a bucket in IBM Cloud Object Storage. For more information, see IBM Cloud Object Storage .
-
When you add a service credential to the bucket, include the hash-based message authentication code (HMAC). For more information, see Service credentials . From the Cloud Object Storage navigation menu, gather the following information:
access key id
, which can be found on the Service credentials page that is associated with the bucket.secret access key
, which can be found on the Service credentials page that is associated with the bucket.bucket name
, which can be found on the Buckets page.bucket region
, which can be found on the Buckets page.root directory name
, which is the path in the bucket where you want to store the backups.-
s3 URL
, which is the endpoint URL of the bucket.- Note: The endpoint URL must start with
http://
orhttps://
.
- Note: The endpoint URL must start with
Note: If the cluster being backed up or restored to uses s390x for architecture, any velero CLI commands must be run on an alternate cluster that does not use s390x and has oc
access to the original (usually using oc login
).
Velero CLI does not yet support s390x.
Backing up foundational services
Complete the following steps to back up the installed foundational services.
Create the backup resources
You need the following resources for completing the backup procedures.
-
Log in to your OpenShift cluster command-line interface (CLI) by using the
oc login
command. -
Create a namespace for Velero objects. The following example creates the
velero
namespace. For more information about Velero, see Velero documentation .oc project velero
-
Install the Red Hat OADP operator in the
velero
namespace. For more information, see About installing OADP . -
Create a secret named
cloud-credentials
with theaccess key id
andsecret access key
credentials.-
Open any editor and place the following credentials in a file named
credentials-velero
.vi credentials-velero
-
Insert the following content in the file:
[default] aws_access_key_id=<access_key_id> aws_secret_access_key=<secret_access_key>
-
Create the secret.
oc create secret generic cloud-credentials -n velero --from-file cloud=credentials-velero
-
-
From your OpenShift cluster console OperatorHub page, install the OADP operator from the
stable-1.3
channel, which provides the Velero 1.9 API. The API is needed for foundational services backup and restore. For more information, see OpenShift Container Platform documentation . -
Create a
DataProtectionApplication
object.Note: The
provider
isaws
even if you are not using AWS Object Storage.apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: name: <resource_name> namespace: velero annotations: argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true argocd.argoproj.io/sync-wave: '20' spec: backupLocations: - velero: config: profile: default region: <bucket_region> s3ForcePathStyle: 'true' s3Url: <s3_URL> credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: <bucket_name> prefix: <root_directory_name> provider: aws configuration: restic: enable: true velero: defaultPlugins: - openshift - aws podConfig: resourceAllocations: limits: cpu: '1' memory: 1Gi requests: cpu: 500m memory: 512Mi
Add labels to resources
You can add labels to resources automatically by running the script or by manually adding labels. Complete one of the following procedures.
Labelling the resources automatically by running the scrip
-
Run the following commands to fetch and download the
env.properties
file and thelabel-common-services.sh
script and save them in the same folder.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/label-common-service.sh wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/env.properties
-
Open the
env.properties
file, edit the required variables and save the changes.Note: The
OPERATOR_NS=""
variable must be properly set for the script to work. Other variables have default values. You can change these values to fit your environment.vi env.properties
The
env.properties
file contains the following variables:# Change the following values to fit your environment. OPERATOR_NS="" # Set the parameter to the namespace where the foundational services operator is installed. # Pass the namespace where foundational services are installed. # Leave the value of this pameter empty if the services are installed in the same namespace as the foundational services operator. SERVICES_NS="" # Pass the control namespace if it is needed to be backed up. CONTROL_NS="" # Change to the namespace where cert-manager, License Service and License Service Reporter are installed if they are istalled in custom namespaces. CERT_MANAGER_NAMESPACE="ibm-cert-manager" LICENSING_NAMESPACE="ibm-licensing" LSR_NAMESPACE="ibm-lsr" # Change to 1 to enable the private catalog if required. ENABLE_PRIVATE_CATALOG=0 # Add additional CatalogSources without the ".spec.publisher: IBM" parameter. Separate the CatalogSources with a comma. # For example: "my-catalog,my-catalog2,my-catalog3" ADDITIONAL_SOURCES=""
-
Use the following command to run the
label-common-service.sh
script../label-common-service.sh
Manually adding labels to resources
Before you begin, set the namespace where you installed foundational services as the default namespace.
oc project <namespace-where-foundational services-are-installed>
You need to label the currently installed resources to identify them during restoration.
-
Add labels to the Licensing service configmaps:
-
Find the licensing namespace:
oc get pods -A | grep licensing
-
Get the
label-licensing-configmaps.sh
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/licensing/label-licensing-configmaps.sh
-
Add executable permission.
chmod +x label-licensing-configmaps.sh
-
Run the script.
./label-licensing-configmaps.sh <namespace from previous step>
-
-
Add labels to the Cert Manager resources:
-
Get the
label-cert-manager.sh
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/cert-manager/label-cert-manager.sh
-
Add executable permission.
chmod +x label-cert-manager.sh
-
Run the script.
./label-cert-manager.sh
Note: The
label-cert-manager.sh
script searches all namespaces for cert manager resources. The search might require elevated privileges to run.
-
-
If you use IBM License Service Reporter, see Backing up the License Service Reporter instance.
-
Add a label to the catalog sources. The following are common catalogsources that are used by IBM Cloud Paks:
oc label catalogsource ibm-operator-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource opencloud-operators foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource ibm-cert-manager-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource ibm-licensing-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource cloud-native-postgresql-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true
A running cluster might have different catalog sources than those listed in the preceding command. In that case, the following instructions can help determine which other catalog sources to label:
- List all available catalog sources:
oc get catalogsource -A
-
Determine which catalog sources to label with
foundationservices.cloudpak.ibm.com=catalog
. Label any catalog source withIBM
under thePUBLISHER
column.Note: Some of the preceding catalog sources might not be in use. Catalog sources might be located in a namespace other than openshift-marketplace in which case the namespace parameter would need to be updated in the preceding command.
- List all available catalog sources:
-
Add a label to the
common-service-maps
configmap in thekube-public namespace
(if present):oc label configmap common-service-maps -n kube-public foundationservices.cloudpak.ibm.com=configmap --overwrite=true
-
If using a custom hostname, custom TLS secret, or both, label the
cs-onprem-tenant-config
configmap in each namespace it is present:oc label configmap cs-onprem-tenant-config foundationservices.cloudpak.ibm.com=configmap --overwrite=true -n <namespace present>
This configmap can be found by using the command:
oc get cm -A | grep cs-onprem-tenant-config
-
Add a label to the namespaces where you installed foundational services, the namespace where IBM Cert Manager is installed (default namespace is
ibm-cert-manager
), the namespace where IBM Licensing is installed (default namespace isibm-licensing
), and workload namespaces that use foundational services. It is possible there are multiple namespaces with foundational services installed. Make sure to label each of them.-
Find the namespaces where the services are installed.
-
Find the namespace where IBM Cert Manager is installed:
oc get pods -A | grep cert-manager
-
Find the namespace where IBM Licensing is installed:
oc get pods -A | grep licensing
-
Determine the workload namespaces. First, check whether the
common-service-maps
configmap exists:oc get cm common-service-maps -n kube-public
If the
common-service-maps
configmap exists, make sure to label each namespace listed inrequested-from-namespace
andcontrolNamespace
values:oc get cm common-service-maps -n kube-public -o yaml
If the common-service-maps configmap is not present, label each namespace that is using the common service instance for smoother restoration process.
-
-
Label the namespaces.
oc label namespace <namespace-where-foundational services-is-installed> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <namespace-where-cert-manager-is-installed> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <namespace-where-licensing-is-installed> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <requested-from-namespace> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <controlNamespace> foundationservices.cloudpak.ibm.com=namespace --overwrite=true
-
-
Add a label to the operator group:
-
Get the names of the operator groups:
oc get operatorgroup -A
-
Add a label to each operator group labeled
common-service
,ibm-cert-manager-operator
, oribm-licensing-operator-app
.oc label operatorgroup <operatorgroup-name> foundationservices.cloudpak.ibm.com=operatorgroup --overwrite=true -n <namespace>
There might be more IBM Cloud Pak specific operator groups to label as well.
-
-
Add a label to the IBM Common Service Operator subscription, there is one in each namespace where the
ibm-common-service-operator
pod is deployed:-
Determine the subscription name:
oc get subscription -n <namespace> | grep ibm-common-service-operator
-
Label the subscription:
oc label subscriptions.operators.coreos.com <ibm common service operator subscription name> foundationservices.cloudpak.ibm.com=subscription --overwrite=true -n <namespace>
-
-
Add the label to the IBM Cert Manager Operator subscription:
-
Check the certificate manager service that is installed in the cluster. If IBM Cert Manager is installed in your cluster, the pod name has ibm-cert-manager-operator` in it.
oc get pods -A | grep cert-manager
Note: If the cert manager operator pod is named something other than
ibm-cert-manager-operator-<alphanumeric characters>
, it means that a third-party certificate manager service is installed. Install this third-party cert manager on the target restore cluster before you restore subscriptions. -
Label the subscription:
oc label subscriptions.operators.coreos.com ibm-cert-manager-operator foundationservices.cloudpak.ibm.com=singleton-subscription --overwrite=true -n <namespace where cert manager is deployed>
-
-
Add a label to the IBM Licensing Operator subscription:
-
Refer to the label licensing in the preceding step to find the namespace or use the following:
oc get pods -A | grep ibm-licensing
-
Determine the subscription name:
oc get subscriptions.operators.coreos.com -n <namespace> | grep ibm-licensing-operator
-
Label the subscription:
oc label subscriptions.operators.coreos.com <IBM licensing subscription name> foundationservices.cloudpak.ibm.com=singleton-subscription --overwrite=true -n <namespace>
-
-
Add a label to the
common-service
custom resource (CR):oc label commonservices common-service foundationservices.cloudpak.ibm.com=commonservice --overwrite=true
Note: If your cluster has more than one (that is, in SOD scenarios), labeling each of them will not negatively impact the restore process.
-
Add a label to the
commonservices.operator.ibm.com
customresourcedefinition (CRD):oc label customresourcedefinition commonservices.operator.ibm.com foundationservices.cloudpak.ibm.com=crd --overwrite=true
-
Add a label to the entitlement secret, if you have one in your cluster:
- Find all entitlement keys on the cluster:
oc get secret -A | grep ibm-entitlement-key
- Label each entitlement key:
oc label secret ibm-entitlement-key foundationservices.cloudpak.ibm.com=entitlementkey --overwrite=true -n <namespace>
- Find all entitlement keys on the cluster:
-
Add a label to the global pull secret, if you have one in your cluster:
oc label secret pull-secret -n openshift-config foundationservices.cloudpak.ibm.com=pull-secret --overwrite=true
-
Add a label to the OperandRequests:
-
Find operand requests to label:
oc get operandrequests -A
-
Label each OperandRequest:
oc label operandrequests <operand request name> foundationservices.cloudpak.ibm.com=operand --overwrite=true -n <namespace>
Note: Typically, foundational services OperandRequests are named
common-service
so any OperandRequest with this name should be labeled. However, there might be more OperandRequests to label other than the ones that are namedcommon-service
as Cloud Paks might name their operand requests something different. There are some that do not need to be labeled because they are created automatically when certain services are requested such asibm-iam-request
. There is no harm in labeling this request.
-
-
Label the
namespacescope
CRD and CR:Note: The following steps provide instructions for labelling the
namespacescope
resources, such as thenamespacescope
CR, subscription, service account, and ConfigMap. If you have more than one instance of foundational services installed on the cluster, label these resources for each operator.oc label namespacescope common-service -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true oc label customresourcedefinition namespacescopes.operator.ibm.com foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
subscription:oc label subscriptions.operators.coreos.com ibm-namespace-scope-operator -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
ConfigMap:oc label configmap namespace-scope -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the namespacescope service account:
oc label serviceaccount ibm-namespace-scope-operator -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
roles across namespaces:-
Run the following command to find all roles:
oc get role -A | grep nss-managed-role-from
-
Label each role that is returned:
>
oc label role <role name> -n <namespace where role is present> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
-
Label the
namespacescope
role bindings across namespaces:-
Run the following command to find all role bindings:
oc get rolebinding -A | grep nss-managed-role-from
-
Label each role binding that is returned:
oc label rolebinding <rolebinding name> -n <namespace where rolebinding is present> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Backup common-service-db
-
Get the common-service-db backup resources.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-deployment.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-pvc.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-br-script-cm.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-rolebinding.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-sa.yaml
-
Update the backup files.
- Replace
<cs-db namespace>
with the namespace where common-service-db instance is running. - Replace the
<storage class>
with the storage class that the current IM deployment uses.
- Replace
-
Add the PVC to the cluster.
oc apply -f cs-db-backup-pvc.yaml
-
Add the
cs-db-br-script-cm.yaml
to the correct namespaceoc apply -f cs-db-br-script-cm.yaml
-
Give the common-service-db backup necessary permissions
oc apply -f cs-db-sa.yaml
oc apply -f cs-db-role.yaml
oc apply -f cs-db-rolebinding.yaml
-
Add the deployment to the cluster.
oc apply -f cs-db-backup-deployment.yaml
Back up Zen
-
Locate
zenservice
instances.oc get zenservice -A
-
Label each zenservice.
oc label zenservice <zenservice name> foundationservices.cloudpak.ibm.com=zen --overwrite=true -n <namespace>
Back up Zen MetastoreDB
Note: Repeat this step for each namespace where a zenservice
instance is installed.
-
Get the Zen 5 backup resources.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-deployment.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-pvc.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
-
Update the backup files.
By default, the
zen5-backup-pvc.yaml
needs to replace the following parameters:- Replace
<zenservice namespace>
with the namespace where thezenservice
instance is running. -
Replace the
<storage class>
with either the storage class that common-service-db deployment uses, or with any storage class that hasRetain
ReclaimPolicy.In the
zen5-backup-deployment.yaml
file, replace all instances of<zenservice namespace>
with the namespace where thezenservice
instance is running. There are four; two are parameters for the velero backup and restore commands.By default, the backup and restore commands (represented by
.spec.template.metadata.annotations.pre.hook.backup.velero.io/command
&.spec.template.metadata.annotations.post.hook.restore.velero.io/command
) are scheduled to run in the<zenservice namespace>
namespace as parameters to the scripts called in the commands. Edit both commands' first parameter values to match the namespace that the deployment is created in.By default, the restore command (represented by
.spec.template.metadata.annotations.post.hook.restore.velero.io/command
) is set to run againstzenservice
named<zenservice name>
. Update the second parameter to match the name of thezenservice
in the target namespace.In the
zen5-br-scripts-cm.yaml
andzen5-sa.yaml
, make sure to replace the namespace value<zenservice namespace>
with the zenservice namespace in use for each instance of zenservice in use.
- Replace
-
Add the PVC to the cluster.
oc apply -f zen5-backup-pvc.yaml
-
Add the
zen5-br-scripts-cm.yaml
to the correct namespaceoc apply -f zen5-br-scripts-cm.yaml
-
Give the Zen 5 backup necessary permissions
-
For each namespace with a
zenservice
to backup, create a service account. Replace the<zenservice namespace>
value before applying.oc apply -f zen5-sa.yaml
-
Once per
zenservice
namespace, apply the Role for the zen backup. Replace the<zenservice namespace>
value before applying.oc apply -f zen5-role.yaml
-
Create the RoleBinding to connect the ServiceAccount to the Role.
-
Edit the
zen5-rolebinding.yaml
file to add the ServiceAccount created earlier and replace the<zenservice namespace>
value.vi zen5-rolebinding.yaml
-
Apply the
zen5-rolebinding.yaml
fileoc apply -f zen5-rolebinding.yaml
-
-
-
Add the deployment to the cluster.
oc apply -f zen5-backup-deployment.yaml
Create a backup resource
Create a backup resource for the velero
namespace.
-
Get the
schedule-common-services.yaml
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/schedule-common-services.yaml
-
Update the
schedule-common-services.yaml
file based on your backup requirements. For more information, see Velero Schedule API Type . By default, the backup runs once a day and is deleted 48 hours later.The following configurations in the
schedule-common-services.yaml
file are important:schedule:
, which is a CRON expression. CRON uses the server time, which is usually the Coordinated Universal Time unless configured to be something else.ttl
, which is the time to live for the backup.storageLocation
, which is the same storage location that you used when you set up OADP. The commandoc get backupstoragelocations.velero.io -n <velero namespace>
can be used to get the name.velero
, which is the namespace where you installed OADP.
-
Create the resource.
oc apply -f schedule-common-services.yaml
-
Verify whether the backup schedule was created.
velero schedule get
After the first scheduled time passes, you can verify whether the backup ran. Look for a schedule name and timestamp.
velero backup get
-
Verify whether the backup was successful and check the details to see if all resources are saved.
velero backup describe <__BACKUP_NAME__> --details
Restoring foundational services
Complete the following steps to restore foundational services.
Before you restore foundational services, set up Velero on the new cluster. Follow the instructions in the Create the backup resources section.
For troubleshooting issues that may arise during restore, see IBM Cloud Pak foundational services Installation Troubleshooting.
Download the necessary files for restoring different resources:
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-namespace.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-entitlementkey.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-pull-secret.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-catalog.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operatorgroup.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-configmap.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-crd.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-commonservice.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-subscriptions.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-licensing.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cert-manager.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operands.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-zen5-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-singleton-subscriptions.yaml
-
Restore the foundational services namespaces by using the
restore-namespace.yaml
file.-
Get the name of the Velero backup that you plan to use for restoring.
velero backup get
Replace
__BACKUP_NAME__
in the following commands with the Velero backup name.Verify whether the backup was successful and check the details to see if all resources are saved.
velero backup describe <__BACKUP_NAME__> --details
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-namespace.yaml
-
Restore the namespace.
oc apply -f restore-namespace.yaml
You can check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the namespace is restored. Your namespace must be listed in the command output.
oc get namespace
Proceed with the next step after the namespace is restored.
-
Change the default project to the restored common service namespace.
oc project <namespace-where-foundational services-are-installed>
-
-
Restore the entitlement key.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-entitlementkey.yaml
-
Restore the entitlement key.
oc apply -f restore-entitlementkey.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the entitlement key is restored.
oc get secret
-
-
Restore the pull secret.
-
Save the current pull secret.
oc get secret pull-secret -n openshift-config -o yaml > original-pull-secret.yaml
-
Delete the current pull secret from the
openshift-config
namespace.oc delete secret pull-secret -n openshift-config
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-pull-secret.yaml
-
Restore the pull secret.
oc apply -f restore-pull-secret.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the pull secret is restored.
oc get secret -n openshift-config | grep pull
-
-
Restore the catalog.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-catalog.yaml
-
Restore the catalog.
oc apply -f restore-catalog.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the catalog source is restored.
oc get catalogsource -n openshift-marketplace | grep ibm
-
Verify whether the
ibm-operator-catalog
pod is running.oc get pod -n openshift-marketplace -w
Note: If using IBM Cert Manager, IBM Licensing, or
cloud-native-postgresql-catalog
catalog source, verify that their pods areRunning
as well.If the pods are running, proceed with the next step.
-
-
Restore the operator group.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-operatorgroup.yaml
-
Restore the operator group.
oc apply -f restore-operatorgroup.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the operator group is restored.
oc get operatorgroup
-
-
Restore
common-service-maps
configmap.-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-configmap.yaml
-
Restore the configmap.
oc apply -f restore-configmap.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the configmap is restored.
oc get configmap common-service-maps -n kube-public
-
-
Restore the commonservices.operator.ibm.com customresourcedefinition (CRD).
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-crd.yaml
-
Restore the CRD.
oc apply -f restore-crd.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the crd is restored.
oc get customresourcedefinition | grep commonservices.operator.ibm.com
-
-
Restore the
common-service
CR.-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-commonservice.yaml
-
Restore the CR.
oc apply -f restore-commonservice.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the commonservice is restored.
oc get commonservice
If the foundational services are not restored, delete the restore resource and apply it again:
-
Delete the resource.
oc delete -f restore-commonservice.yaml
-
Restore the CR.
oc apply -f restore-commonservice.yaml
Wait for 30 seconds and check again for the
CommonService
resource.
-
-
-
Restore the singleton subscriptions.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-singleton-subscriptions.yaml
-
Restore the subscriptions.
oc apply -f restore-singleton-subscriptions.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Watch the namespaces where Cert Manager and License Service are deployed for the Cert Manager and Licensing operators to be running. By default Cert Manager and License Service are deployed in
ibm-cert-manager
andibm-licensing
namespaces.oc get pod -n <cs namespace> -w
-
-
Restore cert manager resource.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-cert-manager.yaml
-
Restore the cert manager resource.
oc apply -f restore-cert-manager.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the certificates are restored.
oc get certificates
-
-
Restore the subscriptions.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-subscriptions.yaml
-
Restore the subscriptions.
oc apply -f restore-subscriptions.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Watch the foundational services namespace for the
operand-deployment-lifecycle-manager
to be running:oc get pod -n <cs namespace> -w
See the following notes:
- If not using IBM Cert Manager, IBM Common Service Operator deployment fails unless a third-party Cert Manager is installed on the cluster beforehand.
- If using SOD, it is likely the
ibm-common-service-operator
not come ready after restoring the subscriptions and subsequently will not deploy ODLM. This is expected and will resolve after running the next step.
Troubleshooting: In case of issues with generating new installation plans for updates or new installations, see OLM is unable to generate new install plans.
-
-
Run
setup_tenant.sh
to set up cluster topology.-
Get the
setup.tenant.sh
andutils.sh
scripts by running the following command:wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/setup_tenant.sh mkdir common && cd common wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/common/utils.sh
Note: This script needs to be run for each instance of foundational services. Each instance should have different namespace values from each other instance (that is, no namespace should be used in two different executions).
-
Run the following command the make the scricpts executable:
chmod +x setup_tenant.sh chmod +x common/utils.sh
-
Gather the values to run script. Operator and Services namespaces:
oc get commonservice common-service -o yaml
Locate values
.spec.operatorNamespace
and.spec.servicesNamespace
Note: These values will always match unless using SOD.
Size:
oc get commonservice common-service -o yaml
Locate value
.spec.size
Tethered namespaces:
oc get cm common-service-maps -o yaml -n kube-public
Make note of the namespaces under
requested-from-namespace
. If this configmap does not exist, this value consists of whichever namespaces are going to use this common service instance. -
Run the script.
Note: If services and operator namespace are the same, you must still specify both parameters when running
setup_tenant.sh
. Use the same namespace for each. Optional parameters-s
and-n
can be used if either using a different catalog source than opencloud-operators or it is in a different namespace respectively. If everything is deployed to the same namespace (CS operators, CS operands, and Cloud Pak workload), you do not need to use thesetup_tenant.sh
script and can move on to the next step../setup_tenant.sh --operator-namespace <operator namespace> --services-namespace <services namespace> --tethered-namespaces <comma delimited (no spaces) list of Cloud Pak workload namespaces that use this foundational services instance> --license-accept -c v<foundational services version number in use i.e. 4.0, 4.1, 4.2, etc> -p <.spec.size value from commonservice cr> -i <install mode, either Manual or Automatic>
-
Wait for script to complete successfully. For more information, see Installing foundational services by using a script.
-
-
Restore Licensing service configmap.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-licensing.yaml
-
Restore the configmap.
oc apply -f restore-licensing.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the configmap is restored.
oc get configmap | grep licensing
-
-
If you use IBM License Service Reporter, see Backing up the License Service Reporter instance.
-
Restore the OperandRequests and OperandConfigs.
-
If you are restoring an OperandConfig, delete the existing first.
oc delete operandconfig common-service
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-operands.yaml
-
Restore the operands.
oc apply -f restore-operands.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the operands are restored.
oc get operandrequest
oc get operandconfig
-
Verify whether operand requests are reconciled.
Give ODLM time to reconcile one or more restored operand requests but new operators and their operands should be seen deploying shortly after the restore completes. Check the operand requests' status fields and the ODLM logs for any issues.
-
If using a custom hostname, TLS secret, or both, wait for the
platform-identity
pods to come ready:-
Verify that the
cs-onprem-tenant-config
configmap is present:oc get cm -n <namespace where hostname is changed or custom TLS secret used> | grep cs-onprem-tenant-config
-
Wait for the
platform-identity-management
,platform-identity-provider
, andplatform-auth-service
pods to come ready in the same namespace. - Make sure to update the custom hostname to reflect a change in cluster if necessary. For example, the structure of the route is
<route name>.cluster1.com
. If you are no longer oncluster1
but now oncluster2
, the route needs to be updated from<route name>.cluster1.com to
<route name>.cluster2.com
. - If using a custom TLS secret, it is best to re-create this secret on the new cluster by using the same name. In this case, if the secret was carried over to the new cluster, it would need to be replaced.
- Follow the instructions here https://www.ibm.com/docs/en/cloud-paks/foundational-services/4.3?topic=cc-updating-custom-hostname-tls-secret-by-using-configmap.
-
-
-
Restore common-service-db
-
Get the restore object
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created previously.vi restore-cs-db.yaml
-
Restore the cs-db data.
oc apply -f restore-cs-db.yaml
-
Check restore progress. Proceed with the next step after restore is complete.
velero restore get
-
Check logs of the velero restore to verify that the data was restored
velero restore logs restore-cs-db-data
Troubleshooting: If the logs or the data indicate that the restore was not successful, apply the following workaround:
-
Get the restore job:
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/common-service-db/cs-db-restore-job.yaml
-
Replace
<cs-db namespace>
with the namespace where common-service-db instance is running. -
Delete the existing cs-db-backup deployment and cs-db-backup pod.
oc delete deploy cs-db-backup -n <namespace>
-
Run the restore job.
oc apply -f cs-db-restore-job.yaml
Note: The secondary steps that are listed here must be run only if the restore logs indicate that the restore was not run. If the storage class used on the backup cluster does not match the storage class that is used on the target cluster, the restore fails. Adapting to different storage classes across clusters is a current limitation of velero.
-
-
-
Restore Zen and Zen data.
-
Restore
zenservice
instances.-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-zen.yaml
-
Restore the
zenservice
instances.oc apply -f restore-zen.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Wait for the
zenservice
instances to come ready. Once theProgress
field is 100%, the instance is ready. The following command will continuously output the percentage of all thezenservices
on the cluster.oc get zenservice -A -w -o yaml | grep Progress:
Note: If the restored zenservice contains fields to configure
zenCustomRoute
, do the following:- Verify the secret used (if the field exists) is present in the zenservice namespace in the target cluster.
- Update the value in the zenservice CR for the route. For example, the structure of the route is
<route name>.cluster1.com
. If you are no longer oncluster1
but now oncluster2
, the route needs to be updated from<route name>.cluster1.com to
<route name>.cluster2.com
.
-
-
Restore zen data.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-zen5-data.yaml
-
Restore the Zen data.
oc apply -f restore-zen5-data.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Check logs of the velero restore to verify that the data was restored
velero restore logs restore-zen5-data
-
Search for
restore_zen5
to find relevant logs. If it is not present, the restore did not run. If the logs or the data indicate that the restore was not successful, the following steps can be taken as a workaround:-
Get the Zen 5 restore job resource.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen5-restore-job.yaml
-
Delete the existing
zen5-backup
deploymentoc delete deploy zen5-backup -n <namespace>
-
Wait for the
zen5-backup
pods to fully delete (fully gone, notTerminating
) -
Give the zen5 backup necessary permissions if the necessary ServiceAccount, Role, and RoleBinding are not already present.
-
Check if permissions exist:
oc get sa -n <zenservice namespace> | grep zen5 oc get role | grep zen5 oc get rolebinding | grep zen5
-
Get the
zen5-sa.yaml
,zen5-role.yaml
, &zen5-rolebinding.yaml
files.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
-
For each namespace with a
zenservice
to backup, edit the service account filezen5-sa.yaml
to deploy in the corresponding namespaceoc apply -f zen5-sa.yaml
-
Once per
zenservice
namespace, apply thezen5-role.yaml
file to create the Role for the zen backup. Replace the<zenservice namespace>
value before applying.oc apply -f zen5-role.yaml
-
Create the RoleBinding to connect the ServiceAccounts to the Role.
-
Edit the
zen5-rolebinding.yaml
file to add each ServiceAccount created earlier. Replace the<zenservice namespace>
value before applying.vi zen5-rolebinding.yaml
-
Apply the
zen5-rolebinding.yaml
fileoc apply -f zen5-rolebinding.yaml
-
-
-
Edit the
zen5-restore-job.yaml
file. The default namespace is set tozen
. The parameters for the underlyingrestore_zen5.sh
are defaulted to thezen
namespace andtest-zen
zenservice
name. Update both of these parameters to reflect the proper namespace andzenservice
respectively. -
Apply the
zen5-restore-job.yaml
fileoc apply -f zen5-restore-job.yaml
-
Wait for the job to complete, then check the logs of the
zen5-restore-job
pod to verify restore completed. -
Repeat as needed for each namespace with a
zenservice
instance installed.
-
-
-
Wait for the
zenservice
instances to come ready. Once theProgress
field is 100%, the instance is ready. The following command will continuously output the percentage of all thezenservices
on the cluster.oc get zenservice -A -w -o yaml | grep Progress:
Fee the following troubleshooting tips:
- Make sure that there is only one
zen5-backup
or onezen5-restore-job
pod in a namespace at any given time as they compete for the same PVC. - If the
zen5-restore-job
pod is stuck inContainerCreating
:- delete the deployment
zen5-backup
- make sure the
zen5-backup
pod is fully deleted (notTerminating
) - delete the
zen5-restore-job
job and its pod (notTerminating
) - ensure that the configmap
zen5-br-configmap
, pvczen5-backup-pvc
, rolezen5-backup-role
, rolebindingzen5-backup-rolebinding
, and service accountzen5-backup-sa
are present in the namespace - reapply the
zen5-restore-job
yaml
- delete the deployment
-
If the configmap
zen5-br-configmap
is not present, it can be downloaded from:wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml.
Make sure to edit the namespace field before applying with the following command:
oc apply -f zen5-br-scripts-cm.yaml
-
Velero restore is less predictable than backup when restoring databases. There is no harm to deleting a
Completed
velero restore object (that is,restore-cs-db-data
orrestore-zen5-data
), deleting the accompanying deployment and pvc, waiting for these items to be fully deleted, then re-creating the velero restore object to try again. Should this still not work, the secondary instructions by using thecs-db-restore-job.yaml
and zen5 restore job can be used on an individual namespace basis. There is no harm to running the restore in a namespace that has already been restored.
-
-
-
If you use a custom route for the restored
zenservice
and you are restoring to a new cluster, update the value ofzenCustomRoute
in thezenservice
CR to reflect the new hostname and re-trigger theiam-config
job. Run the following commands:oc -n <zenservice namespace> patch zenservice <zenservice name> --type='merge' -p '{"spec":{"zenCustomRoute":{"route_host":"<updated route>"}}}' oc -n <zenservice namespace> patch zenservice <zenservice name> --type='merge' -p '{"spec":{"reconcile":true}}' oc get job -n <zenservice namespace> iam-config-job -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | oc replace --force -f -
All restoration tasks are completed.
Verify whether foundational services are properly restored.
-
Verify the pods:
oc get pods
All pods must be running.
-
Verify the subscriptions:
oc get subscriptions
Subscriptions of all installed services must be listed.
-
Verify that the Identity and Access section of the
cp-console
shows the users and teams that your organization added in the original cluster.
For backing up and restoring Identity Management (IM) components, see Identity management backup and restore.
For migrating existing OIDC and SAML configurations, see Migrating identity management.
General Troubleshooting:
If a restore process is stopped in the New
phase when you view with velero restore get
, restart the velero pod in the namespace where OADP is installed. After the velero pod restarts, the status of the restore process must
change to InProgress
.