IBM Cloud Pak foundational services backup and restore for coexistence scenario
A coexistence scenario involves multiple instances of foundational services on one cluster with at least one instance on version 3.23.x or 3.19.9 or later, and at least one instance on version 4.0 or later. You can schedule backup and restore of
foundational services by using the Red Hat OpenShift API for Data Protection (OADP) operator. Make sure that you use the stable-1.3
channel of the OADP operator.
Prerequisites
-
Set up any Amazon S3-compatible storage. For example, you can create a bucket in IBM Cloud Object Storage. For more information, see IBM Cloud Object Storage .
-
When you add a service credential to the bucket, include the hash-based message authentication code (HMAC). For more information, see Service credentials . From the Cloud Object Storage navigation menu, gather the following information:
access key id
, which can be found on the Service credentials page that is associated with the bucket.secret access key
, which can be found on the Service credentials page that is associated with the bucket.bucket name
, which can be found on the Buckets page.bucket region
, which can be found on the Buckets page.root directory name
, which is the path in the bucket where you want to store the backups.-
s3 URL
, which is the endpoint URL of the bucket.- Note: The endpoint URL must start with
http://
orhttps://
.
- Note: The endpoint URL must start with
Note: If the cluster that is being backed up or restored to uses s390x for architecture, any velero CLI commands must be run on an alternate cluster that does not use s390x and has oc
access to the original (usually
by using oc login
). Velero CLI does not yet support s390x.
Backing up foundational services
Complete the following steps to back up the installed foundational services.
Create the backup resources
You need the following resources for completing the backup procedures.
-
Log in to your OpenShift cluster command-line interface (CLI) by using the
oc login
command. -
Create a namespace for Velero objects. The following example creates the
velero
namespace. For more information about Velero, see Velero documentation .oc project velero
-
Install the Red Hat OADP operator in the
velero
namespace. For more information, see About installing OADP . -
Create a secret named
cloud-credentials
with theaccess key id
andsecret access key
credentials.-
Open any editor and place the following credentials in a file named,
credentials-velero
.vi credentials-velero
-
Insert the following content in the file:
[default] aws_access_key_id=<access_key_id> aws_secret_access_key=<secret_access_key>
-
Create the secret.
oc create secret generic cloud-credentials -n velero --from-file cloud=credentials-velero
-
-
From your OpenShift cluster console OperatorHub page, install the OADP operator from the
stable-1.3
channel, which provides the Velero 1.9 API. The API is needed for foundational services backup and restore. For more information, see OpenShift Container Platform documentation . -
Create a
DataProtectionApplication
object.Note: The
provider
isaws
even if you are not using AWS Object Storage.
apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
name: <resource_name>
namespace: velero
annotations:
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
argocd.argoproj.io/sync-wave: '20'
spec:
backupLocations:
- velero:
config:
profile: default
region: <bucket_region>
s3ForcePathStyle: 'true'
s3Url: <s3_URL>
credential:
key: cloud
name: cloud-credentials
default: true
objectStorage:
bucket: <bucket_name>
prefix: <root_directory_name>
provider: aws
configuration:
restic:
enable: true
velero:
defaultPlugins:
- openshift
- aws
podConfig:
resourceAllocations:
limits:
cpu: '1'
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
Add labels to resources
You can add labels to resources automatically by running the script or by manually adding labels. Complete one of the following procedures.
Labelling the resources automatically by running the scrip
-
Run the following commands to fetch and download the
env.properties
file and thelabel-common-services.sh
script and save them in the same folder.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/label-common-service.sh wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/common-service/env.properties
-
Open the
env.properties
file, edit the required variables and save the changes.Note: The
OPERATOR_NS=""
variable must be properly set for the script to work. Other variables have default values. You can change these values to fit your environment.vi env.properties
The
env.properties
file contains the following variables:# Change the following values to fit your environment. OPERATOR_NS="" # Set the parameter to the namespace where the foundational services operator is installed. # Pass the namespace where foundational services are installed. # Leave the value of this pameter empty if the services are installed in the same namespace as the foundational services operator. SERVICES_NS="" # Pass the control namespace if it is needed to be backed up. CONTROL_NS="" # Change to the namespace where cert-manager, License Service and License Service Reporter are installed if they are istalled in custom namespaces. CERT_MANAGER_NAMESPACE="ibm-cert-manager" LICENSING_NAMESPACE="ibm-licensing" LSR_NAMESPACE="ibm-lsr" # Change to 1 to enable the private catalog if required. ENABLE_PRIVATE_CATALOG=0 # Add additional CatalogSources without the ".spec.publisher: IBM" parameter. Separate the CatalogSources with a comma. # For example: "my-catalog,my-catalog2,my-catalog3" ADDITIONAL_SOURCES=""
-
Use the following command to run the
label-common-service.sh
script../label-common-service.sh
Manually adding labels to resources
Before you begin, set the namespace where you installed foundational services as the default namespace.
oc project <namespace-where-foundational services-are-installed>
You must label the currently installed resources to identify them during restoration.
-
Add labels to the Licensing service configmaps:
-
Find the licensing namespace:
oc get pods -A | grep licensing
-
Get the
label-licensing-configmaps.sh
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/licensing/label-licensing-configmaps.sh
-
Add executable permission.
chmod +x label-licensing-configmaps.sh
-
Run the script.
./label-licensing-configmaps.sh <namespace from previous step>
-
-
Add labels to the Cert Manager resources:
-
Get the
label-cert-manager.sh
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/backup/cert-manager/label-cert-manager.sh
-
Add executable permission.
chmod +x label-cert-manager.sh
-
Run the script.
./label-cert-manager.sh
Note: The
label-cert-manager.sh
script searches all namespaces for cert manager resources. The search might require elevated privileges to run.
-
-
If you use IBM License Service Reporter, see Backing up the License Service Reporter instance.
-
Add a label to the catalog sources. The following are common catalogsources that are used by IBM Cloud Paks:
oc label catalogsource ibm-operator-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource opencloud-operators foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource ibm-cert-manager-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource ibm-licensing-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true oc label catalogsource cloud-native-postgresql-catalog foundationservices.cloudpak.ibm.com=catalog -n openshift-marketplace --overwrite=true
A running cluster might have different catalog sources than those listed in the preceding command. In that case, the following instructions can help determine which other catalog sources to label:
- List all available catalog sources:
oc get catalogsource -A
-
Determine which catalog sources to label with
foundationservices.cloudpak.ibm.com=catalog
. Label any catalog source withIBM
under thePUBLISHER
column.Note: Some of the preceding catalog sources might not be in use. Catalog sources might be located in a namespace other than
openshift-marketplace
in which case the namespace parameter must be updated in the preceding command.
- List all available catalog sources:
-
Add a label to the
common-service-maps
configmap in thekube-public
namespace:oc label configmap common-service-maps -n kube-public foundationservices.cloudpak.ibm.com=configmap --overwrite=true
-
If using a custom hostname, custom TLS secret, or both, label the
cs-onprem-tenant-config
configmap in each namespace it is present:oc label configmap cs-onprem-tenant-config foundationservices.cloudpak.ibm.com=configmap --overwrite=true -n <namespace present>
This configmap can be found by using the command:
oc get cm -A | grep cs-onprem-tenant-config
-
Add a label to the namespaces where you installed foundational services, the namespace where IBM Cert Manager is installed (default namespace is
ibm-cert-manager
), the namespace where IBM Licensing is installed (default namespace isibm-licensing
), and workload namespaces that use foundational services. It is possible there are multiple namespaces with foundational services installed. Make sure to label each of them.-
Find the namespaces where the services are installed.
-
Find the namespace where IBM Cert Manager is installed:
oc get pods -A | grep cert-manager
-
Find the namespace where IBM Licensing is installed:
oc get pods -A | grep licensing
-
Make sure to label each namespace listed in
requested-from-namespace
,map-to-common-service-namespace
, andcontrolNamespace
values:oc get cm common-service-maps -n kube-public -o yaml
-
-
Label the namespaces.
oc label namespace <namespace-where-foundational services-is-installed> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <namespace-where-cert-manager-is-installed> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <namespace-where-licensing-is-installed> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <requested-from-namespace> foundationservices.cloudpak.ibm.com=namespace --overwrite=true oc label namespace <controlNamespace> foundationservices.cloudpak.ibm.com=namespace --overwrite=true
-
-
Add a label to the operator groups:
-
Get the names of the operator groups:
oc get operatorgroup -A
-
Add a label to each operator group labeled
common-service
,ibm-cert-manager-operator
, oribm-licensing-operator-app
.oc label operatorgroup <operatorgroup-name> foundationservices.cloudpak.ibm.com=operatorgroup --overwrite=true -n <namespace>
There might be more Cloud Pak specific operator groups to label as well.
-
-
Add a label to the IBM Common Service Operator subscription, there is one in each namespace where the
ibm-common-service-operator
pod is deployed:-
Determine the subscription name:
oc get subscriptions.operators.coreos.com -n <namespace> | grep ibm-common-service-operator
-
Label the subscription:
oc label subscriptions.operators.coreos.com <ibm common service operator subscription name> foundationservices.cloudpak.ibm.com=subscription --overwrite=true -n <namespace>
-
-
Add a label to the IBM Cert Manager Operator subscription:
-
Check the certificate manager service that is installed in the cluster. If IBM Cert Manager is installed in your cluster, the pod name has
ibm-cert-manager-operator
in it.oc get pods -A | grep cert-manager
Note: In a coexistence scenario, there are likely two
ibm-cert-manager-operator
pods: one is4.x
and the other is3.23.x/3.19.x
and always deployed to the control namespace. The subscription to backup is not in the control namespace, but it is in the other namespace (default isibm-cert-manager
). If the cert manager operator pod is named something other thanibm-cert-manager-operator-<alphanumeric characters>
, it means that a third-party certificate manager service is installed. Install this third-party cert manager on the target restore cluster before you restore subscriptions. -
Label the subscription:
oc label subscriptions.operators.coreos.com ibm-cert-manager-operator foundationservices.cloudpak.ibm.com=singleton-subscription --overwrite=true -n <namespace where cert manager is deployed>
-
-
Add a label to the IBM Licensing Operator subscription:
-
Refer to the label licensing in the preceding step to find the namespace or use the following:
oc get pods -A | grep ibm-licensing
-
Determine the subscription name:
oc get subscriptions.operators.coreos.com -n <namespace> | grep ibm-licensing-operator
-
Label the subscription:
oc label subscriptions.operators.coreos.com <IBM licensing subscription name> foundationservices.cloudpak.ibm.com=singleton-subscription --overwrite=true -n <namespace>
-
-
Add a label to the
common-service
custom resource (CR):oc label commonservices common-service foundationservices.cloudpak.ibm.com=commonservice --overwrite=true -n <namespace>
Note: This needs to be done for each instance of
CommonService
CR on the cluster. There is one for each instance of foundational services. There are two if using SOD; one in the operator namespace and one in the services namespace. Label both custom resources. -
Add a label to the
commonservices.operator.ibm.com
customresourcedefinition (CRD):oc label customresourcedefinition commonservices.operator.ibm.com foundationservices.cloudpak.ibm.com=crd --overwrite=true
-
Add a label to the entitlement secret, if you have one in your cluster:
- Find all entitlement keys on the cluster:
oc get secret -A | grep ibm-entitlement-key
- Label each entitlement key:
oc label secret ibm-entitlement-key foundationservices.cloudpak.ibm.com=entitlementkey --overwrite=true -n <namespace>
- Find all entitlement keys on the cluster:
-
Add a label to the global pull secret, if you have one in your cluster:
oc label secret pull-secret -n openshift-config foundationservices.cloudpak.ibm.com=pull-secret --overwrite=true
-
Add a label to the OperandRequests:
-
Find operand requests to label:
oc get operandrequests -A
-
Label each OperandRequest:
oc label operandrequests <operand request name> foundationservices.cloudpak.ibm.com=operand --overwrite=true -n <namespace>
Note: Typically, foundational services OperandRequests are named
common-service
so any OperandRequest with this name should be labeled. However, there might be more OperandRequests to label besides ones that are namedcommon-service
as Cloud Paks might name their OperandRequests something different. There are some that do not need to be labeled because they are created automatically when certain services are requested such asibm-iam-request
. There is no harm in labeling this request.
-
-
Label the
namespacescope
CRD and CR:Note: The following steps provide instructions for labelling the
namespacescope
resources, such as thenamespacescope
CR, subscription, service account, and ConfigMap. If you have more than one instance of foundational services installed on the cluster, label these resources for each operator.oc label namespacescope common-service -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true oc label customresourcedefinition namespacescopes.operator.ibm.com foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
subscription:oc label subscriptions.operators.coreos.com ibm-namespace-scope-operator -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
ConfigMap:oc label configmap namespace-scope -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
service account:oc label serviceaccount ibm-namespace-scope-operator -n <operator namespace> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Label the
namespacescope
roles across namespaces:-
Run the following command to find all roles:
oc get role -A | grep nss-managed-role-from
-
Label each role that is returned:
>
oc label role <role name> -n <namespace where role is present> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
-
Label the
namespacescope
role bindings across namespaces:-
Run the following command to find all role bindings:
oc get rolebinding -A | grep nss-managed-role-from
-
Label each role binding that is returned:
oc label rolebinding <rolebinding name> -n <namespace where rolebinding is present> foundationservices.cloudpak.ibm.com=nss --overwrite=true
-
Back up MongoDB (CS v3.19.x to v4.5.x)
Set up the MongoDB backup deployment. The deployment triggers and holds the MongoDB database backup. During restore, the deployment also triggers the restore of the MongoDB database.
Note: Repeat this step for each namespace where foundational services is installed.
- Get the
mongodb-backup-pvc.yaml
andmongodb-backup-deployment.yaml
files. Place them in the same directory.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/mongodb-backup-pvc.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/mongodb-backup-deployment.yaml
-
Update the backup files.
By default, the
mongodb-backup-pvc.yaml
needs to replace the following parameters:- Replace
<mongo namespace>
with the namespace where the MongoDB deployment that foundational services uses is running. -
Replace the
<storage class>
with either the storage class that MongoDB deployment uses, or with any storage class that hasRetain
ReclaimPolicy.By default, the
mongodb-backup-deployment.yaml
file needs to replace both instances of<mongo namespace>
with the namespace where the MongoDB deployment that foundational services uses is running.
- Replace
-
Add the PVC to the cluster.
oc apply -f mongodb-backup-pvc.yaml
-
Add the deployment to the cluster.
oc apply -f mongodb-backup-deployment.yaml
Back up common-service-db (CS v4.6 and newer)
-
Get the common-service-db backup resources.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-deployment.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-backup-pvc.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-br-script-cm.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-rolebinding.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/velero/schedule/common-service-db/cs-db-sa.yaml
-
Update the backup files.
- Replace
<cs-db namespace>
with the namespace where common-service-db instance is running. - Replace the
<storage class>
with the storage class that the current IM deployment uses.
- Replace
-
Add the PVC to the cluster.
oc apply -f cs-db-backup-pvc.yaml
-
Add the
cs-db-br-script-cm.yaml
to the correct namespaceoc apply -f cs-db-br-script-cm.yaml
-
Give the common-service-db backup necessary permissions
oc apply -f cs-db-sa.yaml
oc apply -f cs-db-role.yaml
oc apply -f cs-db-rolebinding.yaml
-
Add the deployment to the cluster.
oc apply -f cs-db-backup-deployment.yaml
Back up Zen v4 (CS v3.23.x or 3.19.x)
Note: The following instructions must be used only for Zen instances that use foundational services version v3.23.x or 3.19.x.
-
If Zen is installed, add
foundationservices.cloudpak.ibm.com=zen
to the ZenService and Zen OperandRequest.-
Find the Zen service and namespace.
oc get zenservice -A
-
Verify version of Zen:
oc get zenservice -o jsonpath={.items[*].status.currentVersion} -n <zenservice namespace>
If version is < 5, continue with the following steps. If version is >= 5, follow the instructions under
Back up Zen v5
-
Find the Zen OperandRequest. The
| grep zen
filter is used in the following command on the assumption that your Zen OperandRequest name has the wordzen
in it.oc get operandrequest -A | grep zen
Note: Repeat this step for each namespace where a
zenservice
using foundational services version 3.23.x or 3.19.x is installed.
-
Back up Zen v4 Data (CS 3.23.x/3.19.x)
The procedure is applicable for Zen deployments with Zen v4 or earlier. If you use Zen v5 or later, see Back up Zen v5 (CS v4.x). This may differ between zenservice
instances if multiple are present on the same cluster.
-
Get the necessary files for zen backup:
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen-backup-pvc.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen-backup-deployment.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-br-scripts.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-sa.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-rolebinding.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-role.yaml
-
Update the backup files.
By default,
zen-backup-pvc.yaml
needs to replace the following parameters:- Replace
<zenservice namespace>
with the namespace where thezenservice
instance is running. -
Replace
<storage class>
with the storage class of the common service db or MongoDB deployment or the storage class withRetain
ReclaimPolicy.In the
zen-backup-deployment.yaml
file, replace all instances of<zenservice namespace>
with the namespace where thezenservice
instance is deployed. There are two parameters for the velero backup and restore commands.By default, the backup and restore commands such as
.spec.template.metadata.annotations.pre.hook.backup.velero.io/command
and.spec.template.metadata.annotations.post.hook.restore.velero.io/command
are scheduled to run in the<zenservice namespace>
namespace as parameters to the scripts called in the commands. Edit the first parameter values to match the namespace where the Zenservice is deployed.By default, the restore command such as
.spec.template.metadata.annotations.post.hook.restore.velero.io/command
is set to run againstzenservice
named<zenservice name>
. Update the second parameter to match the name of thezenservice
in the target namespace.In
zen4-br-scripts.yaml
andzen4-sa.yaml
, make sure to replace<zenservice namespace>
with the namespace where each instance of zenservice is deployed.
- Replace
-
Add the PVC to the cluster.
oc apply -f zen-backup-pvc.yaml
-
Add
zen4-br-scripts.yaml
to the correct namespace.oc apply -f zen4-br-scripts.yaml
-
Give the necessary permissions for the Zen 4 backup.
-
To backup each namespace with
zenservice
, create a service account. Replace the<zenservice namespace>
with the namespace where you deployed the zenservice.oc apply -f zen4-sa.yaml
-
Apply the Role for the zen backup for each
zenservice
namespace. Replace<zenservice namespace>
with the namespace where you deployed the zenservice.oc apply -f zen4-role.yaml
-
Create the RoleBinding to connect the ServiceAccount to the Role.
-
Edit the
zen4-rolebinding.yaml
file to add the ServiceAccount created earlier and replace<zenservice namespace>
with the namespace where you deployed the zenservice.``cmd vi zen4-rolebinding.yaml ```
-
-
Apply the
zen4-rolebinding.yaml
file.oc apply -f zen4-rolebinding.yaml
-
-
Add the deployment to the cluster.
oc apply -f zen-backup-deployment.yaml
-
Repeat the previous steps for each namespace with a
zenservice
instance
Back up Zen v5 (CS v4.x)
Note: The following instructions must only be used for Zen instances that use foundational services version v4.x.
-
Locate
zenservice
instances.oc get zenservice -A
-
Label each
zenservice
.oc label zenservice <zenservice name> foundationservices.cloudpak.ibm.com=zen --overwrite=true -n <namespace>
Back up Zen MetastoreDB v5
Note: Repeat this step for each namespace where a zenservice
instance is installed.
-
Get the Zen 5 backup resource.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-deployment.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-backup-pvc.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
-
Update the backup files.
By default, the
zen5-backup-pvc.yaml
needs to replace the following parameters:- Replace
<zenservice namespace>
with the namespace where thezenservice
instance is running. -
Replace the
<storage class>
with either the storage class that common service db or MongoDB deployment uses, or with any storage class that hasRetain
ReclaimPolicy.In the
zen5-backup-deployment.yaml
file, replace all instances of<zenservice namespace>
with the namespace where thezenservice
instance is running. There are four; two are parameters for the velero backup and restore commands.By default, the backup and restore commands (represented by
.spec.template.metadata.annotations.pre.hook.backup.velero.io/command
&.spec.template.metadata.annotations.post.hook.restore.velero.io/command
) are scheduled to run in the<zenservice namespace>
namespace as parameters to the scripts called in the commands. Edit both commands' first parameter values to match the namespace that the deployment is created in.By default, the restore command (represented by
.spec.template.metadata.annotations.post.hook.restore.velero.io/command
) is set to run againstzenservice
named<zenservice name>
. Update the second parameter to match the name of thezenservice
in the target namespace.In the
zen5-br-scripts-cm.yaml
andzen5-sa.yaml
, make sure to replace the namespace value<zenservice namespace>
with the zenservice namespace in use for each instance of zenservice in use.
- Replace
-
Add the PVC to the cluster.
oc apply -f zen5-backup-pvc.yaml
-
Add the
zen5-br-scripts-cm.yaml
to the correct namespaceoc apply -f zen5-br-scripts-cm.yaml
-
Give the Zen 5 backup necessary permissions
-
For each namespace with a
zenservice
to backup, create a service account. Replace the<zenservice namespace>
value before applying.oc apply -f zen5-sa.yaml
-
Once per
zenservice
namespace, apply the Role for the zen backup. Replace the<zenservice namespace>
value before applying.oc apply -f zen5-role.yaml
-
Create the RoleBinding to connect the ServiceAccount to the Role.
-
Edit the
zen5-rolebinding.yaml
file to add the ServiceAccount created earlier and replace the<zenservice namespace>
value.vi zen5-rolebinding.yaml
-
Apply the
zen5-rolebinding.yaml
fileoc apply -f zen5-rolebinding.yaml
-
-
-
Add the deployment to the cluster.
oc apply -f zen5-backup-deployment.yaml
Create a backup resource
Create a backup resource for the velero
namespace.
-
Get the
schedule-common-services.yaml
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/schedule-common-services.yaml
-
Update the
schedule-common-services.yaml
file based on your backup requirements. For more information, see Velero Schedule API Type . By default, the backup runs once a day and is deleted 48 hours later.The following configurations in the
schedule-common-services.yaml
file are important:schedule:
, which is a CRON expression. CRON uses the server time, which is usually the Coordinated Universal Time unless configured to be something else.ttl
, which is the time to live for the backup.storageLocation
, which is the same storage location that you used when you set up OADP. The commandoc get backupstoragelocations.velero.io -n <velero namespace>
can be used to get the name.velero
, which is the namespace where you installed OADP.
-
Create the resource.
oc apply -f schedule-common-services.yaml
-
Verify whether the backup schedule was created.
velero schedule get
After the first scheduled time passes, you can verify whether the backup ran. Look for a schedule name and timestamp.
velero backup get
-
Verify whether the backup was successful and check the details to see if all resources are saved.
velero backup describe <__BACKUP_NAME__> --details
Restoring foundational services
Complete the following steps to restore foundational services.
Before you restore foundational services, set up Velero on the new cluster. Follow the instructions in the Create the backup resources section.
For troubleshooting issues that may arise during restore, see IBM Cloud Pak foundational services Installation Troubleshooting.
Download the necessary files for restoring different resources:
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-namespace.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-entitlementkey.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-pull-secret.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-catalog.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operatorgroup.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-configmap.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-crd.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-commonservice.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-subscriptions.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-licensing.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cert-manager.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-operands.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-mongo-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-zen5-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-zen-data.yaml
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-singleton-subscriptions.yaml
-
Restore the foundational services namespaces by using the
restore-namespace.yaml
file.-
Get the name of the Velero backup that you plan to use for restoring.
velero backup get
Replace
__BACKUP_NAME__
in the following commands with the Velero backup name.Verify whether the backup was successful and check the details to see if all resources are saved.
velero backup describe <__BACKUP_NAME__> --details
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-namespace.yaml
-
Restore the namespaces.
oc apply -f restore-namespace.yaml
You can check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the namespaces are restored. Your namespaces must be listed in the command output.
oc get namespace
Proceed with the next step after the namespaces are restored.
-
Change the default project to the restored common service namespace.
oc project <namespace-where-foundational services-are-installed>
-
-
Restore the entitlement key.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-entitlementkey.yaml
-
Restore the entitlement key.
oc apply -f restore-entitlementkey.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the entitlement key is restored.
oc get secret
-
-
Restore the pull secret.
-
Save the current pull secret.
oc get secret pull-secret -n openshift-config -o yaml > original-pull-secret.yaml
-
Delete the current pull secret from the
openshift-config
namespace.oc delete secret pull-secret -n openshift-config
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-pull-secret.yaml
-
Restore the pull secret.
oc apply -f restore-pull-secret.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the pull secret is restored.
oc get secret -n openshift-config | grep pull
-
-
Restore the catalog.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-catalog.yaml
-
Restore the catalog.
oc apply -f restore-catalog.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the catalog source is restored.
oc get catalogsource -n openshift-marketplace | grep ibm
-
Verify whether the
ibm-operator-catalog
pod is running.oc get pod -n openshift-marketplace -w
Note: If using IBM Cert Manager, IBM Licensing, or
cloud-native-postgresql-catalog
catalog source, verify that their pods areRunning
as well.If the pods are running, proceed with the next step.
-
-
Restore the operator groups.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-operatorgroup.yaml
-
Restore the operator groups.
oc apply -f restore-operatorgroup.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the operator groups are restored.
oc get operatorgroup -A
-
-
Restore the
common-service-maps
configmap.-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-configmap.yaml
-
Restore the configmap.
oc apply -f restore-configmap.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the configmap is restored.
oc get configmap common-service-maps -n kube-public
-
-
Restore the
commonservices.operator.ibm.com
customresourcedefinition (CRD).-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-crd.yaml
-
Restore the CRD.
oc apply -f restore-crd.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the crd is restored.
oc get customresourcedefinition | grep commonservices.operator.ibm.com
-
-
Restore the
CommonService
CRs.-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-commonservice.yaml
-
Restore the CRs.
oc apply -f restore-commonservice.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the
CommonService
CRs are restored.oc get commonservice -A
If a
CommonService
CR is not restored, delete the restore resource and apply it again: -
Delete the resource.
```cmd oc delete -f restore-commonservice.yaml ```
-
Restore the CR.
```cmd oc apply -f restore-commonservice.yaml ```
Wait for 30 seconds and check again for the
CommonService
resource.
-
-
Restore the singleton subscriptions.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-singleton-subscriptions.yaml
-
Restore the subscriptions.
oc apply -f restore-singleton-subscriptions.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Watch the namespaces where Cert Manager and License Service are deployed for the Cert Manager and Licensing operators to be running. By default Cert Manager and License Service are deployed in
ibm-cert-manager
andibm-licensing
namespaces.oc get pod -n <cs namespace> -w
-
-
Restore cert manager resources.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-cert-manager.yaml
-
Restore the cert manager resource.
oc apply -f restore-cert-manager.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the certificates are restored.
oc get certificates
-
-
Restore the subscriptions.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-subscriptions.yaml
-
Restore the subscriptions.
oc apply -f restore-subscriptions.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Watch the foundational services namespace for the
operand-deployment-lifecycle-manager
to be running:oc get pod -n <cs namespace> -w
See the following notes:
- If not using IBM Cert Manager, IBM Common Service Operator deployment fails unless a third-party Cert Manager is installed on the cluster beforehand.
- If using SOD, it is likely the
ibm-common-service-operator
will not come ready after restoring the subscriptions and subsequently will not deploy ODLM. This is expected and will resolve after running the next step.
Troubleshooting: In case of issues with generating new installation plans for updates or new installations, see OLM is unable to generate new install plans.
-
-
Run
setup_tenant.sh
to setup cluster topology.-
Get the
setup.tenant.sh
andutils.sh
scripts by running the following command:wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/setup_tenant.sh mkdir common && cd common wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts-adopter/cp3pt0-deployment/common/utils.sh
Note: This script needs to be run for each instance of foundational services. Each instance should have different namespace values from each other instance (that is, no namespace should be used in two different executions).
-
Run the following command the make the scricpts executable:
chmod +x setup_tenant.sh chmod +x common/utils.sh
-
Gather the values to run script. Operator and Services namespaces:
oc get commonservice common-service -o yaml
Locate values
.spec.operatorNamespace
and.spec.servicesNamespace
Note: These values will always match unless using SOD.
Size:
oc get commonservice common-service -o yaml
Locate value
.spec.size
Tethered namespaces:
oc get cm common-service-maps -o yaml -n kube-public
Make note of the namespaces under
requested-from-namespace
. -
Run the script.
Note: If services and operator namespace are the same, you must still specify both parameters when running
setup_tenant.sh
. In this case, use the same namespace for each parameter. Optional parameters-s
and-n
can be used if either using a different catalog source thanopencloud-operators
or if the catalog source is in a different namespace respectively. If everything is deployed to the same namespace (CS operators, CS operands, and Cloud Pak workload), you do not need to use thesetup_tenant.sh
script and can move on to the next step../setup_tenant.sh --operator-namespace <operator namespace> --services-namespace <services namespace> --tethered-namespaces <comma delimited (no spaces) list of Cloud Pak workload namespaces that use this foundational services instance> --license-accept -c v<foundational services version number in use i.e. 4.0, 4.1, 4.2, etc> -p <.spec.size value from `CommonService` cr> -i <install mode, either Manual or Automatic>
-
Wait for script to complete successfully. For more information, see Installing foundational services by using a script.
-
-
Restore Licensing service configmap.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-licensing.yaml
-
Restore the configmap.
oc apply -f restore-licensing.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the configmap is restored.
oc get configmap | grep licensing
-
-
If you use IBM License Service Reporter, see Backing up the License Service Reporter instance.
-
Restore the OperandRequests and OperandConfigs.
-
If you are restoring an OperandConfig, delete the existing first.
oc delete operandconfig common-service -n <namespace where operandconfig resides>
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-operands.yaml
-
Restore the operands.
oc apply -f restore-operands.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify whether the operands are restored.
oc get operandrequest
oc get operandconfig
-
Verify whether operand requests are reconciled.
Note: Give ODLM time to reconcile one or more restored operand requests but new operators and their operands should deploy shortly after the restore completes. Check the status fields of the operand requests and the ODLM logs for any issues.
-
If using a custom hostname, TLS secret, or both, wait for the
platform-identity
pods to come ready:-
Verify that the
cs-onprem-tenant-config
configmap is present:oc get cm -n <namespace where hostname is changed or custom TLS secret used> | grep cs-onprem-tenant-config
-
Wait for the
platform-identity-management
,platform-identity-provider
, andplatform-auth-service
pods to come ready in the same namespace. - Make sure to update the custom hostname to reflect a change in cluster if necessary. For example, the structure of the route is
<route name>.cluster1.com
. If you are no longer oncluster1
but now oncluster2
, the route needs to be updated from<route name>.cluster1.com to
<route name>.cluster2.com
. - If using a custom TLS secret, it is best to re-create this secret on the new cluster by using the same name. In this case, if the secret was carried over to the new cluster, it would need to be replaced.
- Follow the instructions here https://www.ibm.com/docs/en/cloud-paks/foundational-services/4.3?topic=cc-updating-custom-hostname-tls-secret-by-using-configmap.
-
-
-
Restore MongoDB. (CS v3.19.x to 4.5.x)
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-mongo-data.yaml
-
Restore the MongoDB data.
oc apply -f restore-mongo-data.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Verify that the restore completed successfully.
Check the logs for the velero restore to ensure that the restore went through. Search for the following log:
"Failed: error connecting to db server: no reachable servers"
If this message is present, follow these instructions:
-
Get the
mongo-restore.sh
file.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/mongoDB/mongo-restore.sh
-
Make the restore script executable:
chmod +x mongo-restore.sh
-
Delete the existing
mongodb-backup
deployment in the failed restore namespace:oc delete deploy mongodb-backup -n <failed restore namespace>
-
Run the script:
./mongo-restore.sh <cs namespace>
Troubleshooting: If the
mongodb-restore
pod is stuck inContainerCreating
:- delete the deployment
mongodb-backup-deployment
- make sure the
mongodb-backup pod
is fully deleted (notTerminating
) - delete the
mongodb-restore
job and its pod (notTerminating
) - rerun the
mongo-restore.sh
script
- delete the deployment
Note: The secondary steps that are listed here must be run only if the restore logs indicate that the restore was not run. When restoring multiple namespaces, it is possible that some succeed and some fail. The namespace should be specified for each one pass or fail but there is no harm in running these secondary steps after a successful restore. If the storage class used on the backup cluster does not match the storage class that is used on the target cluster, the restore fails. Adapting to different storage classes across clusters is a current limitation of velero.
-
-
-
Restore common-service-db (CS v4.6.x and newer)
-
Get the restore object
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/restore-cs-db.yaml
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created previously.vi restore-cs-db.yaml
-
Restore the cs-db data.
oc apply -f restore-cs-db.yaml
-
Check restore progress. Proceed with the next step after restore is complete.
velero restore get
-
Check logs of the velero restore to verify that the data was restored
velero restore logs restore-zen5-data
Troubleshooting: If the logs or the data indicate that the restore was not successful, apply the following workaround:
-
Get the restore job:
wget https://raw.githubusercontent.com/qpdpQ/ibm-common-service-operator/IM-ZEN-workaround/velero/restore/common-service-db/cs-db-restore-job.yaml
-
Replace
<cs-db namespace>
with the namespace where common-service-db instance is running. -
Delete the existing cs-db-backup deployment and cs-db-backup pod.
oc delete deploy cs-db-backup -n <namespace>
-
Run the restore job.
oc apply -f cs-db-restore-job.yaml
-
-
Restore Platform UI (Zen) resources.
Note: If Zen is installed in a namespace other than the namespace where foundational services are installed, then create that namespace first (if not already restored).
oc new-project <namespace-where-Zen-is-installed>
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-zen.yaml
-
Restore the
zenservice
instances.oc apply -f restore-zen.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Wait for the
zenservice
instances to come ready. Once theProgress
field is 100%, the instance is ready. The following command will continuously output the percentage of all thezenservices
on the cluster.oc get zenservice -A -w -o yaml | grep Progress:
Note: If the restored zenservice contains fields to configure
zenCustomRoute
, do the following:- Verify the secret used (if the field exists) is present in the zenservice namespace in the target cluster.
- Update the value in the zenservice CR for the route. For example, the structure of the route is
<route name>.cluster1.com
. If you are no longer oncluster1
but now oncluster2
, the route needs to be updated from<route name>.cluster1.com to
<route name>.cluster2.com
.
-
-
Restore Zen data for Zen v4 (CS 3.23.x/3.19.x)
The step is applicable for Zen deployments with Zen v4 or earlier. If you use Zen v5 or later, proceed with next step. This may differ between
zenservice
instances if multiple are present on the same cluster.-
Replace
__BACKUP_NAME__
with the name of the backup resource that you created in the previous step.vi restore-zen-data.yaml
-
Restore the Zen data.
oc apply -f restore-zen-data.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Check the logs of the velero restore to verify that the data was restored.
velero restore logs restore-zen-data
-
Search for
zen4-br.sh
to find the relevant logs. Ifzen4-br.sh
is not available in the logs, the restore process is not completed. If the logs or the data indicate that the restore was not successful, complete the following steps:-
Get the Zen restore job resource.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen-restore-job.yaml
-
Delete the existing
zen-backup
deployment.oc delete deploy zen-backup -n <namespace>
-
Wait untill the
zen-backup
pods are fully deleted. (fully gone, notTerminating
) -
Give the Zen backup necessary permissions if the necessary ServiceAccount, Role, and RoleBinding are not available for the Zen backup.
-
Check if permissions exist:
oc get sa -n <zenservice namespace> | grep zen4 oc get role -n <zenservice namespace> | grep zen4 oc get rolebinding -n <zenservice namespace> | grep zen4
-
Get the
zen4-sa.yaml
,zen4-role.yaml
, andzen4-rolebinding.yaml
files.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-sa.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-rolebinding.yaml
-
For each namespace with a
zenservice
to restore, edit the service account filezen4-sa.yaml
to deploy in the corresponding namespace.oc apply -f zen4-sa.yaml
-
Apply the
zen4-role.yaml
file for eachzenservice
namespace to create the Role for the zen backup. Replace<zenservice namespace>
with the namespace where you deployed the zenservice.oc apply -f zen4-role.yaml
-
Create the RoleBinding to connect the ServiceAccounts to the Role.
-
Edit the
zen4-rolebinding.yaml
file to add each ServiceAccount created earlier. Replace the<zenservice namespace>
with the namespace where you deployed the zenservice.vi zen4-rolebinding.yaml
-
Apply the
zen4-rolebinding.yaml
fileoc apply -f zen4-rolebinding.yaml
-
-
-
Edit the
zen-restore-job.yaml
file. Replace the<zenservice namespace>
parameter for the underlyingzen4-br.sh
to reflect the properzenservice
namespace. -
Apply the
zen-restore-job.yaml
fileoc apply -f zen-restore-job.yaml
-
Wait for the job to complete and check the logs of the
zen4-restore-job
pod to verify that restore is completed. -
Repeat the procedure for each namespace with a
zenservice
instance installed.
-
-
-
Wait for the
zenservice
instances to become ready. The instance is ready if theProgress
field is 100%. The following command provides the output percentage of all thezenservices
on the cluster.oc get zenservice -A -w -o yaml | grep Progress:
Troubleshooting:
- Make sure that there is only one
zen-backup
orzen4-restore-job
pod in a namespace at any given time as they compete for the same PVC. - If the
zen4-restore-job
pod is stuck inContainerCreating
:- Delete the deployment
zen-backup
- Make sure the
zen-backup
pod is fully deleted (notTerminating
) - Delete the
zen4-restore-job
job and its pod (notTerminating
) - Ensure that the
zen4-br-configmap
configmap,zen-backup-pvc
pvc,zen4-backup-role
role,zen4-backup-rolebinding
rolebinding, andzen4-backup-sa
service account are present in the namespace - Reapply the
zen4-restore-job
yaml
- Delete the deployment
-
If the configmap
zen4-br-configmap
is not present, you can downloaded with the following command:wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen4-br-scripts.yaml.
Make sure to edit the namespace field before applying with the following command:
oc apply -f zen4-br-scripts.yaml
-
Velero restore is less predictable than backup when restoring databases. There is no harm to deleting a velero restore object (that is,
restore-cs-db-data
,restore-mongo-data
orrestore-zen-data
), deleting the accompanying deployment and pvc, waiting for these items to be fully deleted, then re-creating the velero restore object to try again. Should this still not work, the secondary instructions that use thecs-db-restore-job.yaml
or mongo script and zen4 restore job can be used on an individual namespace basis. There is no harm to running the restore in a namespace that has already been restored.
-
-
Restore Zen data v5.
-
Substitute the
__BACKUP_NAME__
with the name of the backup resource that you created in a previous step.vi restore-zen5-data.yaml
-
Restore the Zen data.
oc apply -f restore-zen5-data.yaml
-
Check the progress and the details of the restore by using the following commands. Proceed with the next step after the status shows as
Completed
.velero restore get
velero restore describe <__RESTORE_NAME__> --details
-
Check logs of the velero restore to verify that the data was restored
velero restore logs restore-zen5-data
-
Search for
restore_zen5
to find relevant logs. If it is not present, the restore did not run. If the logs or the data indicate that the restore was not successful, the following steps can be taken as a workaround:-
Get the Zen 5 restore job resource.
wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/restore/zen/zen5-restore-job.yaml
-
Delete the existing
zen5-backup
deploymentoc delete deploy zen5-backup -n <namespace>
-
Wait for the
zen5-backup
pods to fully delete (fully gone, notTerminating
) -
Give the zen5 backup necessary permissions if the necessary ServiceAccount, Role, and RoleBinding are not already present.
-
Check if permissions exist:
oc get sa -n <zenservice namespace> | grep zen5 oc get role | grep zen5 oc get rolebinding | grep zen5
-
Get the
zen5-sa.yaml
,zen5-role.yaml
, &zen5-rolebinding.yaml
files.wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-sa.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-role.yaml wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-rolebinding.yaml
-
For each namespace with a
zenservice
to backup, edit the service account filezen5-sa.yaml
to deploy in the corresponding namespaceoc apply -f zen5-sa.yaml
-
Once per
zenservice
namespace, apply thezen5-role.yaml
file to create the Role for the zen backup. Replace the<zenservice namespace>
value before applying.oc apply -f zen5-role.yaml
-
Create the RoleBinding to connect the ServiceAccounts to the Role.
-
Edit the
zen5-rolebinding.yaml
file to add each ServiceAccount created earlier. Replace the<zenservice namespace>
value before applying.vi zen5-rolebinding.yaml
-
Apply the
zen5-rolebinding.yaml
fileoc apply -f zen5-rolebinding.yaml
-
-
-
Edit the
zen5-restore-job.yaml
file. The default namespace is set tozen
. The parameters for the underlyingrestore_zen5.sh
are defaulted to thezen
namespace andtest-zen
zenservice
name. Update both of these parameters to reflect the proper namespace andzenservice
respectively. -
Apply the
zen5-restore-job.yaml
fileoc apply -f zen5-restore-job.yaml
-
Wait for the job to complete, then check the logs of the
zen5-restore-job
pod to verify restore completed. -
Repeat as needed for each namespace with a
zenservice
instance installed.
-
-
-
Wait for the
zenservice
instances to come ready. Once theProgress
field is 100%, the instance is ready. The following command will continuously output the percentage of all thezenservices
on the cluster.oc get zenservice -A -w -o yaml | grep Progress:
Troubleshooting:
- Make sure that there is only one
zen5-backup
or onezen5-restore-job
pod in a namespace at any given time as they compete for the same PVC. - If the
zen5-restore-job
pod is stuck inContainerCreating
:- delete the deployment
zen5-backup
- make sure the
zen5-backup
pod is fully deleted (notTerminating
) - delete the
zen5-restore-job
job and its pod (notTerminating
) - ensure that the configmap
zen5-br-configmap
, pvczen5-backup-pvc
, rolezen5-backup-role
, rolebindingzen5-backup-rolebinding
, and service accountzen5-backup-sa
are present in the namespace - reapply the
zen5-restore-job
yaml
- delete the deployment
-
If the configmap
zen5-br-configmap
is not present, it can be downloaded from:wget https://raw.githubusercontent.com/IBM/ibm-common-service-operator/scripts/velero/schedule/zen5-br-scripts-cm.yaml.
Make sure to edit the namespace field before applying with the following command:
oc apply -f zen5-br-scripts-cm.yaml
-
Velero restore is less predictable than backup when restoring databases. There is no harm to deleting a velero restore object (that is,
restore-cs-db-data
,restore-mongo-data
orrestore-zen5-data
), deleting the accompanying deployment and pvc, waiting for these items to be fully deleted, then re-creating the velero restore object to try again. Should this still not work, the secondary instructions by using thecs-db-restore-job.yaml
or mongo script and zen5 restore job can be used on an individual namespace basis. There is no harm to running the restore in a namespace that has already been restored.
-
-
If you use a custom route for the restored
zenservice
and you are restoring to a new cluster, update the value ofzenCustomRoute
in thezenservice
CR to reflect the new hostname and re-trigger theiam-config
job. Run the following commands:oc -n <zenservice namespace> patch zenservice <zenservice name> --type='merge' -p '{"spec":{"zenCustomRoute":{"route_host":"<updated route>"}}}' oc -n <zenservice namespace> patch zenservice <zenservice name> --type='merge' -p '{"spec":{"reconcile":true}}' oc get job -n <zenservice namespace> iam-config-job -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | oc replace --force -f -
All restoration tasks are completed.
Verify whether foundational services are properly restored.
-
Verify the pods:
oc get pods
All pods must be running.
-
Verify the subscriptions:
oc get subscriptions
Subscriptions of all installed services must be listed.
-
Verify that the Identity and Access section of the
cp-console
shows the users and teams that your organization added in the original cluster.
For more information about backing up and restoring Identity Management (IM) components, see Identity management backup and restore.
For migrating existing OIDC and SAML configurations, see Migrating identity management.
General Troubleshooting:
If a restore process is stopped in the New
phase when you view with velero restore get
, restart the velero pod in the namespace where OADP is installed. After the velero pod restarts, the status of the restore process must
change to InProgress
.