Relocate disaster recovery protected discovered application
Before you begin
- Ensure that the application namespace is created in both managed clusters (for example,
busybox-discovered
).
About this task
This section guides you on how to failover a discovered application which is disaster recovery protected.
Procedure
- Disable fencing on the Hub cluster.
- Edit the DRCluster resource for this cluster, replacing <drcluster_name>
with a unique name.
oc edit drcluster <drcluster_name>
apiVersion: ramendr.openshift.io/v1alpha1 kind: DRCluster metadata: [...] spec: cidrs: [...] ## Modify this line clusterFence: Unfenced [...] [...]
Example output:
drcluster.ramendr.openshift.io/ocp4perf1 edited
Note: Once the managed cluster is fenced, all communication from applications to the OpenShift Data Foundation external storage cluster will fail and some Pods will be in an unhealthy state (for example:CreateContainerError
,CrashLoopBackOff
) on the cluster that is now fenced. - Gracefully reboot OpenShift Container Platform nodes that were Fenced. A reboot is
required to resume the I/O operations after unfencing to avoid any further recovery orchestration
failures. Reboot all nodes of the cluster by following the steps in the procedure, Rebooting a node gracefully. Note: Make sure that all the nodes are initially cordoned and drained before you reboot and perform uncordon operations on the nodes.
- After all OpenShift nodes are rebooted and are in a
Ready
status, verify that all Pods are in a healthy state by running this command on the Primary managed cluster (or whatever cluster has been Unfenced)oc get pods -A | egrep -v 'Running|Completed'
NAMESPACE NAME READY STATUS RESTARTS AGE
The output for this query should be zero Pods before proceeding to the next step.Important: If there are Pods still in an unhealthy status because of severed storage communication, troubleshoot and resolve before continuing. Because the storage cluster is external to OpenShift, it also has to be properly recovered after a site outage for OpenShift applications to be healthy.Alternatively, you can use the OpenShift Web Console dashboards and Overview tab to assess the health of applications and the external ODF storage cluster. The detailed Fusion Data Foundation dashboard is found by navigating to Storage Data Foundation.
- Verify that the
Unfenced
cluster is in a healthy state. Validate the fencing status in the Hub cluster for the Primary-managed cluster, replacing <drcluster_name> with a unique name.oc get drcluster.ramendr.openshift.io <drcluster_name> -o jsonpath='{.status.phase}{"\n"}'
Unfenced
- Login to your Ceph cluster and verify that the IPs that belong to the OpenShift
Container Platform cluster nodes are now in the blocklist.
ceph osd blocklist ls
Ensure that you do not see the IPs added during fencing.
- Edit the DRCluster resource for this cluster, replacing <drcluster_name>
with a unique name.
- In the RHACM console, navigate to Disaster Recovery > Protected applications tab.
- At the end of the application row, click on the Actions menu and choose to initiate Relocate.
- In the Relocate application modal window, review the status of the application and the target cluster.
- Click Initiate.
- Check the progression status of Relocate until the result
is
WaitOnUserToCleanup
. The DRPC name can be identified by the unique Name configured in prior steps (for example,busybox-rbd
).oc get drpc {drpc_name} -n openshift-dr-ops -o jsonpath='{.status.progression}{"\n"}'
- Remove the busybox application from the Secondary managed cluster before
Relocate to the Primary managed cluster is completed
- Navigate to the cloned repository for
busybox
and run the following commands on the Secondary managed cluster where you relocated from. Use the same directory that was used to create the application (for example,odr-metro-rbd
).cd ~/ocm-ramen-samples/ git branch
* main
oc delete -k workloads/deployment/odr-metro-rbd -n busybox-discovered
persistentvolumeclaim "busybox-pvc" deleted deployment.apps "busybox" deleted
- Navigate to the cloned repository for
- After deleting the application, navigate to the Protected applications tab and verify that the busybox resources are both in Healthy status.
- Verify that the busybox application is running on the Primary managed
cluster.
oc get pods,pvc -n busybox-discovered
NAME READY STATUS RESTARTS AGE pod/busybox-796fccbb95-qmxjf 1/1 Running 0 2m46s NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE persistentvolumeclaim/busybox-pvc Bound pvc-b20e4129-902d-47c7-b962-040ad64130c4 1Gi RWO ocs-storagecluster-ceph-rbd <unset> 2m57s
- After deleting the application, navigate to the Protected applications tab and verify that the busybox resources are both in Healthy status.