Kubernetes Resource Protection
Kubernetes applications rely on Kubernetes API resources to function within a cluster. For example, deployments can specify the program components of an application, ConfigMap modifies how a user wants to run an application, and custom resources control the overall operation of the application through application operators. Protecting Kubernetes resources against data loss and disasters in an application-independent manner simplifies Kubernetes application development for applications, requiring those backups and restore services.
Selective protection and recovery of Kubernetes resources for disaster recovery
To minimize recovery time, an application can pre-deploy some of its Kubernetes resources on the recovery cluster. Such an application wants a method to avoid duplicate recovery of its Kubernetes resources to ensure the benefits of pre-deployment. Some Kubernetes resources are created dynamically by Kubernetes itself and an application does not need to preserve the history of those resources so that the protection is not needed. Kubernetes events are a prime example of such resources. In the case of events, applications might need a method to prevent protection and recovery. Hence, the RamenDR Recipe custom resource for disaster recovery provides a flexible mechanism to support these examples and has generalized the mechanism for other use cases.
The general technique of selective protection and recovery mechanism is to filter Kubernetes resources by its kind and label. Resources can be selectively protected and recovered by its kind. It uses an include and exclude mechanism that is provided within the VRG. In addition to include and exclude, the standard Kubernetes label selector mechanism is used to protect specific resources.
Kubernetes resource protection and recovery order
Kubernetes applications must be designed to run on a specified system with a desired state, and the desired state is continuously attempted, achieved, and maintained over time. It is up to the application to deal with asynchronous behavior that is required by this set and attempt, achieve, maintain architecture. However, this level of asynchrony cannot be maintained in highly complex applications with stringent user expectations. First, an asynchronous scope breaks the ability of programmers to predict all sequences of events. Second, asynchronous scope can be a source of failed dependencies. It results in back-off retry loops that can violate the application's Recovery Time Objective (RTO). Restoring resources in a prescribed order can avoid both of these problems. So the Ramen VRG provides a mechanism to support capturing and restoring resources in a prescribed order.
An example of Kubernetes resource protection specification
apiVersion: ramendr.openshift.io/v1alpha1
kind: Recipe
metadata:
name: recipe-sample
namespace: my-app-ns
spec:
appType: demo-app # required, but not currently used
- name: volumes
type: volume
labelSelector: app=my-app
- name: config
backupRef: config
type: resource
includedResourceTypes:
- configmap
- secret
- name: deployments
backupRef: deployments
type: resource
includedResourceTypes:
- deployment
- name: instance-resources
backupRef: instance-resources
type: resource
excludedResourceTypes:
- configmap
- secret
- deployment
hooks:
- name: service-hooks
namespace: my-app-ns
labelSelector: shouldRunHook=true
ops:
- name: pre-backup
container: main
command: ["/scripts/pre_backup.sh"] # must exist in 'main' container
timeout: 1800
- name: post-restore
container: main
command: ["/scripts/post_restore.sh"] # must exist in 'main' container
timeout: 3600
workflows:
- name: capture # referenced in VRG
sequence:
- group: config
- group: deployments
- hook: service-hooks/pre-backup
- group: instance-resources
- name: recover # referenced in VRG
sequence:
- group: config
- group: deployments
- group: instance-resources
- hook: service-hooks/post-restore
VRG
sample that uses this Recipe
apiVersion: ramendr.openshift.io/v1alpha1
kind: VolumeReplicationGroup
metadata:
name: vrg-sample
namespace: my-app-ns # same Namespace as Recipe
spec:
kubeObjectProtection:
recipeRef:
name: recipe-sample
captureWorkflowName: capture
recoverWorkflowName: recover
volumeGroupName: volumes
Explanation of the capture and recovery specifications in VRG
The scope of a VRG disaster protection is a single Kubernetes namespace. The VRG protects persistent volumes that are associated with the namespace and optionally protects Kubernetes resources in the namespace. This documentation contains the product previews for protecting Kubernetes resources so does not explain the persistent volume disaster protection.
VRG allows Kubernetes resources to be captured (backed up) and restored as part of disaster recovery. This is achieved through the RamenDR recipe. If a RamenDR recipe specification is not included in VRG, Kubernetes resources are not protected as part of VRG disaster protection.
The RamenDR has two workflows, namely capture and recover. The capture order workflow provides instructions on how to capture a namespace Kubernetes resource. The recover workflow provides instructions on how to recover a namespaces Kubernetes resource after a disaster. This indicates that the order of capture and recovery workflow can be different and need not include the same resources. Each capture and recover workflow contains a list of resource instructions. Each item in the list is acted upon, even if it duplicates the work that is done by other items in the list. So care must be taken to avoid duplication in backup or recover lists so that best RPO and RTO are achieved.
- The name of each item in the captureOrder list must be unique.
- The backupName of each item in the recoverOrder list much matches a name in the recoverOrder list.
- A labelSelector in a list item only applies to that item in the list.
- If a list item contains multiple labelSelectors, then any resource that matches either label selector is operated upon.
- IncludeClusterResources in a list item apply only to that item in the list.
- Each list item can contain either an includedResources section or an excludedResources section, but not both.