Backup & restore hub performance and scaling

Use this topic to understand the number of spokes you can scale up to and the number of spoke clusters that can be connected to a single hub. It outlines the capacities and the scalability considerations to ensure efficient backup and restore operations.

The Backup & restore hub service is efficiently designed to handle large-scale backup and restore concurrent jobs across clusters. With proven scalability and performance. The system is tested successfully to handle up to 1000 concurrent jobs. It serves as a reference point and not a limitation; the hub can scale further with appropriate resource allocation.

Sizing blueprints for varying workloads

The following table provides the recommended configurations for each required service when handling different numbers of concurrent jobs. As the number of concurrent jobs increases, you can scale out (increase the replicas) or scale up (increase the resource limits) for the required services as outlined:

Table 1. Sizing blueprints
Clusters	Concurrent Jobs	Applicationsvc Pods	Backup-service Pods	Mongodb CPU and Memory Limit
10	100	1	1	Default
20	200	1	1	Default
25	250	1	1	Default
30	300	1	1	1 CPU and 1 GiB
40	400	1	2	1 CPU and 1 GiB
50	500	1	3	2 CPU and 2 GiB
60	600	1	4	2 CPU and 2 GiB
70	700	1	5	3 CPU and 3 GiB
80	800	2	6	3 CPU and 3 GiB
90	900	2	7	4 CPU and 4 GiB
100	1000	2	8	4 CPU and 4 GiB

Services not listed in the blueprint table are not required to be scaled. These services are capable of handling the load without any changes, as they do not have scaling requirements for up to 1000 concurrent jobs.

Services like mongodb or any operator-control-manager pods in the ibm-backup-restore namespace cannot be scaled out (increased replicas). For these services, increase the CPU and memory resources to ensure they perform well as the number of concurrent jobs increases.

The guardian-bridge service automatically scales to handle the additional load.

Scaling Services

Use the following command to scale out the desired replicas for the required services:

oc scale deployment <deployment-name> --replicas=<desired-replicas> -n <backup-restore-namespace>

oc scale sts <statefulset-name> --replicas=<desired-replicas> -n <backup-restore-namespace>

Scale backup-service Pods:

oc scale deployment backup-service --replicas=<desired-replicas> -n <backup-restore-namespace>

Scale applicationsvc Pods

oc scale deployment <applicationsvc --replicas=<desired-replicas> -n <backup-restore-namespace>

Use the following command to scale up the desired resource limits for the required services:

oc set resources deployment <deployment-name> --limits=cpu=<desired cpu limit>,memory=<desired-memory-limit -n <backup-restore-namespace>

oc set resources sts <statefulset-name > --limits=cpu=<desired cpu limit>,memory=<desired-memory-limit -n <backup-restore-namespace>

Scale up mongodb resource limits

oc set resources sts mongodb --limits=cpu=<desired cpu limit>,memory=<desired-memory-limit> --containers=mongodb  -n <backup-restore-namespace>