Backup & restore hub performance and scaling
Use this topic to understand the number of spokes you can scale up to and the number of spoke clusters that can be connected to a single hub. It outlines the capacities and the scalability considerations to ensure efficient backup and restore operations.
The Backup & restore hub service is efficiently designed to handle large-scale backup and restore concurrent jobs across clusters. With proven scalability and performance. The system is tested successfully to handle up to 1000 concurrent jobs. It serves as a reference point and not a limitation; the hub can scale further with appropriate resource allocation.
Sizing blueprints for varying workloads
The following table provides the recommended configurations for each required service when handling different numbers of concurrent jobs. As the number of concurrent jobs increases, you can scale out (increase the replicas) or scale up (increase the resource limits) for the required services as outlined:
Clusters | Concurrent Jobs | Applicationsvc Pods | Backup-service Pods | Mongodb CPU and Memory Limit | guardian-bridge | ||
Memory Required and Limit | CPU Required and Limit | JVM Xms and Xmx | |||||
10 | 100 | 1 | 1 | Default | 1 GiB and 2 GiB (default) | 0.5 and 1 (default) | 1 G and 1 G |
20 | 200 | 1 | 1 | Default | 2 GiB and 4 GiB | 0.5 and 1 | 2 G and 2 G |
25 | 250 | 1 | 1 | Default | 2 GiB and 4 GiB | 0.5 and 1 | 2 G and 2 G |
30 | 300 | 1 | 1 | 1 CPU and 1 GiB | 3 GiB and 6 GiB | 1 and 2 | 4 G and 4 G |
40 | 400 | 1 | 2 | 1 CPU and 1 GiB | 4 GiB and 7 GiB | 1 and 2 | 5 G and 5 G |
50 | 500 | 1 | 3 | 2 CPU and 2 GiB | 5 GiB and 8 GiB | 1 and 3 | 6 G and 6 G |
60 | 600 | 1 | 4 | 2 CPU and 2 GiB | 6 GiB and 10 GiB | 1 and 3 | 8 G and 8 G |
70 | 700 | 1 | 5 | 3 CPU and 3 GiB | 8 GiB and 12 GiB | 2 and 4 | 9 G and 9 G |
80 | 800 | 2 | 6 | 3 CPU and 3 GiB | 8 GiB and 12 GiB | 2 and 4 | 10 G and 10 G |
90 | 900 | 2 | 7 | 4 CPU and 4 GiB | 10 GiB and 15 GiB | 3 and 5 | 11 G and 11 G |
100 | 1000 | 2 | 8 | 4 CPU and 4 GiB | 10 GiB and 15 GiB | 3 and 5 | 12 G and 12 G |
Services not listed in the blueprint table are not required to be scaled. These services are capable of handling the load without any changes, as they do not have scaling requirements for up to 1000 concurrent jobs.
Services like mongodb
or any operator-control-manager
pods in
the ibm-backup-restore
namespace cannot be scaled out (increased replicas). For
these services, increase the CPU and memory resources to ensure they perform well as the number of
concurrent jobs increases.
Scaling Services
oc scale deployment <deployment-name> --replicas=<desired-replicas> -n <backup-restore-namespace>
Or
oc scale sts <statefulset-name> --replicas=<desired-replicas> -n <backup-restore-namespace>
- Scale backup-service Pods:
-
oc scale deployment backup-service --replicas=<desired-replicas> -n <backup-restore-namespace>
- Scale applicationsvc Pods
-
oc scale deployment <applicationsvc --replicas=<desired-replicas> -n <backup-restore-namespace>
- Scale up
mongodb
resource limits -
oc set resources sts mongodb --limits=cpu=<desired cpu limit>,memory=<desired-memory-limit> --containers=mongodb -n <backup-restore-namespace>
- Scale up
guardian-bridge
resource limits -
oc patch kafkabridge guardian-bridge \ -n ibm-backup-restore \ --type=merge \ -p '{ "spec": { "resources": { "limits": { "cpu": "<cpu limit>", "memory": "<memory limit>" }, "requests": { "cpu": "<cpu request>", "memory": "<memory request" } }, "jvmOptions": { "-Xms": "<min heap>", "-Xmx": "<max heap>" } } }'