Troubleshooting Backup & Restore service issues

List of known Backup & Restore issues in IBM Fusion.

Common issues

This section outlines common issues and their resolutions across multiple subcategories, such as backup, restore, service protection.
Rack shutdown in a multi-rack setup
Problem statement
A rack shutdown in a multi-rack HA setup can cause the Backup & Restore service to go to an unknown state.
Resolution
In the case of a rack shutdown, if a graceful shutdown is not working or the node is in a non-recoverable state, you can manually add an out-of-service taint on the node. For more information about how it works, see Kubernetes Blogs.

Remove the taint on the affected node after the node is recovered.

Backup or restore fails because of PVCs
Problem statement
Backup & Restore fails with the following error message:
error to create persistentvolumeclaims "<claim name>" is forbidden: exceeded quota"
Cause
During backup or restore operations, the Backup & Restore service temporarily allocates additional storage through PersistentVolumeClaims to stage application data. The requested storage size can even match the application size backed up or restored.
Resolution
  • Scenario 1: Failure due to ResourceQuota:

    Adjust or remove the storage.capacity field from the ResourceQuota object to allow the Backup & Restore service to allocate staging PersistentVolumeClaims as needed. The quota must equal the total size of the largest application's PersistentVolumeClaims, plus an additional 10Gi for local in-place policy storage. A higher quota may be required when you perform parallel operations that span all applications backed up or restored simultaneously.

  • Scenario 2: Failure due to ClusterResourceQuota:

    During backup or restore operations, Backup & Restore service temporarily consumes additional storage using PersistentVolumeClaims to stage data. To ensure successful backup and restore, either exclude the Backup & Restore namespace from the quota or increase it sufficiently. The quota must be set to the total size of the largest application's PersistentVolumeClaims, plus an additional 10Gi for local in-place policy storage. A higher quota may be necessary when you perform parallel operations that span all applications backed up or restored simultaneously.

For more information about resource quotas, see Red Hat Documentation.