Restore with IBM Storage Fusion fails with no space left on device error
Restore with IBM® Storage Fusion fails because a PVC on the source cluster is out of space.
Symptoms
The restore fails at the Volume group: cpd-volumes step in the restore sequence. You see the following error message:
Failed restore PVCs
BMYBR0009 There was an error when processing the job in the Transaction Manager service.
The underlying error was: "Execution of workflow restore of recipe ibmcpd-tenant completed.
Number of failed commmands: 1, last failed command: "VolumeGroup/cpd-volumes" "An
unexpected error occurred during the volume group backup.
RestorePvcsFailedException(\'NoSpaceLeftOnDevice\')"
Causes
Because a PVC in the source cluster is out of space, the PVC fails to restore on the target cluster.
Diagnosing the problem
In the guardian-dm-controller-manager logs on the target cluster, you see messages like in the following example:
<timestamp> ERROR: <timestamp> resticrepository.go:523: No space left on device
<timestamp> {"level":"error","ts":"<timestamp>","logger":"datamover","msg":"Failed to create block volume restore.","error":"No space left on device","stacktrace":"main.restoreWithProgress.func2\n\t/workspace/main.go:788"}
<timestamp> {"level":"error","ts":"<timestamp>","logger":"datamover","msg":"Restore failed with error","copyID":"a6e16fe7553c210313cfc55dd3621a61e774a5a696fec4e9ef9ed916c16a2069","pvc path":"/data/archivelogs-c-db2oltp-<xxxxxxxxxxxxxxxx>-db2u-0","jobName":"archivelogs-c-db2oltp-<xxxxxxxxxxxxxxxx>-db2u-0","reason":"NoSpaceLeftOnDevice","error":"No space left on device","stacktrace":"main.restorePVC\n\t/workspace/main.go:711\nmain.handleRestoreJob\n\t/workspace/main.go:634\nmain.handleJob\n\t/workspace/main.go:174\nmain.main\n\t/workspace/main.go:114\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271"}
<timestamp> {"level":"info","ts":"<timestamp>","logger":"datamover","msg":"PVC restore completed","job":"archivelogs-c-db2oltp-<xxxxxxxxxxxxxxxx>-db2u-0","PVC":"archivelogs-c-db2oltp-<xxxxxxxxxxxxxxxx>-db2u-0","status":"Failed"}
In this example, the PVC that is out of space is archivelogs-c-db2oltp-<xxxxxxxxxxxxxxxx>-db2u-0.
Also, the OpenShift® console shows a PersistentVolumeUsageCritical warning for that PVC.
Resolving the problem
Do the following steps:
- On the source cluster, increase the size of the PVC.
- Retake the backup.
- Clean up the target cluster and retry the restore.
Tip: You can check whether PVCs in an Cloud Pak for Data instance are running out of space with the
Volume usage status check monitor. For more information about this monitor, see Installing privileged monitors.