RWO PVC fails to Mount with the Error "Multi-Attach error" or "is still being used"

Troubleshooting

Problem

The RWO PVC is not in use, but can't be mounted with the "Multi-Attach error"

Symptom

In the pod events, the below message will be observed:

pvc-729bf7c3-b449-49d5-97bf-41169d240b7c is still being used

Cause

ReadWriteOnce (RWO) PVC means a volume can be mounted as read-write by only a single node. See Access Modes.

This issue may occur if the VolumeAttachment is not released from the previous node, preventing the RWO PVC from being attached to a different node

Environment

IBM Storage Fusion Data Foundation (FDF) 4.x

Red Hat OpenShift Data Foundation (ODF) 4.x

Diagnosing The Problem

When the pod fails to mount the PVC, check the pod's describe output:

Type     Reason              Age                   From                     Message
----     ------              ----                  ----                     -------
Normal   Scheduled   XXXX     default-scheduler    Successfully assigned XXXX  Warning  FailedAttachVolume  XXXX   attachdetach-controller  Multi-Attach error for volume "XXXX" Volume is already exclusively attached to one node and can't be attached to another.

Check on each OCP node for the existence of the directory with PV names, and then look for the PV name to confirm if any node has older reference for this pv:

# for i in $(oc get node -l cluster.ocs.openshift.io/openshift-storage= -o jsonpath='{ .items[*].metadata.name }'); do oc debug node/${i} -- chroot /host  find /var/lib/kubelet/pods/pvc-729bf7c3-b449-49d5-97bf-41169d240b7c* ; done

grep the volumehandle in the node describe:
For example PV pvc-729bf7c3-b449-49d5-97bf-41169d240b7c / subvolumeName: csi-vol-459f0536-3b4b-11ed-9291-0a580a820725 and check if still mounted:

$ oc get node -o yaml | grep 459f0536-3b4b-11ed-9291-0a580a820725
name: kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com^0001-0011-openshift-storage-0000000000000001-459f0536-3b4b-11ed-9291-0a580a820725
- kubernetes.io/csi/openshift-storage.cephfs.csi.ceph.com^0001-0011-openshift-storage-0000000000000001-459f0536-3b4b-11ed-9291-0a580a820725

Check on each OCP node for the existence of the directory with the volumeHandle name :

# for i in $(oc get node -l cluster.ocs.openshift.io/openshift-storage= -o jsonpath='{ .items[*].metadata.name }'); do oc debug node/${i} -- chroot /host find /var/lib/kubelet/plugins/kubernetes.io/csi/openshift-storage.rbd.csi.ceph.com -name 0001-0011-openshift-storage-0000000000000001-3d2aea19-0f9e-11ee-8e11-0a580a810211 ; done

Resolving The Problem

NOTE: In the below resolution will be using the NooBaa application for reference. The db-noobaa-db-pg-0 PVC utilizes accessMode:RWO and occasionally may experience this issue.

Find the Volume name:

Syntax: $ oc get pvc -n <namespace> | grep <pvc-name>

$ oc get PVC -n openshift-storage | grep db-noobaa-db-pg-0

NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   <omitted-for-space>
db-noobaa-db-pg-0   Bound    pvc-729bf7c3-b449-49d5-97bf-41169d240b7c   50Gi       RWO            <omitted-for-space>

Associate the volumeattachment with the VOLUME name:

Syntax: $ oc get volumeattachment -n <namespace> | grep <volume-name>

$ oc get volumeattachment -n openshift-storage | grep pvc-729bf7c3-b449-49d5-97bf-41169d240b7c

NAME                                                                  <omitted-for-space>   PV                         
csi-4b4df1d85a06d41f515d20b19f7263df446425ea70cd52fc8037b3ddf8bf4d35  <omitted-for-space>   pvc-729bf7c3-b449-49d5-97bf-41169d240b7c

NOTE: csi-4b4df1d85a06d41f515d20b19f7263df446425ea70cd52fc8037b3ddf8bf4d35 is the volumeattachment name.

Scale the workload associated with the PV/PVC down to --replicas=0 (make note of the replicas prior to scaling):

Syntax: $ oc get <deployment/sts> -n <namespace> followed by a scale replica command. Make sure all pods are deleted sucessfully and not stuck in terminating state.

$ oc get sts -n openshift-storage
NAME           READY   AGE
noobaa-core    1/1     3d
noobaa-db-pg   1/1     3d


$ oc scale sts -n openshift-storage noobaa-db-pg --replicas=0
statefulset.apps/noobaa-db-pg scaled

Delete the VolumeAttachment:

Syntax: $ oc delete volumeattachment -n <namespace> <volumeattachment-name>

$ oc delete volumeattachment -n openshift-storage csi-4b4df1d85a06d41f515d20b19f7263df446425ea70cd52fc8037b3ddf8bf4d35
volume attachment deleted

Note: If the volumeattachment fails to delete with the following message, the volumeattachment was successfully released and the workload can be scaled back up:

Error from server (NotFound): volumeattachments.storage.k8s.io "csi-4b4df1d85a06d41f515d20b19f7263df446425ea70cd52fc8037b3ddf8bf4d35" not found

Scale up workload:

$ oc scale sts -n openshift-storage noobaa-db-pg --replicas=1
statefulset.apps/noobaa-db-pg scaled

If it still fails, we can try to reboot the node where the older volumeattachment was present. That will clear the mount points present on the older node

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB66","label":"Technology Lifecycle Services"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SSSEWFV","label":"Storage Fusion Data Foundation"},"ARM Category":[{"code":"a8m3p000000UoIPAA0","label":"Support Reference Guide"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Tips

RWO PVC fails to Mount with the Error "Multi-Attach error" or "is still being used"

Troubleshooting

Problem

Symptom

Cause

Environment

Diagnosing The Problem

Resolving The Problem

Document Location

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?