Bricks that comprise a GlusterFS volume fail to start

You cannot access a GlusterFS volume because bricks that comprise that volume are offline.

Symptoms

A GlusterFS volume might not be accessible because some of bricks that comprise the volume are offline. Starting the GlusterFS volume does not fix the problem. The volume fails to start because it cannot locate the mount point of a particular brick. Each brick in a GlusterFS volume corresponds to a logical volume on the node.

Causes

For unknown reasons, the symbol links of the logical volume device under /dev/mapper disappeared. The mount point of the brick is also removed by GlusterFS.

Resolving the problem

Complete the following steps to bring back the logical volume device, and mount the logical volume in the specified mount point.

  1. Run command vgscan --mknodes to create the missing logical volume device.

  2. Log in to the GlusterFS pod where the problem brick was created. Locate the mount table in the /var/lib/heketi/fstab directory. Check the content of the mount table to find a related mount point for the brick:

    # cat /var/lib/heketi/fstab
    /dev/mapper/vg_286e1330866da4dc3306818adf5e2145-brick_ab7f5ff1f7cf0b4fd71517736ba22598 /var/lib/heketi/mounts/vg_286e1330866da4dc3306818adf5e2145/brick_ab7f5ff1f7cf0b4fd71517736ba22598 xfs rw,inode64,noatime,nouuid 1 2
    /dev/mapper/vg_286e1330866da4dc3306818adf5e2145-brick_08cf49b637ac08716083b767c8636500 /var/lib/heketi/mounts/vg_286e1330866da4dc3306818adf5e2145/brick_08cf49b637ac08716083b767c8636500 xfs rw,inode64,noatime,nouuid 1 2
    
  3. Manually re-create the mount point. For example,

     mkdir -p /var/lib/heketi/mounts/vg_286e1330866da4dc3306818adf5e2145/brick_08cf49b637ac08716083b767c8636500/brick/.glusterfs/indices
    
  4. Run the following command to mount the logical volume in the mount point.

    mount -a --fstab /var/lib/heketi/fstab
    

    Verify that the mount command is successful.

    mount | grep brick_08cf49b637ac08716083b767c8636500
    
  5. Start the volume.

    # gluster volume start icp_default_mysql-pv-claim_2a13e46b-f3f3-11ea-9bc7-00000a0b181b force
    volume start: icp_default_mysql-pv-claim_2a13e46b-f3f3-11ea-9bc7-00000a0b181b: success
    

    Check the volume status to see that the brick is online. Also, check processes inside the container, and application data from the mount point.

  6. Trigger the self-heal daemon to heal files on the replicate volume.

    • Run the following command from the GlusterFS pod where the brick was created:
      # gluster volume heal icp_default_wp-pv-claim_7a832227-f3f3-11ea-9bc7-00000a0b181b full
      Launching heal operation to perform full self heal on volume icp_default_wp-pv-claim_7a832227-f3f3-11ea-9bc7-00000a0b181b has been successful
      Use heal info commands to check status.