Use this information to manually restore the monitor pods Fusion Data Foundation, when necessary.
About this task
Restore the monitor pods if all three of them go down, and when Fusion Data Foundation is not able to recover the monitor pods
Note: This is a disaster recovery procedure and must be performed under the guidance of the IBM
support team. Contact IBM support team.
- Scale down the
oc scale deployment rook-ceph-operator --replicas=0 -n openshift-storage
oc scale deployment ocs-operator --replicas=0 -n openshift-storage
Create a backup of all deployments in
mkdir backup
cd backup
oc project openshift-storage
for d in $(oc get deployment|awk -F' ' '{print $1}'|grep -v NAME); do echo $d;oc get deployment $d -o yaml > oc_get_deployment.${d}.yaml; done
- Patch the MDS deployments to remove the livenessProbe parameter and
run it with the command parameter as sleep.
for i in $(oc get deployment -l app=rook-ceph-osd -oname);do oc patch ${i} -n openshift-storage --type='json' -p '[{"op":"remove", "path":"/spec/template/spec/containers/0/livenessProbe"}]' ; oc patch ${i} -n openshift-storage -p '{"spec": {"template": {"spec": {"containers": [{"name": "osd", "command": ["sleep", "infinity"], "args": []}]}}}}' ; done
- Retrieve the
cluster map from all the OSDs.
- Create the script.
rm -rf $ms
mkdir $ms
for osd_pod in $(oc get po -l app=rook-ceph-osd -oname -n openshift-storage); do
echo "Starting with pod: $osd_pod"
podname=$(echo $osd_pod|sed 's/pod\///g')
oc exec $osd_pod -- rm -rf $ms
oc cp $ms $podname:$ms
rm -rf $ms
mkdir $ms
echo "pod in loop: $osd_pod ; done deleting local dirs"
oc exec $osd_pod -- ceph-objectstore-tool --type bluestore --data-path /var/lib/ceph/osd/ceph-$(oc get $osd_pod -ojsonpath='{ .metadata.labels.ceph_daemon_id }') --op update-mon-db --no-mon-config --mon-store-path $ms
echo "Done with COT on pod: $osd_pod"
oc cp $podname:$ms $ms
echo "Finished pulling COT data from pod: $osd_pod"
- Run the script.
chmod +x
- Patch the MON deployments, and run it with the command parameter as
- Edit the MON deployments.
for i in $(oc get deployment -l app=rook-ceph-mon -oname);do oc patch ${i} -n openshift-storage -p '{"spec": {"template": {"spec": {"containers": [{"name": "mon", "command": ["sleep", "infinity"], "args": []}]}}}}'; done
- Patch the MON deployments to increase the
oc get deployment rook-ceph-mon-a -o yaml | sed "s/initialDelaySeconds: 10/initialDelaySeconds: 2000/g" | oc replace -f -
oc get deployment rook-ceph-mon-b -o yaml | sed "s/initialDelaySeconds: 10/initialDelaySeconds: 2000/g" | oc replace -f -
oc get deployment rook-ceph-mon-c -o yaml | sed "s/initialDelaySeconds: 10/initialDelaySeconds: 2000/g" | oc replace -f -
- Copy the previously retrieved
to the mon-a pod.
oc cp /tmp/monstore/ $(oc get po -l app=rook-ceph-mon,mon=a -oname |sed 's/pod\///g'):/tmp/
- Navigate into the MON pod and change the ownership of the retrieved
oc rsh $(oc get po -l app=rook-ceph-mon,mon=a -oname)
chown -R ceph:ceph /tmp/monstore
- Copy the keyring template file before rebuilding the mon db.
oc rsh $(oc get po -l app=rook-ceph-mon,mon=a -oname)
cp /etc/ceph/keyring-store/keyring /tmp/keyring
cat /tmp/keyring
key = AQCleqldWqm5IhAAgZQbEzoShkZV42RiQVffnA==
caps mon = "allow *"
key = AQCmAKld8J05KxAArOWeRAw63gAwwZO5o75ZNQ==
auid = 0
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *”
- Identify the keyring of all other Ceph daemons (MGR, MDS, RGW, Crash, CSI and CSI
provisioners) from its respective
oc get secret rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-keyring -ojson | jq .data.keyring | xargs echo | base64 -d
key = AQB3r8VgAtr6OhAAVhhXpNKqRTuEVdRoxG4uRA==
caps mon = "allow profile mds"
caps osd = "allow *"
caps mds = "allow"
Example keyring file,
key = AQDxTF1hNgLTNxAAi51cCojs01b4I5E6v2H8Uw==
caps mon = "allow "
key = AQDxTF1hpzguOxAA0sS8nN4udoO35OEbt3bqMQ==
caps mds = "allow "
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"
key = AQCKTV1horgjARAA8aF/BDh/4+eG4RCNBCl+aw==
caps mds = "allow"
caps mon = "allow profile mds"
caps osd = "allow *"
key = AQCKTV1hN4gKLBAA5emIVq3ncV7AMEM1c1RmGA==
caps mds = "allow"
caps mon = "allow profile mds"
caps osd = "allow *"
key = AQCOkdBixmpiAxAA4X7zjn6SGTI9c1MBflszYA==
caps mon = "allow rw"
caps osd = "allow rwx"
key = AQBOTV1hGYOEORAA87471+eIZLZtptfkcHvTRg==
caps mds = "allow *"
caps mon = "allow profile mgr"
caps osd = "allow *"
key = AQBOTV1htO1aGRAAe2MPYcGdiAT+Oo4CNPSF1g==
caps mgr = "allow rw"
caps mon = "allow profile crash"
key = AQBOTV1hiAtuBBAAaPPBVgh1AqZJlDeHWdoFLw==
caps mds = "allow rw"
caps mgr = "allow rw"
caps mon = "allow r"
caps osd = "allow rw tag cephfs *="
key = AQBNTV1hHu6wMBAAzNXZv36aZJuE1iz7S7GfeQ==
caps mgr = "allow rw"
caps mon = "allow r"
caps osd = "allow rw tag cephfs metadata="
caps mgr = "allow rw"
caps mon = "profile rbd"
caps osd = "profile rbd"
key = AQBNTV1hMNcsExAAvA3gHB2qaY33LOdWCvHG/A==
caps mgr = "allow rw"
caps mon = "profile rbd"
caps osd = "profile rbd"
- For
related keyring, refer to the previous keyring file output and
add the default caps
after fetching the key from its respective Fusion Data Foundation secret.
- OSD keyring is added automatically post recovery.
- Navigate into the mon-a pod, and verify that the
- Navigate into the mon-a pod.
oc rsh $(oc get po -l app=rook-ceph-mon,mon=a -oname)
- Verify that the
has monmap
ceph-monstore-tool /tmp/monstore get monmap -- --out /tmp/monmap
monmaptool /tmp/monmap --print
- Optional: If the
is missing then create a new
monmaptool --create --add <mon-a-id><mon-a-ip> --add <mon-b-id><mon-b-ip> --add <mon-c-id><mon-c-ip> --enable-all-features --clobber /root/monmap --fsid <fsid>
- Is the ID of the mon-a pod.
- Is the IP address of the mon-a pod.
- Is the ID of the mon-b pod.
- Is the IP address of the mon-b pod.
- Is the ID of the mon-c pod.
- Is the IP address of the mon-c pod.
- Is the file system ID.
- Verify the
monmaptool /root/monmap --print
- Import the
Important: Use the previously created keyring file.
ceph-monstore-tool /tmp/monstore rebuild -- --keyring /tmp/keyring --monmap /root/monmap
chown -R ceph:ceph /tmp/monstore
- Create a backup of the old
mv /var/lib/ceph/mon/ceph-a/store.db /var/lib/ceph/mon/ceph-a/store.db.corrupted
mv /var/lib/ceph/mon/ceph-b/store.db /var/lib/ceph/mon/ceph-b/store.db.corrupted
mv /var/lib/ceph/mon/ceph-c/store.db /var/lib/ceph/mon/ceph-c/store.db.corrupted
- Copy the rebuild store.db file to the monstore
mv /tmp/monstore/store.db /var/lib/ceph/mon/ceph-a/store.db
chown -R ceph:ceph /var/lib/ceph/mon/ceph-a/store.db
- After rebuilding the monstore directory, copy the
store.db file from local to the rest of the MON pods, where
<id> is the ID of the MON pod.
oc cp $(oc get po -l app=rook-ceph-mon,mon=a -oname | sed 's/pod\///g'):/var/lib/ceph/mon/ceph-a/store.db /tmp/store.db
oc cp /tmp/store.db $(oc get po -l app=rook-ceph-mon,mon=<id> -oname | sed 's/pod\///g'):/var/lib/ceph/mon/ceph-<id>
- Navigate into the rest of the MON pods and change the ownership of the copied
, where <id> is the ID of the MON pod.
oc rsh $(oc get po -l app=rook-ceph-mon,mon=<id> -oname)
chown -R ceph:ceph /var/lib/ceph/mon/ceph-<id>/store.db
- Revert the patched changes.
Important: Ensure that the MON, MGR, and OSD pods are up and running.
- For MON deployments
- Use the following command, where <mon-deployment.yaml> is the MON
deployment yaml
oc replace --force -f <mon-deployment.yaml>
- For OSD deployments:
- Use the following command, where <osd-deployment.yaml> is the OSD
deployment yaml
oc replace --force -f <osd-deployment.yaml>
- For MGR deployments
- Use the following command, where <mgr-deployment.yaml> is the MGR
deployment yaml
oc replace --force -f <mgr-deployment.yaml>
- Scale up the
and ocs-operator
oc -n openshift-storage scale deployment ocs-operator --replicas=1
What to do next
- Check the Ceph status to confirm that CephFS is running.
ceph -s
id: f111402f-84d1-4e06-9fdb-c27607676e55
health: HEALTH_ERR
1 filesystem is offline
1 filesystem is online with fewer MDS than max_mds
3 daemons have recently crashed
mon: 3 daemons, quorum b,c,a (age 15m)
mgr: a(active, since 14m)
mds: ocs-storagecluster-cephfilesystem:0
osd: 3 osds: 3 up (since 15m), 3 in (since 2h)
pools: 3 pools, 96 pgs
objects: 500 objects, 1.1 GiB
usage: 5.5 GiB used, 295 GiB / 300 GiB avail
pgs: 96 active+clean
Important: If the filesystem is offline or MDS
service is missing, you need to restore the CephFS. For more information, see
Restoring the CephFS.
- Check the Multicloud Object Gateway (MCG) status. It should be active, and the backingstore and
bucketclass should be in Ready
noobaa status -n openshift-storage
Note: If the MCG is not in the active
state, and the backingstore and bucketclass not in the
state, you need to
restart all the MCG related pods. For more information, see
Restoring the Multicloud Object