Commonly required logs for troubleshooting

Gather commonly required logs for troubleshooting.

Some of the commonly used logs for troubleshooting Fusion Data Foundation are listed, along with the commands to generate them.

  • Generating logs for a specific pod:
    oc logs <pod-name> -n <namespace>
  • Generating logs for Ceph or Fusion Data Foundation cluster:
    oc logs rook-ceph-operator-<ID> -n openshift-storage
    Important: Currently, the rook-ceph-operator logs do not provide any information about the failure and this acts as a limitation in troubleshooting issues, see Enabling debug logs for rook-ceph-operator and Disabling debug logs for rook-ceph-operator.
  • Generating logs for plugin pods like cephfs or rbd to detect any problem in the PVC mount of the app-pod:
    oc logs csi-cephfsplugin-<ID> -n openshift-storage -c csi-cephfsplugin
    oc logs csi-rbdplugin-<ID> -n openshift-storage -c csi-rbdplugin
    • To generate logs for all the containers in the CSI pod:
      oc logs csi-cephfsplugin-<ID> -n openshift-storage --all-containers
      oc logs csi-rbdplugin-<ID> -n openshift-storage --all-containers
  • Generating logs for cephfs or rbd provisioner pods to detect problems if PVC is not in BOUND state:
    oc logs csi-cephfsplugin-provisioner-<ID> -n openshift-storage -c csi-cephfsplugin
    oc logs csi-rbdplugin-provisioner-<ID> -n openshift-storage -c csi-rbdplugin
    • To generate logs for all the containers in the CSI pod:
      oc logs csi-cephfsplugin-provisioner-<ID> -n openshift-storage --all-containers
      oc logs csi-rbdplugin-provisioner-<ID> -n openshift-storage --all-containers
  • Generating Fusion Data Foundation logs using cluster-info command:
    oc cluster-info dump -n openshift-storage --output-directory=<directory-name>
  • When using Local Storage Operator, generating logs can be done using cluster-info command:
    oc cluster-info dump -n openshift-local-storage --output-directory=<directory-name>
  • Check the Fusion Data Foundation operator logs and events.

    • To check the operator logs :
      oc logs <ocs-operator> -n openshift-storage
      <ocs-operator>
      oc get pods -n openshift-storage | grep -i "ocs-operator" | awk '{print $1}'
    • To check the operator events :
      oc get events --sort-by=metadata.creationTimestamp -n openshift-storage
  • Get the Fusion Data Foundation operator version and channel.
    oc get csv -n openshift-storage
    Example output :
    NAME                             DISPLAY                       VERSION   REPLACES   PHASE
    mcg-operator.v4.12.0              NooBaa Operator               4.12.0               Succeeded
    ocs-operator.v4.12.0              OpenShift Container Storage   4.12.0               Succeeded
    odf-csi-addons-operator.v4.12.0   CSI Addons                    4.12.0               Succeeded
    odf-operator.v4.12.0              Fusion Data Foundation     4.12.0               Succeeded
    oc get subs -n openshift-storage
    Example output:
    NAME                                                              PACKAGE                   SOURCE             CHANNEL
    mcg-operator-stable-4.12-redhat-operators-openshift-marketplace   mcg-operator              redhat-operators   stable-4.12
    ocs-operator-stable-4.12-redhat-operators-openshift-marketplace   ocs-operator              redhat-operators   stable-4.12
    odf-csi-addons-operator                                           odf-csi-addons-operator   redhat-operators   stable-4.12
    odf-operator                                                      odf-operator              redhat-operators   stable-4.12
  • Confirm that the installplan is created.
    oc get installplan -n openshift-storage
  • Verify the image of the components post updating Fusion Data Foundation.
    • Check the node on which the pod of the component you want to verify the image is running.
      oc get pods -o wide | grep <component-name>
      For Example :
      oc get pods -o wide | grep rook-ceph-operator
      Example output, where dell-r440-12.gsslab.pnq2.redhat.com is the node-name:
      rook-ceph-operator-566cc677fd-bjqnb 1/1 Running 20 4h6m 10.128.2.5 rook-ceph-operator-566cc677fd-bjqnb 1/1 Running 20 4h6m 10.128.2.5 dell-r440-12.gsslab.pnq2.redhat.com <none> <none>
      
      <none> <none>
    • Check the image ID, where <node-name> is the name of the node on which the pod of the component you want to verify the image is running.
      oc debug node/<node name>
      chroot /host
      crictl images | grep <component>
      For Example :
      crictl images | grep rook-ceph

      Take a note of the IMAGEID and map it to the Digest ID on the Rook Ceph Operator page.

For more information, see Using must-gather.