Performing routine cluster monitoring

Establish a schedule for monitoring your IBM® Software Hub deployments on Red Hat® OpenShift® Container Platform.

The health of your cluster can have a huge impact on the health of your IBM Software Hub deployments.

Who should perform this task?
Cluster administrator A cluster administrator must perform this task.
How frequently should you perform this task?

It is recommended that you perform this task at least once per day or once per shift.

However, if there is a large variation in the number of concurrent users or jobs, it is recommended that you perform this task more frequently during peak activity.

Your routine should include the following tasks:

  1. If your storage is remote, ensure that your network is running at 10 Gbps or greater.
  2. Run the storage performance validation playbook to confirm that there are no underlying performance issues with your persistent storage.
  3. Review the monitoring data from the OpenShift Container Platform web console.
    Important: Ensure that you enable monitoring for the user-defined projects where IBM Software Hub software is installed.
    OpenShift Version Resources
    Version 4.12
    Version 4.14
    Version 4.15
    Version 4.16
    Version 4.17
    Version 4.18
    Version 4.19

    Review the following dashboards:

    • API Performance
    • Kubernetes / Compute Resources / Cluster
    • Kubernetes / Compute Resource / Node (Pods)
    • Kubernetes / Compute Resources / Namespace(Pods)
  4. Check the status of the Operand Deployment Lifecycle Manager objects on the cluster:
    1. Confirm that the catalog sources on the cluster are Ready:
      oc get catalogsource -A \
      -o jsonpath="{range .items[*]}{.metadata.name}{': '}{.status.connectionState.lastObservedState}{'\n'}{end}"
    2. Get information about the IBM Software Hub operator subscriptions to determine the channel and to confirm that the current CSV is the same as the installed CSV:
      oc get subscription -n ${PROJECT_CPD_INST_OPERATORS} \
      -o jsonpath="{range .items[*]}{.metadata.name}{' - channel: '}{.spec.channel}{', installedCSV: '}{.status.installedCSV}{', currentCSV: '}{.status.currentCSV}{'\n'}{end}"
    3. Confirm that the operator deployments are ready and have available replicas:
      oc get deploy -n ${PROJECT_CPD_INST_OPERATORS}
    4. Check the status of the operator pods and determine whether any of the pods have been restarted:
      oc get pods -n ${PROJECT_CPD_INST_OPERATORS}
  5. If you installed the privileged monitoring service, review the Monitoring > Alerts and events page for Node imbalance status check events.
  6. If the Node imbalance status check event reports that the cluster is imbalanced, run the descheduler to rebalance the nodes on the cluster. For more information, see Evicting pods using the descheduler in the Red Hat OpenShift Container Platform documentation:
    Important: Run the descheduler during a maintenance window to prevent disruptions.