New metrics are not shown in the Infrastructure monitoring page after IBM Cloud Pak foundational services is upgraded to version 3.12 or later
After IBM Cloud Pak foundational services is upgraded to version 3.12 or later, you might find that new metrics are not shown in the Infrastructure monitoring page on IBM Cloud Pak console.
Symptoms
-
In the the log of the pod
monitoring-metricsvc-collector-kafka
in themanagement-monitoring
namespace, you can see that a403
or409
status code is returned from calls to the Red Hat® Advanced Cluster Management for Kubernetes Observability service APIs.[INFO] [collector.WaitForObservatoriumAPI] [14] [waiting for observatorium write capability to be up and running - 0. Response Code: 403. Error: error issuing remote write request. status code: 403, response: ] [ERROR] [process.SendToRemoteReceive] [78] [error writing time series: error issuing remote write request. status code: 403, response: , status code: 403]
[WARN] [process.ObservationChannelREST] [6930] [Unable to send 100 TimeSeries from 100 cache entries to Thanos. Multiple attempts made. Data will be discarded. Status code: 409]
Cause
The Infrastructure monitoring metric service stores resource metrics by using the Red Hat® Advanced Cluster Management for Kubernetes Observability service API. The API access credentials that are currently in use by the Infrastructure monitoring metric service are not valid.
Solution
Log in to the cluster by using the Red Hat OpenShift oc command line utility, and then delete the existing metric write credential.
-
If Red Hat® Advanced Cluster Management for Kubernetes Observability version 2.2 is installed, follow the steps:
-
Scale the IBM Cloud Foundation certificate manager operator to zero by running the following command:
oc scale deploy ibm-cert-manager-operator -n ibm-common-services --replicas=0
-
Wait for the associated pod to terminate. The following watch command should show no active pods.
watch 'oc get pods -n ibm-common-services | grep ibm-cert'
-
Delete the metric service write certificate, and wait for the
metricsvc-collector-kafka
pods to be recreated.oc delete secret rhacm-observability-write-cert -n management-monitoring
watch 'oc get pods -n management-monitoring | grep metricsvc-collector-kafka'
-
Verify that the new
metricsvc-collector-kafka
pods remain inrunning
state and become ready. If the pods still crash, verify that theibm-cert-manager-operator
is not running (see step 2). Repeat steps 1 through 3 if needed. -
Scale the IBM Cloud Foundation certificate manager operator to one by running the following command:
oc scale deploy ibm-cert-manager-operator -n ibm-common-services --replicas=1
-
-
If Red Hat® Advanced Cluster Management for Kubernetes Observability version 2.3 is installed, run the following command:
oc delete secret observability-controller-open-cluster-management.io-observability-signer-client-cert -n management-monitoring
After the command is executed, a new write credential will be created, and a new instance of the monitoring-metricsvc-collector-kafka pod will also be created. It might take about 10 minutes to see the new metrics in the Infrastructure monitoring page on IBM Cloud Pak console.