Prometheus does not find the target

Prometheus does not retrieve metrics because it cannot find the target that exposes the metrics.

Symptoms

Prometheus does not work. You cannot retrieve Data Gate instance metrics.

Causes

In most cases, this is caused by a misconfiguration.

Resolving the problem

  1. Open a terminal window and log on to the OpenShift® server where Prometheus is installed.
  2. Make sure that the following environment variables are set correctly for your terminal window or shell:
    
    export DATAGATE_NAMESPACE="<datagate-namespace>"
    export DATAGATE_INSTANCE="<datagate-instance>"

    where

    <datagate-namespace>
    Is your Data Gate namespace.
    <datagate-instance>
    Is the name of your Data Gate instance. Run the get dginstance command to find the name of the Data Gate instance.
  3. Run the following checks from the command-line:
    Check for a namespace label
    Run the following command to check for a namespace label from your terminal window:
    oc get namespace "${DATAGATE_NAMESPACE}" --show-labels | grep user-monitoring

    If you do not get a result, that is, if no label name is displayed in the output, add the label with the following command:

    oc label namespace "${DATAGATE_NAMESPACE}" openshift.io/user-monitoring=true
    Check the label of the ServiceMonitor element
    Run the following command to check the label of the ServiceMonitor element:
    oc get servicemonitor datagate-table-sync-metrics 
    -n "${DATAGATE_NAMESPACE}" --show-labels

    The label must be openshift.io/user-monitoring: "true". If it is different, run the following commands to change that:

    oc get servicemonitor -n "${DATAGATE_NAMESPACE}"
    oc describe servicemonitor datagate-table-sync-metrics -n "${DATAGATE_NAMESPACE}"
    Check RBAC permissions
    Make sure that Prometheus can read the bearer token secret by running the following commands:
    oc get role prometheus-bearer-token-reader -n "${DATAGATE_NAMESPACE}"
    oc get rolebinding prometheus-bearer-token-reader -n "${DATAGATE_NAMESPACE}"
    Check the service name
    Make sure that the service name is correct. To display the service name, run:
    oc get svc "${DATAGATE_INSTANCE}-data-gate-db2z-api-svc" 
    -n "${DATAGATE_NAMESPACE}" --show-labels

    Verify that the screen output includes the service name icpdsupport/app=dg-instance-db2z-api.

    Check the Prometheus operator log
    Display the Prometheus operator log to look for errors or other unusual messages. The following command displays the last 50 log entries on the screen:
    oc logs -n openshift-user-workload-monitoring 
    -l app.kubernetes.io/name=prometheus-operator --tail=50