Refreshing bearer tokens for Prometheus authentication

Bearer tokens obtained from Cloud Pak for Data have a limited lifetime. Therefore, you must refresh the authentication token to keep Prometheus monitoring operational.

About this task

This is what happens when tokens expire:
  • Prometheus receives 401 Unauthorized and 403 Forbidden responses as it tries to obtain metrics (called "scraping").
  • The ServiceMonitor element reports a state of "Down"in the Prometheus UI.
  • Metrics are not collected until the token is refreshed.

Procedure

To refresh your Prometheus authentication token:

  1. Open a terminal window and log in to your OpenShift® server.
  2. Set the following environment variables for your terminal session:
    export CPD_URL="https://cpd-cpd-instance.apps.example.com"
    export DATAGATE_NAMESPACE="<datagate-namespace>"
    export CPD_USERNAME="<cpd-user>"
    export CPD_PASSWORD="<cpd-password>"

    where

    https://cpd-cpd-instance.apps.example.com
    Is an example of a web address of a Cloud Pak for Data instance. Replace it with the valid address of your Cloud Pak for Data instance.
    <datagate-namespace>
    Is your Data Gate namespace.
    <cpd-user>
    Is the username of a Cloud Pak for Data administrator.
    <cpd-password>
    Is the password belonging to the username of the Cloud Pak for Data administrator.
  3. Obtain a new token by entering the following command:
    export TOKEN=$(
      curl -k -s -X POST "${CPD_URL}/icp4d-api/v1/authorize" \
        -H "Content-Type: application/json" \
        -d "{\"username\":\"${CPD_USERNAME}\",\"password\":\"${CPD_PASSWORD}\"}" \
      | jq -r '.token'
    )
  4. Verify that the token was retrieved:
    echo "Token obtained: ${TOKEN:0:30}..."
  5. Update the Prometheus authentication secret:
    oc create secret generic prometheus-bearer-token \
      --from-literal=token="${TOKEN}" \
      -n "${DATAGATE_NAMESPACE}" \
      --dry-run=client -o yaml | oc apply -f -
    
  6. Wait until the pods have restarted. This takes about one minute.
    Use the following command to check whether this is the case:
    oc get pods -n openshift-user-workload-monitoring \
    -l app.kubernetes.io/name=prometheus -w

What to do next

You can take a few actions to find out whether a bearer token secret update is necessary:

  • Check whether metrics collections have failed. This is a good indicator for an expired bearer token secret. Run the following Prometheus query for this purpose:
    
    increase(datagate_table_sync_metrics_refresh_failures_total[10m]) > 0
  • See if Prometheus user workload monitoring is enabled and running:
    oc get route -n openshift-user-workload-monitoring
  • Open the Prometheus UI and select Status > Targets on the sidebar. Check the target status that is displayed when you select and run the datagate-table-sync-metrics query.

In addition, you can take the following measures to automate bearer token secret updates:

  • Create a cron job on the OpenShift server that refreshes the token and restarts the Prometheus pods.
  • Use an external secrets operator for a centralized secrets management.
  • Make the Data Gate pods use a sidecar container that manages the token lifecycle.