Setting up Grafana for Data Gate instance monitoring (optional)

Grafana, which is developed by Grafana Labs, is a graphical, browser-based tool that allows you to monitor metrics of various applications. This means that you can set it up to display metrics that were captured by Prometheus. To make Grafana work, you must complete a few extra configuration steps after you've set up Prometheus.

Before you begin

Make sure that Prometheus has been set up properly for the Data Gate instances you want to monitor.

For the creation of a Grafana dashboard, you need a JSON definition file (named datagate_table_sync_grafana_dashboard.json in this topic). Call IBM support to request this file.

Procedure

  1. Log in to the OpenShift® server where the Data Gate instance is installed.
  2. Deploy Grafana by entering the following command:
    oc apply -f - <<'EOF' 
    apiVersion: v1 
    kind: Namespace 
    metadata: 
       name: datagate-monitoring 
    ---
    apiVersion: apps/v1 
    kind: Deployment 
    metadata: 
       name: grafana 
       namespace: datagate-monitoring 
    spec: 
       replicas: 1 
       selector: 
       matchLabels: 
          app: grafana 
       template: 
          metadata: 
            labels: 
               app: grafana 
          spec: 
             containers: 
               - name: grafana 
                 image: grafana/grafana:latest 
                 ports: 
                   - containerPort: 3000 
                 env: 
                   - name: GF_SECURITY_ADMIN_PASSWORD 
                     value: "admin"
    ---
    apiVersion: v1 
    kind: Service 
    metadata: 
       name: grafana 
       namespace: datagate-monitoring 
    spec: 
       selector: 
          app: grafana 
       ports: 
         - name: http 
           port: 3000 
           targetPort: 3000 
    ---
    apiVersion: route.openshift.io/v1 
    kind: Route 
    metadata: 
       name: grafana 
       namespace: datagate-monitoring 
    spec: 
      to: 
        kind: Service 
        name: grafana 
      port: 
         targetPort: http 
      tls: 
         termination: edge 
    EOF
    The following confirmation message is displayed:
    namespace/datagate-monitoring created 
    deployment.apps/grafana created 
    service/grafana created 
    route.route.openshift.io/grafana created
  3. To verify the deployment, check if the Grafana pod is running. Enter the following command:
    oc get pods -n datagate-monitoring
    You should see a screen output similar to this example:
    NAME                      READY     STATUS     RESTARTS    AGE
    grafana-5c696d6c4b-m7zk8  1/1       Running    0           2m20s
  4. Create a Grafana service account by entering the following command:
    kubectl create sa grafana -n "${DATAGATE_NAMESPACE}"
    You see the following screen output:
    serviceaccount/grafana created
  5. Grant view permissions for cluster monitoring:
    kubectl create clusterrolebinding grafana-cluster-monitoring-view \
    --clusterrole=cluster-monitoring-view \
    --serviceaccount="${DATAGATE_NAMESPACE}:grafana"
    You see the following confirmation message on the screen:
    clusterrolebinding.rbac.authorization.k8s.io/grafana-cluster-monitoring-view created
  6. Create a secret token with a validity of 2 years:
    
    kubectl create token grafana --duration=17520h -n "${DATAGATE_NAMESPACE}"
    The token is displayed on the screen. Select it and copy it to the clipboard. You need it in a later step.
  7. Enter another command to display the Grafana route URL. Enter the following command:
    oc get route grafana -n datagate-monitoring -o jsonpath='{.spec.host}{"\n"}'
    The route URL is displayed on the screen.
  8. Copy the Grafana route URL and paste it into the address bar of a browser.
    This starts the Grafana UI in the web browser:
    Figure 1. The Grafana welcome page
    The Grafana welcome page
  9. Log in with your username and password.
    The default credentials after the deployment are:
    Email or username
    admin
    Password
    admin
  10. Change the password when prompted.
    After the login, you see the following page:
    Figure 2. The Grafana start page after your login
    The Grafana start page after your login
  11. On the sidebar on the left, select Connections > Data sources.
  12. Click Add data source and select Prometheus.
    1. Provide the following values:
      Field or control Value
      Prometheus server URL https://thanos-querier.openshift-monitoring.svc:9091
      Access Server
      Authentication method No authentication
      Skip TLS certificate validation Select check box to enable
      Important: The Thanos Querier endpoint (https://thanos-querier.openshift-monitoring.svc:9091) is the recommended endpoint for the querying of OpenShift user workload monitoring metrics. It provides a unified query interface and handles authentication properly.
    2. Click Add header.
    3. Provide the following values:
      Field or control Value
      Header Authorization
      Value Bearer <token>

      where <token> is the token you created in step 6.

      Note: If the token expires, or you need to regenerate it, run the kubectl create token command in step 6 again and update the Authorization header value in Grafana.
  13. Click Save and test.
    You should see the message Data source is working.

Creating a Grafana dashboard

  1. To create a Grafana dashboard:
    1. On the sidbar, select Dashboards.
    2. Click New.
    3. Select Import.
    4. Select the datagate_table_sync_grafana_dashboard.json file for uploading.
    5. Select the Prometheus data source you created in step 12.
    6. Click Import.

What to do next

You might find the following queries useful:
max(datagate_table_sync_state{state="running"}) != 1
To see if synchronization is out of order.
increase(datagate_table_sync_metrics_refresh_failures_total[10m]) > 0
To see if metrics retrieval failures have occurred within the last 10 minutes.
datagate_table_sync_latency_milliseconds > 60000
To change the threshold so that only synchronization latency values greater than 60000 milliseconds are reported.