Example: Creating a PVC custom monitor

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

You can use this tutorial to build a custom monitor that tracks persistent volume claims and reports events based on the status.

If a PVC is bound, an info is registered for the PVC. If a PVC is in an unbound or failed state, a critical event is recorded. This monitor is based on Python 3.7 and uses the Kubernetes Python SDK to interact with the cluster.

About this task

In the following example, the following conditions apply:

  • The "monitor_type" in the monitor event body must match the details.name of the monitor in the ConfigMap. In the example, it is sample-monitor.
  • The "event_type" in the monitor event body must match the details.event_types[i].name in the ConfigMap. In the example, it is check-pvc-bound.
  • The "metadata" in the monitor event body is a comma-separated, <key>=<value> pair that must match the variable names denoted by angle brackets (< >) in details.event_types[i].long_description. In this example, it is Phase=<actual value>.
Example of a monitor event body:
data = {"monitor_type":monitor_type, "event_type":event_type, "severity":severity, "metadata":metadata, "reference":pvc.metadata.name, "namespace": pvc.metadata.namespace}
Example monitor extension ConfigMap:
  "details": {
    "name":"sample-monitor",
    "description": "sample monitor description",
    "image": "image-registry.openshift-image-registry.svc:5000/zen/pvc-monitor:latest",
    "schedule": "*/10 * * * *",
    "event_types": [
      {
        "name": "check-pvc-bound",
        "simple_name": "PVC bound check",
        "alert_type": "platform",
        "short_description": "A monitor that checks whether a PVC is bound.",
        "long_description": "PVC status phase: <Phase>"
      }
    ]
  }

Step 1: Create the Python script to monitor and record PVC status

The following script uses the in-cluster config to authenticate to Kubernetes and access the resources. By default, you have access to the following volume:

  • zen-service-broker-secret

The following environment variables are made available as part of the cron job initialization:

  • ICPD_CONTROLPLANE_NAMESPACE: The control plane namespace.

The following Python script (pvc-monitor.py) lists PVCs and generates events based on their state. All non-bound PVCs are recorded with critical severity. The events are sent as a JSON array to the POST events endpoint, authorized with the service broker token.

import os
import requests
import json
from kubernetes import client, config, watch
def main():
    # setup the namespace
    ns = os.environ.get('ICPD_CONTROLPLANE_NAMESPACE')
    if ns is None:
        ns = ""
    monitor_type = "sample-monitor"
    event_type = "check-pvc-bound"

    # configure client
    config.load_incluster_config()
    api = client.CoreV1Api()

    # configure post request and set secret headers
    url = 'https://zen-watchdog-svc:4444/zen-watchdog/v1/monitoring/events'
    with open('/var/run/sharedsecrets/token', 'r') as file:
        secret_header = file.read().replace('\n', '')
    headers = {'Content-type': 'application/json', 'secret': secret_header}

    # Print PVC list, set status as critical for unbound or failed pvc
    pvcs = api.list_namespaced_persistent_volume_claim(namespace=ns, watch=False)
    events = []
    for pvc in pvcs.items:
        severity = "info"
        print("pvc {}: {}".format(pvc.metadata.name, pvc.status.phase))
        if pvc.status.phase != 'Bound':
            severity = "critical"
        metadata = "{}={}".format("Phase", str(pvc.status.phase))
        data = {"monitor_type":monitor_type, "event_type":event_type, "severity":severity, "metadata":metadata, "reference":pvc.metadata.name, "namespace": pvc.metadata.namespace}
        if pvc.metadata.labels is not None and "icpdsupport/addOnId" in pvc.metadata.labels:
          data["addon_id"] = pvc.metadata.labels["icpdsupport/addOnId"]
        events.append(data)
    json_string = json.dumps(events)
    print("events: {}".format(json_string))

    # post call to zen-watchdog to record events
    r = requests.post(url, headers=headers, data=json_string, verify=False)
    print("status code: {}".format(r.status_code))

if __name__ == '__main__': 
    main()

Step 2: Create and run the Dockerfile

Prerequisites

For the Python script to run in a containerized environment, you need access to a Python executable with Kubernetes Python SDK included.

  1. You need the following requirements.txt:

    kubernetes==11.0.0
  2. You need the following Dockerfile:

    # set base image (host OS)
    FROM python:3.8
    RUN mkdir /pvc-monitor
    # set the working directory in the container
    WORKDIR /pvc-monitor
    ADD . /pvc-monitor
    # install dependencies
    RUN pip install -r requirements.txt
    # command to run on container start
    CMD [ "python", "./pvc-monitor.py" ]
    
  3. You need a requirements.json file. The requirements.json file holds all the required packages for the image. The final structure of the project will be similar to the following example:

    -----Python app to monitor and alert for PVCs
    Pvc-monitor/
      pvc-monitor.py
      Requirements.txt
      Dockerfile
    

Procedure

When the prerequisites are in place, run the following commands.

  1. Build the Docker image by using the following command:
    docker build -f Dockerfile -t pvc-monitor:latest .

    This command returns the <docker-image-id> that you need for the docker tag <docker-image-id> statement in the next step.

  2. Tag and push the Docker image to your private container registry so that it can be accessed by the alerting cron job.
    docker tag <docker-image-id> <docker-registry>/<namespace>/pvc-monitor:latest
    docker push <docker-registry>/<namespace>/pvc-monitor:latest
    

    The default route for the internal image registry is image-registry.openshift-image-registry.svc:5000.

    Note: You can also push the Docker image to the OpenShift® registry. If you choose to push the image to the OpenShift registry, you must expose the registry. For more information, see Exposing the registry in the OpenShift documentation:

Step 3: Set up the extension ConfigMap for the monitor

After you push the Docker image to the registry, you can create an extension ConfigMap that points to the image. This step ensures that the alert manager picks up the image and creates a cron job so that the script is run at scheduled intervals.

You can use the following sample ConfigMap:

apiVersion: v1 
kind: ConfigMap 
metadata: 
  name: sample-monitor-extension
  labels: 
    icpdata_addon: "true" 
    icpdata_addon_version: "1.0.0" 
data: 
  extensions: |
      [
        {
          "extension_point_id": "zen_alert_monitor",
          "extension_name": "zen_alert_monitor_sample",
          "display_name": "Sample alert monitor",
          "details": {
            "name":"sample-monitor",
            "description": "sample monitor description",
            "image": "image-registry.openshift-image-registry.svc:5000/zen/pvc-monitor:latest",
            "schedule": "*/10 * * * *",
            "event_types": [
              {
                "name": "check-pvc-bound",
                "simple_name": "PVC bound check",
                "alert_type": "platform",
                "short_description": "A monitor that checks whether a PVC is bound.",
                "long_description": "PVC status phase: <Phase>"
              }
            ]
          }
        }
      ]

What to do next

Verify that your custom monitor was deployed successfully. For more information, see Monitoring the platform.

To troubleshoot your custom monitor, see the Service monitor development guidelines.