Tutorial: Creating a custom monitor

Important: IBM Cloud Pak® for Data Version 4.7 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.7 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

You can use this tutorial to build a custom monitor that tracks persistent volume claims and reports events based on the status.

If a PVC is bound, an info is registered for the PVC. If a PVC is in an unbound or failed state, a critical event is recorded. This monitor is based on Python 3.7 and uses the Kubernetes Python SDK to interact with the cluster.

Step1: Create the Python script to monitor and record PVC status

The following script uses the in-cluster config to authenticate to Kubernetes and access the resources. By default, you have access to the following volume:

  • zen-service-broker-secret

The following environment variables are made available as part of the cron job initialization:

  • ICPD_CONTROLPLANE_NAMESPACE: The control plane namespace.

The following Python script (pvc-monitor.py) lists PVCs and generates events based on their state. All non-bound PVCs are recorded with critical severity. The events are sent as a JSON array to the POST events endpoint, authorized with the service broker token.

import os
import requests
import json
from kubernetes import client, config, watch
def main():
    # setup the namespace
    ns = os.environ.get('ICPD_CONTROLPLANE_NAMESPACE')
    if ns is None:
        ns = ""
    monitor_type = "sample-monitor"
    event_type = "check-pvc-bound"

    # configure client
    config.load_incluster_config()
    api = client.CoreV1Api()

    # configure post request and set secret headers
    url = 'https://zen-watchdog-svc:4444/zen-watchdog/v1/monitoring/events'
    with open('/var/run/sharedsecrets/token', 'r') as file:
        secret_header = file.read().replace('\n', '')
    headers = {'Content-type': 'application/json', 'secret': secret_header}

    # Print PVC list, set status as critical for unbound or failed pvc
    pvcs = api.list_namespaced_persistent_volume_claim(namespace=ns, watch=False)
    events = []
    for pvc in pvcs.items:
        severity = "info"
        print("pvc {}: {}".format(pvc.metadata.name, pvc.status.phase))
        if pvc.status.phase != 'Bound':
            severity = "critical"
        metadata = "{}={}".format("Phase", str(pvc.status.phase))
        data = {"monitor_type":monitor_type, "event_type":event_type, "severity":severity, "metadata":metadata, "reference":pvc.metadata.name, "namespace": pvc.metadata.namespace}
        if pvc.metadata.labels is not None and "icpdsupport/addOnId" in pvc.metadata.labels:
          data["addon_id"] = pvc.metadata.labels["icpdsupport/addOnId"]
        events.append(data)
    json_string = json.dumps(events)
    print("events: {}".format(json_string))

    # post call to zen-watchdog to record events
    r = requests.post(url, headers=headers, data=json_string, verify=False)
    print("status code: {}".format(r.status_code))

if __name__ == '__main__': 
    main()

Step 2: Create and run the Dockerfile

For the Python script to run in a containerized environment, you need access to a Python executable with Kubernetes Python SDK included.

You need the following requirements.txt:

kubernetes==11.0.0

You need the following Dockerfile:

# set base image (host OS)
FROM python:3.8
RUN mkdir /pvc-monitor
# set the working directory in the container
WORKDIR /pvc-monitor
ADD . /pvc-monitor
# install dependencies
RUN pip install -r requirements.txt
# command to run on container start
CMD [ "python", "./pvc-monitor.py" ]

The requirements.json file holds all the required packages for the image. The final structure of the project will be similar to the following example:

-----Python app to monitor and alert for PVCs
Pvc-monitor/
  pvc-monitor.py
  Requirements.txt
  Dockerfile

When this structure is in place, you can build the Docker image by using the following command:

docker build -f Dockerfile -t pvc-monitor:latest .

Finally, tag and push the Docker image into the OpenShift® registry so that it can be accessed by the alerting cron job.

docker tag <docker-image-id> <docker-registry>/<namespace>/pvc-monitor:latest
docker push <docker-registry>/<namespace>/pvc-monitor:latest

Step 3: Set up the extension configmap for monitor

After you push the Docker image to the OpenShift registry, you can create an extension configmap that points to the image. This step ensures that the alert manager picks up the image and creates a cron job, which ensures that the script is run at scheduled intervals.

You can use the following sample configmap:

apiVersion: v1 
kind: ConfigMap 
metadata: 
  name: sample-monitor-extension
  labels: 
    icpdata_addon: "true" 
    icpdata_addon_version: "1.0.0" 
data: 
  extensions: |
      [
        {
          "extension_point_id": "zen_alert_monitor",
          "extension_name": "zen_alert_monitor_sample",
          "display_name": "Sample alert monitor",
          "details": {
            "name":"sample-monitor",
            "description": "sample monitor description",
            "image": "image-registry.openshift-image-registry.svc:5000/zen/pvc-monitor:latest",
            "schedule": "*/10 * * * *",
            "event_types": [
              {
                "name": "check-pvc-bound",
                "simple_name": "PVC bound check",
                "alert_type": "platform",
                "short_description": "A monitor that checks whether a PVC is bound.",
                "long_description": "PVC status phase: <Phase>"
              }
            ]
          }
        }
      ]