Tutorial: Creating a custom monitor
Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.7 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.
You can use this tutorial to build a custom monitor that tracks persistent volume claims and reports events based on the status.
If a PVC is bound, an info is registered for the PVC. If a PVC is in an unbound or failed state, a critical event is recorded. This monitor is based on Python 3.7 and uses the Kubernetes Python SDK to interact with the cluster.
Step1: Create the Python script to monitor and record PVC status
The following script uses the in-cluster config to authenticate to Kubernetes and access the resources. By default, you have access to the following volume:
zen-service-broker-secret
The following environment variables are made available as part of the cron job initialization:
ICPD_CONTROLPLANE_NAMESPACE: The control plane namespace.
The following Python script (pvc-monitor.py) lists PVCs and generates events based on their state. All non-bound PVCs are recorded with critical severity. The events are sent as a JSON array to the POST events endpoint, authorized with the service broker token.
import os
import requests
import json
from kubernetes import client, config, watch
def main():
# setup the namespace
ns = os.environ.get('ICPD_CONTROLPLANE_NAMESPACE')
if ns is None:
ns = ""
monitor_type = "sample-monitor"
event_type = "check-pvc-bound"
# configure client
config.load_incluster_config()
api = client.CoreV1Api()
# configure post request and set secret headers
url = 'https://zen-watchdog-svc:4444/zen-watchdog/v1/monitoring/events'
with open('/var/run/sharedsecrets/token', 'r') as file:
secret_header = file.read().replace('\n', '')
headers = {'Content-type': 'application/json', 'secret': secret_header}
# Print PVC list, set status as critical for unbound or failed pvc
pvcs = api.list_namespaced_persistent_volume_claim(namespace=ns, watch=False)
events = []
for pvc in pvcs.items:
severity = "info"
print("pvc {}: {}".format(pvc.metadata.name, pvc.status.phase))
if pvc.status.phase != 'Bound':
severity = "critical"
metadata = "{}={}".format("Phase", str(pvc.status.phase))
data = {"monitor_type":monitor_type, "event_type":event_type, "severity":severity, "metadata":metadata, "reference":pvc.metadata.name, "namespace": pvc.metadata.namespace}
if pvc.metadata.labels is not None and "icpdsupport/addOnId" in pvc.metadata.labels:
data["addon_id"] = pvc.metadata.labels["icpdsupport/addOnId"]
events.append(data)
json_string = json.dumps(events)
print("events: {}".format(json_string))
# post call to zen-watchdog to record events
r = requests.post(url, headers=headers, data=json_string, verify=False)
print("status code: {}".format(r.status_code))
if __name__ == '__main__':
main()
Step 2: Create and run the Dockerfile
For the Python script to run in a containerized environment, you need access to a Python executable with Kubernetes Python SDK included.
You need the following requirements.txt:
kubernetes==11.0.0
You need the following Dockerfile:
# set base image (host OS)
FROM python:3.8
RUN mkdir /pvc-monitor
# set the working directory in the container
WORKDIR /pvc-monitor
ADD . /pvc-monitor
# install dependencies
RUN pip install -r requirements.txt
# command to run on container start
CMD [ "python", "./pvc-monitor.py" ]
The requirements.json file holds all the required packages for the image. The final structure of the project will be similar to the following example:
-----Python app to monitor and alert for PVCs
Pvc-monitor/
pvc-monitor.py
Requirements.txt
Dockerfile
When this structure is in place, you can build the Docker image by using the following command:
docker build -f Dockerfile -t pvc-monitor:latest .
Finally, tag and push the Docker image into the OpenShift® registry so that it can be accessed by the alerting cron job.
docker tag <docker-image-id> <docker-registry>/<namespace>/pvc-monitor:latest
docker push <docker-registry>/<namespace>/pvc-monitor:latest
Step 3: Set up the extension configmap for monitor
After you push the Docker image to the OpenShift registry, you can create an extension configmap that points to the image. This step ensures that the alert manager picks up the image and creates a cron job, which ensures that the script is run at scheduled intervals.
You can use the following sample configmap:
apiVersion: v1
kind: ConfigMap
metadata:
name: sample-monitor-extension
labels:
icpdata_addon: "true"
icpdata_addon_version: "1.0.0"
data:
extensions: |
[
{
"extension_point_id": "zen_alert_monitor",
"extension_name": "zen_alert_monitor_sample",
"display_name": "Sample alert monitor",
"details": {
"name":"sample-monitor",
"description": "sample monitor description",
"image": "image-registry.openshift-image-registry.svc:5000/zen/pvc-monitor:latest",
"schedule": "*/10 * * * *",
"event_types": [
{
"name": "check-pvc-bound",
"simple_name": "PVC bound check",
"alert_type": "platform",
"short_description": "A monitor that checks whether a PVC is bound.",
"long_description": "PVC status phase: <Phase>"
}
]
}
}
]