Example: Creating a PVC custom monitor
Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.
You can use this tutorial to build a custom monitor that tracks persistent volume claims and reports events based on the status.
If a PVC is bound, an info is registered for the PVC. If a PVC is in an unbound or failed state, a critical event is recorded. This monitor is based on Python 3.7 and uses the Kubernetes Python SDK to interact with the cluster.
About this task
In the following example, the following conditions apply:
- The
"monitor_type"in the monitor event body must match thedetails.nameof the monitor in the ConfigMap. In the example, it issample-monitor. - The
"event_type"in the monitor event body must match thedetails.event_types[i].namein the ConfigMap. In the example, it ischeck-pvc-bound. - The
"metadata"in the monitor event body is a comma-separated,<key>=<value>pair that must match the variable names denoted by angle brackets (< >) indetails.event_types[i].long_description. In this example, it isPhase=<actual value>.
data = {"monitor_type":monitor_type, "event_type":event_type, "severity":severity, "metadata":metadata, "reference":pvc.metadata.name, "namespace": pvc.metadata.namespace} "details": {
"name":"sample-monitor",
"description": "sample monitor description",
"image": "image-registry.openshift-image-registry.svc:5000/zen/pvc-monitor:latest",
"schedule": "*/10 * * * *",
"event_types": [
{
"name": "check-pvc-bound",
"simple_name": "PVC bound check",
"alert_type": "platform",
"short_description": "A monitor that checks whether a PVC is bound.",
"long_description": "PVC status phase: <Phase>"
}
]
}Step 1: Create the Python script to monitor and record PVC status
The following script uses the in-cluster config to authenticate to Kubernetes and access the resources. By default, you have access to the following volume:
zen-service-broker-secret
The following environment variables are made available as part of the cron job initialization:
ICPD_CONTROLPLANE_NAMESPACE: The control plane namespace.
The following Python script (pvc-monitor.py) lists PVCs and generates events based on their state. All non-bound PVCs are recorded with critical severity. The events are sent as a JSON array to the POST events endpoint, authorized with the service broker token.
import os
import requests
import json
from kubernetes import client, config, watch
def main():
# setup the namespace
ns = os.environ.get('ICPD_CONTROLPLANE_NAMESPACE')
if ns is None:
ns = ""
monitor_type = "sample-monitor"
event_type = "check-pvc-bound"
# configure client
config.load_incluster_config()
api = client.CoreV1Api()
# configure post request and set secret headers
url = 'https://zen-watchdog-svc:4444/zen-watchdog/v1/monitoring/events'
with open('/var/run/sharedsecrets/token', 'r') as file:
secret_header = file.read().replace('\n', '')
headers = {'Content-type': 'application/json', 'secret': secret_header}
# Print PVC list, set status as critical for unbound or failed pvc
pvcs = api.list_namespaced_persistent_volume_claim(namespace=ns, watch=False)
events = []
for pvc in pvcs.items:
severity = "info"
print("pvc {}: {}".format(pvc.metadata.name, pvc.status.phase))
if pvc.status.phase != 'Bound':
severity = "critical"
metadata = "{}={}".format("Phase", str(pvc.status.phase))
data = {"monitor_type":monitor_type, "event_type":event_type, "severity":severity, "metadata":metadata, "reference":pvc.metadata.name, "namespace": pvc.metadata.namespace}
if pvc.metadata.labels is not None and "icpdsupport/addOnId" in pvc.metadata.labels:
data["addon_id"] = pvc.metadata.labels["icpdsupport/addOnId"]
events.append(data)
json_string = json.dumps(events)
print("events: {}".format(json_string))
# post call to zen-watchdog to record events
r = requests.post(url, headers=headers, data=json_string, verify=False)
print("status code: {}".format(r.status_code))
if __name__ == '__main__':
main()
Step 2: Create and run the Dockerfile
Prerequisites
For the Python script to run in a containerized environment, you need access to a Python executable with Kubernetes Python SDK included.
-
You need the following requirements.txt:
kubernetes==11.0.0 -
You need the following Dockerfile:
# set base image (host OS) FROM python:3.8 RUN mkdir /pvc-monitor # set the working directory in the container WORKDIR /pvc-monitor ADD . /pvc-monitor # install dependencies RUN pip install -r requirements.txt # command to run on container start CMD [ "python", "./pvc-monitor.py" ] -
You need a requirements.json file. The requirements.json file holds all the required packages for the image. The final structure of the project will be similar to the following example:
-----Python app to monitor and alert for PVCs Pvc-monitor/ pvc-monitor.py Requirements.txt Dockerfile
Procedure
When the prerequisites are in place, run the following commands.
- Build the Docker image by using the
following
command:
docker build -f Dockerfile -t pvc-monitor:latest .This command returns the <docker-image-id> that you need for the
docker tag <docker-image-id>statement in the next step. - Tag and push the Docker image to your
private container registry so that it can be accessed by the alerting cron
job.
docker tag <docker-image-id> <docker-registry>/<namespace>/pvc-monitor:latest docker push <docker-registry>/<namespace>/pvc-monitor:latestThe default route for the internal image registry is image-registry.openshift-image-registry.svc:5000.
Note: You can also push the Docker image to the OpenShift® registry. If you choose to push the image to the OpenShift registry, you must expose the registry. For more information, see Exposing the registry in the OpenShift documentation:
Step 3: Set up the extension ConfigMap for the monitor
After you push the Docker image to the registry, you can create an extension ConfigMap that points to the image. This step ensures that the alert manager picks up the image and creates a cron job so that the script is run at scheduled intervals.
You can use the following sample ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: sample-monitor-extension
labels:
icpdata_addon: "true"
icpdata_addon_version: "1.0.0"
data:
extensions: |
[
{
"extension_point_id": "zen_alert_monitor",
"extension_name": "zen_alert_monitor_sample",
"display_name": "Sample alert monitor",
"details": {
"name":"sample-monitor",
"description": "sample monitor description",
"image": "image-registry.openshift-image-registry.svc:5000/zen/pvc-monitor:latest",
"schedule": "*/10 * * * *",
"event_types": [
{
"name": "check-pvc-bound",
"simple_name": "PVC bound check",
"alert_type": "platform",
"short_description": "A monitor that checks whether a PVC is bound.",
"long_description": "PVC status phase: <Phase>"
}
]
}
}
]
What to do next
Verify that your custom monitor was deployed successfully. For more information, see Monitoring the platform.
To troubleshoot your custom monitor, see the Service monitor development guidelines.