IBM® Cloud Private monitoring service

You can deploy more instances of the monitoring service that is provided with IBM Cloud Private to monitor the status of your applications.

Deploy the monitoring service from the Catalog

You can deploy the monitoring service with customized configurations from the Catalog in the IBM Cloud Private management console.

From the Catalog page, click the ibm-icpmonitoring Helm chart to configure and deploy it.
Provide values for the required parameters:
- Helm release name: "monitoring"
- Target namespace: "kube-system"
- Mode of deployment: "Managed"
- Cluster access address: Specify the Domain Name Service (DNS) or IP address that is used to access the IBM Cloud Private console.
- Cluster access port: Specify the port that is used to access the IBM Cloud Private console. The default port is 8443.
- Persistent volume: If want to use a persistent volume for Prometheus, Grafana, or Alertmanager, select the required check box.
- Name of the storageClass for the persistentVolume: Specify the name of the storage class that you are using to provision the volume.

Deploy the monitoring service from the CLI

Install the Kubernetes command line (kubectl). See Accessing your IBM Cloud Private cluster by using the kubectl CLI.
Install the Helm command line interface (CLI). See Setting up the Helm CLI.
Install the ibm-icpmonitoring Helm chart. Run the following command:
```
helm install -n monitoring --namespace kube-system --set mode=managed --set clusterAddress=<IP_address> --set clusterPort=<port> ibm-charts/ibm-icpmonitoring
```
<IP_address> is the DNS or IP address that is used to access the IBM Cloud Private console; <port> is the port that is used to access the IBM Cloud Private console.

Access monitoring dashboards

First, log in to the IBM Cloud Private management console.

Then, access the monitoring dashboards by using the following URLs:

URL to access Prometheus dashboard:
```
https://<IP_address>:<port>/prometheus
```
URL to access Grafana dashboard:
```
https://<IP_address>:<port>/grafana
```
- URL to access Alertmanager dashboard:
```
https://<IP_address>:<port>/alertmanager
```
<IP_address> is the DNS or IP address that is used to access the IBM Cloud Private console; <port> is the port that is used to access the IBM Cloud Private console.

Change the user name and password for Grafana

To change the user name and password for Grafana which defaults to admin/admin, create a file called update.yaml. Include the following content in your file where the user name and password fields have the values base64 encoded:

apiVersion: v1
kind: Secret
metadata:
  labels:
    app: monitoring-grafana
    chart: ibm-icpmonitoring-1.1.1
    component: grafana
    release: monitoring
    heritage: Tiller
  name: monitoring-grafana-secret
type: Opaque
data:
  username: ${BASE64 encoded username}
  password: ${BASE64 encoded password}

For kube-system namespace, complete the following steps:

Enter the following command:

kubectl replace -f update.yaml -n $NAMESPACE

Get the pod name for Grafana with the following command:
```
kubectl get pods -n $NAMESPACE | grep grafana
```
Delete that pod with the following command where $POD is the name of the pod:
```
kubectl delete pod $POD -n $NAMESPACE
```
Wait for the pod to be fully back up. You can check this by running the following command and waiting for it to read 2/2:
```
kubectl get pods -n $NAMESPACE --no-headers | grep grafana | grep -v Terminating
```

Now you can check to see if you have to hook the data source back up by checking the following URL in the console:

  https://{CLUSTER_IP}:8443/grafana/api/datasourceshttps://{CLUSTER_IP}:8443/grafana/api/datasources

The updated user name and passwords are automatically used by the backend system. If the return value is [], you need to hook the data source back up.

For non kube-system namespace, complete the following steps:

Enter the following command:

kubectl replace -f update.yaml -n default

Get the pod name for Grafana with the following command:
```
kubectl get pods -n default | grep grafana
```
Delete that pod with the following command where $POD is the name of the pod
```
kubectl delete pod $POD -n default
```
Wait for the pod to be fully back up. You can check this by running the following command and and waiting for it to read 1/1:
```
kubectl get pods -n default --no-headers | grep grafana | grep -v Terminating
```

At this point you can check to see if you have to hook the data source back up.

Get the node port for the deployment by entering the following command:

  kubectl get --namespace default -o jsonpath="{.spec.ports[0].nodePort}" services monitoring-grafana

Then check the following URL in the UI and if the return value is [] then you will need to hook the data source back up.:

  curl -k http://$USERNAME:$PASSWORD@$CLUSTER_IP:$NODEPORT/api/datasources

Configure applications to use monitoring service

Modify the application to expose the metrics.
- For applications that have a metrics endpoint, you must define the metrics endpoint as a Kubernetes service by using the annotation prometheus.io/scrape: 'true'. The service definition resembles the following code:
```
 apiVersion: v1
 kind: Service
 metadata:
   annotations:
     prometheus.io/scrape: 'true'
   labels:
     app: liberty
   name: liberty
 spec:
   ports:
   - name: metrics
     targetPort: 5556
     port: 5556
     protocol: TCP
   selector:
     app: liberty
   type: ClusterIP
```
  Note: For more information about configuring the metrics endpoint for Prometheus, see CLIENT LIBRARIES in the Prometheus documentation.
- For applications that use collectd and depend on collectd-exporter to expose metrics, you update collectd configuration file within the application container. In this configuration file, you must add the network plug-in and point to collectd exporter. Add the following text to the configuration file:
```
LoadPlugin network
<Plugin network>
 Server "monitoring-prometheus-collectdexporter.kube-system" "25826"
</Plugin>
```

Logs and metrics management for Prometheus

You can modify the time period for metric retention by updating the storage.tsdb.retention parameter in the config.yaml file. By default this value is set at 24h, which means that the metrics are kept for 24 hours and then purged. See Configuring the monitoring service.

However, if you need to manually remove this data from the system, you can use the rest API that is provided by the Prometheus component.

To delete metrics data, see Delete Series .
To remove the deleted data from the disk, and clean up the disk space, see Clean Tombstones .

Note: The target URL must have the format https://<IP_address>:8443/prometheus, where <IP_address> is the IP address that is used to access the management console.

The command to delete metrics data resembles the following code:

https://<IP_address>:8443/prometheus/api/v1/admin/tsdb/delete_series?*******

The command to remove deleted data and clean up the disk, resembles the following code:
```
https://<IP_address>:8443/prometheus/api/v1/admin/tsdb/clean_tombstones
```