Configuring the monitoring service
You can customize the monitoring service during your product installation.
Add the following lines of code to the config.yaml
file that is located in the /<installation_directory>/cluster
folder. Customize the parameters as required. See Customize the parameters.
Then, save and exit the file.
monitoring:
prometheus:
scrapeInterval: 1m
evaluationInterval: 1m
retention: 24h
persistentVolume:
enabled: false
storageClass: "-"
resources:
limits:
cpu: 500m
memory: 2048Mi
requests:
cpu: 100m
memory: 128Mi
alertmanager:
persistentVolume:
enabled: false
storageClass: "-"
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 10m
memory: 64Mi
grafana:
persistentVolume:
enabled: false
storageClass: "-"
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi
Customize the parameters
You can customize the values of the parameters, as required.
- The
monitoring.prometheus
section has the following parameters :prometheus.scrapeInterval
is the frequency to scrape targets in Prometheus.prometheus.evaluationInterval
is the frequency to evaluate rules in Prometheus.prometheus.retention
is the duration of time to retain the monitoring data.prometheus.persistentVolume.enabled
is a flag that you set to use a persistent volume for Prometheus. The flagfalse
means that you do not use a persistent volume.prometheus.persistentVolume.storageClass
is the storage class to be used by Prometheus. See Storage class parameter.prometheus.resources.limits.cpu
is the CPU limit that you set for the Prometheus container. The default value is 500 millicpu.prometheus.resources.limits.memory
is the memory limit that you set for the Prometheus container. The default value is 512 million bytes.
- The
monitoring.alertmanager
section has the following parameters:alertmanager.persistentVolume.enabled
is a flag that you set to use a persistent volume for Alertmanager. The flagfalse
means that you do not use a persistent volume.alertmanager.persistentVolume.storageClass
is the storage class to be used by Alertmanager. See Storage class parameter.alertmanager.resources.limits.cpu
is the CPU limit that you set for the Alertmanager container. The default value is 200 millicpu.alertmanager.resources.limits.memory
is the memory limit that you set for the Alertmanager container. The default value is 256 million bytes.
- The
monitoring.grafana
section has the following parameters:grafana.user
is the user name that you use to access Grafana.grafana.password
is the password of the user who is specified in thegrafana.user
parameter.grafana.persistentVolume.enabled
is a flag that you set to use a persistent volume for Grafana. The flagfalse
means that you do not use a persistent volume.grafana.persistentVolume.storageClass
is the storage class to be used by Grafana. See Storage class parameter.grafana.resources.limits.cpu
is the CPU limit that you set for the Grafana container. The default value is 500 millicpu.grafana.resources.limits.memory
is the memory limit that you set for the Grafana container. The default value is 512 million bytes.
For all the available parameters, see Parameters.
Storage class parameter
The storageClass
parameter value is the name of the storage class that the monitoring service uses.
- Enter
-
to not use a storage class. The data is stored within the container file system, and all data is lost if the container crashes. - Enter the name of a storage class, such as
glusterfs
, to use shared storage. If you use shared storage, your data is preserved if the container crashes. To use this option, you must configure the network storage provider.
You can specify any valid Kubernetes storage class. See Storage Classes in the Kubernetes documentation.
NOTE: To enable persistent volumes for monitoring service during your product installation, you must use a storage provider, such as GlusterFS, that supports dynamic storage provisioning. If you choose a provider, such as NFS, that does not support dynamic storage provisioning, you must install your monitoring service after you install your product installation. For more information, see Installing monitoring service in your product .
Parameters
The following table lists the Prometheus parameters and their default values. You can configure these parameters, as required.
Parameter | Description | Default value |
---|---|---|
environment |
Target environment of deployment. Valid options are openshift and non-openshift . |
non-openshift |
mode |
Deployment mode. Valid options are managed and standard . |
standard |
tls.enabled |
Enable security for the Chart | false |
tls.issuer |
Name of the certificate issuer | icp-ca-issuer |
tls.issuerKind |
Type of certificate issuer. Valid options are Issuer and ClusterIssuer . |
ClusterIssuer |
tls.ca.secretName |
Secret of the CA certificate | cluster-ca-cert |
tls.ca.certFieldName |
Name of the CA certificate that is used in the secret | tls.crt |
tls.server.existingSecretName |
Existing secret of the server certificate | "" |
tls.server.certFieldName |
Name of the server certificate that is used in the secret | tls.crt |
tls.server.keyFieldName |
Name of the server key in secret | tls.key |
tls.exporter.existingSecretName |
Existing secret of the exporter certificate | "" |
tls.exporter.certFieldName |
Name of the exporter certificate that is used in the secret | tls.crt |
tls.exporter.keyFieldName |
Name of the exporter key that is used in the secret | tls.key |
tls.client.existingSecretName |
Existing secret of the client certificate | "" |
tls.client.certFieldName |
Name of the client certificate that is used in the secret | tls.crt |
tls.client.keyFieldName |
Name of the client key that is used in the secret | tls.key |
imagePullPolicy |
Policy to pull the deployed images | IfNotPresent |
imagePullSecrets |
Image secret that is used to pull images from a private repository | "" |
clusterAddress |
IP address or DNS name that is used to access the cluster | 127.0.0.1 |
clusterPort |
Port that is used to access the cluster | 8443 |
clusterDomain |
Domain name of the cluster | cluster.local |
clusterName |
Name of the target cluster | mycluster |
prometheus.image.repository |
Image name of the Prometheus server container | ibmcom/prometheus |
prometheus.image.tag |
Image tag of the Prometheus server container | v2.0.0 |
prometheus.port |
Port number of the Prometheus server service | 80 |
prometheus.scrapeInterval |
Interval to scrape metrics | 1m |
prometheus.evaluationInterval |
Evaluation interval for alert rules | 1m |
prometheus.retention |
Prometheus storage retention time | 24h |
prometheus.args |
Arguments for the prometheus container | {} |
prometheus.persistentVolume.enabled |
Set to true if you want to create a volume to store data | false |
prometheus.persistentVolume.useDynamicProvisioning |
Set to true if you want to dynamically provision persistent volume | true |
prometheus.persistentVolume.size |
Capacity of the persistent volume claim | 10Gi |
prometheus.persistentVolume.storageClass |
Storage class for the Prometheus persistent volume | "" |
prometheus.persistentVolume.existingClaimName |
Specify the name if you want to use an existing persistent volume claim | "" |
prometheus.persistentVolume.selector.label |
If you want to use a particular volume, specify the name of the label | "" |
prometheus.persistentVolume.selector.value |
If you want to use a particular volume, specify the value of the label | "" |
prometheus.probe.enabled |
Set to true if you want to enable health probe for Prometheus |
true |
prometheus.probe.readiness.args |
Arguments for readiness probe | {} |
prometheus.probe.liveness.args |
Arguments for liveness probe | {} |
prometheus.resources.limits.cpu |
Prometheus CPU limits | 500m |
prometheus.resources.limits.memory |
Prometheus memory limits | 512Mi |
prometheus.resources.requests.cpu |
Prometheus CPU requests | 100m |
prometheus.resources.requests.memory |
Prometheus memory requests | 128Mi |
prometheus.alertRuleFiles |
Prometheus alert rules template | alertRules |
prometheus.configFiles |
Prometheus configurations template | prometheusConfig |
prometheus.rbacRoleCreation |
Set to true if you want to create role-based access control (RBAC) role and role binding |
true |
prometheus.ingress.enabled |
Set to true if you want to create Prometheus ingress |
false |
prometheus.ingress.annotations |
Annotation for Prometheus ingress | {} |
prometheus.service.type |
Type of Prometheus service | NodePort |
prometheus.etcdTarget.enabled |
Add etcd scrape target in the Prometheus configuration, if set to true |
false |
prometheus.etcdTarget.etcdAddress |
etcd server list | ["127.0.0.1"] |
prometheus.etcdTarget.etcdPort |
etcd server's port | 4001 |
prometheus.etcdTarget.secret |
Secret that is used to access the etcd metrics endpoint | etcd-secret |
prometheus.etcdTarget.tlsConfig |
TLS configuration for etcd scrape configuration | {} |
alertmanager.image.repository |
Alertmanager container image name | ibmcom/alertmanager |
alertmanager.image.tag |
Alertmanager container image tag | v0.13.0 |
alertmanager.port |
Alertmanager service port | 80 |
alertmanager.persistentVolume.enabled |
Creates a volume to store data, if set to true |
false |
alertmanager.persistentVolume.useDynamicProvisioning |
Dynamically provisions a persistent volume, if set to true |
true |
alertmanager.persistentVolume.size |
Size of the persistent volume claim | 1Gi |
alertmanager.persistentVolume.storageClass |
Storage class for Alertmanager persistent volume | "" |
alertmanager.persistentVolume.existingClaimName |
Specify the name if you want to use an existing persistent volume claim | "" |
alertmanager.persistentVolume.selector.label |
If you want to use a particular volume, specify the name of the label | "" |
alertmanager.persistentVolume.selector.value |
If you want to use a particular volume, specify the value of the label | "" |
alertmanager.probe.enabled |
Enables health probe for Alertmanager, if set to true |
true |
alertmanager.probe.readiness.args |
Arguments for readiness probe | {} |
alertmanager.probe.liveness.args |
Arguments for liveness probe | {} |
alertmanager.resources.limits.cpu |
Alertmanager CPU limits | 200m |
alertmanager.resources.limits.memory |
Alertmanager memory limits | 256Mi |
alertmanager.resources.requests.cpu |
Alertmanager CPU requests | 10m |
alertmanager.resources.requests.memory |
Alertmanager memory requests | 64Mi |
alertmanager.configFiles |
Alertmanager configurations file name | alermanagerConfig |
alertmanager.ingress.enabled |
Creates Alertmanager ingress, if set to true |
false |
alertmanager.ingress.annotations |
Annotation for Alertmanager ingress | {} |
alertmanager.service.type |
Type of Alertmanager service | NodePort |
kubeStateMetrics.enabled |
Installs Kubernetes metrics exporter, if set to true |
false |
kubeStateMetrics.image.repository |
kube-state-metrics container image name | ibmcom/kube-state-metrics |
kubeStateMetrics.image.tag |
kube-state-metrics container image tag | v1.2.0 |
kubeStateMetrics.port |
kube-state-metrics service port | 80 |
kubeStateMetrics.probe.enabled |
Enables health probe for kubeStateMetrics, if set to true |
true |
kubeStateMetrics.probe.readiness.args |
Arguments for readiness probe | {} |
kubeStateMetrics.probe.liveness.args |
Arguments for liveness probe | {} |
nodeExporter.enabled |
Installs node exporter, if set to true |
false |
nodeExporter.image.repository |
node-exporter container image name | ibmcom/node-exporter |
nodeExporter.image.tag |
node-exporter container image tag | v0.15.2 |
nodeExporter.port |
node-exporter service port | 9100 |
nodeExporter.probe.enabled |
Enables health probe for nodeExporter, if set to true |
true |
nodeExporter.probe.readiness.args |
Arguments for readiness probe | {} |
nodeExporter.probe.liveness.args |
Arguments for liveness probe | {} |
grafana.image.repository |
Grafana Docker image name | ibmcom/grafana |
grafana.image.tag |
Grafana Docker image tag | 4.6.3 |
grafana.port |
Grafana container exposed port | 3000 |
grafana.user |
Grafana user's name | "admin" |
grafana.password |
Grafana user's password | "" |
grafana.persistentVolume.enabled |
Creates a volume to store data, if set to true |
false |
grafana.persistentVolume.useDynamicProvisioning |
Dynamically provisions a persistent volume, if set to true |
true |
grafana.persistentVolume.size |
Size of the persistent volume claim | 1Gi |
grafana.persistentVolume.storageClass |
Storage class for persistent volume | "" |
grafana.persistentVolume.existingClaimName |
Specify the name if you want to use an existing persistent volume claim | "" |
grafana.persistentVolume.selector.label |
If you want to use a particular volume, specify the name of the label | "" |
grafana.persistentVolume.selector.value |
If you want to use a particular volume, specify the value of the label | "" |
grafana.probe.enabled |
Enables health probe for Grafana, if set to true |
true |
grafana.probe.readiness.args |
Arguments for readiness probe | {} |
grafana.probe.liveness.args |
Arguments for liveness probe | {} |
grafana.resources.limits.cpu |
Grafana CPU limits | 500m |
grafana.resources.limits.memory |
Grafana memory limits | 512Mi |
grafana.resources.requests.cpu |
Grafana CPU requests | 100m |
grafana.resources.requests.memory |
Grafana memory requests | 128Mi |
grafana.configFiles |
Grafana configurations file | grafanaConfig |
grafana.ingress.enabled |
Creates Grafana ingress, if set to true |
false |
grafana.ingress.annotations |
Annotation for Grafana ingress | {} |
grafana.service.type |
Type of Grafana service | NodePort |
grafana.elasticsearchDash.enabled |
Adds Elasticsearch dashboard, if set to true |
false |
collectdExporter.enabled |
Installs collectd exporter, if set to true |
false |
collectdExporter.image.repository |
Collectd exporter image name | ibmcom/collectd-exporter |
collectdExporter.image.tag |
Collectd exporter image tag | 0.3.1 |
collectdExporter.service.serviceMetricsPort |
Metrics service exposed port | 9103 |
collectdExporter.service.serviceCollectorPort |
Collector service exposed port | 25826 |
collectdExporter.probe.enabled |
Enables health probe for collectd exporter, if set to true |
true |
collectdExporter.probe.readiness.args |
Arguments for readiness probe | {} |
collectdExporter.probe.liveness.args |
Arguments for liveness probe | {} |
configmapReload.image.repository |
configmapReload Docker image name | ibmcom/configmap-reload |
configmapReload.image.tag |
configmapReload Docker image tag | v0.1 |
router.image.repository |
Router Docker image name | ibmcom/icp-router |
router.image.tag |
Router Docker image tag | 2.2.0 |
router.subjectAlt |
Subject alternate DNC or IP address for the SSL key | 127.0.0.1 |
elasticsearchExporter.enabled |
Installs Elasticsearch exporter, if set to true |
false |
elasticsearchExporter.image.repository |
Elasticsearch exporter Docker image name | ibmcom/elasticsearch_exporter |
elasticsearchExporter.image.tag |
Elasticsearch exporter Docker image tag | 1.0.2 |
elasticsearchExporter.esUri |
Elasticsearch URL | https://elasticsearch:9200 |
elasticsearchExporter.tls.enabled |
Enables TLS for exporter to request Elasticsearch endpoint | true |
elasticsearchExporter.tls.ca.secretName |
Secret for CS certificate | cluster-ca-cert |
elasticsearchExporter.tls.ca.certFieldName |
Field name for CA certificate in secret | tls.crt |
elasticsearchExporter.tls.client.existingSecretName |
Existing secret for client certificate | "" |
elasticsearchExporter.tls.client.certFieldName |
Field name for client certificate in secret | tls.crt |
elasticsearchExporter.tls.client.keyFieldName |
Field name for client key in secret | tls.key |
elasticsearchExporter.port |
Elasticsearch exporter exposed port | 9108 |
elasticsearchExporter.probe.enabled |
Enables health probe for Elasticsearch exporter, if set to true |
true |
elasticsearchExporter.probe.readiness.args |
Arguments for readiness probe | {} |
elasticsearchExporter.probe.liveness.args |
Arguments for liveness probe | {} |
curl.image.repository |
curl Docker image name | ibmcom/curl |
curl.image.tag |
curl Docker image tag | 4.0.0 |
certGen.image.repository |
Docker image name to generate certificate | ibmcom/icp-cert-gen |
certGen.image.tag |
Docker image tag to generate certificate | 1.0.0 |
init.image.repository |
init Docker image name | ibmcom/icp-cert-gen |
init.image.tag |
init Docker image tag | 1.0.0 |