If you are using local storage, you must first create a storage class and then setup the persistent volumes.
You must create at least six persistent volumes when you are using local-storage, and at least one storage class. Preferably, create at least six storage classes, one for each of the statefulsets for these persistent volumes to use. The statefuls sets are: Cassandra, CouchDB, Datalayer, Zookeeper, Kafka, and Elasticsearch.
Review the storage guidance in the Choosing your storage solution for Monitoring.
Review the procedure for Configuring drives for local storage.
Review
the procedure for Setting readahead on a drive or volume for local storage.
Complete the following steps to set up local storage:
Create a storage class for each statefulset by running the following bash script:
#!/bin/bash
classes=('monitoring-cassandra' 'monitoring-couchdb' 'monitoring-datalayer' 'monitoring-elasticsearch' 'monitoring-kafka' 'monitoring-zookeeper')
template='{"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"name":"<SC_NAME>","labels":{"release":"monitoring"}},"provisioner":"kubernetes.io/no-provisioner","reclaimPolicy":"Delete","volumeBindingMode":"WaitForFirstConsumer"}'
for c in "${classes[@]}"; do
echo "$template" | sed "s/<SC_NAME>/$c/" | oc apply -f -
done
Run the get storage class command again to check that the storage class was created successfully: oc get sc
This example output shows that six storage classes were created - one for each statefulset:
NAME PROVISIONER AGE
monitoring-cassandra kubernetes.io/no-provisioner 84s
monitoring-couchdb kubernetes.io/no-provisioner 83s
monitoring-datalayer kubernetes.io/no-provisioner 82s
monitoring-elasticsearch kubernetes.io/no-provisioner 81s
monitoring-kafka kubernetes.io/no-provisioner 80s
monitoring-zookeeper kubernetes.io/no-provisioner 2m38s
Determine where each statefulset pod will reside. In a standard configuration, each statefulset has one pod.
In a high availability configuration, each statefulset has three pods.
Note: Cassandra has high RAM requirements (6 GB in a size0 environment, 16 GB in a size1 environment). High availability environments require a size1 environment.
On standard OpenShift clusters, only worker nodes can be used.
Run
the following command to see which nodes are available: oc get node
Modify and run the bash script to complete the following actions:
Note: For non-HA environments, a single node is sufficient. If you are using network attached drives, then make sure to mount them to the created folders (or modify the script to point at the drives).
#!/bin/bash
username=core # Standard username on openshift environments
base_dir='/var/home/core/local-storage' # Location where persistent volume data will be stored
max_size='500Gi' # Maximum size for any persistent volume (only affects which persistent volume claims are bound, not actual disk usage).
cassandra_nodes=('worker6.magnolia3.os.fyre.ibm.com' 'worker7.magnolia3.os.fyre.ibm.com' 'worker8.magnolia3.os.fyre.ibm.com')
couchdb_nodes=('worker0.magnolia3.os.fyre.ibm.com' 'worker1.magnolia3.os.fyre.ibm.com' 'worker2.magnolia3.os.fyre.ibm.com')
datalayer_nodes=('worker0.magnolia3.os.fyre.ibm.com' 'worker1.magnolia3.os.fyre.ibm.com' 'worker2.magnolia3.os.fyre.ibm.com')
elasticsearch_nodes=('worker0.magnolia3.os.fyre.ibm.com' 'worker1.magnolia3.os.fyre.ibm.com' 'worker2.magnolia3.os.fyre.ibm.com')
kafka_nodes=('worker3.magnolia3.os.fyre.ibm.com' 'worker4.magnolia3.os.fyre.ibm.com' 'worker5.magnolia3.os.fyre.ibm.com')
zookeeper_nodes=('worker3.magnolia3.os.fyre.ibm.com' 'worker4.magnolia3.os.fyre.ibm.com' 'worker5.magnolia3.os.fyre.ibm.com')
pv_template='{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"name":"<NAME>","labels":{"release":"monitoring"}},"spec":{"accessModes":["ReadWriteOnce"],"capacity":{"storage":"<MAX_SIZE>"},"local":{"path":"<PATH>"},"nodeAffinity":{"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"kubernetes.io/hostname","operator":"In","values":["<NODE>"]}]}]}},"persistentVolumeReclaimPolicy":"Retain","storageClassName":"<SC_NAME>","volumeMode":"Filesystem"}}'
function setup_storage() {
sc=$1
counter=0
for node in "${nodes[@]}"; do
ssh ${username}@${node} mkdir -p "${base_dir}/${sc}-${counter}"
echo "$pv_template" | sed "s/<NAME>/monitoring-${sc}-${counter}/" | sed "s/<MAX_SIZE>/${max_size}/" | sed "s|<PATH>|${base_dir}/${sc}-${counter}|" | sed "s/<NODE>/$node/" | sed "s/<SC_NAME>/monitoring-${sc}/" | oc apply -f -
counter=$((counter+1))
done
}
nodes=("${cassandra_nodes[@]}")
setup_storage cassandra
nodes=("${couchdb_nodes[@]}")
setup_storage couchdb
nodes=("${datalayer_nodes[@]}")
setup_storage datalayer
nodes=("${elasticsearch_nodes[@]}")
setup_storage elasticsearch
nodes=("${kafka_nodes[@]}")
setup_storage kafka
nodes=("${zookeeper_nodes[@]}")
setup_storage zookeeper
Verify that the persistent volumes were created by running: oc get pv
For example,
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
monitoring-cassandra-0 500Gi RWO Retain Available monitoring-cassandra 1m
monitoring-cassandra-1 500Gi RWO Retain Available monitoring-cassandra 1m
monitoring-cassandra-2 500Gi RWO Retain Available monitoring-cassandra 1m
monitoring-couchdb-0 500Gi RWO Retain Available monitoring-couchdb 1m
monitoring-couchdb-1 500Gi RWO Retain Available monitoring-couchdb 1m
monitoring-couchdb-2 500Gi RWO Retain Available monitoring-couchdb 1m
monitoring-datalayer-0 500Gi RWO Retain Available monitoring-datalayer 1m
monitoring-datalayer-1 500Gi RWO Retain Available monitoring-datalayer 1m
monitoring-datalayer-2 500Gi RWO Retain Available monitoring-datalayer 1m
monitoring-elasticsearch-0 500Gi RWO Retain Available monitoring-elasticsearch 1m
monitoring-elasticsearch-1 500Gi RWO Retain Available monitoring-elasticsearch 1m
monitoring-elasticsearch-2 500Gi RWO Retain Available monitoring-elasticsearch 1m
monitoring-kafka-0 500Gi RWO Retain Available monitoring-kafka 1m
monitoring-kafka-1 500Gi RWO Retain Available monitoring-kafka 1m
monitoring-kafka-2 500Gi RWO Retain Available monitoring-kafka 1m
monitoring-zookeeper-0 500Gi RWO Retain Available monitoring-zookeeper 1m
monitoring-zookeeper-1 500Gi RWO Retain Available monitoring-zookeeper 1m
monitoring-zookeeper-2 500Gi RWO Retain Available
For more information about Persistentvolumeclaims for statefulsets and binding, see Persistentvolumeclaims for statefulsets and binding.
The following is an example of a custom storage class and persistentvolume for Cassandra. In this example, the user decided that they want to use local-storage-cassandra as the value for the monitoringDeploy.global.persistence.storageClassOption.cassandradata parameter, and 500Gi as the value for the monitoringDeploy.global.persistence.storageSize.cassandradata parameter.
The Cassandra creates a persistentvolumeclaim that successfully binds to this persistentvolume. The same configuration can be applied to the other statefulsets. That is, if the values that are provided for these parameters in the Monitoring custom resource correspond to the persistentvolumes that are created, then, the persistentvolumeclaims that they create can successfully bind to them.
StorageClass:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage-cassandra
labels:
release: ibmcloudappmgmt
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: ibmcloudappmgmt-cassandra0
labels:
release: ibmcloudappmgmt
spec:
capacity:
storage: 500Gi
storageClassName: local-storage-cassandra
local:
path: /data/k8s/cassandra0
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: ["10.10.10.1"]
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
When Monitoring is installed, it creates persistentvolumeclaims for each of statefulsets. Each of these persistentvolumeclaims needs a persistentvolume to bind to. IBM Cloud Pak for Multicloud Management defaults to using one replica for each statefulset, and therefore, one persistentvolumeclaim for each statefulset. If the number of replicas for any of the statefulsets is increased, more persistentvolumeclaims per statefulset are created, and more persistentvolumes must be created for the persistentvolumeclaims to bind to.
Each of the persistentvolumeclaims searches for a persistentvolume matching its requirements to bind to. The following example provides a persistentvolumeclaim, and the persistentvolume created for it to bind to.
For more detail about statefulsets
and planning a high availability Monitoring environment, see Planning for a high availability installation.
The following persistentvolumeclaim was created when Monitoring was installed. The keys to focus on are:
spec.accessModes, which has a value of ReadWriteOncespec.resources.requests.storage, which has a value of 50Gispec.storageClassName, which has a value of rook-ceph-block-internalThe values for these keys are the keys that the persistentvolume needs to reference to ensure that it gets bound to the persistentvolumeclaim.
[root@cp4mcm-installer-chicharron-inf ~]# oc get pvc data-ibmcloudappmgmt-cassandra-0 -o json
{
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"metadata": {
"annotations": {
"pv.kubernetes.io/bind-completed": "yes",
"pv.kubernetes.io/bound-by-controller": "yes",
"volume.beta.kubernetes.io/storage-provisioner": "rook-ceph.rbd.csi.ceph.com"
},
"creationTimestamp": "2020-03-13T21:20:27Z",
"finalizers": [
"kubernetes.io/pvc-protection"
],
"labels": {
"app": "cassandra",
"chart": "cassandra",
"heritage": "Tiller",
"release": "ibmcloudappmgmt"
},
"name": "data-ibmcloudappmgmt-cassandra-0",
"namespace": "kube-system",
"resourceVersion": "1911203",
"selfLink": "/api/v1/namespaces/kube-system/persistentvolumeclaims/data-ibmcloudappmgmt-cassandra-0",
"uid": "e34209f8-dd78-4c34-956c-651a5eb7adaa"
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"resources": {
"requests": {
"storage": "50Gi"
}
},
"storageClassName": "rook-ceph-block-internal",
"volumeMode": "Filesystem",
"volumeName": "pvc-e34209f8-dd78-4c34-956c-651a5eb7adaa"
},
"status": {
"accessModes": [
"ReadWriteOnce"
],
"capacity": {
"storage": "50Gi"
},
"phase": "Bound"
}
}
This is the persistentvolume that the persistentvolumeclaim is bound to:
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"metadata": {
"annotations": {
"pv.kubernetes.io/provisioned-by": "rook-ceph.rbd.csi.ceph.com"
},
"creationTimestamp": "2020-03-13T21:20:29Z",
"finalizers": [
"kubernetes.io/pv-protection"
],
"name": "pvc-e34209f8-dd78-4c34-956c-651a5eb7adaa",
"resourceVersion": "1911201",
"selfLink": "/api/v1/persistentvolumes/pvc-e34209f8-dd78-4c34-956c-651a5eb7adaa",
"uid": "f29e1f09-f07a-4107-9037-3c3919fc359c"
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"capacity": {
"storage": "50Gi"
},
"claimRef": {
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"name": "data-ibmcloudappmgmt-cassandra-0",
"namespace": "kube-system",
"resourceVersion": "1910984",
"uid": "e34209f8-dd78-4c34-956c-651a5eb7adaa"
},
"csi": {
"driver": "rook-ceph.rbd.csi.ceph.com",
"fsType": "ext4",
"nodeStageSecretRef": {
"name": "rook-csi-rbd-node",
"namespace": "rook-ceph"
},
"volumeAttributes": {
"clusterID": "rook-ceph",
"imageFeatures": "layering",
"imageFormat": "2",
"pool": "rbd",
"storage.kubernetes.io/csiProvisionerIdentity": "1583863004786-8081-rook-ceph.rbd.csi.ceph.com"
},
"volumeHandle": "0001-0009-rook-ceph-0000000000000001-763700a9-6570-11ea-bdc2-0a580afe080f"
},
"persistentVolumeReclaimPolicy": "Delete",
"storageClassName": "rook-ceph-block-internal",
"volumeMode": "Filesystem"
},
"status": {
"phase": "Bound"
}
}
The persistentvolume that rook-ceph created has values that correspond to our persistentvolumeclaim:
spec.accessModes, which has a value of ReadWriteOncespec.capacity.storage, which has a value of 50Gi spec.storageClassName, which has a value of rook-ceph-block-internalThe values match, therefore, the persistentvolumeclaim is successfully bound to the persistentvolume.
Important: If IBM Cloud Pak® for Multicloud Management is installed on the IBM® Cloud, it might be using dynamically provisioned block storage such as ibmc-file-gold, which is always available. You can also use ibmc-file-gold for your storage solution when you are installing Monitoring. For more information, see the Storage section in Preparing to install the IBM Cloud Pak® for Multicloud Management. If IBM Cloud Pak® for Multicloud Management
is not installed on the IBM® Cloud, you must choose another storage solution for your Monitoring installation.