Configuring performance monitoring tool sensors
The IBM Storage Scale System fabric hospital uses custom performance monitoring tool sensors to relay path error information back to the performance monitoring tool collector node. By default, this node is the EMS node.
About this task
Due to current scaling concerns, it is recommended to install and configure only the performance
monitoring tool pmsensors
on 5-10 IBM
Storage Scale System building blocks. Where, each building
block is an IBM
Storage Scale System server pair.
Procedure
- Identify building blocks to install the sensors on them. The blocks are identified by passing a list of IBM Storage Scale System nodes or a node class. This node class contains the nodes that will be monitored.
-
Run the following command from the EMS or I/O node for each recovery group definition for all
building blocks that are being monitored:
This step is necessary to prevent spurious RAS events and discard data that existed before the performance monitoring tool sensor is being installed.
RGNAMES=rg1 rg2 rg3 for rg in $RGNAMES do tsgnrgethospdata $rg --reap --all-node-path done sleep 151 mmperfmon config add --sensors /opt/IBM/zimon/defaults/ZIMonSensors_GPFSFabricHospital.cfg
- Instead of using the code
block in Step 2, it is recommended to use a sample script that is provided in
/usr/lpp/mmfs/samples/vdisk/install_essfabrichospital_sensor.
where,install_essfabrichospital_sensor [-h] [-f FILE][--override-sleep OVERRIDE_SLEEP] node_class_list
- node_class_list
- Specifies node class list (one per building block), which is separated by commas.
- The node_class_list argument is different from the restrict argument that is provided in the /opt/IBM/zimon/defaults/ZIMonSensors_GPFSFabricHospital.cfg file and it is used to determine a sleeping period to synchronize the sensors.
- -h, --help
- Shows the help message and exits.
- -f FILE, --file FILE
- Specifies the sensor configuration file (default /opt/IBM/zimon/defaults/ZIMonSensors_GPFSFabricHospital.cfg).
- --override-sleep OVERRIDE_SLEEP
- Overrides the sleep delay when you are installing a sensor (debug).
The provided sensor file in /opt/IBM/zimon/defaults/ZIMonSensors_GPFSFabricHospital.cfg has the following contents:sensors = { name = "GPFSFabricHospital" # This sensor should be activated only when ECE/GNR is configured # Only supported value is 900 seconds (15 minutes) period = 900 restrict = "nsdNodes" type = "Generic" }
Important: Only values of 900 (15 minutes in seconds) are supported. - Verify the new sensor.
# mmperfmon config show | grep GPFSFabricHospital -A3
A sample output is as follows:name = "GPFSFabricHospital" period = 900 restrict = "FabricHospital10BB" type = "Generic"
Note: In its current form, it is recommended to limit the sensor configuration to 10 IBM Storage Scale System building block node pairs. Here, the system provided node class is"FabricHospital10BB",
which contains all IBM Storage Scale System server nodes by default. If your cluster has more than 10 IBM Storage Scale System building block node pairs, then a custom node class should be defined that encapsulates up to 10 node pairs.For example, assume that the"FabricHospital10BB"
node class is created to contain the nodes that you want to monitor. The performance monitoring tool sensor file would be modified as follows:
A sample output is as follows:# mmperfmon config show | grep GPFSFabricHospital -A3
name = "GPFSFabricHospital" period = 900 restrict = "FabricHospital10BB" type = "Generic"
- Ensure that the performance monitoring component is healthy.
# mmhealth node show perfmon
A sample output is as follows:Node name: c145f11san06a-ib0.gpfs.net Component Status Status Change Reasons & Notices ------------------------------------------------------------------------------- PERFMON HEALTHY 1 day ago -
Note: Sensors can be configured only once. If you want to reinstall the sensor from scratch to change its behavior, you can run the following command to remove the sensor so that the sensor can be added again:# mmperfmon config delete {--all |--sensors Sensor[,Sensor...] }