mmperfmon command
Configures the Performance Monitoring tool and lists the performance metrics.
Synopsis
mmperfmon config generate --collectors CollectorNode[,CollectorNode...]
[ --config-file InputFile ]
or
mmperfmon config add --sensors SensorFile
or
mmperfmon config update { [--collectors CollectorNode[,CollectorNode...] ]
[ --config-file InputFile ] [ Attribute=value ... ] }
or
mmperfmon config delete {--all |--sensors Sensor[,Sensor...] }
or
mmperfmon config show [--config-file OutputFile]
or
mmperfmon query Metric[,Metric...] | Key[,Key...] | NamedQuery
[StartTime EndTime | Duration]
[Options]
or
mmperfmon query compareNodes ComparisonMetric
[StartTime EndTime | Duration]
[Options]
or
mmperfmon delete {--expiredKeys |--key Key[,Key...] }
Availability
Available on all IBM Spectrum Scale™ editions.
The protocol functions provided in this command, or any similar command, are generally referred to as CES (Cluster Export Services). For example, protocol node and CES node are functionally equivalent terms.
Description
mmperfmon config modifies the performance monitoring tool by updating the configuration stored in IBM Spectrum Scale. It can be used to generate an initial configuration, to update reporting periods of different sensors, or to restrict sensors to a given set of nodes.
mmperfmon query is used to query metrics in a cluster from the performance metrics collector. Output can be delivered either in a raw format, formatted table layout or as a CSV export.
In addition to metrics known by the performance collector, the mmperfmon query command can also run predefined named queries or use predefined computed metrics. You can specify a bucket size for each record to return in number of seconds and the number of buckets to retrieve. You can also specify the duration or a time range for which the query can run.
Parameters
- config
- generate
- Generates the configuration of the performance monitoring tool.Note: Once the configuration has been generated, do not forget to turn on monitoring through the mmchnode command.
−−collectors CollectorNode[,CollectorNode...] specifies the set of collectors to which the sensors report their performance measurements. The number of collectors that each sensor shall report to may be specified through colRedundancy parameter in the template sensor configuration file (see --config-file). Federated collectors are automatically configured between these collectors. For more information on federated collectors, see Configuring multiple collectors.
−−config−file InputFile specifies the template sensor configuration file to use. If this option is not provided, the /opt/IBM/zimon/defaults/ZIMonSensors.cfg file is used.
- add
- Adds a new sensor to the performance monitoring tool.−−sensors SensorFile adds the sensors specified in SensorFile to the sensor configuration. Multiple sensors in the configuration file need to be separated by a comma. Following is a sample SensorFile:
The generic sensor and a sensor-specific configuration file need to be installed on all the nodes where the generic sensor is to be activated.sensors = { name = "MySensor" # sensor disabled by default period = 0 type = "Generic" }
- update
- Updates the existing configuration.
−−collectors CollectorNode[,CollectorNode...] updates the collectors to be used by the sensors and for federation (see config generate for details).
−−config−file InputFile specifies a template sensor configuration file to use. This overwrites the currently used configuration with the configuration specified in InputFile.
Attribute=value ... specifies a list of attribute value assignments. This sets the value of attribute Attribute to value.
- delete
- Removes configuration of the performance monitoring tool or the specified
sensors.
−−sensors Sensor[,Sensor...] removes the sensors with the specified names from the performance monitoring configuration.
−−all removes the entire performance monitoring configuration from IBM Spectrum Scale.
- show
- Displays the currently active performance monitoring configuration. Specifies the following
options:
−−config-file OutputFile specifies that the output will be saved to the OutputFile. This option is optional.
- query
Metric[,Metric...] specifies a comma separated list of metrics for displaying in the output.
Key[,Key...] specifies a key that can consist of a node name, sensor group, or optional additional filters, and metrics that are separated by the pipe symbol (|). For example:"cluster1.ibm.com|CTDBStats|locking|db_hop_count_bucket_00"
NamedQuery specifies the name of a predefined query.
compareNodes compares the specified metrics for all nodes in the system. The query creates one column per existing node and only one metric can be compared.
ComparisonMetric specifies the name of the metric to be compared when using the compareNodes query.
StartTime specifies the start timestamp for query in the YYYY-MM-DD-hh:mm:ss format.
EndTime specifies the end timestamp for query in the YYYY-MM-DD-hh:mm:ss format. If it is not specified, the query will return results until the present time.
Duration specifies the number of seconds into the past from present time or EndTime.
Options specifies the following options:- -N or --Node NODENAME specifies the node from where the metrics should be
retrieved.
For general information on how to specify node names, see Specifying nodes as input to GPFS commands.
- --bucket-size BUCKET_SIZE specifies the bucket size (number of seconds), default is 1.
- --number-buckets NUMBER_BUCKETS specifies the number of buckets (records) to display, default is 10.
- --filter FILTER specifies the filter criteria for the query to run. To see the list of filters in the node use the mmperfmon query --list filters command.
- --format FORMAT specifies a common format for all columns.
- --csv provides the output in the CSV format.
- --raw provides the output in a raw format rather than a tabular format.
- --short displays the column names in a short form when there are too many to fit into a row.
- --nice displays the column headers in the output in a bold and underlined typeface.
- --resolve displays the resolved computed metrics and metrics that are used.
- --list {computed | metrics | keys | filters | queries | expiredKeys | all} lists the following information:
- computed displays the computed metrics.
- metrics displays the metrics.
- keys lists the keys.
- filters lists the filters.
- queries lists the available predefined queries.
- expiredKeys lists the group keys for the entities that have not returned any metrics values within the default retention period of 14 days.
- all displays the computed metrics, metrics, keys, filters, and queries.
- -N or --Node NODENAME specifies the node from where the metrics should be
retrieved.
- Delete
- Removes expired keys from the performance monitoring tool database.
--key Key[,Key...] specifies the key or list of keys that have to be removed from the performance monitoring tool database, if they are expired keys. The keys are displayed as a comma-separated string.
--expiredKeys specifies all the expired keys have to be removed from the performance monitoring tool database.
Note: A group key is the part of the metric key string representing a base entity. For example, for the keys:
the group key would be gpfsgui-cluster-1.novalocal|GPFSInodeCap|nfs_shareFS.gpfsgui-cluster-1.novalocal|GPFSInodeCap|nfs_shareFS|gpfs_fs_inode_alloc gpfsgui-cluster-1.novalocal|GPFSInodeCap|nfs_shareFS|gpfs_fs_inode_free gpfsgui-cluster-1.novalocal|GPFSInodeCap|nfs_shareFS|gpfs_fs_inode_max gpfsgui-cluster-1.novalocal|GPFSInodeCap|nfs_shareFS|gpfs_fs_inode_used
Expired keys are group keys which have been detected by IBM Spectrum Scale monitoring tool, but are not found in the current cluster configuration, and have not returned metrics for the default retention period of 14 days at least.
Exit status
- 0
- Successful completion.
- 1
- Invalid arguments given
- 2
- Invalid option
- 3
- No node found with a running performance collector
- 4
- Performance collector backend signaled bad query, for example, no data for this query.
Security
You must have root authority to run the mmperfmon command.
The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see Requirements for administering a GPFS file system.
Examples
- To generate configuration for the c89f8v03 collector node, issue the
command:
The system displays output similar to this:mmperfmon config generate --collectors c89f8v03
mmperfmon: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process. Tue Oct 27 20:40:07 EDT 2015: mmcommon pushSdr_async: mmsdrfs propagation started
- To add /tmp/SensorFile sensor to the performance monitoring tool, issue the
command:
The system displays output similar to this:mmperfmon config add --sensors /tmp/SensorFile
mmperfmon: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process. Tue Oct 27 20:44:33 EDT 2015: mmcommon pushSdr_async: mmsdrfs propagation started # mmperfmon config show | tail -12 { name = "NFSIO" period = 0 proxyCmd = "/opt/IBM/zimon/GaneshaProxy" restrict = "cesNodes" type = "Generic" }, { name = "TestAdd" period = 4 } smbstat = ""
- To update the NFSIO.period value to 5, issue the command:
The system displays output similar to this:# mmperfmon config update NFSIO.period=5
mmperfmon: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process. Tue Oct 27 20:47:53 EDT 2015: mmcommon pushSdr_async: mmsdrfs propagation started # mmperfmon config show | tail -9 }, { name = "NFSIO" period = 5 proxyCmd = "/opt/IBM/zimon/GaneshaProxy" restrict = "cesNodes" type = "Generic" } smbstat = ""
- To remove the TestAdd sensor, issue the following command:
The system displays output similar to this:# mmperfmon config delete --sensors TestAdd
mmperfmon: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process. Tue Oct 27 20:46:23 EDT 2015: mmcommon pushSdr_async: mmsdrfs propagation started Tue Oct 27 20:46:28 EDT 2015: mmcommon pushSdr_async: mmsdrfs propagation completed; mmdsh rc=0 # mmperfmon config show | tail -12 { name = "GPFSDiskCap" period = 0 }, { name = "NFSIO" period = 0 proxyCmd = "/opt/IBM/zimon/GaneshaProxy" restrict = "cesNodes" type = "Generic" } smbstat = ""
- To display the currently active performance monitoring configuration, issue the
command:
The system displays output similar to this:# mmperfmon config show
cephMon = "/opt/IBM/zimon/CephMonProxy" cephRados = "/opt/IBM/zimon/CephRadosProxy" colCandidates = "c89f8v03" colRedundancy = 1 collectors = { host = "" port = "4739" } config = "/opt/IBM/zimon/ZIMonSensors.cfg" ctdbstat = "" daemonize = T hostname = "" ipfixinterface = "0.0.0.0" logfile = "/var/log/zimon/ZIMonSensors.log" loglevel = "info" mmcmd = "/opt/IBM/zimon/MMCmdProxy" mmdfcmd = "/opt/IBM/zimon/MMDFProxy" mmpmon = "/opt/IBM/zimon/MmpmonSockProxy" piddir = "/var/run" release = "4.2.0-0" sensors = { name = "CPU" period = 1 }, { name = "Load" period = 1 }, { name = "Memory" period = 1 }, { name = "Network" period = 1 }, { name = "Netstat" period = 0 }, { name = "Diskstat" period = 0 }, { name = "DiskFree" period = 600 }, { name = "GPFSDisk" period = 0 }, { name = "GPFSFilesystem" period = 1 }, { name = "GPFSNSDDisk" period = 1 restrict = "nsdNodes" }, { name = "GPFSPoolIO" period = 0 }, { name = "GPFSVFS" period = 1 }, { name = "GPFSIOC" period = 0 }, { name = "GPFSVIO" period = 0 }, { name = "GPFSPDDisk" period = 1 restrict = "nsdNodes" }, { name = "GPFSvFLUSH" period = 0 }, { name = "GPFSNode" period = 1 }, { name = "GPFSNodeAPI" period = 1 }, { name = "GPFSFilesystemAPI" period = 1 }, { name = "GPFSLROC" period = 0 }, { name = "GPFSCHMS" period = 0 }, { name = "GPFSAFM" period = 0 }, { name = "GPFSAFMFS" period = 0 }, { name = "GPFSAFMFSET" period = 0 }, { name = "GPFSRPCS" period = 0 }, { name = "GPFSFilesetQuota" period = 3600 }, { name = "GPFSDiskCap" period = 0 }, { name = "NFSIO" period = 0 proxyCmd = "/opt/IBM/zimon/GaneshaProxy" restrict = "cesNodes" type = "Generic" }, { name = "SwiftAccount" period = 1 restrict = "cesNodes" type = "generic" }, { name = "SwiftContainer" period = 1 restrict = "cesNodes" type = "generic" }, { name = "SwiftObject" period = 1 restrict = "cesNodes" type = "generic" }, { name = "SwiftProxy" period = 1 restrict = "cesNodes" type = "generic" } smbstat = ""
- To list metrics by key, for given node, sensor group and metrics, issue this command:
The system displays output similar to this:mmperfmon query "cluster1.ibm.com|CTDBDBStats|locking|db_hop_count_bucket_00"
Row Timestamp db_hop_count_bucket_00 1 2015-04-08-12:54:53 0 2 2015-04-08-12:54:54 0 3 2015-04-08-12:54:55 0 4 2015-04-08-12:54:56 0 5 2015-04-08-12:54:57 0 6 2015-04-08-12:54:58 0 7 2015-04-08-12:54:59 0 8 2015-04-08-12:55:00 0 9 2015-04-08-12:55:01 0 10 2015-04-08-12:55:02 0
- To list the two metrics nfs_read_lat and nfs_write_lat for a
specific time range, filtered by an export and NFS version with 60-seconds-buckets (one record
represents 60 seconds), issue this
command:
The system displays output similar to this:mmperfmon query nfs_read_lat,nfs_write_lat 2014-12-19-11:15:00 2014-12-19-11:20:00 --filter export=/ibm/gpfs/nfsexport,nfs_ver=NFSv3 -b 60
Row Timestamp nfs_read_lat nfs_write_lat 1 2015-04-10-09:24:00 0 0 2 2015-04-10-09:24:10 0 0 3 2015-04-10-09:24:20 0 0 4 2015-04-10-09:24:30 0 0 5 2015-04-10-09:24:40 0 0 6 2015-04-10-09:24:50 0 0 7 2015-04-10-09:25:00 0 0 8 2015-04-10-09:25:10 0 0 9 2015-04-10-09:25:20 0 0 10 2015-04-10-09:25:30 0 0 11 2015-04-10-09:25:40 0 0 12 2015-04-10-09:25:50 45025738 1882453623 13 2015-04-10-09:26:00 0 0 14 2015-04-10-09:26:10 0 0 15 2015-04-10-09:26:20 0 0 16 2015-04-10-09:26:30 0 0 17 2015-04-10-09:26:40 0 0 18 2015-04-10-09:26:50 0 0
- To list all available filters, issue this
command:
The system displays output similar to this:mmperfmon query --list filters
Available Filters: node gpfs-21.localnet.com gpfs-22.localnet.com protocol smb2 db_name account_policy autorid brlock ctdb dbwrap_watchers g_lock group_mapping leases locking netlogon_creds_cli notify_index passdb registry secrets serverid share_info smbXsrv_open_global smbXsrv_session_global smbXsrv_tcon_global smbXsrv_version_global gpfs_fs_name fs0 gpfs0 gpfs_cluster_name gpfs-cluster-2.localnet.com mountPoint / /boot /dev /dev/shm /gpfs/fs0 /mnt/gpfs0 /run /sys/fs/cgroup operation break cancel close create find flush getinfo ioctl keepalive lock logoff negprot notify read sesssetup setinfo tcon tdis write sensor CPU CTDBDBStats CTDBStats DiskFree GPFSFilesystemAPI GPFSVFS Load Memory Network SMBGlobalStats SMBStats netdev_name eth0 lo
- To run a named query for export /ibm/gpfs/nfsexport and nfs_ver
NFSv3, using default bucket size of 1 second, showing last 10 buckets , issue this
command:
The system displays output similar to this:mmperfmon query nfsIOrate --filter export=/ibm/gpfs/nfsexport,nfs_ver=NFSv3,node=cluster1.ibm.com
Legend: 1: cluster1.ibm.com|NFSIO|/ibm/gpfs/nfsexport|NFSv3|nfs_read_ops 2: cluster2.ibm.com|NFSIO|/ibm/gpfs/nfsexport|NFSv3|nfs_write_ops Row Timestamp nfs_read_ops nfs_write_ops 1 2015-05-11-13:32:57 0 0 2 2015-05-11-13:32:58 90 90 3 2015-05-11-13:32:59 90 90 4 2015-05-11-13:33:00 90 91 5 2015-05-11-13:33:01 91 90 6 2015-05-11-13:33:02 91 92 7 2015-05-11-13:33:03 89 88 8 2015-05-11-13:33:04 91 92 9 2015-05-11-13:33:05 93 92 10 2015-05-11-13:33:06 89 89
- To run a named query for export /ibm/gpfs/nfsexport and nfs_ver
NFSv3, using bucket size of 1 minute, showing last 20 buckets (= 20 minutes), issue this
command:
The system displays output similar to this:mmperfmon query nfsIOrate --filter export=/ibm/gpfs/nfsexport,nfs_ver=NFSv3,node=cluster1.ibm.com -n 20 -b 60
Legend: 1: cluster1.ibm.com|NFSIO|/ibm/gpfs/nfsexport|NFSv3|nfs_read_ops 2: cluster2.ibm.com|NFSIO|/ibm/gpfs/nfsexport|NFSv3|nfs_write_ops Row Timestamp nfs_read_ops nfs_write_ops 1 2015-05-11-13:31:00 0 0 2 2015-05-11-13:32:00 280 280 3 2015-05-11-13:33:00 820 820 4 2015-05-11-13:34:00 0 0 5 2015-05-11-13:35:00 0 0 6 2015-05-11-13:36:00 0 0 7 2015-05-11-13:37:00 0 0 8 2015-05-11-13:38:00 0 0 9 2015-05-11-13:39:00 1000 1000 10 2015-05-11-13:40:00 1000 1000 11 2015-05-11-13:41:00 0 0 12 2015-05-11-13:42:00 0 0 13 2015-05-11-13:43:00 0 0 14 2015-05-11-13:44:00 2000 2000 15 2015-05-11-13:45:00 0 0 16 2015-05-11-13:46:00 0 0 17 2015-05-11-13:47:00 1000 1000 18 2015-05-11-13:48:00 1000 1000 19 2015-05-11-13:49:00 0 0 20 2015-05-11-13:50:00 0 0
- To run a compareNodes query for the cpu_user metric,
issue this command:
The system displays output similar to this:mmperfmon query compareNodes cpu_user
Legend: 1: cluster1.ibm.com|CPU|cpu_user 2: cluster2.ibm.com|CPU|cpu_user Row Timestamp cluster1 cluster2 1 2015-05-11-13:53:54 0.5 0.25 2 2015-05-11-13:53:55 0.5 0.25 3 2015-05-11-13:53:56 0.25 0.25 4 2015-05-11-13:53:57 0.5 0.25 5 2015-05-11-13:53:58 0.25 0.75 6 2015-05-11-13:53:59 0.5 0.25 7 2015-05-11-13:54:00 0.25 0.25 8 2015-05-11-13:54:01 0.5 0.25 9 2015-05-11-13:54:02 0.25 0.25 10 2015-05-11-13:54:03 0.5 0.25
- To run an object query, issue the following
command:
The system displays output similar to this:mmperfmon query objObj 2016-09-28-09:56:39 2016-09-28-09:56:43
1: cluster1.ibm.com|SwiftObject|object_auditor_time 2: cluster1.ibm.com|SwiftObject|object_expirer_time 3: cluster1.ibm.com|SwiftObject|object_replication_partition_delete_time 4: cluster1.ibm.com|SwiftObject|object_replication_partition_update_time 5: cluster1.ibm.com|SwiftObject|object_DEL_time 6: cluster1.ibm.com|SwiftObject|object_DEL_err_time 7: cluster1.ibm.com|SwiftObject|object_GET_time 8: cluster1.ibm.com|SwiftObject|object_GET_err_time 9: cluster1.ibm.com|SwiftObject|object_HEAD_time 10: cluster1.ibm.com|SwiftObject|object_HEAD_err_time 11: cluster1.ibm.com|SwiftObject|object_POST_time 12: cluster1.ibm.com|SwiftObject|object_POST_err_time 13: cluster1.ibm.com|SwiftObject|object_PUT_time 14: cluster1.ibm.com|SwiftObject|object_PUT_err_time 15: cluster1.ibm.com|SwiftObject|object_REPLICATE_time 16: cluster1.ibm.com|SwiftObject|object_REPLICATE_err_time 17: cluster1.ibm.com|SwiftObject|object_updater_time Row object_auditor_time object_expirer_time object_replication_partition_delete_time object_replication_partition_update_time object_DEL_time object_DEL_err_time object_GET_time object_GET_err_time object_HEAD_time object_HEAD_err_time object_POST_time object_POST_err_time object_PUT_time object_PUT_err_time object_REPLICATE_time object_REPLICATE_err_time object_updater_time 1 2016-09-28 09:56:39 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.855923 0.000000 0.000000 0.000000 45.337915 0.000000 0.000000 0.000000 0.000000 2 2016-09-28 09:56:40 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3 2016-09-28 09:56:41 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.931925 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 4 2016-09-28 09:56:42 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.855923 0.000000 0.000000 0.000000 516.280890 0.000000 0.000000 0.000000 0.000000 object_DEL_total_time = 0.0 object_PUT_total_time = 561.618805 object_GET_total_time = 0.0 object_POST_total_time = 0.0 object_HEAD_total_time = 1.786948 object_PUT_max_time = 516.28089 object_POST_max_time = 0.0 object_GET_max_time = 0.0 object_HEAD_max_time = 0.931025 object_DEL_max_time = 0.0 object_GET_avg_time = 0.0 object_DEL_avg_time = 0.0 object_PUT_avg_time = 280.809402 object_POST_avg_time = 0.0 object_HEAD_avg_time = 0.893474 object_DEL_time_count = 0.0 object_POST_time_count = 0 object_PUT_time_count = 2 object_HEAD_time_count = 2 object_GET_time_count = 0 object_DEL_min_time = 0.0 object_PUT_min_time = 45.337915 object_GET_min_time = 0.0 object_POST_min_time = 0.0 object_HEAD_min_time = 0.855923
- To view expired keys, issue the following command:
The system displays output similar to this:mmperfmon query –list=expiredKeys
Found expired keys: test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|fs1 test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|fs2 test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|fs2 test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|fs1 test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|gpfs0 test_nodename|DiskFree|/mnt/gpfs0 test_nodename|Netstat test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|objfs test_nodename|GPFSVFS test_nodename|GPFSNode test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|gpfs0 test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|objfs test_nodename|DiskFree|/gpfs/fs2 test_nodename|DiskFree|/gpfs/fs1 test_nodename|GPFSRPCS test_nodename|CPU test_nodename|GPFSNodeAPI test_nodename|Load test_nodename|DiskFree|/mnt/objfs test_nodename|Memory
- To delete expired key, issue the following
command:
The system displays output similar to this:mmperfmon delete –key ‘test_nodename|DiskFree|/mnt/gpfs0‘
Check expired keys completed. Successfully 3 keys deleted.
See also
mmdumpperfdata command