Statistics

Use the Statistics page to monitor the performance and capacity information of system resources and file and object storage. Performance of the system can be monitored by using various pre-defined charts. You can select the required charts and monitor the performance based on the filter criteria.

The pre-defined performance widgets and metrics help in investigating every node or any particular node that is collecting the metrics. You can select the pre-defined performance widgets by clicking the down-arrow in the Search box.

Configuring performance data collection

The mmperfmon command can be used to configure performance data collection and query performance data. The GUI displays only a subset of the metrics that can be configured for collection. Performance data collection is done by the following two distinct entities:
  • Collector: The collector can run on any node in the system. However, to display the performance data that is gathered from the associated sensors, collectors must run on the GUI nodes. The metrics are stored in a database on the collector node for future review.
  • Sensor: The sensors run on nodes that are required to collect metrics.

For more information on how to configure collectors and sensors, see Performance monitoring and Configuring performance monitoring tool section in IBM Spectrum Scale: Administration Guide.

Note: Metrics for the SMB, NFS, Objects, AFM, and transparent cloud tiering can only be collected on the nodes where those services are active.

Display options in performance charts

The charting section displays the performance details based on various aspects. The GUI provides a rich set of controls to view performance charts. You can use these controls to perform the following actions on the charts that are displayed on the page:
  • Zoom the chart by using the mouse wheel or resizing the timeline control. Y-axis can be automatically adjusted during zooming.
  • Click and drag the chart or the timeline control at the bottom. Y-axis can be automatically adjusted during panning.
  • Compare charts side by side. You can synchronize y-axis and bind x-axis. To modify the x and y axes of the chart, click the configuration symbol next to the title Statistics and select the required options.
  • Link the timelines of the two charts together by using the display options that are available.
  • The Dashboard helps to access all single graph charts, which are either predefined or custom created favorites.

Selecting performance and capacity metrics

To monitor the performance of the system, you need to select the appropriate metrics to be displayed in the performance charts. Metrics are grouped under the combination of resource types and aggregation levels. The resource types determine the area from which the data is taken to create the performance analysis and aggregation level determines the level at which the data is aggregated. The aggregation levels that are available for selection varies based on the resource type.

Sensors are configured against each resource type. The following table provides a mapping between resource types and sensors under the Performance category.
Table 1. Sensors available for each resource type
Resource type Sensor name Candidate nodes
Network Network All
System Resources CPU All
Load
Memory
NSD Server GPFSNSDDisk NSD Server nodes
IBM Spectrum Scale Client GPFSFilesystem IBM Spectrum Scale Client nodes
GPFSVFS
GPFSFilesystemAPI
NFS NFSIO Protocol nodes running NFS service
SMB SMBStats Protocol nodes running SMB service
SMBGlobalStats
Waiters GPFSWaiters All nodes
CTDB CTDBStats Protocol nodes running SMB service
Object SwiftAccount Protocol nodes running Object service
SwiftContainer
SwiftObject
SwiftProxy
AFM GPFSAFM All nodes
GPFSAFMFS
GPFSAFMFSET
Transparent Cloud Tiering MCStoreGPFSStats Cloud gateway nodes
MCStoreIcstoreStats
MCStoreLWEStats

The resource type Waiters are used to monitor the long running file system threads. Waiters are characterized by the purpose of the corresponding file system threads. For example, an RPC call waiter that is waiting for Network I/O threads or a waiter that is waiting for a local disk I/O file system operation. Each waiter has a wait time associated with it and it defines how long the waiter is already waiting. With some exceptions, long waiters typically indicate that something in the system is not healthy.

The Waiters performance chart shows the aggregation of the total count of waiters of all nodes in the cluster above a certain threshold. Different thresholds from 100 milliseconds to 60 seconds can be selected in the list below the aggregation level. By default, the value shown in the graph is the sum of the number of waiters that exceed threshold in all nodes of the cluster at that point in time. The filter functionality can be used to display waiters data only for some selected nodes or file systems. Furthermore, there are separate metrics for different waiter types such as Local Disk I/O, Network I/O, ThCond, ThMutex, Delay, and Syscall.

You can also monitor the capacity details that are aggregated at the following levels:
  • NSD
  • Node
  • File system
  • Pool
  • Fileset
  • Cluster

The following table lists the sensors that are used for capturing the capacity details.

Table 2. Sensors available to capture capacity details
Sensor name Candidate nodes
DiskFree All nodes
GPFSFilesetQuota Only a single node
GPFSDiskCap Only a single node
GPFSPool Only a single node where all GPFS file systems are mounted. The GUI does not display any values based on this sensor but it displays warnings or errors due to thresholds based on this sensor.
GPFSFileset Only a single node. The GUI does not display any values based on this sensor but it displays warnings or errors due to thresholds based on this sensor.
You can edit an existing chart by clicking the icon that is available on the upper right corner of the performance chart and select Edit to modify the metrics selections. Do the following to drill down to the metric you are interested in:
  1. Select the cluster to be monitored from the Cluster field. You can either select the local cluster or the remote cluster.
  2. Select Resource type. This is the area from which the data is taken to create the performance analysis.
  3. Select Aggregation level. The aggregation level determines the level at which the data is aggregated. The aggregation levels that are available for selection varies based on the resource type.
  4. Select the entities that need to be graphed. The table lists all entities that are available for the chosen resource type and aggregation level. When a metric is selected, you can also see the selected metrics in the same grid and use methods like sorting, filtering, or adjusting the time frame to select the entities that you want to select.
  5. Select Metrics. Metrics is the type of data that need to be included in the performance chart. The list of metrics that is available for selection varies based on the resource type and aggregation type.
  6. Use the filter option to further narrow down in addition to the objects and metrics selection by using filters. Depending on the selected object category and aggregation level, the "Filter" section can be displayed underneath the aggregation level, allowing one or more filters to be set. Filters are specified as regular expressions as shown in the following examples:
    • As a single entity:

      node1

      eth0

    • Filter metrics applicable to multiple nodes as shown in the following examples:
      • To select a range of nodes such as node1, node2 and node3:

        node1|node2|node3

        node[1-3]

      • To filter based on a string of text. For example, all nodes starting with 'nod' or ending with 'int':

        nod.+|.+int

      • To filter network interfaces eth0 through eth6, bond0 and eno0 through eno6:

        eth[0-6]|bond0|eno[0-6]

      • To filter nodes starting with 'strg' or 'int' and ending with 'nx':

        (strg)|(int).+nx

Creating favorite charts

Favorite charts are nothing but customized predefined charts. Favorite charts along with the predefined charts are available for selection when you add widgets in the Dashboard page.

To create favorite charts, click the ‘star’ symbol that is placed next to the chart title and enter the label.

Monitoring performance of the remote cluster

You can monitor the performance of the remote cluster with the help of performance monitoring tools that are configured in both the remote and local clusters. The performance details collected in the remote cluster is shared with the local cluster using REST APIs.

After establishing the connection with the remote cluster by using the Cluster > Remote Connections page, you can access the performance details of the remote cluster from the following GUI pages:

  • Monitoring > Statistics
  • Monitoring > Dashboard
  • Files > File Systems
To monitor performance details of the remote cluster in the Statistics page, you need to create customized performance charts by performing the following steps:
  1. Access the edit mode by clicking the icon that is available on the upper right corner of the performance chart and selecting Edit.
  2. In the edit mode, select the remote cluster to be monitored from the Cluster field. You can either select the local cluster or remote cluster from this field.
  3. Select Resource type. This is the area from which the data is taken to create the performance analysis.
  4. Select Aggregation level. The aggregation level determines the level at which the data is aggregated. The aggregation levels that are available for selection varies based on the resource type.
  5. Select the entities that need to be graphed. The table lists all entities that are available for the chosen resource type and aggregation level. When a metric is selected, you can also see the selected metrics in the same grid and use methods like sorting, filtering, or adjusting the time frame to select the entities that you want to select.
  6. Select Metrics. Metrics is the type of data that need to be included in the performance chart. The list of metrics that is available for selection varies based on the resource type and aggregation type.
  7. Click Apply to create the customized chart.

After creating the customized performance chart, you can mark it as favorite charts to get them displayed on the Dashboard page.

If a file system is mounted on the remote cluster nodes, the performance details of such remote cluster nodes are available in the Remote Nodes tab of the detailed view of file systems in the Files > File Systems page.