Configuring performance metrics and display options in the Statistics page of the GUI

Use the Monitoring > Statistics page to monitor the performance of system resources and file and object storage. Performance of the system can be monitored by using various pre-defined charts. You can select the required charts and monitor the performance based on the filter criteria.

The pre-defined performance charts and metrics help in investigating every node or any particular node that is collecting the metrics. The following figure shows various configuration options that are available in the Statistics page of the management GUI.
Figure 1. Statistics page in the IBM Spectrum Scale management GUI
Statistics page in the IBM Spectrum Scale management GUI

You can select pre-defined charts that are available for selection from pre-defined chart list. You can display up to two charts at a time.

Display options in performance charts

The charting section displays the performance details based on various aspects. The GUI provides a rich set of controls to view performance charts. You can use these controls to perform the following actions on the charts that are displayed on the page:
  • Zoom the chart by using the mouse wheel or resizing the timeline control. Y-axis can be automatically adjusted during zooming.
  • Click and drag the chart or the timeline control at the bottom. Y-axis can be automatically adjusted during panning.
  • Compare charts side by side. You can synchronize y-axis and bind x-axis. To modify the x and y axes of the chart, click the configuration symbol next to the title Statistics and select the required options.
  • Link the timelines of the two charts together by using the display options that are available.
  • The Dashboard helps to access all single graph charts, which are either predefined or custom created favorites.

Selecting performance and capacity metrics

To monitor the performance of the system, you need to select the appropriate metrics to be displayed in the performance charts. Metrics are grouped under the combination of resource types and aggregation levels. The resource types determine the area from which the data is taken to create the performance analysis and aggregation level determines the level at which the data is aggregated. The aggregation levels that are available for selection varies based on the resource type.

Sensors are configured against each resource type. The following table provides a mapping between resource types and sensors under the Performance category.
Table 1. Sensors available for each resource type
Resource type Sensor name Candidate nodes
Network Network All
System Resources CPU All
Load
Memory
NSD Server GPFSNSDDisk NSD Server nodes
IBM Spectrum Scale™ Client GPFSFilesystem IBM Spectrum Scale Client nodes
GPFSVFS
GPFSFilesystemAPI
NFS NFSIO Protocol nodes running NFS service
SMB SMBStats Protocol nodes running SMB service
SMBGlobalStats
Waiters GPFSWaiters All nodes
CTDB CTDBStats Protocol nodes running SMB service
Object SwiftAccount Protocol nodes running Object service
SwiftContainer
SwiftObject
SwiftProxy
AFM GPFSAFM All nodes
GPFSAFMFS
GPFSAFMFSET
Transparent Cloud Tiering MCStoreGPFSStats Cloud gateway nodes
MCStoreIcstoreStats
MCStoreLWEStats

The resource type Waiters are used to monitor the long running file system threads. Waiters are characterized by the purpose of the corresponding file system threads. For example, an RPC call waiter that is waiting for Network I/O threads or a waiter that is waiting for a local disk I/O file system operation. Each waiter has a wait time associated with it and it defines how long the waiter is already waiting. With some exceptions, long waiters typically indicate that something in the system is not healthy.

The Waiters performance chart shows the aggregation of the total count of waiters of all nodes in the cluster above a certain threshold. Different thresholds from 100 milliseconds to 60 seconds can be selected in the list below the aggregation level. By default, the value shown in the graph is the sum of the number of waiters that exceed threshold in all nodes of the cluster at that point in time. The filter functionality can be used to display waiters data only for some selected nodes or file systems. Furthermore, there are separate metrics for different waiter types such as Local Disk I/O, Network I/O, ThCond, ThMutex, Delay, and Syscall.

You can also monitor the capacity details that are aggregated at the following levels:
  • NSD
  • Node
  • File system
  • Pool
  • Fileset
  • Cluster

The following table lists the sensors that are used for capturing the capacity details.

Table 2. Sensors available to capture capacity details
Sensor name Candidate nodes
DiskFree All nodes
GPFSFilesetQuota Only a single node
GPFSDiskCap Only a single node
GPFSPool Only a single node where all GPFS file systems are mounted. The GUI does not display any values based on this sensor but it displays warnings or errors due to thresholds based on this sensor.
GPFSFileset Only a single node. The GUI does not display any values based on this sensor but it displays warnings or errors due to thresholds based on this sensor.
You can edit an existing chart by clicking the icon that is available on the upper right corner of the performance chart and select Edit to modify the metrics selections. Do the following to drill down to the metric you are interested in:
  1. Select the cluster to be monitored from the Cluster field. You can either select the local cluster or the remote cluster.
  2. Select Resource type. This is the area from which the data is taken to create the performance analysis.
  3. Select Aggregation level. The aggregation level determines the level at which the data is aggregated. The aggregation levels that are available for selection varies based on the resource type.
  4. Select the entities that need to be graphed. The table lists all entities that are available for the chosen resource type and aggregation level. When a metric is selected, you can also see the selected metrics in the same grid and use methods like sorting, filtering, or adjusting the time frame to select the entities that you want to select.
  5. Select Metrics. Metrics is the type of data that need to be included in the performance chart. The list of metrics that is available for selection varies based on the resource type and aggregation type.
  6. Use the filter option to further narrow down in addition to the objects and metrics selection by using filters. Depending on the selected object category and aggregation level, the "Filter" section can be displayed underneath the aggregation level, allowing one or more filters to be set. Filters are specified as regular expressions as shown in the following examples:
    • As a single entity:

      node1

      eth0

    • Filter metrics applicable to multiple nodes as shown in the following examples:
      • To select a range of nodes such as node1, node2 and node3:

        node1|node2|node3

        node[1-3]

      • To filter based on a string of text. For example, all nodes starting with 'nod' or ending with 'int':

        nod.+|.+int

      • To filter network interfaces eth0 through eth6, bond0 and eno0 through eno6:

        eth[0-6]|bond0|eno[0-6]

      • To filter nodes starting with 'strg' or 'int' and ending with 'nx':

        (strg)|(int).+nx

Creating favorite charts

Favorite charts are nothing but customized predefined charts. Favorite charts along with the predefined charts are available for selection when you add widgets in the Dashboard page.

To create favorite charts, click the ‘star’ symbol that is placed next to the chart title and enter the label.