Use case 2: Observe the file system capacity usage by using default threshold rules
This use case demonstrates the use of mmhealth threshold list command for monitoring a file system capacity event by using default threshold rules.
Since the file system capacity-related thresholds such as DataCapUtil_Rule
,
MetaDataCapUtil_Rule
, and InodeCapUtil_Rule
are not node-specific.
These thresholds are reported on the node that has active threshold monitor role.
- Issue the following command to view the node that has active threshold monitor role and the
predefined threshold rules:
DataCapUtil_Rule
,MetaDataCapUtil_Rule
, andInodeCapUtil_Rule
enabled in a cluster.
The preceding command shows output similar to the following as shown here:mmhealth thresholds list
active_thresholds_monitor: scale-12.vmlocal ### Threshold Rules ### rule_name metric error warn direction filterBy groupBy sensitivity ---------------------------------------------------------------------------------------------------------------------------------------------------- MemFree_Rule MemoryAvailable_percent None 5.0 low node 300-min DataCapUtil_Rule DataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300 InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300 SMBConnPerNode_Rule current_connections 3000 None high node 300 SMBConnTotal_Rule current_connections 20000 None high 300 AFMInQueue_Rule AFMInQueueMemory_percent 90.0 80.0 high node 300
- ssh to switch to the node that has active threshold monitor
role:
[root@scale-12 ~]# ssh scale-12.vmlocal
- Issue the following command to review file system
events:
The preceding command gives output similar to the following as shown here.[root@scale-12 ~]# mmhealth node show filesystem -v
Node name: scale-11.vmlocal Component Status Status Change Reasons & Notices -------------------------------------------------------------------------- FILESYSTEM HEALTHY 2022-12-07 21:12:03 - cesSharedRoot HEALTHY 2022-12-07 10:38:55 - localFS DEGRADED 2022-12-07 21:12:03 pool-metadata_high_warn remote-fs HEALTHY 2022-12-15 14:23:24 -
As you can see in the preceding file system example output, everything looks correct except the "pool-metadata_high_warn" event.Event Parameter Severity Active Since Event Message ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ... inode_normal cesSharedRoot INFO 2022-12-07 10:39:25 The inode usage of fileset root in file system cesSharedRoot reached a normal level. inode_normal localFS INFO 2022-12-07 21:34:03 The inode usage of fileset myFset1 in file system localFS reached a normal level. inode_normal localFS INFO 2022-12-07 21:34:03 The inode usage of fileset root in file system localFS reached a normal level. ... pool-data_normal cesSharedRoot INFO 2022-12-07 10:38:55 The pool data of file system cesSharedRoot has reached a normal data level. pool-data_normal cesSharedRoot INFO 2022-12-07 10:38:55 The pool system of file system cesSharedRoot has reached a normal data level. pool-data_normal localFS INFO 2022-12-07 21:34:03 The pool system of file system localFS has reached a normal data level. pool-metadata_normal cesSharedRoot INFO 2022-12-07 10:38:55 The pool data of file system cesSharedRoot has reached a normal metadata level. pool-metadata_normal cesSharedRoot INFO 2022-12-07 10:38:55 The pool system of file system cesSharedRoot has reached a normal metadata level. pool-metadata_high_warn localFS WARNING 2022-12-07 21:34:03 The pool system of file system localFS has reached a warning level for metadata. 80.0
- Issue the following command to get the "
pool-metadata_high_warn
" warning details:
The preceding command shows the warning detail similar to the following as shown here.[root@scale-12 ~]# mmhealth event show pool-metadatadata_high_warn
Event Name: pool-metadata_high_warn Description: The pool has reached a warning level. Cause: The pool has reached a warning level. User Action: Add more capacity to pool or move to different pool or delete data and/or snapshots. Severity: WARNING State: DEGRADED
Tip: See File system events to get complete list of all the possible file system events. - Compare the metadata capacity values reported by
MetaDataCapUtil_Rule
of the system pool fromlocalFS
file system with mmlspool command output.
The preceding command shows the storage pools in file system at '/gpfs/localFS' similar to following as shown:[root@scale-11 ~]# mmlspool localFS
Name Id BlkSize Data Meta Total Data in (KB) Free Data in (KB) Total Meta in (KB) Free Meta in (KB) system 0 4 MB yes yes 16777216 13320192 ( 79%) 16777216 2515582 ( 15%)
In the preceding output, you can see that the pool system has only 15% available space for
meta_data
.