Removing non-detectable resource identifiers from the performance monitoring tool database

Every metric value stored in the database is associated with a metric name and resource identifier (single entity). However, the performance monitoring tool does not perform detectability check of stored entities. The identifiers of deleted and renamed resources remain in the database forever. The missing identifiers that do not return any value over the retention period of 14 days can be reviewed and deleted by using the mmperfmon command.

In order to avoid deleting temporarily missing identifiers, all entities that are not detectable are retained for the time period of 14 days. If the retention period has expired, and no values have been returned for the undetectable entity for over 14 days, it is listed on expiredKeys list and can be deleted.

Follow the given steps to clean up the performance monitoring tool database:

  1. To view expired keys, issue the following command:
    mmperfmon query --list=expiredKeys
    The system displays output similar to this:
    
    Found expired keys:
    test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|fs1
    test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|fs2
    test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|fs2
    test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|fs1
    test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|gpfs0
    test_nodename|DiskFree|/mnt/gpfs0
    test_nodename|Netstat
    test_nodename|GPFSFilesystem|gpfsgui-cluster-2.novalocal|objfs
    test_nodename|GPFSVFS
    test_nodename|GPFSNode
    test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|gpfs0
    test_nodename|GPFSFilesystemAPI|gpfsgui-cluster-2.novalocal|objfs
    test_nodename|DiskFree|/gpfs/fs2
    test_nodename|DiskFree|/gpfs/fs1
    test_nodename|GPFSRPCS
    test_nodename|CPU
    test_nodename|GPFSNodeAPI
    test_nodename|Load
    test_nodename|DiskFree|/mnt/objfs
    test_nodename|Memory
    
  2. To delete expired key, issue the following command:
     mmperfmon delete --key ‘test_nodename|DiskFree|/mnt/gpfs0‘
    The system displays output similar to this:
    Check expired keys completed. Successfully 3 keys deleted.
  3. For the automatic deletion of the performance monitoring keys for non-detectable resources, enable a new automated clean-up job (PerfKeyThread). By default, the job is disabled. But, it can be enabled to run every 7 days on the cluster manager node, which is cluster state manager (CSM). Complete the following steps to disable or enable the automatic deletion of performance monitoring keys for a non-detectable resource:
    1. To enable the deletion of performance monitoring keys to run for every 7 days automatically, issue the following command:
       mmchconfig mmhealth-retention-perfkeythread_days=7 --force
      To restart the performance monitoring on the CSM node, issue the following command:
      # systemctl restart mmsysmon
    2. To disable deletion of the expired keys, issue the following command:
       mmchconfig mmhealth-retention-perfkeythread_days=0 --force
      To restart the performance monitoring on the CSM node, issue the following command:
      # systemctl restart mmsysmon
    Important: It is recommended to enable the PerfKeyThread parameter on storage clusters that are used for the IBM Storage Scale Container Native Storage Access or IBM Storage Scale Container Storage Interface driver remote mount. Because Kubernetes persistent volumes (PV) and the resulting IBM Storage Scale Container Storage Interface driver filesets are stored temporary on the clusters but their historical data is stored in the pmcollector database of the performance monitoring.

The following table shows you resource types and responsible sensors that are included in the detectability validation procedure.

Table 1. Resource types and the sensors responsible for them
Resource type Responsible sensors
Filesets data GPFSFileset , GPFSFilesetQuota
Filesystem inodes data GPFSInodeCap
Pools data GPFSPool, GPFSPoolCap
Filesystem mounts data DiskFree, GPFSFilesystem, GPFSFilesystemAPI
Disks and NSD data GPFSDiskCap, GPFSNSDDisk
Nodes data

CPU

GPFSNode

GPFSNodeAPI

GPFSRPCS

GPFSVFS

Load

Memory

Netstat

Note: The identifiers from Network, Protocols, TCT, and CTDB sensor data are not included in the detectability validation and cleanup procedure.