mmhealth command
Monitors health status of nodes.
Synopsis
mmhealth node show [ GPFS | NETWORK [ UserDefinedSubComponent ]
| FILESYSTEM [UserDefinedSubComponent ] | DISK | CES | AUTH | AUTH_OBJ
| BLOCK | CESNETWORK | NFS | OBJECT | SMB | CLOUDGATEWAY | GUI
| PERFMON ] [-N {Node[,Node..] | NodeFile | NodeClass}]
[--verbose] [--unhealthy]
or
mmhealth node eventlog [[--hour | --day | --week | --month] | [--verbose]]
Availability
Available with IBM Spectrum Scale™ Express Edition or higher.
Description
Use the mmhealth command to monitor the health of the node and services hosted on the node in IBM Spectrum Scale.
By using this command, IBM Spectrum Scale the administrator can monitor the health of each node and services hosted on that node. This command also shows the events that are responsible for the unhealthy status of the services hosted on that node. This data might be helpful for monitoring and analyzing the reasons for the unhealthy status of the node. So, mmhealth command acts as a problem determination tool to identify which services of the node are unhealthy and events responsible for their unhealthy status.
For more information about the system monitoring feature, see IBM Spectrum Scale: Administration Guide
Parameters
- node
- Displays the health status, specifically, at node level.
- show
- Displays the health status of the specified component with:
- GPFS™ | NETWORK | FILESYSTEM | DISK | CES | AUTH | AUTH_OBJ | BLOCK | CESNETWORK | NFS | OBJECT | SMB | CLOUDGATEWAY | GUI | PERFMON
- Displays the detailed health status of the specified component.
- UserDefinedSubComponent
- Displays services that are named by the customer, categorized by one of the other hosted services. For example, a file system named gpfs0 is a subcomponent of file system.
- -N
- Allows the system to make remote calls to the other nodes in the cluster for:
- Node[,Node....]
- Specifies the node or list of nodes that must be monitored for the health status.
- NodeFile
- Specifies a file, containing a list of node descriptors, one per line, to be monitored for health status.
- NodeClass
- Specifies a node class that must be monitored for the health status.
- --verbose
- Shows the detailed health status of a node, including its sub-components.
- --unhealthy
- Displays the unhealthy components only.
- eventlog
- Shows the event history for a specified period of time. If no time period is specified, it
displays all the events by default:
- [--hour | --day | --week| --month]
- Displays the event history for the specified time period.
- [--verbose]
- Displays additional information about the event like component name and event ID in the eventlog.
Exit status
- 0
- Successful completion.
- nonzero
- A failure has occurred.
Security
You must have root authority to run the mmhealth command.
The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. See the information about the requirements for administering a GPFS system in the IBM Spectrum Scale: Administration Guide.
Examples
- To show the health status of the current node:
The system displays output similar to this:mmhealth node show
Node name: test_node Node status: HEALTHY Status Change: 39 min. ago Component Status Reasons ----------------------------------------- GPFS HEALTHY - NETWORK HEALTHY - FILESYSTEM HEALTHY - DISK HEALTHY - CES HEALTHY - PERFMON HEALTHY -
- To view the health status of a specific node, issue this
command:
The system displays output similar to this:mmhealth node show -N test_node2
Node name: test_node2 Node status: CHECKING Status Change: Now Component Status Status Change Reasons ------------------------------------------------------------------- GPFS CHECKING Now - NETWORK HEALTHY Now - FILESYSTEM CHECKING Now - DISK CHECKING Now - CES CHECKING Now - PERFMON HEALTHY Now -
- To view the health status of all the nodes, issue this
command:
The system displays output similar to this:mmhealth node show -N all
Node name: test_node Node status: DEGRADED Component Status Status Change Reasons ------------------------------------------------------------- GPFS HEALTHY Now - CES FAILED Now smbd_down FileSystem HEALTHY Now - Node name: test_node2 Node status: HEALTHY Component Status Status Change Reasons ------------------------------------------------------------ GPFS HEALTHY Now - CES HEALTHY Now - FileSystem HEALTHY Now -
- To view the detailed health status of the component and its sub-component, issue this
command:
The system displays output similar to this:mmhealth node show ces
Node name: test_node Component Status Reasons ----------------------------------------------- CES FAILED smbd_down AUTH HEALTHY - AUTH_OBJ HEALTHY - NFS HEALTHY - OBJ HEALTHY - SMB FAILED smbd_down Event Parameter Severity Description ----------------------------------------------------------------- smbd_down SMB ERROR SMBD process not running
- To view the health status of only unhealthy components, issue this
command:
The system displays output similar to this:mmhealth node show --unhealthy
Node name: test_node Node status: DEGRADED Component Status Reasons ----------------------------------------------- CES FAILED smbd_down
- To view the health status of sub-components of a node's component, issue this
command:
The system displays output similar to this:mmhealth node show --verbose
Node name: test_node Node status: DEGRADED Component Status Reasons ----------------------------------------------- GPFS HEALTHY - CES FAILED smbd_down AUTH HEALTHY - AUTH_OBJ HEALTHY - NFS HEALTHY - OBJ HEALTHY - SMB FAILED smbd_down FILESYSTEM HEALTHY - gpfs0 HEALTHY - FSII HEALTHY -
- To view the eventlog history of the node for the last hour, issue this
command:
The system displays output similar to this:mmhealth node eventlog --hour
Timestamp Event Name Severity Details 2016-04-07 10:12:24.394569 CEST quorum_warn WARNING GPFS quorum monitoring returned unknown result 2016-04-07 10:12:39.366279 CEST quorum_warn WARNING GPFS quorum monitoring returned unknown result 2016-04-07 10:12:54.356577 CEST quorum_warn WARNING GPFS quorum monitoring returned unknown result
- To view the eventlog history of the node for the last hour, issue this
command:
The system displays output similar to this:mmhealth node eventlog --hour --verbose
Timestamp Component Event Name Event ID Severity Details 2016-04-07 10:12:54.356577 CEST gpfs quorum_warn 999291 WARNING GPFS quorum monitoring returned unknown result 2016-04-07 10:13:09.359602 CEST gpfs quorum_warn 999291 WARNING GPFS quorum monitoring returned unknown result 2016-04-07 10:13:24.425680 CEST gpfs quorum_warn 999291 WARNING GPFS quorum monitoring returned unknown result