mmdiag command

Displays diagnostic information about the internal GPFS™ state on the current node.

Synopsis

mmdiag [--all] [--version] [--waiters] [--deadlock] [--threads]
       [--memory] [--network] [--config] [--trace] [--assert]
       [--iohist] [--tokenmgr] [--commands] [--lroc]
       [--dmapi [session|event|token|disposition|all]]
       [--rpc [node[=name]|size|message|all|nn{S|s|M|m|H|h|D|d}]]
       [--stats]

Availability

Available on all IBM Spectrum Scale™ editions.

Description

Use the mmdiag command to query various aspects of the GPFS internal state for troubleshooting and tuning purposes. The mmdiag command displays information about the state of GPFS on the node where it is executed. The command obtains the required information by querying the GPFS daemon process (mmfsd), and thus will only function when the GPFS daemon is running.

Results

The mmdiag command displays the requested information and returns 0 if successful.

Parameters

--all
Displays all available information. This is the same as specifying all of the mmdiag parameters.
--assert
Display current dynamic assert status and levels.
--commands
Displays all the commands currently running on the local node.
--config
Displays configuration parameters and their settings. The list of configuration parameters shown here consists of configuration parameters known to mmfsd. Note that some configuration parameters (for example, trace settings) are only perused by the layers of code above mmfsd, and those will be shown in mmlsconfig output, but not here.

On the other hand, while mmlsconfig only displays a subset of configuration parameters (generally those that have nondefault settings), the list here shows a larger parameter set. All of the documented mmfsd configuration parameters are shown, plus some of the undocumented parameters (generally those that are likely to be helpful in tuning and troubleshooting).

Note that parameter values shown here are those currently in effect (as opposed to the values shown in mmlsconfig output, which may show the settings that will become effective on the next GPFS restart).

--deadlock
Displays the longest waiters exceeding the deadlock detection thresholds.
If a deadlock situation occurs, administrators can use this information from all nodes in a cluster to help decide how to break up the deadlock.
--dmapi
Displays various DMAPI information. If no other options are specified, summary information is displayed for sessions, pending events, cached tokens, stripe groups, and events waiting for reply. The --dmapi parameter accepts the following options:
session
Displays a list of sessions.
event
Displays a list of pending events.
token
Displays a list of cached tokens, stripe groups, and events waiting for reply.
disposition
Displays the DMAPI disposition for events.
all
Displays all of the session, event, token, and disposition information with additional details.
--iohist
Displays recent IO history. The information about IO requests recently submitted by GPFS code is shown here. It can provide some insight into various aspects of GPFS IO, such as the type of data or metadata being read or written, the distribution of IO sizes, and IO completion times for individual IOs. This information can be very useful in performance tuning and troubleshooting.
--lroc
Displays status and statistics for local read-only cache (LROC) devices.
--memory
Displays information about mmfsd memory usage. There are several distinct memory regions that mmfsd allocates and uses, and it can be important to know the memory usage situation for each one.
Heap memory allocated by mmfsd
This area is managed by the OS and does not have a preset limit enforced by GPFS.
Memory pools 1 and 2
Both of these refer to a single memory area, also known as the shared segment. It is used to cache various kinds of internal GPFS metadata, as well as for many other internal uses. This memory area is allocated using a special, platform-specific mechanism and is shared between user space and kernel code. The preset limit on the maximum shared segment size, current usage, and some prior usage information are shown here.
Memory pool 3
This area is also known as the token manager pool. This memory area is used to store the token state on token manager servers. The preset limit on the maximum memory pool size, current usage, and some prior-usage information are shown here.

This information can be useful when troubleshooting ENOMEM errors returned by GPFS to a user application, as well as memory allocation failures reported in a GPFS log file.

--network
Displays information about mmfsd network connections and pending Remote Procedure Calls (RPCs). Basic information and statistics about all existing mmfsd network connections to other nodes is displayed, including information about broken connections. If there are currently any RPCs pending (that is, sent but not yet replied to), the information about each one is shown, including the list of RPC destinations and the status of the request for each destination. This information can be very helpful in following a multinode chain of dependencies during a deadlock or performance-problem troubleshooting.
--rpc
Displays RPC performance statistics. The --rpc parameter accepts the following options:
node[=name]
Displays all per node statistics (channel wait, send time TCP, send time verbs, receive time TCP, latency TCP, latency verbs, and latency mixed). If name is specified, all per node statistics for just the specified node are displayed.
size
Displays per size range statistics.
message
Displays per message type RPC execution time.
all
Displays everything.
nn{S|s|M|m|H|h|D|d}
Displays per node RPC latency statistics for the latest number of intervals, specified by nn, for the interval specified by one of the following characters:
S|s
Displays second intervals only.
M|m
Displays first the second intervals since the last minute boundary followed by minute intervals.
H|h
Displays first the second and minute intervals since their last minute and hour boundary followed by hour intervals.
D|d
Displays first the second, minute, and hour intervals since their last minute, hour, and day boundary followed by day intervals.
Averages are displayed as a number of milliseconds with three decimal places (1 microsecond granularity).
--stats
Displays some general GPFS statistics.
GPFS uses a diverse array of objects to maintain the file system state and cache various types of metadata. The statistics about some of the more important object types are shown here.
OpenFile
This object is needed to access an inode. The target maximum number of cached OpenFile objects is governed by the maxFilesToCache configuration parameter. Note that more OpenFile objects may be cached, depending on workload.
CompactOpenFile
These objects contain an abbreviated form of an OpenFile, and are collectively known as stat cache. The target maximum number of cached CompactOpenFile objects is governed by the maxStatCache configuration parameter.
OpenInstance
This object is created for each open file instance (file or directory opened by a distinct process).
BufferDesc
This object is used to manage buffers in the GPFS pagepool.
indBlockDesc
This object is used to cache indirect block data.

All of these objects use the shared segment memory. For each object type, there is a preset target, derived from a combination of configuration parameters and the memory available in the shared segment. The information about current object usage can be helpful in performance tuning.

--threads
Displays mmfsd thread statistics and the list of active threads. For each thread, its type and kernel thread ID are shown. All non-idle mmfsd threads are shown. For those threads that are currently waiting for an event, the wait reason and wait time in seconds are shown. This information provides more detail than the data displayed by mmdiag --waiters.
--tokenmgr
Displays information about token management. For each mounted GPFS file system, one or more token manager nodes will be appointed. The first token manager is always colocated with the file system manager, while additional token managers may be appointed from the pool of nodes with the manager designation. The information shown here includes the list of currently appointed token manager nodes and, if the current node is serving as a token manager, some statistics about prior token transactions.
--trace
Display current trace status and trace levels. During GPFS troubleshooting, it is often necessary to use the trace subsystem to obtain the debug data necessary to understand the problem (see the topic about the GPFS trace facility in the IBM Spectrum Scale: Problem Determination Guide.) It is very important to have trace levels set correctly, per instructions provided by the IBM® Support Center. The information shown here allows you to check the state of tracing and to see the trace levels currently in effect.
--version
Displays information about the GPFS build currently running on this node. This helps in troubleshooting installation problems. The information displayed here may be more comprehensive than version information available via the OS package management infrastructure, in particular when an e-fix is installed.
--waiters
Displays mmfsd threads waiting for events. This information can be very helpful in troubleshooting deadlocks and performance problems. For each thread, the thread name, wait time in seconds, and wait reason are typically shown. Only non-idle threads currently waiting for some event to occur are displayed. Note that only mmfsd threads are shown; any application IO threads that might be waiting in GPFS kernel code would not be present here.

Exit status

0
Successful completion.
nonzero
A failure has occurred.

Security

You must have root authority to run the mmdiag command.

Examples

  1. To display a list of waiters, issue this command:
    mmdiag --waiters
    The system displays output similar to the following:
    === mmdiag: waiters ===
    0x11DA520 waiting 0.001147000 seconds, InodePrefetchWorker:
     for I/O completion
    0x2AAAAAB02830 waiting 0.002152000 seconds, InodePrefetchWorker:
     for I/O completion
    0x2AAAAB103990 waiting 0.000593000 seconds, InodePrefetchWorker:
     for I/O completion
    0x11F51E0 waiting 0.000612000 seconds, InodePrefetchWorker:
     for I/O completion
    0x11EDE60 waiting 0.005736500 seconds, InodePrefetchWorker:
     on ThMutex 0x100073ABC8 (0xFFFFC2000073ABC8) 
     (CacheReplacementListMutex)

    In this example, all waiters have a very short wait duration and represent a typical snapshot of normal GPFS operation.

  2. To display information about GPFS memory utilization, issue this command:
    mmdiag --memory
    The system displays output similar to the following:
    mmfsd heap size: 1503232 bytes
    
    current mmfsd heap bytes in use: 1919624 total 1867672 payload
    
    Statistics for MemoryPool id 1 ("Shared Segment (EPHEMERAL)")
             128 bytes in use
       557721725 hard limit on memory usage
         1048576 bytes committed to regions
               1 allocations
               1 frees
               0 allocation failures
    
    
    Statistics for MemoryPool id 2 ("Shared Segment")
         8355904 bytes in use
       557721725 hard limit on memory usage
         8785920 bytes committed to regions
         1297534 allocations
         1296595 frees
               0 allocation failures
    
    
    Statistics for MemoryPool id 3 ("Token Manager")
          496184 bytes in use
       510027355 hard limit on memory usage
          524288 bytes committed to regions
            1309 allocations
             130 frees
               0 allocation failures

    In this example, a typical memory usage picture is shown. None of the memory pools are close to being full, and there are no prior allocation failures.

Location

/usr/lpp/mmfs/bin