gpfs.snap command

Creates an informational system snapshot at a single point in time. This system snapshot consists of information such as cluster configuration, disk configuration, network configuration, network status, GPFS™ logs, dumps, and traces.

Synopsis

gpfs.snap [-d OutputDirectory] [-m | -z]
          [-a | -N {Node[,Node...] | NodeFile | NodeClass}]
          [--check-space | --no-check-space | --check-space-only]
          [--cloud-gateway {BASIC |FULL} ] [--full-collection] [--deadlock [--quick] |
           --limit-large-files {YYYY:MM:DD:HH:MM | NumberOfDaysBack | latest}]
          [--exclude-aix-disk-attr] [--exclude-aix-lvm] [--exclude-merge-logs]
          [--exclude-net] [--gather-logs] [--mmdf] [--performance] [--prefix]
          [--protocol ProtocolType[,ProtocolType,...]] [--timeout Seconds]
          [--purge-files KeepNumberOfDaysBack][--hadoop]

Availability

Available on all IBM Spectrum Scale™ editions.

Description

Use the gpfs.snap command as the main tool to gather data when a GPFS problem is encountered, such as a hung file system, a hung GPFS command, or a daemon assert.

The gpfs.snap command gathers information (for example, GPFS internal dumps, traces, and kernel thread dumps) to solve a GPFS problem.

Note: By default, large debug files are now a delta collection, which means that they are only collected when there are new files since the previous run of gpfs.snap. To override this default behavior, use either the --limit-large-files or --full-collection options.
Note: This is a service tool and options might change dynamically. The tool impacts performance and occupies disk space when it runs.

Parameters

-d OutputDirectory
Specifies the output directory. The default is /tmp/gpfs.snapOut.
-m
Specifying this option is equivalent to specifying --exclude-merge-logs with -N.
-z
Collects gpfs.snap data only from the node on which the command is invoked. No master data is collected.
-a
Directs gpfs.snap to collect data from all nodes in the cluster. This is the default.
-N {Node[,Node ...] | NodeFile | NodeClass}
Specifies the nodes from which to collect gpfs.snap data. This option supports all defined node classes. For general information on how to specify node names, see Specifying nodes as input to GPFS commands .
--check-space
Specifies that space checking is performed before collecting data.
--no-check-space
Specifies that no space checking is performed. This is the default.
--check-space-only
Specifies that only space checking is performed. No data is collected.
--cloud-gateway {BASIC | FULL}
With the BASIC option, when the Transparent cloud tiering service is enabled, the snap will collect information such as logs, traces, Java™ cores, along with minimal system and IBM Spectrum Scale cluster information specific to transparent cloud tiering. No customer sensitive information is collected.
Note: The default behavior of the gpfs.snap command includes basic information of Transparent cloud tiering, in addition to the GPFS information.
With the FULL option, extra details such as Java Heap dump are collected, along with the information captured with the BASIC option.
--full-collection
Specifies that all large debug files are collected instead of the default behavior that only collects new files since the previous run of gpfs.snap.
--deadlock
Collects only the minimum amount of data necessary to debug a deadlock problem. Part of the data collected is the output of the mmfsadm dump all command. This option ignores all other options except for -a, -N, -d, and --prefix.
--quick
Collects less data when specified along with the --deadlock option. The output includes mmfsadm dump most, mmfsadm dump kthreads, and 10 seconds of trace in addition to the usual gpfs.snap output.
--limit-large-files {YYYY:MM:DD:HH:MM | NumberOfDaysBack | latest}]
Specifies a time limit to reduce the number of large files collected.
--exclude-aix-disk-attr
Specifies that data about AIX® disk attributes will not be collected. Collecting data about AIX disk attributes on an AIX node that has a large number of disks could be very time-consuming, so using this option could help improve performance.
--exclude-aix-lvm
Specifies that data about the AIX Logical Volume Manager (LVM) will not be collected.
--exclude-merge-logs
Specifies that merge logs and waiters will not be collected.
--exclude-net
Specifies that network-related information will not be collected.
--gather-logs
Gathers, merges, and chronologically sorts all of the mmfs.log files. The results are stored in the directory specified with -d option.
--mmdf
Specifies that mmdf output will be collected.
--performance
Specifies that performance data is to be gathered.
Note: The performance script can take up to 30 minutes to run; therefore, it is not included when all other types of protocol information are gathered by default. Specifying this option is the only way to turn on the gathering of performance data.
--prefix
Specifies that the prefix name gpfs.snap will be added to the tar file.
--protocol ProtocolType[,ProtocolType,...]
Specifies the type (or types) of protocol information to be gathered. By default, whenever any protocol is enabled on a file system, information is gathered for all types of protocol information (except for performance data; see the --performance option). However, when the --protocol option is specified, the automatic gathering of all protocol information is turned off, and only the specified type of protocol information will be gathered. The following values for ProtocolType are accepted:
  • smb
  • nfs
  • object
  • authentication
  • ces
  • core
  • none
--timeout Seconds
Specifies the timeout value, in seconds, for all commands.
--purge-files KeepNumberOfDaysBack
Specifies that large debug files will be deleted from the cluster nodes based on the KeepNumberOfDaysBack value. If 0 is specified, all of the large debug files will be deleted. If a value greater than 0 is specified, large debug files that are older than the number of days specified will be deleted. For example, if the value 2 is specified, the previous two days of large debug files are retained.
This option is not compatible with many of the gpfs.snap options because it only removes files and does not collect any gpfs.snap data.
--hadoop
Specifies that Hadoop data is to be gathered.

Use the -z option to generate a non-master snapshot. This is useful if there are many nodes on which to take a snapshot, and only one master snapshot is needed. For a GPFS problem within a large cluster (hundreds or thousands of nodes), one strategy might call for a single master snapshot (one invocation of gpfs.snap with no options), and multiple non-master snapshots (multiple invocations of gpfs.snap with the -z option).

Use the -N option to obtain gpfs.snap data from multiple nodes in the cluster. When the -N option is used, the gpfs.snap command takes non-master snapshots of all the nodes specified with this option and a master snapshot of the node on which it was invoked.

Exit status

0
Successful completion.
nonzero
A failure has occurred.

Security

You must have root authority to run the gpfs.snap command.

The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see Requirements for administering a GPFS file system.

Examples

  1. To collect gpfs.snap on all nodes with the default data, issue the command:
    (09:25:47) c34f2n03:~ # gpfs.snap
    gpfs.snap started at Mon Feb  8 09:25:54 EST 2016.
    Gathering common data.
    Gathering Linux specific data...
    Gathering trace reports and internal dumps...
    gpfs.snap:  Spawning remote gpfs.snap calls. Master is c34f2n03.
    This may take a while.
    
    Copying file 
    /tmp/gpfs.snapOut/18720/gpfs.snap.c13c1apv7_0208092648.out.tar.gz from c13c1apv7.gpfs.net ...
    gpfs.snap.c13c1apv7_0208092648.out.tar.gz    100%  592KB 592.2KB/s   00:00
    Successfully copied file 
    /tmp/gpfs.snapOut/18720/gpfs.snap.c13c1apv7_0208092648.out.tar.gz from c13c1apv7.gpfs.net.
    
    Copying file 
    /tmp/gpfs.snapOut/18720/gpfs.snap.c6f2bc4n8_0208092705.out.tar.gz from c6f2bc4n8.gpfs.net ...
    gpfs.snap.c6f2bc4n8_0208092705.out.tar.gz  100%  928KB 927.9KB/s   00:00
    Successfully copied file 
    /tmp/gpfs.snapOut/18720/gpfs.snap.c6f2bc4n8_0208092705.out.tar.gz from c6f2bc4n8.gpfs.net.
    Gathering cluster wide protocol data
    Packaging master node data.
    Writing * to file 
    /tmp/gpfs.snapOut/18720/collect/gpfs.snap.c34f2n03_master_0208092554.out.tar.gz
    Packaging all data.
    Writing . to file /tmp/gpfs.snapOut/18720/all.0208092554.tar
    gpfs.snap completed at Mon Feb  8 09:26:45 EST 2016
    ###############################################################################
    Send file /tmp/gpfs.snapOut/18720/all.0208092554.tar to IBM Service
    Examine previous messages to determine additional required data.
    ###############################################################################

    After this command customer would send the tar file (highlighted) to IBM® service as per the message

  2. To collect gpfs.snap on specific nodes, issue the command:
    (09:32:38) c34f2n03:~ # gpfs.snap -N c34f2n03,c13c1apv7
    gpfs.snap started at Mon Feb  8 09:32:48 EST 2016.
    Gathering common data.
    Gathering Linux specific data...
    Gathering trace reports and internal dumps...
    gpfs.snap:  Spawning remote gpfs.snap calls. Master is c34f2n03.
    This may take a while.
    
    Copying file 
    /tmp/gpfs.snapOut/23453/gpfs.snap.c13c1apv7_0208093340.out.tar.gz from c13c1apv7.gpfs.net ...
    gpfs.snap.c13c1apv7_0208093340.out.tar.gz      100%  583KB 583.1KB/s   00:00
    Successfully copied file 
    /tmp/gpfs.snapOut/23453/gpfs.snap.c13c1apv7_0208093340.out.tar.gz from c13c1apv7.gpfs.net.
    Gathering cluster wide protocol data
    Packaging master node data.
    Writing * to file /tmp/gpfs.snapOut/23453/collect/gpfs.snap.c34f2n03_master_0208093248.out.tar.gz
    Packaging all data.
    Writing . to file /tmp/gpfs.snapOut/23453/all.0208093248.tar
    gpfs.snap completed at Mon Feb  8 09:33:34 EST 2016
    ###############################################################################
    Send file /tmp/gpfs.snapOut/23453/all.0208093248.tar to IBM Service
    Examine previous messages to determine additional required data.
    ###############################################################################

Location

/usr/lpp/mmfs/bin