Delays and deadlocks

The first item to check when a file system appears hung is the condition of the networks including the network used to access the disks.

Look for increasing numbers of dropped packets on all nodes by issuing:
  • The netstat -D command on an AIX® node.
  • The ifconfig interfacename command, where interfacename is the name of the interface being used by GPFS for communication.
When using subnets ( See the Using remote access with multiple network definitions topic), different interfaces may be in use for intra-cluster and intercluster communication. The presence of a hang or dropped packed condition indicates a network support issue that should be pursued first. Contact your local network administrator for problem determination for your specific network configuration.

If file system processes appear to stop making progress, there may be a system resource problem or an internal deadlock within GPFS.

Note: A deadlock can occur if user exit scripts that will be called by the mmaddcallback facility are placed in a GPFS file system. The scripts should be placed in a local file system, so that the scripts are accessible even when the networks fail.

To debug a deadlock, do the following:

  1. Check how full your file system is by issuing the mmdf command. If the mmdf command does not respond, contact the IBM® Support Center. Otherwise, the system displays information similar to:
    disk                disk size  failure holds    holds              free KB             free KB
    name                    in KB    group metadata data        in full blocks        in fragments
    --------------- ------------- -------- -------- ----- -------------------- -------------------
    Disks in storage pool: system (Maximum disk size allowed is 1.1 TB)
    dm2                 140095488        1 yes      yes       136434304 ( 97%)        278232 ( 0%)
    dm4                 140095488        1 yes      yes       136318016 ( 97%)        287442 ( 0%)
    dm5                 140095488     4000 yes      yes       133382400 ( 95%)        386018 ( 0%)
    dm0nsd              140095488     4005 yes      yes       134701696 ( 96%)        456188 ( 0%)
    dm1nsd              140095488     4006 yes      yes       133650560 ( 95%)        492698 ( 0%)
    dm15                140095488     4006 yes      yes       140093376 (100%)            62 ( 0%)
                    -------------                         -------------------- -------------------
    (pool total)        840572928                             814580352 ( 97%)       1900640 ( 0%)
    
                    =============                         ==================== ===================
    (total)             840572928                             814580352 ( 97%)       1900640 ( 0%)
    
    Inode Information
    -----------------
    Number of used inodes:            4244
    Number of free inodes:          157036
    Number of allocated inodes:     161280
    Maximum number of inodes:       512000
    GPFS operations that involve allocation of data and metadata blocks (that is, file creation and writes) will slow down significantly if the number of free blocks drops below 5% of the total number. Free up some space by deleting some files or snapshots (keeping in mind that deleting a file will not necessarily result in any disk space being freed up when snapshots are present). Another possible cause of a performance loss is the lack of free inodes. Issue the mmchfs command to increase the number of inodes for the file system so there is at least a minimum of 5% free. If the file system is approaching these limits, you may notice the following error messages:
    6027-533 [W]
    Inode space inodeSpace in file system fileSystem is approaching the limit for the maximum number of inodes.
    operating system error log entry
    Jul 19 12:51:49 node1 mmfs: Error=MMFS_SYSTEM_WARNING, ID=0x4DC797C6, Tag=3690419: File system warning. Volume fs1. Reason: File system fs1 is approaching the limit for the maximum number of inodes/files.
  2. If automated deadlock detection and deadlock data collection are enabled, look in the latest GPFS log file to determine if the system detected the deadlock and collected the appropriate debug data. Look in /var/adm/ras/mmfs.log.latest for messages similar to the following:
    Thu Feb 13 14:58:09.524 2014: [A] Deadlock detected: 2014-02-13 14:52:59: waiting 309.888 seconds on node 
    p7fbn12: SyncHandlerThread 65327: on LkObjCondvar, reason 'waiting for RO lock'
    Thu Feb 13 14:58:09.525 2014: [I] Forwarding debug data collection request to cluster manager p7fbn11 of 
    cluster cluster1.gpfs.net
    Thu Feb 13 14:58:09.524 2014: [I] Calling User Exit Script gpfsDebugDataCollection: event deadlockDebugData, 
    Async command /usr/lpp/mmfs/bin/mmcommon.
    Thu Feb 13 14:58:10.625 2014: [N] sdrServ: Received deadlock notification from 192.168.117.21
    Thu Feb 13 14:58:10.626 2014: [N] GPFS will attempt to collect debug data on this node.
    mmtrace: move /tmp/mmfs/lxtrace.trc.p7fbn12.recycle.cpu0 
    /tmp/mmfs/trcfile.140213.14.58.10.deadlock.p7fbn12.recycle.cpu0
    mmtrace: formatting /tmp/mmfs/trcfile.140213.14.58.10.deadlock.p7fbn12.recycle to 
    /tmp/mmfs/trcrpt.140213.14.58.10.deadlock.p7fbn12.gz
    This example shows that deadlock debug data was automatically collected in /tmp/mmfs. If deadlock debug data was not automatically collected, it would need to be manually collected.

    To determine which nodes have the longest waiting threads, issue this command on each node:

    /usr/lpp/mmfs/bin/mmdiag --waiters waitTimeInSeconds
    For all nodes that have threads waiting longer than waitTimeInSeconds seconds, issue:
    mmfsadm dump all
    Notes:
    1. Each node can potentially dump more than 200 MB of data.
    2. Run the mmfsadm dump all command only on nodes that you are sure the threads are really hung. An mmfsadm dump all command can follow pointers that are changing and cause the node to crash.
  3. If the deadlock situation cannot be corrected, follow the instructions in Additional information to collect for delays and deadlocks, then contact the IBM Support Center.