Automated deadlock detection
Automated deadlock detection flags unexpected long waiters as potential deadlocks. Effective deadlock detection thresholds are self-tuned to reduce false positive detection. You can register a user program for the deadlockDetected event to receive automatic notification.
GPFS code uses waiters to track what a thread is waiting for and how long it is waiting. Many deadlocks involve long waiters. In a real deadlock, long waiters do not disappear naturally as the deadlock prevents the threads from getting what they are waiting for. With some exceptions, long waiters typically indicate that something in the system is not healthy. A deadlock might be in progress, some disk might be failing, or the entire system might be overloaded.
Automated deadlock detection monitors waiters to detect potential deadlocks. Some waiters can become long legitimately under normal operating conditions and such waiters are ignored by automated deadlock detection. Such waiters appear in the mmdiag --waiters output but never in the mmdiag --deadlock output. From now on in this topic, the word waiters refers only to those waiters that are monitored by automated deadlock detection.
Sat Jul 18 09:52:04.626 2015: [A] Unexpected long waiter detected: Waiting 905.9380 sec since
2015-07-18 09:36:58, on node c33f2in01,
SharedHashTabFetchHandlerThread 8397: on MsgRecordCondvar,
reason 'RPC wait' for tmMsgTellAcquire1
The /var/log/messages file on Linux® and the error log on AIX® also log an entry for the deadlock detection, but the mmfs.log file has most details.
The deadlockDetected event is triggered on "Unexpected long waiter detected" and any user program that is registered for the event is invoked. The user program can be made for recording and notification purposes. See /usr/lpp/mmfs/samples/deadlockdetected.sample for an example and more information.
Sat Jul 18 10:00:05.705 2015: [N] The unexpected long waiter on thread 8397 has disappeared in 1386 seconds.
The mmdiag --deadlock command shows the flagged waiter and possibly other waiters closely behind which also passed the threshold for deadlock detection.
If the flagged waiter disappears on its own, without any deadlock breakup actions, then the flagged waiter is not a real deadlock, and the detection is a false positive. A reasonable threshold needs to be established to reduce false positive deadlock detection. It is a good practice to consider the trade-off between waiting too long and not having a timely detection and not waiting long enough causing a false-positive detection.
A false positive deadlock detection and debug data collection are not necessarily a waste of resources. A long waiter, even if it eventually disappears on its own, likely indicates that something is not working well, and is worth looking into.
The configuration parameter deadlockDetectionThreshold is used to specify the initial threshold for deadlock detection. GPFS code adjusts the threshold on each node based on what's happening on the node and cluster. The adjusted threshold is the effective threshold used in automated deadlock detection.
Effective deadlock detection threshold on c37f2n04 is 1000 seconds
Effective deadlock detection threshold on c37f2n04 is 430 seconds for short waiters
Cluster my.cluster is overloaded. The overload index on c40bbc2xn2 is 1.14547
If deadlockDetectionThresholdForShortWaiters is positive, and it is by default, certain waiters, including most of the mutex waiters, are considered short waiters that should not be long. These short waiters have a shorter effective deadlock detection threshold that is self-tuned separately.
Certain waiters, including most of the mutex waiters, are considered short waiters that should not be long. If deadlockDetectionThresholdForShortWaiters is positive, and it is by default, these short waiters are monitored separately. Their effective deadlock detection threshold is also self-tuned separately.
The overload index is the weighted average duration of all I/Os completed over a long time. Recent I/O durations count more than the ones in the past. The cluster overload detection affects deadlock amelioration functions only. The determination by GPFS that a cluster is overloaded is not necessarily the same as the determination by a customer. But customers might use the determination by GPFS as a reference and check the workload, hardware and network of the cluster to see whether anything needs correction or adjustment. An overloaded cluster with a workload far exceeding its resource capability is not healthy nor productive.
If the existing effective deadlock detection threshold value is no longer appropriate for the workload, run the mmfsadm resetstats command to restart the local adjustment.
mmlsconfig deadlockDetectionThreshold
mmlsconfig deadlockDetectionThresholdForShortWaiters
deadlockDetectionThreshold 300
deadlockDetectionThresholdForShortWaiters 60
To disable automated deadlock detection, specify a value of 0 for deadlockDetectionThreshold. All deadlock amelioration functions, not just deadlock detection, are disabled by specifying 0 for deadlockDetectionThreshold. A positive value must be specified for deadlockDetectionThreshold to enable any part of the deadlock amelioration functions.