Relationship between local and global deadlock detection

The four XCF messages represent one global detection cycle, which usually takes two to four x-second intervals to complete (where x is the number of local cycles).

Four XCF messages are required to gather and communicate the latest information from the local deadlock detectors:
  1. The local deadlock detector sends its information about lock waiters to the global deadlock manager.
  2. The global deadlock manager takes that information from all local deadlock detectors and sends messages to each of the IRLMs in the group. (Because the global deadlock manager is also a local deadlock detector, it receives the same information, although somewhat quicker than the rest of the IRLMs.)
  3. Each local deadlock detector checks the global view of resources and determines if it has blockers for other waiters. It passes that information along to the global deadlock manager with its list of waiters.
  4. The global deadlock manager, from the information it receives from the local deadlock detectors, determines if a global deadlock or timeout situation exists. If a global deadlock situation exists, Db2 chooses a candidate for the deadlock. The global deadlock manager also determines if any timeout candidate is blocked by an incompatible waiter or holder and, if so, presents that candidate to the owning IRLM, along with any deadlock candidates belonging to that IRLM. When Db2 receives this information, it determines if it should request that IRLM reject any given timeout candidate waiter.
The following figure illustrates an example in which the deadlock time value is set to 5 seconds.
Figure 1. Global deadlock detection cycle
Begin figure summary.. In 10 seconds, the global deadlock detection manager processes waiters and chooses deadlock and timeout victims. Detailed description available.

Deadlock detection might be delayed if any of the IRLMs in the group encounter any of the following conditions:

  • XCF signaling delays
  • IRLM latch contention (can be encountered in systems with extremely high IRLM locking activity)
  • A large number of global waiters