Debugging hangs

A hang is caused by a wait (also known as a deadlock) or a loop (also known as a livelock). A deadlock sometimes occurs because of a wait on a lock or monitor. A loop can occur similarly or sometimes because of an algorithm making little or no progress towards completion.

A wait could either be caused by a timing error leading to a missed notification, or by two threads deadlocking on resources.

For an explanation of deadlocks and diagnosing them using a Javadump, see LOCKS .

A loop is caused by a thread failing to exit a loop in a timely manner. The problem might occur because the thread calculated the wrong limit value, or missed a flag that was intended to exit the loop. If the problem occurs only on multiprocessor workstations, the failure can usually be traced to:
  • A failure to make the flag volatile.
  • A failure to access the flag while holding an appropriate monitor.
The following approaches are useful to resolve waits and loops:
  • Monitoring process and system state (as described in MustGather information for Linux).
  • Javadumps give monitor and lock information. You can trigger a Javadump during a hang by using the kill -QUIT <PID> command.
  • -verbose:gc information is useful. It indicates:
    • Excessive garbage collection, caused by a lack of Java™ heap space, which makes the system seem to be in livelock
    • Garbage collection causing a hang or memory corruption which later causes hangs