Debugging hangs
A hang is caused by a wait (also known as a deadlock) or a loop (also known as a livelock). A deadlock sometimes occurs because of a wait on a lock or monitor. A loop can occur similarly or sometimes because of an algorithm making little or no progress towards completion.
A wait could either be caused by a timing error leading to a missed notification, or by two threads deadlocking on resources.
For an explanation of deadlocks and diagnosing them using a Javadump, see LOCKS .
A loop is caused by a thread failing to exit a loop in a timely
manner. The problem might occur because the thread calculated the
wrong limit value, or missed a flag that was intended to exit the
loop. If the problem occurs only on multiprocessor workstations, the
failure can usually be traced to:
- A failure to make the flag volatile.
- A failure to access the flag while holding an appropriate monitor.
The following approaches are useful to resolve waits and loops:
- Monitoring process and system state (as described in MustGather information for Linux).
- Javadumps give monitor and lock information. You can trigger a Javadump
during a hang by using the
kill -QUIT <PID>command. - -verbose:gc information is
useful. It indicates:
- Excessive garbage collection, caused by a lack of Java™ heap space, which makes the system seem to be in livelock
- Garbage collection causing a hang or memory corruption which later causes hangs