Resolving deadlock problems
After diagnosing a deadlock problem, the next step is to attempt to resolve the deadlock issue resulting between two concurrently running applications each of which have locked a resource the other application needs. The guidelines provided here can help you to resolve the deadlock problem you are experiencing and help you to prevent such future incidents.
Before you begin
Confirm that you are experiencing a deadlock problem by taking the necessary diagnostic steps for locking problems outlined in Diagnosing and resolving locking problems.
About this task
The guidelines provided here can help you to resolve the deadlock problem you are experiencing and help you to prevent such future incidents.
Use the following steps to diagnose the cause of the unacceptable deadlock problem and to apply a remedy:
- Obtain information from the lock event monitor or administration notification log about all tables where agents are experiencing deadlocks.
- Use the information in the administration notification
log to decide how to resolve the deadlock problem. There are a number of guidelines that help to reduce lock contention and lock wait time. Consider the following options:
- Each application connection should process its own set of rows to avoid lock waits.
- Deadlock frequency can sometimes be reduced by ensuring that all applications access their common data in the same order - meaning, for example, that they access (and therefore lock) rows in Table A, followed by Table B, followed by Table C, and so on. If two applications take incompatible locks on the same objects in different order, they run a much larger risk of deadlocking.
- A lock timeout is not much better than a deadlock, because both cause a transaction to be rolled back, but if you must minimize the number of deadlocks, you can do it by ensuring that a lock timeout will usually occur before a potential related deadlock can be detected. To do this, set the value of the locktimeout database configuration parameter (units of seconds) to be much lower than the value of the dlchktime database configuration parameter (units of milliseconds). Otherwise, if locktimeout is longer than the dlchktime interval, the deadlock detector could wake up just after the deadlock situation began, and detect the deadlock before the lock timeout occurs.
- Avoid concurrent DDL operations if possible. For example, DROP TABLE statements can result in a large number of catalog updates as rows might have to be deleted for the table indexes, primary keys, check constraints, and so on, in addition to the table itself. If other DDL operations are dropping or creating objects, there can be lock conflicts and even occasional deadlocks.
- It is best practice to commit the following actions as soon as
- Write actions such as delete, insert, and update
- Data definition language (DDL) statements, such as ALTER, CREATE, and DROP
- BIND and REBIND commands
The deadlock detector is unable to know about and resolve the following situation, so your
application design must guard against this.
An application, particularly a multithreaded one, can have a deadlock involving a Db2® lock wait and a wait for a resource outside of the Db2 software, such as a semaphore. For example, connection A can be waiting for a lock held by connection B, and B can be waiting for a semaphore held by A.
What to do next
Rerun the application or applications to ensure that the locking problem has been eliminated by checking the administration notification log for lock-related entries.