DB2 Universal Database concurrency
Maximize access to relational data
The purpose of this article is to provide IBM® business partners with important information about DB2® Universal Database™ (UDB) data concurrency in a z/OS® environment. This article attempts to consolidate a significant amount of material from multiple sources, and then present that information as efficiently as possible. It is not intended to be comprehensive in scope or exhaustive in detail.
I intend to discuss the things that most commonly lead to the detection and resolution of concurrency problems. Neither every possible situation nor every potential consideration can be anticipated, let alone covered within the intended scope. I hope that this article provides you general guidance in order to assist DB2 UDB customers in achieving high levels of data concurrency in their environment. This article should be a good starting point for addressing concurrency at any given installation.
I assume that you have a basic understanding of DB2 UDB in a z/OS environment. The first few pages of this article discuss concepts and definitions in order to level set. My recommendations tend to be somewhat general in nature, and I don't always spell out technical descriptions and syntactical specifications in great detail. For more detailed information on these matters, I advise you to check the most recent IBM documentation for the release of DB2 UDB that is installed at the customer location.
The general design point for this article is DB2 UDB V7. Although DB2 UDB V8 has been announced, it will probably be months before the majority of IBM customers will implement DB2 UDB V8 NFM (New Function Mode) for their production systems. There is another factor to consider here as well. Although every new release of DB2 UDB undergoes extensive testing in both IBM and customer environments before being made generally available, customers will most likely have more confidence in general recommendations, Rules-of-Thumb, etc., based on earlier versions of DB2 UDB (because of the length and breadth of actual experience to validate the conclusions) than for a version that is not yet in broad, general usage.
DISCLAIMER: The information contained in this document has not been submitted to any formal IBM test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a user responsibility and depends on the user's ability to evaluate and integrate them into the customer's unique operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Users attempting to adapt these techniques to their own environments do so at their own risk.
Conceptual background and terminology
In the realm of database management systems (DBMS's), the term concurrency is used to denote the ability of more than one application to access the same data at essentially (from the user's perspective) the same time
Since one of the main benefits of a DBMS is the ability to share data among multiple users and multiple applications, the database system should provide a means for managing concurrent access to the data. The DBMS must ensure that the data will be maintained in a consistent state and that the integrity of the data will be preserved.
One method of accomplishing this is to enforce a serial-only mode of processing database requests. That is, each transaction waits until another transaction (either higher priority or initiated earlier) has completed its work. However, this type of processing results in performance levels that are simply unacceptable for today's online systems and customer expectations.
As an alternative, the DBMS can manage the access to data from multiple applications through the means of locks. A lock is a software mechanism that is used in order to allow as much throughput (by maximizing concurrent access to data) as possible while maintaining the integrity and consistency of the data.
Importance of concurrency control
Without an effective means of controlling concurrency, the integrity and consistency of the data could be compromised. The DBMS must protect the database and prevent conditions such as the following:
- Lost updates - Suppose that application A and application B both read the same row from the database, and both calculate a new value for one of its columns. If application A updates the row with its new value first, and subsequently application B then updates the same row, the first update (by application A) is lost.
- Unrepeatable reads - Certain application processes may require the following sequence of events: Program A reads a particular row from a table and then proceeds to other SQL requests. At a later time, program A reads the first row again and must find the same values in all columns as when it read that row the first time. Lacking proper concurrency control, another application could have changed the row between the two read operations.
- Access to uncommitted data - Application A updates some column values in a row, and before that change is committed, application B reads in the new (update) values for that row. If application A then 'backs out' the updated values (either by an SQL ROLLBACK statement in the program logic, or an automatic rollback by DB2 UDB because of an error condition), application B's processing of that row is then based upon uncommitted (and therefore potentially incorrect) data.
The ability to provide simultaneous access to data from multiple applications while maintaining data integrity is known as concurrency control.
A lock is a software mechanism that is used by DB2 UDB to accomplish concurrency control. A lock is essentially a control block that associates a DB2 UDB object or resource with an application in regards to how other applications can access that same object or resource. The application associated with the DB2 UDB resource is said to "hold" or "own" the lock.
By using locks, DB2 UDB (which manages the database) can prevent the kinds of problems mentioned above from occurring. DB2 UDB works in conjunction with another MVS address space, IRLM, which manages these locks. The locks and their owners are tracked by IRLM in order to determine whether or not a DB2 UDB resource that is being requested by an application is available for that type of work. Depending upon the kind of processing being done by the "holder" of the current lock on the resource, and the kind of processing anticipated by the requesting application, the resource may either be locked or shared.
The two most common lock modes are share and exclusive. Share locks are associated with read-only operations, which means that the application holding the lock is able to read the data, and the data also remains available for other applications to read. Exclusive locks are associated with write operations, which means that the application holding the lock is eligible to update the data, and that the data is unavailable to other applications until the update is complete (the change is committed to the database) and the lock is released by the lock owner.
DB2 UDB and IRLM use other types and subtypes of lock modes in its implementation of locking and concurrency control. You can find a more detailed description of this in the DB2 UDB Administration manual.
In addition to using various lock modes, DB2 UDB also provides different levels of locking in order to control the scope of the data that is being locked. These different levels indicate the locking granularity used by DB2 UDB, which can range from a single row to the entire table space. DB2 UDB uses different lock modes, depending upon the locking granularity
The reason for having multiple levels is straight forward. Certain applications may request access to read or update a broad scope of data, while other applications may request only a much narrower scope. If only one level of locking was available, overall system performance could be slowed. For instance, locking too much data all at once forces other applications to wait unnecessarily. Or DB2 UDB might use excessive system resource trying to service additional requests for locks on additional data resources. Having the ability to implement multiple levels of locking granularity can significantly improve concurrency levels.
An application process is said to be suspended when it requests a lock that is already owned by another application process, and that lock cannot be shared. The requesting application is suspended, that is it temporarily stops running. The order of precedence for lock requests is as follows: Incoming lock requests are queued in the order received. Requests by applications already holding locks, and requests for lock promotion are both serviced before requests by new applications. Within those groupings, the request order is "first in, first out" (FIFO).
When an application has been in a suspended state (see above) for longer than a preset interval of time, that application is terminated. The application is said to have timed out. The application receives an appropriate error message in the SQLCA before being terminated. The SQLCA (SQL Communications Area) is a fixed amount of storage set aside in any SQL application program for the purpose of passing condition codes and other information from DB2 UDB to the program.
Certain operations, such as COMMIT and ROLLBACK, cannot be timed out. The preset interval that determines how long an application process can wait for a resource is discussed below under the sub-heading RESOURCE TIMEOUT.
A deadlock is a situation that occurs when two or more applications each hold locks on resources that the other application needs, and without which those applications cannot proceed and complete their work.
Here is a simple deadlock scenario:
- Application A accesses table T and requests an exclusive (non-shareable) lock on page X.
- Application B accesses table T and requests an exclusive lock on page Y.
- Application A then requests a lock on page Y, while still holding the exclusive lock on page X. Application A is suspended, because of application B's exclusive lock on page Y.
- Application B then requests a lock on page X, while still holding the exclusive lock on page Y. Application B is suspended, because of application A's exclusive lock on page X.
- It is a stalemate condition. Neither application A nor B can proceed.
After a preset time interval (please see the discussion below entitled DEADLOCK TIME), DB2 UDB terminates the current unit of work for one of the applications involved in the deadlock situation (usually the application that has done the least amount of work). That frees the locks held by the terminating program and allows the remaining applications to continue. DB2 UDB sends a descriptive error message and information to the SQLCA of the terminated application.
Concurrency mechanisms for utilities and commands
While SQL applications use transaction locks to control concurrent access to DB2 UDB objects, DB2 UDB utilities and commands can obtain acess to DB2 UDB objects through other means, namely claims, drains and compatibility rules.
A claim is a notification to DB2 UDB that a particular object is currencly being acessed. Claims usually do not continue to exist beyond the commit point. In order to access the DB2 UDB object within the next unit of work, an application needs to make a new claim. Claims notify DB2 UDB that there is current interest in or activity on a DB2 UDB object. As long as there are any claims on a DB2 UDB object, no drains may be taken on the object until those claims are released.
A drain is the action of obtaining access to a DB2 UDB object, by:
- Preventing any new claims against the object.
- Waiting for all existing claims on the object to be released.
A drain on a DB2 UDB object causes DB2 UDB to quiesce all applications currently claiming that resource, by allowing them to reach a commit point but preventing them (or any other application process) from making a new claim. A drain lock also prevents conflicting processes from trying to drain the same object at the same time.
DB2 UDB typically controls the concurrent operation of utilities by a set of compatibility rules. Two utilities are considered "compatible" if they do not need to access the same DB2 UDB object at the same time in incompatible modes. When a utility job begins, DB2 UDB checks the system for any other utilities currently processing the same DB2 UDB object. The utility can only continue if the object is either not currently being accessed by another utility, or if the other executing utility is compatible.
Database design considerations
Achieving high concurrency should be one of the goals which guides the initial design of a DB2 UDB database. Many features that affect concurrency can be put into place at the time of creating the tables. You can add some of these features after implementation (such as by ALTERing the DB2 UDB objects), but others cannot be added, or at the very least require significant re-work and/or disruption to the current implementation. Therefore, it is advantageous to consider these issues up front.
I highly recommended that you create your larger tables as partitioned tables. A partitioned tablespace contains only a single, partitioned table. The table is divided into partitions based on the key range of the partitioning index. Each partition is created as a separate dataset and can be processed by DB2 UDB as a separate entity.
For heavy volume batch operations (INSERT, UPDATE, DELETE), you can take advantage of this partitioned table structure by breaking up a large job into smaller jobs. These numerous, smaller jobs can run concurrently (against different partitions) in order to reduce the elapsed time for the entire batch operation, and thereby make access to the table from other application processes available sooner.
In a similar manner, partitioned tables allow a utility job to work on a selected partition only, thus allowing concurrent access by other jobs or application processes to the other partitions in the table. In many cases this ability to take advantage of partitioned tables is inherent within the DB2 UDB utility program itself, while in other cases it simply offers the user the option of designing his workload to allow for higher levels of concurrency of DB2 UDB data.
DB2 UDB uses an escalation technique in order to balance concurrency needs with locking performance overhead. When an application process holds a large number of page, row or LOB locks on a single table or tablespace, DB2 UDB acquires a table or table space lock on that resource and then releases all of the previous page, row and LOB locks on that resource. That process is referred to as lock escalation.
If lock escalation occurs for a table using partition locking (CREATEd or ALTERed with LOCKPART YES), then only the partitions that are currently locked are escalated, and unlocked partitions remain unlocked. Once lock escalation has occurred for a table space, any unlocked partitions that are subsequently accessed are then locked with the table space lock.
When an application is executed, DB2 UDB starts by using page or row locks, and continues doing so as long as the process accesses relatively few pages or rows. When the application accesses many pages or rows, DB2 UDB will change to using table, table space or partition locks. Precisely when lock escalation is invoked is determined by the values of LOCKSIZE and LOCKMAX.
LOCKSIZE is an option of the CREATE/ALTER TABLESPACE statements that controls what type of lock that DB2 UDB acquires for an application process when accessing tables in the table space (i.e. it determines the "size" of the lock, which is sometimes also referred to as the lock granularity). The choices for this option include LOB, TABLESPACE, TABLE, PAGE, ROW and ANY.
The default for the LOCKSIZE parameter of the CREATE TABLESPACE statement is ANY. LOCKSIZE ANY allows DB2 UDB to choose the lock size. Normally DB2 UDB uses LOCKSIZE PAGE for non-LOB tables and LOCKSIZE TABLESPACE for LOB tables.
I recommended using this default when creating tablespaces until you have a reason to do otherwise. If you choose to change the LOCKSIZE, base the decision upon the results of monitoring the performance and concurrency characteristics of the applications that use the table space.
What size lock to use?
With DB2 V4, row level locking became available. Prior to that, the data page was the smallest locking unit. Many in the I/T industry assumed that row locking was a panacea for concurrency issues, but in reality it does not solve every concurrency problem. It may provide a significant improvement in an environment that is experiencing numerous lock waits, deadlocks and timeouts. But in other circumstances DB2 UDB can spend resources on acquiring more locks without realizing a proportional improvement in concurrency.
Since the processing consumed by IRLM to acquire, maintain and release a row lock is approximately the same as that required for a page lock, the decision about the size of lock to specify is a matter of the tradeoff between higher locking overhead and potentially improved concurrency.
So the question as to whether or not use page or row locks is dependent upon the characteristics of your data and applications. If you are observing significant contention on the data pages of a table space that is using page locking granularity, then consider using row locking. By locking at the row level rather than the page level, it may be possible to significantly reduce the contention with other application processes, especially if access is random.
However, if multiple applications are updating the same rows of a page, and in a different sequence, then row locking may even cause more contention than page locking. That is because with page locking, the second and subsequent applications must wait for the first application to finish before accessing that page, and they may time out. With row locking, multiple applications can access rows on the same page simultaneously, and might deadlock if they attempt to access the same set of rows.
Before using LOCKSIZE TABLESPACE or LOCKSIZE TABLE, the user must be reasonably sure that no other important application processes need concurrent access to that object. Before specifying LOCKSIZE ROW on any table, it is highly advisable to estimate whether or not the increase in locking overhead is worth the benefit of improved concurrency.
If a DB2 UDB table has both characteristics of being small and having high usage, consider using LOCKSIZE ROW, especially if there are many rows per page. Furthermore, that locking granularity should not have a large performance impact on locking overhead, since the table is small.
Grouping of DB2 UDB objects and authorizations
Make it a general practice to group tables that are logically related to the same application into one database. Ideally, each application process will reference as few DB2 UDB databases as possible. Also, try to give users different authorization ID's for working with different DB2 UDB databases. In effect, this increases the number of possible applications processes, while decreasing the number of databases that each application process can access. So in general, "keep like things together", and "keep unlike things apart".
Application design considerations
Just as with initial database design, it is important to consider concurrency issues early on whenever you are developing a new DB2 UDB application. It is better to incorporate principles of concurrency before implementation into production. However, sometimes improvements to concurrency may depend upon new functionality in newer releases of DB2 UDB and the other systems with which it interfaces. In such cases, some changes to existing applications may need to be made in order to achieve the increased levels of concurrency. At other times concurrency problems may develop and grow over time to a point that substantial application modification may become necessary. Below are some items to evaluate whenever developing new or modifying existing DB2 UDB applications.
Accessing critical resources as late as possible
For instance, design and code applications so that DB2 UDB data is acquired and updated by the SQL statements as close as possible to the commit point. Taking care to do this will reduce the amount of time that the data is locked, and therefore be unavailable to other application processes.
Committing work ASAP
It is generally a good SQL programming technique to issue a COMMIT statement as soon as possible after reaching a point of data consistency. This practice will help to avoid unnecessary lock contentions, and should be done even in read-only applications. In a similar manner, it is recommended to issue a ROLLBACK statement as soon as an SQL failure is detected in the application. This prevents unsuccessful SQL statements from holding locks on DB2 UDB resources longer than necessary.
Closing cursors ASAP
Closing your cursors ASAP is another programming practice that can help to improve concurrency. By issuing the CLOSE CURSOR statement as soon as possible, the application program will release those locks and free the DB2 UDB resources that they are holding.
Application retry logic
I recommended that application programs include retry logic. That is, when the program receives an indication of a deadlock or timeout situation, the application code should retry the operation, perhaps even multiple times. This makes it possible for the application to recover from the situation without any outside intervention or assistance from operations personnel. The field SQLERRD(3) in the SQLCA returns a reason code that indicates if a deadlock or timeout has occurred.
Accessing data in a consistent order
In most environments, there will be times and situations when different applications will need to access the same data. Make it a standard in your environment for applications to access data in the same sequence. For example, require that all applications access rows of table XYZ in ascending sequence. In that case, the first application to access table XYZ may delay other applications, but will not cause a deadlock situation. In similar fashion, all applications accessing tables A, B and C should do so in alphabetical order.
Increasing online concurrency
One means of reducing deadlocks and timeouts is to access DB2 UDB data in a consistent order throughout your system, as described above. A secondary approach is to use the WHERE CURRENT OF cursor option on UPDATEs and DELETEs, rather than the independent UPDATE and DELETE statements. Trying to plan batch activity for online tables during inactive or off-peak OLTP periods is a third option to consider when striving for maximum concurrency of your online DB2 UDB data.
The use of referential constraints in DB2 UDB databases can have a performance impact. A DELETE statement on a row in the parent table needs to acquire locks on the dependent tables (or at least on their indexes). Those dependent tables then have a decreased availability for concurrent use because of those locks. This impact can be significant for a cascading delete, since a single delete operation accessing a single row could require a scan of multiple rows in multiple tables.
Oftentimes you can reduce the performance impact by creating an index on the foreign key. Although it is not mandatory to have an index on the foreign key, if it doesn't exist, then when a row is deleted from the parent table, DB2 UDB must scan the entire dependent table.
Another action that can potentially improve this situation has to do with grouping related tables together. If rows in a parent table are frequently deleted, consider placing all of the dependent tables (and their dependents, etc.) in the same table space.
The process of converting "source" SQL (actually a prepared DBRM) into executable code is called binding, and is in some ways similar to the compile/link-edit process for procedural languages such as COBOL. During the binding of the SQL code, there are a number of options that can be specified that can significantly affect the degree of concurrency that the application will experience. Following is a discussion of some of the most important options that influence concurrency.
The ACQUIRE parameter determines the timing DB2 UDB uses for obtaining locks on the data requested by an application. ACQUIRE(ALLOCATE) instructs DB2 UDB to acquire all required table space locks at the time the application plan is allocated, but with ACQUIRE(USE) DB2 UDB will wait until the application first issues an SQL statement for the table before acquiring those locks.
Normally you should bind plans with the ACQUIRE(USE) option (which is the default for package binds). That is usually the best choice for concurrency. The ACQUIRE(ALLOCATE) option can give better protection against deadlocks for a very high-priority job. For the occasional appropriate use of that option, you may want to consider binding all DBRM's directly to the plan.
The time that DB2 UDB releases the locks that it holds is determined by the RELEASE option. With RELEASE(COMMIT), DB2 UDB instructs IRLM to release the locks when a COMMIT point is reached, whereas with RELEASE(DEALLOCATE), DB2 UDB will hold all locks until the program is terminated and the plan is de-allocated.
While it does not necessarily hold true in all situations or circumstances, the general recommendation is to use RELEASE(COMMIT) for OLTP and other applications where the number of locks requested is relatively low and the elapsed time of the application process is fairly short. For batch and other application processes that acquire large numbers of locks and/or may execute for a longer time, RELEASE(DEALLOCATE) is normally the better performance option.
The general recommendation for the ACQUIRE/RELEASE options of the BIND is to choose a combination of values based upon the characteristics of the application. One combination, namely ACQUIRE(ALLOCATE)/RELEASE(COMMIT) is disallowed. The DB2 BIND process will flag this specification as an error. The other three possible combinations all have their proper use, and are briefly described below.
ACQUIRE(ALLOCATE)/RELEASE(DEALLOCATE) may be useful for a batch application that runs for hours and accesses various tables, because this combination effectively prevents a deadlock situation from wasting all that processing. All needed resources are locked by the application as soon as the program starts to run.
The disadvantage of this combination is that it can seriously reduce concurrency. Resources that are potentially high in demand may be locked for longer than actually needed. Also, be aware that ACQUIRE(ALLOCATE) disables selective partition locking, so that if you are accessing a table space defined with LOCKPART YES, all partitions are locked.
The ACQUIRE(USE)/RELEASE(DEALLOCATE) combination often results in the most efficient use of processing time. DB2 UDB resources are only locked if and when they are needed by the application. Locking overhead is reduced, because the locks that are acquired are then held until the application terminates.
The disadvantage of this combination is that it may result in an increased frequency of deadlock situations when compared to ACQUIRE(ALLOCATE). Any locks that are acquired are done so in a sequence that is predictable only during an actual application run. Therefore, more deadlocks may occur.
ACQUIRE(USE)/RELEASE(COMMIT) happens to be the default combination for BIND, and provides the best concurrency. DB2 UDB resources are locked only when needed by the application, which can be very important if the application contains many SQL statements that are seldom used.
However, the tradeoff is that there may be an increase in processing time if the application commits fairly frequently, and then the application has to re-acquire locks again. And as with the above combination ACQUIRE(USE)/RELEASE(DEALLOCATE), deadlocks may increase, since all locks are acquired in a sequence that is predictable only during an actual program execution.
The ISOLATION parameter determines the degree of protection or isolation that DB2 UDB maintains for the data used by an application program against other applications within the DB2 UDB system. The four options, listed in order of increasing preference from a strictly concurrency perspective, are:
With Repeatable Read (RR), a row or page lock is held for all rows accessed by the application (whether qualifying or not) at least until the next commit point is reached. DB2 UDB ensures that:
- Your application does not read a row that another application process has changed until that process has released that row, and
Other application processes do not change a row that your application reads until that application commits or terminates.
With Read Stability (RS), a row or page lock is held for all qualifying rows returned to the application, at least until the next commit point. DB2 UDB ensures that:
- Your application does not read a row that another application process has changed until that process has released that row, and
Other application processes do not change a row that satisfies your application's search criteria (as determined by the WHERE clause) until that application commits or terminates. It does allow other application processes to either INSERT a row or UPDATE an existing row that did not originally satisfy the search criteria.
With Cursor Stability (CS), a row or page lock is held only until the cursor moves to another row or page. Therefore DB2 UDB:
- Ensures that your application does not read a row that another application process has changed until that process has released that row, but
Does allow other application processes to change a row that your application reads before your application commits or terminates.
With Uncommitted Read (UR), DB2 UDB does not acquire any row or page locks (the exception being for LOB data), and can run concurrently with most other operations. This is commonly referred to as "dirty read" within the I/T industry. With this isolation level, DB2 UDB allows:
- Other application processes to change any row that your application may read during a unit of work.
- Your application to read any row that another application process has changed, even if that change has not been committed.
This option is only applicable to ISOLATION(CS). For read-only and ambiguous cursors in applications bound with Cursor Stability, this option determines whether or not data currency is required. In other words, it tells DB2 UDB whether or not the data upon which the cursor is positioned must remain identical to (or "current with") the data in the base table. And to further clarify, an ambiguous cursor is one for which DB2 UDB cannot determine whether it will be used for update or read-only purposes.
CURRENTDATA(YES) specifies that data currency is required, so that DB2 UDB will acquire page or row locks to ensure this. CURRENTDATA(NO) specifies that data currency is not required, so DB2 UDB will not acquire those locks.
Overriding isolation (the WITH clause)
DB2 UDB provides the capability for an SQL statement within an application to specify the isolation level with which the plan or package is bound. This effectively overrides the isolation level for the plan or package, but only for the statement in which it appears. The WITH clause can be used in the following types of SQL DML statements:
- "searched" UPDATEs and DELETEs
- INSERTs from subselects
The proper ISOLATION option needs to be determined by the business logic and data integrity requirements of the application. This decision will require a basic understanding of the application needs and associated business rules.
In general, you should use the least restrictive isolation that still maintains the data integrity requirements of the application. The less restrictive the isolation, the more the data concurrency and therefore better performance accessing DB2 UDB data. More often than not, ISOLATION(CS) with CURRENTDATA(NO) will be a good starting point. ISOLATION(CS) allows DB2 UDB to release acquired locks as soon as possible, and CURRENTDATA(NO) allows DB2 UDB to avoid acquiring locks as often as possible.
Uncommitted Read (UR) is very efficient and causes little or no contention, but it should be used with extreme caution, since it enables DB2 UDB to read uncommitted data. Using ISOLATION(UR) is not recommended until it is determined that the applications and end users can accept the logical data inconsistencies that UR allows.
Lock avoidance and isolation
Lock avoidance is a concurrency enhancement introduced in DB2 V3. It enables DB2 UDB to read a page without taking a lock, after determining that the data in that page is committed. Whether or not lock avoidance can be used is dependent upon the degree of isolation that the application is using in its SQL calls.
ISOLATION(RR) and ISOLATION(RS) disallow any lock avoidance. Any cursor with ISOLATION(CS) uses lock avoidance for non-qualifying pages or rows. Ambiguous and read-only cursors with ISOLATION(CS) and CURRENTDATA(NO) use lock avoidance for both qualifying and non-qualifying pages or rows.
Limiting locks on a table space
LOCKMAX is an option that you can specify when a tablespace is CREATEd or ALTERed. This specifies to DB2 UDB the maximum number of page or row locks that a single application can hold on the table space before the locks are escalated. This option can be put into effect for both user data and the DB2 UDB catalog.
If n represents the maximum number of page or row locks before lock escalation occurs, then specify LOCKMAX n. Lock escalation is disabled by specifying LOCKMAX 0. Specifying LOCKMAX SYSTEM sets the maximum equal to the system default, which is set by the option LOCKS PER TABLE(SPACE) on installation panel DSNTIPJ.
Limiting locks acquired by an application
DB2 UDB also provides the ability to set a maximum on the number of page or row locks that are held by a single application. This option, LOCKS PER USER, is also on the installation panel DSNTIPJ. If a page or row lock is requested of IRLM by an application process that already holds the maximum number specified, DB2 UDB sends an error message to the application's SQLCA. The requested lock cannot be acquired until some of the existing locks are released.
The default LOCKS PER USER is 10,000. This number is normally adequate for most installations for the majority of workloads when using page locks. For row-level locking on very large tables, or for LOBs, it may be necessary to specify a higher value. Alternatively, you may want to review these kinds of applications that may require a higher value, and analyze whether or not they could use tablespace locks rather than page, row, or LOB locks.
System and installation considerations
In addition to the suggestions made above in regards to database and application design, here are some recommendations and considerations regarding installation options and the systems environment:
DB2 UDB catalog contention
Certain SQL DDL (data definition language), SQL DML (data manipulation language), and SQL DCL (data control language) statements acquire locks on the DB2 UDB catalog. If multiple application processes are issuing these statements, catalog contention becomes a real possibility.
I recommend that you try to minimize the concurrent use of statements that update the same DB2 UDB catalog table spaces. In general, set standards as to when such SQL should be submitted, and/or limit the number of authorized user ID's that can update the DB2 UDB catalog. For a detailed description of the DDL, DML and DCL that may cause catalog contention, refer to the DB2 UDB Administration Guide.
The IRLM should be given a very high MVS dispatching priority, so that it can service lock requests as soon as possible. IBM generally recommends that IRLM's priority should be just under that of VTAM, but higher than that of the other DB2-related address spaces.
The CROSS MEMORY entry on panel DSNTIPJ determines whether or not IRLM stores its lock control blocks in ECSA (extended control storage area) or within its own private address space.
Specifying NO (PC=NO, which means the cross-address-space program call is not used) places the lock structure into ECSA and requires less CPU time, but can possibly reduce the amount of storage available to other MVS address spaces that use ECSA. With this option, another parameter comes into play as well, MAXIMUM ECSA. This determines the maximum amount of ECSA that IRLM can use for its lock structure. It is important to set this value high enough to prevent IRLM from reaching the limit. IRLM only obtains storage as needed, so it is better to put a larger value than you estimate may be needed.
By specifying YES, the lock control block structure is stored in the IRLM private address space, and the program call instruction (PC=YES) is used to address the structure. With PC=YES, the MAXIMUM ECSA option is ignored.
Current experience with DB2 UDB V7 has shown that PC=YES is the recommended setting, because the impact of virtual address space constraint is deemed to be more significant than the additional CPU cycles consumed by choosing this option. As of DB2 UDB V8, the PC and MAXIMUM ECSA keywords are no longer used (V8 automatically implements PC=YES).
MVS system load and swapping
If your MVS system resources are strained because of excessive workload, the increased contention for CPU cycles, I/O processing and storage can cause a task to go into a wait state until the required resources are available.
It is also advisable to reduce the amount of "swapping" as much as practical, in your MVS system. When MVS paging rates become excessive, an address spaces may be "swapped out", that is all of the virtual storage in the address space is moved out to disk. Whenever a task is swapped out or waiting for other system resources, and the unit of work has not been committed, that task continues to hold locks on DB2 UDB objects, which impacts concurrency.
The DEADLOCK TIME parameter specifies the length of the interval that DB2 UDB uses to scan the system for deadlocked application processes. This is specified on installation panel DSNTIPJ, and the default is five seconds.
As would be expected, the determination of the optimum value for deadlock detection is dependent upon the characteristics and magnitude of your DB2 UDB application workload. Since deadlock detection can cause latch suspensions, consider using the default value if you are not experiencing deadlocks in your system. Having deadlock detection run less frequently means a little less processing time taken by DB2 UDB and IRLM. But if you are experiencing deadlocks to some degree, you should lower this value to one second (you want to discover a deadlock situation as soon as possible).
The RESOURCE TIMEOUT parameter is specified on installation panel DSNTIPI, and stipulates for DB2 UDB, the minimum number of seconds before a timeout can occur. The default value is 60 seconds, which is probably a good starting point for a new DB2 UDB installation. As the performance characteristics of your application develop and change over time, you will want to evluate whether or not the value of this parameter is effective in maintaining good concurrency. A smaller value may cause a larger number of timeouts. With a larger value, suspended applications will more often resume normally, but they remain inactive for longer periods.
Furthermore, here is a hypothetical scenario that demonstrates how reducing this value may actually reduce timeouts, even though the idea is initially counter-intuitive. Suppose the RESOURCE TIMEOUT is 60 seconds.
- Transaction A gets a lock on page 1 but then goes into a wait state because it requests page 2, which is currently locked by transaction X.
- Transaction B starts 1 second later, gets a lock on page 3, but then goes into a wait state because it requests page 1.
- Transaction C starts 1 second later, gets a lock on page 4, but then goes into a wait state because it requests page 1 as well.
- One second later transactions D and E request pages 3 and 4 respectively, and go into a wait state.
- 5. Transaction A waits for 59 seconds, then finally gets access to page 2, but then continues to hold a lock on page 1 for 5 more seconds before it finishes processing.
- As a result, all four other transactions B-E timeout.
If the RESOURCE TIMEOUT was 30 seconds instead, transaction A would have timed out before ever getting page 2, but that would then have freed up page 1 so that transactions B through E could execute. So instead of having four transactions time out, you would have only one.
Disabling inactive applications
On installation panel DSNTIPR there is an IDLE THREAD TIMEOUT option. This specifies the period of time for which an active DB2 UDB distributed thread can hold locks without doing any processing. After that period the DB2 UDB scanning process detects that the thread has been idle for the specified time, so the thread is cancelled by DB2 UDB.
The RELEASE LOCKS parameter is located on installation panel DSNTIP4, and controls whether or not page and row locks are released at commit time for a cursor defined as WITH HOLD. The default is YES, which means that DB2 UDB releases the page or row lock after a COMMIT is issued. You should use the default as much as possible, as this will improve concurrency.
If the user specifies NO, then DB2 UDB will hold the page or row locks for WITH HOLD cursors after the COMMIT is issued. This option is provided so that existing applications that rely on this lock can continue to work as before.
In addition to the numerous items mentioned above concerning database and application design, there are some things related to daily DB2 UDB operations that can impact data concurrency levels, especially in the area of DB2 UDB utilities.
Improved Concurrency for Utilities
A number of DB2 UDB utilities benefit from the improved data concurrency among partitions because of the NPI (non-partitioned index). In order to realize this improvement, you need to run the utilities in parallel by partition, which may require new operational and organizational considerations because of the concurrently executing utilities.
Both the LOAD and REORG utilities can benefit significantly by using this technique. However, partition independence does not benefit the COPY and MODIFY utilities, since they both work at the data level.
Prior to Version 5 of DB2 UDB, access to data being processed by the REORG utility was much more restricted. During the unload phase of REORG, other applications were limited to read-only access to the table space, and during the reload phase, no access at all was possible. With the REORG utility enhancements that became available with DB2 UDB V5, data availability was significantly improved through the specification of the SHRLEVEL option. This is referred to as "online" REORG.
By specifying SHRLEVEL REFERENCE in the REORG utility job, other applications can have read-only access to the data for almost the entire duration of the job. Specifying SHRLEVEL CHANGE in the REORG allows other applications to have read/write access to the data for almost the entire process of reorganization.
Just before the completion of the utility, there is a very brief period where data access is denied. Most customers exploit the online REORG regularly in order to make critical business data concurrently available to other application processes while at the same time performing necessary data maintenance. With the online REORG utility DB2 UDB has provided users with a significant advance toward virtually continuous availability.
Reducing deadlocks during REORG
When invoking online REORG, you might experience deadlocks. If that happens, you should run the REORG utility with the DRAIN ALL option. DRAIN WRITERS is the default, which is done during the log phase. The DRAIN ALL option instructs DB2 UDB to drain both writers and readers as soon as the MAXRO threshold is reached. This approach should especially be considered in environments that experience heavy update activity during the log phase. When DRAIN ALL is specified, there is then no need for a subsequent drain during the switch phase of online REORG.
You should have an awareness of the level of data concurrency that your DB2 UDB system is experiencing. Flaws in application or database design, constrained hardware resources, increased transaction volumes, database growth and other factors can contribute to degradations in performance and concurrency.
First establish a "baseline" - in other words, what is normal for your applications and databases in your environment. Monitoring volumes as well as exception or error conditions becomes the key to identifying and resolving concurrency problems.
Displaying realtime lock information
By including the LOCKS option when issuing the DISPLAY DATABASE command, DB2 UDB will display much useful information about what locks are being held or waiting at a particular moment in time. For the DB2 UDB database (or portion thereof) that is being DISPLAYed, this information includes:
- The transaction locks for all tables, tablespaces, partitions and indexspaces, and the mode and duration of those locks
- The applications and DB2 subsystems that are currently holding, waiting for, or retaining locks
- The authorization ID's, correlation ID's and connection ID's for all applications allocated to the database
- Drain lock information about logical partitions and other resources that are held by utility jobs currently executing
Determining lock modes
If you issue the EXPLAIN statement, or if you include the EXPLAIN option in the BIND and REBIND commands, you can determine the lock mode (Share, eXclusive, etc.) of table and table space locks that will be initially assigned by DB2 UDB for a particular SQL statement. The output of an EXPLAIN is stored in a PLAN_TABLE, which can then be queried by the user. The initial lock mode for the table space is contained in the TSLOCKMODE column of the PLAN_TABLE.
Tracking lock information
In order to maximize data concurrency in your environment, answering the following questions can be helpful in identifying patterns, trends, and problems:
- Which applications contend for resources?
- For what tables and/or tablespaces is there contention?
- What are the types and frequency of the contentions?
- What is the impact of those contentions?
You should run the Statistics and Accounting traces in order to record important information about the performance characteristics of DB2 UDB applications, as well as the overall DB2 UDB system. These traces should run continuously, so that you may periodically create reports in order to view and analyze the data. This reporting function can be done with IBM's DB2 Performance Expert (DB2/PE), or a similar program.
The Statistics trace records the number of lock and unlock requests, claim and drain requests, lock suspensions, deadlocks, timeouts and lock escalations that happened during the reported interval. The Accounting trace tracks the same kind of information at an application (rather than system) level. Additionally, the traces show the maximum number of concurrent page locks held and acquired during the time interval.
From a concurrency standpoint, you should review these reports to help identify the following conditions:
- Lock escalations are normally considered to be undesirable and are typically caused by applications that hold large numbers of page, row and LOB locks. For such cases, it may be possible to improve concurrency and overall system performance by using table or tablespace locks initially, and thereby avoid the escalation condition
- Timeouts can be caused by a RESOURCE TIMEOUT value that is non-optimal for your environment. If there are a high number of timeouts, it may be that the value is too low. A higher value of RESOURCE TIMEOUT may give more of applications a better chance to complete their work. But closely monitor any change that you make, since some conditions may be improved by reducing this value (see the discussion above under the sub-heading RESOURCE TIMEOUT).
- Deadlock conditions can be a major hindrance to performance and concurrency. Usually these should be investigated further, in order to determine both what may be causing them and the possible solutions.
- Some applications consistently hold a large number of locks. Review these applications (and the underlying database design) in order to see if this number can be lowered.
Ideally an installation strives to have "zero" counts for deadlocks and timeouts, and at most a small number of lock escalations. If any of these three symptoms grow past a limited handful of incidents, it is probably time to tune your locking.
In regards to lock escalations, one Rule-of-Thumb claims that if more than one fourth of your lock escalations ultimately result in deadlocks or timeouts, then escalation is not working well in your environment (see the discussion of Lock Escalation above). Consider increasing LOCKMAX or LOCKS PER TABLE(SPACE). Another possible solution is to have the affected applications issue a LOCK TABLE when they initialize.
Also, monitor the total numbers of lock/unlock requests, suspensions and claim/drain requests. Watching those volumes for trends (expected or otherwise) is key to maintaining high levels of data concurrency in a DB2 UDB environment.
The probability that a particular page or row is locked can vary greatly, depending upon the database design, application code, transaction volumes, available system resources such as CPU, memory, I/O, etc., and many other factors. As a general Rule-of-Thumb, a page or row is considered "hot" if it is locked more than 10% of the time. Therefore 10% can be viewed as a reasonable alarm limit. Less than 10% would generally be considered acceptable, while greater than 10% could imply excessive contention and should probably be investigated.
In most environments the average duration of a lock is less than one second. If your installation is experiencing average lock durations greater than this, it probably merits further analysis.
Data sharing considerations
The following paragraphs highlight information that should be reviewed if you are running a DB2 UDB Data Sharing environment. Some of the discussion is pertinent only to data sharing, while some of the material maybe applicable to DB2 UDB in general (and may have even been mentioned before), but is especially important to consider in a data sharing environment.
Sequential batch INSERT planning
Any batch application that performs INSERTs in a sequential order can cause heavy contention on the table space space map pages. For such an application consider using the MEMBER CLUSTER parameter when creating the tablespaces used in such applications. With this option, DB2 UDB will not consider the clustering index when assigning space for SQL INSERT statements. So DB2 UDB will place the inserts into the database based upon the available space, which means that different space map pages will be accessed rather than one or just a few. That should result in a considerable reduction in contention for the space map pages.
Selective partition locking
DB2 UDB V5 introduced selective partition locking, which means that only the partitions being accessed are locked. This is accomplished by specifying the LOCKPART YES option in the CREATE or ALTER TABLESPACE statements.
The main usage of this feature is to allow DB2 to support explicit hierarchy locking at the partition level in a data sharing environment. Selective partition locking provides true partition independence for L-locks, thus avoiding IDRWI (inter-DB2 read/write interest) when each member of a data sharing group is processing a different partition. Before DB2 UDB V5, parent L-locking for table spaces was at the table space level, even for partitioned table spaces.
In a DB2 data sharing group, the following practices are recommended in order to maximize lock avoidance and thus improve concurrency:
- Frequent commits in all application processes running in any of the members in the group.
- Quiescing of inactive members within the group.
- Quick restart of failed members in order to resolve in-doubt transactions.
- Using CURRENTDATA(NO) for plans and package whenever possible.
For data sharing systems, LOCKSIZE PAGE is highly recommended. Before using LOCKSIZE ROW the user should carefully analyze the database design and the nature of the application processes that will be accessing that data. Data sharing DB2 UDB systems incur additional overhead with row locking because of page physical lock (P-lock) contention and negotiation. At high transaction rates, applications tend to serialize behind the page P-lock. So use page locking as the design default.
There are three different kinds of contention in a DB2 data-sharing environment:
False contention - An individual DB2 UDB member calculates a hash value, which is used to determine to which lock table entry a resource will be assigned. The lock table is stored in the coupling facility. This hashing value is calculated by using the name of the resource and other relevant information that uniquely defines the resource to be locked. Although a particular resource will always yield one (and only one) value to determine its place in the lock table, it is very possible that more than one resource will hash to the same lock table entry. When multiple resources have their lock requests hash to the same lock table entry, that condition is called false contention.
XES contention - The lock manager in the coupling facility and XES recognize only two types of lock modes - share (S) mode and exclusive (X) mode. However, IRLM supports a number of other lock modes, such as intent-share (IS), and intent-exclusive (IX) and share with intent-exclusive (SIX). This situation causes the lock mode to be abbreviated when it is passed from IRLM to XES. If XES then tries to register a lock that is incompatible with an existing lock, XES must check with IRLM in order to determine more precisely the type of lock required, and whether or not the lock request is indeed compatible. If this investigation determines that the locks are actually compatible, then it is only XES contention. If the locks are indeed incompatible, then the contention is real.
Real contention - This is the "traditional" contention that is caused by normal IRLM lock incompatibility between two applications, which also happens in non-data sharing environments.
A commonly recommended rule of thumb suggests that less than 2% of all lock requests should result in contention (real or false). Application processes causing greater than 2% contention should be investigated further, in order to determine if the contentions are real or false.
The size of the lock structure can influence the amount of false contention that a data sharing environment experiences. If there are more lock table entries, then there is less chance of two different resources "hashing" to the same entry. Therefore false contention can usually be reduced by increasing the size of the lock structure in the coupling facility. Expanding the lock structure is recommended if more than half of your total contention is false contention.
If most of the contention is "real", then the familiar principles of database and application design should be evaluated for their applicability to the situation. The following actions are often helpful in reducing real contention:
- Using a consistent order for table accesses and updates
- Optimizing transactions in order to reduce overall elapsed time
- Increasing PCTFREE and running the REORG utility frequently
- Delaying updates until just before committing
- Evaluating whether or not to use LOCKSIZE ROW