When you have lots of concurrent activity on your database, there are many bottlenecks that could slow the performance of the database engine. The most common bottlenecks that I see are:
- Locking and serialization conflicts.
- A "hot" disk drive; that is, when the most popular rows accessed by applications happen to all be physically stored on the same disk drive.
The purpose of this article, however, is to examine a less common source for bottlenecks: storage allocation for tables.
A case for preallocating storage
One of the attributes that has made DB2® UDB for iSeries® easy to use and manage over the years is that DB2 automatically handles all of the low-level storage allocation for the DB2 objects. All you have to do is create the table, and let DB2 automatically handle allocating disk storage and spreading the table data evenly over the disk drives. This automatic storage allocation and management works fine for the majority of iSeries and AS/400e® customers. However, there is one environment in which performance can be improved by preallocating the storage for a DB2 table.
An example of an environment that can benefit from preallocated storage is when you have a table in which new rows are being inserted at high rates by multiples connections and jobs -- especially in batch processing environments. Consider preallocating storage for the rows in that table when you have a rough idea on what the maximum row count for the table will be. The reason for preallocating the table storage is so that DB2 can insert new rows into the table during heavy, concurrent loads without having to periodically allocate storage for new rows, which interruprts the insertion process. If many connections and jobs are allocating new space for a DB2 table at the same time, then the queue for storage allocation can quickly become a bottleneck.
After you've identified a table that has a high-velocity insert rate in concurrent environment, and after you determine approximately what the maximum row count will be, then you can use the CHGPF OS/400® system command to preallocate storage for the table. The good news is that there's a command for preallocating the disk storage; the bad news for SQL-based applications is that there is not an SQL interface available for allocating the table storage. However, Figure 1 shows an SQL stored procedure, which I call allocate_rows, that can be invoked from an SQL script after you've created the table to preallocate row storage for the newly created table or to allocate additional row storage for an existing table.
The allocate_rows stored procedure
The allocate_rows stored procedure requires three input parameter names:
- tablename, which is the name of the table for which storage is to be allocated.
- libname, which is the schema or collection name.
- rowcount, which is the number of rows for which storage is to be allocated.
The stored procedure uses these parameters to dynamically construct a CHGPF request for the specified tables. Here's an example of the command that is constructed:
CHGPF FILE(MYSCHEMA/MYTABLE) SIZE(50000 1000 3) ALLOCATE(*YES) |
This command allocates storage for 50000 rows in the specified table, MYTABLE. The SIZE parameter contains 3 different numeric values:
- The first numeric value (50000) is the number of rows initially allocated for the table.
- The second parameter contains the number of rows that will automatically be added to the table if the initial row allocation is exceeded.
- The third value is the number of times that additional rows will be allocated.
In the example shown above, storage for 50000 rows is allocated initially, and the table can support a maximum of 53000 rows (50000 + (3 x 1000)), with 1000 rows being allocated each time the row allocation limit is reached, up to 3 times. This is the main reason why you can only preallocate table storage effectively for those tables in which there is some idea of the maximum number of rows.
The values for the number of additional rows and the number of times that additional rows can be allocated are set to 1000 and 3 in the stored procedure shown in Figure 1, but you can change those values to meet your requirements.
The allocate_rows stored procedure uses a system-provided stored procedure, QCMDEXC, to execute the CHGPF system command from an SQL procedure. The QCMDEXC stored procedure requires two input parameters:
- A character string that contains the system command.
- The character length of that system command.
After the CHGPF command has been executed via the QCMDEXC stored procedure, then further action is required to actually activate the specified change. The stored procedure uses either the RGZPFM or CLRPFM system command on the specified table, depending on whether the table already contains rows. The CLRPFM command deletes all of the rows in the specified table, which is why the stored procedure first runs a query to check if there are any rows in the input table. If the table already contains data, then the RGZPFM command is used to reorganize the table to activate the change in allocation. The RGZPFM command can take a very long time to run for an existing table with a large number of rows, so use this stored procedure carefully with tables that already contain large amounts of data.
Figure 1. SQL procedure to preallocate storage
create procedure
allocate_rows (tablename varchar(256), libname char(10), rowcnt integer )
language sql
begin
DECLARE qualified_name VARCHAR(20);
DECLARE current_count INTEGER;
DECLARE count_stmt VARCHAR(64);
DECLARE change_cmd VARCHAR(128);
DECLARE activate_tblchg_cmd VARCHAR(64) ;
DECLARE cmd_length NUMERIC(15,5);
DECLARE c1 CURSOR FOR s1;
SELECT strip(system_table_schema) || '/' || strip(system_table_name)
INTO qualified_name
FROM qsys2.systables
WHERE table_name=tablename and table_schema=libname;
/* Execute the CHGPF command to change the table allocation */
SET change_cmd = 'CHGPF FILE('|| qualified_name || ') ' ||
'SIZE(' || CHAR(rowcnt) || '1000 3) ALLOCATE(*YES)';
SET cmd_length=LENGTH(change_cmd);
Call QSYS.QCMDEXC (change_cmd,cmd_length);
/* Need to determine the number of rows in the table to activate
this new table allocation */
SET count_stmt = 'SELECT COUNT(*) FROM ' || libname || '.' || tablename ;
PREPARE S1 FROM count_stmt;
OPEN c1;
FETCH c1 INTO current_count;
CLOSE c1;
/* Use the current number of rows to determine the "activation"command */
IF current_count=0 THEN
/* No rows in the table means a clear operation can be used to get the
new ALLOCATE settings to take effect */
SET activate_tblchg_cmd='CLRPFM ' || qualified_name;
ELSE
/* If there are rows, they cannot be deleted so reorganize instead */
SET activate_tblchg_cmd='RGZPFM ' || qualified_name;
END IF;
SET cmd_length = LENGTH (activate_tblchg_cmd);
/* Activate the new table allocation values by issuing a clear or reorganize */
Call QSYS.QCMDEXC (activate_tblchg_cmd,cmd_length);
END;
|
Changing node size for indexes and keyed logical files
Another low-level change that can improve performance in high-concurrency environment is to change the node size in the underlying tree used by DB2 to implement indexes and keyed logical files. By default, SQL indexes are created with the larger node size that improves concurrency. However, there are many older keyed logical files with the smaller node size. Again, this change cannot be done with SQL; the change must be done with the CHGLF command. This same stored procedure design could be reused to build a stored procedure that issues the CHGLF command to change the node structure for an existing index. Here's an example of the command needed to update an index to use the larger node size that's better suited for concurrent environments.
CHGLF FILE(MYSCHEMA/MYINDEX) FRCRBDAP(*YES) ACCPTHSIZ(*MAX1TB) |
I've described the situation in which performance can be improved by preallocating storage, namely, high-velocity inserts in a concurrent environment. I've also shown you how you can create an SQL stored procedure to do this task when an OS/400 command interface cannot be used.

Kent Milligan is a DB2 UDB Technology Specialist in PartnerWorld for Developers, iSeries. Kent spent the first eight years of his IBM career as a member of the DB2 development group in Rochester. He speaks and writes regularly on various iSeries and AS/400e relational database topics. You can reach him at kmill@us.ibm.com.




