A buffered insert is an insert statement that takes advantage of
table queues to buffer the rows being inserted, thereby gaining a
significant performance improvement. To use a buffered insert, an
application must be prepared or bound with the INSERT BUF option.
Buffered inserts can result in substantial performance improvement
in applications that perform inserts. Typically, you can use a buffered
insert in applications where a single insert statement (and no other
database modification statement) is used within a loop to insert many
rows and where the source of the data is a VALUES clause in the INSERT
statement. Typically the INSERT statement is referencing one or more
host variables that change their values during successive executions
of the loop. The VALUES clause can specify a single row or multiple
rows.
Typical decision support applications require the loading and periodic
insertion of new data. This data could be hundreds of thousands of
rows. You can prepare and bind applications to use buffered inserts
when loading tables.
To cause an application to use buffered inserts, use the PREP command
to process the application program source file, or use the BIND command
on the resulting bind file. In both situations, you must specify the
INSERT BUF option.
Note: Buffered inserts cause the following steps to occur:
- The database
manager opens
one 4 KB buffer for each database partition on which the table resides.
- The INSERT statement with the VALUES clause issued by the application
causes the row (or rows) to be placed into the appropriate buffer
(or buffers).
- The database
manager returns
control to the application.
- The rows in the buffer are sent to the partition when the buffer
becomes full, or an event occurs that causes the rows in a partially
filled buffer to be sent. A partially filled buffer is flushed when
one of the following occurs:
- The application issues a COMMIT (implicitly or explicitly through
application termination) or ROLLBACK.
- The application issues another statement that causes a savepoint
to be taken. OPEN, FETCH, and CLOSE cursor statements do not cause
a savepoint to be taken, nor do they close an open buffered insert.
The following SQL statements will close an open buffered insert:
- BEGIN COMPOUND SQL
- COMMIT
- DDL
- DELETE
- END COMPOUND SQL
- EXECUTE IMMEDIATE
- GRANT
- INSERT to a different table
- OPEN CURSOR for a full-select of a data change statement
- PREPARE of the same dynamic statement (by name) doing buffered
inserts
- REDISTRIBUTE DATABASE PARTITION GROUP
- RELEASE SAVEPOINT
- REORG
- REVOKE
- ROLLBACK
- ROLLBACK TO SAVEPOINT
- RUNSTATS
- SAVEPOINT
- SELECT INTO
- UPDATE
- Execution of any other statement, but not another (looping) execution
of the buffered INSERT
- End of application
The following APIs will close an open buffered insert:
- BIND (API)
- REBIND (API)
- RUNSTATS (API)
- REORG (API)
- REDISTRIBUTE (API)
In any of these situations where another statement closes
the buffered insert, the coordinator partition waits until every database
partition receives the buffers and the rows are inserted. It then
executes the other statement (the one closing the buffered insert),
provided all the rows were successfully inserted.
The standard interface in a partitioned database environment, (without
a buffered insert) loads one row at a time doing the following steps
(assuming that the application is running locally on one of the database
partitions):
- The coordinator partition passes the row to the database
manager that
is on the same partition.
- The database
manager uses
indirect hashing to determine the database partition where the row
should be placed:
- The target partition receives the row.
- The target partition inserts the row locally.
- The target partition sends a response to the coordinator partition.
- The coordinator partition receives the response from the target
partition.
- The coordinator partition gives the response to the application.
The
insertion is not committed until the application issues a COMMIT.
- Any INSERT statement containing the VALUES clause is a candidate
for buffered insert, regardless of the number of rows or the type
of elements in the rows. That is, the elements can be constants, special
registers, host variables, expressions, functions and so on.
For a given INSERT statement with the VALUES clause, the DB2® SQL
compiler might not buffer the insert based on semantic, performance,
or implementation considerations. If you prepare or bind your application
with the INSERT BUF option, ensure that it is not dependent on a buffered
insert. This means:
- Errors can be reported asynchronously for buffered inserts, or
synchronously for regular inserts. If reported asynchronously, an
insert error might be reported on a subsequent insert within the buffer,
or on the other statement that closes the buffer. The statement
that reports the error is not executed. For example, consider using
a COMMIT statement to close a buffered insert loop. The commit reports
an SQLCODE -803 (SQLSTATE 23505) due to a duplicate key from an earlier
insert. In this scenario, the commit is not executed. If you want
your application to really commit, for example, some updates that
are performed before it enters the buffered insert loop, you must
reissue the COMMIT statement.
- Rows inserted can be immediately visible through a SELECT statement
using a cursor without a buffered insert. With a buffered insert,
the rows will not be immediately visible. Do not write your application
to depend on these cursor-selected rows if you precompile or bind
it with the INSERT BUF option.
Buffered inserts result in the following performance advantages:
- Only one message is sent from the target partition to the coordinator
partition for each buffer received by the target partition.
- A buffer can contain a large number of rows, especially if the
rows are small.
- Parallel processing occurs as insertions are being done across
partitions while the coordinator partition is receiving new rows.
An application that is bound with INSERT BUF should be written
so that the same INSERT statement with VALUES clause is iterated repeatedly
before any statement or API that closes a buffered insert is issued.
Note: You should do periodic commits to prevent the buffered inserts
from filling the transaction log.