LOAD utility is another method to
populate tables with data. Formatted pages are written directly into
the database. This mechanism allows more efficient data movement than
IMPORT utility. However, some
operations, such as referential or table constraints check and
triggers invocation, are not performed by the
Following is the core of the
other options and modifiers are supported and will be introduced later
in this section. To successfully execute this command, you must have
SYSADM, DBADM, or LOAD authority, or INSERT and/or DELETE privileges
on the table involved in the load. To load data into a table that has
protected columns, you must have LBAC credentials that allow write
access to all protected columns in the table. To load data into a
table that has protected rows, you must have been granted a security
label for write access that is part of the security policy protecting
LOAD FROM input_source OF input_type MESSAGES message_file [ INSERT | REPLACE | TERMINATE | RESTART ] INTO target_tablename
An example of using CURSOR as the load input is shown here:
DECLARE mycursor CURSOR FOR SELECT col1, col2, col3 FROM tab1; LOAD FROM mycursor OF CURSOR INSERT INTO newtab;
The load target must exist before the utility starts. It can be a
table, a typed table, or a table alias. Loading to tables with XML
columns, system tables and temporary tables is not supported. Use the
MESSAGES option to capture any errors,
warnings, and informational messages during the load.
LOAD can be executed in four modes:
INSERTmode adds input data to a table without changing the existing table data.
REPLACEmode deletes all existing data from the table and populates it with input data.
TERMINATEmode terminates a load operation and rolls back to the point in time at which it started. One exception is that if
REPLACEwas specified, the table will be truncated.
RESTARTis used to restart a previously interrupted load. It will automatically continue from the last consistency point. To use this mode, specify the same options as in the previous
LOADcommand but with
RESTART. It allows the utility to find all necessary temporary files generated during the load processing. Therefore, it is important not to manually remove any temporary files generated from a load unless you are sure they are not required. Once the load completes without error, the temporary files will be automatically removed. By default they are created in the current working directory. You can specify the directory where temporary files are stored with
A complete load process has four distinct phases:
- Load phase:
- Loads data into the table.
- Collects index keys and table statistics.
- Records consistency points.
- Places invalid data into dump files and records messages in the message file. When rows of data do not comply with the definition of the table, they are considered to be invalid data and will be rejected (not loaded into the table). Use the dump-file modifier to specify the name and location of a file to record any rejected rows.
- Creates indices based on keys collected during the load phase.
STATISTICS USE PROFILEis specified, statistics is also collected according to the profile defined for the target table. This profile must be created before load is executed, otherwise a warning is returned and no statistics are collected.
- Deletes rows that caused unique key violations and places them
in the exception table. Besides when data simply doesn't match
the definition of the target table as described above, there
may be data that passes the load phase but violates a unique
constraint defined in the table. Note that only the unique
key-violated rows are considered as bad data here; other
constraints are not being checked at this time. Since this
type of data is already loaded into the table, the
LOADutility will delete the offending rows in this phase. An exception table can be used to store the deleted rows so you can decide what to do with them after the load operation completes. If no exception table is specified, the offending rows are deleted without a trace. The exception table is discussed in more detail below.
- Records messages in the message file.
ALLOW READ ACCESSis specified with
USE TABLESPACE, index data is copied from the system temporary tablespace to the tablespace where the index should reside.
An exception table is a user-defined table that has to have the same column definition of the target table being loaded. If at least one of the columns is not present in the exception table, the offending rows will be discarded. Only two additional columns can be added to the end of the table: a timestamp column to record when a row is inserted and a CLOB column to store the reason (or message) why the row is considered bad.
Let's look at an example to illustrate the steps involved in a load process.
Listing 18. Load process example
LOAD FROM act.del OF DEL MODIFIED BY DUMPFILE=/home/db2inst1/act.dmp MESSAGES msg.out INSERT INTO act FOR EXCEPTION actexp
ACT table definition is as follows:
ACTtable definition example
CREATE TABLE "DB2INST1"."ACT" ( "ACTNO" SMALLINT NOT NULL, "ACTKWD" CHAR(6) NOT NULL, "ACTDESC" VARCHAR(20) NOT NULL, PRIMARY KEY (ACTNO) )
The table is currently populates with one row.
Figure 2. Table
The load input data file act.del file contains three rows with the last row containing a duplicate of that in the ACT table, thus, violating the PK constraint.
Figure 3. act.del file content
An exception table, ACTEXP, is created with the same definition as the ACT table.
CREATE TABLE ACTEXP LIKE ACT
In the load phase, all the data from the input file is loaded into ACT table.
In the delete phase, the last row marked in blue is deleted from ACT as it violates the Primary Key constraint and inserted into the exception table ACTEXP.
Please notice that any row in violation of a unique index or a primary key index is copied to the exception table and is not written to the dump file. The dump file, otherwise, contains rows that cannot be loaded because they are invalid or have syntax errors.
At the end of the load, you should examine the message file, the dump file, and the exception table, then decide how to deal with the rejected rows.
Listing 20. ACT table content after the load operation is done
ACTNO ACTKWD ACTDESC ------ ------ -------------------- 10 MANAGE MANAGE/ADVISE 20 ECOST ESTIMATE COST 100 TEACH TEACH CLASSES
And the ACTEXP table:
ACTNO ACTKWD ACTDESC ------ ------ -------------------- 100 TEACH TEACH CLASSES
After the load is done, the tablespace containing the loaded table will be in a BACKUP BENDING state, and the table will be in a SET INTEGRITY BENDING state.
Some load options and file type modifiers were introduced previously. A few more are discussed here:
ROWCOUNT n:Allows users to specify only the first n records in the input file to be loaded.
SAVECOUNT n:Establishes consistency points after every n rows are loaded. Messages are generated and recorded in the message file to indicate how many input rows were successfully loaded at the time of the save point. This point is not possible when input file type is CURSOR.
WARNINGCOUNT n:Stops the load after n warnings have been raised.
INDEXING MODE [ REBUILD | INCREMENTAL | AUTOSELECT | DEFERRED ]: In the build phase, indices are built. This option specifies whether the LOAD utility is to rebuild indices or to extend them increment ally. Four modes are supported:
- REBUILD forces all indices to be rebuilt.
- INCREMENTAL extends indices with new data only.
- AUTOSELECT allows the utility to choose between REBUILD and INCREMENTAL.
- DEFERRED means index create is not going to happen during the load. The indices involved are marked with refresh required. They will be rebuilt when the database is restated or at the first access to such indices.
STATISTICS USE PROFILE: After a load is performed, previous statistics of the target table are most likely not valid, as a lot more data has been added. You can choose to collect the statistics in the build phase according to the profile defined for the target table.
File type modifiers. File type modifiers are specified with the MODIFIED BY clause. Here are few you may find useful:
- fastparse — Syntax checking on loaded data is reduced to enhance performance.
- identityignore, identitymissing, and identityoverride — Used to ignore, indicate missing, or override identity column data, respectively.
- indexfreespace n, pagefreespace n, and totalfreespace n — Leaves specified amount of free pages in index and data pages.
- norowwarnings — Suppresses row warnings.
- lobsinfile — Indicates that LOB files are to be loaded; checks LOBS FROM option for LOB path.
While a table is being loaded, it is locked by the LOAD utility with an
exclusive lock. No other access is allowed until the load completes.
This is the default behavior of the ALLOW NO ACCESS option. During
such a load, the table is in the state of LOAD IN PROGRESS. There is a
handy command that checks the status of a load operation and also
returns the table state:
LOAD QUERY TABLE table_name .
You may have guessed that there is an option to allow table access. The
option causes the table to be locked in share mode. Readers may access
the data that already exists in the table but not the new portion.
Data that is being loaded is not available until the load is complete.
This option puts the loading table in LOAD IN PROGRESS and READ ACCESS
As mentioned, a full index can be rebuilt, or an index can be extended
with the new data during the build phase. With
ALLOW READ ACCESS, if a full index is being
rebuilt, a shadow copy of it is created. When the
LOAD utility gets to the index copy phase
(see Four phases of a load process), the target table is taken
offline, and the new index is copied into the target tablespace.
Regardless of which table access option is specified, various locks are
required for the load to process. If the target table is already
locked by some application, the
utility will have to wait until the locks are released. Instead of
waiting for a lock, you can use
WITH FORCE in the
LOAD command to force off other
applications that hold conflicting locks.
So far, we know that input data that does not comply with the target
table definition is not loaded into the table. Such data is rejected
and recorded in the message file at the load phase. In the delete
LOAD utility deletes rows that
violated any unique constraints. The offended rows are inserted into
an exception table if specified. What about other constraints that the
table might have defined, such as referential integrity and check
constraints? These constraints are not checked by the
LOAD utility. The table will be placed in
SET INTEGRITY PENDING state, which forces you to manually check data
integrity before the table can be accessed. Table state can be queried
LOAD QUERY command as discussed.
The column CONST_CHECKED in the system catalog table SYSCAT.TABLES
also indicates the status of each constraint defined in the table.
To manually turn off integrity checking for one or more tables, use the
Examples are presented here to demonstrate some options. To check
integrity for the appended option of the tables EMPLOYEE and STAFF
SET INTEGRITY FOR
employee, staff IMMEDIATE CHECKED INCREMENTAL.
To bypass foreign key checking on table EMPLOYEE with the
IMMEDIATE UNCHECKED option, use
SET INTEGRITY FOR
employee FOREIGN KEY IMMEDIATE UNCHECKED.
In some cases, you may want to place the target table as well as its
descendent tables with foreign key relationship in SET INTEGRITY
PENDING state after the load completes. This ensures that all these
tables are in control for accessibility until a manual integrity check
is performed. The load option is
SET INTEGRITY PENDING CASCADE IMMEDIATE,
which indicates that the check pending state for foreign key
constraints is immediately extended to all descendent foreign key
tables. By default, only the loaded table will be placed in check
LOAD utility writes formatted
pages into the database directly, no database logging is performed to
record the new data being loaded. If you have a recoverable database
(i.e., with LOGREATIN and/or USEREXIT turned on), DB2 needs to ensure
that the database is still recoverable after the load completes. In
order to enforce recoverability, the tablespace where the table is
stored will be placed in BACKUP PENDING mode. This means that the
tablespace must be backed up before it can be accessed.
This is the default way to make the tablespace accessible after a load
operation completes. Another method is to back up the loaded data
while the load is running with the option
COPY YES. A backup file will be created at
the end of the load.
There is another option you may consider to avoid backing up the
tablespace right after the load completes.
NONRECOVERABLE marks the table being loaded
as unrecoverable. The associated tablespace is fully accessible after
load completes. DB2 does not stop you in querying and modifying the
table data. However, if you need to eventually restore the tablespace
and roll forward to a time that passes the
NONRECOVERABLE load operation, the loaded
table is not recoverable. The recovery progress skips all the logs
associated with the table. You can only drop and recreate the table.
Therefore, it is still recommended that you back up the tablespace at
a convenient time so existing and loaded data is saved in a
The Data Studio provides easy-to-use graphical interfaces to perform load operations. All the load options and file modifiers discussed are also available in this interface.