You can use the LOAD utility to load one or more partitions
of a partitioned table space. To improve performance when loading
more than one partition, consider enabling partition parallelism.
About this task
Partition parallelism can reduce the elapsed time that is required for loading large amounts of
data into partitioned table spaces.
If you are loading a partitioned
table space that is created with DEFINE NO, the load operation might take longer. If a partitioned
table space is created with DEFINE NO, all partitions are also implicitly defined with DEFINE NO.
The first data row that is inserted by the LOAD utility defines all data sets in the partitioned
table space. If this process takes a long time, expect timeouts on the database descriptor (DBD).
Restriction: You
cannot load data at the partition level of a partition-by-growth table
space.
Procedure
To load partitions:
-
If you want to load only certain partitions of a partitioned table, use the PART
clause of the INTO TABLE option. If you omit the PART clause, the entire table is loaded.
Restriction: The following restrictions exist for identity columns:
- When index-based partitioning is used, LOAD INTO TABLE PART integer is
not allowed if an identity column is part of the partitioning index.
- When table-based partitioning is used, LOAD INTO TABLE PART integer is
not allowed if an identity column is used in a partitioning clause of the CREATE TABLE or ALTER
TABLE statement.
To
override these restrictions, specify the OVERRIDE(IDENTITY) option in the LOAD statement.
- If you want partitions to be processed in parallel, take
one of the following actions:
- If
you have a single input data set and the partitioned table space is partitioned (non-UTS) or
partition-by-range , specify the PARALLEL keyword. This keyword enables LOAD to use multiple
parallel subtasks. When determining the degree of parallelism to specify on the PARALLEL keyword,
consider that a high degree of parallelism can result in increased processor time.
Recommendation: Specify PARALLEL(0) or PARALLEL without a number so that
Db2 can determine the optimal degree of
parallelism.
If one or more
nonpartitioned secondary indexes exists on the partitioned table space, and you have a separate
input data set for each partition, use load partition parallelism. Partition parallelism loads all
partitions in a single job. To invoke partition parallelism, for each partition that you want to
load, specify the INTO TABLE PART clause with one of the following keywords:
- INDDN
- INCURSOR
- DISCARDDN if DISCARDS n is specified
If the table space is created with DEFINE NO, specifying SHRLEVEL CHANGE on your LOAD
statement and enabling partition parallelism is equivalent to concurrent, independent insert jobs.
For example, in a large partitioned table space that is created with DEFINE NO, the LOAD utility
starts three tasks. The first task tries to insert the first row, which causes an update to the DBD.
The other two tasks time out while they wait to access the DBD. The first task holds the lock on the
DBD while the data sets are defined for the table space.

- If the only indexes are the partitioned indexes, use multiple
jobs to run LOAD concurrently against separate partitions. This method
also requires that you have a separate input data set for each partition.
- If you use the INTO TABLE PART clause, take the following
actions as appropriate:
- If you specify the REPLACE or RESUME options, specify them separately
by partition. If you specify these options before the INTO TABLE PART
clause, LOAD serializes the load operation for the entire table space
and does not process the partitions concurrently.
- To load columns in an order that is different than the order of
the columns in the CREATE TABLE statement, code field specifications
for each INTO TABLE PART clause.
- Make
sure that you specify the LOAD options in the appropriate place in the utility statement. Some LOAD
options, such as INDDN, DISCARDDN, RESUME, and REPLACE, can have a different scope depending on
their placement in the LOAD statement. For example, if you specify INDDN before INTO TABLE, the
specified input data set is used to load the entire table. However, if you specify INDDN after INTO
TABLE, in a PART clause, the specified input data set is used to load only the specified partition.
-
If you want Db2 to save copies of
records that are not loaded, use the DISCARDDN option to specify a discard data set as
follows:
- If you want a single discard data set for all partitions, specify DISCARDDN before INTO TABLE.
If you use a template for this data set, and the TEMPLATE utility statement contains the variable
$PA. or $PART., that variable is substituted with 0 or the partition number in the first INTO TABLE
PART clause.
- If you want one discard data set for a partition, specify DISCARDDN in an INTO TABLE PART
clause. If you use a template for this data set, and the TEMPLATE utility statement contains the
variable $PA. or $PART., that variable is substituted with the partition number. If you specify
DISCARDDN in more than one INTO TABLE PART clause, you must ensure that the data set names are
unique. One way to ensure unique names is to use templates and the variable $PA. or $PART. in the
TEMPLATE statement.
Examples
- Example of loading certain records into certain partitions
- The control statement in the following example specifies that Db2 is to load data into the first and
second partitions of the employee table. Records with '0' in column 1 replace the
contents of partition 1; records with '1' in column 1 are added to partition 2; all
other records are ignored. This example control statement, which is simplified to
illustrate the point, does not list field specifications for all columns of the
table.
LOAD DATA CONTINUEIF(72:72)='X'
INTO TABLE DSN8C10.EMP PART 1 REPLACE WHEN (1) = '0'
( EMPNO POSITION (1:6) CHAR(6),
FIRSTNME POSITION (7:18) CHAR(12),
...
)
INTO TABLE DSN8C10.EMP PART 2 RESUME YES WHEN (1) = '1'
( EMPNO POSITION (1:6) CHAR(6),
FIRSTNME POSITION (7:18) CHAR(12),
...
)
- Example of loading partitions from separate input data sets
- The following example LOAD statements
specify that partitions 1 and 2 of the EMP table are to be loaded from the EMPLDS1 and EMPLDS2 data
sets. This example assumes that your data is in separate input data sets and already sorted by
partition. Therefore, you do not need to use the WHEN clause of INTO TABLE. Placing the RESUME YES
option before the PART option inhibits concurrent partition processing. If you want LOAD to process
other partitions concurrently, specify the RESUME option after the PART
option.
LOAD DATA INDDN EMPLDS1 CONTINUEIF(72:72)='X'
RESUME YES
INTO TABLE DSN8C10.EMP REPLACE PART 1
LOAD DATA INDDN EMPLDS2 CONTINUEIF(72:72)='X'
RESUME YES
INTO TABLE DSN8C10.EMP REPLACE PART 2
- Example of loading partitions independently
- In
the following example, partition 1 and partition 2 are loaded concurrently.
LOAD DATA INDDN SYSREC LOG NO
INTO TABLE DSN8C10.EMP PART 1 REPLACE
LOAD DATA INDDN SYSREC2 LOG NO
INTO TABLE DSN8C10.EMP PART 2 REPLACE
- Example of specifying separate discard data sets for each partition
- If you specify multiple discard data sets, you must ensure that the data set names are unique.
In the following example, the LOAD statement specifies that two discard data sets are to be
allocated, one for each partition. Those discard data sets both use the DT template. The TEMPLATE
statement for DT includes the &PA. variable (partition number) to ensure that the data set name
is different for each partition.
TEMPLATE DT UNIT(SYSDA)
DSN(JUOSU339.&TS..P&PA..DISCARD)
SPACE(50,10) TRK
LOAD DATA
INTO TABLE DSN8B10.EMP PART 1 INDDN D1 DISCARDDN(DT) REPLACE NO RESUME
INTO TABLE DSN8B10.EMP PART 2 INDDN D2 DISCARDDN(DT) REPLACE NO RESUME