Loading data through external tables
Deprecated use of external tables in Bulk Load write mode
Before the Db2 connector fully supported external tables, it was only possible to load data using external tables. As set of properties to configure this mode was located under Usage > Load Control > Use external tables section and they were only enabled in case of Write mode set to Bulk load. Now it is possible to configure the external tables by using the properties located under the Usage > Session section. However, it is still possible to use previously designed jobs. When the connector is running and external tables are configured within Bulk load mode, the following warning is logged: 'Bulk load' write mode using external tables is deprecated and might be removed in the future release. Use Insert write mode with external tables instead.
When the warning is visible in the job log, do not use the external tables in the Bulk load mode. All configuration properties related to the usage of external tables in this mode are now prefixed with [deprecated]. Modify the stage configuration by setting the Write mode to Insert and the Usage > Session > Use external tables property to Yes.
Data load from Sequential File or Named Pipe using External Table is supported on some versions of Db2 database. If the Db2 connector detects that current database does not have the support for External Table, a message IIS-CONN-DB2-00969E "External Tables are not supported in the current DB2 version: " is reported and the job aborts.
The successful execution of the load operation results in the log entry with the number of
records loaded: Number of records loaded =
. In addition to log entry visible in
the job log, the load operation creates dedicated .log file in the location
specified.
The load through the External Table is possible when connector is used as a target (for data write). The Write mode property must be set to Bulk Load. In the Load Control properties section, you can select Use External Table = Yes
The configuration properties for external table data load:
Load Control
- Use External Table = Yes/No
- Indicates whether the External Tables are used during load
- Compress file data (with GZIP) = Yes/No
- Indicates whether the External Table file data is compressed using the GZIP compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.
- Maximum reject count (SQL option: MAXERRORS) = Default: 1
- The number of rejected records at which the system stops processing and immediately rolls back the load. The default is 1 (that is, a single rejected record results in a rollback)
- Skip rows (SQL option: SKIPROWS) = Default: 0
- For a load operation, the number of rows to skip before beginning to load the data. The default is 0. Because skipped rows are processed before they are skipped, a skipped row is still capable of causing a processing error.
- Other options (string, custom option)
- Additional options to be passed to the External Table statement. Parameters reference
- Directory for log files (String, SQL option: LOGDIR) =
- Specifies the directory for the log and bad files.
- Statistics on columns (String (whitespace separated), multi-select columns list) = Default: empty (all)
- Generates statistics on the columns. If no column specified, statistics are collected on all columns by default.
Existing common bulkload properties to load data through external tables
- Load method
- Use of Named Pipe(s) or Sequential Files.
- File(s) only
- Indicates whether input files should be created without executing the load operation.
- Directory for data and command files also as default for the log directory when not specified
- Remove intermediate data file
- Select Yes to remove the intermediate data file after completing the load operation.
- Row Count (SQL option: MAXROWS)
- The number of physical records to be loaded. Allows a user to load only the first "row count" rows in a file.
- Statistics
- Specifies the type of statistics that are generated for the table.
- Enable quoted identifiers
- Specifies whether or not to enclose database object names in quotes when generating DDL and DML.