DB2 Version 10.1 for Linux, UNIX, and Windows

Ingest utility

The ingest utility (sometimes referred to as continuous data ingest, or CDI) is a high-speed client-side DB2® utility that streams data from files and pipes into DB2 target tables. Because the ingest utility can move large amounts of real-time data without locking the target table, you do not need to choose between the data currency and availability.

The ingest utility ingests pre-processed data directly or from files output by ETL tools or other means. It can run continually and thus it can process a continuous data stream through pipes. The data is ingested at speeds that are high enough to populate even large databases in partitioned database environments.

An INGEST command updates the target table with low latency in a single step. The ingest utility uses row locking, so it has minimal interference with other user activities on the same table.

With this utility, you can perform DML operations on a table using a SQL-like interface without locking the target table. These ingest operations support the following SQL statements: INSERT, UPDATE, MERGE, REPLACE, and DELETE. The ingest utility also supports the use of SQL expressions to build individual column values from more than one data field.

Other important features of the ingest utility include:
The INGEST command supports the following input data formats:
In addition to regular tables and nicknames, the INGEST command supports the following table types:
A single INGEST command goes through three major phases:
1. Transport
The transporters read from the data source and put records on the formatter queues. For INSERT and MERGE operations, there is one transporter thread for each input source (for example, one thread for each input file). For UPDATE and DELETE operations, there is only one transporter thread.
2. Format
The formatters parse each record, convert the data into the format that DB2 database systems require, and put each formatted record on one of the flusher queues for that record's partition. The number of formatter threads is specified by the num_formatters configuration parameter. The default is (number of logical CPUs)/2.
3. Flush
The flushers issue the SQL statements to perform the operations on the DB2 tables. The number of flushers for each partition is specified by the num_flushers_per_partition configuration parameter. The default is max( 1, ((number of logical CPUs)/2)/(number of partitions) ).