IBM Db2
The IBM Db2 source reads data from an IBM Db2 database. For information about supported versions, see Supported systems and versions.
The IBM Db2 source can use multiple threads to process tables in parallel.
When you configure the IBM Db2 source, you specify the database name, host name, and port number. In addition, you can enable the source to use SSL. You can also use a connection to configure the source.
The source generates record header attributes that enable you to use the sources of a record in flow processing.
Before you use the IBM Db2 source, you must complete some prerequisite tasks.
Prerequisites
About this task
Before you configure the IBM Db2 source, complete the following prerequisite tasks.
Procedure
Multithreaded processing
The IBM Db2 source uses multiple concurrent threads to process data based on the Number of Threads property. Each table is processed by only one thread a time, which ensures that the order of batches and records from each table is preserved.
As the flow runs, each thread connects to the source system, creates a batch of data, and passes the batch to an available flow runner. A flow runner is a sourceless flow instance - an instance of the flow that includes all of the processors, executors, and targets in the flow and handles all flow processing after the source.
Each flow runner processes one batch at a time, just like a flow that runs on a single thread. When the flow of data slows, the flow runners wait idly until they are needed, generating an empty batch at regular intervals. You can configure the Runner Idle Time flow property to specify the interval or to opt out of empty batch generation.
Multithreaded flows preserve the order of records within each batch, just like a single-threaded flow. But since batches are processed by different flow runners, the order that batches are written to targets is not ensured.
For more information about multithreaded flows, see Multithreaded flow overview.
Record header attributes
The IBM Db2 source creates record header attributes that include information about the originating file for the record.
You can use the record:attribute or
record:attributeOrDefault functions to access the information
in the attributes. For more information about working with record header attributes,
see Working with header attributes.
- schemaName - Name of the schema containing the table the record comes from.
- tableName - Name of the table the record comes from.
Field attributes
The IBM Db2 source generates field attributes for columns converted certain data types in Data Collector. The attributes provide additional information about each field.
| Data Collector Data Type | Generated Field Attribute | Description |
|---|---|---|
| Date | unit | Original unit in which the date was obtained from IBM Db2. |
| Datetime | unit | Original unit in which the date was obtained from IBM Db2. |
| Decimal | precision | Precision of the value obtained from IBM Db2. |
| Decimal | bit_width | Original bit width in which the value was obtained from IBM Db2. |
| Decimal | scale | Scale of the value obtained from IBM Db2. |
| Decimal and String | flight_type | Original flight type. |
| Double and Float | precision | Original precision obtained from IBM Db2. |
| Integer, Long, and Short | bit_width | Original bit width in which the value was obtained from IBM Db2. |
| Integer, Long, and Short | signed | Whether the original value obtained from IBM Db2 was signed. |
| Time | unit | Original unit in which the date was obtained from IBM Db2. |
IBM Db2 data types
The IBM Db2 source converts IBM Db2 data types into Data Collector data types before passing them to the flow.
| IBM Db2 Data Type | Data Collector Data Type |
|---|---|
| BigInt | Long |
| Binary | Byte Array |
| Binary_View | Byte Array |
| Boolean | Boolean |
| Char | String |
| Clob | String |
| Date | Date |
| Dbclob | String |
| DecFloat | String |
| Decimal | Decimal |
| Double | Double |
| Duration | Long |
| Fixed_Size_Binary | Byte Array |
| Fixed_Size_List | List |
| Graphic | String |
| Integer | Integer |
| Large_Binary | Byte Array |
| Large_List | List |
| Large_List_View | List |
| List | List |
| List_View | List |
| Nchar | String |
| Numeric | Decimal |
| Nvarchar | String |
| Real | Float |
| SmallInt | Short |
| Struct | Map |
| Time | Time |
| Timestamp | Datetime |
| Varchar | String |
| Vargraphic | String |
| XML | String |
Key column data types
The source supports the following Data Collector data types as key columns:
- Boolean
- Char
- Date
- Datetime
- Decimal
- Double
- Float
- Integer
- Long
- Short
- String
- Time
- Zoned Datetime
Troubleshooting IBM Connectivity Service
The source
connects to the database through IBM Connectivity Service. IBM Connectivity Service and all stages that use the service are bundled into
the IBM Connectivity Service stage library,
streamsets-datacollector-ibm-connectivity-service-lib. By default, Data Collector runs IBM Connectivity Service when the IBM
Connectivity Service stage library is installed.
| Call | CURL request |
|---|---|
| Get IBM Connectivity Service state |
curl --location 'https://<engine url>/rest/v1/flight/state' --header 'Authorization: Bearer <IBM Cloud API key> |
| Restart IBM Connectivity Service |
curl --location 'https://<engine url>//rest/v1/flight/restart' --header 'Authorization: Bearer <IBM Cloud API key> |
| Get IBM Connectivity Service logs |
curl --location 'https://<engine url>//rest/v1/flight/state' --header 'Authorization: Bearer <IBM Cloud API key> |
Configuring an IBM Db2 source
About this task
Configure an IBM Db2 source to read data from an IBM Db2 database.