Configuring sparse lookup operations

Data that is read by a database stage can serve as reference data to a Lookup stage. By default, this reference data is loaded into memory like any other reference link. When directly connected as the reference link to a Lookup stage, you can configure the Lookup Type property of the Db2 connector to Sparse and send individual SQL statements to the database for each incoming Lookup row.

Before you begin

You must create a lookup operation job.

About this task

If the number of input rows to a stage is significantly smaller than the number of reference rows (1:100 or more) in a database table, you can configure the Db2 connector to perform a sparse lookup operation and send individual SQL statements to the database for each incoming lookup row. In the lookup operation job, the connector receives the records from the input stage, and then the connector performs the lookup operation directly on the external resource. The connector then generates the output records.

You can use the sparse lookup method only in parallel jobs.

Procedure

  1. Double-click on the Db2 Connector stage to open the link properties window.
  2. From Lookup Type list, select Sparse.
  3. Click the Columns tab and define the columns that you want to use from the database to which the connector is connected.
  4. Configure the properties on the Properties tab.
    1. Define and test your connection properties in the Connection section.
    2. In the Usage section, you can specify if you want auto-generated SQL statements or user-defined SQL statements. Specify Generate SQL as Yes or No.
      • If you specify Generate SQL as Yes, specify the Table name and the Key columns details in the Columns tab.
      • If you specify Generate SQL as No, specify the Select statement property. In the select part of the SELECT statement, the asterisk wildcard (*) does not work in a sparse lookup. Therefore, specify every column in the database and delimit the columns by commas. You must specify all columns in the Columns tab, even if you do not use them in this lookup. You can then delete the fields in the columns that you do not need. The following syntax is an example of the first part of the SELECT statement: select Field001,Field002,Field003. Also you need to include a WHERE clause to perform the lookup. Key columns that follow the WHERE clause must have the word ORCHESTRATE and a period added to the beginning of the column name. ORCHESTRATE can be capitalized or lowercase letters, like ORCHESTRATE.Field001. The following SELECT statement is an example of the correct syntax of the WHERE clause: select Field001,Field002,Field003 from MY_TABLE where ORCHESTRATE.Field001 = Field001.
    3. Optional: Configure any other properties on the Properties tab.
  5. Click OK to save the changes.
  6. To map the input links to the output link, double-click the Lookup stage to open the stage editor.
    1. Drag or copy the columns from your input link to your output link to add the columns to the output link
    2. Define any conditions for a lookup failure by clicking the Constraint icon in the menu.
    3. Select the appropriate value for the Lookup Failure column and click OK. If you select Reject, you must have a reject link and target stage in your job configuration to capture these records.
  7. Click OK.
  8. Save, compile, and run the job.