Configuring sparse lookup operations

You can configure a Greenplum Connector stage to perform a sparse (direct) lookup operation on a Greenplum table.

Before you begin

  • Set up the column definitions on a link to specify the format of the records that the Greenplum Connector stage reads from a Greenplum server.
  • Configure the Greenplum Connector stage as a source for the reference data.
  • Add a Lookup stage to the job design canvas, and then create a reference link from the Greenplum Connector stage to the Lookup stage.

About this task

In a sparse lookup, the connector runs the specified SELECT statement one time for each parameter set that arrives in the form of a record on the input link to the Lookup stage. The specified input parameters in the statement must have corresponding columns defined on the reference link. Each input record includes a set of parameter values that are represented by key columns. The Greenplum Connector stage sets the parameter values on the bind variables in the SELECT statement, and then the Greenplum Connector stage runs the statement.

The result of the lookup is routed as one or more records through the reference link from the Greenplum Connector stage back to the Lookup stage and from the Lookup stage to the output link of the Lookup stage. A sparse lookup is also known as a direct lookup because the lookup is performed directly on the data source.

You can use the sparse lookup method when the target table is too large to fit in memory.

Procedure

  1. Double-click on the Greenplum Connector stage.
  2. From theLookup Type list, select Sparse.
  3. On the Columns tab, define the key columns to use from the database to which the connector is connected.
  4. On the Properties page, configure the properties on the Properties tab.
    1. If you set the Generate SQL to Yes, specify the Table name and the Key columns details in the Columns page.
    2. If you set the Generate SQL at runtime to No when you configured the connector as a source, specify a value for the Select statement property. In the select part of the SELECT statement, list the columns to return to the Lookup stage. Ensure that the columns in the select list match the columns on the reference link. Each column name in the WHERE clause of the statement must begin with the word ORCHESTRATE and a period. The word ORCHESTRATE can be all uppercase or all lowercase letters.
      For example, you can specify the following SELECT statement: select Field002,Field003 from MY_TABLE where Field001 = ORCHESTRATE.Field001. The column names that follow the word ORCHESTRATE should match the key columns on the reference link.
  5. Click OK to save the changes.
  6. Map the input link and reference link columns to the output link columns and specify conditions for a lookup failure:
    1. Drag or copy the columns from the input link and reference link to your output link.
    2. To define conditions for a lookup failure, click the Constraints icon in the menu.
    3. In the Lookup Failure column, select a value, and then click OK. If you select Reject, you must have a reject link from the Lookup stage and a target stage in your job configuration to capture the rejected records.
    4. Click OK.
  7. Save, compile, and run the job.