You can configure a Greenplum Connector stage to perform
a sparse (direct) lookup operation on a Greenplum table.
Before you begin
- Set up the column definitions on a link to specify the format
of the records that the Greenplum Connector stage reads from a Greenplum
server.
- Configure the Greenplum Connector stage as a source for the reference
data.
- Add a Lookup stage to the job design canvas, and then create a
reference link from the Greenplum Connector stage to the Lookup stage.
About this task
In a sparse lookup, the connector runs the specified
SELECT statement one time for each parameter set that arrives in the
form of a record on the input link to the Lookup stage. The specified
input parameters in the statement must have corresponding columns
defined on the reference link. Each input record includes a set of
parameter values that are represented by key columns. The Greenplum
Connector stage sets the parameter values on the bind variables in
the SELECT statement, and then the Greenplum Connector stage runs
the statement.
The result of the lookup is routed as one or
more records through the reference link from the Greenplum Connector
stage back to the Lookup stage and from the Lookup stage to the output
link of the Lookup stage. A sparse lookup is also known as a direct
lookup because the lookup is performed directly on the data source.
You
can use the sparse lookup method when the target table is too large
to fit in memory.
Procedure
- Double-click on the Greenplum Connector stage.
- From theLookup Type list, select Sparse.
- On the Columns tab, define the key columns to use from
the database to which the connector is connected.
- On the Properties page, configure the properties on the Properties tab.
- If you set the Generate SQL to Yes,
specify the Table name and the Key columns
details in the Columns page.
- If you set the Generate SQL at runtime to No when
you configured the connector as a source, specify a value for the Select
statement property. In the select part of the SELECT statement,
list the columns to return to the Lookup stage. Ensure that the columns
in the select list match the columns on the reference link. Each column
name in the WHERE clause of the statement must begin with the word ORCHESTRATE and
a period. The word ORCHESTRATE can be all uppercase or all
lowercase letters.
For example, you can specify the
following SELECT statement: select Field002,Field003 from
MY_TABLE where Field001 = ORCHESTRATE.Field001. The column
names that follow the word ORCHESTRATE should match the key
columns on the reference link.
- Click OK to save the changes.
- Map the input link and reference link columns to the output
link columns and specify conditions for a lookup failure:
- Drag or copy the columns from the input link and reference
link to your output link.
- To define conditions for a lookup failure, click the Constraints icon
in the menu.
- In the Lookup Failure column, select a value, and then
click OK. If you select Reject,
you must have a reject link from the Lookup stage and a target stage
in your job configuration to capture the rejected records.
- Click OK.
- Save, compile, and run the job.