Bloom Filter stage in DataStage

Use the Bloom Filter stage to perform efficient lookups on keys.

You can use the Bloom Filter stage to more efficiently lookup incoming keys against previous values. The Bloom Filter stage can generate false positives but never generates false negatives in your output data set. You should use the Bloom Filter stage only when a small number of false positives are acceptable in your output dataset. This stage takes a single input data set, and can generate multiple output sets depending on the operating mode. The Bloom Filter stage manages bloom filter file sets. It also adds or deletes files from the file set based on the options that are specified for the stage.

Input tab

The Columns section specifies the column definitions of incoming data. The Advanced section allows you to change the default buffering settings for the input link.

Output tab

The stage can have any number of output links. Choose the one you want to work on from the drop down list. The Columns section specifies the column definitions of incoming data. Click Edit at the bottom of the Columns section to specify mapping information. Mapping specifies the relationship between the processed data being produced by the Bloom Filter stage and the Output columns. The Advanced section allows you to change the default buffering settings for the output link.