Data search performance best practices
There are many ways to develop IBM® Transformation Extender maps that search through data. These different options provide varying balances between functionality and performance.
The functions available to the map developer to do data searches are:
- EXTRACT()
- LOOKUP()
- SEARCHUP() or SEARCHDOWN()
- CHOOSE() - see POSITIONAL INDEXING in the table below
The following table contains the functions available to the map developer to do data searches.
Function | Use | Advantage | Disadvantage |
---|---|---|---|
EXTRACT() | When expected results are multiple objects. | One of the most powerful ways of searching through a data file. | Not the most efficient way to locate objects if the expected result set is only a single object. |
LOOKUP() | Where expected results are a single object. Scans a data file object-by-object, looking for an object that matches the specified criteria. |
More efficient than Extract(). | The average amount of time required for search depends proportionally on the number of objects in the data file. |
SEARCHUP() or SEARCHDOWN() | When the set of items to be searched is sorted. | Increased performance. Takes advantage of the sorted order of the data file by traversing the data file as a binary tree. Search time is proportional to log2(N), where N is the number of data items. |
Cannot be used for unsorted data files. The smaller the input data file, the less performance results benefit. |
POSITIONAL INDEXING | When used with CHOOSE() and INDEX(), locates objects within a data file by position. | Executes repeated CHOOSE() requests and maintains a cache of position
information for the indexed data file. The cache continues to be used as long as repeated calls to CHOOSE() reference the same type, and indexing is fast. |
When repeated calls to CHOOSE() don't reference the same type, the cache isn't used, and indexing is not as fast. |