IBM Support

Performance best practice with QualityStage Address Verification Interface

Question & Answer


Question

How can I achieve optimal performance with InfoSphere QualityStage Address Verification Interface with batch processing?

Cause

Performance of the Address Verification Interface (AVI) is impacted by the input data, stage property settings, and parallel engine configuration

Answer

Experiments with USA address data sets indicate that the following steps could help improve the performance of the QS AVI Validation function:

1. Include the PostCode input field to avoid most addresses being flagged as unverified.

2. Sort the PostCode input field. In-house experiments show ~44% improvement by sorting the PostCode input field.

3. If possible, always include the Country input field. As much as a 74% degradation in performance was observed when the Country field was omitted.

4. If possible, avoid using unfielded input. Unfielded input is all address data in one column, with no differentiation between the address line, postal code, and other address data. Unfielded input contributes to a degradation in performance by as much as 10%.

5. Avoid using 'Validation' processing type with 'Suggestion' mode for batch processing. 'Suggestion' mode is not designed for batch processing

6. Increase the parallel engine processing node count in the APT_CONFIG_FILE if your computer has CPU resources available. AVI throughput scales linearly as the node count is increased.

Performance results vary depending on the operating system you run on and other system variables. The percentages provided here are only for your reference.



Note: These best practices do not apply to real time or address verification interface (AVI) processes that are deployed by InfoSphere Information Services Director.  

[{"Product":{"code":"SSVSBF","label":"InfoSphere QualityStage"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Address Verification Interface","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"10.5;10.0","Edition":"","Line of Business":{"code":"","label":""}},{"Product":{"code":"SSZJPZ","label":"IBM InfoSphere Information Server"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":" ","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"9.1;8.7;8.5;11.5;11.3","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21625670