D_USPS comparison

Compares an alphanumeric house number from a data source to two alphanumeric house number intervals from a reference source by using a left-right interval comparison. Control columns indicating the odd-even parity of the reference intervals are required.

The D_USPS comparison requires the column names for the house number (generally on the data source), two intervals for house number ranges on the reference source, and control columns that indicate the parity of the house number range.

Frequency information is not taken into account when this match comparison is used but a two-source match requires four input streams. If you use this match comparison with a two-source match stage job, create two dummy file inputs instead of files that contain frequency information.

Required columns

The following data source and reference source columns are required:

  • Data. A column from the data source that contains numeric or nonnumeric values.
  • Reference. (1) The reference column that contains the beginning value of the first interval (such as the left side of the street) from the reference source.
  • Reference. (2) The reference column that contains the ending value of the first interval from the reference source.
  • Reference. (3) The reference column that contains the beginning value of the second interval (such as the right side of the street) from the reference source.
  • Reference. (4) The reference column that contains the ending value of the second interval from the reference source.
  • Reference. (Control1) The odd/even parity for the range defined with reference columns (1) and (2).
  • Reference. (Control2) The odd/even parity for the range defined with reference columns (3) and (4).

The control information from the USPS ZIP + 4 code is:

  • O. The range represents only odd house numbers.
  • E. The range represents only even house numbers.
  • B. The range represents all numbers (both odd and even) in the interval.
  • U. The parity of the range is unknown.

Example

A house number on the data source is first compared to the interval range defined with reference source columns (1) and (2). If the parity of house number agrees with the code defined with Control 1 and with the parity of the house number defined with reference source column (1), and the intervals overlap, it is considered a match. If not, the house number on the data source is next compared to the interval defined with reference source columns (3) and (4).