Parsing, validating, and transliterating address data

You can configure the Address Verification stage to parse, validate, and transliterate address data.

You can parse the addresses that are in your input stage. You assign address components, such as street or postal code, to a column.

You can validate the addresses against postal reference files to check the correctness of the data. The validation process assesses address deliverability and provides a status such as very likely, fair chance, or unlikely to be deliverable.

You can also generate geographic location information and a summary report as part of address validation. A Validation Summary report shows the following items:
  • Total number of records processed
  • Number and percentage of records that the stage passed, failed, validated, corrected, or suggested another address for
  • Number and percentage of records that the stage failed because of postal code, city, street, country, or region

If you choose the validation processing type, ensure that you have access to current postal validation reference files. For more information, see Installing and updating address and geo reference files for the QualityStage Address Verification stage.

Transliteration is performed on the address data after the processing. Transliteration converts addresses from one representation (script) to another. You can transliterate addresses in non-Latin languages, such as Greek or Hebrew, to the Latin character set. Or you can transliterate from Latin to a non-Latin, Native character set. Use transliterated addresses to store data consistently in one common writing system.

To configure the Address Verification stage, double-click its icon on the canvas, or right-click the icon and select Open. On the Stage tab, open the Processing section to select the parse or validation processing type. From the Stage tab, you can also set options, advanced options, and link ordering.

Configuring the Address Verification stage to process address data

You specify the type of process that you want and define the detail level of the output information and the format of the output address data.

Before you begin

Create a job that includes a stage that contains input address data, the Address Verification stage, and a stage to receive the output data. You can use a file stage, a database stage, or a processing stage to contain the input or output data. For example, you might use a database stage for the input data and a sequential file stage for the output data.

To improve performance of the Address Verification stage, sort the input data before you add the data to the job. Data that is sorted at a granular level improves job performance more than data that is sorted only at the country level. For example, to ensure that the job runs as quickly as possible, you might sort input data in the following order:
  1. By country
  2. By region or province
  3. By city
  4. By postal code

Procedure

  1. Double-click the Address Verification stage icon on the canvas.
    The details pane opens to the Stage tab.
  2. In the Processing section, complete the required fields.
  3. In the Options section, specify how detailed you want the output information to be and the format of the output address data.
  4. Click the Input tab to open it, then set an input name, edit address columns, or set advanced options.
  5. Click the Output tab to open it, then set an output name, edit address columns, or set advanced options.
  6. Click Save.

What to do next

Assign input columns to address fields

Assigning input columns to address fields

You assign input columns to address fields on the Input tab of the stage details pane. For example, you can assign the column for a house number and the column for a street name.

Before you begin

Select and configure a processing type on the Stage tab.

About this task

You do not have to assign every input column. For example, the input data might contain a column, such as Customer Since 1988, that you do not want to use in an address.

Procedure

  1. On the Input tab page of the stage details pane, open the Address Columns section, then click Edit.
    The Edit address columns page opens.
  2. If your address data contains fewer or more address lines than the number of address fields that are shown in the column list, adjust the Number of address lines to assign field to the number of lines in your data.
  3. Select Show additional address fields to see more address fields that are available, including "Sub-building," "Dependent street," "Dependent locality," and others.
  4. Hover over a row in the list, then click the pencil icon to edit the row.
    The details pane for the row opens.
  5. Select the number of columns to assign, then select the column name or names.
  6. Click the up or down arrow to reorder the columns.
  7. Click Apply to finish editing the row, click Apply and return to return to the stage details pane, then click Save.