CSV file format for lookup table definitions

When you import lookup definitions in the Standardization Rules Designer, files that contain definitions must be text files that contain values that are separated by commas. Each lookup table definition in the file must be formatted correctly.

CSV file requirements

The entire CSV file must meet the following requirements:
  • Files must have only one definition per line.
  • Files must use UTF-8 character encoding.
  • If leading or trailing white space must be preserved for an individual value, the entire value must be enclosed in double quotation marks.

Definition requirements

Each lookup table definition can include a maximum of three columns. The following table shows the three columns in the order that they must be specified and lists requirements for each column.
Table 1. Columns in a lookup table definition
Column Required column Requirements
Value Yes The maximum length is 600 characters.
Returned value No

If a returned value is not specified, it is the same as the value.

The maximum length is 600 characters.
Similarity threshold No

If a similarity threshold is not specified, a default of 900 is assigned.

The similarity threshold must be an integer in the range 700 - 900.

Examples

In this example, the returned value and similarity threshold are not specified.

METER,,
As a result, the returned value is the same as the value, and a default value is assigned for the similarity threshold. The following table shows the lookup table definition that is shown in the Standardization Rules Designer.
Table 2. Definition that includes defaults for the returned value and similarity threshold
Value Returned value Similarity threshold
METER METER 900

In this example, values are specified for all of the columns. Because the value contains white space, it is enclosed in double quotation marks.

"FREIGHT TON",IMPERIAL,800
The following table shows the lookup table definition that is shown in the Standardization Rules Designer.
Table 3. Definition that includes values for all columns
Value Returned value Similarity threshold
FREIGHT TON IMPERIAL 800