Predefined rule sets
You can use predefined rule sets that are country- or region-specific or that can be applied internationally.
Country or region rule sets
In the Designer client repository view, expand the Standardization Rules folder and select the country or region that you want. The folder for the country or region contains the provided rule sets.
You can install additional rule sets from asset interchange and then import the rule sets into your project.
Country or region | Rule sets | Location of rule set files |
---|---|---|
Argentina |
|
InfoSphere® Information Server file directory |
Australia |
|
Designer client repository view |
Brazil |
|
InfoSphere Information Server file directory |
Canada |
|
Designer client repository view |
Chile |
|
InfoSphere Information Server file directory |
France |
|
Designer client repository view |
Germany |
|
Designer client repository view |
Hong Kong Special Administrative Region of the People's Republic of China - Chinese characters |
Note: The Name domain includes data typically found in the Area
domain. The Prep domain is not needed.
|
Designer client repository view |
Hong Kong Special Administrative Region of the People's Republic of China - Latin characters |
Note: The Name domain includes data typically found in the Area
domain. The Prep domain is not needed.
|
Designer client repository view |
India |
Note: The IndiaAddressSharedContainer shared container is imported
with the Indian address rule sets. The shared container can be used
in a job that standardizes Indian address and area data.
|
InfoSphere Information Server file directory |
Ireland |
|
InfoSphere Information Server file directory |
Italy |
|
Designer client repository view |
Japan |
|
Designer client repository view |
Japan |
JPKANA converts data from Katakana to Kanji. |
InfoSphere Information Server file directory |
Korea |
|
InfoSphere Information Server file directory |
Mexico |
|
InfoSphere Information Server file directory |
Netherlands |
|
InfoSphere Information Server file directory |
People's Republic of China |
|
Designer client repository view |
Peru |
|
InfoSphere Information Server file directory |
Russia |
The RUADDRL rule set standardizes address and area information. To use these rule sets, you must set the Windows code page to 1251 and the regional settings for your operating system to Russian. |
InfoSphere Information Server file directory |
Spain |
|
Designer client repository view |
Thailand |
The THADDRL rule set standardizes address and area information. To use these rule sets, you must set the Windows code page to 847 and set the regional settings for your operating system to the Thai language. In the job properties, set the NLS parameter to TIS-620. |
InfoSphere Information Server file directory |
United Kingdom |
|
Designer client repository view |
United States |
The USTAXID rules validate United States tax ID format, including
the following characteristics:
|
Designer client repository view |
Rule sets in the other category
Scope | Rule sets | Location of rule set files |
---|---|---|
International | COUNTRY Determines the country or region of origin for incoming data. This rule set looks at area information, such as city, state or province, locality, and postal code. The rule set attempts to identify the country or region to which the information belongs and assign the ISO territory code. Expected input: multinational address information with a default two byte country code delimter that is enclosed in the characters ZQ, for example: ZQUSZQ. |
Designer client repository view |
United States-specific Expanded Company Name | EXPCOM Parses company name information into
up-to-eight match words (excluding words such as the and and)
and extracts distinguishing information such as:
Expected input: a valid company name. |
Designer client repository view |
Validation | VDATE Verifies date formats, for example:
|
Designer client repository view |
Validation | VEMAIL Verifies email formats, for example:
|
Designer client repository view |
Validation | VPHONE Verifies United States, Canada, and
Caribbean phone formats, for example:
Note: A number sequence that leads with 1, is
considered invalid.
|
Designer client repository view |
Validation | VTAXID Provides rules that validate country-specific tax ID formats. Intended for only the United States. |
Designer client repository view |
Rule sets in the sample category
Rule sets | Location of rule set files |
---|---|
GENPROD Demonstrates how product description data can be processed through standardization. |
InfoSphere Information Server file directory |
PHPROD Demonstrates how pharmaceutical data can be processed through standardization. |
InfoSphere Information Server file directory |
Rule set development package
- Standardization rules templates
- Includes templates for domains: address, area, and name.
- Standardization rules development kit
- Consists of a series of jobs that produce reports to assist in custom rules development.
- Standardization quality assessment kit
- Consists of a series of jobs that produce reports to assist in evaluating standardization results.