Classifications

Classifications strengthen the contextual information that patterns provide by identifying that the underlying values belong to particular categories. Each rule set contains its own set of categories, which are called classes.

In IBM® InfoSphere® QualityStage®, records are represented as patterns. In the same way that a record consists of one or more values, patterns consist of one or more abstract characters, each of which represents a class. For example, a set of address data might include the record 123 N CHERRY HILL ROAD, which is represented by the pattern ^D++T. The following table shows the contextual information that each class in the pattern ^D++T provides.
Table 1. Example of a standard address pattern with the contextual information that each class provides
Input record Class label Contextual information that the class provides
123 ^ Value that includes only numbers
N D Street direction
Cherry + Value that includes only letters
Hill + Value that includes only letters
Road T Street type
Patterns contain the following types of classes:

Rule sets use classifications to identify and classify key values. For example, a rule set for address data might use classifications to categorize values that are street types (AVE, ST, RD) or directions (N, NW, S) by providing the following information:

Classifications are added and modified by editing the classifications table (previously called .CLS file) , enhancing a rule set in the Standardization Rules Designer, or using the user classification override.