Rule set definitions and data rule definitions

You can use data rule definitions or rule set definitions to create rule logic to evaluate your data.

When you build rule logic, you can either create individual data rule definitions or rule set definitions.

Rule set definitions: Collection of data rule definitions.
  • Rule sets run much faster than executing individual rules one by one because rule sets are executed by using unique processing algorithms on a parallel execution framework. Also, since the processing and statistics are aggregated during execution, no additional steps are necessary to aggregate exception information after individual rule processing is completed.
  • Rule sets provide a broader range of statistical information, allowing you to see the multiple views of output such as which records break certain combinations of rules, which combination of rules are broken together, and so on. This additional level of detail allows you to focus corrective attention on problem records and all associated issues at once, rather than sporadically responding to isolated rule exceptions, as well as look for patterns of issues that might not be apparent looking at a rule in isolation.
  • When you create a rule set definition, you add data rule definitions that are already created and defined. The individual data rule definitions cannot be modified in the rule set workspace. After you create a rule set definition, you generate a rule set. When you run the rule set, all the data rule definitions that are part of the rule set are executed at the same time and produce a comprehensive set of analysis results.
  • When you use rule set definitions, rather than include all of the logic within a single definition, you can divide the logic over multiple definitions. Each definition can check for a specific condition. You can then group all of the definitions in a rule set definition. You build the rule logic that you want to be applied to your data, and group the rule definitions that you create in rule set definitions.
The process of creating a rule set definition out of data rule definitions is shown in the following figure:
Figure 1. The process of creating a rule set definition out of data rule definitions
The process of creating a rule set definition out of data rule definitions
Data rule definitions: Individual data rules that you construct rule logic for. When you create data rule definitions, you construct rule logic to analyze data.
  • When it is time to generate data rules, you would generate data rules from each individual data rule definition, and then execute the data rules you want to run, one rule at a time. The data that you run the rule against either passes or fails the check.

A major difference between a rule set and individual data rules is the results they produce when they are executed. When you execute a rule set, all the rules in the rule set are evaluated at the same time against the data. The rule set either passes or fails a percent of the rules in the rule set. The results of all the rules in the rule set are aggregated into rule set statistics. When you run a series of data rules one by one, the data either passes or fails each rule that you execute.

For example, you are maintaining a master product catalog and are monitoring a number of rules for that catalog. You have individual rules to ensure that the division and supplier codes exist and that they are part of respective standard reference lists. You have rules to ensure that the product description is not blank and does not include an excessive string of random consonants. You also have a rule to ensure a valid hazardous material flag exists. Running the valid hazardous material flag rule in isolation, you find a small percentage of exceptions, and can identify the records involved. If you run that rule in context with the other rules, you find that all records where the hazardous material flag does not exist (the exception records), all records that do not meet the rules for a supplier code in the standard reference list, and a product description without too many random consonants. Looking at the detailed output for the entire record, you recognize that the supplier code represents a test code. You can then take corrective action to ensure that such test records do not enter the system and resolve several issues simultaneously.