Defining benchmarks

Benchmarks represent the threshold or tolerance for error associated with a specific condition such as the validation of a data rule. If a benchmark is not established or set at either the level of the rule definition or the data rule executable, the generated statistics will simply reflect how many records met or did not meet the rule. There is no notation to indicate whether this result is acceptable or not.

By establishing a benchmark, you are indicating at what point errors in the data should trigger an alert condition that there is a problem in the data. These marks will be visible in the data rule output. There will be additional indications of variance that can be used for subsequent evaluation.

You will set benchmarks based on how you want to track the resulting statistics. If a data rule is set to test the valid business condition, the benchmark can either reflect the percent or count of records that met the data rule, or those that do not meet the data rule.

For example: The business expects that the Hazardous Material flag is valid. The tolerance for error is .01% This can be expressed either as:

Benchmark:  did not meet % < .01%
or
Benchmark:  met % > 99.99%

As with naming standards, it is best to establish a standard for benchmark measurement to help ensure consistency of experience when monitoring ongoing data quality. Using a positive value (met %) for the benchmark has the value that users will view as 99% of target.