Naming standards

All rule and rule set definitions, all executable data rules and rule sets, all metrics, and all global variables require a name. A name can either facilitate or hinder reuse and sharing of these components, so while names can be freeform and be quickly created when simply evaluating and testing conditions, it is critical to identify an effective naming standard for ongoing quality monitoring.

Naming standards are intended to:

Consistency and clarity are the two key balancing factors in naming standards. It is easy to get wide variation in names, as shown in the following figure.

Figure 1. An example of simple naming standards
Shows an example of simple naming standards

It is also easy to adopt a strong set of conventions, but, rigid naming conventions reduce clarity. If you cannot easily interpret the meaning of a rule definition, you will most likely create something new instead of reusing a valid one.

Figure 2. An example of more rigid naming standards
Shows an example of more rigid naming standards

One thing to avoid doing is to embed items that can be elsewhere (for example, your initials). Other fields such as Description and Created By can store these types of references, facilitate clarity and organization, and can be used for sorting and filtering.

A common naming approach is to use a structure like Prefix – Name – Suffix.

Prefix Values can be used to facilitate sorting and organization of rules. Prefix with something that expresses a Type/Level of Rule. For example, the following is a breakdown of different types or levels of rules:
  • Completeness (Level 0), use CMP, L0
  • Value Validity (Level 1), use VAL, L1
  • Structural Consistency (Level 2), use STR, PK, FMT, L2
  • Conditional Validity (Level 3), use RUL, CON, L3
  • Cross-source or Transformational Consistency (Level 4), use XSC, TRN, L4

The use of a schema like L0 to L4 allows easy sorting, but might be too cryptic. The use of abbreviations is clearer, and will sort, but does not necessarily sort in sequenced order.

Name Values help to identify the type of field and the type of rule (for example, SSN_Exists, SSN_Format, AcctBalance_InRange). The choice of a name will typically be based on the type of object (for example, Rule Definition, Rule Set, and so on).

Rule definitions
  • Type of field evaluated provides for better understanding
  • Names can range from generic ‘template' to specific
    • Data Exists could be a template for common reuse
    • SSN_InRange is specific to typical data
Rule set definitions
These typically range from a group of rules for the same data to a group of rules for some larger object.
  • SSN_Validation would include rule definitions for existence, completeness, format, valid range, and other specific field type
  • Customer_Validation would include rule or rule set definitions for all fields related to the customer such as name, SSN, date of birth, and other values
Rules
  • The table and column evaluated provides for better understanding. It might be necessary to include the schema, database, or file name as well.
  • Include the definition test type to readily identify how the rule is applied:
    • AcctTable.Name_Exists
    • CustTable.SSN_InRange
Rule sets
The schema and table, possibly with column evaluated, provides better understanding of the data source.
  • CustTable.SSN_Validation identifies the relevant table or column
  • CustDB.CustTable_Validation identifies the relevant schema or table only as it evaluates multiple columns
Metrics
  • Can be applied to single or multiple rules or rule sets or other metrics
  • Include the type of measure evaluated
  • Include the measure interval (for example, day, week, month) if relevant
  • Where applied to a single rule or rule set, including the name of rule or rule set helps understanding
    • AcctTable.Name_Exists_CostFactor_EOM identifies that there is a cost applied to this rule at end of month
    • CustDB.CustTable_Validation_DailyVariance_EOD identifies that there is a test of variance at end of day against prior day
Global variables
  • Are individual instances of variables used in multiple rule definitions
  • Distinguish these from specific local variables including a prefix that helps users who are reading the rule definition understand that this is a global variable (for example, GLBL_)
  • Include the type of value or reference (for example, Balance, StateCode)
  • Include an identifier that conveys what information is associated with the value or reference (for example, Minimum_Balance, Master_StateCode)

Suffix values can help with filtering, clarity of the type of rule, or to establish iterative versions.