When I get dropped into a large codebase I like to use a software metrics tool to take a macro look at the organization of the code. Over many years the Object-oriented programming community and others have come up with a suite of metrics that can be computed for most programming languages. Architects and developers can use these metrics to detect bad "code smells", find code that would benefit from refactoring, or perform an impact analysis before a major refactoring gets underway. Many of these are well described in Agile Software Development, Principles, Patterns, and Practices by Robert C. Martin.
There has been relatively little research into defining metrics for rule-based systems. Here are the only two papers I could dig up - perhaps you know of others?
- Coupling and cohesion metrics for knowledge-based systems using frames and rules
- Applying metrics to rule-based systems
The traditional software metrics I find most useful are those that deal with code responsibility and dependencies (afferent and efferent coupling). These metrics are very useful when trying to understand the potential scope of a refactoring for example. Determining code dependencies is relatively straightforward, because a Java method that invokes 10 methods has a dependency on those methods. Things are trickier in the rules world because a rule may have 10 rules that must fire to set up its conditions (preconditions) and then 5 downstream rules that will fire once its actions are triggered (postconditions). Through Static Rule Analysis we may be able to detect some of these dependencies however.
Rules also have dependencies on the Business Object Model, parameters, variables and the Vocabulary, and these are much easier to determine as they are explicit in each rule.
Below are some ideas for metrics that might be useful for rule-based systems.
Afferent Coupling (Responsibility) : Ca
- The number of rules that will fire when a rule's actions are executed
- The number of rules that reference a VOC phrase
- The number of rules that reference a BOM member
Efferent Coupling (Independence) : Ce
- The number of rules that are required to fire for a rule's conditions to be true
- The number of BOM/VOC references per rule
Instability is defined as the ratio of efferent and afferent coupling. Of course it is easy to create a rule with an instability index of zero -- just ensure the rule can never be fired! Instability is therefore a tradeoff between responsibility and independence.
I = Ce / (Ce + Ca)
Large or Complex Rules
The presence of many large or complex rules may indicate a "rule smell". A few ideas:
- Rules that reference many VOC elements
- Rules that require many bindings
- Number of characters
- Number of conditions
- Number of actions
- Number of variable or parameter references
Packages, Ruleflows and Rule Projects
Zooming out from the individual rules themselves, many of these metrics can also be applied to packages (an average measure for the rules within a package), ruleflows (because they also have pre and postconditions) and Rule Projects (average measures as well as reporting dependencies).
I've outlined a few ideas for rule metrics, but what do you think? Is this approach applicable to rule-based systems? Do you have any techniques you use to detect bad "rule smells"?