Rule filter constraints
You can limit the rules to be included in a model by applying rule filters in an Associations training run or in a Sequence Rules training run.
- Count constraints
- Item constraints
- Range constraints
Count constraints
If you have specified an upper limit for the number of rules to be included in a generated model, you can specify count constraints to select the 'best rules'. You can specify integers or strings.
Count constraint | Integer | String |
---|---|---|
Accessible volume | 28 | accessibleVolume |
Weight of the rule body | 22 | bodyWeight |
Confidence | 2 | confidence |
Weight of the rule head | 21 | headWeight |
Weight of each item in rules or item sets | 24 | itemWeight |
Lift | 3 | lift |
Number of items | 5 | nbOfItems |
Number of item sets | 6 | nbOfItemsets |
Weight of the rule | 20 | ruleWeight |
Support | 1 | support |
Support*confidence | 4 | suppTimesConf |
Support times weight of the rule | 23 | suppTimesRuleWeight |
Elapsed time between adjacent parts of the rule | 10 | stepWindow |
Total elapsed time from the beginning to the end of the rule | 9 | totalWindow |
Average weight of the transactions (Associations) or transaction groups (Sequences) that support the rule | 29 | transactionWeight |
Example of count constraints
This example specifies that the model must not contain more than 100 rules. If more rules are found, the 100 'best' rules are included in the model.
- Number of items in rules
- The best rules are the shortest rules. The shortest rules include the lowest number of items.
- Largest lift value
- If the previous criterion does not produce a clear ranking of the rules, for example, if there are 46 rules with 2 items and 2375 rules with 3 items, the rules that have the largest lift value are selected. This means that from the rules with 2 items and from the rules with 3 items, the rules with the largest lift value are selected as best rules.
IDMMX.DM_RuleFilter()
..DM_setMaxNumRules(100)
..DM_addCountConstr(1,'nbOfItems','ascending')
..DM_addCountConstr(2,'lift','descending')
Item Constraints
With item constraints, you can limit the number of rules to be generated in a model. You can specify to include or to exclude items in the rule body, in the rule head, or in the complete rule.
If you want to include the specified items, the rules that contain at least one of the specified items are generated. Including specified items is called a positive item constraint.
If you want to exclude the specified items, the rules that contain any of the specified items are discarded from the results. Excluding specified items is called a negative item constraint.
You
can combine positive item constraints and negative item constraints
that apply
to the rule body, the rule head, or to the complete rule by using
the boolean
predicates AND
or OR
. Item constraints
that
are connected by the predicate AND
must be matched
simultaneously.
If you apply item constraints that are connected by the predicate OR
,
at least one of the item constraints must be satisfied.
Item
constraints
also work with the categories of a taxonomy. Taxonomies are classes
or categories
of items. For example, if Drink is a taxonomy parent of the
items Juice and Milk,
and the item Juice is a taxonomy parent of the item Orange
juice,
you can specify one of the available item constraints for the taxonomy
parent Drink.
This constraint affects any occurrence of the items Orange
Juice
, Juice
, Milk
,
or Drink
in the data.
If you want to include the specified categories, only rules that contain at least one of the specified categories or at least one of their direct or indirect members are generated. If you want to exclude the specified categories, the rules that contain the specified categories or their direct or indirect members are discarded from the result.
If you use name mappings, you can use the original item names or their name mapping to define item constraints.
Item constraint | Integer | String |
---|---|---|
The item must exist in the rule body. | 1 | isInBody |
The item must exist in the rule head. | 2 | isInHead |
The item must exist in the rule. | 3 | isInRule |
The item must not exist in the rule body. | -1 | notInBody |
The item must not exist in the rule head. | -2 | notInHead |
The item must not exist in the rule | -3 | notInRule |
At least one item of this category, or the category name, must exist in the rule body. | 11 | categoryIsInBody |
At least one item of this category, or the category name, must exist in the rule head. | 12 | categoryIsInHead |
At least one item of this category, or the category name, must exist in the rule. | 13 | categoryIsInRule |
An item of this category must not exist in the rule body. | -11 | categoryNotInBody |
An item of this category must not exist in the rule head. | -12 | categoryNotInHead |
An item of this category must not exist in the rule. | -13 | categoryNotInRule |
Average weight of the transactions (Associations) or transaction groups (Sequences) that support the rule | 29 | transactionWeight |
Example of item constraints
Item constraints that
include an identical third argument are concatenated by the logical OR
.
Constraint groups that include different third arguments are concatenated
by the logical AND
. Therefore, the boolean predicate
that
matches the rule filter below looks like this:
- ('Roquefort'
is in the rule head or 'Camembert' is included in the rule
head)
AND
- (Any article of the class 'alcoholic
drinks' or the class name itself
is included in the rule head)
AND
- (Neither
an article from the class 'alcoholic drinks' nor the class name
itself is in the rule body)
AND
- (Neither an article from the class 'non food' nor the class name itself is anywhere in the rule).
IDMMX.DM_RuleFilter()
..DM_setItemConstr('Roquefort','isInHead',1)
..DM_setItemConstr('Camembert','isInHead',1)
..DM_setItemConstr('alcoholic drinks','categoryIsInHead',2)
..DM_setItemConstr('alcoholic drinks','categoryNotInBody',3)
..DM_setItemConstr('non food','categoryNotInRule',4)
Range constraints
With range constraints, you can limit the search space for new rules by specifying valid ranges for particular numeric properties of a rule, for example, the support value or the confidence value. You can use methods for specifying range constraints with integer IDs or with textual key words.
Range constraint | Integer | String |
---|---|---|
Accessible volume | 28 | accessibleVolume |
Weight of the rule body | 22 | bodyWeight |
Confidence | 2 | confidence |
Weight of the rule head | 21 | headWeight |
Weight of each item in rules or item sets | 24 | itemWeight |
Lift | 3 | lift |
Number of items | 5 | nbOfItems |
Number of item sets | 6 | nbOfItemsets |
Number of items in the rule body | 7 | nbOfBodyItems |
Number of items in the rule head | 8 | nbOfHeadItems |
Weight of the rule | 20 | ruleWeight |
Support | 1 | support |
Support*confidence | 4 | suppTimesConf |
Support times weight of the rule | 23 | suppTimesRuleWeight |
Elapsed time between adjacent parts of the rule | 10 | stepWindow |
Total elapsed time from the beginning to the end of the rule | 9 | totalWindow |
Example of range constraints
This example specifies that the rules that meet the following criteria are included in the model:
- Support value of 2.7% or higher
- Confidence value of 50% or higher
- Lift value smaller than or equal to 0.5, or greater than or equal to 2.0
- Consists of 2 to 4 items (counting rule body and rule head)
- 1 head item (more than 0 and less than 2)
IDMMX.DM_RuleFilter()
..DM_addRangeConstr('support',2.7,100.0,'isInClosed')
..DM_addRangeConstr('confidence',50.0,100.0,'isInClosed')
..DM_addRangeConstr('lift',0.5,2.0,'notInOpen')
..DM_addRangeConstr('nbOfItems',2,4,'isInClosed')
..DM_addRangeConstr('nbOfHeadItems',0,2,'isInOpen')