Usage of association rules
Usually, the market basket analysis runs on a large fact table or on a part of it.
Such a table contains at least two columns, a transaction ID column or time ID column, and an item ID column. The columns contain each transaction, that is, each purchase order, with the purchased items. To save disk usage, both columns are often numeric. Additionally, the relation between the item ID and the name or description of the purchased item can be stored in a separate table.
By searching such a basic table, the market basket analysis can find frequent patterns, for example, groups of items that are purchased together in many transactions. The number of transactions that contains a pattern is called the support for this pattern.
When a table is analyzed, a specified minimum support is used. The minimum support excludes patterns that have a support that is lower than the specified minimum support.
The following conditions regarding minimum support might occur:- The number of frequent patterns that are detected is reversely proportional with the minimum support
- The number of frequent patterns that are detected cause combinatorial explosion if the minimum support is set too low
Rules are defined based on frequent patterns.
From a pattern (A B C), the following rules are implied:
- (A B)=>(C)
- (A C)=>(B)
- (B C)=>(A)
A rule has the same support as the frequent patterns which it is implied from. A rule is also characterized by its confidence. Confidence means the probability that transactions that contain the items on the left side of the rule, also have the items on the right side. To prevent the creation and storage of too many rules, you can specify a minimum confidence.