Sequence rules, sequences, and item sets have various characteristics.
Some of the characteristics are shared between the different views, for example,
Support. Other characteristics are specific, for example, Lift is specific
to the Sequence Rules View.
The following list covers the characteristics of sequence rules, sequences,
and item sets in alphabetical order. Table 1 shows
an overview of the characteristics and their appropriate views.
- Absolute support
- Depending on the
view, the absolute support value reflects the number of occurrences of a sequence
rule or the number of occurrences of a sequence in a model.
- Body
- A part of a sequence rule. In the following example, the sequence A >>>
B1, B2 represents the body of the sequence rule.
A >>> B1, B2 ==> C1, C2
- Confidence
- The confidence of a sequence
rule indicates its strength or reliability. The confidence is defined as the
percentage of transaction groups that support the sequence rule out of all
transaction groups that support the rule body. A transaction group supports
the rule body if it contains the item sets of the rule body.
In the following
sequence rule, the confidence value is 60%. This means that for 60% of all
transaction groups that contain swimsuits and beach towels a later transaction
group will contain sun glasses.
Sequence Rule:
[Swimsuits] + [Beach towels] ⇒ [Sun glasses]
Support=24% Confidence=60% Lift=2.0
- Graphical representation
- The graphical representation of a sequence rule.
One graph represents
one sequence. The item sets of a sequence rule are represented as nodes. Sequence
steps are represented as highlighted arrows. Rule steps are indicated by thin
arrows.
In the tabular form of the Sequence Rules View, you can include
the graphical representation of the sequence rules. In the graphical representation
of the sequence rules, item sets are represented as nodes, and the steps are
represented as arrows. One graph represents one sequence rule. The graphs
are static, you cannot move the item sets or the arrows.
- Item sets
- The nodes that represent the item sets are colored depending on the item
set color that you have specified in the Properties notebook.
You can specify
the characteristics that are used as item set labels. The item set labels
are displayed in the graphical representation of the sequence rules.
- Steps
- You can differentiate between the following steps:
- Sequence step
- The sequence step is indicated by the thin arrow. It represents the steps
from one item set to the next item set within the sequence. Sequence steps
occur in the rule body.
- Rules step
- The rule step is indicated by the highlighted arrow. It represents the
step from the sequence to the item set. Sequence steps lead from the rule
body to the rule head.
By default, the steps are labeled with the measure Time
Mean. In the Properties notebook, you can replace this measure with the following
measures:
- Time Mean+-Standard Dev.
- Time Min.
- Time Max.
You can show or hide the steps labels in the Sequence Rules View
by clicking View => Labels for Steps on the menu bar of the Sequence
Visualizer. If Labels for Steps is selected in the View menu,
the steps labels are displayed in the Sequence Rules View.
If you include the graphical representation of the sequence
rules in the Sequence Rules Visualizer, the legend additionally shows the
following information:
- The item set color that you selected in the Properties notebook
- The measure that you selected for the step
The following figure shows the Sequence Rules View with the graphical
representation of the sequence rules and the extended legend information.
Figure 1. The Sequence Rules View with the graphical
representation of the sequence rules
- Group
-
Rule
groups help you to distinguish sets of sequence rules that do not have a direct
or indirect relationship.
If two sequence rules, for example, R₁ and
R₂, share at least one item set (regardless whether the item set appears in
the rule head or in the rule body), these sequence rules are directly related
(R₁ ~ R₂). Therefore they belong to the same rule group.
The
rule group of a given sequence rule R₁ consists of all sequence rules in the
model that are directly or indirectly related to R₁. This means that a rule
group contains all sequence rules that are connected via a chain of directly
related sequence rules in the model.
The figure below illustrates the
direct and the indirect relationship of sequence rules. Each of the sequence
rules has a direct relationship to another sequence rule because they contain
the same item set. For example, sequence rule number 1 and sequence rule number
2 contain the item set L.
Sequence rule number 2 and sequence rule
number 4 are not directly related because they do not contain the same item
set. However, they are indirectly related based on the direct relation ship
of sequence rule number 2 and sequence rule number 3 (both contain the item
set D) and the direct relationship of sequence rule number 3 and sequence
rule number 4 (both contain the item set S).
For example, there might
be a model that contains the following sequence rules that include the item
sets A, B, C, D, E, F, G, and H:
- A >>> B ==> C
- D ==> E
- D ==> A
- F >>> F ==> G
- F ==> H
The sequence rules
1 and
2 belong to the
same rule group because sequence rule number
3 contains
item sets that are included in sequence rule number
1 and
sequence rule number
2.
The
sequence rules 4 and 5 belong to a
different rule group because they are not linked to any item set of sequence
rule number 1,
sequence rule number 2,
or sequence rule number 3.
Item
sets that belong to sequence rules in different rule groups are not typically
contained in the same transaction group. For example, if you are looking at
repair histories, you can interpret this as not having dependencies in breaking
down. However, if you sort the sequence rules of by sequence rule group, you
can easily detect item sets (product parts) that are typically broken down
in a sequence.
- Head
- A part of a sequence rule. In the following example, the item
set C1, C2 represents the head of the sequence rule.
A >>> B1, B2 ==> C1, C2
- ID
- The identification of a sequence rule, a sequence, or an item set.
- In Rules as Body
- The number of sequence rules that include a particular sequence in the
rule body.
- In Rules as Head
- The number of sequence rules that include a particular item set in the
rule head.
- Item Set
- An unordered set of items. An item set can include one or more items.
- Item sets in Rule Body
- The number of item sets that are included in the rule body.
- Item sets in Rule Head
- The number of item sets that are included in the rule head.
- Items in Set
- The number of items in an item set.
- Item Sets in Sequence
- The number of item sets in a sequence.
- Lift
- For sequence rules, the lift value shows
the difference between the sequence rule and the sum of the sequence lift
value and the item sets lift value.
For sequences, the lift value shows
the difference between the sequence and the sum of the different parts of
a sequence. This means, if the lift value of a sequence is greater than 1,
previous item sets in a sequence are related with consecutive item sets. The
occurrence of the previous item sets enforces the occurrence of the consecutive
item sets.
For item sets, the lift value shows the difference between
the item set and the sum of the different parts of the item set. This means,
if the lift value of an item set is greater than 1, the items in the item
set are related.
- Number of Rules
- The number of sequence rules that include a particular item set in the
rule body or in the rule head.
- Sequence
- An ordered list of item sets. The item sets are ordered by time.
- Sequence rule
- A sequence rule consists of a sequence of item sets in the
rule body leading to an item set in the rule head.
The sequence of item
sets in the rule body influences the item set in the rule head.
A sequence
rule might look like this:
A >>> B1, B2 ==> C1, C2.
- Support
- For sequence rules, a transaction group supports
a sequence rule if the transaction group contains the rule body and the rule
head in this order. The support value is the ratio of transaction groups
supporting the sequence rule and the total number of transaction groups within
your database of transaction groups.
For example, in the following sequence
rule, there might be 24 transaction groups out of 100 transaction groups that
support the sequence rule:
[ignition distributor] >>>
[air bag fron right] + [fuse_15] ==> [air condition]
Support=24% Confidence=60% Lift=2.0
This means that 24 transaction groups out of 100 transaction
groups are made up of the following item sets that occur in the following
order:
- An item set that contains the item ignition distributor
- An item set that contains the items air bag front right and fuse_15
- An item set that contains the item air condition
These item sets might contain additional items, and the transaction
group might contain additional item sets.
For sequences, a transaction
group supports a sequence if the transaction group contains all item sets
of the sequence in the same order as in the sequence. The support value is
the ratio of transaction groups supporting the sequence and the total number
of transaction groups within your database of transaction groups.
For
item sets, a transaction group supports an item set if the item set is a subset
of at least one transaction within the transaction group. The support value
is the ratio of transaction groups supporting the item set and the total number
of transaction groups within your database of transaction groups.
- Support multiplied by Confidence
- The rule support multiplied by
rule confidence measure helps you to identify rules that might be
important for you. It takes the confidence value and the support value into
account. If the confidence value and the support value are high, the measure rule
support multiplied by rule confidence is also high.
- Time Standard Dev.
- For sequence rules, this value reflects the minimum
time between the body, the sequence, and the head item set of the sequence.
For
sequences, this value reflects the minimum time between the successive item
sets of a sequence.
- Time Max.
- For sequence rules, this value reflects the maximum time between the body,
the sequence, and the head item set of a sequence.
For sequences, this
value reflects the maximum time between the successive item sets of a sequence.
- Time Mean
- The mean of the elapsed time from the beginning to the end of a sequence.
- Time Min.
- For
sequence rules, this value reflects the minimum time between the body, the
sequence, and the head item set of the sequence.
For sequences, this value
reflects the minimum time between the successive item sets of a sequence.
- Weight Mean
- The mean weight of rules is a value that represents, for example, a price.
- Weight StdDev
- The standard deviation of the weight distribution of rules.
- Weight Min.
- The minimum weight of rules that is included in training data.
- Weight Max.
- The maximum weight of rules that is included in training data.
- TAGrp Weight Mean
- The mean weight of all transaction groups that
support the rules that are included in the training data.
- TAGrp Weight StdDev
- The standard deviation of all transaction groups that support
the rules that are included in the training data.
- TAGrp Weight Max.
- The maximum weight of all transaction groups
that support the rules that are included in the training data.
- TAGrp Weight Min.
- The minimum weight of all transaction groups
that support the rules that are included in the training data.
The following table shows the different views and the appropriate characteristics.
Table 1. Overview of the Sequences Visualizer
views and the appropriate characteristics| Characteristics |
Sequence Rules |
Sequences |
Item Sets |
| Absolute Support |
X |
X |
- |
| Body |
X |
- |
- |
| Confidence |
X |
- |
- |
| Graphical Representation |
X |
- |
- |
| Group |
X |
- |
X |
| Head |
X |
- |
- |
| ID |
X |
X |
X |
| In Rules as Body |
- |
X |
- |
| In Rules as Head |
- |
- |
X |
| Item Set |
- |
- |
X |
| Item sets in Rule Body |
X |
- |
- |
| Item sets in Rule Head |
X |
- |
- |
| Items in Set |
- |
- |
X |
| Item Sets in Sequence |
- |
X |
- |
| Lift |
X |
- |
- |
| Number of rules |
- |
- |
X |
| Sequence |
- |
X |
- |
| Support |
X |
X |
X |
| Support*Confidence |
X |
- |
- |
| TAGrp Weight Mean |
X |
X |
X |
| TAGrp Weight StdDev |
X |
X |
X |
| TAGrp Weight Min. |
X |
X |
X |
| TAGrp Weight Max. |
X |
X |
X |
| Time Max. |
X |
X |
- |
| Time Mean |
X |
X |
- |
| Time Min. |
X |
X |
- |
| Time Standard Dev. |
X |
X |
- |
| Weight Mean |
X |
X |
X |
| Weight StdDev |
X |
X |
X |
| Weight Min |
X |
X |
X |
| Weight Max. |
X |
X |
X |
Depending on the business question that you are interested in, you might
want to include particular characteristics in the views of the Sequences Visualizer.
For example:
- If you have broken parts and want to find out which other parts frequently
break afterwards, and how often this happens, you need the characteristics
Sequence and Absolute Support in the Sequences view.
- If, in addition, you want to find out how likely other parts will break
as a consequence, you need the characteristics Body, Head, and Confidence
in the Sequence Rules view.