Rule Set Model Tab
The Model tab for a Rule Set nugget displays a list of rules extracted from the data by the algorithm.
Rules are broken down by consequent (predicted category) and are presented in the following format:
if antecedent_1
and antecedent_2
...
and antecedent_n
then predicted value
where consequent
and antecedent_1
through
antecedent_n
are all conditions. The rule is interpreted as "for records where
antecedent_1
through antecedent_n
are all true,
consequent
is also likely to be true." If you click the Show
Instances/Confidence button on the toolbar, each rule will also show information on the
number of records to which the rule applies--that is, for which the antecedents are true
(Instances) and the proportion of those records for which the entire rule is true
(Confidence).
Note that confidence is calculated somewhat differently for C5.0 rule sets. C5.0 uses the following formula for calculating the confidence of a rule:
(1 + number of records where rule is correct)
/
(2 + number of records for which the rule's antecedents are true)
This calculation of the confidence estimate adjusts for the process of generalizing rules from a decision tree (which is what C5.0 does when it creates a rule set).