Rule Set Model Tab

The Model tab for a Rule Set nugget displays a list of rules extracted from the data by the algorithm.

Rules are broken down by consequent (predicted category) and are presented in the following format:

if antecedent_1
and antecedent_2
...
and antecedent_n
then predicted value

where consequent and antecedent_1 through antecedent_n are all conditions. The rule is interpreted as "for records where antecedent_1 through antecedent_n are all true, consequent is also likely to be true." If you click the Show Instances/Confidence button on the toolbar, each rule will also show information on the number of records to which the rule applies--that is, for which the antecedents are true (Instances) and the proportion of those records for which the entire rule is true (Confidence).

Note that confidence is calculated somewhat differently for C5.0 rule sets. C5.0 uses the following formula for calculating the confidence of a rule:

(1 + number of records where rule is correct)
/
(2 + number of records for which the rule's antecedents are true)

This calculation of the confidence estimate adjusts for the process of generalizing rules from a decision tree (which is what C5.0 does when it creates a rule set).