|
The C5.0 node builds either a decision tree or a rule set. The model works by splitting the
sample based on the field that provides the maximum information gain at each level. The target field
must be categorical. Multiple splits into more than two subgroups are allowed.
|
Example
node = stream.create("c50", "My node")
# "Model" tab
node.setPropertyValue("use_model_name", False)
node.setPropertyValue("model_name", "C5_Drug")
node.setPropertyValue("use_partitioned_data", True)
node.setPropertyValue("output_type", "DecisionTree")
node.setPropertyValue("use_xval", True)
node.setPropertyValue("xval_num_folds", 3)
node.setPropertyValue("mode", "Expert")
node.setPropertyValue("favor", "Generality")
node.setPropertyValue("min_child_records", 3)
# "Costs" tab
node.setPropertyValue("use_costs", True)
node.setPropertyValue("costs", [["drugA", "drugX", 2]])
Table 1. c50node properties
c50node Properties |
Values |
Property description |
target
|
field
|
C50 models use a single target field and one or more input fields. A weight field can also be
specified. See the topic Common modeling node properties for more information. |
output_type
|
DecisionTree
RuleSet
|
|
group_symbolics
|
flag
|
|
use_boost
|
flag
|
|
boost_num_trials
|
number
|
|
use_xval
|
flag
|
|
xval_num_folds
|
number
|
|
mode
|
Simple
Expert
|
|
favor
|
Accuracy
Generality
|
Favor accuracy or generality. |
expected_noise
|
number
|
|
min_child_records
|
number
|
|
pruning_severity
|
number
|
|
use_costs
|
flag
|
|
costs
|
structured
|
This is a structured property.
See the example for usage.
|
use_winnowing
|
flag
|
|
use_global_pruning
|
flag
|
On (True ) by default. |
calculate_variable_importance
|
flag
|
|
calculate_raw_propensities
|
flag
|
|
calculate_adjusted_propensities
|
flag
|
|
adjusted_propensity_partition
|
Test
Validation
|
|