Instance weights and class weights

By default, all input records and classes are assumed to have equal relative importance. You can change this by assigning individual weights to the members of either or both of these items. Doing so might be useful, for example, if the data points in your training data are not realistically distributed among the categories. Weights enable you to bias the model so that you can compensate for those categories that are less well represented in the data. Increasing the weight for a target value should increase the percentage of correct predictions for that category.

In the Decision Tree modeling node, you can specify two types of weights. Instance weights assign a weight to each row of input data. The weights are typically specified as 1.0 for most cases, with higher or lower values given only to those cases that are more or less important than the majority, as shown in the following table.

Table 1. Instance weight example
Record ID Target Instance Weight
1 drugA 1.1
2 drugB 1.0
3 drugA 1.0
4 drugB 0.3

Class weights assign a weight to each category of the target field, as shown in the following table.

Table 2. Class weight example
Class Class Weight
drugA 1.0
drugB 1.5

Both types of weights can be used at the same time, in which case they are multiplied together and used as instance weights. Thus if the two previous examples were used together, the algorithm would use the instance weights as shown in the following table.

Table 3. Instance weight calculation example
Record ID Calculation Instance Weight
1 1.1*1.0 1.1
2 1.0*1.5 1.5
3 1.0*1.0 1.0
4 0.3*1.5 0.45