Ensemble Node Settings

Target field for ensemble. Select a single field that is used as the target by two or more upstream models. The upstream models can use flag, nominal, or continuous targets, but at least two of the models must share the same target in order to combine scores.

Filter out fields generated by ensembled models. Removes from the output all of the additional fields generated by the individual models that feed into the Ensemble node. Select this check box if you are interested only in the combined score from all of the input models. Ensure that this option is deselected if, for example, you want to use an Analysis node or Evaluation node to compare the accuracy of the combined score with that of each of the individual input models.

Available settings depend on the measurement level of the field selected as the target.

Continuous Targets

For a continuous target, scores will be averaged. This is the only available method for combining scores.

When averaging scores or estimates, the Ensemble node uses a standard error calculation to work out the difference between the measured or estimated values and the true values, and to show how close those estimates matched. Standard error calculations are generated by default for new models; however, you can deselect the check box for existing models, for example, if they are to be regenerated.

Categorical Targets

For categorical targets, a number of methods are supported, including voting, which works by tallying the number of times each possible predicted value is chosen and selecting the value with the highest total. For example, if three out of five models predict yes and the other two predict no, then yes wins by a vote of 3 to 2. Alternatively, votes can be weighted based on the confidence or propensity value for each prediction. The weights are then summed, and the value with the highest total is again selected. The confidence for the final prediction is the sum of the weights for the winning value divided by the number of models included in the ensemble.

All categorical fields. For both flag and nominal fields, the following methods are supported:

  • Voting
  • Confidence-weighted voting
  • Highest confidence wins

Flag fields only. For flag fields only, a number of methods based on propensity are also available:

  • Raw propensity-weighted voting
  • Adjusted propensity-weighted voting
  • Average raw propensity
  • Average adjusted propensity

Voting ties. For voting methods, you can specify how ties are resolved.

  • Random selection. One of the tied values is chosen at random.
  • Highest confidence. The tied value that was predicted with the highest confidence wins. Note that this is not necessarily the same as the highest confidence of all predicted values.
  • Raw or adjusted propensity (flag fields only). The tied value that was predicted with the largest absolute propensity, where the absolute propensity is calculated as:
abs(0.5 - propensity) *
2

Or, in the case of adjusted propensity:

abs(0.5 - adjusted propensity) * 2