xgboosttreenode Properties

XGBoost Tree© is an advanced implementation of a gradient boosting algorithm with a tree model as the base model. Boosting algorithms iteratively learn weak classifiers and then add them to a final strong classifier. XGBoost Tree is very flexible and provides many parameters that can be overwhelming to most users, so the XGBoost Tree node in SPSS® Modeler exposes the core features and commonly used parameters. The node is implemented in Python.
Table 1. xgboosttreenode properties
xgboosttreenode properties Data type Property description
TargetField Renamed to target starting with version 18.2.1.1 field The target fields.
InputFields Renamed to inputs starting with version 18.2.1.1 field The input fields.
treeMethod Renamed to tree_method starting with version 18.2.1.1 string The tree method for model building. Possible values are auto, exact, or approx. Default is auto.
numBoostRound Renamed to num_boost_round starting with version 18.2.1.1 integer The num boost round value for model building. Specify a value between 1 and 1000. Default is 10.
maxDepth Renamed to max_depth starting with version 18.2.1.1 integer The max depth for tree growth. Specify a value of 1 or higher. Default is 6.
minChildWeight Renamed to min_child_weight starting with version 18.2.1.1 Double The min child weight for tree growth. Specify a value of 0 or higher. Default is 1.
maxDeltaStep Renamed to max_delta_step starting with version 18.2.1.1 Double The max delta step for tree growth. Specify a value of 0 or higher. Default is 0.
objectiveType Renamed to objective_type starting with version 18.2.1.1 string The objective type for the learning task. Possible values are reg:linear, reg:logistic, reg:gamma, reg:tweedie, count:poisson, rank:pairwise, binary:logistic, or multi. Note that for flag targets, only binary:logistic or multi can be used. If multi is used, the score result will show the multi:softmax and multi:softprob XGBoost objective types.
earlyStopping Renamed to early_stopping starting with version 18.2.1.1 Boolean Whether to use the early stopping function. Default is False.
earlyStoppingRounds Renamed to early_stopping_rounds starting with version 18.2.1.1 integer Validation error needs to decrease at least every early stopping round(s) to continue training. Default is 10.
evaluationDataRatio Renamed to evaluation_data_ratio starting with version 18.2.1.1 Double Ration of input data used for validation errors. Default is 0.3.
random_seed integer The random number seed. Any number between 0 and 9999999. Default is 0.
sampleSize Renamed to sample_size starting with version 18.2.1.1 Double The sub sample for control overfitting. Specify a value between 0.1 and 1.0. Default is 0.1.
eta Double The eta for control overfitting. Specify a value between 0 and 1. Default is 0.3.
gamma Double The gamma for control overfitting. Specify any number 0 or greater. Default is 6.
colsSampleRatio Renamed to col_sample_ratio starting with version 18.2.1.1 Double The colsample by tree for control overfitting. Specify a value between 0.01 and 1. Default is 1.
colsSampleLevel Renamed to col_sample_level starting with version 18.2.1.1 Double The colsample by level for control overfitting. Specify a value between 0.01 and 1. Default is 1.
lambda Double The lambda for control overfitting. Specify any number 0 or greater. Default is 1.
alpha Double The alpha for control overfitting. Specify any number 0 or greater. Default is 0.
scalePosWeight Renamed to scale_pos_weight starting with version 18.2.1.1 Double The scale pos weight for handling imbalanced datasets. Default is 1.
use_HPO Added for version 18.2.1.1