xgboosttreenode properties

XGBoost Tree node iconXGBoost Tree© is an advanced implementation of a gradient boosting algorithm with a tree model as the base model. Boosting algorithms iteratively learn weak classifiers and then add them to a final strong classifier. XGBoost Tree is very flexible and provides many parameters that can be overwhelming to most users, so the XGBoost Tree node in SPSS Modeler exposes the core features and commonly used parameters. The node is implemented in Python.

Table 1. xgboosttreenode properties
xgboosttreenode properties Data type Property description
custom_fields boolean This option tells the node to use field information specified here instead of that given in any upstream Type node(s). After selecting this option, specify the fields as required.
target field The target fields.
inputs field The input fields.
tree_method string The tree method for model building. Possible values are auto, exact, or approx. Default is auto.
num_boost_round integer The num boost round value for model building. Specify a value between 1 and 1000. Default is 10.
max_depth integer The max depth for tree growth. Specify a value of 1 or higher. Default is 6.
min_child_weight Double The min child weight for tree growth. Specify a value of 0 or higher. Default is 1.
max_delta_step Double The max delta step for tree growth. Specify a value of 0 or higher. Default is 0.
objective_type string The objective type for the learning task. Possible values are reg:linear, reg:logistic, reg:gamma, reg:tweedie, count:poisson, rank:pairwise, binary:logistic, or multi. Note that for flag targets, only binary:logistic or multi can be used. If multi is used, the score result will show the multi:softmax and multi:softprob XGBoost objective types.
early_stopping Boolean Whether to use the early stopping function. Default is False.
early_stopping_rounds integer Validation error needs to decrease at least every early stopping round(s) to continue training. Default is 10.
evaluation_data_ratio Double Ration of input data used for validation errors. Default is 0.3.
random_seed integer The random number seed. Any number between 0 and 9999999. Default is 0.
sample_size Double The sub sample for control overfitting. Specify a value between 0.1 and 1.0. Default is 0.1.
eta Double The eta for control overfitting. Specify a value between 0 and 1. Default is 0.3.
gamma Double The gamma for control overfitting. Specify any number 0 or greater. Default is 6.
col_sample_ratio Double The colsample by tree for control overfitting. Specify a value between 0.01 and 1. Default is 1.
col_sample_level Double The colsample by level for control overfitting. Specify a value between 0.01 and 1. Default is 1.
lambda Double The lambda for control overfitting. Specify any number 0 or greater. Default is 1.
alpha Double The alpha for control overfitting. Specify any number 0 or greater. Default is 0.
scale_pos_weight Double The scale pos weight for handling imbalanced datasets. Default is 1.
use_HPO