xgboostasnode Properties

XGBoost is an advanced implementation of a gradient boosting algorithm. Boosting algorithms iteratively learn weak classifiers and then add them to a final strong classifier. XGBoost is very flexible and provides many parameters that can be overwhelming to most users, so the XGBoost-AS node in SPSS® Modeler exposes the core features and commonly used parameters. The XGBoost-AS node is implemented in Spark.
Table 1. xgboostasnode properties
xgboostasnode properties Data type Property description
target_field field List of the field names for target.
input_fields field List of the field names for inputs.
nWorkers integer The number of workers used to train the XGBoost model. Default is 1.
numThreadPerTask integer The number of threads used per worker. Default is 1.
useExternalMemory Boolean Whether to use external memory as cache. Default is false.
boosterType string The booster type to use. Available options are gbtree, gblinear, or dart. Default is gbtree.
numBoostRound integer The number of rounds for boosting. Specify a value of 0 or higher. Default is 10.
scalePosWeight Double Control the balance of positive and negative weights. Default is 1.
randomseed integer The seed used by the random number generator. Default is 0.
objectiveType string The learning objective. Possible values are reg:linear, reg:logistic, reg:gamma, reg:tweedie, rank:pairwise, binary:logistic, or multi. Note that for flag targets, only binary:logistic or multi can be used. If multi is used, the score result will show the multi:softmax and multi:softprob XGBoost objective types. Default is reg:linear.
evalMetric string Evaluation metrics for validation data. A default metric will be assigned according to the objective. Possible values are rmse, mae, logloss, error, merror, mlogloss, auc, ndcg, map, or gamma-deviance. Default is rmse.
lambda Double L2 regularization term on weights. Increasing this value will make the model more conservative. Specify any number 0 or greater. Default is 1.
alpha Double L1 regularization term on weights. Increasing this value will make the model more conservative. Specify any number 0 or greater. Default is 0.
lambdaBias Double L2 regularization term on bias. If the gblinear booster type is used, this lambda bias linear booster parameter is available. Specify any number 0 or greater. Default is 0.
treeMethod string If the gbtree or dart booster type is used, this tree method parameter for tree growth (and the other tree parameters that follow) is available. It specifies the XGBoost tree construction algorithm to use. Available options are auto, exact, or approx. Default is auto.
maxDepth integer The maximum depth for trees. Specify a value of 2 or higher. Default is 6.
minChildWeight Double The minimum sum of instance weight (hessian) needed in a child. Specify a value of 0 or higher. Default is 1.
maxDeltaStep Double The maximum delta step to allow for each tree's weight estimation. Specify a value of 0 or higher. Default is 0.
sampleSize Double The sub sample for is the ratio of the training instance. Specify a value between 0.1 and 1.0. Default is 1.0.
eta Double The step size shrinkage used during the update step to prevent overfitting. Specify a value between 0 and 1. Default is 0.3.
gamma Double The minimum loss reduction required to make a further partition on a leaf node of the tree. Specify any number 0 or greater. Default is 6.
colsSampleRatio Double The sub sample ratio of columns when constructing each tree. Specify a value between 0.01 and 1. Default is1.
colsSampleLevel Double The sub sample ratio of columns for each split, in each level. Specify a value between 0.01 and 1. Default is 1.
normalizeType string If the dart booster type is used, this dart parameter and the following three dart parameters are available. This parameter sets the normalization algorithm. Specify tree or forest. Default is tree.
sampleType string The sampling algorithm type. Specify uniform or weighted. Default is uniform.
rateDrop Double The dropout rate dart booster parameter. Specify a value between 0.0 and 1.0. Default is 0.0.
skipDrop Double The dart booster parameter for the probability of skip dropout. Specify a value between 0.0 and 1.0. Default is 0.0.