Spark nodes

SPSS® Modeler offers nodes for using Spark native algorithms. The Spark tab on the Nodes Palette contains the following nodes you can use to run Spark algorithms. These nodes are supported on Windows 64, Mac 64, and Linux 64. Note that these nodes don't support specifying an integer/double column as Flag/Nominal for building a model. To do this, you must convert the column value to 0/1 or 0,1,2,3,4...

Isotonic Regression belongs to the family of regression algorithms. The Isotonic-AS node in SPSS Modeler is implemented in Spark. For details about Isotonic Regression algorithms, see https://spark.apache.org/docs/2.2.0/mllib-isotonic-regression.html.
XGBoost© is an advanced implementation of a gradient boosting algorithm. Boosting algorithms iteratively learn weak classifiers and then add them to a final strong classifier. XGBoost is very flexible and provides many parameters that can be overwhelming to most users, so the XGBoost-AS node in SPSS Modeler exposes the core features and commonly used parameters. The XGBoost-AS node is implemented in Spark.
K-Means is one of the most commonly used clustering algorithms. It clusters data points into a predefined number of clusters. The K-Means-AS node in SPSS Modeler is implemented in Spark. For details about K-Means algorithms, see https://spark.apache.org/docs/2.2.0/ml-clustering.html. Note that the K-Means-AS node performs one-hot encoding automatically for categorical variables.
Multilayer perceptron is a classifier based on the feedforward artificial neural network and consists of multiple layers. Each layer is fully connected to the next layer in the network. The MultiLayerPerceptron-AS node in SPSS Modeler is implemented in Spark. For details about the multilayer perceptron classifier (MLPC), see https://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier.