K-Means-AS node Build Options

Use the Build Options tab to specify build options for the K-Means-AS node, including regular options for model building, initialization options for initializing cluster centers, and advanced options for the computing iteration and random seed. For more information, see the JavaDoc for K-Means on SparkML.1

Regular

Model Name. The name of the field generated after scoring to a specific cluster. Select Auto (default) or select Custom and type a name.

Number of Clusters. Specify the number of clusters to generate. The default is 5 and the minimum is 2.

Initialization

Initialization Mode. Specify the method for initializing the cluster centers. K-Means|| is the default. For details about these two methods, see Scalable K-Means++. 2

Initialization Steps. If the K-Means|| initialization mode is selected, specify the number of initialization steps. 2 is the default.

Advanced

Advanced Settings. Select this option if you want to set advanced options as follows.

Max Iteration. Specify the maximum number of iterations to perform when searching cluster centers. 20 is the default.

Tolerance. Specify the convergence tolerance for iterative algorithms. 1.0E-4 is the default.

Set Random Seed. Select this option and click Generate to generate the seed used by the random number generator.

Display

Display Graph. Select this option if you want a graph to be included in the output.

The following table shows the relationship between the settings in the SPSS® Modeler K-Means-AS node and the K-Means Spark parameters.
Table 1. Node properties mapped to Spark parameters
SPSS Modeler setting Script name (property name) K-Means SparkML parameter
Input Fields features
Number of Clusters clustersNum k
Initialization Mode initMode initMode
Initialization Steps initSteps initSteps
Max Iteration maxIter maxIter
Toleration toleration tol
Random Seed randomSeed seed

1 "Class KMeans." Apache Spark. JavaDoc. Web. 3 Oct 2017.

2 Bahmani, Moseley, et al. "Scalable K-Means++." Feb 28, 2012. http://theory.stanford.edu/%7Esergei/papers/vldb12-kmpar.pdf.