Examples for TwoStep clustering

This example shows how to build a TWOSTEP clustering model on the ADULT sample data set and how to associate clusters to new transactions.

First, you split the ADULT sample data set into a training data set and a validation data set.

CALL IDAX.SPLIT_DATA('intable=adult, traintable=adult_train, testtable=adult_test, id=id, fraction=0.35');

The following call runs the algorithm on the adult_train data set and builds the TWOSTEP model.

CALL IDAX.TWOSTEP('model=adult_mdl, intable=adult_train, id=id');

The PREDICT_TWOSTEP stored procedure calculates a cluster_id for each row according to the used model (adult_mdl).

The following call shows how to associate clusters to new transactions.

CALL IDAX.PREDICT_TWOSTEP('model= adult_mdl, intable=adult_test, id=id, outtable=adult_mdl_score');

The clusters that are assigned to the records of the input table records are stored in an output table. The output table contains the id column and the cluster_id column.

The id column of the output table is identical to the id column of the input table. Each record of the input table is associated with a cluster with the smallest distance from the record to the cluster.