IBM® Db2® for z/OS® Models - K-Means build options

By setting the build options, you can customize the build of the model for your own purposes.

If you want to build a model with the default options, click Run.

Distance measure. This parameter defines the method of measure for the distance between data points. Greater distances indicate greater dissimilarities. Select one of the following options:

  • Euclidean. The Euclidean measure is the straight-line distance between two data points.
  • Normalized Euclidean. The Normalized Euclidean measure is similar to the Euclidean measure but it is normalized by the squared standard deviation. Unlike the Euclidean measure, the Normalized Euclidean measure is also scale-invariant.

Number of clusters. This parameter defines the number of clusters to be created.

Maximum number of iterations. The algorithm does several iterations of the same process. This parameter defines the number of iterations after which model training stops.

Statistics. This parameter defines how many statistics are included in the model. Select one of the following options:

  • All. All column-related statistics and all value-related statistics are included.
    Note: This parameter includes the maximum number of statistics and might therefore affect the performance of your system. If you do not want to view the model in graphical format, specify None.
  • Columns. Column-related statistics are included.
  • None. Only statistics that are required to score the model are included.

Replicate results. Select this check box if you want to set a random seed to replicate analyses. You can specify an integer, or you can create a pseudo-random integer by clicking Generate.