IBM Support

How does the K-mean cluster node select initial records for clustering?

Question & Answer


Question

How does the K-mean cluster node select initial records for clustering? Is there a pseudo - random number generator used and how is the seed value set to create the initial sample for the k-means algorithm? We see in the documentation (IBM Modeler Algorithms' Guide for Version 18.1.1) that the first record is used to create the initial cluster but then how are the subsequent clusters created. For instance, if k-means of 5 is specified, are the first 5 records used to create the centroids of the initial clusters?

[{"Product":{"code":"SS3RA7","label":"IBM SPSS Modeler"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Component":"Modeler","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Not Applicable","Edition":"","Line of Business":{"code":"LOB76","label":"Data Platform"}}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Document Information

Modified date:
10 October 2022

UID

swg22015418