IBM Support

Clustering binary data with K-Means (should be avoided)

Troubleshooting


Problem

I have a number of variables containing binary data (such as 0-1 or Yes-No responses, also known as dichotomous data). I would like to use K-Means Clustering to form clusters of similar cases. But I have heard that it is inappropriate to cluster binary-valued data. Is this true?

[{"Product":{"code":"SSLVMB","label":"IBM SPSS Statistics"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Not Applicable","Edition":"","Line of Business":{"code":"LOB76","label":"Data Platform"}}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Historical Number

25479

Document Information

Modified date:
16 April 2020

UID

swg21477401