Selecting records based on cluster field values

By default, Cluster Analysis creates a new field that identifies the cluster group for each record. The default name of this field is ClusterGroupn, where n is an integer that forms a unique field name.

Figure 1. Cluster field added to dataset
Cluster field added to dataset

To use the values of the cluster field to select records in specific clusters:

  1. From the menus choose:

    Data > Select Cases

    Figure 2. Select Cases dialog
    Select Cases dialog
  2. In the Select Cases dialog, select If condition is satisfied and then click If.
    Figure 3. Select Cases: If dialog
    Select Cases: If dialog
  3. Enter the selection condition.

    For example, ClusterGroup1 < 3 will select all records in clusters 1 and 2, and will exclude records in clusters 3 and higher.

  4. Click Continue.

In the Select Cases dialog, there are several options for what to do with selected and unselected records:

Filter out unselected cases. This creates a new field that specifies a filter condition. Excluded records are not deleted from the dataset. They are retained with a filter status indicator, which is displayed as a diagonal slash through the record number in the Data Editor. This is equivalent to interactively selecting clusters in the Cluster Model Viewer.

Copy selected cases to a new dataset. This creates a new dataset in the current session that contains only the records that meet the filter condition. The original dataset is unaffected.

Delete unselected cases. Unselected records are deleted from the dataset. Deleted records can be recovered only by exiting from the file without saving any changes and then reopening the file. The deletion of cases is permanent if you save the changes to the data file.

The Select Cases dialog also has an option to use an existing variable as a filter variable (field). If you create a filter condition interactively in the Cluster Model Viewer and save the generated filter field with the dataset, you can use that field to filter records in subsequent sessions.

Next