Overview (PROXIMITIES command)

PROXIMITIES computes a variety of measures of similarity, dissimilarity, or distance between pairs of cases or pairs of variables for moderate-sized datasets (see “Limitations” below). PROXIMITIES matrix output can be used as input to procedures ALSCAL, CLUSTER, and FACTOR.

Options

Standardizing Data. With the STANDARDIZE subcommand, you can use several different methods to standardize the values for each variable or for each case.

Proximity Measures. You can use the MEASURE subcommand to compute a variety of similarity, dissimilarity, and distance measures. (Similarity measures increase with greater similarity; dissimilarity and distance measures decrease.) MEASURE can compute measures for interval data, frequency-count data, and binary data. Only one measure can be requested in any one PROXIMITIES procedure. With the VIEW subcommand, you can control whether proximities are computed between variables or between cases.

Output. You can use the PRINT subcommand to display a computed matrix.

Matrix Input and Output. You can use the MATRIX subcommand to write a computed proximities matrix to IBM® SPSS® Statistics data files. This matrix can be used as input to procedures CLUSTER, ALSCAL, and FACTOR. You can also use MATRIX to read a similarity, dissimilarity, or distance matrix. This option lets you rescale or transform existing proximity matrices.

Basic Specification

The basic specification is a variable list, which obtains Euclidean distances between cases based on the values of each specified variable.

Subcommand Order

  • The variable list must be first.
  • Subcommands can be named in any order.

Operations

  • PROXIMITIES ignores case weights when computing coefficients.

Limitations

  • PROXIMITIES keeps the raw data for the current split-file group in memory. Storage requirements increase rapidly with the number of cases and the number of items (cases or variables) for which PROXIMITIES computes coefficients.