Overview (PROXIMITIES command)
PROXIMITIES
computes a variety of measures of similarity, dissimilarity, or
distance between pairs of cases or pairs of variables for moderate-sized
datasets (see “Limitations” below). PROXIMITIES
matrix output can be used as
input to procedures ALSCAL
, CLUSTER
, and FACTOR
.
Options
Standardizing Data. With the STANDARDIZE
subcommand, you can use several
different methods to standardize the values for each variable or for
each case.
Proximity Measures. You can use the MEASURE
subcommand to compute a variety of similarity,
dissimilarity, and distance measures. (Similarity measures increase
with greater similarity; dissimilarity and distance measures decrease.) MEASURE
can compute measures for interval
data, frequency-count data, and binary data. Only one measure can
be requested in any one PROXIMITIES
procedure. With the VIEW
subcommand,
you can control whether proximities are computed between variables
or between cases.
Output. You can use the PRINT
subcommand to display a computed matrix.
Matrix Input
and Output. You can use the MATRIX
subcommand to write a computed proximities matrix to IBM® SPSS® Statistics data files. This matrix can be used
as input to procedures CLUSTER
, ALSCAL
, and FACTOR
. You can also use MATRIX
to read a similarity, dissimilarity,
or distance matrix. This option lets you rescale or transform existing
proximity matrices.
Basic Specification
The basic specification is a variable list, which obtains Euclidean distances between cases based on the values of each specified variable.
Subcommand Order
- The variable list must be first.
- Subcommands can be named in any order.
Operations
-
PROXIMITIES
ignores case weights when computing coefficients.
Limitations
-
PROXIMITIES
keeps the raw data for the current split-file group in memory. Storage requirements increase rapidly with the number of cases and the number of items (cases or variables) for whichPROXIMITIES
computes coefficients.