Measures for Interval Data (CLUSTER command)
For interval data, use any one of the following
keywords on MEASURE
:
SEUCLID. Squared Euclidean distance. The distance
between two items, x and y, is the sum of the squared differences
between the values for the items. SEUCLID
is the measure commonly used with centroid, median, and Ward's methods
of clustering. SEUCLID
is the
default and can also be requested with keyword DEFAULT
.
EUCLID. Euclidean distance. This is the default
specification for MEASURE
. The
distance between two items, x and y, is the square root
of the sum of the squared differences between the values for the items.
CORRELATION. Correlation between vectors of values. This is a pattern similarity measure.
COSINE. Cosine of vectors of values. This is a pattern similarity measure.
CHEBYCHEV. Chebychev distance metric. The distance between two items is the maximum absolute difference between the values for the items.
BLOCK. City-block or Manhattan distance. The distance between two items is the sum of the absolute differences between the values for the items.
MINKOWSKI(p). Distance in an absolute Minkowski power metric. The distance between two items is the pth root of the sum of the absolute differences to the pth power between the values for the items. Appropriate selection of the integer parameter p yields Euclidean and many other distance metrics.
POWER(p,r). Distance in an absolute power metric. The distance between two items is the rth root of the sum of the absolute differences to the pth power between the values for the items. Appropriate selection of the integer parameters p and r yields Euclidean, squared Euclidean, Minkowski, city-block, and many other distance metrics.