Overview (DISTANCE CORRELATION command)
The basic specification for DISTANCE CORRELATION
includes
VARIABLES
, which specifies a list of variables to be correlated. The procedure has
the subcommands CRITERIA
, PRINT
, and PLOT
. With
the subcommands, you can customize the computation and output display options.
Options
- Variable list
- The variable list is required and allows only numeric variables. The procedure computes pairwise
distance correlations between all combinations of variables within the list.
- The variable list must contain at least two numeric variables to run the procedure. If the list contains var1, var2, var3, the procedure computes: R(var1, var2), R(var1, var3), R(var2, var3)
- Each variable in the list is compared with every other variable. Pairwise comparisons are computed exhaustively to capture all possible relationships within the set of variables.
- At least one pair of distinct variables is required to run the procedure. If fewer than two variables are provided, an error message is issued.
- ID Subcommand
-
The
ID
subcommand specifies a variable whose values or value labels identify the casewise listing. By default, cases are labeled by their case number.- The only specification is the name of a single variable that exists in the active dataset.
- Only the first eight characters of the variable’s value labels are used to label cases. If the variable has no value labels, the values are used.
- Only the first eight characters of a string variable are used to label cases.
MISSING controls the treatment of missing values.
LISTWISE
By default, missing values are excluded listwise. Cases with a missing value for any variable that is named in a list are excluded from the computation of all coefficients in the Correlations table. The number of used cases is displayed in a single annotation. Each variable list on a command is evaluated separately. This option decreases the amount of required memory and significantly decreases computational time.
- Criteria
- The
Criteria
subcommand provides control parameters for normalization, and other key options for customizing the output. - The
Print
subcommand provides options to display the essential measurements that are used in the computation such as number of observations, mean, standard deviation, minimum, and maximum values for each variable. The command also displays distance correlation coefficients, distance covariance estimates, and distance matrix. - Plot
- The
PLOT
subcommand generates scatterplots that show the pairwise distances or correlations between the selected variables. The parametersXAXIS
andYAXIS
sets the variables that you want to be shown in the X-axis and Y-axis.