Overview (DISTANCE CORRELATION command)

The basic specification for DISTANCE CORRELATION includes VARIABLES, which specifies a list of variables to be correlated. The procedure has the subcommands CRITERIA, PRINT, and PLOT. With the subcommands, you can customize the computation and output display options.

Options

Variable list
The variable list is required and allows only numeric variables. The procedure computes pairwise distance correlations between all combinations of variables within the list.
  • The variable list must contain at least two numeric variables to run the procedure. If the list contains var1, var2, var3, the procedure computes: R(var1, var2), R(var1, var3), R(var2, var3)
  • Each variable in the list is compared with every other variable. Pairwise comparisons are computed exhaustively to capture all possible relationships within the set of variables.
  • At least one pair of distinct variables is required to run the procedure. If fewer than two variables are provided, an error message is issued.
ID Subcommand
The ID subcommand specifies a variable whose values or value labels identify the casewise listing. By default, cases are labeled by their case number.
  • The only specification is the name of a single variable that exists in the active dataset.
  • Only the first eight characters of the variable’s value labels are used to label cases. If the variable has no value labels, the values are used.
  • Only the first eight characters of a string variable are used to label cases.

MISSING controls the treatment of missing values.

LISTWISE

By default, missing values are excluded listwise. Cases with a missing value for any variable that is named in a list are excluded from the computation of all coefficients in the Correlations table. The number of used cases is displayed in a single annotation. Each variable list on a command is evaluated separately. This option decreases the amount of required memory and significantly decreases computational time.

Criteria
The Criteria subcommand provides control parameters for normalization, and other key options for customizing the output.
Print
The Print subcommand provides options to display the essential measurements that are used in the computation such as number of observations, mean, standard deviation, minimum, and maximum values for each variable. The command also displays distance correlation coefficients, distance covariance estimates, and distance matrix.
Plot
The PLOT subcommand generates scatterplots that show the pairwise distances or correlations between the selected variables. The parameters XAXIS and YAXIS sets the variables that you want to be shown in the X-axis and Y-axis.