Overview (MULTIPLE CORRESPONDENCE command)

MULTIPLE CORRESPONDENCE (Multiple Correspondence Analysis; also known as homogeneity analysis) quantifies nominal (categorical) data by assigning numerical values to the cases (objects) and categories, such that in the low-dimensional representation of the data, objects within the same category are close together and objects in different categories are far apart. Each object is as close as possible to the category points of categories that apply to the object. In this way, the categories divide the objects into homogeneous subgroups. Variables are considered homogeneous when they classify objects in the same categories into the same subgroups.

Basic Specification

The basic specification is the command MULTIPLE CORRESPONDENCE with the VARIABLES and ANALYSIS subcommands.

Syntax Rules

  • The VARIABLES and ANALYSIS subcommands always must appear.
  • All subcommands can appear in any order.
  • For the first subcommand after the procedure name, a slash is accepted, but not required.
  • Variables specified in the ANALYSIS subcommand must be found in the VARIABLES subcommand.
  • Variables specified in the SUPPLEMENTARY subcommand must be found in the ANALYSIS subcommand.

Operations

  • If the same subcommand is repeated, it causes a syntax error and the procedure terminates.

Limitations

  • MULTIPLE CORRESPONDENCE operates on category indicator variables. The category indicators should be positive integers. You can use the DISCRETIZATION subcommand to convert fractional value variables and string variables into positive integers. If DISCRETIZATION is not specified, fractional value variables are automatically converted into positive integers by grouping them into seven categories (or into the number of distinct values of the variable if this number is less than seven) with a close-to-normal distribution, and string variables are automatically converted into positive integers by ranking.
  • In addition to system-missing values and user-defined missing values, MULTIPLE CORRESPONDENCE treats category indicator values less than 1 as missing. If one of the values of a categorical variable has been coded 0 or some negative value and you want to treat it as a valid category, use the COMPUTE command to add a constant to the values of that variable such that the lowest value will be 1. You can also use the RANKING option of the DISCRETIZATION subcommand for this purpose, except for variables you want to treat as numerical, since the spacing of the categories will not be maintained.
  • There must be at least three valid cases.
  • Split-File has no implications for MULTIPLE CORRESPONDENCE.