Multiple Correspondence Analysis

Multiple Correspondence Analysis quantifies nominal (categorical) data by assigning numerical values to the cases (objects) and categories so that objects within the same category are close together and objects in different categories are far apart. Each object is as close as possible to the category points of categories that apply to the object. In this way, the categories divide the objects into homogeneous subgroups. Variables are considered homogeneous when they classify objects in the same categories into the same subgroups.

Example. Multiple Correspondence Analysis could be used to graphically display the relationship between job category, minority classification, and gender. You might find that minority classification and gender discriminate between people but that job category does not. You might also find that the Latino and African-American categories are similar to each other.

Statistics and plots. Object scores, discrimination measures, iteration history, correlations of original and transformed variables, category quantifications, descriptive statistics, object points plots, biplots, category plots, joint category plots, transformation plots, and discrimination measures plots.

Multiple Correspondence Analysis Data Considerations

Data. String variable values are always converted into positive integers by ascending alphanumeric order. User-defined missing values, system-missing values, and values less than 1 are considered missing; you can recode or add a constant to variables with values less than 1 to make them nonmissing.

Assumptions. All variables have the multiple nominal scaling level. The data must contain at least three valid cases. The analysis is based on positive integer data. The discretization option will automatically categorize a fractional-valued variable by grouping its values into categories with a close-to-normal distribution and will automatically convert values of string variables into positive integers. You can specify other discretization schemes.

Related procedures. For two variables, Multiple Correspondence Analysis is analogous to Correspondence Analysis. If you believe that variables possess ordinal or numerical properties, Categorical Principal Components Analysis should be used. If sets of variables are of interest, Nonlinear Canonical Correlation Analysis should be used.

To Obtain a Multiple Correspondence Analysis

This feature requires the Categories option.

  1. From the menus choose:

    Analyze > Dimension Reduction > Optimal Scaling...

  2. Select All variables multiple nominal.
  3. Select One set.
  4. Click Define.
  5. Select at least two analysis variables and specify the number of dimensions in the solution.
  6. Click OK.

You may optionally specify supplementary variables, which are fitted into the solution found, or labeling variables for the plots.

This procedure pastes MULTIPLE CORRESPONDENCE command syntax.