Analyze Patterns (Multiple Imputation)
Analyze Patterns provides descriptive measures of the patterns of missing values in the data, and can be useful as an exploratory step before imputation. This is a Multiple Imputation procedure.
Example. A telecommunications provider wants to better understand service usage patterns in its customer database. They have complete data for services used by their customers, but the demographic information collected by the company has a number of missing values. Analyzing the patterns of missing values can help determine next steps for imputation.
From the menus choose:
- Select at least two analysis variables. The procedure analyzes patterns of missing data for these variables.
Optional Settings
Analysis Weight. This variable contains analysis (regression or sampling) weights. The procedure incorporates analysis weights in summaries of missing values. Cases with a negative or zero analysis weight are excluded.
Output. The following optional output is available:
- Summary of missing values. This displays a paneled pie chart that shows the number and percent of analysis variables, cases, or individual data values that have one or more missing values.
- Patterns of missing values. This displays tabulated patterns of missing values. Each pattern corresponds to a group of cases with the same pattern of incomplete and complete data on analysis variables. You can use this output to determine whether the monotone imputation method can be used for your data, or if not, how closely your data approximate a monotone pattern. The procedure orders analysis variables to reveal or approximate a monotonic pattern. If no nonmonotone pattern exists after reordering you can conclude that the data have a monotonic pattern when analysis variables are ordered as such.
- Variables with the highest frequency of missing values. This
displays a table of analysis variables sorted by percent of missing
values in decreasing order. The table includes descriptive statistics
(mean and standard deviation) for scale variables.
You can control the maximum number of variables to display and minimum percentage missing for a variable to be included in the display. The set of variables that meet both criteria are displayed. For example, setting the maximum number of variables to 50 and the minimum percentage missing to 25 requests that the table display up to 50 variables that have at least 25% missing values. If there are 60 analysis variables but only 15 have 25% or more missing values, the output includes only 15 variables.
This procedure pastes MULTIPLE IMPUTATION command syntax.