IMPUTE Subcommand (MULTIPLE IMPUTATION command)
The IMPUTE subcommand controls the imputation method and model. By default, the AUTO method is used to impute missing data values.
METHOD Keyword
The METHOD keyword specifies the imputation method. Specify one of the following options:
AUTO. Automatic method. This is the default. Chooses the best method based on a scan of the data. The monotone method is used when the data have a monotone pattern of missingness; otherwise, if the data have a nonmonotone pattern, FCS is used. The AUTO method internally sorts the analysis variables in order from the least missing to most missing to detect a monotone pattern of missingness if it exists; thus, order of variables in the variable list is ignored when AUTO is used.
MONOTONE. Monotone method. This is a noniterative method that can be used only when the data have a monotone pattern of missingness. A monotone pattern exists when you can order the variables such that, if a variable has a nonmissing value, all preceding variables also have nonmissing values. The monotone method imputes missing values sequentially in the order specified in the variable list. An error occurs if the input data do not have a monotone pattern for the order in which variables are specified.
FCS. Fully conditional specification method. This is an iterative Markov chain Monte Carlo (MCMC) method that can be used when the pattern of missing data is arbitrary (monotone or nonmonotone). In each iteration, the FCS method sequentially imputes missing values in the order specified in the variable list.
NONE. None. No imputation is performed. Choose this option if you want analyses of missingness only. A warning is issued if you turn off imputation and suppress all analyses of missing values.
NIMPUTATIONS Keyword
By default, five imputations are performed. To request a different number of imputations, specify a positive integer. As a rule of thumb, the higher the degree of missing information in the data, the more imputations that are needed to obtain reasonable efficiency relative to an infinite number of imputations.
- NIMPUTATIONS is ignored if imputation is turned off.
SCALEMODEL Keyword
The AUTO, FCS, and MONOTONE methods are multivariate methods that can be used when several variables have missing values. Each uses univariate models when imputing values of individual variables.
By default, the type of univariate model that is used depends on the measurement level of the variable whose missing values are to be imputed. Multinomial logistic regression is always used for categorical variables. For scale variables, linear regression (LINEAR) is used by default. Optionally you can use the predictive mean matching (PMM) method for scale variables. PMM is a variant of linear regression that ensures that the imputed values are plausible. For PMM the imputed value always matches an observed value (specifically, the observed value that is closest to the value drawn by the imputation model).
- SCALEMODEL is honored only when the FCS or MONOTONE method is chosen explicitly. Linear regression is always used for scale variables when the AUTO method is requested. A warning is issued if SCALEMODEL is specified for the AUTO method.
- SCALEMODEL is ignored if impution is turned off.
INTERACTIONS Keyword
By default, the imputation model for each variable includes a constant term and main effects for predictor variables. You can optionally include all possible two-way interactions among categorical predictor variables. Specify INTERACTIONS=TWOWAY. Interaction terms do not include scale predictors.
- INTERACTIONS is honored only when the FCS or MONOTONE method is chosen explicitly. Two-way interactions are not included when AUTO method is requested. A warning is issued if INTERACTIONS is specified for the AUTO method.
- INTERACTIONS is ignored if imputation is turned off or if there are fewer than two categorical predictors.
MAXPCTMISSING Keyword
By default analysis variables are imputed and used as predictors without regard to how many missing values they have provided they at least have sufficient data to estimate an imputation model. The optional MAXPCTMISSING keyword is used to exclude variables that have a high percentage of missing values. Specify the maximum allowable percentage of missing values as a positive number less than 100. For example, if you specify MAXMISSINGPCT=50, analysis variables that have more than 50% missing values are not imputed, nor are they used as predictors in imputation models.
- MAXPCTMISSING is ignored if imputation is turned off.
Maximum Number of Draws
If minimum or maximum values are specified for imputed values of scale variables (see CONSTRAINTS subcommand), the procedure attempts to draw values for a case until it finds a set of values that are within the specified ranges. By default, a maximum of 50 sets of values are drawn per case. To override the maximum, specify a positive integer for MAXCASEDRAWS. If a set of values is not obtained within the specified number of draws per case, the procedure draws another set of model parameters and the repeats the case-drawing process. By default, a maximum of 2 sets of model parameters are drawn. To override the maximum, specify a positive integer for MAXPARAMDRAWS. An error occurs if a set of values within the ranges is not obtained within the specified number of case and parameter draws.
Note that increasing these values can increase the processing time. If the procedure is taking a long time (or is unable) to find suitable draws, check the minimum and maximum values specified on the CONSTRAINTS subcommand to ensure they are appropriate.
- MAXCASEDRAWS and MAXPARAMDRAWS are ignored if imputation is turned off, when predictive mean matching is used, and when imputing categorical values.
MAXITER Keyword
Iteration stops when the maximum number of iterations is reached. To override the default number of iterations (10), specify a positive integer value for MAXITER.
- MAXITER is honored only if the FCS method is chosen explicitly. The keyword is ignored when the AUTO method is used.
- The MAXITER keyword is ignored if the monotone method is used or imputation is turned off.
SINGULAR Keyword
SINGULAR specifies the tolerance value used to test for singularity in univariate imputation models (linear regression, logistic regression, and predictive mean matching). The default value is 10e−12. Specify a positive value.
- SINGULAR is honored only if the FCS or MONOTONE method is chosen explicitly. The keyword is ignored and the default singularity tolerance is used when AUTO method is requested.
- SINGULAR is ignored if imputation is turned off.
MAXMODELPARAM Keyword
MAXMODELPARAM specifies the maximum number of model parameters allowed when imputing any variable. If a model has more parameters than the specified limit, processing terminates with an error (no missing values are imputed). The default value is 100. Specify a positive integer value.
- MAXMODELPARAM is ignored if imputation is turned off.