IMPUTE Subcommand (MULTIPLE IMPUTATION command)
The IMPUTE
subcommand controls
the imputation method and model. By default, the AUTO
method is used to impute missing data values.
METHOD keyword
The METHOD
keyword specifies the imputation method. Specify
one of the following options:
- AUTO
- Automatic method. This is the default. Chooses the best method based
on a scan of the data. The monotone method is used when the data have a monotone pattern of
missingness; otherwise, if the data have a non-monotone pattern,
FCS
is used. TheAUTO
method internally sorts the analysis variables in order from the least missing to most missing to detect a monotone pattern of missingness if it exists; thus, order of variables in the variable list is ignored whenAUTO
is used. - MONOTONE
- Monotone method. This is a non-iterative method that can be used only when the data have a monotone pattern of missingness. A monotone pattern exists when you can order the variables such that, if a variable has a non-missing value, all preceding variables also have non-missing values. The monotone method imputes missing values sequentially in the order specified in the variable list. An error occurs if the input data do not have a monotone pattern for the order in which variables are specified.
- FCS
- Fully conditional specification method. This is an iterative Markov chain Monte Carlo (MCMC) method that can be used when the pattern of missing data is arbitrary (monotone or non-monotone). In each iteration, the FCS method sequentially imputes missing values in the order specified in the variable list.
- NONE
- None. No imputation is performed. Choose this option if you want analyses of missingness only. A warning is issued if you turn off imputation and suppress all analyses of missing values.
NIMPUTATIONS keyword
By default, five imputations are performed. To request a different number of imputations, specify a positive integer. As a rule of thumb, the higher the degree of missing information in the data, the more imputations that are needed to obtain reasonable efficiency relative to an infinite number of imputations.
NIMPUTATIONS
is ignored if imputation is turned off.
SCALEMODEL keyword
The AUTO
, FCS
, and MONOTONE
methods are multivariate methods that can be used when several variables have missing values. Each
uses univariate models when imputing values of individual variables.
By default, the type of univariate model that is used depends on the
measurement level of the variable whose missing values are to be imputed. Multinomial logistic
regression is always used for categorical variables. For scale variables, linear regression
(LINEAR
) is used by default.
Optionally you can use the predictive mean matching (PMM
)
method for scale variables. PMM
is a variant of linear regression that ensures that
the imputed values are plausible. For PMM
, the imputed value is based on the value
defined for the closest randomly selected complete case from the closest (k) predictions,
where (k) is a positive integer with a default value of 5.
SCALEMODEL
is honored only when theFCS
orMONOTONE
method is chosen explicitly. Linear regression is always used for scale variables when theAUTO
method is requested. A warning is issued ifSCALEMODEL
is specified for theAUTO
method.-
SCALEMODEL
is ignored if imputation is turned off.
INTERACTIONS keyword
By default, the imputation model for each variable includes a constant term
and main effects for predictor variables. You can optionally include all possible two-way
interactions among categorical predictor variables. Specify INTERACTIONS=TWOWAY
.
Interaction terms do not include scale predictors.
INTERACTIONS
is honored only when theFCS
orMONOTONE
method is chosen explicitly. Two-way interactions are not included whenAUTO
method is requested. A warning is issued ifINTERACTIONS
is specified for theAUTO
method.INTERACTIONS
is ignored if imputation is turned off or if there are fewer than two categorical predictors.
MAXPCTMISSING keyword
By default analysis variables are imputed and used as predictors without
regard to how many missing values they have provided they at least have sufficient data to estimate
an imputation model. The optional MAXPCTMISSING
keyword is used to exclude
variables that have a high percentage of missing values. Specify the maximum allowable percentage of
missing values as a positive number less than 100. For example, if you specify
MAXMISSINGPCT=50
, analysis variables that have more than 50% missing values are not
imputed, nor are they used as predictors in imputation models.
MAXPCTMISSING
is ignored if imputation is turned off.
Maximum Number of Draws
If minimum or maximum values are specified for imputed values of scale
variables (see CONSTRAINTS
subcommand), the procedure attempts to draw values for a
case until it finds a set of values that are within the specified ranges. By default, a maximum of
50 sets of values are drawn per case. To override the maximum, specify a positive integer for
MAXCASEDRAWS
. If a set of values is not obtained within the specified number of
draws per case, the procedure draws another set of model parameters and the repeats the case-drawing
process. By default, a maximum of 2 sets of model parameters are drawn. To override the maximum,
specify a positive integer for MAXPARAMDRAWS
. An error occurs if a set of values
within the ranges is not obtained within the specified number of case and parameter draws.
Note that increasing these values can increase the processing time. If the
procedure is taking a long time (or is unable) to find suitable draws, check the minimum and maximum
values specified on the CONSTRAINTS
subcommand to ensure they are appropriate.
MAXCASEDRAWS
andMAXPARAMDRAWS
are ignored if imputation is turned off, when predictive mean matching is used, and when imputing categorical values.
MAXITER keyword
Iteration stops when the maximum number of iterations is reached. To override
the default number of iterations (10), specify a positive integer value for
MAXITER
.
MAXITER
is honored only if theFCS
method is chosen explicitly. The keyword is ignored when theAUTO
method is used.- The
MAXITER
keyword is ignored if the monotone method is used or imputation is turned off.
SINGULAR keyword
SINGULAR
specifies the tolerance value used to test for
singularity in univariate imputation models (linear regression, logistic regression, and predictive
mean matching). The default value is 10e−12. Specify a positive value.
SINGULAR
is honored only if theFCS
orMONOTONE
method is chosen explicitly. The keyword is ignored and the default singularity tolerance is used whenAUTO
method is requested.SINGULAR
is ignored if imputation is turned off.
MAXMODELPARAM keyword
MAXMODELPARAM
specifies the maximum number of model
parameters allowed when imputing any variable. If a model has more parameters than the specified
limit, processing terminates with an error (no missing values are imputed). The default value is
100. Specify a positive integer value.
-
MAXMODELPARAM
is ignored if imputation is turned off.