IMPUTE Subcommand (MULTIPLE IMPUTATION command)
The IMPUTE subcommand controls
the imputation method and model. By default, the AUTO method is used to impute missing data values.
METHOD keyword
The METHOD keyword specifies the imputation method. Specify
one of the following options:
- AUTO
- Automatic method. This is the default. Chooses the best method based
on a scan of the data. The monotone method is used when the data have a monotone pattern of
missingness; otherwise, if the data have a non-monotone pattern,
FCSis used. TheAUTOmethod internally sorts the analysis variables in order from the least missing to most missing to detect a monotone pattern of missingness if it exists; thus, order of variables in the variable list is ignored whenAUTOis used. - MONOTONE
- Monotone method. This is a non-iterative method that can be used only when the data have a monotone pattern of missingness. A monotone pattern exists when you can order the variables such that, if a variable has a non-missing value, all preceding variables also have non-missing values. The monotone method imputes missing values sequentially in the order specified in the variable list. An error occurs if the input data do not have a monotone pattern for the order in which variables are specified.
- FCS
- Fully conditional specification method. This is an iterative Markov chain Monte Carlo (MCMC) method that can be used when the pattern of missing data is arbitrary (monotone or non-monotone). In each iteration, the FCS method sequentially imputes missing values in the order specified in the variable list.
- NONE
- None. No imputation is performed. Choose this option if you want analyses of missingness only. A warning is issued if you turn off imputation and suppress all analyses of missing values.
NIMPUTATIONS keyword
By default, five imputations are performed. To request a different number of imputations, specify a positive integer. As a rule of thumb, the higher the degree of missing information in the data, the more imputations that are needed to obtain reasonable efficiency relative to an infinite number of imputations.
NIMPUTATIONSis ignored if imputation is turned off.
SCALEMODEL keyword
The AUTO, FCS, and MONOTONE
methods are multivariate methods that can be used when several variables have missing values. Each
uses univariate models when imputing values of individual variables.
By default, the type of univariate model that is used depends on the
measurement level of the variable whose missing values are to be imputed. Multinomial logistic
regression is always used for categorical variables. For scale variables, linear regression
(LINEAR) is used by default.
Optionally you can use the predictive mean matching (PMM)
method for scale variables. PMM is a variant of linear regression that ensures that
the imputed values are plausible. For PMM, the imputed value is based on the value
defined for the closest randomly selected complete case from the closest (k) predictions,
where (k) is a positive integer with a default value of 5.
SCALEMODELis honored only when theFCSorMONOTONEmethod is chosen explicitly. Linear regression is always used for scale variables when theAUTOmethod is requested. A warning is issued ifSCALEMODELis specified for theAUTOmethod.-
SCALEMODELis ignored if imputation is turned off.
INTERACTIONS keyword
By default, the imputation model for each variable includes a constant term
and main effects for predictor variables. You can optionally include all possible two-way
interactions among categorical predictor variables. Specify INTERACTIONS=TWOWAY.
Interaction terms do not include scale predictors.
INTERACTIONSis honored only when theFCSorMONOTONEmethod is chosen explicitly. Two-way interactions are not included whenAUTOmethod is requested. A warning is issued ifINTERACTIONSis specified for theAUTOmethod.INTERACTIONSis ignored if imputation is turned off or if there are fewer than two categorical predictors.
MAXPCTMISSING keyword
By default analysis variables are imputed and used as predictors without
regard to how many missing values they have provided they at least have sufficient data to estimate
an imputation model. The optional MAXPCTMISSING keyword is used to exclude
variables that have a high percentage of missing values. Specify the maximum allowable percentage of
missing values as a positive number less than 100. For example, if you specify
MAXMISSINGPCT=50, analysis variables that have more than 50% missing values are not
imputed, nor are they used as predictors in imputation models.
MAXPCTMISSINGis ignored if imputation is turned off.
Maximum Number of Draws
If minimum or maximum values are specified for imputed values of scale
variables (see CONSTRAINTS subcommand), the procedure attempts to draw values for a
case until it finds a set of values that are within the specified ranges. By default, a maximum of
50 sets of values are drawn per case. To override the maximum, specify a positive integer for
MAXCASEDRAWS. If a set of values is not obtained within the specified number of
draws per case, the procedure draws another set of model parameters and the repeats the case-drawing
process. By default, a maximum of 2 sets of model parameters are drawn. To override the maximum,
specify a positive integer for MAXPARAMDRAWS. An error occurs if a set of values
within the ranges is not obtained within the specified number of case and parameter draws.
Note that increasing these values can increase the processing time. If the
procedure is taking a long time (or is unable) to find suitable draws, check the minimum and maximum
values specified on the CONSTRAINTS subcommand to ensure they are appropriate.
MAXCASEDRAWSandMAXPARAMDRAWSare ignored if imputation is turned off, when predictive mean matching is used, and when imputing categorical values.
MAXITER keyword
Iteration stops when the maximum number of iterations is reached. To override
the default number of iterations (10), specify a positive integer value for
MAXITER.
MAXITERis honored only if theFCSmethod is chosen explicitly. The keyword is ignored when theAUTOmethod is used.- The
MAXITERkeyword is ignored if the monotone method is used or imputation is turned off.
SINGULAR keyword
SINGULAR specifies the tolerance value used to test for
singularity in univariate imputation models (linear regression, logistic regression, and predictive
mean matching). The default value is 10e−12. Specify a positive value.
SINGULARis honored only if theFCSorMONOTONEmethod is chosen explicitly. The keyword is ignored and the default singularity tolerance is used whenAUTOmethod is requested.SINGULARis ignored if imputation is turned off.
MAXMODELPARAM keyword
MAXMODELPARAM specifies the maximum number of model
parameters allowed when imputing any variable. If a model has more parameters than the specified
limit, processing terminates with an error (no missing values are imputed). The default value is
100. Specify a positive integer value.
-
MAXMODELPARAMis ignored if imputation is turned off.