Custom Imputation Model

In order to prevent imputed values from falling outside the reasonable range of values for each variable, we'll specify a custom imputation model with constraints on the variables. Further, Household income in thousands is highly right-skew, and further analysis will likely use the logarithm of income, so it seems sensible to impute the log-income directly.

  1. Make sure the original dataset is active.
  2. To create a log-income variable, from the menus choose:

    Transform > Compute Variable...

    Figure 1. Compute Variable dialog
    Compute Variable dialog
  3. Type lninc as the target variable.
  4. Type ln(income) as the numeric expression.
  5. Click Type & Label..
    Figure 2. Type and Label dialog
    Type and Label dialog
  6. Type Log of income as the label.
  7. Click Continue.
  8. Click OK in the Compute Variable dialog.
    Figure 3. Variables tab with Log of income replacing Household income in thousands in the imputation model
    Variables tab with Log of income replacing Household income in thousands in the imputation model
  9. Recall the Impute Missing Data Values dialog and click the Variables tab.
  10. Deselect Household income in thousands [income] and select Log of income [lninc] as variables in the model.
  11. Click the Method tab.
    Figure 4. Alert for replacing existing dataset
    Alert for replacing existing dataset
  12. Click Yes in the alert that appears.
    Figure 5. Method tab
    Method tab
  13. Select Custom and leave Fully conditional specification selected as the imputation method.
  14. Click the Constraints tab.
    Figure 6. Constraints tab
    Constraints tab
  15. Click Scan Data.
  16. In the Define Constraints grid, type 1 as the minimum value for Months with service [tenure].
  17. Type 18 as the minimum value for age (Age in years).
  18. Type 0 as the minimum value for address (Years at current address).
  19. Type 0 as the minimum value for employ (Years with current employer).
  20. Type 1 as the minimum value and 1 as the level of rounding for reside (Number of people in household). Note that while many of the other scale variables are reported in integer values, it is sensible to posit that someone has lived for 13.8 years at their current address, but not really to think that 2.2 people live there.
  21. Type 0 as the minimum value for lninc (Log of income).
  22. Click the Output tab.
    Figure 7. Output tab
    Output tab
  23. Select Create iteration history and type telcoFCS as the name of the new dataset.
  24. Click OK.

Next