Running a linear regression on factor component scores

Figure 1. Factor Analysis dialog box
Factor Analysis dialog with z-score variables selected

Using the Factor Analysis procedure, we can create a set of independent variables that are uncorrelated and fit the dependent variable as well as the original independent variables.

  1. To run a Factor Analysis on the standardized variables, from the menus choose:

    Analyze > Dimension Reduction > Factor...

  2. Select Zscore: Vehicle type through Zscore: Fuel efficiency as analysis variables.
  3. Click Extraction.
    Figure 2. Extraction dialog box
    Extraction dialog box
  4. In the Extract group, select Fixed number of factors and type 10 as the number of factors to extract.
  5. Click Continue, then click Rotation in the Factor Analysis dialog box.
    Figure 3. Rotation dialog box
    Rotation dialog box
  6. In the Method group, select Varimax.
  7. Click Continue, then click Scores in the Factor Analysis dialog box.
    Figure 4. Factor Scores dialog box
    Factor Scores dialog box
  8. Select Save as variables.
  9. Click Continue, then click OK in the Factor Analysis dialog box.
    Figure 5. Linear Regression dialog box
    Linear Regression dialog with factor score variables selected as independent variables
  10. To run a Linear Regression on the factor scores, recall the Linear Regression dialog box.
  11. Deselect Zscore: Vehicle type through Zscore: Fuel efficiency as independent variables.
  12. Select REGR factor score 1 for analysis 1 [FAC1_1] through REGR factor score 10 for analysis 1 [FAC10_1] as independent variables.
  13. Click OK.
Figure 6. ANOVA table
ANOVA table that shows sum of squares, degrees of freedom, mean square, F, and significance.

As expected, the model fit is the same for the model built using the factor scores as for the model using the original predictors.

Figure 7. Coefficients table
Coefficients tables showing unstandardized and standardized coefficients (B and Beta), t, significance, zero-order, part, and partial correlations, tolerance, and VIF

Also as expected, the collinearity statistics show that the factor scores are uncorrelated. Also note that since the variability of the coefficient estimates are not artificially inflated by collinearity, the coefficient estimates are larger, relative to their standard errors, in this model than in the original model. This means that more of the factors are identified as statistically significant, which can affect your final results if you want to build a model that only includes significant effects.

Next