Effects of Transformations

Transforming the variables makes a nonlinear relationship between the original response and the original set of predictors linear for the transformed variables. However, when there are multiple predictors, pairwise relationships are confounded by the other variables in the model.

To focus your analysis on the relationship between Daily ozone level and Day of the year, begin by looking at a scatterplot. From the menus choose:

Graphs > Chart Builder...

Figure 1. Chart Builder dialog
Chart Builder dialog
  1. In the Chart Builder, select the Scatter/Dot gallery and choose Simple Scatter.
  2. Select Daily ozone level as the y-axis variable and Day of the year as the x-axis variable.
  3. Click OK.
    Figure 2. Scatterplot of Daily ozone level and Day of the year
    Scatterplot with Daily ozone level on the vertical axis and Day of year on the horizontal axis

    This figure illustrates the relationship between Daily ozone level and Day of the year. As Day of the year increases to approximately 200, Daily ozone level increases. However, for Day of the year values greater than 200, Daily ozone level decreases. This inverted U pattern suggests a quadratic relationship between the two variables. A linear regression cannot capture this relationship.

  4. To see a best-fit line overlaid on the points in the scatterplot, activate the graph by double-clicking on it.
  5. Select a point in the Chart Editor.
  6. Click the Add Fit Line at Total tool, and close the Chart Editor.
    Figure 3. Scatterplot showing best-fit line
    Scatterplot showing best-fit line

    A linear regression of Daily ozone level on Day of the year yields an R 2 of 0.004. This fit suggests that Day of the year has no predictive value for Daily ozone level. This is not surprising, given the pattern in the figure. By using optimal scaling, however, you can linearize the quadratic relationship and use the transformed Day of the year to predict the response.

    Figure 4. Categorical Regression dialog
    Categorical Regression dialog

    To obtain a categorical regression of Daily ozone level on Day of the year, recall the Categorical Regression dialog.

  7. In the Categorical Regression dialog, deselect Inversion base height through Temperature (degrees F) as independent variables.
  8. Select Day of the year as an independent variable.
  9. Click Define Scale.
    Figure 5. Define Scale dialog
    The Categorical Regression Define Scale dialog.
  10. In the Define Scale dialog, select Nominal as the optimal scaling level.
  11. Click Continue.
  12. Click Discretize in the Categorical Regression dialog.
    Figure 6. Discretization dialog
    The Categorical Regression Discretization dialog.
  13. In the Discretization dialog, select doy.
  14. Select Equal intervals.
  15. Type 10 as the interval length.
  16. Click Change.
  17. Click Continue.
  18. Click Plots in the Categorical Regression dialog.
    Figure 7. Plots dialog
    The Categorical Regression Plots dialog.
  19. In the Plots dialog, select doy for transformation plots.
  20. Click Continue.
  21. Click OK in the Categorical Regression dialog.
    Figure 8. Model summary for categorical regression of Daily ozone level on Day of the year
    Table showing multiple R, R-square, adjusted R-square, apparent prediction error

    The optimal scaling regression treats Daily ozone level as numerical and Day of the year as nominal. This results in an R 2 of 0.549. Although only 55% of the variation in Daily ozone level is accounted for by the categorical regression, this is a substantial improvement over the original regression. Transforming Day of the year allows for the prediction of Daily ozone level.

    Figure 9. Transformation plot of Day of the year (nominal)
    Transformation plot of Day of the year (nominal)

    This figure displays the transformation plot of Day of the year. The extremes of Day of the year both receive negative quantifications, whereas the central values have positive quantifications. By applying this transformation, the low and high Day of the year values have similar effects on predicted Daily ozone level.

    Figure 10. Chart Builder
    Chart Builder

    To see a scatterplot of the transformed variables, recall the Chart Builder, and click Reset to clear your previous selections.

  22. In the Chart Builder, select the Scatter/Dot gallery and choose Simple Scatter.
  23. Select Daily ozone level Quantification [TRA1_3] as the y-axis variable and Day of the year Quantification [TRA2_3] as the x-axis variable.
  24. Click OK.
Figure 11. Scatterplot of the transformed variables
Scatterplot of the transformed variables

This figure depicts the relationship between the transformed variables. An increasing trend replaces the inverted U. The regression line has a positive slope, indicating that as transformed Day of the year increases, predicted Daily ozone level increases. Using optimal scaling linearizes the relationship and allows interpretations that would otherwise go unnoticed.