Example: Ozone Data

In this example, you will use a larger set of data to illustrate the selection and effects of optimal scaling transformations. The data include 330 observations on six meteorological variables previously analyzed by Breiman and Friedman 1, and Hastie and Tibshirani 2, among others. The following table describes the original variables. Your categorical regression attempts to predict the ozone concentration from the remaining variables. Previous researchers found nonlinearities among these variables, which hinder standard regression approaches.

Table 1. Original variables
Variable Description
ozon daily ozone level; categorized into one of 38 categories
ibh inversion base height
dpg pressure gradient (mm Hg)
vis visibility (miles)
temp temperature (degrees F)
doy day of the year

This dataset can be found in ozone.sav.See the topic Sample Files for more information.

Next

1 Breiman, L., and J. H. Friedman. 1985. Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80, 580-598.
2 Hastie, T., and R. Tibshirani. 1990. Generalized additive models. London: Chapman and Hall.