Example: Ozone Data
In this example, you will use a larger set of data to illustrate the selection and effects of optimal scaling transformations. The data include 330 observations on six meteorological variables previously analyzed by Breiman and Friedman 1, and Hastie and Tibshirani 2, among others. The following table describes the original variables. Your categorical regression attempts to predict the ozone concentration from the remaining variables. Previous researchers found nonlinearities among these variables, which hinder standard regression approaches.
Variable | Description |
---|---|
ozon | daily ozone level; categorized into one of 38 categories |
ibh | inversion base height |
dpg | pressure gradient (mm Hg) |
vis | visibility (miles) |
temp | temperature (degrees F) |
doy | day of the year |
This dataset can be found in ozone.sav.See the topic Sample Files for more information.
1
Breiman, L., and J. H. Friedman. 1985. Estimating optimal
transformations for multiple regression and correlation. Journal of the American Statistical
Association, 80, 580-598.
2
Hastie, T., and R. Tibshirani. 1990. Generalized additive
models. London: Chapman and Hall.