Standard linear regression models assume that variance is constant within the population under study. When this is not the case (for example, when cases that are high on some attribute show more variability than cases that are low on that attribute) linear regression using ordinary least squares (OLS) no longer provides optimal model estimates. If the differences in variability can be predicted from another variable, the Weight Estimation procedure can compute the coefficients of a linear regression model using weighted least squares (WLS), such that the more precise observations (that is, those with less variability) are given greater weight in determining the regression coefficients. The Weight Estimation procedure tests a range of weight transformations and indicates which will give the best fit to the data.
Example. What are the effects of inflation and unemployment on changes in stock prices? Because stocks with higher share values often show more variability than those with low share values, ordinary least squares will not produce optimal estimates. Weight estimation allows you to account for the effect of share price on the variability of price changes in calculating the linear model.
Statistics. Log-likelihood values for each power of the weight source variable tested, multiple R, R-squared, adjusted R-squared, ANOVA table for WLS model, unstandardized and standardized parameter estimates, and log-likelihood for the WLS model.
Weight Estimation data considerations
Data. The dependent and independent variables should be quantitative. Categorical variables, such as religion, major, or region of residence, need to be recoded to binary (dummy) variables or other types of contrast variables. The weight variable should be quantitative and should be related to the variability in the dependent variable.
Assumptions. For each value of the independent variable, the distribution of the dependent variable must be normal. The relationship between the dependent variable and each independent variable should be linear, and all observations should be independent. The variance of the dependent variable can vary across levels of the independent variable(s), but the differences must be predictable based on the weight variable.
Related procedures. The Explore procedure can be used to screen your data. Explore provides tests for normality and homogeneity of variance, as well as graphical displays. If your dependent variable seems to have equal variance across levels of independent variables, you can use the Linear Regression procedure. If your data appear to violate an assumption (such as normality), try transforming them. If your data are not related linearly and a transformation does not help, use an alternate model in the Curve Estimation procedure. If your dependent variable is dichotomous (for example, whether a particular sale is completed or whether an item is defective) use the Logistic Regression procedure. If your dependent variable is censored (for example, survival time after surgery) use Life Tables, Kaplan-Meier, or Cox Regression, available in Custom Tables and Advanced Statistics. If your data are not independent (for example, if you observe the same person under several conditions) use the Repeated Measures procedure, available in Custom Tables and Advanced Statistics.
Obtaining a Weight Estimation Analysis
This feature requires SPSS® Statistics Standard Edition or the Regression Option.
- From the menus choose:
- Select one dependent variable.
- Select one or more independent variables.
- Select the variable that is the source of heteroscedasticity as the weight variable.
- Weight Variable
- The data are weighted by the reciprocal of this variable raised to a power. The regression equation is calculated for each of a specified range of power values and indicates the power that maximizes the log-likelihood function.
- Power Range
- This is used in conjunction with the weight variable to compute weights. Several regression equations will be fit, one for each value in the power range. The values entered in the Power range test box and the through text box must be between -6.5 and 7.5, inclusive. The power values range from the low to high value, in increments determined by the value specified. The total number of values in the power range is limited to 150.
This procedure pastes WLS command syntax.