# Two-Stage Least-Squares Regression

Standard linear regression models assume that errors in the dependent variable are uncorrelated with the independent variable(s). When this is not the case (for example, when relationships between variables are bidirectional), linear regression using ordinary least squares (OLS) no longer provides optimal model estimates. Two-stage least-squares regression uses instrumental variables that are uncorrelated with the error terms to compute estimated values of the problematic predictor(s) (the first stage), and then uses those computed values to estimate a linear regression model of the dependent variable (the second stage). Since the computed values are based on variables that are uncorrelated with the errors, the results of the two-stage model are optimal.

**Example.** Is the demand for a commodity related to its price
and consumers' incomes? The difficulty in this model is that price
and demand have a reciprocal effect on each other. That is, price
can influence demand and demand can also influence price. A two-stage
least-squares regression model might use consumers' incomes and lagged
price to calculate a proxy for price that is uncorrelated with the
measurement errors in demand. This proxy is substituted for price
itself in the originally specified model, which is then estimated.

**Statistics.** For each model: standardized and unstandardized
regression coefficients, multiple *R*, *R* ^{2},
adjusted *R* ^{2}, standard error of the estimate, analysis-of-variance
table, predicted values, and residuals. Also, 95% confidence intervals
for each regression coefficient, and correlation and covariance matrices
of parameter estimates.

## Two-Stage Least-Squares Regression data considerations

**Data.** The dependent and independent variables should be quantitative. Categorical variables,
such as religion, major, or region of residence, need to be recoded to binary (dummy) variables or
other types of contrast variables. Endogenous explanatory variables should be
quantitative (not categorical).

**Assumptions.** For each value of the independent variable, the distribution of the dependent
variable must be normal. The variance of the distribution of the dependent variable should be
constant for all values of the independent variable. The relationship between the dependent variable
and each independent variable should be linear.

**Related procedures.** If you believe that none of your predictor variables is correlated with
the errors in your dependent variable, you can use the Linear Regression procedure. If your data
appear to violate one of the assumptions (such as normality or constant variance), try transforming
them. If your data are not related linearly and a transformation does not help, use an alternate
model in the Curve Estimation procedure. If your dependent variable is dichotomous, such as whether
a particular sale is completed or not, use the Logistic Regression procedure. If your data are not
independent--for example, if you observe the same person under several conditions--use the Repeated
Measures procedure.

## Obtaining a Two-Stage Least-Squares Regression Analysis

This feature requires SPSS® Statistics Standard Edition or the Regression Option.

- From the menus choose:
- Select one dependent variable.
- Select one or more explanatory (predictor) variables.
- Select one or more instrumental variables.
- Instrumental. These are the variables used to compute the predicted values for the endogenous variables in the first stage of two-stage least squares analysis. The same variables may appear in both the Explanatory and Instrumental list boxes. The number of instrumental variables must be at least as many as the number of explanatory variables. If all explanatory and instrumental variables listed are the same, the results are the same as results from the Linear Regression procedure.

Explanatory variables not specified as instrumental are considered endogenous. Normally, all of the exogenous variables in the Explanatory list are also specified as instrumental variables.

This procedure pastes 2SLS command syntax.