R-Squared Statistics

Figure 1. Model Summary
Model Summary

In the linear regression model, the coefficient of determination, R 2, summarizes the proportion of variance in the dependent variable associated with the predictor (independent) variables, with larger R 2 values indicating that more of the variation is explained by the model, to a maximum of 1. For regression models with a categorical dependent variable, it is not possible to compute a single R 2 statistic that has all of the characteristics of R 2 in the linear regression model, so these approximations are computed instead. The following methods are used to estimate the coefficient of determination.

What constitutes a “good” R 2 value varies between different areas of application. While these statistics can be suggestive on their own, they are most useful when comparing competing models for the same data. The model with the largest R 2 statistic is “best” according to this measure.

Next

1 Cox, D. R., and E. J. Snell. 1989. The Analysis of Binary Data, 2nd ed. London: Chapman and Hall.
2 Nagelkerke, N. J. D. 1991. A note on the general definition of the coefficient of determination. Biometrika, 78:3, 691-692.
3 McFadden, D. 1974. Conditional logit analysis of qualitative choice behavior. In: Frontiers in Economics, P. Zarembka, eds. New York: Academic Press.