Choosing a procedure for Binary Logistic Regression

Binary logistic regression models can be fitted using the Logistic Regression procedure and the Multinomial Logistic Regression procedure. Each procedure has options not available in the other. An important theoretical distinction is that the Logistic Regression procedure produces all predictions, residuals, influence statistics, and goodness-of-fit tests using data at the individual case level, regardless of how the data are entered and whether or not the number of covariate patterns is smaller than the total number of cases, while the Multinomial Logistic Regression procedure internally aggregates cases to form subpopulations with identical covariate patterns for the predictors, producing predictions, residuals, and goodness-of-fit tests based on these subpopulations. If all predictors are categorical or any continuous predictors take on only a limited number of values—so that there are several cases at each distinct covariate pattern—the subpopulation approach can produce valid goodness-of-fit tests and informative residuals, while the individual case level approach cannot.

Logistic Regression
Provides the following unique features:
  • Hosmer-Lemeshow test of goodness of fit for the model
  • Stepwise analyses
  • Contrasts to define model parameterization
  • Alternative cut points for classification
  • Classification plots
  • Model fitted on one set of cases to a held-out set of cases
  • Saves predictions, residuals, and influence statistics
Multinomial Logistic Regression
Provides the following unique features:
  • Pearson and deviance chi-square tests for goodness of fit of the model
  • Specification of subpopulations for grouping of data for goodness-of-fit tests
  • Listing of counts, predicted counts, and residuals by subpopulations
  • Correction of variance estimates for over-dispersion
  • Covariance matrix of the parameter estimates
  • Tests of linear combinations of parameters
  • Explicit specification of nested models
  • Fit 1-1 matched conditional logistic regression models using differenced variables
  • Both of these procedures fit a model for binary data that is a generalized linear model with a binomial distribution and logit link function. If a different link function is more appropriate for your data, then you should use the Generalized Linear Models procedure.
  • If you have repeated measurements of binary data, or records that are otherwise correlated, then you should consider the Generalized Linear Mixed Models or Generalized Estimating Equations procedures.