Overview (DISCRIMINANT command)
DISCRIMINANT performs linear discriminant analysis for two or more groups. The
goal of discriminant analysis is to classify cases into one of several
mutually exclusive groups based on their values for a set of predictor
variables. In the analysis phase, a classification rule is developed
using cases for which group membership is known. In the classification
phase, the rule is used to classify cases for which group membership
is not known. The grouping variable must be categorical, and the independent
(predictor) variables must be interval or dichotomous, since they
will be used in a regression-type equation.
Options
Variable Selection Method. In addition
to the direct-entry method, you can specify any of several stepwise
methods for entering variables into the discriminant analysis using
the METHOD subcommand. You can
set the values for the statistical criteria used to enter variables
into the equation using the TOLERANCE, FIN, PIN, FOUT, POUT, and VIN subcommands, and you can specify inclusion levels on the ANALYSIS subcommand. You can also specify
the maximum number of steps in a stepwise analysis using the MAXSTEPS subcommand.
Case Selection. You can
select a subset of cases for the analysis phase using the SELECT subcommand.
Prior Probabilities. You
can specify prior probabilities for membership in a group using the PRIORS subcommand. Prior probabilities are
used in classifying cases.
New Variables. You can add new variables
to the active dataset containing the predicted group membership, the
probability of membership in each group, and discriminant function
scores using the SAVE subcommand.
Classification
Options. With the CLASSIFY subcommand, you can classify only those cases that were not selected
for inclusion in the discriminant analysis, or only those cases whose
value for the grouping variable was missing or fell outside the range
analyzed. In addition, you can classify cases based on the separate-group
covariance matrices of the functions instead of the pooled within-groups
covariance matrix.
Statistical Display. You can request
any of a variety of statistics on the STATISTICS subcommand. You can rotate the pattern or structure matrices using
the ROTATE subcommand. You can
compare actual with predicted group membership using a classification
results table requested with the STATISTICS subcommand or compare any of several types of plots or histograms
using the PLOT subcommand.
Basic Specification
The basic specification requires two subcommands:
-
GROUPSspecifies the variable used to group cases. -
VARIABLESspecifies the predictor variables.
By default, DISCRIMINANT enters all variables simultaneously into the discriminant equation
(the DIRECT method), provided
that they are not so highly correlated that multicollinearity problems
arise. Default output includes analysis case processing summary, valid
numbers of cases in group statistics, variables failing tolerance
test, a summary of canonical discriminant functions, standardized
canonical discriminant function coefficients, a structure matrix showing
pooled within-groups correlations between the discriminant functions
and the predictor variables, and functions at group centroids.
Subcommand Order
- The
GROUPS,VARIABLES, andSELECTsubcommands must precede all other subcommands and may be entered in any order. - The analysis block follows, which may include
ANALYSIS,METHOD,TOLERANCE,MAXSTEPS,FIN,FOUT,PIN,POUT,VIN,FUNCTIONS,PRIORS,SAVE, andOUTFILE. Each analysis block performs a single analysis. To do multiple analyses, specify multiple analysis blocks. - The keyword
ANALYSISis optional for the first analysis block. Each new analysis block must begin with anANALYSISsubcommand. Remaining subcommands in the block may be used in any order and apply only to the analysis defined within the same block. - No analysis block subcommands can be specified after any of the global
subcommands, which apply to all analysis blocks. The global subcommands
are
MISSING,MATRIX,HISTORY,ROTATE,CLASSIFY,STATISTICS, andPLOT. If an analysis block subcommand appears after a global subcommand, the program displays a warning and ignores it.
Syntax Rules
- Only
one
GROUPS, oneSELECT, and oneVARIABLESsubcommand can be specified perDISCRIMINANTcommand.
Operations
-
DISCRIMINANTfirst estimates one or more discriminant functions that best distinguish among the groups. - Using these functions,
DISCRIMINANTthen classifies cases into groups (if classification output is requested). - If more than one analysis block is specified, the above steps are repeated for each block.
- This procedure uses the multithreaded options specified by
SET THREADSandSET MCACHE.
Limitations
- Pairwise deletion of missing data is not available.