CRITERIA Subcommand (OPTIMAL BINNING command)
The CRITERIA subcommand specifies bin creation
options.
PREPROCESS = EQUALFREQ (BINS=n) | NONE. Preprocessing
method when MDLP binning is used. PREPROCESS = EQUALFREQ creates
preliminary bins using the equal frequency method before performing
MDLP binning. These preliminary bins—rather than the original data
values of the binning input variables—are input to the MDLP binning
method.
-
EQUALFREQmay be followed by parentheses containing theBINSkeyword, an equals sign, and an integer greater than 1. TheBINSvalue serves as a preprocessing threshold and specifies the number of bins to create. The default value isEQUALFREQ (BINS = 1000). - If the number of distinct values in a binning input variable is
greater than the
BINSvalue, then the number of bins created is no more than theBINSvalue. Otherwise, no preprocessing is done for the input variable. -
NONErequests no preprocessing.
METHOD = MDLP | EQUALFREQ (BINS=n). Binning method. The MDLP option
performs supervised binning via the MDLP algorithm.
If METHOD = MDLP is specified, then a guide variable
must be specified on the VARIABLES subcommand.
- Alternatively,
METHOD = EQUALFREQperforms unsupervised binning via the equal frequency algorithm.EQUALFREQmay be followed by parentheses containing theBINSkeyword, an equals sign, and an integer greater than 1. TheBINSvalue specifies the number of bins to create. The default value of theBINSargument is 10. - If the number of distinct values in a binning input variable is
greater than the
BINSvalue, then the number of bins created is no more than theBINSvalue. Otherwise,BINSgives an upper bound on the number of bins created. Thus, for example, ifBINS = 10is specified but a binning input variable has at most 10 distinct values, then the number of bins created will equal the number of distinct values in the input variable. - If
EQUALFREQis specified, then theVARIABLESsubcommandGUIDEkeyword and theCRITERIAsubcommandPREPROCESSkeyword are silently ignored. - The default
METHODoption depends on the presence of aGUIDEspecification on theVARIABLESsubcommand. IfGUIDEis specified, thenMETHOD = MDLPis the default. IfGUIDEis not specified, thenMETHOD = EQUALFREQis the default.
LOWEREND = UNBOUNDED | OBSERVED. Specifies how the minimum
end point for each binning input variable is defined. Valid option
values are UNBOUNDED or OBSERVED.
If UNBOUNDED, then the minimum end point extends
to negative infinity. If OBSERVED, then the minimum
observed data value is used.
UPPEREND = UNBOUNDED | OBSERVED. Specifies how the maximum
end point for each binning input variable is defined. Valid option
values are UNBOUNDED or OBSERVED.
If UNBOUNDED, then the maximum end point extends
to positive infinity. If OBSERVED, then the maximum
of the observed data is used.
LOWERLIMIT =INCLUSIVE | EXCLUSIVE. Specifies how the
lower limit of an interval is defined. Valid option values are INCLUSIVE or EXCLUSIVE.
Suppose the start and end points of an interval are p and q,
respectively. If LOWERLIMIT = INCLUSIVE, then the
interval contains values greater than or equal to p but less
than q. If LOWERLIMIT = EXCLUSIVE, then the
interval contains values greater than p and less than or equal
to q.
FORCEMERGE = value. Small bins threshold. Occasionally, the procedure may produce bins
with very few cases. A bin is merged if the ratio of its size (number of cases) to that of a
neighboring bin is smaller than the specified threshold. Larger thresholds tend to result in more
merging. The default value of FORCEMERGE is 0; by default, forced merging of very
small bins is not performed.