TRANSFORM Subcommand (ADP command)
The TRANSFORM subcommand is used to merge similar
categories of categorical inputs, bin values of continuous inputs,
and construct and select new input fields from continuous inputs using
principal components analysis.
MERGESUPERVISED Keyword
The MERGESUPERVISED keyword specifies how to
merge similar categories of a nominal or ordinal input in the presence
of a target.
- If there are no categorical inputs,
MERGESUPERVISEDis ignored. - If there is no target specified on the
FIELDSsubcommand,MERGESUPERVISEDis ignored.
YES(PVALUE=value). Supervised merge. Similar categories
are identified based upon the relationship between the input and the
target. Categories that are not significantly different; that is,
having a p-value greater than the value of PVALUE,
are merged. Specify a value greater than 0 and less than or equal
to 1. The default is 0.05. YES is the default.
NO. Do not merge categories.
MERGEUNSUPERVISED Keyword
The MERGEUNSUPERVISED keyword specifies how to
merge similar categories of a nominal or ordinal input when there
is no target.
- If there are no categorical inputs,
MERGEUNSUPERVISEDi s ignored. - If there is a target specified on the
FIELDSsubcommand,MERGEUNSUPERVISEDis ignored.
YES(ORDINAL|NOMINAL|MINPCT=value). Unsupervised merge. The
equal frequency method is used to merge categories with less than MINPCTof
the total number of records. Specify a value greater than or equal
to 0 and less than or equal to 100. The default is 10 if MINPCTis
not specified. If YES is specified without ORDINAL or NOMINAL,
then no merging is performed.
NO. Do not merge categories. NO is
the default.
BINNING Keyword
The BINNING keyword specifies how to discretize
continuous inputs in the presence of a categorical target.
SUPERVISED(PVALUE=value). Supervised binning. Bins
are created based upon the properties of "homogeneous subsets", which
are identified by the Scheffe method using PVALUE as
the alpha for the critical value for determining homogeneous subsets. SUPERVISED is
the default. Specify a value greater than 0 and less than or equal
to 1. The default is 0.05
If there is no target specified on the FIELDS subcommand,
or the target is not categorical, or there are no continuous inputs,
then SUPERVISED is ignored.
NONE. Do not bin values of continuous inputs.
SELECTION Keyword
The SELECTION keyword specifies how to perform
feature selection for continuous inputs in the presence of a continuous
target.
YES(PVALUE=value). Perform feature selection. A
continuous input is removed from the analysis if the p-value
for its correlation with the target is greater than PVALUE. YES is
the default.
If there is no target specified on the FIELDSsubcommand,
or the target is not continuous, or there are no continuous inputs,
then YESis ignored.
NO. Do not perform feature selection.
CONSTRUCTION Keyword
The CONSTRUCTIONkeyword specifies how to perform
feature construction for continuous inputs in the presence of a continuous
target.
YES(ROOT=rootname). Perform feature construction. New
predictors are constructed from groups of "similar" predictors using
principal component analysis. Optionally specify the rootname for
constructed predictors using ROOT in parentheses.
Specify a rootname (no quotes). The default is feature
If there is no target specified on the FIELDS subcommand,
or the target is not continuous, or there are no continuous inputs,
then YES is ignored.
NO. Do not perform feature construction. NO is
the default.