RESCALE Subcommand (MLP command)
The RESCALE
subcommand is
used to rescale covariates or scale dependent variables.
All rescaling is performed based on the training
data, even if a testing or holdout sample is defined (see PARTITION Subcommand (MLP command)). That is, depending on the type
of rescaling, the mean, standard deviation, minimum value, or maximum
value of a covariate or dependent variable are computed using only
the training data. It is important that these covariates or dependent
variables have similar distributions across the training, testing,
and holdout samples. If the data are partitioned by specifying percentages
on the PARTITION
subcommand,
then the MLP
procedure attempts
to ensure this similarity by random assignment. However, if you use
the PARTITION
subcommand VARIABLE
keyword to assign cases to the
training, testing, and holdout samples, then we recommend that you
confirm the distributions are similar across samples before running
the MLP
procedure.
COVARIATE Keyword
The COVARIATE
keyword specifies
the rescaling method to use for covariates specified following WITH
on the command line. If no covariates
are specified on the command line, then the COVARIATE
keyword is ignored.
STANDARDIZED. Subtract the mean and divide by the standard deviation, (x−mean)/s. This is the default rescaling method for covariates.
NORMALIZED. Subtract the minimum and divide by the range, (x−min)/(max−min).
ADJNORMALIZED. Adjusted version of subtract the minimum and divide by the range, [2*(x−min)/(max−min)]−1 .
NONE. No rescaling of covariates.
DEPENDENT Keyword
The DEPENDENT
keyword specifies
the rescaling method to use for scale dependent variables.
- This keyword applies only to scale dependent variables—that
is, either
MLEVEL=S
is specified on the command line or the variable has a scale measurement level based on its data dictionary setting. If a dependent variable is not scale, then theDEPENDENT
keyword is ignored for that variable. - The availability
of these rescaling methods for scale dependent variables depends on
the output layer activation function in effect. See the
OUTPUTFUNCTION
keyword in ARCHITECTURE Subcommand (MLP command) for details about the activation functions. - If the identity activation function is in effect,
then any of the rescaling methods, including
NONE
, may be requested. If the sigmoid activation function is in effect, thenNORMALIZED
is required. If the hyperbolic tangent activation function is in effect, thenADJNORMALIZED
is required. - If automatic architecture selection
is in effect (
/ARCHITECTURE AUTOMATIC=YES
), then the default output layer activation function (identity if there are any scale dependent variables) is always used. In this case, the default rescaling method (STANDARDIZED
) is also used and theDEPENDENT
keyword is ignored.
STANDARDIZED. Subtract the mean and divide by the standard deviation, (x−mean)/s. This is the default rescaling method for scale dependent variables if the output layer uses the identity activation function. This rescaling method may not be specified if the output layer uses the sigmoid or hyperbolic tangent activation function.
NORMALIZED. Subtract the
minimum and divide by the range, (x−min)/(max−min). This is the required rescaling method for scale dependent variables
if the output layer uses the sigmoid activation function. This rescaling
method may not be specified if the output layer uses the hyperbolic
tangent activation function. The NORMALIZED
keyword may be followed by the CORRECTION
option, which specifies a number ε that is applied as a correction
to the rescaling formula. In particular, the corrected formula is
[x−(min−ε)]/[(max+ε)−(min−ε)]. This correction ensures that all
rescaled dependent variable values will be within the range of the
activation function. A real number greater than or equal to 0 must
be specified. The default is 0.02.
ADJNORMALIZED. Adjusted version
of subtract the minimum and divide by the range, [2*(x−min)/(max−min)]−1 . This is the required rescaling method for scale dependent variables
if the output layer uses the hyperbolic tangent activation function.
This rescaling method may not be specified if the output layer uses
the sigmoid activation function. The ADJNORMALIZED
keyword may be followed by the CORRECTION
option, which specifies a number ε that is applied as a correction
to the rescaling formula. In particular, the corrected formula is
{2*[(x−(min−ε))/((max+ε)−(min−ε))]}−1. This correction
ensures that all rescaled dependent variable values will be within
the range of the activation function. A real number greater than
or equal to 0 must be specified. The default is 0.02.
NONE. No rescaling of scale dependent variables.