RESCALE Subcommand (MLP command)

The RESCALE subcommand is used to rescale covariates or scale dependent variables.

All rescaling is performed based on the training data, even if a testing or holdout sample is defined (see PARTITION Subcommand (MLP command)). That is, depending on the type of rescaling, the mean, standard deviation, minimum value, or maximum value of a covariate or dependent variable are computed using only the training data. It is important that these covariates or dependent variables have similar distributions across the training, testing, and holdout samples. If the data are partitioned by specifying percentages on the PARTITION subcommand, then the MLP procedure attempts to ensure this similarity by random assignment. However, if you use the PARTITION subcommand VARIABLE keyword to assign cases to the training, testing, and holdout samples, then we recommend that you confirm the distributions are similar across samples before running the MLP procedure.

COVARIATE Keyword

The COVARIATE keyword specifies the rescaling method to use for covariates specified following WITH on the command line. If no covariates are specified on the command line, then the COVARIATE keyword is ignored.

STANDARDIZED. Subtract the mean and divide by the standard deviation, (x−mean)/s. This is the default rescaling method for covariates.

NORMALIZED. Subtract the minimum and divide by the range, (x−min)/(max−min).

ADJNORMALIZED. Adjusted version of subtract the minimum and divide by the range, [2*(x−min)/(max−min)]−1 .

NONE. No rescaling of covariates.

DEPENDENT Keyword

The DEPENDENT keyword specifies the rescaling method to use for scale dependent variables.

  • This keyword applies only to scale dependent variables—that is, either MLEVEL=S is specified on the command line or the variable has a scale measurement level based on its data dictionary setting. If a dependent variable is not scale, then the DEPENDENT keyword is ignored for that variable.
  • The availability of these rescaling methods for scale dependent variables depends on the output layer activation function in effect. See the OUTPUTFUNCTION keyword in ARCHITECTURE Subcommand (MLP command) for details about the activation functions.
  • If the identity activation function is in effect, then any of the rescaling methods, including NONE, may be requested. If the sigmoid activation function is in effect, then NORMALIZED is required. If the hyperbolic tangent activation function is in effect, then ADJNORMALIZED is required.
  • If automatic architecture selection is in effect (/ARCHITECTURE AUTOMATIC=YES), then the default output layer activation function (identity if there are any scale dependent variables) is always used. In this case, the default rescaling method (STANDARDIZED) is also used and the DEPENDENT keyword is ignored.

STANDARDIZED. Subtract the mean and divide by the standard deviation, (x−mean)/s. This is the default rescaling method for scale dependent variables if the output layer uses the identity activation function. This rescaling method may not be specified if the output layer uses the sigmoid or hyperbolic tangent activation function.

NORMALIZED. Subtract the minimum and divide by the range, (x−min)/(max−min). This is the required rescaling method for scale dependent variables if the output layer uses the sigmoid activation function. This rescaling method may not be specified if the output layer uses the hyperbolic tangent activation function. The NORMALIZED keyword may be followed by the CORRECTION option, which specifies a number ε that is applied as a correction to the rescaling formula. In particular, the corrected formula is [x−(min−ε)]/[(max+ε)−(min−ε)]. This correction ensures that all rescaled dependent variable values will be within the range of the activation function. A real number greater than or equal to 0 must be specified. The default is 0.02.

ADJNORMALIZED. Adjusted version of subtract the minimum and divide by the range, [2*(x−min)/(max−min)]−1 . This is the required rescaling method for scale dependent variables if the output layer uses the hyperbolic tangent activation function. This rescaling method may not be specified if the output layer uses the sigmoid activation function. The ADJNORMALIZED keyword may be followed by the CORRECTION option, which specifies a number ε that is applied as a correction to the rescaling formula. In particular, the corrected formula is {2*[(x−(min−ε))/((max+ε)−(min−ε))]}−1. This correction ensures that all rescaled dependent variable values will be within the range of the activation function. A real number greater than or equal to 0 must be specified. The default is 0.02.

NONE. No rescaling of scale dependent variables.