Multilayer Perceptron

The Multilayer Perceptron (MLP) procedure produces a predictive model for one or more dependent (target) variables based on the values of the predictor variables.

Examples. Following are two scenarios using the MLP procedure:

A loan officer at a bank needs to be able to identify characteristics that are indicative of people who are likely to default on loans and use those characteristics to identify good and bad credit risks. Using a sample of past customers, she can train a multilayer perceptron, validate the analysis using a holdout sample of past customers, and then use the network to classify prospective customers as good or bad credit risks. Show me

A hospital system is interested in tracking costs and lengths of stay for patients admitted for treatment of myocardial infarction (MI, or "heart attack"). Obtaining accurate estimates of these measures allows the administration to properly manage the available bed space as patients are treated. Using the treatment records of a sample of patients who received treatment for MI, the administrator can train a network to predict both cost and length of stay. Show me

Data Considerations

Dependent variables. The dependent variables can be:

  • Nominal. A variable can be treated as nominal when its values represent categories with no intrinsic ranking (for example, the department of the company in which an employee works). Examples of nominal variables include region, postal code, and religious affiliation.
  • Ordinal. A variable can be treated as ordinal when its values represent categories with some intrinsic ranking (for example, levels of service satisfaction from highly dissatisfied to highly satisfied). Examples of ordinal variables include attitude scores representing degree of satisfaction or confidence and preference rating scores.
  • Scale. A variable can be treated as scale (continuous) when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars.

    The procedure assumes that the appropriate measurement level has been assigned to all dependent variables; however, you can temporarily change the measurement level for a variable by right-clicking the variable in the source variable list and selecting a measurement level from the pop-up menu. To permanently change the level of measurement for a variable, see Variable Measurement Level.

An icon next to each variable in the variable list identifies the measurement level and data type:

Table 1. Measurement level icons
  Numeric String Date Time
Scale (Continuous)
Scale icon
n/a
Scale Date icon
Scale Time icon
Ordinal
Ordinal icon
Ordinal String icon
Ordinal Date icon
Ordinal Time icon
Nominal
Nominal icon
Nominal String icon
Nominal Date icon
Nominal Time icon

Predictor variables. Predictors can be specified as factors (categorical) or covariates (scale).

Categorical variable coding. The procedure temporarily recodes categorical predictors and dependent variables using one-of-c coding for the duration of the procedure. If there are c categories of a variable, then the variable is stored as c vectors, with the first category denoted (1,0,...,0), the next category (0,1,0,...,0), ..., and the final category (0,0,...,0,1).

This coding scheme increases the number of synaptic weights and can result in slower training; however, more "compact" coding methods usually lead to poorly fit neural networks. If your network training is proceeding very slowly, try reducing the number of categories in your categorical predictors by combining similar categories or dropping cases that have extremely rare categories. For more information on recoding variables, see Recode into Same Variables or Recode into Different Variables.

All one-of-c coding is based on the training data, even if a testing or holdout sample is defined (see Partitions (Multilayer Perceptron)). Thus, if the testing or holdout samples contain cases with predictor categories that are not present in the training data, then those cases are not used by the procedure or in scoring. If the testing or holdout samples contain cases with dependent variable categories that are not present in the training data, then those cases are not used by the procedure, but they may be scored.

Rescaling. Scale-dependent variables and covariates are rescaled by default to improve network training. All rescaling is performed based on the training data, even if a testing or holdout sample is defined (see Partitions (Multilayer Perceptron)). That is, depending on the type of rescaling, the mean, standard deviation, minimum value, or maximum value of a covariate or dependent variable is computed using only the training data. If you specify a variable to define partitions, it is important that these covariates or dependent variables have similar distributions across the training, testing, and holdout samples. Use, for example, the Explore procedure to examine the distributions across partitions.

Frequency weights. Frequency weights are ignored by this procedure.

Replicating results. If you want to replicate your results exactly, use the same initialization value for the random number generator, the same data order, and the same variable order, in addition to using the same procedure settings. More details on this issue follow:

  • Random number generation. The procedure uses random number generation during random assignment of partitions, random subsampling for initialization of synaptic weights, random subsampling for automatic architecture selection, and the simulated annealing algorithm used in weight initialization and automatic architecture selection. To reproduce the same randomized results in the future, use the same initialization value for the random number generator before each run of the Multilayer Perceptron procedure. See the topic Random Number Generators for more information.
  • Case order. The Online and Mini-batch training methods (see Training (Multilayer Perceptron)) are explicitly dependent upon case order; however, even Batch training is dependent upon case order because initialization of synaptic weights involves subsampling from the dataset.

    To minimize order effects, randomly order the cases. To verify the stability of a given solution, you may want to obtain several different solutions with cases sorted in different random orders. In situations with extremely large file sizes, multiple runs can be performed with a sample of cases sorted in different random orders.

  • Variable order. Results may be influenced by the order of variables in the factor and covariate lists due to the different pattern of initial values assigned when the variable order is changed. As with case order effects, you might try different variable orders (simply drag and drop within the factor and covariate lists) to assess the stability of a given solution.

Creating a Multilayer Perceptron Network

This feature requires the Neural Networks option.

From the menus choose:

Analyze > Neural Networks > Multilayer Perceptron...

  1. Select at least one dependent variable.
  2. Select at least one factor or covariate.

Optionally, on the Variables tab you can change the method for rescaling covariates. The choices are:

  • Standardized. Subtract the mean and divide by the standard deviation, (x−mean)/s.
  • Normalized. Subtract the minimum and divide by the range, (x−min)/(max−min). Normalized values fall between 0 and 1.
  • Adjusted Normalized. Adjusted version of subtracting the minimum and dividing by the range, [2*(x−min)/(max−min)]−1. Adjusted normalized values fall between −1 and 1.
  • None. No rescaling of covariates.

Fields with unknown measurement level

The Measurement Level alert is displayed when the measurement level for one or more variables (fields) in the dataset is unknown. Since measurement level affects the computation of results for this procedure, all variables must have a defined measurement level.

Scan Data. Reads the data in the active dataset and assigns default measurement level to any fields with a currently unknown measurement level. If the dataset is large, that may take some time.

Assign Manually. Opens a dialog that lists all fields with an unknown measurement level. You can use this dialog to assign measurement level to those fields. You can also assign measurement level in Variable View of the Data Editor.

Since measurement level is important for this procedure, you cannot access the dialog to run this procedure until all fields have a defined measurement level.

This procedure pastes MLP command syntax.