Overview (RBF command)

Neural networks are a data mining tool for finding unknown patterns in databases. Neural networks can be used to make business decisions by forecasting demand for a product as a function of price and other variables or by categorizing customers based on buying habits and demographic characteristics. The RBF procedure fits a radial basis function neural network, which is a feedforward, supervised learning network with an input layer, a hidden layer called the radial basis function layer, and an output layer. The hidden layer transforms the input vectors into radial basis functions. Like the MLP (multilayer perceptron) procedure, the RBF procedure performs prediction and classification.

The RBF procedure trains the network in two stages:

  1. The procedure determines the radial basis functions using clustering methods. The center and width of each radial basis function are determined.
  2. The procedure estimates the synaptic weights given the radial basis functions. The sum-of-squares error function with identity activation function for the output layer is used for both prediction and classification. Ordinary least squares regression is used to minimize the sum-of-squares error.

Because of this two-stage training approach, the RBF network is in general trained much faster than MLP.

Options

Prediction or classification. One or more dependent variables may be specified, and they may be scale, categorical, or a combination. If a dependent variable has scale measurement level, then the neural network predicts continuous values that approximate the “true” value of some continuous function of the input data. If a dependent variable is categorical, then the neural network is used to classify cases into the “best” category based on the input predictors.

Rescaling. RBF optionally rescales covariates (predictors with scale measurement level) or scale dependent variables before training the neural network. There are three rescaling options: standardization, normalization, and adjusted normalization.

Training, testing, and holdout data. RBF optionally divides the dataset into training, testing, and holdout data. The neural network is trained using the training data. The testing data can be used to determine the “best” number of hidden units for the network. The holdout data is completely excluded from the training process and is used for independent assessment of the final network.

Architecture selection. The RBF procedure creates a neural network with one hidden layer and can perform automatic architecture selection to find the “best” number of hidden units. By default, the procedure automatically computes a reasonable range and finds the “best” number within the range. However, you can override these computations by providing your own range or a specific number of hidden units.

Activation functions. Units in the hidden layer can use the normalized radial basis function or the ordinary radial basis function.

Missing values. The RBF procedure has an option for treating user-missing values of categorical variables as valid. User-missing values of scale variables are always treated as invalid.

Output. RBF displays pivot table output but offers an option for suppressing most such output. Graphical output includes a network diagram (default) and a number of optional charts: predicted by observed values, residual by predicted values, ROC (Receiver Operating Characteristic) curves, cumulative gains, lift, and independent variable importance. The procedure also optionally saves predicted values in the active dataset. Hidden unit center and width vectors and synaptic weight estimates can be saved in XML files.

Basic Specification

The basic specification is the RBF command followed by one or more dependent variables, the BY keyword and one or more factors, and the WITH keyword and one or more covariates.

By default, the RBF procedure standardizes covariates and scale dependent variables and selects a training sample before training the neural network. Automatic architecture selection is used to find the “best” neural network architecture. User-missing values are excluded and default pivot table output is displayed.

Note: Measurement level can affect the results. If any variables (fields) have an unknown measurement level, a data pass is performed to determine the measurement level before the analysis begins. For information on the determination criteria, see SET SCALEMIN.

Syntax Rules

  • All subcommands are optional.
  • Subcommands may be specified in any order.
  • Only a single instance of each subcommand is allowed.
  • An error occurs if a keyword is specified more than once within a subcommand.
  • Parentheses, equals signs, and slashes shown in the syntax chart are required.
  • The command name, subcommand names, and keywords must be spelled in full.
  • Empty subcommands are not allowed.
  • Any split variable defined on the SPLIT FILE command may not be used as a dependent variable, factor, covariate, or partition variable.

Limitations

Frequency weights specified on the WEIGHT command are ignored with a warning by the RBF procedure.

Categorical Variables

The RBF procedure temporarily recodes categorical predictors and dependent variables using one-of-c coding for the duration of the procedure. If there are c categories of a variable, then the variable is stored as c vectors, with the first category denoted (1,0,...,0), the next category, (0,1,0,...,0), ..., and the final category, (0,0,...,0,1).

Because of the one-of-c coding, the total number of input units is the number of scale predictors plus the number of categories across all categorical predictors. However, unlike the multilayer perceptron (MLP), this coding scheme does not increase the number of syntaptic weights for categorical predictors and hence should not significantly increase the training time.

All one-of-c coding is based on the training data, even if a testing or holdout sample is defined (see PARTITION Subcommand (RBF command) ). Thus, if the testing or holdout samples contain cases with predictor categories that are not present in the training data, then those cases are not used by the procedure or in scoring. If the testing or holdout samples contain cases with dependent variable categories that are not present in the training data, then those cases are not used by the procedure but they may be scored.

Replicating Results

The RBF procedure uses random number generation during random assignment of partitions. To reproduce the same randomized results in the future, use the SET command to set the initialization value for the random number generator before each run of the RBF procedure.

RBF results are also dependent on data order because the two-step cluster algorithm is used to determine the radial basis functions. To minimize data order effects, randomly order the cases before running the RBF procedure. To verify the stability of a given solution, you may want to obtain several different solutions with cases sorted in different random orders. In situations with extremely large file sizes, multiple runs can be performed with a sample of cases sorted in different random orders.

In summary, if you want to exactly replicate RBF results in the future, use the same initialization value for the random number generator and the same data order, in addition to using the same RBF procedure settings.