Overview (RBF command)
Neural networks are a data mining tool for finding
unknown patterns in databases. Neural networks can be used to make
business decisions by forecasting demand for a product as a function
of price and other variables or by categorizing customers based on
buying habits and demographic characteristics. The RBF
procedure fits a radial basis function
neural network, which is a feedforward, supervised learning network
with an input layer, a hidden layer called the radial basis function
layer, and an output layer. The hidden layer transforms the input
vectors into radial basis functions. Like the MLP
(multilayer perceptron) procedure, the RBF
procedure performs prediction and classification.
The RBF
procedure trains the
network in two stages:
- The procedure determines the radial basis functions using clustering methods. The center and width of each radial basis function are determined.
- The procedure estimates the synaptic weights given the radial basis functions. The sum-of-squares error function with identity activation function for the output layer is used for both prediction and classification. Ordinary least squares regression is used to minimize the sum-of-squares error.
Because of this two-stage training approach,
the RBF
network is in general
trained much faster than MLP
.
Options
Prediction or classification. One or more dependent variables may be specified, and they may be scale, categorical, or a combination. If a dependent variable has scale measurement level, then the neural network predicts continuous values that approximate the “true” value of some continuous function of the input data. If a dependent variable is categorical, then the neural network is used to classify cases into the “best” category based on the input predictors.
Rescaling.
RBF
optionally rescales covariates
(predictors with scale measurement level) or scale dependent variables
before training the neural network. There are three rescaling options:
standardization, normalization, and adjusted normalization.
Training, testing,
and holdout data.
RBF
optionally divides the dataset into training, testing, and holdout
data. The neural network is trained using the training data. The
testing data can be used to determine the “best” number
of hidden units for the network. The holdout data is completely excluded
from the training process and is used for independent assessment of
the final network.
Architecture selection. The RBF
procedure creates a neural network with one hidden
layer and can perform automatic architecture selection to find the
“best” number of hidden units. By default, the procedure
automatically computes a reasonable range and finds the “best”
number within the range. However, you can override these computations
by providing your own range or a specific number of hidden units.
Activation functions. Units in the hidden layer can use the normalized radial basis function or the ordinary radial basis function.
Missing values. The RBF
procedure has an option
for treating user-missing values of categorical variables as valid.
User-missing values of scale variables are always treated as invalid.
Output.
RBF
displays pivot table output
but offers an option for suppressing most such output. Graphical output
includes a network diagram (default) and a number of optional charts:
predicted by observed values, residual by predicted values, ROC (Receiver
Operating Characteristic) curves, cumulative gains, lift, and independent
variable importance. The procedure also optionally saves predicted
values in the active dataset. Hidden unit center and width vectors
and synaptic weight estimates can be saved in XML files.
Basic Specification
The basic specification is the RBF
command followed by one or more dependent variables, the BY
keyword and one or more factors, and
the WITH
keyword and one or more
covariates.
By default, the RBF
procedure standardizes covariates and scale dependent
variables and selects a training sample before training the neural
network. Automatic architecture selection is used to find the “best”
neural network architecture. User-missing values are excluded and
default pivot table output is displayed.
Syntax Rules
- All subcommands are optional.
- Subcommands may be specified in any order.
- Only a single instance of each subcommand is allowed.
- An error occurs if a keyword is specified more than once within a subcommand.
- Parentheses, equals signs, and slashes shown in the syntax chart are required.
- The command name, subcommand names, and keywords must be spelled in full.
- Empty subcommands are not allowed.
- Any split variable defined on the
SPLIT FILE
command may not be used as a dependent variable, factor, covariate, or partition variable.
Limitations
Frequency weights specified on the WEIGHT
command are ignored with a warning by the RBF
procedure.
Categorical Variables
The RBF
procedure temporarily
recodes categorical predictors and dependent variables using one-of-c coding for the duration of the procedure.
If there are c categories of
a variable, then the variable is stored as c vectors, with the first
category denoted (1,0,...,0), the next category, (0,1,0,...,0), ...,
and the final category, (0,0,...,0,1).
Because of the one-of-c coding, the total number of input units is the number of scale predictors plus the number of categories across all categorical predictors. However, unlike the multilayer perceptron (MLP), this coding scheme does not increase the number of syntaptic weights for categorical predictors and hence should not significantly increase the training time.
All one-of-c coding is based on the training data, even if a testing or holdout sample is defined (see PARTITION Subcommand (RBF command) ). Thus, if the testing or holdout samples contain cases with predictor categories that are not present in the training data, then those cases are not used by the procedure or in scoring. If the testing or holdout samples contain cases with dependent variable categories that are not present in the training data, then those cases are not used by the procedure but they may be scored.
Replicating Results
The RBF
procedure uses random number generation during random assignment
of partitions. To reproduce the same randomized results in the future,
use the SET
command to set the
initialization value for the random number generator before each run
of the RBF
procedure.
RBF
results are also
dependent on data order because the two-step cluster algorithm is
used to determine the radial basis functions. To minimize data order
effects, randomly order the cases before running the RBF
procedure. To verify the stability
of a given solution, you may want to obtain several different solutions
with cases sorted in different random orders. In situations with
extremely large file sizes, multiple runs can be performed with a
sample of cases sorted in different random orders.
In summary, if you want to exactly replicate RBF
results in the future, use the same initialization
value for the random number generator and the same data order, in
addition to using the same RBF
procedure settings.