Creating the Stream

Figure 1. Sample stream to show SVM modeling
Sample stream to show SVM modeling
  1. Create a new stream and add a Var File source node pointing to cell_samples.data in the Demos folder of your IBM® SPSS® Modeler installation.

    Let's take a look at the data in the source file.

  2. Add a Table node to the stream.
  3. Attach the Table node to the Var File node and run the stream.
    Figure 2. Source data for SVM
    Source data for SVM

    The ID field contains the patient identifiers. The characteristics of the cell samples from each patient are contained in fields Clump to Mit. The values are graded from 1 to 10, with 1 being the closest to benign.

    The Class field contains the diagnosis, as confirmed by separate medical procedures, as to whether the samples are benign (value = 2) or malignant (value = 4).

    Figure 3. Type node settings
    Type node settings
  4. Add a Type node and attach it to the Var File node.
  5. Open the Type node.

    We want the model to predict the value of Class (that is, benign (=2) or malignant (=4)). As this field can have one of only two possible values, we need to change its measurement level to reflect this.

  6. In the Measurement column for the Class field (the last one in the list), click the value Continuous and change it to Flag.
  7. Click Read Values.
  8. In the Role column, set the role for ID (the patient identifier) to None, as this will not be used either as a predictor or a target for the model.
  9. Set the role for the target, Class, to Target and leave the role of all the other fields (the predictors) as Input.
  10. Click OK.

    The SVM node offers a choice of kernel functions for performing its processing. As there's no easy way of knowing which function performs best with any given dataset, we'll choose different functions in turn and compare the results. Let's start with the default, RBF (Radial Basis Function).

    Figure 4. Model tab settings
    Model tab settings
  11. From the Modeling palette, attach an SVM node to the Type node.
  12. Open the SVM node. On the Model tab, click the Custom option for Model name and type class-rbf in the adjacent text field.
    Figure 5. Default Expert tab settings
    Default Expert tab settings
  13. On the Expert tab, set the Mode to Expert for readability but leave all the default options as they are. Note that Kernel type is set to RBF by default. All the options are greyed out in Simple mode.
    Figure 6. Analyze tab settings
    Analyze tab settings
  14. On the Analyze tab, select the Calculate variable importance check box.
  15. Click Run. The model nugget is placed in the stream, and in the Models palette at the top right of the screen.
  16. Double-click the model nugget in the stream.

Next