Creating the Stream

- Create a new stream and add a Var File source node pointing to
cell_samples.data in the Demos folder of your IBM® SPSS® Modeler installation.
Let's take a look at the data in the source file.
- Add a Table node to the stream.
- Attach the Table node to the Var File node and run the stream.
Figure 2. Source data for SVM The ID field contains the patient identifiers. The characteristics of the cell samples from each patient are contained in fields Clump to Mit. The values are graded from 1 to 10, with 1 being the closest to benign.
The Class field contains the diagnosis, as confirmed by separate medical procedures, as to whether the samples are benign (value = 2) or malignant (value = 4).
Figure 3. Type node settings - Add a Type node and attach it to the Var File node.
- Open the Type node.
We want the model to predict the value of Class (that is, benign (=2) or malignant (=4)). As this field can have one of only two possible values, we need to change its measurement level to reflect this.
- In the Measurement column for the Class field (the last one in the list), click the value Continuous and change it to Flag.
- Click Read Values.
- In the Role column, set the role for ID (the patient identifier) to None, as this will not be used either as a predictor or a target for the model.
- Set the role for the target, Class, to Target and leave the role of all the other fields (the predictors) as Input.
- Click OK.
The SVM node offers a choice of kernel functions for performing its processing. As there's no easy way of knowing which function performs best with any given dataset, we'll choose different functions in turn and compare the results. Let's start with the default, RBF (Radial Basis Function).
Figure 4. Model tab settings - From the Modeling palette, attach an SVM node to the Type node.
- Open the SVM node. On the Model tab, click the
Custom option for Model name and type class-rbf
in the adjacent text field.
Figure 5. Default Expert tab settings - On the Expert tab, set the
Mode to Expert for readability but leave all the
default options as they are. Note that Kernel type is set to
RBF by default. All the options are greyed out in Simple mode.
Figure 6. Analyze tab settings - On the Analyze tab, select the Calculate variable importance check box.
- Click Run. The model nugget is placed in the stream, and in the Models palette at the top right of the screen.
- Double-click the model nugget in the stream.