Building models with IBM® Db2 for z/OS
Each of the supported algorithms has a corresponding modeling node. You can access the Db2® for z/OS® modeling nodes from the Database Modeling tab on the nodes palette.
Data considerations
Fields in the data source can contain variables of various data types, depending on the modeling node. In SPSS® Modeler, data types are known as measurement levels. The Fields tab of the modeling node uses icons to indicate the permitted measurement level types for its input and target fields.
Target field. The target field is the field whose value you are trying to predict. Where a target can be specified, only one of the source data fields can be selected as the target field.
Record ID field. Specifies the field used to uniquely identify each case. For example, this might be an ID field, such as CustomerID. If the source data does not include an ID field, you can create this field by means of a Derive node, as the following procedure shows.
- Select the source node.
- From the Field Ops tab on the nodes palette, double-click the Derive node.
- Open the Derive node by double-clicking its icon on the canvas.
- In the Derive field field, type (for example) ID.
- In the Formula field, type @INDEX and click OK.
- Connect the Derive node to the rest of the stream.
Handling null values
If the input data contains null values, use of some Db2 for z/OS nodes may result in error messages or long-running streams, so we recommend removing records containing null values. Use the following method.
- Attach a Select node to the source node.
- Set the Mode option of the Select node to Discard.
- Enter the following in the Condition field:
@NULL(field1) [or @NULL(field2)[... or @NULL(fieldN]])
Be sure to include every input field.
- Connect the Select node to the rest of the stream.
Model output
It is possible for a stream containing a Db2 for z/OS modeling node to produce slightly different results each time it is run. This is because the order in which the node reads the source data is not always the same, as the data is read into temporary tables before model building. However, the differences produced by this effect are negligible.
General comments
- In SPSS Collaboration and Deployment Services, it is not possible to create scoring configurations using streams containing Db2 for z/OS modeling nodes.
- PMML export or import is not possible for models created by the Db2 for z/OS nodes.