Troubleshooting SPSS Modeler

The information in this section provides troubleshooting details for issues you may encounter in SPSS Modeler.

Running multiple flows

We don't recommend running multiple flows at the same time using the same username under one project. If you must do this, be sure the memory limit (8 GiB, by default) is not exceeded. If too many flows are running at the same time under the same username and project, SPSS Modeler may run out of memory and return an error message such as Execution was interrupted. If you encounter the error message, complete the following steps:

Wait for the completion of one or more flow runs
Close your browser tabs that contain successfully completed flow runs
Wait for 15 minutes
Click Run on the interrupted flow that returned an error. If you use caching in your flow, flush the cache before you click Run.

Unnamed fields in migrated streams

In SPSS Modeler desktop, unnamed data fields are named field1, field2, ..., by default. In SPSS Modeler in Cloud Pak for Data, unnamed data fields are named COLUMN1, COLUMN2, ..., by default. So if you create a flow from a stream file (.str) that was created in SPSS Modeler desktop and contains such fields, the output will differ. As a workaround, you can add a script such as the following to the flow you created from the imported stream:

# TO DO: run this script once after importing the stream into CP4D 
import modeler.api
stream = modeler.script.stream()

# map "COLUMN" to "field" for data sources without field names (csv without headers)
source_node = stream.findByID("...") # TO DO: provide ID of existing source node (csv file without headers)
filter_node = stream.findByID("...") # TO DO: provide ID of existing filter node (where field names are provided) 
new_node = stream.create("filter", 'new node') # creates new filter node between source and filter
stream.linkBetween(new_node, source_node, filter_node)

# change field names from "COLUMN1" to "field1" etc. 
for number in range(1,1000): # change max value if necessary
    old_name = 'COLUMN' + str(number)
    new_name = 'field' + str(number)
    new_node.setKeyedPropertyValue("new_name", old_name, new_name)

KDE nodes with unsupported Python version

If your flow contains an old KDE node, when you run it you may receive an error about the model using a Python package that's no longer supported. In such a case, remove the old KDE node and add a new one.

Differences in how having no line separators is handled

If there is no separator in a line of a data record, that line will be discarded in Cloud Pak for Data. In SPSS Modeler desktop, such lines are read as empty values.

Values for Predictor Importance can vary between SPSS Modeler flows and SPSS Modeler desktop streams

To avoid inconsistent results on different platforms, a new random sampling method is used to compute Predictor Importance in SPSS Modeler on Cloud Pak for Data. This causes new Predictive Importance results to vary from the original Predictive Importance results in SPSS Modeler desktop if the data is not uniformly distributed. Random sampling is triggered when the number of records exceeds 200. SPSS Modeler desktop will be upgraded in a future version to match the results in SPSS Modeler on Cloud Pak for Data.

It's hard to tell the difference between models generated from Text Analytics

In the Text Analytics Workbench, when you click Generate new model, a new model nugget is created in your flow. If you generate multiple models, they all have the same name, so it may be difficult to differentiate them. One recommendation is to use annotations to help identify them (double-click a model nugget to open its properties, then go to Annotations).

Some generated model results may vary from prior versions

Beginning with 4.6, the generated models for some algorithms may vary from previous versions because certain settings are now assigned dynamically based on the number of CPUs of the deployed pod. For example, KMeans-AS nodes and Random Trees nodes may produce slightly different generated model results beginning with 4.6. This change was made to fully utilize your IT capacities. If you have a more powerful runtime, such as 8 CPUs, SPSS Modeler detects the extra CPUs and adjusts settings to take advantage of them.