Kohonen Node Model Options
Model name You can generate the model name automatically based on the target or ID field (or model type in cases where no such field is specified) or specify a custom name.
Use partitioned data. If a partition field is defined, this option ensures that data from only the training partition is used to build the model.
Continue training existing model. By default, each time you execute a Kohonen node, a completely new network is created. If you select this option, training continues with the last net successfully produced by the node.
Show feedback graph. If this option is selected, a visual representation of the two-dimensional array is displayed during training. The strength of each node is represented by color. Red denotes a unit that is winning many records (a strong unit), and white denotes a unit that is winning few or no records (a weak unit). Feedback may not display if the time taken to build the model is relatively short. Note that this feature can slow training time. To speed up training time, deselect this option.
Stop on. The default stopping criterion stops training, based on internal parameters. You can also specify time as the stopping criterion. Enter the time (in minutes) for the network to train.
Set random seed. If no random seed is set, the sequence of random values used to initialize the network weights will be different every time the node is executed. This can cause the node to create different models on different runs, even if the node settings and data values are exactly the same. By selecting this option, you can set the random seed to a specific value so the resulting model is exactly reproducible. A specific random seed always generates the same sequence of random values, in which case executing the node always yields the same generated model.
Note: When using the Set random seed option with records read from a database, a Sort node may be required prior to sampling in order to ensure the same result each time the node is executed. This is because the random seed depends on the order of records, which is not guaranteed to stay the same in a relational database.
Note: If you want to include nominal (set) fields in your model but are having memory problems in building the model, or the model is taking too long to build, consider recoding large set fields to reduce the number of values, or consider using a different field with fewer values as a proxy for the large set. For example, if you are having a problem with a product_id field containing values for individual products, you might consider removing it from the model and adding a less detailed product_category field instead.
Optimize. Select options designed to increase performance during model building based on your specific needs.
- Select Speed to instruct the algorithm to never use disk spilling in order to improve performance.
- Select Memory to instruct the algorithm to use disk spilling when appropriate
at some sacrifice to speed. This option is selected by default.
Note: When running in distributed mode, this setting can be overridden by administrator options specified in options.cfg.
Append cluster label. Selected by default for new models, but deselected for models loaded from earlier versions of IBM® SPSS® Modeler, this creates a single categorical score field of the same type that is created by both the K-Means and TwoStep nodes. This string field is used in the Auto Cluster node when calculating ranking measures for the different model types. See the topic Auto Cluster Node for more information.