SMOTE node Settings

Define the following settings on the SMOTE node's Settings tab.

Target Setting

Target Field. Select the target field. All flag, nominal, ordinal, and discrete measurement types are supported. If the Use partitioned data option is selected in the Partition section, only training data will be over-sampled.

Over Sample Ratio

Select Auto to automatically select an over-sample ratio, or select Set Ratio (minority over majority) to set a custom ratio value. The ratio is the number of samples in the minority class over the number of samples in the majority class. The value must be greater than 0 and less than or equal to 1.

Random Seed

Set random seed. Select this option and click Generate to generate the seed used by the random number generator.

Methods

Algorithm Kind. Select the type of SMOTE algorithm you wish to use.

Samples Rules

K Neighbours. Specify the number of the nearest neighbors to use for constructing synthetic samples

M Neighbours. Specify the number of nearest neighbors to use for determining if a minority sample is in danger. This will only be used if the Borderline1 or Borderline2 SMOTE algorithm type is selected.

Partition

Use partitioned data. Select this option if you only want training data to be over-sampled.

The SMOTE node requires the imbalanced-learn© Python library. The following table shows the relationship between the settings in the SPSS® Modeler SMOTE node dialog and the Python algorithm.

Table 1. Node properties mapped to Python library parameters
SPSS Modeler setting	Script name (property name)	Python API parameter name
Over sample ratio (number input control)	`sample_ratio_value`	`ratio`
Random seed	`random_seed`	`random_state`
K_Neighbours	`k_neighbours`	`k`
M_Neighbours	`m_neighbours`	`m`
Algorithm kind	`algorithm_kind`	`kind`