Tester operator

You can use the tester operator to apply the input test data on a classification or regression model and compute a test result that contains the quality information about the tested mining model.

Only the predictive (Classification and Regression) mining models can be tested. You use the tester operator to test the model, compute the test result, and write it into a database table. Testing a model means that the model is applied to data which already contains the target field values that the model can predict. The test compares the predicted and the actual target field values and uses this comparison to create a test result, which contains various model quality numbers and a lift or gains chart. The gains chart shows graphically how well the model’s predictions, when sorted by descending predicted values, reproduce the order of the test data records when sorted by descending actual target field values.

You can visually explore the test result using a visualizer operator or extract one of the quality measures using the quality extractor or gains extractor operators.

For an example of how to use the tester operator, refer to the tutorial Building and testing a prediction model.

The tester needs the same number and type of input columns that were originally used to train the prediction model with values in the target column.

A typical scenario for the tester operator is to split the historic data with known target values into a training data set and a test dataset and then validate the model computed on the training data set with the test data set.

The following figure shows a typical flow:

Figure 1. Example testing function flowExample testing function flow

Here the table BANKCUSTOMERS contains the historic data. The input data is split into 50% training and 50% test rows using the random split operator.

After the Predictor has computed a classification model for categorical target column NBR_YEARS_CLIENT, the tester validates the model and sends the test result to the visualizer operator.

Another typical scenario is that you can validate the prediction model from time to time with current data to ensure that the prediction scores computed by the model are still accurate.



Feedback