Scoring Wizard: Selecting scoring functions
The scoring functions are the types of "scores" available for the selected model. For example, predicted value of the target, probability of the predicted value, or probability of a selected target value.
Scoring function. The scoring functions available are dependent on the model. One or more of the following will be available in the list:
- Predicted value. The predicted value of the target outcome of interest. This is available for all models, except those that do not have a target.
- Probability of predicted value. The probability of the predicted value being the correct value, expressed as a proportion. This is available for most models with a categorical target.
- Probability of selected value. The probability of the selected value being the correct value, expressed as a proportion. Select a value from the drop-down list in the Value column. The available values are defined by the model. This is available for most models with a categorical target.
- Confidence. A probability measure associated with the predicted value of a categorical target. For Binary Logistic Regression, Multinomial Logistic Regression, and Naive Bayes models, the result is identical to the probability of the predicted value. For Tree and Ruleset models, the confidence can be interpreted as an adjusted probability of the predicted category and is always less than the probability of the predicted value. For these models, the confidence value is more reliable than the probability of the predicted value.
- Node number. The predicted terminal node number for Tree models.
- Standard error.The standard error of the predicted value. Available for Linear Regression models, General Linear models, and Generalized Linear models with a scale target. This is available only if the covariance matrix is saved in the model file.
- Cumulative Hazard. The estimated cumulative hazard function. The value indicates the probability of observing the event at or before the specified time, given the values of the predictors.
- Nearest neighbor. The ID of the nearest neighbor. The ID is the value of the case labels variable, if supplied, and otherwise the case number. Applies only to nearest neighbor models.
- Kth nearest neighbor. The ID of the kth nearest neighbor. Enter an integer for the value of k in the Value column. The ID is the value of the case labels variable, if supplied, and otherwise the case number. Applies only to nearest neighbor models.
- Distance to nearest neighbor. The distance to the nearest neighbor. Depending on the model, either Euclidean or City Block distance will be used. Applies only to nearest neighbor models.
- Distance to kth nearest neigbor. The distance to the kth nearest neighbor. Enter an integer for the value of k in the Value column. Depending on the model, either Euclidean or City Block distance will be used. Applies only to nearest neighbor models.
Field Name. Each selected scoring function saves a new field (variable) in the active dataset. You can use the default names or enter new names. If fields with those names already exist in the active dataset, they will be replaced. For information on field naming rules, see Variable names.
Value. See the descriptions of the scoring functions for descriptions of functions that use a Value setting.