
AI_SEMANTIC_CLUSTER scalar function
The AI_SEMANTIC_CLUSTER function computes a semantic clustering score of a member argument against a set of clustering arguments.
The schema is SYSIBM.
The AI_SEMANTIC_CLUSTER function computes a clustering score using the value returned by member-expr among the cluster formed by the values returned by clustering-expression arguments. The machine learning model and machine learning model column is determined by the column in member-expr, or you can specify it in the column-name in the USING MODEL COLUMN clause.
- member-expr
-
An expression that specifies the value to be scored against the rest of the cluster. The value must be a built-in data type that is not a binary data type, a LOB data type, XML, ROWID, CHAR FOR BIT DATA, or VARCHAR FOR BIT DATA.
The member-expr determines which model and model column is used for the function:- If the member-expr is a standalone column reference and no model column is explicitly specified with the USING MODEL COLUMN clause, the standalone column is the model column that is used to evaluate the function, and the model is the model table associated with the table that the column belongs to.
- The USING MODEL COLUMN clause can be used to specify a column for the member-expr . The column named in the USING MODEL COLUMN clause is the model column, and the model is the model table associated with the table that the model column belongs to.
- Db2 must be able to determine a model column from the member-expr, either because the expression is a standalone column reference, or because a column is specified in the USING MODEL COLUMN clause.
- AI must be enabled for the table or view that the model is associated with and the model must be trained. All model columns must be included in the model.
The model column, either specified as a standalone column reference or with the USING MODEL COLUMN clause, may be a qualified name. The qualifier must not be a synonym name or a correlation name of a table expression. The qualifier must refer to a table name or a view name, or to an alias to a table name or view name.
If the model column is a numeric column as indicated during the training of the model, the value of the expression and the values of each clustering-expression are cast to FLOAT during execution of the function. If the value is outside of the range of FLOAT, the result of the function is a null value.
- clustering-expression
-
The values returned by the set of clustering-expressions form a cluster against which the member-expr is scored. The values must each be a built-in data type that is not a binary data type, a LOB data type, XML, ROWID, CHAR FOR BIT DATA, or VARCHAR FOR BIT DATA.
The model and model column of the clustering-expression argument is inferred to be the same as the model and model column of the member-expr.
Up to three clustering-expression arguments may be specified for AI_SEMANTIC_CLUSTER.
The result is a double-precision floating point number (FLOAT) between -1.0 and 1.0 that is the semantic clustering score. A larger positive result indicates a better clustering of member-expr among the cluster formed by clustering-expressions than a lower result.
The result can be null; if any argument is null, the result is the null value. If the arguments to the function contain values that were not seen during model training, and the model column is trained as categorical, the result is the null value.
Notes
- Configuration requirement
- SQL Data Insights must be configured in Db2 to use this function.
- AI_SEMANTIC_CLUSTER must not be specified in a CREATE VIEW statement
- AI_SEMANTIC_CLUSTER must not be specified in a CREATE VIEW statement, and must not be specified in the fullselect of a materialized-query-definition of a CREATE TABLE statement.
- AI_SEMANTIC_CLUSTER is not deterministic
- The function reads data from the table associated with the model. The AI_SEMANTIC_CLUSTER function is not deterministic. Training of the model is also not deterministic. The model may change slightly, even if trained again using the same data, which can cause small differences in the semantic cluster scores produced by the function which can affect the ordering of results.
Example
Customers with IDs '0280_XJGEX', '6467_CHFZW' and '0093_XWZFY' have all churned. If we form a semantic cluster of those three customers, find the top 5 customers that would belong in that cluster.
SELECT AI_SEMANTIC_CLUSTER(CUSTOMERID, '0280_XJGEX', '6467_CHFZW', '0093_XWZFY'), CHURN.*
FROM CHURN
ORDER BY 1 DESC
FETCH FIRST 5 ROWS ONLY
