<model name>_COLUMN_STATISTICS table
This table contains one line for each cluster.
The following table shows the table columns:
Column name | Data type | Description |
---|---|---|
CLUSTERID (primary key column) | INTEGER | Index value of the cluster in the cluster model If CLUSTERID is 0, the row applies to all input records. |
COLUMNNAME (primary key column) | VARCHAR(128) | Name of an input column |
CARDINALITY | BIGINT | Number of distinct values If the column is continuous, the value is NULL. |
MODE | VARCHAR (16000) | Most frequent discrete value in the column for CLUSTERID |
MINIMUM | DOUBLE | Minimum value If the column is not numeric, the value is NULL. |
MAXIMUM | DOUBLE | Maximum value If the column is not numeric, the value is NULL. |
MEAN | DOUBLE | Mean value If the column is not numeric, the value is NULL. |
VARIANCE | DOUBLE | Unbiased sample variance If the column is not numeric, the value is NULL. |
VALIDFREQ | BIGINT | Number of valid values |
MISSINGFREQ | BIGINT | Number of missing values |
INVALIDFREQ | BIGINT | Number of invalid values |
IMPORTANCE | DOUBLE | Normalized chi-square value The value indicates whether the column distribution in the cluster is different from the overall column distribution. The normalized chi-square value is the factor by which the chi-square value differs from the chi-square value that is sufficient for 99.99% significance (considering degrees of freedom). |