IDAX.NAIVEBAYES - Build a Naive Bayes model

Use this stored procedure to build a probabilistic Naive Bayes classification model.

Authorities

The privileges held by the authorization ID of the statement must include the IDAX_USER role.

Syntax

IDAX.NAIVEBAYES(in parameter_string varchar(32672))

Parameter descriptions

parameter_string
Mandatory one-string parameter that contains pairs of <parameter>=<value> entries that are separated by a comma.
Data type: VARCHAR(32672)
The following list shows the parameter values:
model
Mandatory.
The name of the Naive Bayes model that is to be build.
Data type: VARCHAR(64)
intable
Mandatory.
The name of the input table.
Data type: VARCHAR(128)
id
Mandatory.
The column of the input table that identifies a unique instance ID.
Data type: VARCHAR(128)
target
Mandatory.
The column of the input table that represents the class
Data type: VARCHAR(128)
incolumn
Optional.
The columns of the input table that have specific properties, which are separated by a semi-colon (;).
Each column is succeeded by one or more of the following properties:
  • By type nominal (:nom) or by type continuous (:cont). By default, numerical types are continuous, and all other types are nominal.
  • By role :id, :target, :input, or :ignore.
If this parameter is not specified, all columns of the input table have default properties.
Default: none
Data type: VARCHAR(32000)
coldeftype
Optional.
The default type of the input table columns.
Allowed values are nom and cont.
If the parameter is not specified, numeric columns are continuous, and all other columns are nominal.
Default: none
Data type: VARCHAR(4)
coldefrole
Optional.
The default role of the input table columns.
Allowed values are input and ignore.
If the parameter is not specified, all columns are input columns.
Default: input
Data type: VARCHAR(6)
colPropertiesTable
Optional.
The input table where properties of the columns of the input table are stored.
If this parameter is not specified, the column properties of the input table column properties are detected automatically.
Default: none
Data type: VARCHAR(128)
disc
Optional.
Determines the automatic discretization of all continuous attributes.
Allowed values are ef, em, ew, and ewn.
disc=ef
Equal-frequency discretization.
An unsupervised discretization algorithm that uses the equal frequency criterion for interval bound setting.
disc=em
Minimal entropy discretization.
An unsupervised discretization algorithm that uses the minimal entropy criterion for interval bound setting.
disc=ew
Default.
Equal-width discretization
An unsupervised discretization algorithm that uses the equal width criterion for interval bound setting. It calculates the width of the bins for a column as (vmax - vmin)/k, where:
vmax
The maximum value for the column.
vmin
The minimum value for the column.
k
The number of discretization bins that are requested for the column.
The discretization bin limits are placed at vmin + i * w, for i = 1, ..., k-1.
disc=ewn
Equal-width discretization with nice bin limits.
An unsupervised discretization algorithm that first calculates the width of the bins for a column as 6 * stddev / k, where:
stddev
The standard deviation for the column values.
k
The number of discretization bins that are requested for the column.
It then makes the width nice by replacing it with the nearest value of 1, 2, 2.5, or 5 times a power of ten. Finally, it sets the bin limits around the column mean value so that the bin limits become multiples of the nice width. Note that the resulting number of bins might differ from the requested number of bins.
bins
Optional.
Number of bins for numeric columns
Default: 10
Data type: INTEGER

Returned information

The number of column statistics, value statistics, and class statistics of the Naive Bayes model as a result set.