Use this stored procedure to build a probabilistic Naive Bayes classification
model.
Authorities
The privileges held by the authorization ID of the statement must include the IDAX_USER role.
Syntax
IDAX.NAIVEBAYES(in parameter_string varchar(32672))
Parameter descriptions
- parameter_string
- Mandatory one-string parameter that contains pairs of
<parameter>=<value> entries that are separated by a comma.
- Data type: VARCHAR(32672)
- The following list shows the parameter values:
-
- model
- Mandatory.
- The name of the Naive Bayes model that is to be build.
- Data type: VARCHAR(64)
- intable
- Mandatory.
- The name of the input table.
- Data type: VARCHAR(128)
- id
- Mandatory.
- The column of the input table that identifies a unique instance ID.
- Data type: VARCHAR(128)
- target
- Mandatory.
- The column of the input table that represents the class
- Data type: VARCHAR(128)
- incolumn
- Optional.
- The columns of the input table that have specific properties, which are separated by a
semi-colon (;).
- Each column is succeeded by one or more of the following properties:
- By type nominal (
:nom
) or by type continuous (:cont
). By
default, numerical types are continuous, and all other types are nominal.
- By role
:id
, :target
, :input
, or
:ignore
.
- If this parameter is not specified, all columns of the input table have default properties.
- Default: none
- Data type: VARCHAR(32000)
- coldeftype
- Optional.
- The default type of the input table columns.
- Allowed values are
nom
and cont
.
- If the parameter is not specified, numeric columns are continuous, and all other columns are
nominal.
- Default: none
- Data type: VARCHAR(4)
- coldefrole
- Optional.
- The default role of the input table columns.
- Allowed values are
input
and ignore
.
- If the parameter is not specified, all columns are input columns.
- Default: input
- Data type: VARCHAR(6)
- colPropertiesTable
- Optional.
- The input table where properties of the columns of the input table are stored.
- If this parameter is not specified, the column properties of the input table column properties
are detected automatically.
- Default: none
- Data type: VARCHAR(128)
- disc
- Optional.
- Determines the automatic discretization of all continuous attributes.
- Allowed values are ef, em, ew,
and ewn.
- disc=ef
- Equal-frequency discretization.
- An unsupervised discretization algorithm that uses the equal frequency criterion for interval
bound setting.
- disc=em
- Minimal entropy discretization.
- An unsupervised discretization algorithm that uses the minimal entropy criterion for interval
bound setting.
- disc=ew
- Default.
- Equal-width discretization
- An unsupervised
discretization algorithm that uses the equal width criterion for interval bound setting. It
calculates the width of the bins for a column as
(vmax - vmin)/k, where:
- vmax
- The maximum value for the column.
- vmin
- The minimum value for the column.
- k
- The number of discretization bins that are requested for the column.
The discretization bin limits are placed at vmin + i * w, for i =
1, ..., k-1.
- disc=ewn
- Equal-width discretization with nice bin limits.
- An unsupervised discretization algorithm that first calculates the width of the bins for a
column as
6 * stddev / k, where:
- stddev
- The standard deviation for the column values.
- k
- The number of discretization bins that are requested for the column.
It then makes the width nice by replacing it with the nearest value of 1, 2, 2.5, or
5 times a power of ten. Finally, it sets the bin limits around the column mean value so that the bin
limits become multiples of the nice width. Note that the resulting number of bins might differ from
the requested number of bins.
- bins
- Optional.
- Number of bins for numeric columns
- Default: 10
- Data type: INTEGER
Returned information
The number of column statistics, value statistics, and class statistics of the Naive Bayes model
as a result set.