Use this stored procedure to calculate the limits for discretization bins of equal
frequency for numeric columns. Each bin contains the approximate equal number of data records.
Authorization
The privileges held by the authorization ID of the statement must include the IDAX_USER role.
Syntax
IDAX.EFDISC(in parameter_string varchar(32672))
Parameter descriptions
- parameter_string
- Mandatory one-string parameter that contains pairs of
<parameter>=<value> entries that are separated by a comma.
- Data type: VARCHAR(32672)
- The following list shows the parameter values:
-
- intable
- Mandatory.
- The name of the input table.
- Data type: VARCHAR(ANY)
- outtable
- Mandatory.
- The name of the output table where the limit for the discretization bins is to be stored.
- Data type: VARCHAR(ANY)
- incolumn
- Mandatory.
- The columns of the input table that are separated by a semi-colon (;).
- Each column name can be followed by a colon (:) and the number of discretization bins that are
to be calculated for this column.
- Data type: VARCHAR(ANY)
- bins
- Optional.
- The default number of discretization bins that are to be calculated.
- Default: 10
- Data type: INTEGER
- binprec
- Optional.
- The precision for an even distribution of data records in the calculated discretization
bins.
- The number of data records in each bin must be within [iw-
<binprec>*iw,iw+<binprec>*iw], where iw is the size of
the input table divided by the number of requested limits for discretization bins.
- Min: 0
- Max: 1
- Default: 0.1
- Data type: DOUBLE
Returned information
INTEGER the number of columns for which the limits for the discretization bins are
calculated.
Example
call IDAX.EFDISC('intable=SAMPLES.CUSTOMER_CHURN, outtable=CUSTOMER_CHURN_BINS, incolumn=AVG_SPENT_RETAIN_PM:4;ANNUAL_REVENUE_MIL, bins=8, binprec=0.2');
select COLNAME, BREAK from CUSTOMER_CHURN_BINS;