IDAX.EFDISC - Discretization bins of equal frequency

Use this stored procedure to calculate the limits for discretization bins of equal frequency for numeric columns. Each bin contains the approximate equal number of data records.

Authorization

The privileges held by the authorization ID of the statement must include the IDAX_USER role.

Syntax

IDAX.EFDISC(in parameter_string varchar(32672))

Parameter descriptions

parameter_string
Mandatory one-string parameter that contains pairs of <parameter>=<value> entries that are separated by a comma.
Data type: VARCHAR(32672)
The following list shows the parameter values:
intable
Mandatory.
The name of the input table.
Data type: VARCHAR(ANY)
outtable
Mandatory.
The name of the output table where the limit for the discretization bins is to be stored.
Data type: VARCHAR(ANY)
incolumn
Mandatory.
The columns of the input table that are separated by a semi-colon (;).
Each column name can be followed by a colon (:) and the number of discretization bins that are to be calculated for this column.
Data type: VARCHAR(ANY)
bins
Optional.
The default number of discretization bins that are to be calculated.
Default: 10
Data type: INTEGER
binprec
Optional.
The precision for an even distribution of data records in the calculated discretization bins.
The number of data records in each bin must be within [iw- <binprec>*iw,iw+<binprec>*iw], where iw is the size of the input table divided by the number of requested limits for discretization bins.
Min: 0
Max: 1
Default: 0.1
Data type: DOUBLE

Returned information

INTEGER the number of columns for which the limits for the discretization bins are calculated.

Example

call IDAX.EFDISC('intable=SAMPLES.CUSTOMER_CHURN, outtable=CUSTOMER_CHURN_BINS, incolumn=AVG_SPENT_RETAIN_PM:4;ANNUAL_REVENUE_MIL, bins=8, binprec=0.2');
select COLNAME, BREAK from CUSTOMER_CHURN_BINS;