Use this stored procedure to
build a PCA model.
Authorization
The privileges held by the authorization ID of the statement must include the IDAX_USER role.
To create a PCA model, ensure that you meet the following prerequisites:
- The input columns must contain only continuous columns.
- The input columns must contain at least two continuous columns.
- The number of input columns must not be greater than 78.
- The input table must not be empty.
Syntax
IDAX.PCA(in parameter_string varchar(32672))
Parameter descriptions
- parameter_string
- Mandatory one-string parameter that contains pairs of
<parameter>=<value> entries that are separated by a comma.
- Data type: VARCHAR(32672)
- The following list shows the parameter values:
-
- model
- Mandatory.
- The name of the PCA model that is to be built.
- Data type: VARCHAR(ANY)
- intable
- Mandatory.
- The name of the input table.
- Data type: VARCHAR(ANY)
- id
- Mandatory.
- The column of the input table that identifies a sequence ID.
- Data type: VARCHAR(ANY)
- incolumn
- Optional.
- The columns of the input table that have specific properties, which are separated by a
semi-colon (;).
- If this parameter is not specified, all columns of the input table have default properties.
- Default: none
- Data type: VARCHAR(ANY)
- incolumn
- Optional.
- The columns of the input table that have specific properties, which are separated by a
semi-colon (;).
- Each column is succeeded by one or more of the following properties:
- By type nominal (":nom", ":nominal") or by type continuous (":cont",
":continuous"). By default, numerical types are continuous, and all other types are nominal.
- By type nominal (
:nom
) or by type continuous (:cont
). By
default, numerical types are continuous, and all other types are nominal.
- By role id (":id"), target (":target"), input (":active", ":in", ":input")
or ignore (":ignore", ":inactive").
- By role
:id
, :target
, :input
, or
:ignore
.
- If this parameter is not specified, all columns of the input table have default properties.
- Default: none
- Data type: VARCHAR(ANY)
- coldeftype
- Optional.
- The default type of the input table columns.
- Allowed values are
nom
and cont
.
- Allowed values are:
- "nom" and "nominal" for type nominal
- "cont" and "continuous" for type continuous
- If the parameter is not specified, all columns are continuous columns.
- Default: none
- Data type: VARCHAR(ANY)
- Data type: VARCHAR(10)
- coldefrole
- Optional.
- The default role of the input table columns.
- Allowed values are
input
and ignore
.
- Allowed values are:
- "active", "in", and "input" for role input
- "ignore" and "inactive" for role ignore
- If the parameter is not specified, all columns are input columns.
- Default: input
- Data type: VARCHAR(ANY)
- Data type: VARCHAR(8)
- colPropertiesTable
- Optional.
- The input table where properties of the columns of the input table are stored in the same format
as the output format of the IDAX.COLUMN_PROPERTIES() stored procedure.
- If this parameter is not specified, the column properties of the input table column properties
are detected automatically.
-
Restriction: The following tables are not supported:
- colPropertiesTable with "COLROLE" column with value 'objweight'
- colPropertiesTable with "COLWEIGHT" column with value '<wgt>'
- Default: none
- Data type: VARCHAR(ANY)
- forceeigensolve
- Optional.
- A flag that indicates whether the eigenvalue decomposition is to be
forced when the model is calculated which might be marginally faster that the standard singular
value decomposition.
- This parameter value is ignored for the Spark implementation of PCA. It is,
however, still accepted as a parameter to be compatible with the previous SQL-based version of
PCA.
- Allowed values are 'true' and 'false'.
- Setting this flag to true might be marginally faster than the standard
singular value decomposition. If the input table is a singular matrix, setting the flag to true does
not produce a result.
- Set this flag to true only if the modeling process for an input table
seems to be slow or seems to produce wrong results.
- Default: false
- Data type: VARCHAR(ANY)
- centerdata
- Optional.
- A flag that indicates whether the input values are to be shifted to zero-centured.
- Allowed values are 'true' and 'false'.
- Default: true
- Data type: VARCHAR(ANY)
- scaledata
- Optional.
- A flag that indicates whether unit variance scaling is to be automatically applied for the input
values.
- Allowed values are 'true' and 'false'.
- Default: false
- Data type: VARCHAR(ANY)
- savescores
- Optional.
- A flag that indicates whether the score on each PCA is to be calculated and stored in the PCA
model.
- Allowed values are 'true' and 'false'.
- Default: false
- Data type: VARCHAR(ANY)
Returned information
A result set that contains the string 't' (for 'true') if the PCA model was built successfully.
If the PCA model was not built successfully, an error message is shown.
Example
CALL IDAX.PCA('model=customer_churn_pca, intable=customer_churn_train, id=cust_id, coldefrole=ignore,
incolumn=DURATION;AVG_SPENT_RETAIN_PM;AVG_SQ_SPENT_RETAIN_PM;ANNUAL_REVENUE_MIL;TOTAL_EMPLOYEES;TOTAL_BUY;TOTAL_BUY_FREQ;TOTAL_BUY_FREQ_SQ,
centerData=true, scaleData=true, forceEigenSolve=false, saveScores=true');