Functions for PCA
The PCA algorithm is implemented in the PCA and PROJECT_PCA stored procedures. To print PCA models, use the PRINT_MODEL stored procedure.
The PCA algorithm transforms the input table that contains the observations in rows and predictors in columns into a matrix A. To find the eigenvectors of matrix A, matrix A is decomposed by using singular value decomposition (SVD) or eigenvalue decomposition. These eigenvectors are then stored in the corresponding PCA model.
If matrix A is ill-conditioned, for example, if it is a singular matrix, the eigenvalue decomposition does not return a result. In this case, SVD is suggested as the alternative option for the computation.
All stored procedures consist of a mandatory one-string parameter that contains pairs of <parameter>=<value> entries. These entries are separated by a comma. The data type of the parameter is VARCHAR(any).
Valid <parameter>=<value> entries are listed in the parameter descriptions for each stored procedure.
How NULL values are handled
The handling of NULL values depends on where they are encountered during the computation of the principal components.
NULL values in the input table are handled as described in the following list.
- NULL values are contained in the id column.
- The corresponding id column in the input table is scanned for NULL values and duplicates. If NULL values or duplicates are found, the algorithm stops. An error message is then shown.
- This handling applies to the IDAX.PCA stored procedure and the IDAX.PROJECT_PCA stored procedure.
- NULL values are contained in any of the input columns.
- NULL values in any of the input columns are not valid. Therefore, the row from the input table in which one or more input columns contain a NULL value is ignored. An error message is not shown.
- For the IDAX.PCA stored procedure and the IDAX.PROJECT_PCA stored procedure, the input table is considered as empty, if all rows from the input data set contain NULL values. An error message is shown.