Importing data with the CLI LOAD utility in CLI applications

The CLI LOAD functionality provides an interface to the IBM® Db2® LOAD utility from CLI. This functionality allows you to insert data in CLI using LOAD instead of array insert. This option can yield significant performance benefits when large amounts of data need to be inserted. Because this interface invokes LOAD, the same consideration given for using LOAD should also be taken into account when using the CLI LOAD interface.

Before you begin

Before importing data with the CLI LOAD utility, ensure you have initialized your CLI application.
Note: The CLI LOAD interface to the IBM Db2 LOAD utility is not supported when accessing an Informix® database server.

About this task

Note: Starting from Version 9.7, Fix Pack 4, this feature can also be used with the CLI async processing feature.
  • Unlike the IBM Db2 LOAD utility, the CLI LOAD utility does not load data directly from an input file. Instead, if required, the application should retrieve the data from the input file and insert it into the appropriate application parameters that correspond to the parameter markers in the prepared statement.
  • If the prepared SQL statement for inserting data contains a SELECT clause, parameter markers are not supported.
  • The prepared SQL statement for inserting data must include parameter markers for all columns in the target table, unless a fullselect is used instead of the VALUES clause in the INSERT statement.
  • The insertion of data is non-atomic because the load utility precludes atomicity. LOAD might not be able to successfully insert all the rows passed to it. For example, if a unique key constraint is violated by a row being inserted, LOAD will not insert this row but will continue loading the remaining rows.
  • A COMMIT will be issued by LOAD. Therefore, if the insertion of the data completes successfully, the LOAD and any other statements within the transaction cannot be rolled back.
  • The error reporting for the CLI LOAD interface differs from that of array insert. Non-severe errors or warnings, such as errors with specific rows, will only appear in the LOAD message file.

Procedure

To import data using the CLI LOAD utility:

  1. Specify the statement attribute SQL_ATTR_USE_LOAD_API in SQLSetStmtAttr() with one of the following supported values:
    SQL_USE_LOAD_INSERT
    Use the LOAD utility to append to existing data in the table in instance-based client and use External table interface in instance-less client.
    SQL_USE_LOAD_REPLACE
    Use the LOAD utility to replace existing data in the table in the instance-based client. The instance-less client does not support this value.
    SQL_USE_LOAD_WITH_ET
    Use External table interface to append to existing data in the table in both instance-based and instance-less client.
    For example, the following call indicates that the CLI LOAD utility will be used to add to the existing data in the table:
    SQLSetStmtAttr (hStmt, SQL_ATTR_USE_LOAD_API, 
                    (SQLPOINTER) SQL_USE_LOAD_INSERT, 0);
    Note: When SQL_USE_LOAD_INSERT or SQL_USE_LOAD_REPLACE is set, no other CLI functions except for the following CLI function can be called until SQL_USE_LOAD_OFF is set (see Step 3):
    • SQLBindParameter()
    • SQLExecute()
    • SQLExtendedBind()
    • SQLParamOptions()
    • SQLSetStmtAttr()
  2. Create a structure of type db2LoadStruct and specify the required load options through this structure. Set the SQL_ATTR_LOAD_INFO statement attribute to a pointer to this structure.
  3. Optional: The ANYORDER file type modifier option of the LOAD API can potentially increase the performance of the load. Set the statement attribute SQL_ATTR_LOAD_MODIFIED_BY in SQLSetStmtAttr() to specify the file type modifier option ANYORDER.
    For example, the following call specifies the anyorder file type modifier for the CLI LOAD:
    char *filemod="anyorder";
    SQLSetStmtAttr (hstmt, SQL_ATTR_LOAD_MODIFIED_BY,
                    (SQLPOINTER) filemod, SQL_NTS);
  4. Issue SQLExecute() on the prepared SQL statement for the data to be inserted.
    The INSERT SQL statement can be a fullselect which allows data to be loaded from a table using the SELECT statement. With a single execution of the INSERT statement, all of the data from the SELECT is loaded.
    The following example shows how a fullselect statement loads data from one table into another:
    SQLPrepare (hStmt,
                (SQLCHAR *) "INSERT INTO tableB SELECT * FROM tableA",
    	    SQL_NTS);
    SQLExecute (hStmt);
  5. Call SQLSetStmtAttr() with SQL_USE_LOAD_OFF.
    This ends the processing of data using the LOAD utility. Subsequently, regular CLI array insert will be in effect until SQL_ATTR_USE_LOAD_API is set again (see Step 1).
  6. Optional: After the CLI LOAD operation, you can query the number of rows that were affected by it by using the following statement attributes:
    • SQL_ATTR_LOAD_ROWS_COMMITTED_PTR: A pointer to an integer that represents the total number of rows processed. This value equals the number of rows successfully loaded and committed to the database, plus the number of skipped and rejected rows.
    • SQL_ATTR_LOAD_ROWS_DELETED_PTR: A pointer to an integer that represents the number of duplicate rows deleted.
    • SQL_ATTR_LOAD_ROWS_LOADED_PTR: A pointer to an integer that represents the number of rows loaded into the target table.
    • SQL_ATTR_LOAD_ROWS_READ_PTR: A pointer to an integer that represents the number of rows read.
    • SQL_ATTR_LOAD_ROWS_REJECTED_PTR: A pointer to an integer that represents the number of rows that could not be loaded.
    • SQL_ATTR_LOAD_ROWS_SKIPPED_PTR: A pointer to an integer that represents the number of rows skipped before the CLI LOAD operation began.

    To use the statement attributes to query the number of rows affected by the CLI LOAD, the application must call SQLSetStmtAttr before the CLI LOAD, and pass a pointer to the memory location where the value will be stored.

    For example, after you turn on CLI LOAD by calling SQLSetStmtAttr and specify the statement attribute SQL_ATTR_USE_LOAD_API as in step 1, before executing the INSERT to do the CLI LOAD, you can call SQLSetStmtAttr to pass a pointer to the memory location where the value will be stored.
    int *rowsLoaded;
    int *rowsDeleted;
    
    rowsLoaded = (int *)malloc(sizeof(int));
    if (rowsLoaded == NULL)
    {
      // Handle any memory allocation failure by malloc
    }      
    rc = SQLSetStmtAttr(hstmt, SQL_ATTR_LOAD_ROWS_LOADED_PTR, rowsLoaded, 
    SQL_IS_POINTER);
    
    rowsDeleted = (int *)malloc(sizeof(int));
    if (rowsLoaded == NULL)
    {
      // Handle any memory allocation failure by malloc
    }      
    rc = SQLSetStmtAttr(hstmt, SQL_ATTR_LOAD_ROWS_DELETED_PTR, rowsDeleted, 
    SQL_IS_POINTER);
    After the CLI LOAD, you can retrieve the statement attribute values as follows:
    printf("\n  Value of SQL_ATTR_LOAD_ROWS_LOADED_PTR is %d", *rowsLoaded);
    printf("\n  Value of SQL_ATTR_LOAD_ROWS_DELETED_PTR is %d", *rowsDeleted); 
    You can also retrieve the statement attribute values by calling SQLGetStmtAttr, as shown in the following example. Note that you must call SQLSetStmtAttr to pass a pointer to the memory location where the value will be stored before you issue the INSERT statement for the CLI LOAD.
    int *pStmtAttrValue;
    
    rc = SQLGetStmtAttr(hstmt,
                        SQL_ATTR_LOAD_ROWS_LOADED_PTR,
                        &pStmtAttrValue,
                        sizeof(pStmtAttrValue),
                        NULL); 
    printf("\n  Value of SQL_ATTR_LOAD_ROWS_LOADED_PTR  is %d", *pStmtAttrValue);
    
    rc = SQLGetStmtAttr(hstmt,
                        SQL_ATTR_LOAD_ROWS_DELETED_PTR,
                        &pStmtAttrValue,
                        sizeof(pStmtAttrValue),
                        NULL);
    printf("\n  Value of SQL_ATTR_LOAD_ROWS_DELETED_PTR is %d", *pStmtAttrValue);
    Note: With the release of Db2 Connect 12.1, parameter markers in CLI applications can bind with row-wise array input when running CLI LOAD operations.