Loading incremental updates to dynamic cubes

After you load initial fact data, you can add new fact rows to the fact table at any time. After you start a dynamic cube and update the fact table with new rows, make the updates visible in a dynamic cube by using an incremental load.

To identify the new rows to dynamic cubes, you must use a non-null TID value for the rows that is higher than the TID value for the previously inserted rows. For example, if a previous update to the fact data used a TID value 2, the next update must use a TID value of 3 or higher.

You can load more than one increment at a time. For example, if you have updates for TID values 3, 4 and 5, you can load them all at once. Alternatively, you can specify that you want to load only incremental updates up to TID value 4.

Important: Assign the same TID value to all fact rows that are loaded together.

While incremental updates to a dynamic cube are in progress, queries against the cube return values that are based on the current completed update. When the update is complete, and data caches are updated, new queries return values that are based on the latest incremental update.

An incremental update is a memory-intensive process that requires additional memory for the duration of the incremental load and during the dynamic cube operation. The following examples illustrate how to estimate the amount of memory that is needed in both situations:

Additional memory requirements during an incremental load.
You should plan for 500 bytes of memory for each new tuple. A tuple is defined as the number of tuples processed in the incremental update. For example, 10M tuples requires 5 GB of extra memory to load. This is calculated using the following formula: number of unique rows at the grain of the cube times the number of additive measures.

In this formula, the number of unique rows at the grain of the cube is the number of unique rows that are affected at the grain of the cube. This value is used by a query to fetch the incremental values. It must always be equal to or less than the number of inserted rows. It might be less than the number of inserted rows if the grain of the cube is higher. For example, the cube is modeled to hour, but rows are inserted to the minute.

The number of inserted rows in this formula is the number of rows in the fact table.
Additional memory requirements during dynamic cube operation.
Tuples for the latest increment are saved after the incrementallyLoadCubes command finishes, at a cost of 100 bytes per tuple. For example, a 10M increment requires an extra 1 GB of memory. This extra memory is required while the cube runs and applies to the last set of loaded tuples. For example, if you incrementally load a cube ten times, when the load commands are finished the required extra memory is: 100 bytes times the number of tuples in the last load.

You can load updates to aggregate tables separately, and you can choose when to run these updates. For more information, see Incremental updates of aggregate tables.

You can load incremental updates to dynamic cubes by using the Incrementally update data action in IBM® Cognos® Administration. This method allows you to run these commands by schedule and by trigger. For more information see Starting and managing dynamic cubes.

You can also load incremental updates to dynamic cubes by using the DCAdmin command-line tool, as shown in the following procedure.

Procedure

Perform the following steps in the DCAdmin command-line tool to load an incremental update:

Open the DCAdmin command-line tool. For information about running the tool, see DCAdmin command-line tool.
Run the getCubeMetrics command to check the following metrics:
- The metric timeLastNearRealTimeUpdateAvailable returns the date and time when the latest increment was loaded.
- The metric timeToApplyLastNearRealTimeUpdates returns the time used to build the latest increment.
- The metric valueOfLastNearRealTimeTID returns the TID value of the latest increment.
By checking these metrics, you can determine which TID value was loaded last and decide how frequently the dynamic cube should be updated.
Run the incrementallyLoadCubes command.
This command includes a transactionID parameter that you can use to specify the transaction ID (TID) value to which to load fact data updates. If you do not specify this parameter, the command runs a MAX query to determine the latest TID value. Use the transactionID parameter for non-indexed databases, such as Netezza® and IBM Db2® BLU, where performance might be adversely affected by using a MAX query. For indexed databases, such as Db2 and Oracle, running a MAX query has no adverse effect, therefore, it is not necessary to use this parameter.
Optional: Run the getCubeMetrics command again to check if the updates were successful.

Results

When this update and the data caches updates are complete, new queries return values that are based on the latest incremental update.