Making queries wait for incremental updates

To minimize the gap in data recency between the original Db2 for z/OS tables and incrementally updated accelerator-shadow tables (replication latency), you can delay the execution of queries until the latest incremental updates have been applied. This guarantees that an accelerated query against incrementally updated tables always "sees" the latest changes that have been committed in Db2 for z/OS.

This capability fulfills the criteria for Hybrid Transactional Analytical Processing (HTAP). It allows you to run transactional processes as well as analytical queries on a unified data store with a minimized time lag between transactional completion and analytical insight. Informed business decisions can be made much faster, on the basis of much more recent data.

To activate the delay, you must add a special register to your queries. You can also use a ZPARM or bind option to the same effect. The special register is:

SET CURRENT QUERY ACCELERATION WAITFORDATA = n.m.

where n.m. is a decimal number denoting seconds. The minimum is 0.0 (no delay). The maximum is 3600.0 (a delay of one hour).

Important: If you want to use a delay, specify at least 20 seconds. Lower values might work, but you must take the time into account that is needed to scan and read the Db2 for z/OS log and to report the findings. Experience has shown that realistic delays are around 20 seconds long.

Example

SET CURRENT QUERY ACCELERATION = ENABLE;
SET CURRENT QUERY ACCELERATION WAITFORDATA = 120.0;
SELECT * FROM XYZ WHERE [...]

The first special register in line 1 enables query acceleration. The second special register sets the delay to 120 seconds. What follows is the SQL code for the query. The execution of the query will be postponed for 120 seconds. During that time, the replication engine tries to apply all committed changes that have accumulated before the query started. If the replication engine cannot finish the job during that period, the query will not be accelerated. The further handling depends on the option set in the first special register and on the expiration behavior configured on the IBM Db2 Analytics Accelerator Console. The first special register is set to ENABLE. That means the query fails if one the conditions for query acceleration is not met. Let's assume that the conditions can be met; so normally, the query would run on the accelerator. However, the default expiration behavior is to abort a query if the incremental updates cannot be applied in time. Supposing the default behavior is set, the query would indeed fail if not all of the pending incremental updates could be applied in 120 seconds.

How to calculate the proper delay time

You certainly ask yourself how to determine the proper delay time (WAITFORDATA value) for your environment. There are two values by which an initial delay time or starting point can be calculated: the replication latency and the replication velocity. The replication latency is the time that the incremental update process is lagging behind. The replication velocity is a measure for the speed at which data changes could be applied in the past. You get these two values by running the following Db2 command:

-DIS ACCEL(<name>) DETAIL

where <name> is the name of the accelerator attached to the Db2 subsystem. If incremental updates have been configured for this pair of Db2 subsystem and accelerator, you find the two values in the command output. For example:

...
CURRENT REPLICATION LATENCY FOR THIS DB2 SYSTEM = 2000 MS
NUMBER OF SUCCESSFUL QUERY REQUESTS WITH DELAY PROTOCAL FOR ALL DB2 SYSTEMS = 0
NUMBER OF EXPIRED QUERY REQUESTS WITH DELAY PROTOCAL FOR ALL DB2 SYSTEMS = 1
REPLICATION VELOCITY = 0.04 DB2 LOG SECONDS APPLIED PER SECOND
...

In this example, The latency is 2000 milliseconds or 2 seconds, and the velocity is 0.04. You get the proper WAITFORDATA value by dividing latency by velocity. For the example, this gives you a WAITFORDATA value of 50 seconds to start with.

See the note on replication latency.

Conditions for the success or failure of a query

Whether a query fails depends on the setting of the SET CURRENT QUERY ACCELERATION special register. It also depends on the type of the table. For accelerator-only tables, the WAITFORDATA option is always ignored because it can be assumed that the data in these tables is always the most recent data available (there is no other data in a Db2 for z/OS source table).

If the WAITFORDATA option has been set for queries that reference incrementally updated tables as well as tables not enabled for incremental updates (loaded tables), the query might fail if loaded tables are to be excluded.

For information on how to configure the treatment of loaded tables, follow the link to Including or excluding loaded tables from WAITFORDATA queries at the end of this topic.

Table 1 lists the possible combinations of settings and their effect with regard to query processing. It is assumed that an incoming query satisfies the conditions for query acceleration.

Table 1. Settings that influence WAITFORDATA query processing
CURRENT QUERY ACCELERATION	WAITFORDATA	Loaded tables¹	Query processing
NONE	N/A	N/A	Db2 for z/OS
ENABLE / ELIGIBLE / ALL	`0.0`	Include	Accelerator
	`0.0`	Exclude	Query fails
	`> 0.0`	Include	Query fails if outstanding incremental updates cannot be a applied within WAITFORDATA period.
	`> 0.0`	Exclude	Query fails
ENABLE WITH FAILBACK	`0.0`	Include	Accelerator
	`0.0`	Exclude	Db2 for z/OS
	`> 0.0`	Include	Db2 for z/OS
	`> 0.0`	Exclude	Db2 for z/OS

Error codes in connection with WAITFORDATA processing

If an error occurs in connection with WAITFORDATA processing, an error message with SQLCODE = -904 is returned in the format:

WAITFORDATA rsn=<reason code>:<reason code text>

where

<reason code>: An integer (positive, negative, or 0)
<reason code text>: The message text

Example

SQLCODE = -904, ERROR:  UNSUCCESSFUL EXECUTION CAUSED BY AN UNAVAILABLE RESOURCE. 
REASON 00E7000E, TYPE OF RESOURCE 00001080, AND RESOURCE NAME WAITFORDATA 
rsn=1:Table(s) are in invalid state.

The message shows rsn=1 as the <reason code>, and Table(s) are in invalid state. as the <reason code text> .

Table of error codes

Table 2 lists the possible reason codes and message texts. It also provides a description of the most common causes.

Table 2. Reason codes and texts in WAITFORDATA error messages
RSN code	Message text	Common causes
`1`	`Table(s) are in invalid state.`	Not all tables to be processed are in the replicationInProgress state. This state, however, is a precondition for WAITFORDATA processing. Check the Last Load column in the Tables section of the Accelerator view in IBM Db2 Analytics Accelerator Studio.
`2`	`Subscription state is invalid.`	Incremental updates have not been started for the accelerator on which the tables are located. In IBM Db2 Analytics Accelerator Studio, check the Replication status in the header of the Accelerator view. The status must be Started. If necessary, click Start.
`3`	`Set of tables is empty or invalid.`	Some tables are empty or have not been enabled for incremental updates. Make sure that none of the tables in the set are empty. If you find empty tables, reload them, or exclude them from the set. Check the Replication Since column in the Tables section of the Accelerator view in IBM Db2 Analytics Accelerator Studio. The column must show a date for all the tables to be processed. If necessary, click Replication > Enable Replication.
`4`	`An unknown internal error occurred.`	A program thread checks whether pending incremental updates were applied within the specified WAITFORDATA delay. The thread might have been interrupted, but the exact reason cannot be determined. The WAITFORDATA query failed.
`5`	`Timeout for WAITFORDATA reached.`	Pending incremental updates could not be applied within the WAITFORDATA period.

Dependencies on other processing states, configurations, or product features

Other processing states or the use of certain configuration settings or other product features might affect query processing if WAITFORDATA is enabled (value > 0.0):

If one of the tables referenced by the query is currently being reloaded, or in the spilling or draining state, the query will fail or be processed by Db2 for z/OS (ENABLE WITH FAILBACK). This is because the end of these processes or table states cannot be predicted, and in most cases, they take much longer than the WAITFORDATA period.
If partitions of a table have been archived with the High Performance Storage Saver, WAITFORDATA processing can still be used for the active partitions of the table. The archived partitions do not receive incremental updates, so a WAITFORDATA setting will not have an effect.
If workload balancing is used in a high-availability setup, a WAITFORDATA setting > 0.0 might lead to successful or failing queries. The decisive factor is the replication latency of the accelerator that a query is sent to. If the latency is higher than the WAITFORDATA period, the query fails. The problem is that one does not know beforehand to which accelerator a query is sent. The accelerator is picked at random.
If the product software on the accelerator is back-level, and does not support the WAITFORDATA protocol, a query does not necessarily fail. If CURRENT QUERY ACCELERATION is set to ENABLE, ELIGIBLE, or ENABLE WITH FAILBACK, the query might be processed by Db2 for z/OS. If, on the other hand, the special register is set to ALL, the query will fail. The same applies to the equivalent ZPARM or bind option settings.
WAITFORDATA processing is not possible if federated access has been granted to any of the tables referenced by a query. This is because the reference tables (referencing accelerator-only tables), which contain metadata, cannot be updated by the same incremental update process, as they were created from a different, disconnected Db2 subsystem. The reference tables would inevitably be out-of-sync. Therefore, queries against such tables will fail.
You cannot change the distribution key or the organizing keys of a table while incremental updates are enabled. To enable the Alter Keys function, you must first disable replication, and that means that a setting of WAITFORDATA > 0.0 will have no effect.
A failure of the database engine on the accelerator has the effect that all queries with a WAITFORDATA setting will fail. However, these queries will be restarted and processed as usual after the database engine has recovered or after a replacement instance has been started.
Treatment of date, time, and timestamp values: An accelerated query that retrieves dates might give you the same values as Db2 for z/OS, but a query that selects time values or timestamps will probably produce different results. The reason is that the WAITFORDATA period and the difference between the z/OS system clock and the accelerator's system clock (=< 7 seconds) are added to these values while the incremental updates are being applied.

¹ This has an effect only if loaded tables are referenced by the query