Making queries wait for incremental updates
To minimize the gap in data recency between the original Db2 for z/OS tables and incrementally updated accelerator-shadow tables (replication latency), you can delay the execution of queries until the latest incremental updates have been applied. This guarantees that an accelerated query against incrementally updated tables always "sees" the latest changes that have been committed in Db2 for z/OS.
This capability fulfills the criteria for Hybrid Transactional Analytical Processing (HTAP). It allows you to run transactional processes as well as analytical queries on a unified data store with a minimized time lag between transactional completion and analytical insight. Informed business decisions can be made much faster, on the basis of much more recent data.
To activate the delay, you must add a special register to your queries. You can also use a ZPARM or bind option to the same effect. The special register is:
SET CURRENT QUERY ACCELERATION WAITFORDATA = n.m.
where n.m.
is a decimal number denoting seconds. The minimum
is 0.0
(no delay). The maximum is 3600.0
(a delay of one
hour).
Example
SET CURRENT QUERY ACCELERATION = ENABLE;
SET CURRENT QUERY ACCELERATION WAITFORDATA = 120.0;
SELECT * FROM XYZ WHERE [...]
The first special register in line 1 enables query acceleration. The second special register sets
the delay to 120 seconds. What follows is the SQL code for the query. The execution of the query
will be postponed for 120 seconds. During that time, the replication engine tries to apply all
committed changes that have accumulated before the query started. If the replication engine cannot
finish the job during that period, the query will not be accelerated. The further handling depends
on the option set in the first special register and on the expiration behavior configured on the
IBM Db2 Analytics Accelerator Console. The first
special register is set to ENABLE
. That means the query fails if one the conditions
for query acceleration is not met. Let's assume that the conditions can be met; so normally, the
query would run on the accelerator. However, the default expiration behavior is to abort a query if
the incremental updates cannot be applied in time. Supposing the default behavior is set, the query
would indeed fail if not all of the pending incremental updates can be applied in 120 seconds.
How to calculate the proper delay time
-DIS ACCEL(<name>) DETAIL
<name>
is the name of the accelerator attached to the Db2 subsystem. If
incremental updates have been configured for this pair of Db2 subsystem and accelerator, you find
the two values in the command output. For example:... CURRENT REPLICATION LATENCY FOR THIS DB2 SYSTEM = 2000 MS NUMBER OF SUCCESSFUL QUERY REQUESTS WITH DELAY PROTOCAL FOR ALL DB2 SYSTEMS = 0 NUMBER OF EXPIRED QUERY REQUESTS WITH DELAY PROTOCAL FOR ALL DB2 SYSTEMS = 1 REPLICATION VELOCITY = 0.04 DB2 LOG SECONDS APPLIED PER SECOND ...
In this example, The latency is 2000 milliseconds or 2 seconds, and the velocity is 0.04. You get
the proper WAITFORDATA value by dividing latency by velocity. For the example, this gives you a
WAITFORDATA value of 50
seconds to start with.
Conditions for the success or failure of a query
Whether a query fails depends on the setting of the SET CURRENT QUERY ACCELERATION special register and on the expiration behavior. It also depends on the type of the table. For accelerator-only tables, the WAITFORDATA option is always ignored because it can be assumed that the data in these tables is always the most recent data available (there is no other data in a Db2 for z/OS source table).
If the WAITFORDATA option has been set for queries that reference incrementally updated tables as
well as tables not enabled for incremental updates (loaded tables), the query might fail if loaded
tables are to be excluded and the expiration behavior is set to FAIL
.
For information on how to configure the expiration behavior and the treatment of loaded tables, follow the links to Configuring the expiration behavior of WAITFORDATA queries and Including or excluding loaded tables from WAITFORDATA queries at the end of this topic.
CURRENT QUERY ACCELERATION | WAITFORDATA | Expiration behavior | Loaded tables | Query processing |
---|---|---|---|---|
NONE | N/A | N/A | N/A | Db2 for z/OS |
ENABLE / ELIGIBLE / ALL | 0.0 |
fail |
Include | Accelerator |
Exclude | Query fails | |||
continue |
Include | Accelerator | ||
Exclude | Accelerator | |||
> 0.0 |
fail |
Include | Query fails if outstanding incremental updates cannot be a applied within WAITFORDATA period. | |
Exclude | Query fails | |||
continue |
Include | Accelerator. The query might return results that are based on obsolete data | ||
Exclude | Accelerator. The query might return results that are based on obsolete data | |||
ENABLE WITH FAILBACK | 0.0 |
fail |
Include | Accelerator |
Exclude | Db2 for z/OS | |||
continue |
Include | Accelerator | ||
Exclude | Accelerator. The query might return results that are based on obsolete data | |||
> 0.0 |
fail |
Include | Db2 for z/OS | |
Exclude | Db2 for z/OS | |||
continue |
Include | Accelerator. The query might return results that are based on obsolete data | ||
Exclude | Db2 for z/OS |
Dependencies on other processing states, configurations, or product features
Other processing states or the use of certain configuration settings or other product features might affect query processing if WAITFORDATA is enabled (value > 0.0):
- If continuous incremental updates are enabled, and one of the tables referenced in the query is currently reloaded, or in the spilling or draining state, the query will fail or be processed by Db2 for z/OS (ENABLE WITH FAILBACK). This is because the end of these processes or table states cannot be predicted, and in most cases, they take much longer than the WAITFORDATA period.
- If continuous incremental updates are not enabled, the reload of a table stops the incremental update process. The end of the reload process cannot be predicted. It can easily take several hours. The execution of the query would have to wait until the process has finished and incremental updates are re-enabled. This is probably much more time than a user who is waiting for the results of a query is willing to accept. Therefore, WAITFORDATA queries will fail.
- If partitions of a table have been archived with the High Performance Storage Saver, WAITFORDATA processing can still be used for the active partitions of the table. The archived partitions do not receive incremental updates, so a WAITFORDATA setting will not have an effect.
- A setting of WAITFORDATA > 0.0 is ignored if an external stored procedure is run for in-database analytics or predictive modeling (IBM Netezza® Analytics). The external stored procedures are executed immediately, processing whatever data is currently available.
- If workload balancing is used in a high-availability setup, a WAITFORDATA setting > 0.0 might lead to successful or failing queries. The decisive factor is the replication latency of the accelerator that a query is sent to. If the latency is higher than the WAITFORDATA period, the query fails. The problem is that one does not know beforehand to which accelerator a query is sent. The accelerator is picked at random.
- If the product software on the accelerator is back-level, and does not support the WAITFORDATA
protocol, a query does not necessarily fail. If
CURRENT QUERY ACCELERATION
is set toENABLE
,ELIGIBLE
, orENABLE WITH FAILBACK
, the query might be processed by Db2 for z/OS. If, on the other hand, the special register is set toALL
, the query will fail. The same applies to the equivalent ZPARM or bind option settings. - WAITFORDATA processing is not possible if federated access has been granted to any of the tables referenced by a query. This is because the reference tables (referencing accelerator-only tables), which represent snapshots of the underlying accelerator-shadow tables, cannot be updated by the same incremental update process, and will inevitably be out-of-sync. A query against such tables will fail.
- If a column has been added to a Db2 for z/OS table (add-column schema change), and the accelerator-shadow table of that table has been enabled for incremental updates, the success of a query with a setting of WAITFORDATA > 0.0 against this table depends on a successful schema synchronization within or before the start of the WAITFORDATA period. That is, the SYSPROC.ACCEL_SYNCHRONIZE_SCHEMA stored procedure or the equivalent function in your administration client must have been run successfully. If the stored procedure or client function has been started, but not yet completed, a successful run of an accelerated query against the table remains uncertain. It might or might not succeed, depending on the completion of the necessary changes to the catalog and the full reflection of the schema change in the accelerator-shadow table.
- You cannot change the distribution key or the organizing keys of a table while incremental updates are enabled. To enable the Alter Keys function, you must first disable replication, and that means that a setting of WAITFORDATA > 0.0 will have no effect.
- A failure of the active Netezza host has the effect that all queries with a WAITFORDATA setting will fail. However, these queries will be restarted and processed as usual after the passive host has taken over (failover).
- Treatment of date, time, and timestamp values: An accelerated query that retrieves dates might give you the same values as Db2 for z/OS, but a query that selects time values or timestamps will probably produce different results. The reason is that the WAITFORDATA period and the difference between the z/OS system clock and the accelerator's system clock (=< 7 seconds) are added to these values while the incremental updates are being applied.