SYSPROC.ACCEL_LOAD_TABLES
Loads data from the source tables in Db2 into the corresponding accelerator-shadow tables.
Details
- Load of a previously defined accelerator-shadow table in InitialLoadPending state
- Reload of table partitions or non-partitioned accelerator-shadow tables only if the Db2 tables have been modified since the last load
- Partial reload, that is, if partitions of a Db2 table have been moved with the High Performance Storage Saver, reload of those partitions that have not been moved.
- Complete update of the data in an already loaded accelerator-shadow table with more recent Db2 data
- Update of the data in selected partitions of range-partitioned accelerator-shadow tables, provided that the partitioning key and the partition boundaries have not changed
- Deletion of data in accelerator-shadow tables that originate from range-partitioned tables after a ROTATE PARTITION operation
- Addition of new partition data to existing accelerator-shadow tables after a ROTATE PARTITION operation or ADD PARTITION operation on range-partitioned tables in Db2
- Addition of data from partitions that were appended to partitioned-by-growth tables in Db2
- Addition of data from partitions that were appended to partitioned-by-growth tables in Db2 InitialLoadPendingInitialLoadPending
Queries against already loaded accelerator-shadow tables go against the old data until the load operation is completed.
- If table partitions have been moved by the High Performance Storage Saver, you can load (initial load) or reload only those partitions that have not been moved. Moved partitions are always ignored.
- For range-partitioned tables and tables partitioned by growth, you can reload (update) the data of selected partitions only.
- InitialLoadPending
- Error
- Loaded
If the stored procedure runs successfully, the state of all loaded accelerator-shadow tables is changed to Loaded.
During a load, IBM Db2 Analytics Accelerator automatically selects a suitable distribution key.
If the stored procedure fails, all table modifications are rolled back on the accelerator, and the accelerator-shadow tables return to their previous states (InitialLoadPending, Error, or Loaded) and the IBM Db2 Analytics Accelerator catalog remains unmodified.
A situation in which the tables are modified, but the timestamps of these tables do not change in the catalog can only occur if an error surfaces after the completion of operations on the accelerator, but before the end of the stored-procedure run. The diverging timestamps in the catalog and on the accelerator have no functional consequences; the accelerator-shadow tables will contain the most recent data and will be available for processing.
Parallel processing
Partitions of a partitioned table (partitioned by range or by growth) and non-partitioned tables can be loaded in parallel. By default, up to four entities of both types (table partitions, non-partitioned tables, or a mixture of both) are loaded in parallel. You can change the default by setting the AQT_MAX_UNLOAD_IN_PARALLEL environment variable to a different value.
The order in which partitions or non-partitioned tables are loaded can vary and is not predictable. The reason is that before each run, the accelerator calculates how to distribute the items most efficiently to the available parallel threads It is therefore very likely that the order will deviate from the order in which the entities are listed in the table_load_specification, which is the parameter used to specify the items to be processed.
<externalTool> element in connection with the
enforceTableLoadSequence attribute in XML input for the
table_load_specification parameter. See the following
example:<externalTool enforceTableLoadSequence="true" />During execution, the stored procedure can increase or decrease the degree of parallelism at any time to optimize the performance of the current and of future load jobs. If that happens, indicating messages (AQT10418I and AQT10419I) are written to the SYSLOG. These messages are also returned by the stored procedure when processing has finished.
SYSPROC.ACCEL_LOAD_TABLES calls the Db2-supplied SYSPROC.DSNUTILU stored procedure, which unloads the data from the Db2 tables. The Workload Manager (WLM) policies might prohibit the invocation of SYSPROC.DSNUTILU, due to specific service class definitions or a current system overload.
AQT_MAX_RETRIES_DSNUTILU=2AQT_SECONDS_BEFORE_RETRY_DSNUTILU=60
These settings instruct SYSPROC.ACCEL_LOAD_TABLES to make three attempts at starting DSNUTILU and to wait for 60 seconds between the retries.
Additionally, you can set the AQT_SKIP_UNLOAD_EMPTY_PARTS environment variable to improve the performance of the load process. If this variable is set, SYSPROC.ACCEL_LOAD_TABLES does not call the SYSPROC.DSNUTILU utility for empty partitions. The detection of empty tables and partitions relies on Db2 real-time statistics.
<externalTool> element is used as part of the
table_load_specification parameter for
SYSPROC.ACCEL_LOAD_TABLES, the Db2 UNLOAD utility is always
called, irrespective of the setting of this variable.An auto-cleanup mechanism is in place. If SYSPROC.ACCEL_LOAD_TABLES was ended prematurely or abnormally, for example by the -CAN THREAD Db2 command, Db2 UNLOAD utilities that have been started by the stored procedure remain in the STOPPED state and related resources are left in the /tmp/aqtpipes/V7/<user> directory, where <user> is the ID that was used to start the stored procedure. During the next run of the SYSPROC.ACCEL_LOAD_TABLES stored procedure with the same user ID (for any table), the Db2 UNLOAD processes are ended, and the leftovers are deleted.
Restrictions
- You cannot invoke the SYSPROC.ACCEL_LOAD_TABLES procedure more than once for the same accelerator-shadow table at the same time. While the procedure is running, the state of all specified accelerator-shadow tables changes to LoadInProgress, which prevents the start of another instance of SYSPROC.ACCEL_LOAD_TABLES that works on the same tables. The stored procedure can be invoked again after a successful completion of the previous run. Otherwise, an error message is issued.
- You can only use this stored procedure to load or update accelerator-only tables if an external tool is
employed for this task. If such a tool is not available, you must use the
INSERT ... INTO TABLE ... IN ACCELERATORSQL statement. For information on how to call external tools, see the example at the end of the description of the table_load_specification parameter. - If you change the partitioning type of the table space of a source table in Db2 for z/OS (for example, from segmented to partitioned by growth), you can no longer load the corresponding accelerator-shadow table. The operation will fail. Therefore, you must remove the table from the accelerator and redefine it after changing the partitioning type.
- Referencing accelerator-only tables, which have been created for federated access, cannot be loaded with an external tool because their content must, at any time, reflect the content of the underlying original table in the remote Db2 subsystem. If such a table were loaded by an external tool, it would be possible for the table content to diverge from the content of the underlying "master" table. Hence the access to referencing accelerator-only tables is blocked, and an attempt to load these with an external tool will fail.
- After a load, a replication-enabled accelerator-shadow table goes into Suspended
state if someone has stopped replication during the load.
Suspended means that the table is skipped when incremental updates are applied. To make this table part of the incremental update process again, you must reload it.
Multiple accelerators
If multiple accelerators are connected to the same Db2 subsystem and a table exists on more than one accelerator, you can load the various instances of this table in parallel, provided that the lock mode permits this (NONE and ROW do). Each accelerator contains a snapshot of Db2 data at the point of the last load operation. There is no mechanism that guarantees the consistency of the table data on all accelerators.
Syntax
CALL SYSPROC.ACCEL_LOAD_TABLES
(accelerator_name,
lock_mode,
table_load_specification,
message);
Options description
accelerator_name- The unique name of the accelerator. This accelerator must have been defined by the SYSPROC.ACCEL_ADD_ACCELERATOR2 stored procedure.
lock_mode- Controls the protection level while accelerator-shadow tables are being loaded. The protection level ensures a consistent state of the data with respect to the specified entity. Valid values:
table_load_specification- An XML input string that lists the accelerator-shadow tables to be loaded. The string must
conform to the structure of the tableSetForLoad element in the
SAQTSAMP(AQTSXSD1) data set. A table name must not occur more than once in the input string.
For range-partitioned tables and tables partitioned by growth, it is possible to update only the data in selected partitions. This is done by specifying a partition list in the XML input string. Other tables are always completely reloaded. An error is returned if a partition list is specified for a table that is neither range-partitioned, nor partitioned by growth.
You specify partitions by their logical or physical partition numbers, separated by commas (,). Ranges can be specified by using a colon (:) between the range boundaries. The range boundaries are inclusive. Using negative values, you can specify partitions by counting backwards from the upper end of the entire range of partition numbers. In this context,-1is used to identify the last partition of a table. In a sorted sequence of partition numbers, the position of a partition in the sequence can be determined according to the following formula:
where(number of last partition) + 1 - nnis the number of partitions to be deducted from the upper end.Table 1. Examples of specifications for partitions and partition ranges XML specification for partitions Description <partitions>1</partitions>Includes only the first partition <partitions>1:10</partitions>Includes the first 10 partitions of a table <partitions>-10:-1</partitions>Includes the last 10 partitions of a table <partitions>1:-11</partitions>Includes all partitions except for the last 10 <partitions>1,10:13,-1</partitions>Includes the partitions 1, partitions 10 through 13, and the last partition of a table Example A:<?xml version="1.0" encoding="UTF-8" ?> <dwa:tableSetForLoad xmlns:dwa="http://www.ibm.com/xmlns/prod/dwa/2011" usePhysicalPartitionNumbers="true" version="1.0"> <table name="SALES" schema="BCKE"> <!-- explicitly specified partition numbers --> <partitions>1,5:10,20</partitions> </table> <table name="CUSTOMER" schema="BCKE" > <!-- negative partition numbers count from the end, -1 is the last element example for "all partitions except last 10" --> <partitions>1:-11</partitions> </table> <table name="ORDER2009" schema="BCKE"> </table> </dwa:tableSetForLoad>The sample XML string includes the attribute setting
usePhysicalPartitionNumbers="true">, which means that the specified partition numbers are regarded as physical partition numbers.The sample XML string lists three accelerator-shadow tables to be loaded: SALES, CUSTOMER, and ORDER2009. For the SALES table, the partitions 1, 5 through 10, and 20 are loaded.
All partitions of the CUSTOMER table are loaded, except for the last 10 partitions. Assume that this table has 20 partitions. According to the formula, this gives you 20 + 1 - 11=10.
The behavior is different for ORDER2009: Because partitions are not specified, the stored procedure loads just the partitions that have changed in the original Db2 table (newly added partitions or changed partition boundaries).
Example B:<?xml version="1.0" encoding="UTF-8" ?> <aqttables:tableSetForLoad xmlns:aqttables="http://www.ibm.com/xmlns/prod/dwa/2011" version="1.1"> <table name="SALES" schema="BCKE" > <partitions>1,5:10,20</partitions> </table> <table name="CUSTOMER" schema="BCKE" detectChanges="PARTITIONS" /> <table name="PRODUCT" schema="BCKE" /> </aqttables:tableSetForLoad>The sample XML string lists three accelerator-shadow tables to be loaded or reloaded: SALES, CUSTOMER, and PRODUCT. Assume that these tables are partitioned by range.
The SALES table contains a
<partitions>element, which lists the partitions to be loaded. These are the partitions with the IDs 1, 5, 6, 7, 8, 9, 10, and 20. If partitions were added as the result of an ADD or ROTATE operation, then these partitions will also be loaded.For the CUSTOMER table, the
detectChangesattribute is set toPARTITION, but a partition list has not been specified, so only tables that were added by an ADD or ROTATE operation since the last load will be loaded.For the PRODUCT table, neither the
detectChangesattribute, nor a partition list has been specified, so all partitions are loaded or reloaded. message- For the description, follow the appropriate link under Related reference at the end of this topic.
Automatic detection of modified data
The SYSPROC.ACCEL_LOAD_TABLES stored procedure can detect modified data automatically and then reload only the tables or partitions that have changed since the last load. This is useful if you want to run an automated batch process repeatedly that refreshes the data in your accelerator-shadow tables at regular intervals, for example daily, and you do not know which data has changed in Db2. Through a special attribute in the XML input, you can synchronize your accelerator data and at the same time ensure that just a minimum amount of data is transferred during the process.
detectChanges to the following values:- detectChanges="DATA"
- Basically, this setting causes a reload of partitions or entire tables if changes were detected in the original partitions or tables since the last load.
- detectChanges="PARTITIONS"
- Basically, this setting causes a load or reload of partitions on the accelerator that were added or rotated in Db2 for z/OS since the last load.
However, the actual effect of these settings also depends on the table type and whether a partition list has been specified, so the outcome is not always straightforward. See the following table:
| Attribute setting | Partition list specified | Table type | Effect |
|---|---|---|---|
| Not set or empty value | No | Partitioned by range | Full table reload |
| Partitioned by growth | Full table reload | ||
| Non-partitioned | Full table reload | ||
| Yes | Partitioned by range | Reload of selected partitions plus added or rotated partitions | |
| Partitioned by growth | Reload of selected partitions plus added or rotated partitions | ||
| Non-partitioned | Error AQT10403E (invalid attribute combination) | ||
detectChanges= "DATA" |
No | Partitioned by range | Reload of changed partitions including added or rotated partitions |
| Partitioned by growth | Reload of changed partitions including added or rotated partitions | ||
| Non-partitioned | Full table reload if changes were detected | ||
| Yes | Partitioned by range | Reload of changed partitions including added or rotated partitions. Warning AQT10148W (partition list ignored) | |
| Partitioned by growth | Reload of changed partitions including added or rotated partitions. Warning AQT10148W (partition list ignored) | ||
| Non-partitioned | Error AQT10403E (invalid attribute combination) | ||
detectChanges= "PARTITION" |
No | Partitioned by range | (Re)load of just the added or rotated partitions |
| Partitioned by growth | (Re)load of just the added or rotated partitions | ||
| Non-partitioned | Error AQT10403E (invalid attribute combination) | ||
| Yes | Partitioned by range | Reload of selected partitions plus added or rotated partitions | |
| Partitioned by growth | Reload of selected partitions plus added or rotated partitions | ||
| Non-partitioned | Error AQT10403E (invalid attribute combination) |
- Accelerator-shadow tables in InitialLoadPending or Error state are always fully reloaded.
- You cannot specify a partition list for non-partitioned tables.
- Partitions that have been removed from the beginning of a range-partitioned table due to a ROTATE operation are automatically detected and deleted from the corresponding accelerator-shadow table.
- All partition changes in range-partitioned tables except for ADD and ROTATE lead to an error if you try to reload the corresponding accelerator-shadow tables.
- The
forceFullReloadattribute is deprecated becausedetectChanges="PARTITIONS".without partition list has the same effect asforceFullReload="false", and omitting thedetectChangesattribute and a partition list has the same effect asforceFullReload="true". - If you use lock mode
NONEanddetectChanges="DATA", it can happen that more changes are detected than actually transferred to the accelerator. This is caused by the SKIP LOCKED DATA option, which is used implicitly by the Db2 Unload Utility. If parts of the data were locked by other processes, then this data is ignored and not unloaded to the accelerator. To ensure that all changes are finally transferred, a partition in an accelerator-shadow table is always fully reloaded if it was previously loaded with lock modeNONEanddetectChanges="DATA". A reload is started even if the partition had no further changes since then.
Example
<?xml version="1.0" encoding="UTF-8" ?>
<aqttables:tableSetForLoad
xmlns:aqttables="http://www.ibm.com/xmlns/prod/dwa/2011" version="1.1">
<table name="UNPARTITIONED3" schema="TEST" detectChanges="DATA"/>
<table name="PARTITIONED4" schema="TEST" detectChanges="PARTITIONS"/>
</aqttables:tableSetForLoad>
Suppose that the table UNPARTITIONED3 is a non-partitioned table, and that
PARTITIONED4 is partitioned by range. The accelerator-shadow table of the non-partitioned table
will be fully reloaded if changes were detected since the last load. The accelerator-shadow table of the partitioned table will
be reloaded only if partitions were added or rotated since the last load. The table will not be
fully reloaded. Only the new partitions will be added.
If very short synchronization intervals are required, such as periods of minutes for example, consider using the incremental update function of IBM Db2 Analytics Accelerator.
detectChanges="DATA" setting
invokes the same change detection mechanism as the SYSPROC.ACCEL_GET_TABLES_DETAILS stored
procedure. That is, SYSPROC.ACCEL_LOAD_TABLES reloads all partitions or non-partitioned tables for
which SYSPROC.ACCEL_GET_TABLES_DETAILS returns a change category other than NONE. The following
restrictions apply: - Under specific conditions, for example if table data was modified by a non-IBM utility, table changes might not be correctly detected. In such a case, you must manually reload the affected partitions or the entire accelerator-shadow table.
- If the detection of data changes is not possible, that is, in cases where the
SYSPROC.ACCEL_GET_TABLES_DETAILS stored procedure returns the change category
UNKNOWN, tables or partitions are reloaded even if the accelerator data is still in sync with the original Db2 data.
Result sets
The stored procedure always returns two result sets:
- First result set (SP_TRACE_CURSOR)
- Depending on the trace configuration in the
messageinput parameter, this result set is empty or contains trace information about the stored procedure execution. It is identified by the cursor SP_TRACE_CURSOR and contains the following columns:- SEQID of type INTEGER
- TRACEDATA of type VARBINARY, with a maximum length of 32698 characters
The information in the TRACEDATA column is encoded in UTF-8. It is intended for analysis by IBM support. If the length of a value exceeds 32698 characters, it is split into multiple result set rows. The rows are concatenated in ascending order of SEQID.
- Second result set (MESSAGES_CURSOR)
- This result set contains an XML string that contains the same messages as
the MESSAGE output parameter. In contrast to the MESSAGE output parameter, the result set does not
have a 64 KB size limitation. Therefore, it always contains the whole set of
<message>elements (no truncation). The structure of the XML string conforms to that of themessageOutputelement in the SAQTSAMP(AQTSXSD1) data set. The result set is identified by a cursor named MESSAGES_CURSOR and contains the following columns:- SEQID of type INTEGER
- MESSAGES of type VARBINARY, with a maximum length of 32698 characters
The rows in the result set are concatenated in ascending order of the values in the SEQID column. The information in the MESSAGES column is encoded in UTF-8.
Prerequisites
- For the user ID that calls the stored procedures on z/OS, you must define an OMVS segment in the Resource Access Control Facility (RACF®).
- The
accelerator_nameparameter must specify an accelerator name that is listed in the catalog tables of the product. - The accelerator-shadow table
must exist on the accelerator that the
accelerator_nameparameter specifies. - Because SYSPROC.ACCEL_LOAD_TABLES starts one or more instances of SYSPROC.DSNUTILU, DSNUTILU must be installed and configured properly on z/OS. This includes a correct setup of the WLM application environment for Db2-supplied stored procedures and IBM Db2 Analytics Accelerator stored procedures. For more information, follow the appropriate links under Related tasks at the end of this topic.
- To avoid naming conflicts on z/OS, do not run Db2 utilities whose names start with AQT at the same time as this stored procedure.
Authorizations for z/OS
The required authorizations for this stored procedure are listed in the following section.
Therefore, do not let applications change this ID by use of a Db2 authorization exit, such as DSN3@ATH or DSN3@SGN.
The list of required authorizations:
- EXECUTE on the stored procedure
- MONITOR1 authorization
- Authorization to read the original Db2 base tables of the
accelerator-shadow tables that are listed in
the
table_load_specificationparameter - DISPLAY authorization (for calling -DIS GROUP)
- Authorization to run the ACCESS DB command on the databases that the tables reside in (needed to refresh Db2 real-time statistics)
- Authorization to execute the RUNSTATS utility on the databases that contain the tables to be loaded, such as the STATS authorization.
- Authorization to run ADMIN_INFO_SYSPARM so that ZPARMs can be retrieved.
- Read/write and execute access to the /tmp directory for the user who calls the stored procedure (UNIX System Services pipes are created in this directory)
- Privilege to connect to Db2 for z/OS via RRSAF.Important: If you want to grant the required Db2 privileges to secondary authorization IDs, for example groups, make sure that the Db2 SIGNON exit is correctly configured for the use of such IDs. The default SIGNON exit in Db2 for z/OS does not accept secondary authorization IDs. You must therefore use a sample SIGNON exit for this purpose, that is, DSN3SSGN or an equivalent. For more information, see Using secondary IDs for sign-on requests.
- RACF ACCESS(READ) on the data set that contains the AQTENV file in the started task procedure of the Workload Manager (WLM) environment.
- RACF ACCESS(READ) on the data set that contains the AQTDEF6 file in the started task procedure of the Workload Manager (WLM) environment.
- EXECUTE on the DSNADM.DSNADMIZ package to allow access to system parameters when the SYSPROC.ADMIN_INFO_SYSPARM stored procedure is called.
Accessed data and systems
The SYSPROC.ACCEL_LOAD_TABLES stored procedure locks Db2
objects as defined by the lock_mode parameter. After a
successful load of all accelerator-shadow tables, it changes the value of the REFRESH_TIME column for these tables in the
SYSACCEL.SYSACCELERATEDTABLES catalog table of Db2 for z/OS.
Messages that indicate a change in the degree of parallelism (AQT10418I and AQT10419I) are written to the SYSLOG. You can suppress the logging of these messages by commenting out the following line in the AQTENV data set:
AQT_DO_SYSLOGGING = SET
On the accelerator, the stored procedure updates the IBM Db2 Analytics Accelerator catalog and, in case of an initial load, creates corresponding accelerator-shadow tables.
The dynamic statement cache in Db2 is invalidated for initial loads. All plans that are affected by the cache invalidation must therefore be rebuilt by Db2. This might increase the preparation time of the affected queries slightly because a new plan must be prepared during the first execution of a query. This, however, is a onetime operation. For all subsequent preparations of the same query, the dynamic statement cache can be used again. The initial prolongation of the preparation time is in most cases more than outweighed by the reduction that sets in when successfully accelerated queries are submitted repeatedly because such queries are immediately routed to the accelerator.
Ist the following paragraph still true?