SYSPROC.ACCEL_COLLECT_TABLE_STATISTICS
Invokes the RUNSTATS utility for an accelerator-shadow table, an accelerator-only table, or a set of such tables on an accelerator.
Details
To run the RUNSTATS utility successfully on an accelerator table, the table must be in one of the following states:
Accelerator-shadow tables | Accelerator-only tables |
---|---|
|
|
If you run the stored procedure on an accelerator-shadow table that is in the process of being fully reloaded, the statistics are collected on the previous version of the table. That is, after the completion of the reload, the statistics of the newest version of the table might not be up-to-date. Therefore, complete a running reload before you run this stored procedure.
If you specify more than a single table as input for this stored procedure, the RUNSTATS utility is run in parallel. That is, a single execution thread is started for each table in the set. This might result in a high demand for CPU resources and system memory. To collect statistics on a large number of tables, it is therefore better to run the stored procedure several times in a sequence.
The SYSPROC.ACCEL_COLLECT_TABLE_STATISTICS stored procedure works synchronously. This means the procedure does not return results to the caller before all specified tables have been processed. This allows you to include this stored procedure in extract, transform, and load (ETL) processes. For example:
- Insert rows into an accelerator-only table A.
- Collect statistics on this table.
- Run a second INSERT statement that selects rows in accelerator-only table A in order to insert these rows into accelerator-only table B.
Important usage notes
In most cases, it is not necessary to run this stored procedure because the accelerator automatically collects statistics when certain events occur:
- early stats
- The collection starts immediately after an initial table load.
- copy stats
- Statistics are copied from the initially loaded table during a full table reload.
- auto stats
- The collection is initiated by a daemon of the database engine that runs every two hours to monitor table changes. New statistics are collected on tables with many changes.
- fabricated statistics
- These statistics are generated during query execution if statistics are missing.
Generally, statistics should only be collected manually if tables were modified considerably, that is, if many updates occurred or if a significant amount of data was deleted or inserted. See the following list of occasions on which a manual collection of statistics might be useful:
- After you changed an accelerator-only table by INSERT, UPDATE, or DELETE operations and before you reference the table in a query.
- If tables were updated by the IBM Integrated Synchronization function because many INSERT, UPDATE or DELETE operations had to be synchronized, and you want to collect new statistics before any of the affected tables is referenced in a query.
- If you have opened a service request and IBM support asks you to collect statistics.
- Do not start too many statistics collections at the same time, as these might take resources away from other processes, such as queries or table load operations.
- Start a statistics collection only if the current statistics are outdated and if the performance is unsatisfactory, that is, if it takes comparably long to obtain results from the affected tables when these are referenced in queries.
For more information, see Table Statistics and IBM Db2 Analytics Accelerator for z/OS.
Syntax
CALL SYSPROC.ACCEL_COLLECT_TABLE_STATISTICS
(accelerator_name,
table_statistics_specification,
message);
Options description
accelerator_name
- The unique name of the accelerator. This accelerator must have been defined by the SYSPROC.ACCEL_ADD_ACCELERATOR2 stored procedure.
table_statistics_specification
- An XML input string that lists the tables you want to collect statistics on. A table name must
not occur more than once in the input string.
The XML string must conform to the structure of the
tableSetForStatisticsCollection
element in the SAQTSAMP(AQTSXSD1) data set. For information on how to specify the input XML string, follow the link to the element content in the appendix at the end of this topic.Example:
<?xml version="1.0" encoding="UTF-8"?> <dwa:tableSetForStatisticsCollection xmlns:dwa="http://www.ibm.com/xmlns/prod/dwa/2011" version="1.0"> <table name="NATION" schema="TPCH"/> <table name="REGION" schema="TPCH"/> </dwa:tableSetForStatisticsCollection>
The
<passThrough>
element is an optional element you can specify inside a<table> </table>
block. You should only use it when IBM support advises it. The<passThrough>
element allows you to add command options to the RUNTSTATS statement that is finally executed.With these command options, you can focus on specific table columns or column groups. You can select the columns or column groups that you want to collect statistics or distribution metrics on. You can also determine the sampling type as well as set, update, or reference a statistics profile.
For details, see the IBM Documentation topic on the RUNSTATS command.
Example:
<table schema="TPCH" name="NATION"> <passThrough> WITH DISTRIBUTION AND SAMPLED DETAILED INDEXES ALL SET PROFILE ONLY </passThrough> </table>
This results in the execution of the following RUNSTATS command on the accelerator:
RUNSTATS ON TABLE "DWA<location>"."NATION..." WITH DISTRIBUTION AND SAMPLED DETAILED INDEXES ALL SET PROFILE ONLY
Note: The syntax of the command options used within the<passThrough>
element might change over time or vary with different product versions. A particular syntax you have used in the past might not be supported anymore in the future. message
- For the description, follow the appropriate link under Related reference at the end of this topic.
Result sets
The stored procedure always returns two result sets:
- First result set (SP_TRACE_CURSOR)
- Depending on the trace configuration in the
message
input parameter, this result set is empty or contains trace information about the stored procedure execution. It is identified by the cursor SP_TRACE_CURSOR and contains the following columns:- SEQID of type INTEGER
- TRACEDATA of type VARBINARY, with a maximum length of 32698 characters
The information in the TRACEDATA column is encoded in UTF-8. It is intended for analysis by IBM support. If the length of a value exceeds 32698 characters, it is split into multiple result set rows. The rows are concatenated in ascending order of SEQID.
- Second result set (MESSAGES_CURSOR)
- This result set contains an XML string that contains the same messages as
the MESSAGE output parameter. In contrast to the MESSAGE output parameter, the result set does not
have a 64 KB size limitation. Therefore, it always contains the whole set of
<message>
elements (no truncation). The structure of the XML string conforms to that of themessageOutput
element in the SAQTSAMP(AQTSXSD1) data set. The result set is identified by a cursor named MESSAGES_CURSOR and contains the following columns:- SEQID of type INTEGER
- MESSAGES of type VARBINARY, with a maximum length of 32698 characters
The rows in the result set are concatenated in ascending order of the values in the SEQID column. The information in the MESSAGES column is encoded in UTF-8.
Prerequisites
- For the user ID that calls the stored procedures on z/OS, you must define an OMVS segment in the Resource Access Control Facility (RACF®).
- The
accelerator_name
parameter must specify an accelerator name that is listed in the catalog tables of the product. - The tables must exist on the accelerator that the
accelerator_name
parameter specifies.
Authorizations for z/OS
On z/OS, the user ID under which this stored procedure is run must have the following privileges:
- EXECUTE on the stored procedure
- MONITOR1 (needed so that DSNWLIR can be used to run Db2 commands via the Db2 Instrumentation Facility Interface (IFI)).
- Authorization to run ADMIN_INFO_SYSPARM so that ZPARMs can be retrieved.
- RACF ACCESS(READ) on the data set that contains the AQTENV file in the started task procedure of the Workload Manager (WLM) environment.
- RACF ACCESS(READ) on the data set that contains the AQTDEF6 file in the started task procedure of the Workload Manager (WLM) environment.
- EXECUTE on the DSNADM.DSNADMIZ package to allow access to system parameters when the SYSPROC.ADMIN_INFO_SYSPARM stored procedure is called.