Static versus dynamic SQL

CDC Replication uses Static SQL to access and manage its metadata Db2® tables. CDC for Db2 for z/OS® uses Dynamic SQL when dealing with data in user tables. Dynamic SQL is used for user tables because it is built at execution time, when it is possible to determine how user tables were created.

When CDC Replication is installed on the source system, Dynamic SQL is used when obtaining data during a refresh. During mirroring, table data is obtained from the Db2 log, with the exception of LOB data, XML data and data for %SELECT and %GETCOL source derived columns. XML data is always read from the source table using Dynamic SQL. LOB data is read using Dynamic SQL except when inline LOBs are used in DB2 X and the entire LOB can be read from the log record. Except for LOB columns, XML columns and %SELECT and %GETCOL source derived columns, no SQL, neither Dynamic nor Static, is used to obtain user data during mirroring.

When CDC Replication is installed on the target system, Dynamic SQL is used to apply changes to the user tables.

It is known that Dynamic SQL is much less efficient then Static SQL. Still, properly managed Dynamic SQL can be made to be almost as efficient as Static SQL. To explain how this improvement in Dynamic SQL efficiency is possible, consider how Dynamic and Static SQL differ in preparation and execution.

Processing of SQL by the DBMS can be broken down into three steps:

  1. Syntactic and semantic analysis: verifying that the SQL is coded according to the syntax rules, and determining what it is meant to achieve.
  2. Access permission and path analysis: determining if the resources accessed by the SQL are permitted for the requester, and if they are permitted, how that access will be provided (determining the access path).
  3. Execution and completion analysis: obtaining or changing the data according to the specification of the SQL and preparing returned status information about the outcome of the request.

Step 1 is relatively innocuous and absorbs CPU resources only. Step 2 is very resource intensive, as it requires the DBMS to access multiple tables to resolve the questions of security permissions and available access paths, and then to store the results in other tables as temporary “plan”. Step 3's efficiency is dependent on the type of processing specified by the SQL, what access path has been chosen, what server resources are available, and so on.

In the case of Static SQL, Step 1 occurs during precompilation of the source code containing the Static SQL. Step 2 occurs during the BIND process that occurs after precompilation but before execution. Step 3 occurs at execution time. This is the reason that Static SQL is perceived to be efficient. Most of its onetime processing occurs earlier, before the statement needs to be executed. Only Step 3 occurs each time that the Static SQL statement is executed.

In the case of Dynamic SQL, the three steps occur at execution time. If a Dynamic SQL statement is simply built and executed, the three steps occur during processing of the EXECUTE request. The next time a similar SQL statement is required, all three steps are performed again. This is the reason that Dynamic SQL is so much less efficient that Static SQL. In this section, this type of Dynamic SQL is identified as naive Dynamic SQL.

It is possible with Dynamic SQL, however, to provide some degree of separation between the three steps at execution time. The executing program can build the SQL statement and then prepare it into a buffer. The PREPARE statement encompasses Step 1 and Step 2. Subsequently, the program can execute the prepared SQL statement from the buffer. In this case, the EXECUTE statement performs only Step 3. Furthermore, the SQL statement in the prepared buffer can be executed as many times as it is needed, until a COMMIT (or ROLLBACK) is issued. The COMMIT will invalidate the contents of the prepared buffer. In this section, this type of Dynamic SQL is identified as astute Dynamic SQL.

The following section is a simple comparative analysis of the efficiencies of the three types of SQL. In this analysis, CPU time was measured. I/O activity by the DBMS was not measured, but the elapsed time for completing the SQL requests was measured. The analysis consisted of three separate test runs:

  1. 1000 Static SQL INSERT statements.
  2. 1000 naive Dynamic SQL INSERT statements (for example, 1000 EXECUTE statements).
  3. 1000 astute Dynamic SQL INSERT statements (for example, 1 PREPARE and 1000 EXECUTE statements from the prepared buffer).

The results are provided in the following table:

Note: The timings presented here are examples, only for the purposes of illustrating the relative performance of dynamic and static SQL. They should not be considered as representative of the actual performance that CDC Replication will achieve in your working environment.
Test (1000 INSERTS) CPU Time SRB Time Total CPU Time Elapsed Time
Static SQL .95 sec. .01 sec. .96 sec. .04 min.
EXECUTE only (naive) 14.49 sec. .02 sec. 14.51 sec. .30 min.
PREPARE and EXECUTE (astute) .99 sec. .02 sec. 1.01 sec. .04 min.

The table above reveals that combining a single PREPARE statement with multiple EXECUTE statements is comparable to Static SQL. The naive Dynamic SQL uses about 15 times more CPU and takes about 7.5 times the elapsed time of Static SQL. The astute Dynamic SQL uses only about 1.052 times more CPU than Static SQL, and has an indistinguishable elapsed time. The additional flexibility of astute Dynamic SQL is worth the slight increase (5.2%) in CPU utilization over the inflexible Static SQL.

CDC Replication uses astute Dynamic SQL when it accesses user table data. Recalling that each COMMIT purges the Dynamic SQL's prepared buffer, it can be seen that the efficiency of the astute Dynamic SQL is directly related to the number of times the prepared buffer can be reused before a COMMIT is issued.

On the source system, a cursor is opened to access the table's data during a Refresh operation, and each row is fetched individually. A COMMIT is issued (only to release locks) when there are no more rows to be fetched. Therefore, on the source system, for almost any table, astute Dynamic SQL should be virtually the same as Static SQL in its efficiency.

On the target system, when the apply process is making changes to user tables, it commits those changes when it receives a COMMIT statement from the source system. The target system does not issue COMMIT statements by itself. The source system sends COMMIT statements to the target system when the source system obtains them from the DBMS Log. Therefore, on the target system, astute Dynamic SQL's efficiency is based on how frequently COMMIT statements are issued by the user's application that originally changed the replicated tables on the source system. Large commit groups are applied on the target system almost as efficiently as Static SQL. Small commit groups, such as those generated by online transactions, can be inefficient, approaching the inefficiency of naive Dynamic SQL. The Transaction Grouping feature can be engaged by the customer to cause CDC Replication to collect commit groups received at the target system, and commit them together as a single unit of work, based on customer-provided parameters. This will have a positive effect on performance on the target system, as it will increase the efficiency of the astute Dynamic SQL being used to apply the changes. CDC Replication also makes use of DB2's Dynamic Statement Caching feature by specifying KEEPDYNAMIC(YES) on the BIND PLAN statement for the target apply process. This causes Db2 to retain cached statements across COMMITs (but not ROLLBACKs), if possible.

On the target system, CDC Replication reserves 50 PREPARE buffers for retaining prepared Dynamic SQL for each subscription. For each mapped table in a subscription on the target system that is receiving changes, it is possible that an INSERT, UPDATE, or DELETE change can be applied to the table. A subscription's PREPARE buffer is allocated to a table-and-change combination when the first such change for a specific table is received in the commit group. When there are more than 16 (50 buffers per subscription and 3 change types per table) tables mapped to a subscription, the 50 PREPARE buffers could all be in use. If this should happen, then CDC Replication uses an LRU (Least Recently Used) algorithm to decide which PREPARE buffer is reused for astute Dynamic SQL. This means that, even when commit groups are large, when a subscription has more than 16 tables mapped, the efficiency of the astute Dynamic SQL begins to drop.

One way to increase the efficiency of the apply process at the target system is to change user applications so that they contain appreciable numbers of changes in each commit group. This may be possible with batch jobs, but is impractical with commit groups generated by online transactions. With the use of the Transaction Grouping feature, however, it is no longer necessary to modify application programs to increase the efficiency of replication. Still it would increase the efficiency of the application programs themselves.

Another way to increase the efficiency of the apply process at the target system is to replace the astute Dynamic SQL that is not efficient with Static SQL coded in a user exit.

To summarize, these are the considerations for deciding if CDC Replication's astute Dynamic SQL will become inefficient.

  • A “profile” of the DBMS Log's commit groups that contain data to be replicated is required. It is the Database Administrator's task to identify those replicated tables that are being changed in small commit groups, and to determine how many of these small commit groups will be created in a particular time period. Both the number of changes in commit groups and the numbers of such commit groups need to be considered to assess the efficiency of the apply process.
  • Subscriptions where there are more selected tables than the configured number of PREPARE buffers specified can support will need to be examined. Is there a restriction on the types of changes (INSERT, UPDATE and DELETE) that are being applied to a particular table? A table may receive a high proportion of UPDATE changes, but very few INSERT and DELETE changes within a subscription. This would tend to require only a single PREPARE buffer for that table in the commit group, increasing the efficiency of the subscription's astute Dynamic SQL. When the mix of tables combined with the types of changes to the tables exceeds configured maximum, the efficiency begins to decrease, and this trend worsens as the table-and-change mix gets larger.
  • CDC Replication's Transaction Grouping feature can be engaged at the target to increase the minimum size of the logical units of work being applied. This will increase the efficiency of the astute Dynamic SQL.
  • The Db2 Dynamic Statement Cache should be turned on and have a sufficient size to cache up to the configured maximum statements per target subscription.