%GETCOL and %SELECT function calls and processing efficiency
%GETCOL or %SELECT function calls can be used in expressions and row filtering expressions. The %GETCOL function provides a mechanism to obtain current data from an application table, which is then used in the evaluation of the expression. %SELECT provides a similar capability to %GETCOL, but is more powerful and natural to use, due to its accepting an SQL SELECT statement to specify the current data to be accessed.
When processing a %GETCOL function call, CDC Replication prepares and executes SQL to obtain the data specified. Such ad hoc SQL can be very inefficient, depending on the access path that Db2® uses when the SQL is executed. To address this exposure, CDC Replication issues a Db2 EXPLAIN against the SQL at the time that the %GETCOL function is entered into or changed in an expression or row filtering expression. During installation, the CHCMDMUT sample job will have created a <CHCUserID>.PLAN_TABLE. Db2 stores the results of the EXPLAIN in this PLAN_TABLE, where it is accessed and examined by CDC Replication. If the ad hoc SQL uses an inefficient access path, then an appropriate warning message is issued in Management Console.
%GETCOL function call access path warning messages should not be ignored. For example, inefficient %GETCOL function calls used in row filtering expressions could slow data processing by the source environment to the extent that CDC Replication may be slower at obtaining data from the Db2 log than Db2 is in writing to it, producing ever-increasing replication latency. In addition to the failure to perform, %GETCOL function call access path problems can cause CDC Replication to absorb large amounts of CPU and I/O resources, negatively impacting other jobs running on the same server.
Some of the types of inefficiencies that can be indicated in the PLAN_TABLE are non-unique key and partial key accesses, index space scans, and table space scans. Non-unique and partial key accesses may or may not be acceptable, depending on the amount of data accessed by the partial key. Index space and table space scans should never be permitted. Assessment of these types of access path problem types is the normal responsibility of a Database Administrator. The DBA should be able to determine what is required to correct the access path situation. Usually, it can be corrected by building a new index with the required keys to allow a full key access to obtain the data. A different way to improve %GETCOL efficiency may be the use of different key columns or the addition of more key columns to the %GETCOL function call, if this is possible, causing the access path to change. This approach can avoid the building of another index, which would require Db2 buffer resources when it was being accessed.
Yet another approach would be to avoid the use of %GETCOL function calls entirely.
%SELECT is processed differently from %GETCOL. The SQL SELECT statement framework provided in a %SELECT function is not subjected to an EXPLAIN request, and no information about the efficiency of the SQL SELECT statement is passed back to the user. Of course, the dynamic SQL produced for the %SELECT statement can still employ an inefficient access path. It is the responsibility of the DBA to use the tools available (for example, EXPLAIN(YES) during binding and subsequent analysis of the PLAN_TABLE produced) to determine if the SELECT SQL statement created by the %SELECT statement executes efficiently.