Topic
2 replies Latest Post - ‏2013-02-11T22:06:10Z by SystemAdmin
SystemAdmin
SystemAdmin
5 Posts
ACCEPTED ANSWER

Pinned topic query optimization

‏2013-02-10T14:59:50Z |
I am curious as to how does query optimization take place in DB2 Temporal. Can someone explain or direct me to some resources?
Updated on 2013-02-11T22:06:10Z at 2013-02-11T22:06:10Z by SystemAdmin
  • MatthiasNicola
    MatthiasNicola
    7 Posts
    ACCEPTED ANSWER

    Re: query optimization

    ‏2013-02-10T23:32:45Z  in response to SystemAdmin
    Hi Maria,

    at a high level, the temporal constraints in a query, such as "FOR SYSTEM_TIME AS OF" or "FOR BUSINESS_TIME FROM ... TO ...", are compiled as predicates into the query execution plan. The DB2 optimizer then performs cost-based optimization of the plan, much like for any other SQL statements.

    You may want to use the db2exfmt tool to look at the execution plans of temporal queries, and compare the original vs. the optimized statement in the db2exfmt output.

    Thanks,

    Matthias


    Matthias Nicola
    http://www.tinyurl.com/pureXML
    http://nativexmldatabase.com/
  • SystemAdmin
    SystemAdmin
    5 Posts
    ACCEPTED ANSWER

    Re: query optimization

    ‏2013-02-11T22:06:10Z  in response to SystemAdmin
    Hi Maria,

    On DB2 10 for z/OS, the cost-based optimization is done as usual, there are some optimizations including following query transformations:

    1. Merge of BUSINESS_TIME table expression

    Assume --
    att is an application-period temporal table, and bus_start/bus_end are its BUSINESS_TIME
    period columns. rt is another table.

    Given query --
    SELECT ... FROM att FOR BUSINESS_TIME AS OF exp, rt where att.id = rt.id;

    It is equivalent to query --
    SELECT ... FROM (SELECT ... FROM att
    WHERE bus_start <= exp AND bus_end > exp) att, rt where att.id = rt.id;

    DB2 for z/OS merges the table expression and evaluates predicates bus_start <= exp AND bus_end > exp before join att to rt --
    SELECT ... FROM att, rt where att.id = rt.id;
    ==> the table expression is gone, internally the predicates are evaluated before the inner join

    2. Union distribution of SYSTEM_TIME table expression, followed by further table expression
    merge of each branch

    Assume --
    policy_info_stt is a system-period temporal table, and sys_start/sys_end are its SYSTEM_TIME
    period columns. policy_owner_id is a column of it too. stt_history is the history table of
    policy_info_stt. customer is another table.

    Given query --
    SELECT ...
    FROM customer, policy_info_stt FOR SYSTEM_TIME FROM exp1 TO exp2
    WHERE customer_id = policy_owner_id ;

    It is equlivalent to query --
    SELECT ...
    FROM customer, (SELECT ... FROM (SELECT ... FROM policy_info_stt
    UNION ALL
    SELECT ... FROM stt_history) policy_info_stt
    WHERE sys_start < exp2 AND
    sys_end > exp1) policy_info_stt
    WHERE customer_id = policy_owner_id ;

    Continue the query transformations --

    SELECT ...
    FROM customer, (SELECT ... FROM (SELECT ... FROM policy_info_stt) policy_info_stt
    WHERE sys_start < exp2 AND
    sys_end > exp1
    UNION ALL
    SELECT ... FROM (SELECT ... FROM stt_history) stt_history
    WHERE sys_start < exp2 AND
    sys_end > exp1) policy_info_stt
    WHERE customer_id = policy_owner_id ;

    SELECT ...
    FROM customer,(SELECT ... FROM policy_info_stt
    WHERE sys_start < exp2 AND
    sys_end > exp1
    UNION ALL
    SELECT ... FROM stt_history
    WHERE sys_start < exp2 AND
    sys_end > exp1) policy_info_stt
    WHERE customer_id = policy_owner_id ;
    SELECT ...
    FROM customer, policy_info_stt
    WHERE sys_start < exp2 AND
    sys_end > exp1 AND
    customer_id = policy_owner_id
    UNION ALL
    SELECT ...
    FROM customer, stt_history
    WHERE sys_start < exp2 AND
    sys_end > exp1 AND
    customer_id = policy_owner_id ;

    3. Predicate push down with materialized SYSTEM_TIME table expression

    Assume --
    policy_info_stt is a system-period temporal table, and sys_start/sys_end are its SYSTEM_TIME
    period columns. coverage and policy_owner_id are columns of it too. stt_history is the
    history table of policy_info_stt. customer is another table.

    Given query --
    SELECT DISTINCT coverage, ... ...
    FROM customer, policy_info_stt FOR SYSTEM_TIME FROM exp1 TO exp2
    WHERE customer_id = policy_owner_id AND coverage > 80000 ;

    Regarding this query, the union all table expression cannot be merged because of DISTINCT keyword. To improve filter factor, predicate coverage > 80000 is pushed down to the implicit union all table expression, same as the implicit SYSTEM_TIME temporal predicates --

    SELECT DISTINCT coverage, ... ...
    FROM customer, (SELECT ... FROM policy_info_stt
    WHERE sys_start < exp2 AND
    sys_end > exp1 AND
    coverage > 80000
    UNION ALL
    SELECT ... FROM stt_history
    WHERE sys_start < exp2 AND
    sys_end > exp1 AND
    coverage > 80000) policy_info_stt
    WHERE customer_id = policy_owner_id ;

    Thanks,
    Xiaohong