Explain information for column-organized tables
Explain information is captured to support column-organized tables. You can use this information to determine how your application performs when it uses this functionality.
The CTQ plan operator represents the transition between column-organized data processing and row-organized data processing.
- Set the EXPLAIN mode on by using the CURRENT EXPLAIN MODE special
register as follows:
db2 SET CURRENT EXPLAIN MODE YES
- Issue your query against column-organized tables.
- Issue the db2exfmt command to format the contents
of the explain tables and obtain the access plan. The following example
shows you how to use this command against the SAMPLE database:
db2exfmt -d sample -1 -o output.exfmt
Improving the performance of queries on column-organized tables by using indexes
Some queries that have selective search conditions might run faster when accessing column-organized tables if indexes are used. Unique indexes are implicitly created to support enforced primary and unique key constraints. Starting in Db2® version 11.1 Mod Pack 3 and Fix Pack 3 (126.96.36.199), indexes can also be explicitly created by using the CREATE INDEX statement. Before Db2 188.8.131.52, unique indexes were used for select, update, or delete operations that affect only one row in a column-organized table. An example is shown in the following index access plan:
Rows RETURN ( 1) Cost I/O | 1 CTQ ( 2) 41.3466 6 | 1 NLJOIN ( 3) 41.3449 6 /----+-----\ 1 1 CTQ TBSCAN ( 4) ( 6) 6.91242 34.4325 1 5 | | 1 98168 IXSCAN CO-TABLE: VICCHANG ( 5) /BIC/SZCCUST 6.91242 Q1 1 | 98168 INDEX: VICCHANG /BIC/SZCCUST~0 Q1
This plan is equivalent to a FETCH-IXSCAN combination that is used to access row-organized data. For index access to column-organized data, row-organized data processing retrieves the rowid from the index by using IXSCAN(5) and passes it to column-organized data processing using CTQ(4). CTQ(4) represents a column-organized table queue that passes data from row-organized data processing to column-organized data processing. TBSCAN(6) locates the columns that are identified by the rowid. TBSCAN(6) might apply additional predicates if necessary, or reapply the IXSCAN predicates in some situations. Specifically, if the table is being accessed under the UR isolation level, or the access is in support of an update or delete operation, the TBSCAN needs to apply only those predicates that were not already applied by the IXSCAN. Otherwise, the TBSCAN needs to reapply all of the IXSCAN predicates. NLJOIN(3) represents the process of retrieving the rowid from row-organized data processing and passing it to the column-organized TBSCAN.
- The result of the index access is joined to another column-organized table
- The index scan returns at most one row
- Removal of duplicate rows
- Sorting by using column-organized processing
The explain representation for a column-organized FETCH is similar to that of a row-organized FETCH, except for arguments that are not applicable to column-organized processing. The example that is shown previously for a query that uses an index before Db2 184.108.40.206 would appear as the following when a FETCH operator is used:
Rows RETURN ( 1) Cost I/O | 1 FETCH ( 2) 17.0787 1 /----+----\ 1 98168 IXSCAN CO-TABLE: VICCHANG ( 3) /BIC/SZCCUST 6.91242 Q1 1 | 98168 INDEX: VICCHANG /BIC/SZCCUST~0 Q1
- Jump scans
- Deferred fetch index plans (index ANDing, ORing, and list prefetch)
- Star join and zigzag join
- Scan Sharing
Intra-partition parallel index scans are not supported for column-organized tables.
Update and delete operations that use an index scan on a column-organized table are not supported by the FETCH operator. Update and delete operations that affect only a single row are supported by using either index-only access or the nested-loop fetch approach.
Common table expression
A common table expression defines the result of a table that you can specify in the FROM clause of an SQL Statement. Statements with common table expressions against column-organized tables can have more efficient execution plans.
The following sample execution plan correspond to that query.
WITH cse(c1,c2) AS (SELECT t1.c1, MAX(c2) FROM t1 GROUP BY t1.c1) SELECT a.c2 FROM cse a, cse b WHERE a.c1 = b.c2;
The execution plan includes the TEMP(6) operator, which materializes the results of the common table expression during column-organized data processing. Operators TBSCAN(5) and TBSCAN(9) scan the output of the TEMP(6) operator and send the data to the HSJN(4) operator. Afterward, the CTQ(3) operator sends the results of the join operation from column-organized data processing to row-organized data processing.
Rows RETURN ( 1) Cost I/O | 25 LTQ ( 2) 2078.07 30 | 25 CTQ ( 3) 2074 30 | 25 ^HSJOIN ( 4) 2073.79 30 /-+--\ 25 25 TBSCAN TBSCAN ( 5) ( 9) 1036.85 1036.85 15 15 | | 25 25 TEMP TEMP ( 6) ( 6) 1033.24 1033.24 15 15 | 25 GRPBY ( 7) 1032.45 15 | 50000 TBSCAN ( 8) 792.149 15 | 50000 CO-TABLE: URSU T1 Q1
SELECT * FROM tc1 ORDER BY c1,c2 SELECT rank() OVER (PARTITION BY c1 ORDER BY c2) AS rnk, c2 FROM tc1
A column-organized sort is typically executed in parallel using multiple database agents and can use different methods to distribute the data among the agents, depending on the semantics of the SQL statement. The type of parallel sorting method is indicated by the SORTTYPE argument of the SORT operator along with the sort key columns and the sort partitioning columns. The SORTTYPE argument can have the values GLOBAL, PARTITIONED, or MERGE for a column-organized SORT.
SELECT * FROM tc1 ORDER BY c1, c2
Rows RETURN ( 1) Cost I/O | 1000 LMTQ ( 2) 351.462 10 | 1000 CTQ ( 3) 289.471 10 | 1000 TBSCAN ( 4) 288.271 10 | 1000 SORT ( 5) 278.209 10 | 1000 TBSCAN ( 6) 85.5805 10 | 1000 CO-TABLE: DB2USER TC1 Q1
Arguments: --------- SORTKEY : (Sort Key column) 1: Q1.C1(A) 2: Q1.C2(A) SORTTYPE: (Intra-Partition parallelism sort type) GLOBAL
SELECT c1, c2, c3, MAX(c1) OVER (PARTITION BY c2), MAX(c1) OVER (PARTITION BY c2, c3) FROM tc1 ORDER BY c2
Rows RETURN ( 1) Cost I/O | 1000 LMTQ ( 2) 466.38 10 | 1000 CTQ ( 3) 411.188 10 | 1000 TBSCAN ( 4) 409.588 10 | 1000 SORT ( 5) 399.527 10 | 1000 TBSCAN ( 6) 288.271 10 | 1000 SORT ( 7) 278.209 10 | 1000 TBSCAN ( 8) 85.5805 10 | 1000 CO-TABLE: DB2USER TC1 Q1
Arguments: --------- PARTCOLS: (Table partitioning columns) 1: Q2.C2 SORTKEY : (Sort Key column) 1: Q2.C2(A) 2: Q2.C3(A) SORTTYPE: (Intra-Partition parallelism sort type) PARTITIONED
Arguments: --------- SORTKEY : (Sort Key column) 1: Q3.C2(A) SORTTYPE: (Intra-Partition parallelism sort type) MERGE
SELECT c1, c2, c3, MAX(c1) OVER (PARTITION BY c2), MAX(c1) OVER (PARTITION BY c2, c3) FROM tc1
Rows RETURN ( 1) Cost I/O | 1000 LTQ ( 2) 421.526 10 | 1000 CTQ ( 3) 383.134 10 | 1000 TBSCAN ( 4) 288.271 10 | 1000 SORT ( 5) 278.209 10 | 1000 TBSCAN ( 6) 85.5805 10 | 1000 CO-TABLE: DB2USER TC1 Q1
Arguments: --------- PARTCOLS: (Table partitioning columns) 1: Q2.C2 SORTKEY : (Sort Key column) 1: Q2.C2(R) 2: Q2.C3(A) SORTTYPE: (Intra-Partition parallelism sort type) PARTITIONED
|Agent 1||Agent 1||Agent 1||Agent 1|
Since the window specification for both OLAP functions is PARTITION BY rather than ORDER BY, and the outer sub-select doesn't have an ORDER BY clause, strict order on C2 and C3 does not need to be produced by the SORT. Distinct values of C2 just need to be processed by the same database agent. For example, all rows with values C2 = 5 must be processed by the same agent in order to properly determine the maximum value of C1 for that group of values. Limiting the sorting to values of C3 within distinct values of C2 reduces the memory required to perform the sort. Performance might also be improved because partitions can be emitted out of sequence, avoiding the synchronization overhead to emit them in order.
Since the data is not strictly ordered on C2, the order indicator is set to "R" indicating that it is randomly ordered.
A single SORT operation can provide the partitioning needed for multiple OLAP functions if the partitioning specifications are identical or are proper subsets of each other. When this is the case, the SORT will be partitioned on the the smallest set of columns that is a proper subset of all other OLAP partitioning specifications.