Nickname parallel fetching vs Nickname serial fetching

Executing SQL query in parallel or serial is significantly different from the access plan perspective. You can take a non-partitioned table and partitioned table as example. For query against non-partitioned table you don’t need to consider partition key, where the data are stored and how to transport and process data etc. Nickname has very similar attributes, when you are using serial fetching, the nickname data only get fetched from coordinator partition and normally you don’t need to consider how the nickname data is partitioned. While you turn on interpartition parallelism with nickname parallel fetching feature, the Db2® query compiler takes nickname as scatter partitioned table or replicate partitioned table, and query compiler generate parallel access plan accordingly.

Parallel access plan will be distributed to all the applicable partitions and then get executed, while a serial access plan will only be executed in coordinator partition. Let’s take SQL query “select c1 from n1 group by c1” as an example to see the difference of the two plans.

The following is the access plan of parallel fetching, from the access plan we can see there is an extra GRPBY operator above the SHIP, this doesn’t means the ‘group by’ is not pushed down if you examine the remote statement later you will find it is already pushed down to remote data source. The extra GRPBY operator is used to do final ‘group by’ for the data returned from different partitions. And there is also a MDTQ operator above the SHIP, this is the key difference between a parallel access plan and a serial access plan, a parallel access plan always has TQ operator above the SHIP. A table queue (TQ) is used to flow table data from one database partition to another when multiple database partitions are involved when processing a query. Multiple database partitions are used to process a query when parallelism is involved. In nickname parallel fetching it explores all applicable database partitions to process a query, so TQ is always needed.

            Rows
           RETURN
           (   1)
            Cost
             I/O
             |
              1
           GRPBY
           (   2)
           67.7084
              1
             |
             10
           MDTQ
           (   3)
           67.5984
              1
             |
             2.5
           SHIP
           (   4)
           64.0085
              1
             |
            1000
      NICKNM: NEWTON
             N1
             Q1

When you examine the plan, you can find the SHIP operator and check the remote statement, in parallel fetching, the remote statement is different from the one in serial fetching. In serial fetching you can copy the remote statement and execute it in remote data source directly. But in parallel fetching, the remote statement cannot be executed in remote data source. The remote statement is a SQL template which defines how the original SQL query will be split. While each partition receives the plan, every partition will populate the SQL template with correct conditions so that the SQL can be executed in remote data source and most importantly only fetch partial data.

The following is serial access plan:

            Rows 
           RETURN
           (   1)  
            Cost 
             I/O 
             |   
             10  
           SHIP  
           (   2)  
           121.908 
              1   
             |   
            1000 
      NICKNM: NEWTON 
             N1  
             Q1

From this access plan we can see there is no extra operators above the SHIP (In most cases, there will be operators above the SHIP as some operations cannot be pushed down). And if we check the remote statement, we can see the remote statement is basically same with the original SQL query. Here is the remote statement of this access plan: SELECT A0."C1" FROM "DB2INST1"."T1" A0 GROUP BY A0."C1" FOR READ ONLY.