Question & Answer
Question
How does the planner determine which tables to pre-broadcast?
Answer
What is pre-broadcasting?
Netezza Release 4.6 and later introduced a new phase of planning called pre-broadcasting. A pre-broadcast table is a table whose rows are replicated, in a broadcast copy of the table, on all dataslices/active SPUs in an NPS system. The presence of a broadcast copy speeds up queries (typically joins) that require rows of the table to be broadcast. The pre-broadcast phase looks at the smaller dimension tables involved in a query and gathers additional information about their actual size.
The name pre-broadcast is a bit misleading. A table is usually broadcast when it is being joined to another, larger table and is distributed differently. The pre-broadcast phase of planning may find that the dimension table is already co-located with the larger tables in the query, and will simply materialize the table on the swap partition using a saveTempNode (found in the plan file).
What is the benefit of pre-broadcasting an object?
Before pre-broadcasting, transient table sizes were calculated using estimated row counts and the width of the columns involved. For queries that relied on statistics (fixed or JIT), inaccuracies -- even slight -- could produce a different, less-optimal plan. And a less optimal plan could manifest itself as poor query performance.
With pre-broadcasting active, the actual sizes of the tables being pre-broadcast are known by the Optimizer when putting together the query's execution plan, which results in far more efficient query plans.
Which tables are eligible?
In short, all tables are potentially eligible initially. The system calculates the estimated size of the table, taking into account the data types of the columns involved and the effects any restrictions might have on total row count. It then compares that result to the prebroadcast_limitsystem setting. Its value is a percentage of the snippet memory cap. The default value is 10, which works out to be a little bit more than 18 MB. Reducing this value to 5 (or 5%) would reduce the limit to roughly 9MB.
If the planner finds a table to be small enough to pre-broadcast, an additional plan is scheduled to pre-broadcast the table.
How does the planner determine how large a table is?
The planner looks at one of three things to determine table size, which ultimately determines whether or not a table is eligible for pre-broadcasting.
1. For tables with a base size smaller than the prebroadcast_limit: The planner uses the raw size of the transient table. This value is calculated by multiplying the width of the columns involved by the number of rows in the table (according to the system catalog).
2. For tables with a base size larger than the prebroadcast_limit but with fewer rows than jit_stats_min_rows (default 5M): The planner estimates the size of the table using fixed statistics. It then checks for any restrictions against the table and computes the estimated row count multiplied by the row width (only for columns involved in this query).
3. For tables larger than jit_stats_min_rows: The planner uses a JIT sample scan to produce the estimated row count, which it then multiplies by the row width to determine the transient table's size.
How does the pre-broadcast planner deal with incorrect estimates?
The planner takes an extra precautionary step to prevent pre-broadcasting very large tables. This step involves appending a limitNode to the scanNode of the table being considered for pre-broadcasting.
For example, if the query references only a single integer column from a given table, then the planner imposes a limit of n, where n is the limit divided by 4 bytes (size of an integer). The resulting value is the maximum number of rows that can be pre-broadcast. If, during the scan operation, the planner finds that more rows exist than this limit allows, then the pre-broadcast fails and the planner will scan this table during the main query execution plan.
The following plan snippet, projecting an integer and date column, illustrates the process:
SQL: Phase 1 Pre-Broadcast: DIM_DATE
1[00]: spu ScanNode table "MYDATABASE.ADMIN.DIM_DATE" 1358543 memoryMode=no flags=0x0 index=0 cost=24 (o)
1[01]: spu RestrictNode (non-NULL)
1[02]: spu ProjectNode, 2 cols, projectFlags=0x0
1:DATE_VAL 0:DATE_ID
1[03]: spu LimitNode limit=2450944 offset=%
1[04]: spu ReturnNode
If you were to run the same query, but only projecting the date_id column, you would reduce the projection by half.
SQL: Phase 1 Pre-Broadcast: DIM_DATE
1[00]: spu ScanNode table "MYDATABASE.ADMIN.DIM_DATE" 1358543 memoryMode=no flags=0x0 index=0 cost=24 (o)
1[01]: spu RestrictNode (non-NULL)
1[02]: spu ProjectNode, 1 cols, projectFlags=0x0
0:DATE_ID
1[03]: spu LimitNode limit=4901888 offset=%
1[04]: spu ReturnNode
By reducing the projection by half, you double the number of rows that could fit into this reserved space. If you were to increase the prebroadcast_limit by a factor of 2, that would also double the threshold of allowed rows. Using the previous example with the prebroadcast_limit doubled:
SQL: Phase 1 Pre-Broadcast: DIM_DATE
1[00]: spu ScanNode table "MYDATABASE.ADMIN.DIM_DATE" 1358543 memoryMode=no flags=0x0 index=0 cost=24 (o)
1[01]: spu RestrictNode (non-NULL)
1[02]: spu ProjectNode, 1 cols, projectFlags=0x0
0:DATE_ID
1[03]: spu LimitNode limit=9804032 offset=%
1[04]: spu ReturnNode
Multi-phase planning is an important feature to Netezza; it is a shift from the more traditional approach of relying on fixed, point-in-time statistics to produce query execution plans. By using various sampling techniques and collecting real-time information about the tables involved, far better plans are produced, which results in significant performance improvements.
Historical Number
NZ776085
Was this topic helpful?
Document Information
Modified date:
17 October 2019
UID
swg21569663