Just in Time statistics

Transparently to users, the system automatically generates Just in Time (JIT) statistics on user tables to help the optimizer refine planning.

JIT statistics are not run on system tables, external tables, or virtual tables. JIT statistics improve selectivity estimations when a table contains data skew or when there are complex column/join restrictions. The system also uses JIT statistics to avoid broadcasting large tables that were estimated to be small based on available statistics. The overhead of these on-the-fly statistics is negligible when compared to the overall improved query performance and total query time.

JIT statistics use sampler scan functionality and zone map information to conditionally collect several pieces of information:

The number of rows that are scanned for the target table
The number of extents that are scanned for the target table
The number of maximum extents that are scanned for the target table on the data slices with the greatest skew
The number of rows that are scanned for the target table that apply to each join
The number of unique values for any target table column that is used in subsequent join or group by processing

This information is conditionally requested for and used in estimating the number of rows that result from a table scan, join, or “group by” operation.

Note: JIT statistics do not eliminate the need to run the GENERATE STATISTICS command. While JIT statistics help guide row estimation, there are situations where the catalog information calculated by GENERATE STATISTICS is used in subsequent calculations to complement the row estimations. Depending on table size, the GENERATE STATISTICS process does not collect dispersion because the JIT statistics scan estimates it on-the-fly as needed.

The system automatically runs JIT statistics for user tables when it detects the following conditions:

Tables that contain more than 5,000,000 records.
Queries that contain at least one column restriction.
Tables that participate in a join or have an associated materialized view. JIT statistics are integrated with materialized views to ensure that the exact number of extents is scanned.

The system runs JIT statistics even in EXPLAIN mode. To check whether JIT statistics were run, review the EXPLAIN VERBOSE output and look for cardinality estimations that are flagged with the label JIT.