In it's simplest terms, a Zone Map is a persistent table maintained by the Netezza system that contains information about the data in a table created by a user. A single row entry in the Zone map table would correspond to an extent (Storage) of the User table. By default, Netezza would record information about integeral and temporal columns from the user's table in it's corresponding zone map. For each qualified column, Netezza would maintain the maximum and minimum value of the column that is stored in an extent (3MB).
Other than integral and temporal types, Netezza also maintains character column info (first 8 bytes) in zone maps if that column is mentioned in "order by" clause of a materalized view. If your table has hundreds of columns, remember that Netezza would maintain zone map info only for the first 200 qualified columns (its configurable).
Netezza does not create zone map info for numeric types that are larger than 8 bytes (even if it's part of an "order by" clause of a materialized view). Check if the smaller numeric coulmns can be converted to integral types to exploit zone maps.
To check the columns of a table (that qualify for zone map) named "mytab" in "mydb" database,you can use nz_zonemap from Netezza toolkit (ask Support if you do not have this):
nz_zonemap mydb mytab
To see the zone map entries (ranges in an extent) for a column (say col1), you could use :
nz_zonemap mydb mytab col1
As new rows are inserted into the table or updated, qualified entries in the zone map are updated for the new ranges. An update in a zone map is an "in place" update, unlike in a user table which is a combination of a delete and an insert. Delete operations on the user table do not reflect in the zone maps but you can do groom or generate statistics to refresh it.
Zone Maps give best results if the table is ordered or nearly ordered on the restricting/filter columns (WHERE condition). This would lead to efficient I/O since the qualified rows would be localized within a few extents and the rest of the data need not be scanned. Temporal data loaded sequentially falls into an ordered pattern very well and can exploit zone maps. Also, Netezza does not bother creating zone maps for very small tables (< 10MB by default, this is configurable and may depend on which version of Netezza you are on).
At the other end, you could receive an error if the table exceeds 192GB on a dataslice. In such cases, you may want to:
- GROOM the table to remove deleted data.
- Redistribute the data if the high skew is causing excess data in a dataslice.
- break the table into smaller tables and create a view with a UNION ALL operation.
When investigating poor query performance, do remember to check for disabled zone maps or too may deleted rows or unordered data. Zone maps get disabled during "generate statistics". Interrupting this command may lead to disabled zone maps which would not be referred during scans. In that case, ensure that you run a full generate statistics for at least one column to re-enable the zone map for that table.
Unfortunately, a query plan would not reveal if a zone map was used or not. The ZoneMap Debug tool helps you log information about zone map usage. To enable this tool for a single session :
set enable_zmap_debug=1 in nzsql (or through jdbc/odbc client).
Set the flag to 0 to disable it. If you need to debug all the sessions in the system, use enableZMapDbgWrite switch through the "nzsystem set" command :
nzsystem set –args system.enableZMapDbgWrite=1
The debug info would be written into the following virtual tables which you could query (thanks to Kapil Maheshwari for guiding me to the right tables) :
For older systems, this info could be accessed through the following virtual tables :