For the system to create the best execution plan for a query, it evaluates what it knows about the database tables that it accesses. Without up-to-date statistics, the system uses internal, default values that are independent of the actual table and which result in suboptimal queries with long run times.
|Per table||Number of records|
- Based on the number of records, the number of distinct values for a column, and assuming uniform distribution between the min and max values, the optimizer estimates the number of relevant rows and determines which is the smaller of two join tables.
- Based on the min and max values, the optimizer determines what type of math needs to be done (for example, 64-bit or 128-bit).
- If there are nulls in the database tables, then during code generation the system must generate more code to test and check fields for null values. Extra code means extra CPU cycles during execution time.
The GENERATE STATISTICS command collects this information. If you have the GenStats privilege, you can run this command on a database, table, or individual columns. By default, the admin user can run the command on any database (to process all the tables in the database) or any individual table.
SYSTEM(ADMIN)=> GRANT LIST ON TABLE TO user1; DEV(ADMIN)=> GRANT GENSTATS ON TABLE TO user1;
For more information about the GenStats privilege, see Table 1.
The following table describes the nzsql command syntax for these cases.
|A database (all tables)||GENERATE STATISTICS;|
|A specific table (all columns)||GENERATE STATISTICS ON table_name;|
|Individual columns in a table||GENERATE STATISTICS ON my_table(name, address, zip);|
The GENERATE STATISTICS command reads every row in every table to determine dispersion values (no sampling). It provides the most accurate and best quality statistics.