Data redistribution

Deployment options: Netezza Performance Server for Cloud Pak for Data System

After the platform and software expansion, if you chose online redistribution of tables, you can now perform that using the command nz_redistribute.

Offline data redistribution

After the topology expansion step, the newly added data slices are empty. nzredrexpand initiates redistribution of table data across the augmented set of data slices. Netezza Performance Server database access is unavailable during this process.

The redistribution process requires a small amount of space on each data slice; only about 1% of free space is necessary. The redistribution tool validates that each slice has the necessary amount of free space before you initiate redistribution.

The redistribute job runs a sample test to estimate the distribution rate. It then identifies the sizes of all the existing tables in the system. With this information, the redistribute job estimates the total time for redistribution.

The redistribute job iterates over each table in the system, redistributing each in turn. Redistribution is performed incrementally, a few extents at a time. The process guarantees no loss of extents or duplication of extents, even if there are interruptions or hardware failures.

The redistribution execution plan for each table consists of scanning the table and redistributing records based on the table's distribution key, with distribution keys rehashed, according to the new data slice count. Tables that are distributed on random are round-robin that is distributed among the new data slices. Chunked random distribution mode is recognized when it is enabled.

The redistribution step can be restarted if it is interrupted for any reason by running nzredrexpand --resume. However, until the redistribution is complete, the system is unavailable for user queries. When started, redistribution must be run to completion, and resumed if interrupted. You cannot go back to the old number of data slices when the process begins.

When the expansion process is completed, all host backups that were taken before expansion are useless.

Online data redistribution

Online data redistribution addresses the potentially long outage experienced by client applications, by opening up access to the data after software expansion and allowing data redistribution to proceed and with WLM (workload management) resource limits of the user’s choice.