Reordering data after migration
Learn how to order data after you migrated from PureData System for Analytics to Netezza Performance Server.
The PureData System for Analytics (TwinFin, Mako, Striper) and Netezza Performance Server boxes have different data slice configurations. To cut down on the amount of data that needed scanning, your PureData System for Analytcis might have relied on well ordered tables and zonemaps. However, when you migrate data from one generation of a system to another generation, the migrated data is not going to be migrated on the same data slices, nor in the same order. This includes nzrestore, nz_restore, nz_migrate and other migration commands.
Note that when data is moved, the natural order of the data is disrupted. It is important that the natural order of the migrated data is restored once the migration process is completed. This ensures that you don't encounter any issues after migration.
Checking data ordering
To check whether or not a query is (or isn't) making use of zonemaps, you can look at the
*.pln
file.
Detail 98620_21 table 158475/159304.76/160935 scan 18/19.75/23
This is an example of a good use of zonemaps. Only about 1/1000th of the whole data had to be scanned for this table/query.
Detail 2730534_81 table 33122/34768.89/37139 scan 30619/32226.61/34278
For this table/query, a full table scan had to be performed.
So this will tell you everything you need to know about THIS query. But if you want to compare "apples to oranges", then you really need a copy of that other plan as well. To see how much data it had to scan.
Data reordering
For this, select large tables, the ones with more than 10 million rows. The larger the table, the more important this is if you want to be able to reduce the amount of data that has to be scanned for any given query.
CBT tables
ORGANIZE ON
clause, run the GROOM
command:groom table <tableName> records all;
You might be already sorting your table as part of the ETL processing and you know what columns work best for you for organizing the data.
If not, you can run the
nz_sort_order script against the original source table. The script looks at the
existing zonemap information for the table and tries to compute a % sorted
for each
of the columns. That information can help you choose the proper sort key.
- Run the ALTER TABLE and ORGANIZE ON commands.
As a result, your table is now recognized as a clustered base table.
- Run the GROOM
command:
groom table <tableName> records all;
This command performs the actual sort/reorganization of the data in the table.
Alternatively, you can create a copy of the table and sort it on the fly by running the CREATE TABLE AS and ORDER BY commands. For more information about CTAS, see CREATE TABLE AS.