Preparing to expand Cloud Pak for Data System
Deployment options: Netezza Performance Server for Cloud Pak for Data System
This section outlines the preparatory procedures necessary before expansion.
Before you can expand your
system, you must
- complete/run multiple checks,
- prepare/organize your data,
- ensure that you have enough space for the expansion and,
- choose a redistribution method.
Verify Netezza Performance Server and Cloud Pak for Data System versions
Ensure that your systems (Netezza Performance Server
and Cloud Pak for Data System) are on the following versions:
- For Cloud Pak for Data System: 1.0.8.0 or later.
- For Netezza Performance Server: 11.2.1.11 or later (but not 11.2.2.0 or later). This is applicable only for online redistribution.
Pre-expansion checks
Note:
If system is part of NRS:
-
Find all the databases that are participating in NRS via the
nzdr
list db command. -
Remove all the all those databases from the replication using
nzdr delete db
command. To delete database from NRS, see Deleting replication databases.
If system is part of replication setup:
-
Remove the currently replicating database configuration from NRS.
-
Set it up again from scratch to ensure it works properly with the expanded system.
- Check system health.
-
Complete the checks to ensure that the following conditions are met:
- System health is good.
- Check whether there are any issues:
nzds -issues
- Check whether the system needs to be re-balanced (down time) before the
expansion:
nzds rebalance -check
- Check AP issues and ensure that there are no open alerts.
ap issues
- Check network health.
- Collect logs by running the command. Ensure that there are no errors in the
logs.
apdiag collect --components hw/switch/ network/
- Verify that the node model number was set correctly.
- Model numbers must be the same for all
nodes.
$ ap hw -d | grep -w node | awk -F '|' '{print $4 " "$7}'
Example:$ ap hw -d | grep -w node | awk -F '|' '{print $4 " "$7}' enclosure1.node1 7X21CTO1WW enclosure1.node2 7X21CTO1WW enclosure1.node3 7X21CTO1WW enclosure1.node4 7X21CTO1WW enclosure2.node1 7X21CTO1WW enclosure2.node2 7X21CTO1WW enclosure2.node3 7X21CTO1WW enclosure2.node4 7X21CTO1WW enclosure3.node1 7X21CTO1WW enclosure3.node2 7X21CTO1WW enclosure3.node3 7X21CTO1WW enclosure3.node4 7X21CTO1WW
- Manually vacuum Netezza Performance Server a couple of days before the expansion.
-
If you vacuum the system, you can shorten the redistribution time and by extension reduce the time that is needed to expand the system. Depending on the size of the catalog,an extra system outage happens. For example, a 2-hour outage.
- Stop Netezza Performance Server:
nzstop
- Run a manual vacuum:
/nz/support/bin/nz_manual_vacuum
- Start Netezza Performance Server:
nzstart
- Stop Netezza Performance Server:
- Check the speed of the disks.
- Analyze the command logs to identify low speed disks. If you identify any slow disks, contact
the Netezza Performance Server development team.
/nz/support/bin/nz_check_disk_scan_speeds -size 2 –cleanup
- After the command finishes, you can see the following
information:
Dropping table 'NZ_CHECK_DISK_SCAN_SPEEDS' now that the testing is complete. DROP TABLE
- After the command finishes, you can see the following
information:
Note: Certain actions may need to be repeated based on the outcomes of preceding
steps. For instance, if an evaluation reveals insufficient disk space to facilitate online
redistribution of all tables, consider optimizing disk space by cleaning databases, schemas, and
tables, followed by reassessing the available free disk space.
Check RAID consistency
- Run the nzraidcheck command two days before the expansion during the system
idle time to detect bad pages or disk issues and validate primary and mirror data
consistency.
Contact IBM if there are any issues./nz/kit/bin/adm/tools/nzraidcheck -mode checkOnly
Unlock read-only databases
- The redistribution process needs exclusive write access to all databases. Unlock any locked
databases. First run the nz_redr_db_lock_info tool.
/nz/support/bin/nz_redr_db_lock_info -d <directory>
- If there are no locked databases, the output shows No database needs to be unlocked, and no further action is needed.
- If there are locked databases, the output shows Run the following before expansion
starts with
nzsql
commands. Run the firstnzsql
command to unlock the databases.
Note: When unlocking read only databases (if they are part of incremental restores) they
will lose the ability to continue their incremental restore. It is recommended to drop such database
to reduce re-distribution times.
Clean up unwanted databases, schemas, and tables
- Clean up unwanted databases, schemas, and tables (including grooming of tables that have many deleted rows). This reduces disk space consumption and data redistribution time.
Groom versioned tables
- Run the following command to groom versioned
tables:
/nz/support/bin/nz_altered_tables -groom
Prepare to preserve data order
- Prepare to preserve data order two days before expansion. During the expansion process, data is
redistributed across data slices. The natural order of data is changed and might impact query performance.
- Run the nz_sort_order tool against each database to obtain recommendations
for converting tables to CBTs (by adding organizing columns). For example:
/nz/support/bin/nz_sort_order <database Name> -recommend yes
- Save the output files with recommendations that add organizing columns and then groom the tables. The recommendations are needed after the expansion and redistribution.
- Run the nz_sort_order tool against each database to obtain recommendations
for converting tables to CBTs (by adding organizing columns). For example:
Note: Preserving data order is not required if they are going from x to 2x size system. For example,
base+2 to base+4, base+4 to base+8, base+8 to base+16, base+16 to base+32 does not requires this
step. However, any other configuration like base+2 to base+6 or base+4 to base+6 or base+8 to
base+12 or base+32 to base+48 will require this step.
Backup databases
- Backup databases before expansion. If you are performing your regular backups, take the final increment before Netezza Performance Server expansion.
Prepare for post-expansion data validation
- Prepare for post-expansion data validation. To capture row count, run the following
command:
nz_db_table_row_count
Tip: Select some reports or queries to run for data validation, performance comparison, and capture the results.
Ensure sufficient disk space
-
Note: This step is applicable only if online redistribution is chosen.
- Determine the number of data slices that the system will have after expansion. You can obtain
this from IBM or by multiplying the number of SPU enclosures by 96. For example, if expanding to a
Base+8 system, the new data slice count is 768.Tip: To calculate new data slice count: number of SPU enclosures * 96.
- Run the following command to determine whether all tables can be redistributed by using online
redistribution. Use the output when choosing a distribution method in Redistribution methods.
/nz/support/bin/nz_redistribute -SpaceEstimate <new data slice count>
- Output
-
- If there is sufficient disk space to redistribute all tables
online:
nz_redistribute -SpaceEstimate <new data slice count> # Of Dataslices -------------------- Before Expansion: 120 After Expansion: 160 Dataslice Sizing -------------------- Total Capacity: 195.31 GiB USED (max): 133.64 GiB FREE (min): 61.66 GiB Largest Table -------------------- Name: SAMPLE_DATABASE.ADMIN.CUSTOMER_ORDERS DSlice AVG Storage: 21.07 GiB DSlice MAX Storage: 27.95 GiB Estimation Summation -------------------- Total space needed (per dataslice) to do the online redistribution: 128.19 GiB Should be adequate: 67.12 GiB to spare
- If the disk space is insufficient to redistribute all tables
online:
Proceed to Redistribution methods, or try to free up more disk space to cover thenz_redistribute -SpaceEstimate <new data slice count> # Of Dataslices -------------------- Before Expansion: 120 After Expansion: 160 Dataslice Sizing -------------------- Total Capacity: 195.31 GiB USED (max): 181.02 GiB FREE (min): 14.29 GiB Largest Table -------------------- Name: SAMPLE_DATABASE.ADMIN.WEB_HITS DSlice AVG Storage: 50.83 GiB DSlice MAX Storage: 61.56 GiB Estimation Summation -------------------- Total space needed (per dataslice) to do the online redistribution: 197.33 GiB INSUFFICIENT BY: 2.02 GiB
INSUFFICIENT BY
amount per data slice and then repeat thenz_redistribute -SpaceEstimate
command before choosing the redistribution method.
- If there is sufficient disk space to redistribute all tables
online:
- Determine the number of data slices that the system will have after expansion. You can obtain
this from IBM or by multiplying the number of SPU enclosures by 96. For example, if expanding to a
Base+8 system, the new data slice count is 768.