Expanding Netezza Performance Server for Cloud Pak for Data System
Expand your Netezza Performance Server for Cloud Pak for Data System to store more data and support increased parallelism for processing queries over many SPUs (Snippet Processing Units).
Expansion with Netezza Performance Server
- Hardware/Platform expansion
- Physically adding and connecting enclosures that hold additional SPUs and configuring these nodes.
- Software expansion
- Updating Netezza Performance Server system topology metadata to represent the additional SPUs, their attached NVMe disks, and the number and location of additional data slices.
Data redistribution - Overview
Data redistribution is the process of taking the existing rows and distributing them over the new set of data slices based on the distribution method of each table for correct query processing. Netezza Performance Server 11.2.1.11 supports both offline and online redistribution. Choose either of these two redistribution method after expansion.
As noted in Distribution keys, performance of queries and workloads on Netezza Performance Server is to a very large extent affected by the distribution methods for various tables over the data slices that reside on the SPU disks. The distribution method for a given table is either random or hash, with the latter method hashing table rows on a set of up to four user-specified distribution key columns.
When Netezza Performance Server is expanded, existing data slices currently on the existing SPU disks will remain in place, and new data slices will be added on the new SPUs’ disks. The number of these new data slices will be 96 per enclosure (4 nodes) or 192 per pair of enclosures (8 nodes).
Prior to the expansion, table rows are distributed using their distribution methods and keys over the original number of data slices before expanding a Netezza Performance Server system. After hardware and software expansion with additional Snippet Processing Units (SPU), the system will have increased number of data slices. But the rows of each table are distributed over the original number of data slices.
- Offline
- Netezza Performance Server releases prior to 11.2.1.11, Netezza Performance Server expansion was performed by an IBM Support engineer, working with the customer. There are some pre and post-expansion steps, but the core software expansion and redistribution steps were integrated into a single program nzredrexpand. This style of data redistribution (after software expansion) is termed as offline as it is performed while Netezza Performance Server is made unavailable to client applications. For details, see Offline data redistribution.
- Online
-
Online data redistribution in Netezza Performance Server 11.2.1.11 addresses the extended downtime experienced by client applications. The downtime is addressed by opening up access to the data after software expansion and allowing data redistribution to proceed asynchronously at times and with WLM (workload management) resource limits of your choice.
The existing offline redistribution approach continues to be available, for cases where the expected Netezza Performance Server downtime is acceptable. You user can choose the expansion at this point, whether they opt to perform redistribution online themselves after hardware expansion or whether they wish to proceed with offline redistribution automatically and immediately after expansion. For details, see Data redistribution - Online.
- Update Netezza Performance Server configuration and topology information to represent the additional SPUs, their attached NVMe disks, and the number and location of additional data slices.
- Optionally, if the you chose offline redistribution, redistribute all tables while Netezza Performance Server is still unavailable for client workloads.