Data slices, data partitions, and disks

It is important to understand the relationship of SPUs, disks, data slices, and data partitions in the IBM® Netezza® appliance. Netezza uses these terms to help identify hardware components for system management tasks and troubleshooting events.

Disks

A disk is a physical drive on which data resides. In a Netezza system, host servers have several disks that hold the Netezza software, host operating system, database metadata, and sometimes small user files. The Netezza system also has many more disks that hold the user databases and tables. Each disk has a unique hardware ID to identify it.

For the IBM PureData® System for Analytics N200x appliances, 24 disks reside in each disk enclosure, and full rack models have 12 enclosures per rack for a total of 288 disks per rack.

For IBM Netezza 1000 or IBM PureData System for Analytics N1001 systems, 48 disks reside in one storage array; a full-rack system has two storage arrays for a total of 96 disks.

For IBM PureData System for Analytics N3001-001 appliances, all disks are located on two hosts. 16 out of 24 disks on each host are used for storing data slices.

Data slices

A data slice is a logical representation of the data that is saved on a disk. The data slice contains “pieces” of each user database and table. When users create tables and load their data, they distribute the data for the table across the data slices in the system by using a distribution key. An optimal distribution is one where each data slice has approximately the same amount of each user table as any other. The Netezza system distributes the user data to all of the data slices in the system by using a hashing algorithm.

Data partitions

A data partition is a logical representation of a data slice that is managed by a specific SPU. That is, each SPU owns one or more data partitions, which contains the user data that the SPU is responsible for processing during queries. For example, in the IBM PureData System for Analytics N200x appliances, each SPU typically owns 40 data partitions although one or two may own 32 partitions. For example, in IBM Netezza 1000 or IBM PureData System for Analytics N1001 systems, each SPU typically owns 8 data partitions although one SPU has only 6 partitions. For a Netezza C1000 system, each SPU owns 9 data partitions by default. SPUs could own more than their default number of partitions; if a SPU fails, its data partitions are reassigned to the other active SPUs in the system. In IBM PureData System for Analytics N3001-001 appliances, each of the two virtual SPUs owns 14 data partitions.