Disk considerations

Designing a proper storage infrastructure for your IBM Storage Scale file systems is key to achieving performance and reliability goals. When deciding what disk configuration to use, consider three key areas: infrastructure, performance, and disk access method.

Infrastructure
  • Ensure that you have sufficient disks to meet the expected I/O load. In IBM Storage Scale terminology, a disk may be a physical disk or a RAID device.
  • Ensure that you have sufficient connectivity (adapters and buses) between disks and network shared disk servers.
  • Determine whether you are within IBM Storage Scale limits. Starting with GPFS 3.1, the structural limit on the maximum number of disks in a file system is 2048, which is enforced by IBM Storage Scale. (However, the number of disks in your system may be constrained by products other than IBM Storage Scale.)
  • For a list of storage devices tested with IBM Storage Scale, see the IBM Storage Scale FAQ in IBM® Documentation.
  • For Linux® on Z, see the Storage topics DASD device driver and SCSI-over-Fibre Channel device driver in Device Drivers, Features, and Commands in the Linux on Z library overview.
Disk access method
  • Decide how your disks will be connected. Supported types of disk connectivity include the following configurations:
    1. All disks SAN-attached to all nodes in all clusters that access the file system

      In this configuration, every node sees the same disk simultaneously and has a corresponding disk device entry.

    2. Each disk connected to multiple NSD server nodes (up to eight servers), as specified on the server list

      In this configuration, a single node with connectivity to a disk performs data shipping to all other nodes. This node is the first NSD server specified on the NSD server list. You can define additional NSD servers on the server list. Having multiple NSD servers guards against the loss of a single NSD server. When using multiple NSD servers, all NSD servers must have connectivity to the same disks. In this configuration, all nodes that are not NSD servers will receive their data over the local area network from the first NSD server on the server list. If the first NSD server fails, the next available NSD server on the list will control data distribution.

    3. A combination of SAN-attached and an NSD server configuration.
      Configuration consideration:
      • If the node has a physical attachment to the disk and that connection fails, the node switches to using a specified NSD server to perform I/O. For this reason, it is recommended that you define NSDs with multiple servers, even if all nodes have physical attachments to the disk.
      • Configuring IBM Storage Scale disks without an NSD server stops the serving of data when the direct path to the disk is lost. This may be a preferable option for nodes requiring a higher speed data connection provided through a SAN as opposed to a lower speed network NSD server connection. Parallel jobs using MPI often have this characteristic.
      • The -o useNSDserver file system mount option on the mmmount, mount, mmchfs, and mmremotefs commands can be used to specify the disk discovery, and limit or eliminate switching from local access to NSD server access, or the other way around.
  • Decide whether you will use storage pools to manage your disks.

    Storage pools allow you to manage your file system's storage in groups. You can partition your storage based on such factors as performance, locality, and reliability. Files are assigned to a storage pool based on defined policies.

    Policies provide for the following:
    • Placing files in a specific storage pool when the files are created
    • Migrating files from one storage pool to another
    • File deletion based on file characteristics

    For more information, see Storage pools.