IBM Spectrum Scale file systems and file sets

The IBM® Pattern for IBM Spectrum Scale supports the creation of both file systems and file sets.

IBM Spectrum Scale Servers can host multiple file systems. File systems are supported by one or more network shared disks (NSDs). Within the file systems you define file sets, which are treated as subdirectories within the file system. File sets are groupings of files.
Attention:
  • You can attach a maximum of 14 storage volumes to a IBM Spectrum Scale server deployment instance.
  • A file system includes 1 or more volumes, resulting in a maximum of 14 file systems.
  • The maximum allowed size of the volumes that are attached to the file system after the file system creation depends on the size of the volumes that are used at the file system's initial creation. For more information, see Adding disks to a file system. The maximum size of any one disk that can be added is set approximately to the sum of the disk sizes when the file system is created. Plan ahead and make sure that the size of the initial volumes that are used at file system creation is large enough to allow a volume size that you expect to use later.

Creating file systems

File systems are mounted on the IBM Spectrum Scale server as follows:
/gpfs/fileSystemName1
/gpfs/fileSystemName2
/gpfs/fileSystemName3
...
File systems are created in two ways:
  • During the deployment of a IBM Spectrum Scale Primary server, which creates the initial file system.
  • Running the Attach Storage Volumes operation and specifying a file system.

In both cases, the file systems are allocated block shared storage volumes, which you create in advance.

If you need more space on a file system, you can add additional storage volumes to the deployment using the Attach Storage Volumes operation.

Adding shared volumes to a mirrored configuration

When you deploy a Primary configuration and attach Mirror and Tiebreaker deployed configurations, you must ensure that the space allocated on both the Primary and Mirror configurations for a certain shared file system is identical.

For example, you would not allocate 10 GB of shared volume storage to the Primary configuration file system, and only 5 GB to the Mirror configuration for the same file system, because when the size of the data exceeds 5 GB, replication would no longer work. You should also keep in mind that replication in this type of mirrored scenario can use up twice the storage due to the replication that occurs.

Adding shared volumes to a passive configuration

When you deploy a Primary configuration and attach a Passive deployed configuration, you must ensure that the space allocated on both the Primary and Passive configurations is identical.

You can add volumes to both sides as needed by using the Attach Storage Volumes operation on both the Primary and Passive deployments. Attach any additional volumes to the Passive deployment prior to attaching additional volumes to the Primary deployment.

Creating file sets

File sets are created and linked under the file system, for example:
/gpfs/filesystem1/fileSet1
/gpfs/filesystem1/fileSet2
/gpfs/filesystem2/fileSet1
/gpfs/filesystem2/fileSet2
...
File set names must adhere to the following conventions:
  • Names are character strings and must be less than 256 characters in length.
  • Names must be unique within a file system.
  • Names must use only the following characters: a-z, A-Z, 0-9, -, _
  • The name root is reserved for the file set of the files system root directory.
  • The name must contain no spaces (this is specific to the IBM Spectrum Scale pattern).
  • Quotas must have as their final character the unit of measure (g or G for GB, m or M for MB, k or K for KB, t or T for TB)

File sets do have maximum size quotas, and an error occurs if users consume more data than the quota allows. These quotas are set by using IBM Spectrum Scale Client operations. Note, however, that quotas are enforced only if the user is non-root. In this case, directory access must be configured within the client virtual machines (and potentially server virtual machines) by using standard Linux® ACL and permission commands.

Some workloads use the methodology of creating the file set on every client virtual machine. This practice is tolerated, and a warning message is generated if a second attempt is made to create the same quota. The quota limits of the second and subsequent attempts are ignored.

The warning message is displayed as follows:
WARNING: fileset <filesetName> already exists
You can check for existing file set names that are already in use by using either of the following methods:
  • Run the Status operation, which identifies file sets that are associated with a specified file system.
  • Run the following command to see which file sets exist on a virtual machine in the Server cluster:
    /usr/lpp/mmfs/bin/mmlsfileset <filesystemName>

As a best practice, check the logs after running the IBM Spectrum Scale Client Policy (or alternatively, after running the IBM Spectrum Scale Client script packages) to see if the warning message is written to the log.

Be careful to create file sets using unique names. If you enter the same name as another user, you might mistakenly overwrite their data, because you could both be linked to the same file set.

Isolating file sets

File sets serve as a shared folder between tenants of the file system. Within a single server deployment, this is considered a trusted tenant model, with no isolation between tenants.

For example, suppose a file system administrator creates file systems A and B. A pattern deployment is configured to use File System A with File set 001.

When the pattern is deployed, a connection to File System A is made, and File set 001 is created if it is not already present. Now, other deployments with this configuration can share data within that file set. If another system also had a deployment requesting File System A and File set 001, it could alter the data in the file set.

To control the second system and to isolate the data in File System A from the second system, the second deployment should be configured to use File System B on the same server.

As another option, you might deploy a second server with File System A. Then define a separate cloud group mapped to the new server deployment, to isolate the associated deployments.

Removing shared volumes from the file system

Be careful when removing shared volumes from the file system. This action should be taken only after careful analysis of the problem which requires the volume to be removed.

There are two ways to remove a shared volume:
  • Use the Remove Network Shared Disk(s) operation, to remove the disk by the disk NSD name.
  • Use the Detach Shared Volumes operation, to remove the disk by the shared volume name.

In certain cases you might need to unmount file systems and shut down nodes before disks can be removed successfully. You can use either the operations in the Administrative section, which should perform these tasks for you, or you can use specific IBM Spectrum Scale commands to perform the same steps.