Issues caused by the suboptimal setup or configuration of the IBM Spectrum Scale cluster
This section discusses the issues caused due to the suboptimal setup or configuration of the IBM Spectrum Scale cluster.
- Suboptimal performance due to unbalanced architecture and improper system level settings
The system performance depends on the IBM Spectrum Scale cluster architecture components like servers, network, storage, disks, topology, and balance-factor. The performance is also dependent on the performance of the low-level components like network, node, and storage subsystems that make up the IBM Spectrum Scale cluster. - Suboptimal performance due to low values assigned to IBM Spectrum Scale configuration parameters
Most GPFS configuration parameters have default values. For example, in IBM Spectrum Scale version 4.2 and above, the pagepool attribute defaults to either one-third of the physical memory on the node or 1 GiB (whichever is smaller), maxMBpS defaults to 2048 and maxFilesToCache defaults to 4000. However, if the IBM Spectrum Scale configuration parameters are explicitly set to values lower than their default values by the user, it can impact the I/O performance. - Suboptimal performance due to new nodes with default parameter values added to the cluster
When new nodes are added to the IBM Spectrum Scale cluster, ensure that the GPFS configuration parameter values on the new nodes are not set to default values, unless explicitly set so by the user based on the GPFS node class. Instead, the GPFS configuration parameter values on the new nodes must be similar to the values of the existing nodes of similar type for optimal performance. The necessary system level component settings, like BIOS, network and others on the new nodes, also need to match the system level component settings of the existing nodes. - Suboptimal performance due to low value assigned to QoSIO operation classes
If Quality of Service for I/O (QoSIO) feature is enabled on the file system, verify whether any of the storage pools are assigned low values for other and maintenance class. Assigning low values for other and maintenance class can impact the performance when I/O is performed on that specific storage pool. - Suboptimal performance due to improper mapping of the file system NSDs to the NSD servers
The NSDs in a file system need to be optimally assigned to the NSD servers so that the client I/O is equally distributed across all the NSD servers. For example, consider a file system with 10 NSDs and 2 NSD servers. The NSD-to-server mapping must be done in such a way that each server acts as the primary server for 5 of the NSD in the file system. If the NSD-to-server mapping is unbalanced, it can result in hot spots in one or more of the NSD servers. Presence of hot spots within a system can cause performance degradation. - Suboptimal performance due to incompatible file system block allocation type
In some cases, proof-of-concept (POC) is done on a smaller setup that consists of clusters with eight or fewer nodes and file system with eight or fewer disks. When the necessary performance requirements are met, the production file system is deployed on a larger cluster and storage setup. It is possible that on a larger cluster, the file performance per NSD is less compared to the smaller POC setup, even if all the cluster and storage component are healthy and performing optimally. In such cases, it is likely that the file system is configured with the default cluster block allocation type during the smaller POC setup and the larger file system setup are configured with scatter block allocation type.
Parent topic: Performance issues