Protocols support overview: Integration of protocol access methods with GPFS
IBM Storage Scale provides extra protocol access methods. Providing these additional file and object access methods and integrating them with GPFS offers several benefits. It enables users to consolidate various sources of data efficiently in one global namespace. It provides a unified data management solution and enables not only efficient space utilization but also avoids making unnecessary data moves just because access methods might be different.
Protocol access methods that are integrated with GPFS are NFS, SMB, HDFS, and S3. While each of these server functions (NFS, SMB, HDFS, and S3) uses open source technologies, this integration adds value by providing scaling and by providing high availability by using the clustering technology in GPFS.
For information about HDFS protocol support, see CES HDFS in IBM Storage Scale Big Data and Analytics Support documentation.
- The integration of file and object serving with GPFS allows the ability to create NFS exports, SMB shares, and S3 buckets containers that have data in GPFS file systems for access by client systems that do not run GPFS.
- Some nodes (at least two recommended) in the GPFS cluster must be designated as protocol nodes (also called CES nodes) from which (non-GPFS) clients can access data that resides in and is managed by GPFS by using the appropriate protocol artifacts (exports, shares, buckets, or containers).
- The protocol nodes need to have GPFS server license designations.
- The protocol nodes must be configured with
external
network addresses that are used to access the protocol artifacts from clients. The (external) network addresses are different from the GPFS cluster address that is used to add the protocol nodes to the GPFS cluster. - The CES nodes allow SMB, NFS, HDFS, and S3 clients to access the file system data using the protocol exports on the configured public IP addresses. The CES framework allows network addresses that are associated with protocol nodes to fail over to other protocol nodes when a protocol node fails.
All protocol nodes in a cluster must be running on the same operating system and they must be of the same CPU architecture. The other nodes in the GPFS cluster might be on other platforms and operating systems.
For information about supported operating systems for protocol nodes and their required minimum kernel levels, see IBM Storage Scale FAQ in IBM® Documentation.
Like GPFS, the protocol serving functionality is also delivered only as software.
- The CES framework provides access to data managed by GPFS through more access methods.
- While the protocol function provides several aspects of NAS file serving, the delivery is not a NAS appliance.
- Role-based access control of the command line interface is not offered.
- Further, the types of workloads that are suited for this delivery continue to be workloads that
require the scaling or consolidation aspects that are associated with traditional GPFS.Note: Some NAS workloads might not be suited for delivery in the current release. For instance, extensive use of snapshots, or support for many SMB users. For SMB limitations, see SMB limitations.
For more information, see IBM Storage Scale FAQ in IBM Documentation.
The protocols feature provides following additional commands for administering the protocols, along with the enhancements to the existing commands:
- The commands for managing these functions includes mmces, mmuserauth, mmnfs, mmsmb, mmobj, and mmperfmon.
- In addition, mmdumpperfdata and mmprotocoltrace are provided to help with data collection and tracing.
- Existing GPFS commands are expanded with some options for protocols include mmlscluster, mmchnode, and mmchconfig.
- The gpfs.snap command is enhanced to include data by gathering about the protocols to help with problem determination.
For information on the use of CES including administering and managing the protocols, see Implementing Cluster Export Services.
In addition to the installation toolkit, IBM Storage Scale also includes a performance monitoring toolkit. Sensors to collect performance information are installed on all protocol nodes, having one of these nodes as a designated collector node. The mmperfmon query command can be used to view the performance counters that have been collected.
The mmhealth command can be used to monitor the health of the node and the services hosted on that node.