CES NFS support

In Cluster Export Services (CES), you must consider supported protocol versions, service and export configuration, NFS service monitoring, fail-over considerations, and client support requirements for Network File System (NFS) support.

NFS support levels

NFS versions 3 (NFSv3) and 4 (NFSv4.0, NFSv4.1, , NFSv4.2) are supported.
Note: NFS 4.2 is only technology preview feature, which cannot be used in the production.

By default NFS 4.0 and NFS 4.1 versions are enabled on IBM Storage Scale 5.2.0 for the cluster installation. Clusters that are upgraded on any earlier version than IBM Storage Scale 5.2.0 retain the same default value (only NFS 4.0), or the value is set explicitly in older clusters (that is, only 4.0, only 4.1 or both 4.0 and 4.1). To modify the MINOR_VERION, use the mmnfs config change MINOR_VERSION=<minorversion> command. Valid values for a minor version are 0 or 1 or 0,1. For more information, see mmnfs command.

NFS monitoring

The NFS servers are monitored to check for proper functions. If a problem is found, the CES addresses of the node are reassigned, and the node state is set to the failed state. When the problem is corrected, the node resumes normal operation.

NFS service configuration

Configuration options for the NFS service can be set with the mmnfs config command.

You can use the mmnfs config command to set and list default settings for NFS such as the port number for the NFS service, the default access mode for exported file systems, the log level, and enable or disable status for delegations. For a list of configurable attributes, see mmnfs command.

Some of the attributes such as the protocol can be overruled for a given export on a per-client base. For example, the default settings might have NFS protocols 3 and 4 enabled, but the export for a client might restrict it to NFS version 4 only.

NFS export configuration

Exports can be added, removed, or changed with the mmnfs export command. Authentication must be set up before you define an export.

Exports can be declared for any directory in the GPFS file system, including a fileset junction. At the time where exports are declared, these folders must exist physically in GPFS. Only folders in the GPFS file system can be exported. No folders that are located only locally on a server node can be exported because they cannot be used in a failover situation.

Export-add and export-remove operations can be applied at run time of the NFS service. Export-change operation does not require a restart of the NFS service.

NFS failover

When a CES node leaves the cluster, the CES addresses assigned to that node are redistributed among the remaining nodes. Remote clients that access the GPFS file system might see a pause in service while the internal state information is passed to the new servers.
Note: NFS clients are responsible for maintaining data integrity when a server reboots, crashes, or fails over. In the NFS protocol, the NFS client is responsible for tracking which data is destaged and for detecting that a server is crashed before destaging all data, and for tracking which data must be rewritten to disk. Failover is transparent to most applications in NFS, with the following exception:
  • Client applications might experience -EEXIST errors or -ENOENT errors when you are creating or deleting file system objects.

NFS clients

When you work with NFS clients, consider the following points:
  • If you mount the same NFS export on one client from two different IBM Storage Scale NFS protocol nodes, data corruption might occur.
  • The NFS protocol version that is used as the default on a client operating system might differ from what you expect. If you are using a client that mounts NFSv3 by default, and you want to mount NFSv4, then you must explicitly specify the relevant NFSv4.0 or NFSv4.1 in the mount command. For more information, see the mount command for your client operating system.
  • To prevent NFS clients from encountering data integrity issues during failover, ensure that NFS clients are mounted with the option -o hard.
  • A client can mount an NFS export by using one of the following methods:

    • CES IP of the protocol nodes.
    • Alias for the CES IPs, which are defined in the client's /etc/hosts path or DNS that includes DNS round-robin.
    • Net BIOS name as alias for CES IPs, which are defined in the client's /etc/hosts path or DNS that includes DNS round-robin. Net BIOS name can be found by using the mmuserauth service list command. This is mandatory if kerberized NFS shares are used.
    Note:
    • If protocol node hostname is used to mount NFS shares, then high availability for IP might not work.
    • If a DNS round-robin (RR) entry name is used to mount an NFSv3 export, then data unavailability might occur due to unreleased locks. The NFS lock manager on IBM Storage Scale is not cluster-aware. This limitation is not applicable for NFSv4 exports.
  • If the client mounts an NFS export by using a CES IP address, which is an IPv6 address, you might need to enclose the IPv6 address with square brackets. For example,
    mount [spectrumScaleCESIPv6IP]:/path/to/exportedDirectory /localMountPoint

    For more information about mounting with IPv6 address at the NFS client, see the man page for 'nfs'.

  • Clients that are performing NFS mounts must use a retry timeout value that is marginally lower than the NFS server grace period.

    CES NFS server enters grace period after the daemon restarts, or when an IP address is released or a new IP address is acquired. Previously connected clients reclaim their state (for example – file locks, opens) within the grace period. The default value for grace period is 90 seconds.

    The NFS client waits for a response from NFS server for a period that is indicated by timeo before retrying requests. The timeo can be specified as an option during mount. The value of timeo is expressed in deciseconds (one-tenth of a second). Clients performing NFS mounts with a retry timeout value close to the NFS server grace period might cause application failures like I/O errors.

    An example to set the retry timeout value as 40 seconds (overriding the Linux® client's default value of 60 seconds for TCP) is - mount -o timeo=400 spectrumScaleCESIP:/path/to/exportedDirectory /localMountPoint.

Choosing between CNFS or CES

If you want to put highly available NFS services on top of the GPFS file system, you have the choice between clustered NFS (Implementing a clustered NFS environment on Linux) and Cluster Export Services (Implementing Cluster Export Services).

To help you choose one of these NFS offerings, consider the following points:
Multiprotocol support
If you plan to use other protocols (such as SMB or Object) in addition to NFS, CES must be chosen. While CNFS provides support only for NFS, the CES infrastructure adds support also for SMB and Object. With CES, you can start with NFS and add (or remove) other protocols at any time.
Command support
While CNFS provides native GPFS command support for creation and management of the CNFS cluster, it lacks commands to manage the NFS service and NFS exports. The CES infrastructure introduces native GPFS commands to manage the CES cluster. Furthermore, you can also manage the supported protocol services and the NFS exports by using the commands. For example, with CES, you do not need to adapt NFS configuration files individually on the protocol nodes. This work is done by the new GPFS commands that are provided for CES.
Performance
CNFS is based on the kernel NFS server while NFS support in CES is based on the Ganesha NFS server operating in user space. Due to the different nature of these NFS I/O stacks, performance depends on system characteristics and NFS workload. Contact your IBM® representative to get help with sizing the required number of protocol nodes to support certain workload characteristics and protocol connection limits.
Which of the two NFS servers performs better has no general answer because the performance depends on many factors. Tests that are conducted with both NFS I/O stacks over various workloads show that the kernel-based NFS server (CNFS) performs better under metadata-intensive workloads. Typically this testing is with many smaller files and structures. The Ganesha NFS server provides better performance on other data-intensive workloads such as Video Streaming.
Note: CES provides a different interface to obtain performance metrics for NFS. CNFS uses the existing interfaces to obtain NFS metrics from the kernel (such as nfsstat or the /proc interface). The CES framework provides the mmperfmon query command for Ganesha-based NFS statistics. For more information, see mmperfmon command.
Migration of CNFS to CES
For information about migrating existing CNFS environments to CES, see Migration of CNFS clusters to CES clusters.