NFS misuses that affect performance

Many of the misuses of NFS occur because people do not realize that the files that they are accessing are at the other end of an expensive communication path.

A few examples of this are as follows:

  • An application running on one system doing random updates of an NFS-mounted inventory file, supporting a real-time retail cash register application.
  • A development environment in which a source code directory on each system was NFS-mounted on all of the other systems in the environment, with developers logging onto arbitrary systems to do editing and compiles. This practically guaranteed that all of the compiles would be obtaining their source code from, and writing their output to, remote systems.
  • Running the ld command on one system to transform .o files in an NFS-mounted directory into an a.out file in the same directory.
  • Applications that issue writes that are not page-aligned, for example 10 KB. Writes that are less than 4 KB in size always result in a pagein and in the case of NFS, the pagein goes over the network.

It can be argued that these are valid uses of the transparency provided by NFS. Perhaps this is so, but these uses do cost processor time and LAN bandwidth and degrade response time. When a system configuration involves NFS access as part of the standard pattern of operation, the configuration designers should be prepared to defend the consequent costs with offsetting technical or business advantages, such as:

  • Placing all of the data or source code on a server, rather than on individual workstations, improves source-code control and simplifies centralized backups.
  • A number of different systems access the same data, making a dedicated server more efficient than one or more systems combining client and server roles.

Another type of application that should not be run across NFS file systems is an application that does hundreds of lockf() or flock() calls per second. On an NFS file system, all the lockf() or flock() calls (and other file locking calls) must go through the rpc.lockd daemon. This can severely degrade system performance because the lock daemon may not be able to handle thousands of lock requests per second.

Regardless of the client and server performance capacity, all operations involving NFS file locking will probably seem unreasonably slow. There are several technical reasons for this, but they are all driven by the fact that if a file is being locked, special considerations must be taken to ensure that the file is synchronously handled on both the read and write sides. This means there can be no caching of any file data at the client, including file attributes. All file operations go to a fully synchronous mode with no caching. Suspect that an application is doing network file locking if it is operating over NFS and shows unusually poor performance compared to other applications on the same client/server pair.