Packets dropped by the server
Several situations exist where servers drop packets under heavy loads.
- Network adapter driver
When an NFS server responds to a very large number of requests, the server sometimes overruns the interface driver output queue. You can observe this by looking at the statistics that are reported by the netstat -i command. Examine the columns marked Oerrs and look for any counts. Each Oerrs value is a dropped packet. This is easily tuned by increasing the problem device driver's transmit queue size. The idea behind configurable queues is that you do not want to make the transmit queue too long, because of latencies incurred in processing the queue. But because NFS maintains the same port and XID for the call, a second call can be satisfied by the response to the first call's reply. Additionally, queue-handling latencies are far less than UDP retransmit latencies incurred by NFS if the packet is dropped.
- Socket buffers
The UDP socket buffer is another place where a server drops packets. These dropped packets are counted by the UDP layer and you can see the statistics by using the netstat -p udp command. Examine the socket buffer overflows statistic.
NFS packets are usually dropped at the socket buffer only when a server has a lot of NFS write traffic. The NFS server uses a UDP socket attached to NFS port 2049 and all incoming data is buffered on that UDP port. The default size of this buffer is 60,000 bytes. You can divide that number by the size of the default NFS Version 3 write packet (32786) to find that it will take 19 simultaneous write packets to overflow that buffer.
You might see cases where the server has been tuned and no dropped packets are arriving for either the socket buffer or the
Oerrsdriver, but clients are still experiencing timeouts and retransmits. Again, this is a two-case scenario. If the server is heavily loaded, it may be that the server is just overloaded and the backlog of work for nfsd daemons on the server is resulting in response times beyond the default timeout that is set on the client. The other possibility, and the most likely problem if the server is known to be otherwise idle, is that packets are being dropped on the network.