IBM Support

Removing large NFS files may return "NO SUCH FILE" error

Troubleshooting


Problem

When removing large NFS files it may return "No such file" error on a busy nfs server.

Symptom

The problem can be observed when using an "rm" command to remove a large size (5Gb) file residing on a busy remote nfs server. The "rm" command usually takes a rather long time, typically 10 sends or more, to complete and it may get an errno 2 "No such file" return error. The file is actually removed, but it looks like it didn't exist.

Cause

The problem is caused due to an internal race condition when a second re-transmission of the same remove Call is sent to the busy nfs server after a timeout of the first one. The first Call had already removed the file but yet not answered back. The second order finds the file does not exist and this Reply is sent to the nfs client.

Environment

This problem has been seen in AIX 7.1 and 7.2 NFSv3 servers.

Diagnosing The Problem

To diagnose properly this issue, an iptrace can be collected:
startsrc -s iptrace -a "/tmp/iptrace.cap"
# reproduce the issue
stopsrc -s iptrace
and submit iptrace.cap to IBM Support team.

Resolving The Problem

The Problems is fixed by APAR IJ18845 or any of its equivalences or iFixes. A possible work around is to increase the NFS mount "timeo" timeout to the time the server needs to remove the file and answer back, thus an RPC re-transmission does not take place. This can be done dynamically by:

mount -o remount, timeo="new value", other_options    server:/share /mount_point

"timeo" default value is 100 (10 seconds). Try a larger value in the range of 150-600.

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"ARM Category":[{"code":"a8m50000000L0NSAA0","label":"Networking->NFS"}],"ARM Case Number":"TS001987172","Platform":[{"code":"PF002","label":"AIX"}],"Version":"AIX 7.1 and 7.2","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
15 April 2020

UID

ibm16187701