Slow access to SMB caused by contended access to files or directories
This topic describes the reason behind the slow access to SMB server and the troubleshooting steps to handle it.
If the access through the SMB server is slower than expected, then there might be an issue with the highly contended access to the same file or directory through the SMB server. This happens because of the internal record keeping process of the SMB server. The internal record keeping process requires that the record for each open file or directory must be transferred to different protocol nodes for every open and close operation, which at times, overloads the SMB server. This delay in access is experienced in extreme cases, where many clients are opening and closing the same file or directory. However, note that concurrent access to the same file or directory is handled correctly in the SMB server and it usually causes no problems.
The following procedure can help tracking the files or directories of the contended records in the database statistics using CTDB track. When a "hot" record is detected, it is recorded in the database statistic and a message is printed to syslog.
When this message refers to the locking.tdb database, this can point to the problem of concurrent access to the same file or directory. The same reference might be seen in the ctdb dbstatistics for locking.tdb
# ctdb dbstatistics locking.tdb
DB Statistics locking.tdb
db_ro_delegations 0
db_ro_revokes 0
locks
num_calls 15
num_current 0
num_pending 0
num_failed 0
db_ro_delegations 0
hop_count_buckets: 139 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0
lock_buckets: 0 9 6 0 0 0 0 0 0 0 0 0 0 0 0 0
locks_latency MIN/AVG/MAX 0.002632/0.016132/0.061332 sec out of 15
vacuum_latency MIN/AVG/MAX 0.000408/0.003822/0.082142 sec out of 817
Num Hot Keys: 10
Count:1 Key: 6a4128e3ced4681b017c0600000000000000000000000000
Count:0 Key:
Count:0 Key:
Count:0 Key:
Count:0 Key:
Count:0 Key:
Count:0 Key:
Count:0 Key:
Count:0 Key:
Count:0 Key:
When ctdb points to a hot record in locking.tdb, then use the "net tdb locking" command to determine the file behind this record:
# /usr/lpp/mmfs/bin/net tdb locking 6a4128e3ced4681b017c0600000000000000000000000000
Share path: /ibm/fs1/smbexport
Name: testfile
Number of share modes: 2
If this happens on the root directory of an SMB export, then a workaround can be to exclude that from cross-node locking:
mmsmb export change smbexport --option fileid:algorithm=fsname_norootdir

fsname_norootdir
is set as default.
If this happens on files, the recommendation would be to access that SMB export only through one CES IP address, so that the overhead of transferring the record between the nodes is avoided.
If the SMB export contains only sub directories with home directories where the sub directory names match the user name, the recommended configuration would be an SMB export uses the %U sub situation to automatically map the user with the corresponding home directory:
mmsmb export add smbexport /ibm/fs1/%U