APAR status
Closed as program error.
Error description
This APAR addresses two issues related to NFS-Ganesha that can cause crashes. Issue 1: NFS-Ganesha may crash with the following stack trace: (gdb) bt (gdb) bt #0 0x00003fffa73e52e8 in raise () from /lib64/libpthread.so.0 #1 0x00003fffa7954628 in crash_handler (signo=6, info=0x3ffefac4a468, ctx=0x3ffefac496f0) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/MainNFSD/nfs_ini t.c:247 #2 <signal handler called> #3 0x00003fffa717fcb0 in raise () from /lib64/libc.so.6 #4 0x00003fffa718200c in abort () from /lib64/libc.so.6 #5 0x00003fffa79b9fd4 in free_client_record (record=0x3fff200ed130) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_clienti d.c:1381 #6 0x00003fffa79ba3d8 in dec_client_record_ref (record=0x3fff200ed130) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_clienti d.c:1461 #7 0x00003fffa79b825c in nfs_client_id_expire (clientid=0x3fff200edbd0, make_stale=false) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_clienti d.c:914 #8 0x00003fffa79c7820 in reserve_lease_or_expire (clientid=0x3fff200edbd0, update=true) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_lease.c :181 #9 0x00003fffa7a59db4 in nfs4_op_renew (op=0x3fff029152d0, data=0x3fff0320d9c0, resp=0x3ffee960cab0) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/Protocols/NFS/nf s4_op_renew.c:91 #10 0x00003fffa7a2ed80 in process_one_op (data=0x3fff0320d9c0, status=0x3ffefac4cfd0) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/Protocols/NFS/nf s4_Compound.c:920 #11 0x00003fffa7a30010 in nfs4_Compound (arg=0x3ffeeabd84a0, req=0x3ffeeabd7c90, res=0x3ffee9854f60) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/Protocols/NFS/nf s4_Compound.c:1327 #12 0x00003fffa794dae4 in nfs_rpc_process_request (reqdata=0x3ffeeabd7c90) Issue 2: NFS-Ganesha may crash with the following stack trace: #0 0x00007f27f0a984fb in raise () from /lib64/libpthread.so.0 #1 0x00007f27f2775d7b in crash_handler (signo=11, info=0x7f20e337e930, ctx=0x7f20e337e800) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/MainNFSD/nfs_init.c:247 #2 <signal handler called> #3 0x00007f27f28a3cf5 in nlm_granted_callback (obj=0x7f2430001378, lock_entry=0x7f2204302c20) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/Protocols/NLM/nlm_util. c:609 #4 0x00007f27f27b133b in try_to_grant_lock (lock_entry=0x7f2204302c20) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/SAL/state_lock.c:1732 #5 0x00007f27f27b177b in process_blocked_lock_upcall (block_data=0x7f2204305510) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/SAL/state_lock.c:1780 #6 0x00007f27f27ac19c in state_blocked_lock_caller (ctx=0x7f21c8408650) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/SAL/state_async.c:81 #7 0x00007f27f27f62bd in fridgethr_start_routine (arg=0x7f21c8408650) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/support/fridgethr.c:556 #8 0x00007f27f0a90ea5 in start_thread () from /lib64/libpthread.so.0 #9 0x00007f27f018fb0d in clone () from /lib64/libc.so.6
Local fix
Problem summary
This APAR addresses two issues related to NFS-Ganesha that can cause crashes. Here are the details: Issue 1: NFS-Ganesha may crash with the following stack trace: (gdb) bt (gdb) bt #0 0x00003fffa73e52e8 in raise () from /lib64/libpthread.so.0 #1 0x00003fffa7954628 in crash_handler (signo=6, info=0x3ffefac4a468, ctx=0x3ffefac496f0) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/MainNFSD/nfs_ini t.c:247 #2 <signal handler called> #3 0x00003fffa717fcb0 in raise () from /lib64/libc.so.6 #4 0x00003fffa718200c in abort () from /lib64/libc.so.6 #5 0x00003fffa79b9fd4 in free_client_record (record=0x3fff200ed130) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_clienti d.c:1381 #6 0x00003fffa79ba3d8 in dec_client_record_ref (record=0x3fff200ed130) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_clienti d.c:1461 #7 0x00003fffa79b825c in nfs_client_id_expire (clientid=0x3fff200edbd0, make_stale=false) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_clienti d.c:914 #8 0x00003fffa79c7820 in reserve_lease_or_expire (clientid=0x3fff200edbd0, update=true) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/SAL/nfs4_lease.c :181 #9 0x00003fffa7a59db4 in nfs4_op_renew (op=0x3fff029152d0, data=0x3fff0320d9c0, resp=0x3ffee960cab0) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/Protocols/NFS/nf s4_op_renew.c:91 #10 0x00003fffa7a2ed80 in process_one_op (data=0x3fff0320d9c0, status=0x3ffefac4cfd0) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/Protocols/NFS/nf s4_Compound.c:920 #11 0x00003fffa7a30010 in nfs4_Compound (arg=0x3ffeeabd84a0, req=0x3ffeeabd7c90, res=0x3ffee9854f60) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21.308708/Protocols/NFS/nf s4_Compound.c:1327 #12 0x00003fffa794dae4 in nfs_rpc_process_request (reqdata=0x3ffeeabd7c90) Issue 2: NFS-Ganesha may crash with the following stack trace: #0 0x00007f27f0a984fb in raise () from /lib64/libpthread.so.0 #1 0x00007f27f2775d7b in crash_handler (signo=11, info=0x7f20e337e930, ctx=0x7f20e337e800) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/MainNFSD/nfs_init.c:247 #2 <signal handler called> #3 0x00007f27f28a3cf5 in nlm_granted_callback (obj=0x7f2430001378, lock_entry=0x7f2204302c20) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/Protocols/NLM/nlm_util. c:609 #4 0x00007f27f27b133b in try_to_grant_lock (lock_entry=0x7f2204302c20) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/SAL/state_lock.c:1732 #5 0x00007f27f27b177b in process_blocked_lock_upcall (block_data=0x7f2204305510) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/SAL/state_lock.c:1780 #6 0x00007f27f27ac19c in state_blocked_lock_caller (ctx=0x7f21c8408650) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/SAL/state_async.c:81 #7 0x00007f27f27f62bd in fridgethr_start_routine (arg=0x7f21c8408650) at /usr/src/debug/nfs-ganesha-3.5-ibm071.21/support/fridgethr.c:556 #8 0x00007f27f0a90ea5 in start_thread () from /lib64/libpthread.so.0 #9 0x00007f27f018fb0d in clone () from /lib64/libc.so.6
Problem conclusion
This problem is fixed in 5.1.2.12 To see all Spectrum Scale APARs and their respective Fix solutions refer to page: https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_ apars.html Benefits of the solution: The code has been modified to address the crashes. Workaround: None Problem Trigger: For Issue 1, the crash is related to the NFSv4 lease period and can occur due to timing issues, such as delays in lease renewal or a heavily loaded server with multiple client requests. For Issue 2, the crash is related to blocking lock requests and lock upgrades on the same fileby multiple threads, which can lead to timing issues. Platforms Affected: Linux Only Functional Area Affected: NFS-Ganesha crash followed by CES-IP failover. Customer Impact: Medium Importance
Temporary fix
Comments
APAR Information
APAR number
IJ47409
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
512
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-06-29
Closed date
2023-07-19
Last modified date
2023-07-19
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"512","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
20 July 2023