A fix is available
APAR status
Closed as program error.
Error description
nfs v4 server will crash when doing cleanup follow by a failure of opening a file. The stack trace looks like KDB(0)> f pvthread+00AD00 STACK: ■0001BF00abend_trap+000000 () ■0486C558find_stale_fr+000258 (??, ??) ■04867B4Cfr_gc+0001EC (??, ??) ■048681D8node_gc+0001D8 (??) ■048641BCwi_do_job+00011C (??) ■0486F230sm4_cleanup+000170 (??, ??, ??) ■0033E9B0procentry+000010 (??, ??, ??, ??) From the file record, you can get the p_fh and find out the thread who allocated the p_fh. that thread should still be in the middle of opening. Its stack trace looks like KDB(0)> f 4786 pvthread+12B200 STACK: ■0052BE60slock+000480 (00000000000D3770, 8000000000001032 ■??) ■00009558.simple_lock+000058 () ■04884460sm4_cleanup_fr+0000C0 (??, ??) ■04882DACend_open_no_or+00026C (??, ??, ??, ??, ??, ??, ??, ??) ■0488CCA8sm4_end_open+000408 (??, ??) ■048AD6D0open_nocreate+000450 (??, ??, ??, ??) ■048AFA8Crfs4_open+00026C (??) ■047F44CCrfs4_dispatch_i+00002C (??) ■047F4CB8rfs4_dispatch+000558 (??, ??) ■04731388svc_getreq+0008C8 (??) ■0033F4DCthreadentry+00005C (??, ??, ??, ??) thread pvthread+12B200 turns off all state but didn't take the fr off the hash list. While the cleanup thread comes in and fond that fr off the hash list and failed the ras check and abend the machine.
Local fix
Problem summary
nfs v4 server will crash when doing cleanup follow by a failure of opening a file. The stack trace looks like KDB(0)> f pvthread+00AD00 STACK: 0001BF00 abend_trap+000000 () 0486C558 find_stale_fr+000258 (??, ??) 04867B4C fr_gc+0001EC (??, ??) 048681D8 node_gc+0001D8 (??) 048641BC wi_do_job+00011C (??) 0486F230 sm4_cleanup+000170 (??, ??, ??) 0033E9B0 procentry+000010 (??, ??, ??, ??) From the file record, you can get the p_fh and find out the thread who allocated the p_fh. that thread should still be in the middle of opening. Its stack trace looks like KDB(0)> f 4786 pvthread+12B200 STACK: 0052BE60 slock+000480 (00000000000D3770, 8000000000001032 ?? ) 00009558 .simple_lock+000058 () 04884460 sm4_cleanup_fr+0000C0 (??, ??) 04882DAC end_open_no_or+00026C (??, ??, ??, ??, ??, ??, ??, ??) 0488CCA8 sm4_end_open+000408 (??, ??) 048AD6D0 open_nocreate+000450 (??, ??, ??, ??) 048AFA8C rfs4_open+00026C (??) 047F44CC rfs4_dispatch_i+00002C (??) 047F4CB8 rfs4_dispatch+000558 (??, ??) 04731388 svc_getreq+0008C8 (??) 0033F4DC threadentry+00005C (??, ??, ??, ??) thread pvthread+12B200 turns off all state but didn't take the fr off the hash list. While the cleanup thread comes in and fond that fr off the hash list and failed the ras check and abend the machine.
Problem conclusion
don't take off the flag until fr is off the hash list
Temporary fix
Comments
6100-04 - use AIX APAR IZ87676 6100-05 - use AIX APAR IZ87632 6100-06 - use AIX APAR IZ85172 7100-00 - use AIX APAR IZ85599
APAR Information
APAR number
IZ85172
Reported component name
AIX 610 STD EDI
Reported component ID
5765G6200
Reported release
610
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Submitted date
2010-09-16
Closed date
2010-09-16
Last modified date
2013-04-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
AIX 610 STD EDI
Fixed component ID
5765G6200
Applicable component levels
R610 PSY U835918
UP11/05/10 I 1000
PTF to Fileset Mapping
U835918 bos.net.nfs.client 6.1.6.15
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSAUMY","label":"IBM AIX Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11Q","label":"AIX 6.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
17 April 2013