APAR status
Closed as program error.
Error description
A rare race condition can cause a crash with stack similar to: (4)> f pvthread+042700 STACK: [00308324]seCOWIOWait+000404 (F1000A003AE20000, F1000A0EE01CC090 [??]) [0030DCCC]seCOW+000B4C (??, ??) [0030CC68]seCOD+0000A8 (??, ??, ??, ??) [0032C03C]txFreeMap+00029C (??, ??, ??, ??) [0035EFFC]xtFreeMap+00021C (??, ??, ??) [0032A420]txUpdateMap+000240 (??, ??) [0032D4D0]txCommit+0006D0 (??, ??, ??, ??) [0034EE50]j2_remove+000470 (??, ??, ??, ??) [0066DE78]vnop_remove+000438 (??, ??, ??, ??) [006C0A60]kunlinkat+000460 (FFFFFFFEFFFFFFFE, 0000000111CAED11, 0000000000000000, 0000000000000000) [00003938]syscall+000230 () [kdb_get_virtual_memory] no real storage @ FFFFFFFFFFFB840 [100047098]0000000100047098 () [kdb_read_mem] no real storage @ FFFFFFFFFFF8D60 ============================== with crash occurring in seCOWIOWait function. A second thread will also be seen named snapshot where the snapshot command is being run to eithor delete a snapshot or create a new snapshot. The snapshot command deletes a snapshot resource needed by the thread that crashes.
Local fix
Problem summary
A rare race condition can cause a crash with stack similar to: (4)> f pvthread+042700 STACK: 00308324 seCOWIOWait+000404 (F1000A003AE20000, F1000A0EE01CC090 ?? ) 0030DCCC seCOW+000B4C (??, ??) 0030CC68 seCOD+0000A8 (??, ??, ??, ??) 0032C03C txFreeMap+00029C (??, ??, ??, ??) 0035EFFC xtFreeMap+00021C (??, ??, ??) 0032A420 txUpdateMap+000240 (??, ??) 0032D4D0 txCommit+0006D0 (??, ??, ??, ??) 0034EE50 j2_remove+000470 (??, ??, ??, ??) 0066DE78 vnop_remove+000438 (??, ??, ??, ??) 006C0A60 kunlinkat+000460 (FFFFFFFEFFFFFFFE, 0000000111CAED11, 0000000000000000, 0000000000000000) 00003938 syscall+000230 () kdb_get_virtual_memory no real storage @ FFFFFFFFFFFB840 100047098 0000000100047098 () kdb_read_mem no real storage @ FFFFFFFFFFF8D60 ============================== with crash occurring in seCOWIOWait function. A second thread will also be seen named snapshot where the snapshot command is being run to eithor delete a snapshot or create a new snapshot. The snapshot command deletes a snapshot resource needed by the thread that crashes.
Problem conclusion
Added the check to avoid sync issue between snapshot delete command and file remove which will avoid system crash.
Temporary fix
Comments
APAR Information
APAR number
IV88438
Reported component name
AIX V7.1
Reported component ID
5765H4000
Reported release
710
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-08-19
Closed date
2016-08-19
Last modified date
2017-01-20
APAR is sysrouted FROM one or more of the following:
IV79798
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
AIX V7.1
Fixed component ID
5765H4000
Applicable component levels
R710 PSY U872346
UP17/01/19 I 1000
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11R"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]
Document Information
Modified date:
20 April 2022