APAR status
Closed as program error.
Error description
Assert causing, in turn, kernel crash. -- mmfs.log: [D] logAssertFailed: Kernel thread 23567 pid 27230 file /project/sprelgpfs516/build/rgpfs516s001d/obj/x86_64-linu x/avs/fs/mmfs/ts/fs/fetch-vfs-kx.C line 2520: nPrefetchedBuffers > 0 [D] Check vmcore and vmcore-dmesg.txt under /var/crash for relevant information. [D] Stack for process 27230 kernel thread 23567 in /proc/27230/task/23567/stack: [D] [<0>] cxiWaitEventWait+0x212/0x340 [mmfslinux] [D] [<0>] _ZN6ThCond12internalWaitEP16KernelSynchStatejPv+0x58/0x25 0 [mmfs26] runmmfs starting (32239) -- vmcore.dmesg: [418611.235966] GPFS logAssertFailed:ánPrefetchedBuffers > 0 fileá/project/sprelgpfs516/build/rgpfs516s001d/obj/x86_64 -linux/avs/fs/mmfs/ts/fs/fetch-vfs-kx.C line 2520 [418612.872051] CIFS PidTable: buckets 64 [418612.872059] CIFS BufTable: buckets 64 [418614.687996] Initializing the GPFS shared trace buffer: pid 31124 name lxtrace-5.14.21 completeInit 0 [418622.463287] <5>cxiPanic: thread 23567 pid 27230 forcePanic 0: fetch-vfs-kx.C:2520:0:177617:FFFFFFFFC15461E0::nPrefetche dBuffers > 0 [418622.463330] ------------[ cut here ]------------ [418622.463332] kernel BUG at /usr/lpp/mmfs/src/gpl-linux/cxiSystem.c:2619! [418622.463341] invalid opcode: 0000 [#1] PREEMPT SMP PTI à [418622.463378] RIP: 0010:cxiPanic+0x52/0xb0 [mmfslinux] ... [418622.463478] Call Trace: [418622.463482]á <TASK> [418622.463485]á logAssertFailed+0x2f2/0x370 [mmfs26 19f6b68fe542c235fb3247e98c37423ef15a2588] [418622.463605]á ? _ZN12OpenInstance15doneWithBufferMEP10BufferDescxi+0x347/ 0x400 [mmfs26 19f6b68fe542c235fb3247e98c37423ef15a2588] [418622.463677]á ? _Z9kSFSWriteP15KernelOperationP13gpfsVfsData_tP10gpfsNode _tP9MMFSVInfoiP8cxiUio_ti7FileUIDP8OpenFileP10ext_cred_tP 10WhatLockedj+0x3c1f/0x5a70 [mmfs2 6 19f6b68fe542c235fb3247e98c37423ef15a2588] [418622.463728]á ? _Z9gpfsWriteP13gpfsVfsData_tP15KernelOperationP9cxiNode_t iP8cxiUio_tP9MMFSVInfoP10cxiVattr_tSA_P10ext_cred_tP14cxi PageLists_tiP+0x2156/0x7090 [mmfs2 6 19f6b68fe542c235fb3247e98c37423ef15a2588] [418622.463789]á ? up+0x12/0x60 [418622.463795]á ? down+0x1a/0x60 [418622.463797]á ? cxiBlockingMutexAcquire+0x18f/0x220 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.463819]á ? cxiBlockingMutexAcquire+0x18f/0x220 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.463839]á ? up+0x12/0x60 [418622.463842]á ? cxiBlockingMutexRelease+0xdc/0xe0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.463862]á ? _ZN15KernelOperationD1Ev+0x56/0xb0 [mmfs26 19f6b68fe542c235fb3247e98c37423ef15a2588] [418622.463910]á ? _Z33gpfsIsCifsBypassTraversalCheckingv+0xa8/0xc0 [mmfs26 19f6b68fe542c235fb3247e98c37423ef15a2588] [418622.463966]á ? rdwrInternal+0x467/0x6b0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.463985]á ? gpfs_f_aio_write_internal+0x180/0x3a0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464001]á ? select_task_rq_fair+0x17f/0x13c0 [418622.464006]á ? gpfs_f_aio_write_internal+0x130/0x3a0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464022]á ? select_task_rq_fair+0x17f/0x13c0 [418622.464025]á ? trace_raw_output_sched_kthread_work_execute_start+0x50/0x 50 [418622.464031]á ? newidle_balance+0x2f0/0x430 [418622.464034]á ? dequeue_entity+0xf3/0x3d0 [418622.464037]á ? __update_idle_core+0x1b/0xb0 [418622.464041]á ? gpfs_f_rdwr_iter+0x1e7/0x350 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464056]á ? __wake_up_common_lock+0x87/0xc0 [418622.464060]á ? gpfs_f_rdwr_iter+0x1e7/0x350 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464075]á ? enqueue_hrtimer+0x2f/0x80 [418622.464079]á ? __remove_hrtimer+0x39/0x70 [418622.464081]á ? aa_file_perm+0x126/0x4f0 [418622.464085]á ? hrtimer_cancel+0x11/0x20 [418622.464088]á ? new_sync_write+0x11f/0x1c0 [418622.464092]á ? new_sync_write+0x11f/0x1c0 [418622.464096]á ? vfs_write+0x220/0x280 [418622.464099]á ? ksys_pwrite64+0x75/0x90 [418622.463966]á ? rdwrInternal+0x467/0x6b0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.463985]á ? gpfs_f_aio_write_internal+0x180/0x3a0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464001]á ? select_task_rq_fair+0x17f/0x13c0 [418622.464006]á ? gpfs_f_aio_write_internal+0x130/0x3a0 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464022]á ? select_task_rq_fair+0x17f/0x13c0 [418622.464025]á ? trace_raw_output_sched_kthread_work_execute_start+0x50/0x 50 [418622.464031]á ? newidle_balance+0x2f0/0x430 [418622.464034]á ? dequeue_entity+0xf3/0x3d0 [418622.464037]á ? __update_idle_core+0x1b/0xb0 [418622.464041]á ? gpfs_f_rdwr_iter+0x1e7/0x350 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464056]á ? __wake_up_common_lock+0x87/0xc0 [418622.464060]á ? gpfs_f_rdwr_iter+0x1e7/0x350 [mmfslinux ddd4b60f1e6f01da172ce0fa5c59a6c84411e2e8] [418622.464075]á ? enqueue_hrtimer+0x2f/0x80 [418622.464079]á ? __remove_hrtimer+0x39/0x70 [418622.464081]á ? aa_file_perm+0x126/0x4f0 [418622.464085]á ? hrtimer_cancel+0x11/0x20 [418622.464088]á ? new_sync_write+0x11f/0x1c0 [418622.464092]á ? new_sync_write+0x11f/0x1c0 [418622.464096]á ? vfs_write+0x220/0x280 [418622.464099]á ? ksys_pwrite64+0x75/0x90 [418622.464103]á ? do_syscall_64+0x5b/0x80 [418622.464107]á ? exit_to_user_mode_prepare+0x1dc/0x230 [418622.464111]á ? syscall_exit_to_user_mode+0x18/0x40 [418622.464114]á ? do_syscall_64+0x67/0x80 [418622.464117]á ? do_syscall_64+0x67/0x80 [418622.464120]á ? do_syscall_64+0x67/0x80 [418622.464122]á ? entry_SYSCALL_64_after_hwframe+0x61/0xcb
Local fix
Problem summary
Kernel crash with assert: nPrefetchedBuffers > 0. This could happen when application using multiple threads to perform sequential read or write more than 65535 blocks on the same open file. The starting offset of the read/write must not be on GPFS block boundary.
Problem conclusion
This problem is fixed in 5.1.8.1 To see all Spectrum Scale APARs and their respective Fix solutions refer to page: https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_ apars.html Benefits of the solution: Prevent the unexpected kernel crash that could occur when multiple threads are used to perform sequential read/write on the same open file. Work Around: Close/reopen file before performing more than 65535 sequential read/write on the same file using multiple threads. Problem trigger: Performing sequential read/write on the same file using multiple threads where starting offsets of each read/write is not on GPFS block boundary. Symptom: Abend/Crash Platforms affected: ALL Operating System environments Functional Area affected: All Scale Users Customer Impact: High Importance
Temporary fix
Comments
APAR Information
APAR number
IJ46715
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
516
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-05-09
Closed date
2023-07-21
Last modified date
2023-07-21
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"516","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
21 July 2023