IBM Support

IBM Spectrum Scale Alert: Releases 4.2.3.18 or later and 5.0.4.0 or later have issues resulting in kernel crashes on RHEL7.7 with kernel 3.10.0-1062.18.1 or higher and releases 4.2.3.12 or later and 5.0.2.2 or later on RHEL7.6 with kernel 3.10.0-95

Flashes (Alerts)


Abstract

IBM has identified an issue in IBM Spectrum Scale versions that support RHEL7.7 (4.2.3.18 or later and 5.0.4.0 or later), in which a RHEL7.7 node running kernel versions 3.10.0-1062.18.1 or higher and RHEL7.6 (4.2.3.12 or later and 5.0.2.2 or later) with kernel 3.10.0-957.47.1.el7 or higher may encounter a kernel crash while performing operations on files in the file system. Fix is available.

Content

This issue affects IBM Spectrum Scale versions 4.2.3.18 or later and 5.0.4.0 or later running on RHEL7.7 and IBM Spectrum Scale versions 4.2.3.13 or later and 5.0.2.2 or later on RHEL 7.6, where upgrading RHEL7.7 kernel version to 3.10.0-1062.18.1 or higher and RHEL7.6 with kernel 3.10.0-957.47.1.el7 or higher  may encounter a kernel crash while performing operations on files in the file system.
How to determine if system is affected:
Systems running RHEL7.7 kernel 3.10.0-1062.18.1 or higher and RHEL7.6 with  kernel 3.10.0-957.47.1.el7 or higher may encounter a kernel crash. 
The following are examples of kernel stack traces for the crash:
[ 2915.625015] BUG: unable to handle kernel NULL pointer dereference at
0000000000000040
[ 2915.633770] IP: [<ffffffffc0e2cf90>]
cxiDropSambaDCacheEntry+0x190/0x1b0 [mmfslinux]
[ 2915.914097]  [<ffffffffc0e3d28c>] gpfs_i_rmdir+0x29c/0x310 [mmfslinux]
[ 2915.921381]  [<ffffffffb9663130>] ? take_dentry_name_snapshot+0xf0/0xf0
[ 2915.928760]  [<ffffffffb9664f60>] ? shrink_dcache_parent+0x60/0x90
[ 2915.935656]  [<ffffffffb96577cc>] vfs_rmdir+0xdc/0x150
[ 2915.941388]  [<ffffffffb965cca1>] do_rmdir+0x1f1/0x220
[ 2915.947119]  [<ffffffffb964ce66>] ? __fput+0x186/0x260
[ 2915.952849]  [<ffffffffb964d02e>] ? ____fput+0xe/0x10
[ 2915.958484]  [<ffffffffb94c2e60>] ? task_work_run+0xc0/0xe0
[ 2915.964701]  [<ffffffffb965df05>] SyS_unlinkat+0x25/0x40
   
[1224278.495993] [<ffffffff88e63918>] __dentry_kill+0x128/0x190
[1224278.496678] [<ffffffff88e63a36>] dput+0xb6/0x1a0
[1224278.497378] [<ffffffff88e64116>] d_prune_aliases+0xb6/0xf0
[1224278.498083] [<ffffffffc0c2c0ea>] cxiPruneDCacheEntry+0x13a/0x1c0
[mmfslinux]
[1224278.498798] [<ffffffffc0eba608>]
_ZN10gpfsNode_t16invalidateOSNodeEPS_Pvij+0x108/0x350 [mmfs26]
Cause:
It appears a retrofit of the following change deployed in newer kernels has caused an inconsistency between the GPFS kernel portability layer and the kernel proper.

* Wed Feb 12 2020 Frantisek Hrbata <fhrbata@hrbata.com> [3.10.0-1062.18.1.el7]
...
- [fs] fix inode leaks on d_splice_alias() failure exits (Miklos Szeredi) [1781159 1749390]
Problem Determination:
Updating a Spectrum Scale v4.2.3.18 or later and v5.0.4.0 or later code levels running RHEL 7.7 to kernel version 3.10.0-1062.18.1 or higher or 4.2.3.13 or later and 5.0.2.2 or later running RHEL7.6 with kernel 3.10.0-957.47.1.el7 or higher may result in the crash with the stack trace above.
Recommendations:
Only upgrade the RHEL7.7 kernel to 3.10.0-1062.18.1 or higher, i.e apply RedHat patch RHSA-2020:0834,  after applying the fix available in IBM Spectrum Scale.
For IBM Spectrum Scale V4.2.3.18 through V4.2.3.21, apply V4.2.3.22 or later available from FixCentral at:
Please contact IBM Service for an efix:
For IBM Spectrum Scale V5.0.4.0 through V5.0.4.3, reference APAR IJ24294
For IBM Spectrum Scale V4.2.3.18 through V4.2.3.21, reference APAR IJ24603
To contact IBM Service, see http://www.ibm.com/planetwide/

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"ARM Category":[],"Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.2 & 5.0","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
02 October 2020

UID

ibm16193107