IBM Support

IBM Spectrum Scale support: Kernel crashes on Ubuntu 16.04.1 with kernel version 4.4.0-57 or later

Troubleshooting


Problem

When using Ubuntu 16.04.1 and upgrading the kernel version to 4.4.0-57 or later, you may encounter a kernel crash issue when there are some operations involving setting and removing an extended file attribute, e.g. using the setxattr and removexattr syscall.

Symptom

You may encounter a "kernel:BUG: unable to handle kernel NULL pointer dereference" message and see the following stack backtrace from dmesg:

[ 1996.301280] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 1996.301417] IP: posix_acl_valid+0x12/0xf0
[ 1996.301555] PGD ba1eb067 PUD ba5c8067 PMD 0
[ 1996.301677] Oops: 0000 [#1] SMP
[ 1996.301796] Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) ppdev snd_hda_codec_generic snd_hda_intel snd_hda_codec joydev snd_hda_core
 input_leds snd_hwdep snd_pcm snd_timer snd soundcore serio_raw parport_pc pvpanic parport i2c_piix4 8250_fintek mac_hid binfmt_misc ib_iser
 rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov
 async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear qxl ttm drm_kms_helper syscopyarea sysfillrect
 sysimgblt fb_sys_fops drm psmouse virtio_scsi floppy pata_acpi [last unloaded: tracedev]
[ 1996.302662] CPU: 1 PID: 32740 Comm: cp Tainted: G           OE   4.4.0-62-generic #83-Ubuntu
[ 1996.302784] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[ 1996.302923] task: ffff8800b8503fc0 ti: ffff8800ba468000 task.ti: ffff8800ba468000
[ 1996.303045] RIP: 0010:   posix_acl_valid+0x12/0xf0
[ 1996.303182] RSP: 0018:ffff8800ba46ba88  EFLAGS: 00010282
[...]

[ 1996.304795] Call Trace:
[ 1996.304983]   gpfs_set_posix_acl+0xbe/0x320 [mmfslinux]
[ 1996.305108]   ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[ 1996.305300]   ? gpfs_set_posix_acl+0x320/0x320 [mmfslinux]
[ 1996.305428]   ? up+0x32/0x50
[ 1996.305552]   ? shFind+0x92/0x230 [mmfslinux]
[ 1996.305682]   ? cxiTraceEntry+0x9c/0x400 [mmfslinux]
[ 1996.305804]   ? up+0x32/0x50
[ 1996.305931]   gpfs_i_setxattr+0x149/0x560 [mmfslinux]
[ 1996.306062]   ? gpfs_i_setattr_internal+0x147/0xa00 [mmfslinux]
[ 1996.306186]   ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[ 1996.306314]   ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[ 1996.306440]   ? evm_protected_xattr+0x44/0xa0
[ 1996.306555]   ? posix_xattr_acl+0x12/0x4a
[ 1996.306678]   __vfs_setxattr_noperm+0xac/0x1a0
[ 1996.306798]   ? security_inode_setxattr+0xbd/0xd0
[ 1996.306937]   vfs_setxattr+0xa7/0xb0
[ 1996.307052]   setxattr+0x128/0x200
[ 1996.307851]   ? cp_new_stat+0x153/0x180
[ 1996.308134]   ? SYSC_newfstat+0x34/0x60
[ 1996.308477]   ? percpu_down_read+0x12/0x50
[ 1996.308878]   SyS_fsetxattr+0xa0/0xd0
[ 1996.309261]   entry_SYSCALL_64_fastpath+0x16/0x71
[ 1996.309667] Code: 89 5a 26 c1 f8 03 66 89 42 1e 48 89 d0 5b 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55
 41 54 53 <8b> 46 10 48 8d 5e 14 4c 8d 3c c3 4c 39 fb 73 5b 44 0f b7 6e 16
[ 1996.310616] RIP   posix_acl_valid+0x12/0xf0
[ 1996.311033]  RSP <ffff8800ba46ba88>
[ 1996.311362] CR2: 0000000000000010

Cause

The crash is caused by the following upstream kernel commit in kernel version 4.8, and Ubuntu backporting this change to its kernel version 4.4.0-57:

commit 0d4d717f25834134bb6f43284f84c8ccee5bbf2a
Author: Eric W. Biederman <ebiederm@xmission.com>
Date: Mon Jun 27 16:04:06 2016 -0500

vfs: Verify acls are valid within superblock's s_user_ns.
Update posix_acl_valid to verify that an acl is within a user namespace.
[...]

Environment

Ubuntu 16.04.1 with kernel version 4.4.0-57 or later.

Diagnosing The Problem

Check dmesg to see if there is a 'BUG: unable to handle kernel NULL pointer dereference' message and the node is rebooted.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Component":"--","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.1.1;4.2.0;4.2.1;4.2.2","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
01 August 2018

UID

ssg1S1010071