IBM Support

Db2 crashed with signal 11 in system call

Troubleshooting


Problem

Db2 server running on Linux got crashed while backup a database, no FODC created, no core file created, what could be the reason of this crash?

Symptom

Checking db2diag.log:

# online backup started on partition 16, db2sysc pid = 23850
2018-06-30-21.14.58.006081+480 E17937688E492         LEVEL: Info
PID     : 23850                TID : 140414715684608 PROC : db2sysc 16
INSTANCE: db2inst1             NODE : 016            DB   : SAMPLE
APPHDL  : 0-20747              APPID: *N0.db2inst1.180630131457
AUTHID  : DB2INST1             HOSTNAME: myhost1
EDUID   : 12898                EDUNAME: db2agntp (SAMPLE) 16
FUNCTION: DB2 UDB, database utilities, sqlubSetupJobControl, probe:1897
MESSAGE : Starting an online db backup.

# db2sysc 23850 trapped with signal 11
2018-06-30-21.56.55.506820+480 E17971971E471         LEVEL: Error
PID     : 23591                TID : 139658025494272 PROC : db2wdog 16 [db2inst1]
INSTANCE: db2inst1             NODE : 016
HOSTNAME: myhost1
EDUID   : 2                    EDUNAME: db2wdog 16 [db2inst1]
FUNCTION: DB2 UDB, base sys utilities, sqleWatchDog, probe:9064
DATA #1 : Process ID, 4 bytes
23850
DATA #2 : Hexdump, 8 bytes
0x00007F04AAFFD268 : 0101 0000 0B00 0000                        ........

Given FODC is not created in db2dump path,  checking syslog(i.e. /var/log/messages) to see if any further clue(s) , see this:

<<<
Jun 30 21:50:40 myhost1 avahi-daemon[3407]: Withdrawing address record for fe80::42f2:e9ff:fe63:2218 on eno1.
Jun 30 21:51:07 myhost1 kernel: mm/memory.c:401: bad pmd ffff88f7747a54f8(80000071708000e7)
Jun 30 21:51:08 myhost1 abrt-hook-ccpp: Can't open 'core.23850' at '/': Permission denied
<skipped>
Jun 30 21:55:04 myhost1 abrt-hook-ccpp: /var/spool/abrt is 35203441407 bytes (more than 1279MiB), deleting 'oops-2018-06-19-19:15:33-30698-0'
Jun 30 21:55:04 myhost1 abrt-server: Executable '/home/db2inst1/sqllib/adm/db2sysc' doesn't belong to any package and ProcessUnpackaged is set to 'no'
Jun 30 21:55:04 myhost1 abrt-server: 'post-create' on '/var/spool/abrt/ccpp-2018-06-30-21:51:08-23850' exited with 1
Jun 30 21:55:04 myhost1 abrt-server: Deleting problem directory '/var/spool/abrt/ccpp-2018-06-30-21:51:08-23850'
Jun 30 21:55:05 myhost1 kernel: ------------[ cut here ]------------
Jun 30 21:55:05 myhost1 kernel: WARNING: at mm/mmap.c:2773 exit_mmap+0x196/0x1a0()
Jun 30 21:55:05 myhost1 kernel: Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 tcp_lp bnep bluetooth rfkill vxglm(POE) dmpjbod(POE) dmpap(POE) dmpaa(POE) fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter vxspec(POE) vxio(POE) vxdmp(POE) vxcafs(POE) vxportal(POE) fdd(POE) vxfs(POE) veki(POE) xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ocrdma ib_core ib_addr iTCO_wdt iTCO_vendor_support ipmi_devintf intel_powerclamp coretemp kvm dm_service_time
Jun 30 21:55:05 myhost1 kernel: crc32_pclmul ghash_clmulni_intel aesni_intel vfat lrw fat gf128mul glue_helper ablk_helper cryptd pcspkr ipmi_ssif cdc_ether osst usbnet mii lpc_ich st ipmi_si ioatdma i2c_i801 sg i7core_edac ipmi_msghandler mfd_core wmi shpchp dca edac_core acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc dm_multipath ip_tables xfs libcrc32c sr_mod cdrom sd_mod ata_generic pata_acpi mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper lpfc ttm crc_t10dif be2net crct10dif_generic scsi_transport_fc ata_piix crct10dif_pclmul vxlan drm serio_raw crc32c_intel ip6_udp_tunnel scsi_tgt libata megaraid_sas bnx2 i2c_core udp_tunnel crct10dif_common dm_mirror dm_region_hash dm_log dm_mod
Jun 30 21:55:05 myhost1 kernel: CPU: 38 PID: 33840 Comm: db2sysc Tainted: P          IOE  ------------   3.10.0-327.el7.x86_64 #1
Jun 30 21:55:05 myhost1 kernel: Hardware name: IBM System x3850 X5 -[71437HM]-/Node 1, Processor Card, BIOS -[G0E179BUS-1.79]- 07/28/2013
Jun 30 21:55:05 myhost1 kernel: 0000000000000000 000000001bd3ce81 ffff88db79783bc8 ffffffff816351f1
Jun 30 21:55:05 myhost1 kernel: ffff88db79783c00 ffffffff8107b200 00000000000245c3 ffff88fc2ad05140
Jun 30 21:55:05 myhost1 kernel: ffff88fc2ad051b8 ffff8865a8fe6440 ffff88e9f3276780 ffff88db79783c10
Jun 30 21:55:05 myhost1 kernel: Call Trace:
Jun 30 21:55:05 myhost1 kernel: [<ffffffff816351f1>] dump_stack+0x19/0x1b
Jun 30 21:55:05 myhost1 kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
Jun 30 21:55:05 myhost1 kernel: [<ffffffff8107b34a>] warn_slowpath_null+0x1a/0x20
Jun 30 21:55:05 myhost1 kernel: [<ffffffff8119e916>] exit_mmap+0x196/0x1a0
Jun 30 21:55:05 myhost1 kernel: [<ffffffff810782b7>] mmput+0x67/0xf0
Jun 30 21:55:05 myhost1 kernel: [<ffffffff810815ac>] do_exit+0x28c/0xa60
Jun 30 21:55:05 myhost1 kernel: [<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30
Jun 30 21:55:05 myhost1 kernel: [<ffffffff81081dff>] do_group_exit+0x3f/0xa0
Jun 30 21:55:05 myhost1 kernel: [<ffffffff81092c10>] get_signal_to_deliver+0x1d0/0x6d0
Jun 30 21:55:05 myhost1 kernel: [<ffffffff81014417>] do_signal+0x57/0x6c0
Jun 30 21:55:05 myhost1  kernel: [<ffffffff8162e498>] ? __bad_area_nosemaphore+0x1bd/0x1ca
Jun 30 21:55:05 myhost1 kernel: [<ffffffff8162e6c7>] ? bad_area+0x43/0x4a
Jun 30 21:55:05 myhost1 kernel: [<ffffffff81641035>] ? __do_page_fault+0x365/0x420
Jun 30 21:55:05 myhost1 kernel: [<ffffffff81014adf>] do_notify_resume+0x5f/0xb0
Jun 30 21:55:05 myhost1 kernel: [<ffffffff8163d1fc>] retint_signal+0x48/0x8c
Jun 30 21:55:05 myhost1 kernel: ---[ end trace 39f39ed52aac9358 ]---
Jun 30 21:55:05 myhost1 kernel: BUG: Bad rss-counter state mm:ffff88fc2ad05140 idx:1 val:512

>>>

Above shows the process db2sysc 23850 terminated abnormally in kernel memory.c.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"v10.5, v11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Log InLog in to view more of this document

This document has the abstract of a technical article that is available to authorized users once you have logged on. Please use Log in button above to access the full document. After log in, if you do not have the right authorization for this document, there will be instructions on what to do next.

Document Information

Modified date:
01 May 2025

UID

ibm10718705