IBM Support

IBM Sterling B2B Integrator node crashed with: "java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11" due to nproc settings

Troubleshooting


Problem

OutOfMemory (OOM) error occurred on node1 of a 2-node IBM Sterling B2B Integrator (SBI) cluster

Symptom

The following error was noted in the SBI logs e.g., opsServer.log, noapp.log (with a date & time stamp), wf.log:

Caused by: java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11

The following error was noted in the noapp.log (without a date & time stamp):

JVMDUMP012E Error in System dump: insufficient system resources to generate dump, errno=11 "Resource temporarily unavailable"

Cause

The issue was on the OS end:
The required Red Hat 'nproc' values (* hard nproc 16000 ,* soft nproc 16000) in the /etc/security/limits.conf file were being overridden by the /etc/security/limits.d/90-nproc.conf file (to 1024). Once processing picked up, the overridden nproc values of 1024 were easily exceeded.

Thread dumps from OOM reflect soft nproc value being overridden to 1024:
1CIUSERLIMITS User Limits (in bytes except for NOFILE and NPROC)
NULL ------------------------------------------------------------------------
NULL type soft limit hard limit
2CIUSERLIMIT RLIMIT_NPROC 1024 16000

Environment

SBI 5.2.4.2_Interim Fix 4, Red Hat Enterprise Linux 6.x

Diagnosing The Problem

  • The error: "Caused by: java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11" means that this wasn't an JVM OutOfMemory error. The heap wasn't exhausted.
  • Hardware wasn't a limiting factor (128GB RAM on servers with dual 12-core processors (24 CPUs))

Exhausted the normal troubleshooting steps to diagnose the native OOM:
Started monitoring items on OS - ulimit for nproc (number of processes) & nofiles (number of open files):
  • nofiles: lsof | wc -l
  • nproc: ps -eLf | grep <SBIuser> | wc -l

Observed and came to the conclusion that the nproc values were not being honored.

Resolving The Problem

Make the necessary changes in the /etc/security/limits.d/90-nproc.conf file so that the hard and soft nproc settings of 16000 are honored

Additional information on the subject matter:


Consult RedHat Support with further questions or concerns on the matter

Internal Use Only

This technote was generated by Technote Kickstart 1.1.0.83 based on Industry Solutions PMR 12482,999,000.
View the associated PMR's text via Wellspring at: http://eclient.lenexa.ibm.com:9082/DocFetcher/source/PMR/12482.999.000%20O14/10/18

[{"Product":{"code":"SS3JSW","label":"Sterling B2B Integrator"},"Business Unit":{"code":"BU012","label":"WCE"},"Component":"--","Platform":[{"code":"PF027","label":"Solaris"}],"Version":"5.2.4.2","Edition":""}]

Document Information

Modified date:
16 June 2018

UID

swg21689324