IBM Support

Fenced wrapper reports error -430/-1476 randomly

Troubleshooting


Problem

Running fenced wrapper resulted in random SQL0430N/SQL1476N errors when using IBM InfoSphere Federation Server.

Symptom

When running fenced wrapper accessing remote data source via nicknames, your job is aborted/rolled back and couldn't be restarted until recycling DB2 server. You get an error message like:

The current transaction was rolled back because of error "-430". SQLCODE=-1476, SQLSTATE=40506, DRIVER=3.51.90

Cause

Due to an operating system limitation or resource issue, db2fmp process encountered EAGAIN(11) error while calling pthread_create() to create a new db2fmp thread. Then the db2fmp porcess was marked unstable, this causes the application end up with error SQL0430N/SQL1476N. Error SQL1476N means the current transactions is rolled back, SQL0430N means application has abnormally terminated.

This problem can happen to all supported fenced wrappers.

Diagnosing The Problem

From db2diag.log, you should see below error at the beginning:

2012-04-18-21.56.04.778621+480 E41980A395 LEVEL: Severe (OS)


PID : 12255296 TID : 1 PROC : db2fmp (C) 0
INSTANCE: db2inst1 NODE : 000
EDUID : 1 EDUNAME: db2fmp (C) 0
FUNCTION: DB2 UDB, oper system services, sqloCreateAppThread, probe:100
CALLED : OS, -, pthread_create
OSERR : EAGAIN (11) "Resource temporarily unavailable"

After above error, you can see error entries like following:

2012-04-18-21.56.04.779388+480 E43164A3366 LEVEL: Severe
PID : 13697180 TID : 35669 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-3620 APPID: 10.92.32.50.60820.120418135354
AUTHID : DB2INST1
EDUID : 35669 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, routine_infrastructure, sqlerMasterThreadReq, probe:901
DATA #1 : String, 50 bytes
marking fmp as unstable, error reading IPC buffer:
DATA #2 : String, 8 bytes
Fmp TID:
.....


2012-04-18-21.56.04.779637+480 I46531A449 LEVEL: Severe
PID : 13697180 TID : 35669 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-3620 APPID: 10.92.32.50.60820.120418135354
AUTHID : DB2INST1
EDUID : 35669 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, routine_infrastructure, sqlerGetFmpThread, probe:20
RETCODE : ZRC=0xFFFFFBEE=-1042

2012-04-18-21.56.04.779773+480 I46981A465 LEVEL: Severe
PID : 13697180 TID : 35669 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-3620 APPID: 10.92.32.50.60820.120418135354
AUTHID : DB2INST1
EDUID : 35669 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, base sys utilities, sqleFedFMPManager::getFmpFromProcPool, probe:10
RETCODE : ZRC=0xFFFFFBEE=-1042

....

2012-04-18-21.56.04.790613+480 I73178A862 LEVEL: Warning
PID : 13697180 TID : 21463 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-385 APPID: 10.92.32.39.34622.120418105855
AUTHID : DB2INST1
EDUID : 21463 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, relation data serv, sqlrr_rollback_with_sqlcode, probe:150
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: -430 sqlerrml: 19
sqlerrmc: FedFmp abnormal end
sqlerrp : SQLRI212
sqlerrd : (1) 0x801A006D (2) 0x00000000 (3) 0x00000000
(4) 0x00000000 (5) 0xFFFFFF38 (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate:

2012-04-18-21.56.04.790888+480 I74041A460 LEVEL: Severe
PID : 13697180 TID : 21463 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-385 APPID: 10.92.32.39.34622.120418105855
AUTHID : DB2INST1
EDUID : 21463 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, base sys utilities, sqleFedFMPManager::getFmpFromPool, probe:100
RETCODE : ZRC=0xFFFFFB95=-1131

...

2012-04-18-21.56.04.791234+480 I75598A538 LEVEL: Severe
PID : 13697180 TID : 21463 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-385 APPID: 10.92.32.39.34622.120418105855
AUTHID : DB2INST1
EDUID : 21463 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, relation data serv, sqlrrbck_dps, probe:200
RETCODE : ZRC=0x8026006D=-2144993171=SQLQG_RC_CA_BUILT
"Error constant for gateway and the sqlca filled and trustable."

2012-04-18-21.56.04.791337+480 I76137A831 LEVEL: Severe
PID : 13697180 TID : 21463 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLEDB
APPHDL : 0-385 APPID: 10.92.32.39.34622.120418105855
AUTHID : DB2INST1
EDUID : 21463 EDUNAME: db2agent (SAMPLEDB) 0
FUNCTION: DB2 UDB, relation data serv, sqlrrbck_dps, probe:250
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: -1476 sqlerrml: 4
sqlerrmc: -430
sqlerrp : SQLRR0F9
sqlerrd : (1) 0x801A006D (2) 0x00000000 (3) 0x00000000
(4) 0x00000000 (5) 0xFFFFFF38 (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate: 40506

Resolving The Problem

1. Run command "ulimit -a" to get the current user Limits.

2. Check following ulimit settings, set them to unlimited(except for stack) or enlarge their values.


Unix linux recommended
---------------------------------------------------------------------------
file(blocks) file size(-f) unlimited
data(kbytes) data seg size(-d) unlimited
memory(kbytes) max memory size(-m unlimited
nofiles(descriptors) open files(-n) unlimited
processes(per user) max user processes(nproc,-u) unlimited
threads(per process) N/A unlimited
stack(kbytes) stack size(-s)

NOTE: You have to set above ulimit settings, not only for db2 instance user, but also for root, db2 fenced user, DAS user, and any other users that will issue "db2start".

3. After above changes, recycle DB2, monitor the system to check if the error happened again.

[{"Product":{"code":"SS2K5T","label":"InfoSphere Federation Server"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Federated Server","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"9.5;9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21595086