APAR status
Closed as program error.
Error description
If you have the auditing turned on (any level) and your instance hits an assertion failure which triggers the SYSALARMPROGRAM ($INFORMIXDIR/etc/evidence.sh by default), the instance may become unresponsive to new connection requests - or even get completely stuck - for 6 or more minutes. When auditing is turned on, the onstat command sends it's command line arguments to the onmode_mon thread in the server to be written into the audit trail. If the assertion failure occurs in a thread running on cpuvp 1, that cpuvp gets blocked (as it waits for SYSALARMPROGRAM to finish) and cannot serve the onmode_mon thread (which is bound to it) hence the onmode_mon thread can't accept the command line arguments sent by the onstats called from SYSALARMPROGRAM. In such a situation the onstat waits till the onmode_mon thread becomes available. If it doesn't do so in 5 seconds, the onstat gives up and continues to print the requested outputs. As the default SYSALARMPROGRAM calls the onstat ~73x, the total time the script runs is at least 365 seconds. During this time all the threads bound to cpuvp 1 (onmode_mon, listeners and others) can't run. If you have only one cpuvp configured, the whole instance is blocked, which may have some adverse effects. For example, in a MACH11 cluster environment managed by a connection manager (CM), this may lead to a split-brain situation (two primaries in cluster) as the CM initiates a failover (because it can't reach the blocked old primary) and promotes some of the secondaries to a new primary without killing the old one.
Local fix
A partial workaround may be: - make sure you have at least 2 cpuvp's configured - if you are using the default SYSALARMPROGRAM, find the "DO_ONSTAT_A=off" line in it and change it to "DO_ONSTAT_A=on". This will reduce the number of onstat calls from 73 to 8, so the time needed to complete the script should go from 365 to ~40 seconds
Problem summary
**************************************************************** * USERS AFFECTED: * * All users * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Update to IDS-12.10.xC5 * ****************************************************************
Problem conclusion
Problem Fixed In IDS-12.10.xC5
Temporary fix
Comments
APAR Information
APAR number
IT04342
Reported component name
INFORMIX SERVER
Reported component ID
5725A3900
Reported release
C10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2014-09-11
Closed date
2015-10-16
Last modified date
2015-10-16
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
INFORMIX SERVER
Fixed component ID
5725A3900
Applicable component levels
RC10 PSY
UP
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"C10","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
16 October 2015