IBM Support

IT04342: THE EVIDENCE.SH CAN CAUSE LIMITED CONNECTIVITY OR A COMPLETE INSTANCE BLOCK WHEN AUDITING IS TURNED ON

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • If you have the auditing turned on (any level) and your instance
    hits an assertion failure which triggers the SYSALARMPROGRAM
    ($INFORMIXDIR/etc/evidence.sh by default), the instance may
    become unresponsive to new connection requests - or even get
    completely stuck - for 6 or more minutes.
    When auditing is turned on, the onstat command sends it's
    command line arguments to the onmode_mon thread in the server to
    be written into the audit trail. If the assertion failure occurs
    in a thread running on cpuvp 1, that cpuvp gets blocked (as it
    waits for SYSALARMPROGRAM to finish) and cannot serve the
    onmode_mon thread (which is bound to it) hence the onmode_mon
    thread can't accept the command line arguments sent by the
    onstats called from SYSALARMPROGRAM.
    In such a situation the onstat waits till the onmode_mon thread
    becomes available. If it doesn't do so in 5 seconds, the onstat
    gives up and continues to print the requested outputs.
    As the default SYSALARMPROGRAM calls the onstat ~73x, the total
    time the script runs is at least 365 seconds. During this time
    all the threads bound to cpuvp 1 (onmode_mon, listeners and
    others) can't run. If you have only one cpuvp configured, the
    whole instance is blocked, which may have some adverse effects.
    For example, in a MACH11 cluster environment managed by a
    connection manager (CM), this may lead to a split-brain
    situation (two primaries in cluster) as the CM initiates a
    failover (because it can't reach the blocked old primary) and
    promotes some of the secondaries to a new primary without
    killing the old one.
    

Local fix

  • A partial workaround may be:
    - make sure you have at least 2 cpuvp's configured
    - if you are using the default SYSALARMPROGRAM, find the
    "DO_ONSTAT_A=off" line in it and change it to "DO_ONSTAT_A=on".
    This will reduce the number of onstat calls from 73 to 8, so the
    time needed to complete the script should go from 365 to ~40
    seconds
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All users                                                    *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Update to IDS-12.10.xC5                                      *
    ****************************************************************
    

Problem conclusion

  • Problem Fixed In IDS-12.10.xC5
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT04342

  • Reported component name

    INFORMIX SERVER

  • Reported component ID

    5725A3900

  • Reported release

    C10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-09-11

  • Closed date

    2015-10-16

  • Last modified date

    2015-10-16

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    INFORMIX SERVER

  • Fixed component ID

    5725A3900

Applicable component levels

  • RC10 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"C10","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
16 October 2015