IBM Support

JR43259: Normal TCP errors CAUSE INFOMATION SERVER 8.5 TO CREATE MULTIPLE CORE FILES ON LINUX

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Problem Description:
    If an I/O error occurs the PX section leader process immediately
    closed its side of the communication socket with the conductor
    process. This caused the conductor process to to fail and dump
    core. Over thousands of job runs with MQ or z/OS stages,
    occasional communication errors would lead to hundreds of core
    file on the PX server machine, which filled up the file system.
    The fix checks for several common TCP errors so that the PX
    shuts down normally without creating a core.
    

Local fix

Problem summary

  • Information Server is creating core files regularly and filling
    up disk space.
    
    Customer believes they have identified they are related to lost
    connection, either MQ Server channel (as in these files) or a
    z/OS DB2 connection.
    
    Using gdb on core file shows:
    
    Program terminated with signal 11, Segmentation fault.
    #0  0x00002b9d8f8646a9 in APT_PMMessagePort::makeMessage(char*,
    int) ()
       from
    /opt/IBM/InformationServer/Server/PXEngine/lib/liborchx86_64.so
    

Problem conclusion

  • This is a very low incidence problem.
    
    
    
                        Solution:
    In case of I/O error, we are deactivating Section leader,
    causing section leader marked as not
    usable in this step. During this process, the control port
    (conductor side of the section leader)
    removed from polling group and further dumping core.
    Socket errors causing the msgport to set to state bad and
    causing the core dump.
    Check was added to avoid the above cases.
    

Temporary fix

Comments

APAR Information

  • APAR number

    JR43259

  • Reported component name

    WIS DATASTAGE

  • Reported component ID

    5724Q36DS

  • Reported release

    850

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-06-28

  • Closed date

    2012-08-15

  • Last modified date

    2013-01-14

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WIS DATASTAGE

  • Fixed component ID

    5724Q36DS

Applicable component levels

  • R850 PSY

       UP

  • R870 PSY

       UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
14 January 2013