IBM Support

IC72123: EMFILE error due to a file handle leak may happen and can cause DBMarkedBad error

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The customer got database marked bad error because DB2 could not
    drop a temporary file due to EMFILE error at a file open.
    EMFILE error was caused by opening /var/db2/global.reg file.
    When db2sysc attempted to access global registry file, it could
    open it but got the following error.  When db2 gets this error,
    it does not close the opened file descriptor and a file handle
    leak occurs.  When this error happens repeatedly, it results in
    EMFILE error.
    
    From db2diag.log,
    2010-06-30-09.00.58.626926+540 I3582A1357         LEVEL: Error
    PID     : 454956               TID  : 6099        PROC : db2sysc
    0
    INSTANCE: db2inst1             NODE : 000         DB   : SAMPLE
    APPHDL  : 3-44174              APPID: *N3.db2inst1.100707013023
    AUTHID  : db2inst1
    EDUID   : 6099                 EDUNAME: db2agntp (SAMPLE) 99
    FUNCTION: DB2 Common, Generic Registry, GenRegFile::OpenScan,
    probe:20
    MESSAGE :
    ECF=0x900001BF=-1879047745=ECF_GENREG_OPEN_INPUT_FILE_FAILED
              Failed to open the input registry
    CALLED  : OS, -, fopen
    RETCODE : ECF=0x9000002D=-1879048147=ECF_FILE_PROCESS_MAX
              The maximum number of file per process has already
    been reached
    DATA #1 : String, 19 bytes
    /var/db2/global.reg
    CALLSTCK:
      [0] 0x0900000002FEAEFC pdOSSeLoggingCallback + 0x34
      [1] 0x0900000000624424 oss_log__FP9OSSLogFacUiN32UlN26iPPc +
    0x1C4
      [2] 0x0900000000624810 ossLogRC + 0xD0
      [3] 0x09000000010CEF6C OpenScan__10GenRegFileFv + 0x3CC
      [4] 0x09000000010E928C ossOpenInstanceList__FPcPPvCb + 0x8C
      [5] 0x0900000001811950 @71@EnvRegRefresh__FP12SEnvRegistry +
    0x2C4
      [6] 0x0900000001811578 @71@EnvRegOpen__FPP12SEnvRegistry +
    0x84
      [7] 0x09000000017A9160 @71@sqloPRegQueryDefaultValue__FiPcPCc
    + 0xC
      [8] 0x0900000001816420 @71@EnvGetDB2SysVar__FiPcUl + 0x110
      [9] 0x0900000001811118 @71@EnvQueryDB2SystemVariables__Fv +
    0x80
    
    
    
    Then, you got the following using lsof.
    COMMAND    PID     USER   FD   TYPE             DEVICE
    SIZE/OFF                NODE NAME
    db2sysc 229436 db2inst1    3r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    ... snip ...
    db2sysc 229436 db2inst1   22r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    db2sysc 229436 db2inst1   23r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    ... snip ...
    db2sysc 229436 db2inst1  999r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    db2sysc 229436 db2inst1 1000r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    ... snip ...
    db2sysc 229436 db2inst1 *484r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    db2sysc 229436 db2inst1 *485r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    db2sysc 229436 db2inst1 *486r  VREG               10,6
      0 XXX /var (/dev/hd9var)
    
    
    This problem also caused instance crash with the following error
    in the db2diag.log.
    
    2010-07-01-05.31.23.892274+540 I23479979A1362     LEVEL: Severe
    
    PID     : 454956               TID  : 3503        PROC : db2sysc
    99
    INSTANCE: db2inst1             NODE : 099         DB   : SAMPLE
    
    APPHDL  : 3-19801              APPID:
    12.26.3.148.54848.100708075435
    AUTHID  : EDWETL
    
    EDUID   : 3503                 EDUNAME: db2agntp (SLSIYMS) 99
    
    FUNCTION: DB2 UDB, data management, sqldCriticalSectionEnd,
    probe:9323
    CALLED  : DB2 UDB, data management, sqldDropTable
    
    RETCODE : ZRC=0x85020087=-2063466361=SQLB_NO_HANDLES
    
              "SqlbFileTbl out of file handles."
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * A File handle will leak, if GenRegFile::OpenScan , probe:20  *
    * is recorded in db2diag.log                                   *
    *                                                              *
    * From db2diag.log,                                            *
    *                                                              *
    * 2010-06-30-09.00.58.626926+540 I3582A1357        LEVEL:Error *
    * PID    : 454956              TID  : 6099        PROC         *
    * :db2sysc 0                                                   *
    * INSTANCE: db2inst1            NODE : 000        DB  : SAMPLE *
    * APPHDL  : 3-44174                                            *
    * APPID:*N3.db2inst1.100707013023                              *
    * AUTHID  : db2inst1                                           *
    * EDUID  : 6099                EDUNAME: db2agntp (SLSIYMS) 99  *
    * FUNCTION: DB2 Common, Generic Registry,                      *
    * GenRegFile::OpenScan,probe:20                                *
    *                                                              *
    * MESSAGE :                                                    *
    * ECF=0x900001BF=-1879047745=ECF_GENREG_OPEN_INPUT_FILE_FAILED *
    *                                                              *
    * Failed to open the input registry                            *
    * CALLED  : OS, -, fopen                                       *
    * RETCODE : ECF=0x9000002D=-1879048147=ECF_FILE_PROCESS_MAX    *
    * The maximum number of file per process has already been      *
    * reached                                                      *
    * DATA #1 : String, 19 bytes                                   *
    * /var/db2/global.reg                                          *
    *                                                              *
    * CALLSTCK:                                                    *
    *                                                              *
    * [0] 0x0900000002FEAEFC pdOSSeLoggingCallback + 0x34          *
    * [1] 0x0900000000624424 oss_log__FP9OSSLogFacUiN32UlN26iPPc   *
    * + 0x1C4                                                      *
    * [2] 0x0900000000624810 ossLogRC + 0xD0                       *
    * [3] 0x09000000010CEF6C OpenScan__10GenRegFileFv + 0x3CC      *
    * [4] 0x09000000010E928C ossOpenInstanceList__FPcPPvCb + 0x8C  *
    * [5] 0x0900000001811950 @71@EnvRegRefresh__FP12SEnvRegistry   *
    * + 0x2C4                                                      *
    * [6] 0x0900000001811578 @71@EnvRegOpen__FPP12SEnvRegistry +   *
    * 0x84                                                         *
    * [7] 0x09000000017A9160@71@sqloPRegQueryDefaultValue__FiPcPCc *
    * + 0xC                                                        *
    * [8] 0x0900000001816420 @71@EnvGetDB2SysVar__FiPcUl + 0x110   *
    * [9] 0x0900000001811118 @71@EnvQueryDB2SystemVariables__Fv    *
    * + 0x80                                                       *
    *                                                              *
    * In this case, opened file descriptor was not closed.         *
    *                                                              *
    * Then, you got the following using lsof.                      *
    * COMMAND    PID    USER  FD  TYPE            DEVICE           *
    * SIZE/OFF                NODE NAME                            *
    * db2sysc 229436 db2inst1    3r  VREG              10,6        *
    * 0 XXX /var (/dev/hd9var)                                     *
    * ... snip ...                                                 *
    * db2sysc 229436 db2inst1  22r  VREG              10,6         *
    * 0 XXX /var (/dev/hd9var)                                     *
    * db2sysc 229436 db2inst1  23r  VREG              10,6         *
    * 0 XXX /var (/dev/hd9var)                                     *
    * ... snip ...                                                 *
    * db2sysc 229436 db2inst1  999r  VREG              10,6        *
    * 0 XXX /var (/dev/hd9var)                                     *
    * db2sysc 229436 db2inst1 1000r  VREG              10,6        *
    * 0 XXX /var (/dev/hd9var)                                     *
    * ... snip ...                                                 *
    * db2sysc 229436 db2inst1 *484r  VREG              10,6        *
    * 0 XXX /var (/dev/hd9var)                                     *
    * db2sysc 229436 db2inst1 *485r  VREG              10,6        *
    * 0 XXX /var (/dev/hd9var)                                     *
    * db2sysc 229436 db2inst1 *486r  VREG              10,6        *
    * 0 XXX /var (/dev/hd9var)                                     *
    *                                                              *
    * db2sysc was TOO MANY opened /var... files due to the         *
    * above(open was successful, but lock file was error.          *
    * if this case was occurred, opened file descriptor was not    *
    * closed)                                                      *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to DB2 UDB version 9.5 fix pack 8.                   *
    ****************************************************************
    

Problem conclusion

  • Problem was first fixed in DB2 UDB Version 9.5 FixPack 8
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC72123

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-10-21

  • Closed date

    2011-05-24

  • Last modified date

    2011-05-24

  • APAR is sysrouted FROM one or more of the following:

    IC69889

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R950 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
24 May 2011