IBM Support

IT29660: DB2STOP FAILS WITH SQL6037N, LEAVING [DB2SYSC]<DEFUNCT>.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • Db2stop sometimes fails with SQL6037N, leaving [db2sysc]
    <defunct>.
    
    ps -elf output:
    ----------------------------------------------------------------
    F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
    *snip*
    4 S root 24200 1 0 99 19 - 300894 futex_ 3/21 ? 00:00:00 db2wdog
    0 [db2inst1]
    4 Z db2inst1 24204 24200 1 99 19 - 0 exit 3/21 ? 00:19:57
    [db2sysc] <defunct>
    5 S root 24219 24200 1 99 19 - 301534 msgrcv 3/21 ? 00:13:30
    db2ckpwd 0
    5 S root 24220 24200 1 99 19 - 301534 msgrcv 3/21 ? 00:13:30
    db2ckpwd 0
    5 S root 24221 24200 1 99 19 - 301534 msgrcv 3/21 ? 00:13:30
    db2ckpwd 0
    4 S db2inst1 24308 24200 0 99 19 - 221615 SYSC_s 3/21 ? 00:00:07
    db2acd 0
    ,0,0,0,1,0,0,0,0000,1,0,995cf0,14,1e014,2,0,1,41fc0,0x210000000,
    0x210000000,1600000,3af7801b,2,4a4081da
    ----------------------------------------------------------------
    
    When you hit this problem, you WON'T see the message like the
    following, which you usually see before db2stop completes.
    ----------------------------------------------------------------
    2019-02-16-01.30.10.606355+540 I16527E867 LEVEL: Event
    PID : 27022 TID : 140128437659392 PROC : db2wdog 0
    [db2inst1]
    INSTANCE: db2inst1 NODE : 000
    HOSTNAME: server1
    EDUID : 2 EDUNAME: db2wdog 0 [db2inst1]
    FUNCTION: DB2 UDB, oper system services, sqlossig, probe:10
    MESSAGE : Sending SIGKILL to the following process id
    DATA #1 : signed integer, 4 bytes
    27175
    CALLSTCK: (Static functions may not be resolved correctly, as
    they are
    resolved to the nearest symbol)
    [0] 0x00007F724FC1E027 sqlossig + 0x167
    [1] 0x000000000041DA8D _Z28sqleKillAllProcsFromWatchDogv + 0x4D
    [2] 0x000000000040F24A _Z12sqleWatchDogm + 0x2D0A
    [3] 0x000000000040C118 DB2main + 0x1668
    [4] 0x00007F724FC94FFF sqloEDUEntry + 0x87F
    [5] 0x00007F72573F3DC5 /lib64/libpthread.so.0 + 0x7DC5
    [6] 0x00007F724A8C773D clone + 0x6D
    ----------------------------------------------------------------
    
    Stack trace of db2wdog shows it is waiting in msgrcv().
    
    gdb output:
    ----------------------------------------------------------------
    Thread 1
    #0 0x00007f6f76524ea3 in msgrcv () from /lib64/libc.so.6
    #1 0x00007f6f7b855e27 in sqlorqueInternal(SQLO_QUE_DESC*,
    SQLO_MSG_HDR*, int, int) () from
    /opt/ibm/db2/V10.5/lib64/libdb2e.so.1
    #2 0x00007f6f7b8559f1 in sqlorque2 ()
    from /opt/ibm/db2/V10.5/lib64/libdb2e.so.1
    #3 0x000000000040ca12 in sqleWatchDog(unsigned long) ()
    #4 0x000000000040c118 in DB2main ()
    #5 0x00007f6f7b8f0fff in sqloEDUEntry ()
    from /opt/ibm/db2/V10.5/lib64/libdb2e.so.1
    #6 0x00007f6f8304fdc5 in start_thread () from
    /lib64/libpthread.so.0
    #7 0x00007f6f7652373d in clone () from /lib64/libc.so.6
    ----------------------------------------------------------------
    
    This is because there was a timing hole where db2wdog can
    receive SIGCHLD from db2sysc after it checked the status flag
    that is set
    when SIGCHLD was caught, and at the time it checked the flag,
    the status was not set, so instead of proceeding with the
    shutdown procedure, it continues with the normal process which
    waits for new requests in msgrcv().
    

Local fix

  • Run "db2pd -stack all" when db2stop hangs.  The db2pd call will
    interrupt db2wdog and it should see the flag indicating SIGHILD
    from db2sysc has been caught, and proceed with the shutdown
    process.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to Db2 11.1 Mod 4 Fixpack 5 or higher                *
    ****************************************************************
    

Problem conclusion

  • First fixed in Db2 11.1 Mod 4 Fixpack 5
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT29660

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    B10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2019-07-09

  • Closed date

    2020-01-16

  • Last modified date

    2020-01-16

  • APAR is sysrouted FROM one or more of the following:

    IT29054

  • APAR is sysrouted TO one or more of the following:

    IT31378

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • RB10 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 January 2020