IBM Support

PH06246: BIGSQL INSTANCE HANGS DUE TO LOCK ON DB2DIAG.LOG

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as fixed if next.

Error description

  • Under rare timing conditions, when Db2 CLP front end process
    (db2) receives a signal whilst terminating, it might hang with
    sqlnlsMessage->sqlnlscmsg functions on the top of the stack,
    e.g.:
    .
    ossLockGetConflict
    sqlnlscmsg
    sqlnlsMessage
    sqlnlsgmsg
    sqlogmsg
    pdLoadMessage
    pdGetMessage
    pdLogInternal
    pdLogSysRC
    sqloLogAndMapQueError
    sqlodque
    clp_fp_exitlist
    clp_fp_sig
    <signal handler called>
    munmap
    _ossMemFree
    sqlnlsFreeMsgFileList
    .
    Typically the hang involves SIGHUP, which is generated by the
    operating system when the terminal, from which Db2 CLP was run,
    is closed. As the result, Db2 CLP process will be stuck, holding
    a lock on the db2diag.log and/or administration notification
    log. This will cause other processes trying to write to the log
    file to fail with SQLO_SHAR error.
    .
    You may see errors in the notification log, bigsql.nfy, such as:
    .
    2018-11-20-19.04.10.724026 Instance:bigsql Node:000
    PID:90141(db2agent (BIGSQL) 0) TID:3728729856 Appid:none
    RAS/PD component pdDmpErrMsg Probe:17 Database:BIGSQL
    .
    ADM14000E The database manager is unable to open diagnostic log
    file
    "/var/ibm/bigsql/diag/DIAG0000/.db2diag.rotate.lck". Run the
    command "db2diag
    -rc "0x870f0016"" to find out more.
    .
    Running the recommended command db2diag -rc "0x870f0016" shows:
    .
    Input ZRC string '0x870f0016' parsed as 0x870F0016
    (-2029060074).
    .
    ZRC value to map: 0x870F0016 (-2029060074)
    V7 Equivalent ZRC value: 0xFFFFF616 (-2538)
    .
    ZRC class :
    Global Processing Error (Class Index: 7)
    Component:
    SQLO ; oper system services (Component Index: 15)
    Reason Code:
    22 (0x0016)
    .
    Identifer:
    SQLO_SHAR
    Identifer (without component):
    SQLZ_RC_SHAR
    .
    Description:
    File sharing violation.
    .
    Associated information:
    Sqlcode -902
    SQL0902C A system error occurred. Subsequent SQL statements
    cannot be
    processed. IBM software support reason code: "".
    .
    Number of sqlca tokens : 1
    Diaglog message number: 8519
    .
    lsof showed the lock on the db2diag.log was owned by the CLP
    front end 'db2'.
    .
    $ lsof .db2diag.rotate.lck
    COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
    db2 363 bigsql 4uW REG 253,4 0 60819144 .db2diag.rotate.lck
    .
    $ lsof db2diag.3531.log
    COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
    db2 363 bigsql 5uW REG 253,4 83586846 60819153 db2diag.3531.log
    

Local fix

  • If SQLO_SHAR on one of the log files is observed, one should
    check whether the file is being used by some process using OS
    tools like fuser/lsof.  When the process using the file is
    identified, OS tools like procstack/gstack can be used to
    generate a backtrace (stack) to verify whether it matches this
    APAR.
    .
    To resolve the problem, one can either:
    a) kill PID of Db2 CLP front that got stuck
         $ kill -9 <pid_of_db2_clp_front_end>
    b) rename the log file held by the stuck CLP.
    .
    To limit the possibility of hitting the problem, avoid closing
    the terminal while the db2 clp is still running. Terminate
    cleanly by issuing 'connect reset' and 'terminate' before
    closing.
    

Problem summary

  • Please see problem description.
    

Problem conclusion

Temporary fix

Comments

APAR Information

  • APAR number

    PH06246

  • Reported component name

    IBM BIG SQL

  • Reported component ID

    5737E7400

  • Reported release

    504

  • Status

    CLOSED FIN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-12-06

  • Closed date

    2020-09-09

  • Last modified date

    2020-09-09

  • APAR is sysrouted FROM one or more of the following:

    IT23978

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"504"}]

Document Information

Modified date:
10 September 2020