IBM Support

IT18995: CF MAY ENCOUNTER A HANG AND GET RESTARTED WITH THE FOLLOWING MESSAGE LOGGED IN DB2DIAG.LOG - PS GET SERVER ROLE FAILED. STATUS:

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In some cases, we might observe messages similar to the
    following indicating that the cf encountered a hang. Due to this
    db2 monitoring code would receive a timeout and we will see the
    cf being restarted.
    
    2016-12-29-10.10.26.664040-120 I1646A574            LEVEL: Error
    PID     : 17825896             TID : 1              PROC :
    db2rocme 128 [db2inst1]
    INSTANCE: db2inst1               NODE : 128
    HOSTNAME: test_host
    EDUID   : 1                    EDUNAME: db2rocme 128 [db2inst1]
    FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
    DATA #1 : <preformatted>
    Ps Get Server Role failed. status: 8006005a
    DATA #1 : <preformatted>
    If a CF return code is displayed above and you wish to get
    more information then please run the following command:
    
    db2diag -cfrc <CF_errcode>
    ....
    2016-12-29-10.10.26.669247-120 I4714A522            LEVEL: Error
    PID     : 17825896             TID : 1              PROC :
    db2rocme 128 [db2inst1]
    INSTANCE: db2inst1               NODE : 128
    HOSTNAME: test_host
    EDUID   : 1                    EDUNAME: db2rocme 128 [db2inst1]
    FUNCTION: DB2 UDB, high avail services, rocmCAMonitor, probe:761
    MESSAGE : ECF=0x94C6005A=-1798963110
    DATA #1 : String, 70 bytes
    CF has encountered a hang, registering a timeout and returning
    offline
    DATA #2 : ZRC, PD_TYPE_ZRC, 4 bytes
    0x00000000
    .....
    
    Checking the cf stacks, we will observe the CF thread holding a
    lock while traversing a list of elements to free and all other
    threads would be blocked waiting on this lock. The problem is
    caused due to a timing window that exists when modifying this
    list, which results in the list being corrupted. Due to this we
    end up with a circular list that the CF thread traverses, which
    thereby causes an infinite hang.
    
    This APAR addresses the above observed issue.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to DB2 11.1 Mod 2 Fix Pack 2 or higher               *
    ****************************************************************
    

Problem conclusion

  • First fixed in DB2 11.1 Mod 2 Fix Pack 2
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT18995

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    B10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-01-26

  • Closed date

    2017-06-28

  • Last modified date

    2017-06-28

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.1","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
29 June 2020