IBM Support

IT22805: CF STUCK IN CATCHUP STATE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • CF stuck in CATCHUP state
    
    When starting DB2 pureScale instance, the primary CF starts
    successfully. But the secondary CF goes into ERROR state, and
    then after a while it's stuck in the CATCHUP state.
    
    [db2instd@psmemb01 ~]$ db2instance -list
    ID        TYPE             STATE                HOME_HOSTa
    CURRENT_HOST            ALERT   PARTITION_NUMBER
    LOGICAL_PORT
    NETNAME
    --        ----             -----                ---------
    ------------            -----   ----------------
    ------------
    -------
    0       MEMBER             ERROR
    psmemb01                   psmemb04              YES
    0                   2    psmemb04
    1       MEMBER             ERROR
    psmemb02                   psmemb01              YES
    0                   2    psmemb01
    2       MEMBER             ERROR
    psmemb03                   psmemb02              YES
    0                   1    psmemb02
    3       MEMBER             ERROR
    psmemb04                   psmemb03              YES
    0                   1    psmemb03
    128     CF               PRIMARY
    psdbcf01                   psdbcf01               NO
    -                   0    psdbcf01
    129     CF               CATCHUP
    psdbcf02                   psdbcf02               NO
    -                   0    psdbcf02
    
    HOSTNAME                   STATE                INSTANCE_STOPPED
    ALERT
    --------                   -----                ----------------
    -----
    psmemb04                  ACTIVE
    NO          YES
    psmemb03                  ACTIVE
    NO          YES
    psmemb02                  ACTIVE
    NO          YES
    psdbcf02                  ACTIVE
    NO           NO
    psdbcf01                  ACTIVE
    NO           NO
    psmemb01                  ACTIVE
    NO          YES
    
    The can happen when the CF automatic memory tuner CFMT is ON and
    secondary CF restarts, a member might not be
    able to allocate GBP with the same structure size on the
    secondary CF.
    
    The following error will be observed:
    
    2017-11-09-02.55.07.208865-300 I7472119E3370         LEVEL:
    Severe
    PID     : 10031                TID : 139967275722496 PROC :
    db2sysc 0
    INSTANCE: user               NODE : 000            DB   : ROCMDB
    HOSTNAME: hotellnx114
    EDUID   : 63                   EDUNAME: db2castructevent GBP
    (ROCMDB) 0
    FUNCTION: DB2 UDB, base sys utilities,
    sqleCaSynchronousCacheAllocate, probe:3494
    MESSAGE : CA RC= 2147879427
    DATA #1 : CF Response Structure, PD_TYPE_SD_CFRESPONSE, 424
    bytes
    API Response Code = 2147879427
    Model Code = 6
    Response Code = 3
    Exception Code = 0
    SID = 316
    API ID = 147
    Line Number = 0
    Description = API Function: 147 (PsCacheAllocate),
    Structure ID (SID): 316 (Unknown Name),
    Model Code: 6 (Management), Command Code: 10 (0xa) (Allocate
    Cache Structure)
    Return Code: Severity = 0x8 (Error), Component = 6 (svr_mgmnt
    [Management]),
    Command =  10 (0xa), (Allocate Cache Structure), Error = 3
    Error Name = Invalid target structure size
    (CA_MGMNT_CACHE_ALLOC_INVALID_TSS)
    API Return Code: 0x80060a03,
    CA Response Code: 3 (Invalid target structure size)
    Status Conditions: 0
    
    #DATA #2 : String, 712 bytes
    Cache Allocation Request:
           sid = 316
           tdtdr_de = 80
           tdtdr_dae = 1
           monitor_frequency = 0
           aai = 0
           api = 0
           udfoqi = 1
           at = 0
           mdas = 16
           dtf = 0
           ditf = 0
           tcpi = 0
           trvi = 0
           bcpi = 0
           tss = 5888
           mxss = 272640
           daex = 4
           msc = 9
           mcc = 1024
           tgdec = 6000000
           tgdaec = 10000000
           ncm = 4095
           db2glc = 1
           sau =
           csau =
    Cache Allocation Response:
           sc[0] = 0
           sc[1] = 0
           sc[2] = 0
           reipi = 0
           ssci = 0
           pdtdr_de = 0
           pdtdr_dae = 0
           ss = 0
           mrss = 11673
           mass = 11776
           tdec = 0
           tdaec = 0
    
    From the error you see that the target struct size of tss = 5888
    is smaller than minimum size of mrss = 11673
    
    
    Workaround/Local fix:
    A full db2stop/db2start may resolve this issue.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to Db2 11.1 Mod 3 Fix Pack 3 or higher               *
    ****************************************************************
    

Problem conclusion

  • First fixed in Db2 11.1 Mod 3 Fix Pack 3
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT22805

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    B10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-10-17

  • Closed date

    2018-03-19

  • Last modified date

    2018-03-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • RB10 PSN

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
19 March 2018