IBM Support

IT36716: SPECTRUM PROTECT SERVER MIGHT CRASH DUE TO ACTIVE LOG PINNING BY RUNNING LARGE OBJECTS TIERING TRANSACTIONS

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When tiering large objects, the amount of chunk updates in the
    transactions can pin the active log causing the log to be
    full. This can cause severe performance degradation, across
    the board, if this situation is encountered. All server sessions
    and operations can grind to a halt.
    
    L2/Customer diagnostics:
    
    - Looking into the server
    actlog, the following message is encountered during
    tiering:
    
    ANR4537I The active log space used is 234355.04
    megabytes, and the active log space available is 26508.96
    megabytes. The ratio, 0.90, exceeds the
    threshold 0.80
    
    -
    While the db2diag.log can show the following
    error:
    
    2021-03-30-22.34.25.857239-420 I66080763A631 LEVEL:
    Error
    PID : 7143836 TID : 252994 PROC : db2sysc 0
    INSTANCE:
    tsminst1 NODE : 000 DB : TSMDB1
    APPHDL : 0-39496 APPID:
    /..
    UOWID : 2 ACTID: 1
    AUTHID : TSMINST1 HOSTNAME: ...
    EDUID :
    252994 EDUNAME: db2agent (TSMDB1) 0
    FUNCTION: DB2 UDB, data
    protection services, sqlpWriteLR, probe:6680
    MESSAGE :
    ZRC=0x85100009=-2062548983=SQLP_NOSPACE
     "Log File has reached
    its saturation point"
     DIA8309C Log file was full.
    
    - Verify
    that tiering threads are causing the log pinning by analyzing
    servermon data, as follows:
    
    "db2.txt" shows the "Appl id
    holding the oldest transaction= 36875".
    
    "show.txt" shows
    which threads work with above application ID:
    
    Tsn=0:260263216,
    Resurrected=False, InFlight=True, Distributed=False,
    Persistent=True, Addr 11a007c00
     Start ThreadId=243345,
    Timestamp=03/30/21 21:13:35, Creator=sddelete.c(5329)
     Last
    known in use by ThreadId=243345
     Participants=3,
    summaryVote=ReadOnly
     EndInFlight False, endThreadId 0, tmidx
    0, processBatchCount 0, mustAbort False.
     Participant DB:
    voteReceived=False, ackReceived=False
     DB: Txn 119ed1200,
    ReadOnly(NO), connP=12e75c080, applHandle=36875, openTbls=7:
    
    DB: --> OpenP=13f60d560 for table=SC.Object.Tracker.
     DB: -->
    OpenP=14d382760 for table=SD.Non.Dedup.Refcount.Updates.
     DB:
    --> OpenP=14d3820e0 for table=SD.Refcount.Updates.
     DB: -->
    OpenP=12ec46a40 for table=SD.Pending.Deletions.
     DB: -->
    OpenP=14d2ede80 for table=SD.Chunk.Locations.
     DB: -->
    OpenP=12900aa20 for table=SD.Non.Dedup.Locations.
     DB: -->
    OpenP=12895c5e0 for table=SD.Recon.Order.
     DB: -->
    RegSqlId=0x0F000010 SELECT for table=SD.Recon.Order,
    executed(No).
     DB: --> RegSqlId=0x0F0000EE SELECT for
    table=SD.Recon.Order, executed(Yes).
     DB: -->
    RegSqlId=0x0F0000AA SELECT for table=SD.Chunk.Copies,
    executed(Yes).
     DB: --> RegSqlId=0x0F000103 UPDATE for
    table=SD.Non.Dedup.Locations, executed(Yes).
     DB: -->
    RegSqlId=0x0F000018 UPDATE for table=SD.Refcount.Updates,
    executed(Yes).
     Participant SC: voteReceived=False,
    ackReceived=False
     Participant SD: voteReceived=False,
    ackReceived=False
     Locks held by Tsn=0:260263216 :
    
    Type=20009(SD Object ID), NameSpace=4, SummMode=xLock,
    Mode=xLock, Key='585801473' File=sddelete.c Line=2898
    
    Type=19002(im node filespace), NameSpace=0, SummMode=sLock,
    Mode=sLock, Key='619.1' File=sddelete.c Line=5338
    
    Type=19001(im node), NameSpace=0, SummMode=isLock, Mode=isLock,
    Key='619' File=sddelete.c Line=5338
     Type=20009(SD Object ID),
    NameSpace=4, SummMode=xLock, Mode=xLock, Key='585801470'
    File=sddelete.c Line=2898
     Type=20009(SD Object ID),
    NameSpace=4, SummMode=xLock, Mode=xLock, Key='585801463'
    File=sddelete.c Line=2898
    
    
    Server does transaction batching
    based on the number of files processed. The default is 1000. If
    there are very large files then the amount of chunk updates in
    that transaction can pin the log too long causing this issue.
    

Local fix

  • Reducing the number of files processed for transaction handling
    by reducing the PENDINGOBJDELBATCHSIZE and ASYNCSDOBJDELWORKERS
    server options:
    
    SETOPT ASYNCSDOBJDELWORKERS 3
    SETOPT PENDINGOBJDELBATCHSIZE 50
    
    If this issue is encountered, the above options can be set and
    the Spectrum Protect Server recycled
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All IBM Spectrum Protect server users of container type      *
    * storage pools.                                               *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be                                              *
    * fixed in levels 8.1.12.200 and 8.1.13. Note that this is     *
    * subject to change at the discretion of IBM.                  *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms for reported release:  AIX, Linux, and
    Windows.
    Platforms fixed:   AIX, Linux, Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT36716

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    81A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-04-27

  • Closed date

    2021-07-30

  • Last modified date

    2021-07-30

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R81A PSY

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"81A","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
28 April 2022