IBM Support

IC85734: SERVER CAN CRASH FROM CANCEL PROCESS ACQUIRING MUTEX ON BFCANCELMEDIAWAITS

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Server crashed from a Cancel Process when acquiring a mutex on
    the bfCancelMediaWaits.
    
    Customer/L2 Diagnostics (if applicable)
    Actlog
       07/06/12   12:29:54      ANR2017I Administrator TSMADMIN
    issued command: MOVE DATA 409AADL5 RECONSTRUCT=YES  (SESSION:
    1606)
    ...
     07/06/12   12:29:58      ANR0984I Process 11 for MOVE DATA
    started in the BACKGROUND at 12:29:58. (SESSION: 1606, PROCESS:
    11
    ...
     07/06/12   12:45:24      ANR2750I Starting scheduled command
    MIGARCHIVESAN95 ( UPDSTG ARCHIVESAN Hi=95 Lo=40 ). (SESSION:
    1723)
     07/06/12   12:45:24      ANR2017I Administrator TCMH issued
    command: UPDATE STGPOOL ARCHIVESAN Hi=95 Lo=40  (SESSION: 1723)
     07/06/12   12:45:24      ANR2202I Storage pool ARCHIVESAN
    updated. (SESSION: 1723)
     07/06/12   12:45:24      ANR2753I (MIGARCHIVESAN95):ANR2202I
    Storage pool ARCHIVESAN updated. (SESSION: 1723)
     07/06/12   12:45:24      ANR2751I Scheduled command
    MIGARCHIVESAN95 completed successfully. (SESSION: 1723)
     07/06/12   12:45:29      ANR2017I Administrator TSMADMIN issued
    command: CANCEL PROCESS 11  (SESSION: 1716)
     07/06/12   12:45:29      ANR1143W Move data process terminated
    for volume 409AADL5 - process canceled. (SESSION: 1606, PROCESS:
    11)
     07/06/12   12:45:29      ANR0515I Process 11 closed volume
    409AADL5. (SESSION:1606, PROCESS: 11)
     07/06/12   12:45:29      ANR0515I Process 11 closed volume
    387AADL5. (SESSION:1606, PROCESS: 11)
     07/06/12   12:45:29      ANR0511I Session 1631 opened output
    volume 387AADL5.(SESSION: 1631)
     07/06/12   12:53:31      ANR4726I The NAS-NDMP support module
    has been loaded.
    
    DBX call stack from the core
    IOT/Abort trap in _pth_init_kgetsig at 0x900000000744650
    ($t3142)
    [untrusted: /usr/lib/libpthreads.a(shr_xpg5_64.o)]
    0x900000000744650 (_pth_init_kgetsig+0x70) 38840004        addi
    r4,0x4(r4)
    (dbx) where
    _pth_init_kgetsig() at 0x900000000744650 [untrusted:
    /usr/lib/libpthreads.a(shr_xpg5_64.o)]
    _p_sigtimedwait(??, ??, ??) at 0x900000000743ec8 [untrusted:
    /usr/lib/libpthreads.a(shr_xpg5_64.o)]
    raise.raise(??) at 0x90000000002bd2c [untrusted:
    /usr/lib/libc.a(shr_64.o)]
    tzload(??, ??, ??) at 0x900000000088504 [untrusted:
    /usr/lib/libc.a(shr_64.o)]
    PsAbortServer(??) at 0x1000209c0
    pkAbort(??) at 0x10001ab04             <---- this is the crash
    TrapSyncError(??) at 0x10000763c
    pkAcquireMutexTracked(??, ??, ??) at 0x10000779c
    bfCancelMediaWaits(??, ??) at 0x100211294
    AfMoveDataCancel(??, ??, ??) at 0x10066791c
    procCancelProcess(??, ??) at 0x10048bf94
    AdmCancelProcess(??) at 0x100b390f4
    AdmCommandLocal(??, ??, ??, ??, ??) at 0x10017ee90
    admCommand(??, ??, ??, ??, ??) at 0x10017d450
    SmAdminCommandThread(??) at 0x1008714f0
    StartThread(??) at 0x10001b684
    (dbx)
    
    
    *NOTE*
    This issue may be a variant of:
    APAR IC60842 - CANCEL PROCESS/SESSION MAY CRASH TSM SERVER WHEN
    PROCESS/SESSION IS IN A MOUNT WAIT STATE.
    ERROR DESCRIPTION:
     If a process or session is in a mediaW state and that process
    or
     session is cancelled, the TSM Server may crash. The following
     call stack was generated on AIX when a cancel process on a move
     data was issued:
    ...
    Since the call stacks can vary, if the bfCancelMediaWaits is
     seen, that is a good indication this APAR applies.
    
    
    Platforms affected:
    TSM 6.2, 6.3 Unix/Linux/Windows
    
    Initial Impact: Medium
    
    Additional Keywords: ZZ62 ZZ63 crash abend core cancel proc
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All Tivoli Storage Manager server users of   *
    *                 CANCEL PROCESS command.                      *
    ****************************************************************
    * PROBLEM DESCRIPTION: See ERROR DESCRIPTION.                  *
    ****************************************************************
    * RECOMMENDATION: Apply fixing level when available. This      *
    *                 problem is currently projected to be fixed   *
    *                 in levels 6.2.5, and 6.3.3. Note that this   *
    *                 is subject to change at the discretion of    *
    *                 IBM.                                         *
    ****************************************************************
    *
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC85734

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    62A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-08-03

  • Closed date

    2012-08-21

  • Last modified date

    2012-08-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R62A PSY

       UP

  • R62H PSY

       UP

  • R62L PSY

       UP

  • R62S PSY

       UP

  • R62W PSY

       UP

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"62A","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
21 August 2012