IBM Support

IT36193: WHEN USER CANCELS AN IN-PROGRESS COPY BACKUP THERE IS WINDOW WHEN DATAMOVER WILL GET STUCK IN PENDING STATE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The issue occurs intermittently when a running copy backup
    operation is canceled in IBM Spectrum Protect Plus.  An SLA with
    many backups (over 10) associated is more likely to hit the
    problem.
    This causes the datamover pod to be stuck in a pending state
    indefinitely as it is unable to start and exit.
    
    A kubectl get pod on the datamover will remain Pending as in the
    following
    NAME                                            READY   STATUS
    RESTARTS   AGE
    pvc-backup-112342                               0/1     Pending
    0          5s
    
    An event will exist to show the scheduling has failed:
      Warning  FailedScheduling  <unknown>  default-scheduler
    persistentvolumeclaim "testpvc" not found
    
    The situation is confirmed to have occurred when the backup
    operation has been canceled, the job is no longer running, yet a
    kubectl get pods shows that the deployment and pods still exist
    on the Kubernetes system.
    

Local fix

  • Wait for running backup jobs to complete or finish canceling.
    
    Obtain a list of the stuck Pending deployments with the
    following kubectl command:
    kubectl get deployment --all-namespaces | grep
    'resource-backup\|pvc-backup'
    
    Manually delete these deployments with the following:
    kubectl delete deployment -n <namespace> <deployment name>
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.7                       *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * see ERROR Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed IBM Spectrum Protect Plus level        *
    * 10.1.8. Note that this is subject to change at the           *
    * discretion of IBM.                                           *
    ****************************************************************
    

Problem conclusion

  • The problem has been fixed so that the scheduler will correctly
    check that a request from the Agent was completed or canceled.
    The new behavior is the data mover will not be created and left
    in a pending state on the kubernetes cluster.
    

Temporary fix

  • Wait for running backup jobs to complete or finish canceling.
    
    Obtain a list of the stuck Pending deployments with the
    following kubectl command:
    kubectl get deployment --all-namespaces | grep
    'resource-backup\|pvc-backup'
    
    Manually delete these deployments with the following:
    kubectl delete deployment -n <namespace> <deployment name>
    

Comments

APAR Information

  • APAR number

    IT36193

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A17

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-03-11

  • Closed date

    2021-03-18

  • Last modified date

    2021-03-18

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A17","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024