IBM Support

IT31918: VSNAP WITH DEDUPLICATION ENABLED HAS SLOW PERFORMANCE IN GENERALAND IS ESPECIALLY SLOW AFTER A REBOOT

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When deduplication is enabled on a vSnap server and when many
    terabytes of data has already been written into the vSnap
    storage pool, further I/O operations can be very slow. This
    problem is seen with various symptoms, such as:
    
    - All backup operations show very slow throughput.
    - Operations to create target volumes or NFS shares on vSnap
    fail with timeouts.
    - VM backup operations appear to hang/stall for hours.
    - SQL backup operations fail with errors seen in the job log:
    "Failed to mount for volume <name>, Volume id/name is not
    found."
    - SQL backup operations fail with errors seen in the job log:
    "The system cannot find the file specified, The device is not
    ready."
    
    These issues occur when a significant amount of data has
    previously been written into the pool such that the dedup lookup
    table (DDT) has grown to a size of 50 gigabytes or higher.
    
    An additional major system is that when the vSnap server is
    rebooted, it shows especially poor performance immediately after
    the reboot. Depending on the size of the DDT, it takes several
    hours to over a day for the vSnap to return to peak performance.
    
    APAR IT30276 was previously opened to address these issues. The
    work done in that APAR yielded an incremental improvement in
    overall performance, but performance problems still persist.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus levels 10.1.0, 10.1.1, 10.1.2,     *
    * 10.1.3, 10.1.4 and 10.1.5.                                   *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in IBM Spectrum Protect Plus level     *
    * 10.1.5 patch1 and 10.1.6. Note that this is subject to       *
    * change at the discretion of IBM.                             *
    ****************************************************************
    

Problem conclusion

  • The fixes made in APAR IT30276 improved the utilization of
    memory caches to ensure more of the deduplication table (DDT)
    can be cached in memory. This yielded incremental improvements
    in performance, but problems persist because the DDT eventually
    grows too large and the system doesn't have enough RAM to cache
    the entire DDT which results in slowdowns when portions of the
    DDT have to be repeatedly reloaded into memory cache from disk.
    
    Further improvements have been made in this APAR to manage the
    size of the DDT to prevent it from growing too large. vSnap now
    manages the size of the DDT on an ongoing basis to ensure that
    the amount of DDT entries that track unique blocks (i.e. blocks
    for which no duplicates have been found yet) does not exceed
    configured limits. For vSnap systems upgraded from previous
    versions where a very large DDT already exists, a one-time
    operation is introduced that shrinks the size of the DDT by
    evicting unneeded entries.
    
    When a vSnap system is rebooted, the DDT must be reloaded into
    memory caches from disk. In prior versions, this reloading
    occurred on an as-needed basis when new data was written into
    the pool. This caused the initial write operations after a
    reboot to be very slow. The problem has been resolved by
    introducing a new phase in the vSnap startup process that
    aggressively preloads the DDT into memory after a reboot. After
    a reboot, during the startup of vSnap services, the DDT is read
    into memory caches before starting the remaining services. If
    this preload of the DDT does not complete within 15 minutes, the
    vSnap services are started anyway while the rest of the preload
    operates in the background. This ensures that the vSnap server
    achieves a more reasonable performance soon after a reboot, and
    in general, achieves peak performance much sooner than it did in
    prior versions.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT31918

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A15

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-02-19

  • Closed date

    2020-02-20

  • Last modified date

    2020-02-20

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A15","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024