IBM Support

PH62082: WHEN USING NVME STORAGE FOR TEMPSPACE1, AN UPGRADE OR RESET OF THE ACCELERATOR COULD LEAD TO AN OUTAGE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as documentation error.

Error description

  • When running the Accelerator deployed on IBM Z in a multi-node
    cluster together with using NVMe (nonvolatile memory express)-
    storage for TEMPSPACE1,
    - a Reset (with or without wipe),
    - a Shutdown/Deactivate/Activate of the LPARs,
    - or an Accelerator upgrade
    could lead to an outage: the head node and the data nodes will
    be down.
    The issue can be accompanied by the following error messages
    (partially only seen in the internal logs of the database
    engine)
    - AQTST030E
    - "SQL0290N Table space access is not allowed. SQLSTATE=55039"
    - "ADM6047W The table space "TEMPSPACE1" (ID "1") is in the
    DROP_PENDING state. The table space will be kept OFFLINE. The
    table space state is "0x0000C000". This table space is
    unusable and should be dropped."
    - "SQL1034C The database was damaged " - " WARNING: Failed to
    activate BLUDB database on some or all of the members ..."
    Manual intervention by IBM support will be required to
    re-activate the Accelerator (see the Local Fix/Workaround
    section).
    
    Please note:
    Although one of the messages occurring could indicate a damage
    of the database, the integrity of the data kept in the
    accelerator-shadow tables or accelerator-only tables will not be
    harmed.
    The issue occurs during the startup of the accelerator: the
    DB2wh engine wants to access the pool for temporary data defined
    on NVMe storage, however, the tempspace on one or more nodes
    (LPARs) is not yet available for processing.
    
    A fix for the subject issue will be delivered with Accelerator
    maintenance level 7.5.12.3.
    
    Additional keywords:
    TS016417509 TS016545740 TS017486846
    NVME AQTST030E SQL0290N ADM6047W
    SQL1034C TEMPSPACE1 DATABASE DAMAGED DT390652
    GH/Everest/customer-cases/issues/716
    

Local fix

  • Run an online session with the IBM technical support team:
    as db2inst1 user:
    
    -- check the TablespaceID for TEMPSPACE1:
    db2pd -db bludb -tablespace all
    -- check status of TS whether it contains zeroes only and not
    0x0000C000 (which indicates the DROP PENDING):
    db2pd -db bludb -tablespace 1 -member all |grep -A1 -i "Status"
    -- does not show EXPLICIT as activation_state:
    db2 "select member,db_conn_time,db_activation_state from
    table(mon_get_database(-2)) order by member"
    -- restart drop pending, the key here is to use "db2_all" to
    apply the command on all nodes.
    db2_all 'db2 "restart database bludb drop pending tablespaces
    (TEMPSPACE1)"'
    as root (inside docker):
    run script
    /head/vidaa_scripts/restart_db2_with_broken_tempspace1.sh
    as db2inst1:
    db2 activate db bludb
    
    via Admin GUI:
    reset w/o wipe ---to re-install docker, the DROP/CREATE
    TEMPSPACE1 will succeed now and it will be back on transient
    storage.
    

Problem summary

  • Problem Summary:
    See APAR Error description
    
    Users Affected:
    Customers, running the Accelerator deployed on IBM Z in a
    multi-node cluster together with using NVMe (nonvolatile memory
    express)-storage for TEMPSPACE1
    
    Problem Scenario:
    See APAR Error description.
    
    Problem Symptoms:
    See APAR Error description.
    

Problem conclusion

  • Conclusion:
    The issue has been fixed with Accelerator maintenance level
    7.5.12.3.
    
    Upgrade your Accelerator environments accordingly.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH62082

  • Reported component name

    ANYTCS ACCLTR Z

  • Reported component ID

    5697DA700

  • Reported release

    750

  • Status

    CLOSED DOC

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2024-06-27

  • Closed date

    2024-07-21

  • Last modified date

    2024-10-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

[{"Business Unit":{"code":"BU011","label":"Systems - zSystems software"},"Product":{"code":"SG19M"},"Platform":[{"code":"PF054","label":"z Systems"}],"Version":"750"}]

Document Information

Modified date:
03 October 2024