IBM Support

PH58964: HIGH PERFORMANCE STORAGE SAVER: WITH MNT LEVEL 7.5.12, ARCHIVINGOF TABLES OR TABLE PARTITIONS CAN RESULT IN AN EXCEPTION

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as documentation error.

Error description

  • When using the High Performance Storage Saver with Accelerator
    maintenance level 7.5.12(.1),
    processing sequences such as
    a.  ARCHIVE_TABLES -> RESTORE_TABLES -> ARCHIVE_TABLES
    or
    b. Partial reload of a table partition -> Archive partition
    can result in an exception (AssertionFailedException) if no full
    
    ReorgDaemon has happened
    - prior to the second call of ARCHIVE_TABLES (for a.)
    - prior to the call of ARCHIVE_TABLES (for b. ).
    
    The exception occurs
    - as the affected table partitions are in a state that does not
     allow for successful archiving: **PendingDelete**.
    - because of a timing issue. The housekeeping service
     (ReorgDaemon) did not finish its cleanup work in time before
     the final Archive process step started.
    
    The exception is accompanied by the following additional
    information:
    "(<date> <time>) (ERROR) (Task: nnnn; Archive Tables) (Thread:
    70363694099856) (Component: CATALOG) (EXCEPTION)
    /home/dwabuild/workspace/catalog/sources/operations/internal/Par
    
    titionHandler.cpp:312
    AssertionFailedException: An assertion '( false )' failed.
    Additional info: Partition Partition ID <p> (mapped to backend
    ID xxxxxx; ending at 'yyy') of table version <n> (<table
    identifier>) is already in state **PendingDelete** and cannot be
    
    set as invisible".
    
    Additional keywords:
    HPSS GH/Everest/customer-cases/issues/625 #idaa7512 regression
    Archive_Tables Restore_Tables AssertionFailedException
    ReorgDaemon partition
    

Local fix

  • **Avoiding the occurrence of the AssertionFailedException
    issue:**
    Concerning scenario "a":
    In case of scenario a, wait an hour (in case of high workload
    several hours) until you start the final archive step.
    
    Concerning scenario "b":
    A partial reload preceding the archive is not required as the
    archive process initially refreshes the data of the partition to
    
    be archvied by downloading the respective data from Db2 for
    z/OS.
    If your process flow requires the partial reload, wait an hour
    (in case of high workload several hours) until you start the
    archive process.
    
    **How to recover from the AssertionFailedException issue and
    achieve the desired successful archiving:**
    1. Wait an hour (in case of high workload several hours) after
    the AssertionFailedException occurred
    2. Start the final archive step once more.
    
    Attention:
    If the latter archive run finished successfully, the table
    partitions
    - will no longer be available for query acceleration AND
    - will not be available as Archived table partitions.
    
    3. Run RESTORE_ARCHIVE_TABLES
    4. Wait an hour (in case of high workload several hours)
    5. Run ARCHIVE_TABLES once more.
    If step 5 has finished successfully, the affected partitions
    will be available as archived table partitions.
    
    IBM support information:
    How to determine whether a ReorgDaemon run has successfully
    processed a table partition that is to be archived:
    - on the customer system during a _Webex_:
     1. go into the `dashDB` container
     2. open `/head/dwa/var/log/profiling/ReorgDaemon` (if the
        first archiving was today in UTC time;
        otherwise, extract one of the earlier
        `ReorgDaemon.yyyy.mm.dd.n.zip` files using `unzip
        <file-name>`).
     3. check whether there is a `<ReorgJob task="nnn">` entry
        - with a `<DetectionTimestamp>` (in UTC) after the first
          archiving (sequence a) or after the partial reload
          sequence b)
        - for the `<Table>` in question
        - with a `<TableVersion>` that does NOT contain the
          substring `-ARCHIVE_DATA-`
        - with an element `<Partitions numInReorgItem="nnn"
          numWithExpiredDeleteTickets="nnn" tedInBackend="nnn"
          numDeletedInBackend="nnn" ss="true" >`
          numRemovedFromCatalog="nnn"
          success="true" >
        - where the `<Partitions>` element lists the partitions in
          question (each partition is listed as
          `client-partition-ID -> backend-partition-ID` and the
          `client-partition-ID` is the number of that partition in
          DB2 for z/OS.
     4. if that is the case, then the `ReorgDaemon` removed the
        partitions in question.
    
    - in a trace archive from the accelerator:
     1. look into the `catalog.dump`
     2. find the `<Table>` element of the table in question
     3. find the `<LoadTableVersions number="x" active="y" >`
        element for that `<Table>` element
     4. Find the `<TableVersion version="x" ...>` element for that
        `<LoadTableVersions>` element where the `version` is equal
        to the `active` from the `<LoadTableVersions>` (so, it is
        the currently active load table version).
     5. Look through the `<TableVersion>` element for
        '<Partition>` elements with `<State>PendingDelete</State>`
     6. If there are NO such `<Partition>` elements for any of the
        `<ClientPartitionID>`s in question (there can be multiple
        in`<Partition>` elements with the same
        `<ClientPartitionID> in the active load table version),
        then the `ReorgDaemon` removed the partitions in question.
    
    If the `ReorgDaemon` removed the partitions in question, then
    the following archiving will not run into the
    `AssertionFailedException` that is described by this APAR.
    

Problem summary

  • Problem Summary:
    See APAR Error description.
    
    
    Users Affected:
    Users of the High Performance Storage Saver function.
    
    Problem Scenario:
    See APAR Error description.
    
    
    Problem Symptoms:
    See APAR Error description.
    

Problem conclusion

  • The issue has been fixed with Accelerator maintenance level
    7.5.12.2.
    
    Upgrade your Accelerator environment(s) accordingly.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH58964

  • Reported component name

    ANYTCS ACCLTR Z

  • Reported component ID

    5697DA700

  • Reported release

    750

  • Status

    CLOSED DOC

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2023-12-31

  • Closed date

    2024-09-17

  • Last modified date

    2024-09-17

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

[{"Business Unit":{"code":"BU011","label":"Systems - zSystems software"},"Product":{"code":"SG19M"},"Platform":[{"code":"PF054","label":"z Systems"}],"Version":"750"}]

Document Information

Modified date:
17 September 2024