APAR status
Closed as documentation error.
Error description
When using the High Performance Storage Saver with Accelerator maintenance level 7.5.12(.1), processing sequences such as a. ARCHIVE_TABLES -> RESTORE_TABLES -> ARCHIVE_TABLES or b. Partial reload of a table partition -> Archive partition can result in an exception (AssertionFailedException) if no full ReorgDaemon has happened - prior to the second call of ARCHIVE_TABLES (for a.) - prior to the call of ARCHIVE_TABLES (for b. ). The exception occurs - as the affected table partitions are in a state that does not allow for successful archiving: **PendingDelete**. - because of a timing issue. The housekeeping service (ReorgDaemon) did not finish its cleanup work in time before the final Archive process step started. The exception is accompanied by the following additional information: "(<date> <time>) (ERROR) (Task: nnnn; Archive Tables) (Thread: 70363694099856) (Component: CATALOG) (EXCEPTION) /home/dwabuild/workspace/catalog/sources/operations/internal/Par titionHandler.cpp:312 AssertionFailedException: An assertion '( false )' failed. Additional info: Partition Partition ID <p> (mapped to backend ID xxxxxx; ending at 'yyy') of table version <n> (<table identifier>) is already in state **PendingDelete** and cannot be set as invisible". Additional keywords: HPSS GH/Everest/customer-cases/issues/625 #idaa7512 regression Archive_Tables Restore_Tables AssertionFailedException ReorgDaemon partition
Local fix
**Avoiding the occurrence of the AssertionFailedException issue:** Concerning scenario "a": In case of scenario a, wait an hour (in case of high workload several hours) until you start the final archive step. Concerning scenario "b": A partial reload preceding the archive is not required as the archive process initially refreshes the data of the partition to be archvied by downloading the respective data from Db2 for z/OS. If your process flow requires the partial reload, wait an hour (in case of high workload several hours) until you start the archive process. **How to recover from the AssertionFailedException issue and achieve the desired successful archiving:** 1. Wait an hour (in case of high workload several hours) after the AssertionFailedException occurred 2. Start the final archive step once more. Attention: If the latter archive run finished successfully, the table partitions - will no longer be available for query acceleration AND - will not be available as Archived table partitions. 3. Run RESTORE_ARCHIVE_TABLES 4. Wait an hour (in case of high workload several hours) 5. Run ARCHIVE_TABLES once more. If step 5 has finished successfully, the affected partitions will be available as archived table partitions. IBM support information: How to determine whether a ReorgDaemon run has successfully processed a table partition that is to be archived: - on the customer system during a _Webex_: 1. go into the `dashDB` container 2. open `/head/dwa/var/log/profiling/ReorgDaemon` (if the first archiving was today in UTC time; otherwise, extract one of the earlier `ReorgDaemon.yyyy.mm.dd.n.zip` files using `unzip <file-name>`). 3. check whether there is a `<ReorgJob task="nnn">` entry - with a `<DetectionTimestamp>` (in UTC) after the first archiving (sequence a) or after the partial reload sequence b) - for the `<Table>` in question - with a `<TableVersion>` that does NOT contain the substring `-ARCHIVE_DATA-` - with an element `<Partitions numInReorgItem="nnn" numWithExpiredDeleteTickets="nnn" tedInBackend="nnn" numDeletedInBackend="nnn" ss="true" >` numRemovedFromCatalog="nnn" success="true" > - where the `<Partitions>` element lists the partitions in question (each partition is listed as `client-partition-ID -> backend-partition-ID` and the `client-partition-ID` is the number of that partition in DB2 for z/OS. 4. if that is the case, then the `ReorgDaemon` removed the partitions in question. - in a trace archive from the accelerator: 1. look into the `catalog.dump` 2. find the `<Table>` element of the table in question 3. find the `<LoadTableVersions number="x" active="y" >` element for that `<Table>` element 4. Find the `<TableVersion version="x" ...>` element for that `<LoadTableVersions>` element where the `version` is equal to the `active` from the `<LoadTableVersions>` (so, it is the currently active load table version). 5. Look through the `<TableVersion>` element for '<Partition>` elements with `<State>PendingDelete</State>` 6. If there are NO such `<Partition>` elements for any of the `<ClientPartitionID>`s in question (there can be multiple in`<Partition>` elements with the same `<ClientPartitionID> in the active load table version), then the `ReorgDaemon` removed the partitions in question. If the `ReorgDaemon` removed the partitions in question, then the following archiving will not run into the `AssertionFailedException` that is described by this APAR.
Problem summary
Problem Summary: See APAR Error description. Users Affected: Users of the High Performance Storage Saver function. Problem Scenario: See APAR Error description. Problem Symptoms: See APAR Error description.
Problem conclusion
The issue has been fixed with Accelerator maintenance level 7.5.12.2. Upgrade your Accelerator environment(s) accordingly.
Temporary fix
Comments
APAR Information
APAR number
PH58964
Reported component name
ANYTCS ACCLTR Z
Reported component ID
5697DA700
Reported release
750
Status
CLOSED DOC
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-12-31
Closed date
2024-09-17
Last modified date
2024-09-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Applicable component levels
[{"Business Unit":{"code":"BU011","label":"Systems - zSystems software"},"Product":{"code":"SG19M"},"Platform":[{"code":"PF054","label":"z Systems"}],"Version":"750"}]
Document Information
Modified date:
17 September 2024