There are two APARs to point out when looking to upgrade DB2 for z/OS when you are using QRep or SQL Replication (now referred to as Infosphere Replication Server).
The first, PM96954, addresses several issues customers have seen. Most notably high CPU overhead in Qapply when the same row is being updated repeatedly, as well as Qapply ending with ASN0543E due to changes occurring in the DB2 catalog during an online migration.
PM96954 – Address QRep failures during DB2 10 CM and NFM migration
Remember that if you plan on going to DB2 11 you must be on Infosphere Replication Server V10.2.1 prior to migrating to V11 CM mode due to the 10 byte extended RBA values. The IFCID 306 record is changed in CM mode and any product which reads them will need to be on the appropriate maintenance level.
There are 2 APARs for DB2 10 which could affect Class 3 wait time in DB2.
The first was a fix for an ISO(UR) application not returning recently updated rows. If the updates to a GBP dependent object create overflow records, the side effect of this APAR, is that each over flow page will result in a force log write, and forced write to the coupling facility of the overflow page. If there are many occurrences of this in a batch application for instance then the Log Write I/O suspense could become a noticeable performance degradation. The way to avoid this Log Write I/O is to avoid the writing of overflow records by increasing PCTFREE on the object. Overflow records can occur with compressed rows or VARCHAR fields which change length after the update, and will not fit back into the row's original place on the data page.
PM82279 – closed a gap for ISO(UR) readers when a table space is GBPDEP, but this will force the write of the overflow page and log record if the update causes the row to overflow
Here is a simple SELECT statement to determine the number of overflow records in an object. By determining the percent of rows that have overflowed you can determine the amount of free space that should be preserved in the object to avoid them. In DB2 11 the PCTFREE FOR UPDATE clause can help with this issue and avoid the need for a REORG.
This APAR addresses high Other Write I/O due to space map pages taking up space on the vertical deferred write queue and causing more frequent writes. The space map pages themselves are not changed and will not be written out, but a page p-lock is taken against them especially in cases where Member Cluster is used and the object is GBP dependent. The APAR ensures those spacemap pages are not left on the VDWQT queue.
PI07513 - VDWQT being hit frequently due to unchanged pages being left on the queue resulting in high Other Write I/O
If an application uses multi-row insert against a Universal table space, which is partition by growth, the getpage count could be unusually high. This APAR adjusts the space search algorithm as it pertains to multi-row insert.
"During the exhaustive search prior to the physical extend of the data set, the Multi-Row insert operation encounters a high get page count. In this case, insert operation fails to find available space to insert and will search the same set of space map pages or data pages for each insert operation within the same Muli-Row insert statement."
PI12398 – slow insert performance on PBG UTS with multi-row insert
Customers requested a new object state which would allow readers access to data, eliminate the need to, or effect of a -STOP or -START command to place it in Read Only Status. This Persistent Read Only status causes UPDATE/INSERT/DELETE statements to fail with SQLCODE -904 and RC00C90635. ALTER ROTATE PARTITION as well as other Utilities will fail if there are rows in that partition.
PM95731 – new PRO (Persistent Read Only) Status to keep an object in read only status
Recommended procedure for setting the PRO restricted status on a table space partition:
1. -STOP DB() SP() PART() - wait for the command to complete
successfully, verify that the object is in STOP status
(not STOPP) with the -DIS DB command.
2. -START DB() SP() PART() ACCESS(UT)
3. Create two full image copies of the table space partition
with COPY SHRLEVEL REFERENCE
4. Use REPAIR utility to turn on PRO status
5. -START DB() SP() PART() ACCESS(RW)
Along with the preceding APAR the lab has modified the DSNACCOX stored procedure to be able to handle the new PRO status to avoid unnecessary Utility suggestions.
PI15366 – adjust DSNACCOX to avoid utilities on objects in persistent read only status
With the increased attention on auditing capabilities around the DB2 engine this new feature adds the capability to do some simple auditing using the system time capability of temporal tables, as well as with some generated expression columns.
APAR text: " The PTFs for PM99683 (the preconditioning APAR), PI15298 (the enabling APAR), and PI15666 (the LOAD utility feature APAR) deliver integrated auditing support using non-deterministic generated expression columns to allow for automatic tracking of some audit information including: a. who modified the data in the table b. what SQL operation modified the data in the table."
PM99683- new auditing capability in DB2 11 with profile tables
High Performance DBATs were introduce in DB2 10. In order to utilize this feature you must have the JCC packages (i.e. SYSNL200) bound with RELEASE(DEALLOCATE) as well as the -MODIFY DDF PKGREL(BNDOPT) option in place.
This allows distributed requests to benefit from the performance aspects of not going through deallocation after each commit. Caution should be used when employing this option, as you would not want every distributed application coming in as RELEASE(DEALLOCATE) and using up all of the available DBATs. You can more granularly control these If you bind the dynamic JCC packages into an alternate collection and then allow specific applications to use this by having them specify this collection in the CurrentPackageSet.
APAR PI20352 was opened because there were times of increased DDF SRB time seen when the thread was deallocated prior to the 200 uses. In order to alleviate this, code has been modified to allow the High Performance DBAT to be pooled if it has not reach the 200 use mark. The POOLINAC timeout value can be used to limit the time the DBAT remains in the pool.
PI20352 – high performance DBATs increase DDF SRB time if not reused 200 times
This only affects new installations of DB2 11, not migrations.
After a new Version 11 installation of DB2 has created approximately 6534 archive log data set pairs, the offload task can begin failing with MSGDSNJ116I ERROR ATTEMPTING TO ADD ARCHIVE ENTRY TO BSDS. To bypass this problem, you can reduce the maximum number of archive log entries in the BSDS data sets by changing the DSN6LOGP MAXARCH value to 6500 in DSNZPARM.
PI22934 – DSNJU102 not creating enough BSDS entries to support 10,000 archives
From the APAR text, and how to determine if you rae affected:
DSNJU102 does not create enough log data set records in the
BSDS data sets to support 10,000 archive log data set pairs
when the BSDS data sets are formatted under DB2 Version 11.1.
To verify that this problem has occurred, print the BSDS
control record (key = 04000001) with IDCAMS.
//STEP1 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//BSDS DD DISP=SHR,DSN=DSNC000.DB2A.BSDS01
//SYSIN DD *
From the output, the BCR1HIKY value is at offset x'18'
into this record. If the BCR1HIKY value is 040001E7, the BSDS
was not formatted correctly. If DSNZPARM specifies a DSN6LOGP
MAXARCH value over 6534 and TWOARCH=YES, the log offload task
could start failing with MSGDSNJ116I after approximately 6534
archive log pairs have been added to the BSDS data sets. If
the BCR1HIKY value is 040002F0 or 040002F1, then the BSDS
data sets will have enough data set records to support up to
the maximum of 10,000 archive log data set pairs and no
action is needed.
DB2 10 added a new column, REORGCLUSTERSENS, to RTS tablespace SYSIBM.SYSTABLESPACESTATS. This indicates the number of times data has been read by SQL statements that are sensitive to the clustering sequence since the last REORG or LOAD REPLACE, and is used by DSNACCOX to determine whether a REORG is required. Once in NFM, creating a tablespace or running REORG or LOAD REPLACE sets REORGCLUSTERSENS to zero. However, the ENFM process sets this column to NULL for existing tablespaces, which means they will never be considered for REORG by DSNACCOX, until you run a REORG or LOAD REPLACE to initially populate REORGCLUSTERSENS. The recommended work-around is to set REORGCLUSTERSENS for existing tablespaces to a non-zero value after ENFM.
PI22121 – RTS column REORGCLUSTERSENS is set to NULL during NFM processing for existing tablespaces
Many customers these days are utilizing DASD mirroring solutions as well as Hyperswap technology to automate fail-over to an alternate site or to local DASD hardware in the event of a failure or disaster. z/OS APAR OA31707 was put out to aid in the event of a fail-over by ensuring any pages it might need would not be paged out to AUX.
From OA31707: "During a Hyperswap, it is possible for the system to require page fault resolution via page devices that may be part of the scope of devices being recovered by the Hyperswap. If this occurs, it is possible that a page fault will not be able to be resolved leading to deadlock and Hyperswap failures."
The downside of this is a massive amount of page fixed storage which includes the following:
31- bit common storage (both above and below 16M)
Address spaces that are defined as critical for paging
All data spaces associated with those address spaces that are critical for paging (unless CRITICALPAGING=NO was specified on the DSPSERV CREATE)
Pageable link pack area (PLPA)
All HVCOMMON objects
All HVSHARED objects
The purpose of this entry is to ensure customers are aware of the effect on real storage when this function is enabled, and to plan for it in advance. A system that is already running lean on REAL storage may see increased demand paging once this function is enabled, which can lead to DB2 entering DISCARD MODE (contraction) due to the REALSTORAGE_MANAGEMENT setting. Toggling in and out of DISCARD MODE frequently will increase CPU consumption on the LPAR and can result in parts of the DB2 address space (not protected by CRITICAL PAGING) being pushed out to AUX.
If you have page fixed your buffer pools then the vast majority of the DBM1 PRIVATE address space will never be paged to AUX either, so you could end up with a severe shortage of REAL storage on the LPAR.
You can issue the D XCF,COUPLE command to determine if the function is enabled.
Further important information about the protection provided by this APAR and the service it introduces can be found as follows:
OA31707 – CRITICALPAGING introduced to harden page sets deemed important to Hiperswap
Some of you might have noticed that IBM Knowledge Center does not include topic footers that provide a link to the information in PDF format. However, the DB2 for z/OS information is still available in PDF format.
The DB2 for z/OS Information Center has been replaced as the source for for all official DB2 for z/OS product information on the web by the DB2 11 for z/OS Knowledge Center. The IBM Knowledge Center also provides links to additional information resources and alternative documentation formats, such as the PDF manuals in the IBM Publications Center. You should update your bookmarks to ensure that you move from the Information Center to the better-performing Knowledge Center.
It is a common belief that DB2 10 and 11 for z/OS can only use 1MB size pageable large pages, other than for buffers, if the CEC has Flash Express installed. For example, the text for APAR PM85944 infers just that. However, this is not true. The only requirement is that the CEC be SCM-capable (SCM = Storage-Class Memory). In other words, it does not matter whether Flash Express is actually installed or not. So if a customer is running on a zEC12 CEC without Flash Express, DB2 could request, and be given, a 1MB size pageable large page, residing in a 1MB size large page frame. However if that page needs to get paged out for some reason, at that point it will be broken down into 4KB page frames. To put it another way, pageable large pages are available on a zEC12 capable CEC with the caveat that if Flash Express is not installed, then if those pages are ever paged out they will be demoted to 4KB page frames. 1MB pageable large pages that are demoted to 4KB page frames as they paged out will never be coalesced: that is, they will remain 4KB page frames for the remaining life of the IPL.
Watch out for open APAR PI18475. This describes a situation where IRLM V2.3 fails with ABEND0C4 in DXRRL770 after PI07853/UI14551 is applied.
The error occurs during P-lock processing. The full range of symptoms is unknown, but include DB2 termination with reason code 00C90093, indicating an invalid request to update an object allocated to a utility, preceded by message DSNT501I with reason code 00C202AA.
However, be aware of the warning implied in the APAR text indicating that "other symptoms may be possible". It is strongly recommended that you monitor this APAR and apply the corrective maintenance when available.
The last thing you want to deal with when in a recovery situation is a failed RECOVER, so keep an eye on APAR PI17986, which was closed very recently. This affects customers DB2 10 and DB2 11 customers who have APAR PM88455, PTF UK97229/UK97230 applied. The APAR abstract indicates that this affects customers who use RECOVER TOCOPY, but closer examination of the APAR text reveals that this can also affect customers who use RECOVER TOLOGPOINT or TORBA, so in other words this can affect any PIT recovery.
The RECOVER can fail with RC 8, and something like the following message is issued:
DSNU556I - RECOVER CANNOT PROCEED FOR TABLESPACE db.ts DSNUM x
TOCOPY copy_data_set BECAUSE A SYSIBM.SYSCOPY RECORD HAS BEEN
ENCOUNTERED WHICH HAS DBNAME=db TSNAME=ts DSNUM=y ICTYPE=M
STYPE=R STARTRBA=X'nnnnnnnnnnnn' LOWDSNUM=0 HIGHDSNUM=0
Although the APAR is closed, the PTFs are not yet available. It is strongly advised that you monitor this APAR and apply the corrective maintenance when available.
Without this fix applied, when REORG FORCE cancels a thread, the cancel operation picks up a control block address without holding a latch, and when that control block address is used for thread cancellation, it might actually have been reused by a different thread. So there are two possibilities: the wrong thread may get cancelled; and, if that is the case, depending on what that cancelled thread id doing, DB2 may be brought down. The fix causes a latch to be acquired when picking up the control block address, ensuring that the right thread is cancelled, and protecting you from a DB2 outage.
Many customers experience problems when allocating large datasets, because the datasets often end up being allocated in many extents. This can affect performance and availability as high frequency extent allocation or in extreme cases lead to extent allocation failure. The following is a partial rework of information provided by John Campbell, IBM DB2 for z/OS Distinguished Engineer, in response to a real customer problem.
At a high level, the recommended best practice for managing large datasets is for customers to use both Space Constraint Relief and Extent Constraint Removal. That is the easy part of the answer. The difficult part is understanding the implication of setting PRIQTY=-1 for large objects.
In terms of volume maintenance, the vast majority of customers do not DEFRAG their volumes; sometimes this is just impossible because of 24x7 operations, but sometimes customers do not even know they need to. Consider the impact, if you have fragmented volumes, and DB2 is using the sliding scale of secondary allocation. DB2 might, for example, allocate 7 cylinders, and then work out it needs to allocate 8 cylinders on the next allocation using the sliding scale, but the volume is so fragmented that another extent has to be found as the adjacent space does not hold 8 cylinders, therefore resulting in many extents.
If you find yourself in this situation, there are some questions you need to answer and consideratiobs to take into account:
What 3390 model type are you emulating? You might be struggling to obtain a PRIQTY of 36000, for example, which might seem quite big but is only 500 cylinders and is not, in fact, very large. This opens up the possibility that you are still emulating 3390 Mod-3s, which are, by today's standards, very small. The larger the logical model size you are emulating, the less of an issue finding 500 cylinders should be.
What is the HIGH specification in ISMF for the SMS Storage Group? Many customers are using 99% and forcing datasets to go multi-volume, causing additional DB2 performance degradation because of the additional synchronous request to the ICF catalog for each new volume used.
Are there enough volumes in the SMS Storage Group? Many customers are running their systems lean and mean, and do not have enough logical volumes. Even if they understand that, they are often unwilling to add more volumes because of the cost.
What DSSIZE you use indicates how many extents it will take to get to that 500 cylinder point, via a gradual slide or stepping stone.
For many customers there will be a 90/10 split for small vs. large data sets. If you assume 300 cylinders as the dividing line (this is an arbitrary value used for purposes of this discussion), then 90% of objects would be below 300 cylinders and 10% of the objects above. It would therefore make sense to provide two SMS Storage Groups, one for small objects, and one for large objects. The 'large' pool should have some additional logical volumes as a 'buffer' for doing such things as parallel Online REORGs. This way the small objects don't introduce fragmentation into the Storage Pool for the large data sets, and therefore would cause many fewer problems. The problem with PRIQTY -1 or a value very low is that it will not trigger the Storage Group ACS routine to allocate the datasets in the 'large' pool, because the allocations are based on an initial space allocation which is too small. You could take individual data sets or a set of them and have the ACS routines place them specifically in the 'large' pool, but that would be a manual approach. An alternative is if the installation to allow DBAs to specify one of the SMS classes on the CREATE/ALTER STOGROUP, the Storage Administrator can then code the Storage Group ACS routine so as assign the datasets into a specific special class and into the 'large' pool based on the class name provided. This takes away the manual effort and gives more control to DBAs.
When John was helping design sliding secondary allocation, he never envisaged PRIQTY=SECQTY=-1 to be a 'one size fit all' approach to be used for all objects. If you know the allocation for an object will be large, a common approach is to allocate 1000 cylinders primary, and 100 cylinders secondary if there is enough room. This is true for Online REORG of large objects as well as those that do not use REUSE or ALTER of the space before the REORG. In this case DB2 Data Space Manager will go through the slide scale all over again, so a 2900+ cylinder 2GB data set slides from the beginning even though we know what the potential high end size will be.
Be aware that every time a new extent is taken there is a synchronous request to the VTOC, plus the performing of the sliding scale calculation, and if the object goes multi-volume then add the additional synchronous request for the ICF catalog to add a new volume. Very few customers see this performance impact.
So in summary, I will make the following recommendations:
Enable both Space Constraint Relief and Extent Constraint Removal. For more information, see z/OS DFSMSdfp Storage Administration, SC23-6860-01.
Have a segregated SMS Storage Group with additional volumes for large allocations.
Use larger emulated logical volumes.
Use a reasonable value for the ISMF Storage Group HIGH value. For more information, see z/OS DFSMS Implementing System-Managed Storage, SC23-6849-00.
This new information APAR is opened for recommended APARs and tuning to improve DB2 INSERT performance based on recent observations from analysis and solution to customers using SAP application - but these recommendation should also apply to general customers.
Migrating DB2 for z/OS with the business application set active
Increasingly, customers must keep their applications online and available all hours of every day, 24X7. Fortunately, the DB2 migration process has been designed specifically to allow the business application set to continue to run while you migrate. This article by Jay Yothers, IBM Senior Technical Staff Member, provides hints and tips to make your online migration successful.
This new IBM Redbooks publication discusses the performance and possible impacts of the most important functions in DB2 11 for z/OS. It includes performance measurements that were made in the laboratory and provide some estimates improvements when moving from DB2 10.
APAR II13538: DEALING WITH HUNG COUPLING FACILITY CONNECTIONS IXLCONN REASON CODE 02010C27 0C27 02010C09 APAR status http://www-01.ibm.com/support/docview.wss?uid=isg1II13538
When a DB2 member abnormally terminates, its connections to the coupling facility structures are put into a FAILING state by cross-system extended services for z/OS (XES). The FAILING DB2 member remains in this state until all surviving members of the group have responded to the XES Disconnected/Failed Connection (DiscFailConn) event for each structure. XES sends this event to each surviving member of the group so that the necessary recovery actions can be taken in response to the failed member.
For this reason it is important to preform the actions described in this informational APAR to recover from hung CF structure connections.
Other important informational APARs in this area are: APAR II14016: DB2 RA10 R910 R810 HANG WAIT SUSPEND LOOP PROBLEM SUMMARY and for HANG or WAIT problems in DB2 DISTRIBUTED asid also see II08215 + II11164.
There are 56 Best Practices located at this site covering DB2 for z/OS, DB2 Tools, and QMF.
From the best practice pages on this site you can view the videos (streamed from YouTube), download a copy (MP4) to your local machine, download a transcript, and download slides when available.
DB2 for z/OS Best Practice: FlashCopy and DB2 for z/OS https://ibm.biz/BdRK4R
This Best Practice covers FlashCopy and its uses in conjunction with DB2 for z/OS. It gives you an overview of FlashCopy, how the DB2 utilities use it, and how you can affect the behavior in your environment. How to use FlashCopy to takes backups of data outside of DB2’s control is discussed. Finally, how FlashCopy fits in with DASD based replication solutions such as Metro Mirror (PPRC), z/OS Global Mirror (XRC), and Global Mirror.
In this Best Practice, Sheryl Larsen, the DB2 for z/OS world-wide evangelist discusses structures and appliances that can be used in conjunction with DB2 to improve performance. She presents information on base table indexes, index on expression, Materialized Query Tables (MQTs), zIIPs, and Accelerated Query Tables (AQTs). Decisions and cases regarding when to use these structures are presented.
DB2 for z/OS Best Practice: Advanced SQL Performance Insights https://ibm.biz/BdRKr7
In this Best Practice, Sheryl Larsen, the DB2 for z/OS world-wide evangelist shares many of the insights she has gained in writing efficient and well performing SQL. She explains the stages of SQL processing and filtering and gives practical tips on which predicates might obtain faster results. The steps to review your SQL are outlined. Sheryl also reviews some of the analytic capabilities of SQL statements such as PACK, GROUP BY GROUPING SETS, GROUP BY ROLLUP, and GROUP BY CUBE.
RTS REORGLASTTIME is set to creation time without REORG ever executed:
RTS tracks the last time when a REORG utility was executed against an table space / partition object in SYSIBM.SYTABLESPACESTATS.REORGLASTTIME.
However,after a newly created table space object without any REORG executed against it, SYSIBM.SYSTABLESPACESTATS.REORGLASTTIME indicates the same timestamp as the creation time.
Why is that?
Actually this behaviour is intended and works as designed. The rational behind is that a newly created empty table is considered perfectly reorganised.
That behaviour is also documented in the DB2 documentation about SYSIBM.SYSTABLESPACESTATS.REORGLASTTIME (http://www-01.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.sqlref/src/tpc/db2z_sysibmsystablespacestatstable.dita?cp=SSEPEK_11.0.0%2F10-0-117&lang=en):
The timestamp the REORG utility was last run on the table space or partition, or when the REORG utility has not been run, the time when the table space or partition was created. A null value indicates that the timestamp is unknown.
A value of NULL would indicate an UNKNOWN point in time. One could argue that this theoretically true, no REORG utility was executed against the table space object. However, the other rationale is that DB2 supplied stored procedure DSNACCOX has to identify the need for a REORG based on REORG-INSERT/UPDATE/DELETE values of the SYSIBM.SYSTABLESPACESTATS table. This is basically done based on the SYSIBM.SYSTABLESPACESTATS.UPDATESTATSTIME value and number of SYSIBM.SYSTABLESPACESTATS.REORG-INSERTS/DELETES/UPDATES compared against SYSIBM.SYSTABLESPACESTATS.REORGLASTTIME. So if we would set SYSIBM.SYSTABLESPACESTATS.REORGLASTTIME to NULL during the CREATE time, which means also UNKNOWN, DSNACCOX could not pick on that. In the past we had to run REORG explicitly after create time to enable DSNACCOX.
With that said, there is also an exception to it. APAR PM37511 (http://www-01.ibm.com/support/docview.wss?uid=swg1PM37511) introduced a change where COPY or RUNSTATS would trigger setting RSYSIBM.SYSTABLESPACESTATSEORGLASTTIME to NULL if the RTS row for the affected object did not exist before. This situation could occur when during CREATE TABLESPACE or CREATE INDEXSPACE time the RTS table space was offline or not accessible. Or when migrating from DB2 UDB for z/OS Version 8.1, the table space was created when the RTS support was disabled.
A couple of recent DB2 Connect APARs have been identified as high pervasive and can impact customers planning on migrating to DB2 Connect 10.5 fix pack 2 in preparation for their migrating to DB2 11.
When discussing on DB2 11 migration plans, please review the following DB2 Connect APARs.
APAR IC99419 CLI-BASED APPLICATIONS RECEIVE SQL0501N AGAINST DB2 Z/OS WHEN STORED PROCEDURE CALL HAS MULTIPLE CURSORS APAR IC98222 COMPLETE SPECIAL REGISTER SUPPORT FOR DB2 Z/OS V11 APAR IC99735 GETJCCSPECIALREGISTERPROPERTIES () RETURNS SQLCODE -4743 AGAINST V11NFM WITH V10R1 APPLCOMPAT PACKAGES
If you are planning to deploy DB2 Connect 10.5 fix pack 2 and are susceptible to the reported problem, the recommendation is to either request a special build with the APAR fix or wait until the APAR is closed in a future fix pack available.
Customers who are using VSAM log striping for the DB2 active log datasets with DB2 10 for z/OS or DB2 11 for z/OS should pay attention to recently opened APAR PI10353, which is now marked HIPER. There is no exposure for DB2 for z/OS customers running DB2 V9.1 or earlier. Neither is there any cause for alarm, as the exposure for DB2 10 or 11 is very small. There is no data loss involved and no loss of data integrity.
So what is the underlying problem?This concerns a concept called dependent writes, which means that, in the event of a DB2 restart after a crash, or when restarting DB2 from a DASD mirror or restored FlashCopy image, I/Os must be applied in the order in which they were initiated, and not in the order in which they completed. Unfortunately, VSAM striping does not guarantee the correct ordering of dependent writes for in-flight I/Os. This could mean, in some rare scenarios, a missing log CI stripe. Just to be clear: VSAM striping has always worked this way.
Why are DB2 10 for z/OS and DB2 11 for z/OS exposed, and not earlier releases? Forced log writes (for example, those scheduled as a result of an application commit) can mean that a partially full log CI is written to disk. That log CI has to be rewritten in place when the next log write I/O is scheduled. Prior to DB2 10 for z/OS, DB2 would perform the rewrite of the log CI to log copy 1 and then to log copy 2 serially - ensuring that the write to log copy 1 is complete before initiating the write to log copy 2. With DB2 9 for z/OS and earlier releases, DB2 restart automatically detected the hole in the log, truncated the log at that point and continued the restart process. However, a performance enhancement delivered with DB2 10 for z/OS caused DB2 to rewrite log CIs (those which had been partially full when previously written) to both log copy 1 and log copy 2 in parallel. Unfortunately, the existing logic in DB2 10 for z/OS and DB2 11 for z/OS cannot deal with the new situation, that is, a missing log CI stripe.
A possible consequence of this is that one or both copies of the current DB2 active log pair could be damaged at the end, causing a DB2 crash restart to fail and leading to advanced non-standard recovery, which has to be done under guidance of the IBM DB2 service team. This could theoretically happen in the following circumstances:
A local site crash, such as a CEC or LPAR failure, or an address space failure
A DB2 restart off an XRC mirror or PPRC mirror
A DB2 restart off a restored system level FlashCopy
No customer has ever hit the problem following a local site failure in over 3 years of DB2 10 for z/OS production experience. Only one customer has ever hit the problem, and only when performing disaster recovery restart testing from an XRC mirror. This happened on two occasions. The chance of hitting the problem during DB2 crash restart following a local site failure is tiny. The chance of hitting the problem when restarting from a DASD mirror or from a restored system level FlashCopy is marginally greater, because the timing window for this sort of event (a missing log CI stripe) is larger. APAR PI10353 will provide a fix (circumvention) such that DB2 will automatically detect this type of problem, repair the log and allow DB2 crash restart to proceed.
For further guidance, please see the problem relief as described in the APAR, but be cautious if you are considering the option of disabling VSAM striping, which is difficult to implement, and may well result in serious performance problems.
Access to IBM information centers will soon be redirected to IBM Knowledge Center, so take time now to explore your favorite IBM information in its new home at ibm.biz/IBMKCgo. If you've explored IBM Knowledge Center during its open beta, you have probably noticed some differences from the IBM information centers that you used in the past. With time and practice, using IBM Knowledge Center will become second nature. In the mean time, here are a few tips and techniques that you might be delighted to discover!
How do collections work?
As you can tell, IBM Knowledge Center includes many topics about many IBM products. You probably use a few IBM products regularly and maybe some more products on occasion. By using My Collections, you can define just the topics that you want to use on a regular basis. You can put all of the information for one or more products and versions in a single collection, or you can include individual topics from still other product collections. And once you set up your collection with the information you want, you can add to it, move topics up or down, change the name of the collection, and create a PDF of the topics.
Where did my search results go?
The left part of the IBM Knowledge Center interface shows three sections, shown below. They don't always stay in the same order though. Whichever section is on top is the one that matches the displayed content on the right. When you visit IBM Knowledge Center, the Table of Contents is usually on top. If you do a search, Search Results will be on top and you will see the results on the right. If you look at My Collections, that section will be on top. To move from one section to another, simply toggle among them.
How do I narrow down my search to just the one or two products that I care about?
Fortunately, IBM Knowledge Center has powerful filtering capabilities. You define the products that you want to search in by typing the product name into the search field or by clicking Add Products in the top left of the IBM Knowledge Center interface, where you select the products that you want in your filter.
After you set up the products for your filter, your search will be restricted to that set of content.
Why does my filter keep changing?
IBM Knowledge Center offers automatic filtering. When you open a topic (from the Table of Contents or Search Results), IBM Knowledge Center automatically creates a filter for the product collection of that topic, as shown below. In this case, we navigated to the topic about editing capabilities in DB2 QMF, and the filter was set up automatically so that any search will be restricted to QMF 11.
If you want, you can then expand the filter by clicking Add Products or by selecting additional filters for the date range, task, or operating system.
Can I turn off the automatic filtering?
Yes. If you don't want the automated search filter to be added to your product filter as you navigate, you can turn it off completely. On the Search Results page, click Search Options and uncheck Automatically select a product based on the current content.
Once you do that, only the products that you select directly will appear as search filters, and navigating won't have any effect on your search results.
If I define a filter that I like, can I use it again, for future searches?
There are a few ways to do this. The easiest way is probably just to bookmark the IBM Knowledge Center address at the time you have your filter set up the way you want to use it in the future. You can also sign in to IBM Knowledge Center and use its Save Search option in the top right of the interface, as shown here:
You can then use that same search filter when you sign in next time, even from a different computer or browser.
What if I still get too many results, even after I define a good filter?
Sometimes the keywords that you are using as your search string are found in a large number of topics, so you end up with more results than you want to review. Three techniques are useful here.
If your search term includes more than one word that you want to use together, place the entire search string in double quotation marks. IBM Knowledge Center will return only topics that have that exact string, rather than topics that contain one or more of the individual words in your search string.
Sometimes you get too many results simply because your search string is commonly used. In this situation, you can revise your search string to use the intitle operator, immediately before whatever word or phrase (in double quotation marks) you are most interested in. The intitle operator limits your search results to topics that have that word or phrase of your search string in the topic title itself. That way, you know that your search terms are of primary importance because they are mentioned in the topic title and not merely in the body of the topic. For example, if you really want to find troubleshooting topics about performance degradation, your search string might be: performance degradation intitle:troubleshooting
Another situation that sometimes causes too many results is when many of the resulting topics contain information that you just don't care about. If you see in your initial results that most of the topics contain a term that makes the topic be not of interest, use the minus sign (-) before that term in your revised search term. The minus sign acts as an exclude operator.
If a minus sign means to exclude a search term, how do I search for something that starts with a minus sign?
One example of just this situation is information about SQL codes, which generally begin with a minus sign (such as -803). If you are debugging a problem for which an SQL code is issued, you need to find the information, not to exclude it.
The easy ways around the behavior of the minus sign are to:
Put the minus sign and SQL code inside of double quotation marks so that IBM Knowledge Center looks for the exact phrase within the marks: "-803"
Prepend the SQL code with a text string, with no space before the minus sign, which basically negates the exclusion behavior of the minus sign: SQLcode-803
If you don't realize or forget that the minus sign excludes a term, IBM Knowledge Center will generally help you by offering a "Did you mean?" question, as shown here, to remind you that you might need to use double quotation marks around the term.
We hope that these tips are useful to you as you become more familiar with IBM Knowledge Center. More tips are available in the Help topics on the welcome page and on the IBM Knowledge Center Technical Content blog at https://ibm.biz/IBMKCTCBlog, which has additional information about how to get the most out of IBM Knowledge Center.
Remember that access to IBM information centers will soon be redirected to IBM Knowledge Center, so take time now to explore your favorite IBM information in its new home at ibm.biz/IBMKCgo.
IFCID 199 statistics are externalized on boundaries based on the statistics interval and a grouping of data sets having more than 1 I/O per second. Thus statistics for that data set may not be externalized at the statistics interval boundary. This can cause some confusion when monitoring for specific data sets requiring I/O.
From the APAR text:
New field QW0199SC has been added to the data section 3 of the
IFCID 199. This field contains the timestamp of the last time
the corresponding dataset statistics were externalized. If this
is the first time statistics for this dataset are externalized,
this value is a timestamp of when this dataset was opened.
PM90422 - Add timestamp to IFCID 199 for last interval it was externalized
The informational APAR (II14660) has a number of important APARs to keep track of. It also lists the FIXCAT categories that you can use within SMPE to filter what APARs you are most interested in during your preparation for migration.
Two examples are : IBM.Migrate-Fallback.DB2.V11 and IBM.Coexistence.DB2.SYSPLEXDataSharing
II14660 – info APAR for V11 migration and fallback
PM85944 allwed the buffer pool PMB (page manipulation blocks) to be backed by pageable 1MB frames. The only way for DB2 to utilize pageable 1MB large frames is if you have FlashExpress installed on your zEC12 or zBC12 running z/OS 1.13 with FMID JBB778H applied. For clarification, if your buffer pools are page fixed and backed by LFAREA (amount of real storage set aside for 1MB frames) in DB2 10 or DB2 11 then the PMBs are already backed by 1MB frames. Thus the addition of Flash Express would not help. If however you cannot afford the real memory to page fix the buffer pools and back them with LFAREA then the addition of Flash Express could help with the TLB misses (CPU performance). In this case the PMBs would be backed by non-LFAREA storage, but you should have some spare LFAREA storage set aside, because if we run out of non-LFAREA then performance would still be better if we fell back to LFAREA, rather than actually paging to the Flash storage.
PM85944 – retro 1MB frames for PMB buffer control blocks (pageable) benefit if 4k backing
For customers using Change Data Capture or Q Replication where the products are using IFI 306 reads to capture log records, there is a DB2 11 enhancement that has retro-fitted to DB2 10. The enhancement perequisites v10.2.1 of CDC so you will need to check if your tools are at the correct level. With this APAR you can specify a subset of DBID and PSID pairs that your subscription should be tracking, instead of the filtering occuring after the records have already reached the replication product. This way not all of the log records from all of the tables marked with 'data capture chages' get pushed to the replication product. The lab noticed a significant CPU reduction in the capture application with this fix on.
The DSNWMSGS data set in the prefix.SDSNIVPD library of your DB2 9, DB2 10, or DB2 11 subsystem can be very helpful in determining which instrumentation facility field relates to fields in your IBM or vendor monitoring tool. There is also DDL included which can be used to create and polulate DB2 tables with this information to make it searchable via SQL.
PM93550 – update to DSNWMSGS from the second half of 2013
One of the main differences in the DB2 11 stats are the inclusion of the IRLM 31-bit and 64-bit private and common storage statistics.
The IRLM storage is not accounted for in the 'REAL Storage in Use - Summary' reported in IFCID 225 or in the Omegamon Statistics report, but it generally only constitutes single-digit MBs of storage. Since the IRLM storage was not reported in previous versions of DB2 there is not a DB2 baseline to compare it to unless you utilize RMF monitoring of the address spaces.
DB2 Sort is a separate chargeable product which designed to help reduce elapsed time, and CPU time used by the IBM DB2 Utilities Suite. This product was designed to help the utilities with their sorting processes. Portions of the DB2 Sort are zIIP eligible, which go beyond what might be eligible using DFSORT. For customers using DB2 Sort with DB2 9, DB2 10, or DB2 11 if there was zIIP eligible DB2 Sort processing that was dispatched to a CP instead, there was previously no way of monitoring this. The IFCID 25 record for utilities processing times did not include this information previously.
With the following APARs the SMF type 30 records are interrogated to get the zIIP eligible processing that occured on a general CP, and reported on.
This APAR addresses an issue where a very large segmented table space (550M rows) saw significant insert performacne degredation due to excessive getpages. This was related to an issue with the calculation of the segment number where the row should be inserted.
PI04942 – insert performance degradation in segmented and UTS tables due to segment calculation
During DB2 Restart, Recover Utility, Restore Utility indexes were not taking advantage of the prefetch performance offered by FLA 9fast log apply). Hence table spaces would be recovered more quickly than the indexes. This applies to all suported versions of DB2. The storage set aside for Fast Log Apply increased from 100MB to 510MB in DB2 10.
Q Rep is fast becoming an integral part of the High Availability options for z/OS environments, and is involved in the IBM Active-Active solution. This publication includes sections on latency analysis, managing Q Replication in the IBM DB2® for z/OS environment, and recovery procedures. These are topics of particular interest to clients who implement the Q Replication solution on the z/OS platform.
For the benefit of DB2 for z/OS Value Unit Edition (VUE) licensing environments, DB2 for z/OS APAR PM99492 (PTFs UI12638(V9), UI12636(V10), UI12637(V11)) made changes to the DB2 for z/OS Product Name information that is returned to remote JCC based application environments, even for non VUE environments. DB2 was unaware that this information can be presented to application environments and it has been discovered that the change in this ProductName information has adversely affected many remote application environments.
In case APAR PM99492 has been installed and experiencing issues, Customers must remove it immediately to revert the change. The DB2 service class name is returned on a connection boundary. It is not cached by JCC. Once the APAR is removed, new connections will see the old DB2 Service class name.
If you are running with z/OS 1.13 or above, and using volume-level FlashCopy to backup your DB2 data (whether it is done outside of DB2's control or via the DB2 BACKUP SYSTEM), you should enable the DFSMSdss automatic UCB Refresh function.
By default, this function is disabled - which means that if the FlashCopy moves the VTOC on the target volume (source VTOC is in a different location than the target VTOC), then the target volume has to be varied offline/online before it can be accessed on systems other than the one that did the FlashCopy. This could result in failures for any processes that access the backup once it has been taken - an offload to tape for example.
The automatic UCB Refresh function can be enabled on z/OS 1.13 via the command /MODIFY DEVMAN,ENABLE(REFUCB) or update of the PARMLIB member DEVSUPxx: ENABLE(REFUCB)
Disadvantage of the initial approach using IFCID 316:
• Cache evictions unpredictable
• How do I tell if something would have qualified?
• Point in time evaluation
Improved approach using Accelerator Modeling support provided via APAR PM90886:
• DB2 estimates the CPU cost and elapsed time for queries eligible for offload to IBM DB2 Analytics Accelerator
• A new online system configuration parameter ACCELMODEL to enable accelerator modeling (default is OFF)
• The estimated cost is captured in new fields contained in DB2 accounting record IFCID 3
• Additionally, EXPLAIN will populate DSN_QUERYINFO_TABLE, which shows if a query is eligible for offload or not
The following publications were updated in October 2013:
DB2 10 for z/OS:
Administration Guide, SC19-2968-10
IRLM Messages and Codes for IMS and DB2 for z/OS, GC19-2666-08
pureXML Guide, SC19-2981-07
What's New?, GC19-2985-08
IBM Spatial Support for DB2 for z/OS User's Guide and Reference, GC19-2986-03
IRLM PE PM65217 (PTFs UK79710/UK79709) can cause a DB2 group-wide outage. The risk of hitting this issue is very low and data integrity is not at risk.
If multiple DB2 members are restarted at the same time (e.g. a group restart) and if an IRLM lock structure rebuild is triggered during this time, the lock structure rebuild can hang and result in all the DB2 members being abended. The most common reason for a lock structure rebuild (but not the only one) is when you have mixed versions of DB2 e.g. you are running V10 and V9 - you start a V10 member first and then a V9 member so the lock structure has to rebuild to be at the lower V9 level.
Things you can do to mitigate risk until the PTF for PM94539 (PTFs UK97980/UK97981) is applied:
1. If you are doing a group restart in a mixed-release coexistence environment - start one V9 member first, then start the other members in any version order and any number at one time (this will prevent a lock structure rebuild at the same time as multiple members starting).
2. If you are falling back to V9 after all your members are V10, start one member in V9 at a time.
The QMF Best Practices Web site has gone live! The initial launch contains 7 new how-to videos showcasing new features in QMF 11. These videos have been created by experts in the QMF Development Team. Each video includes a transcript and the option to stream or download to your machine. More videos and best practices will be added to this site in the coming weeks and months so you may want to bookmark it.
Nowadays it is increasingly important to deliver information to business users in a format that can be easily digested and used to make important business decisions—dashboards are key to modern, dynamic data delivery. With over 150 new analytic functions and an all-new graphical dashboard capability, QMF provides a palette upon which you can place multiple objects such as controls, charts, metrics and more—all of which can be derived from various data sources simultaneously and linked together to deliver robust interactive real-time data when you need it.
End users need ad-hoc capabilities beyond interactive dashboards. See how an end user can view individual graphs at a high level summary and be able to drill into more detailed views, work with interactive reports and change prompts. End users can also make ad-hoc changes to table views in order to sort data, change formatting and even transfer to another object type.
Using QMF, content developers can create visual dashboards and database visualization applications that present either interactive or persistent data. Visual dashboards are typically designed to run interactively (though they are not required to do so) and provide a dynamic means of reviewing critical enterprise data. The data that is presented in a visual dashboard can be obtained from querying multiple data sources across the enterprise and displaying it in a wide range of visual images, including graphs, maps, charts, and custom graphics.
Reports are an important facet of any business intelligence tool. Well designed reports help your team to analyze data efficiently. From standard tabular reports to visual reports, which allow you to chart, graph, and map your data, QMF challenges and extends the traditional notion of reporting, making your communication and presentation more precise, effective, and powerful.
Scheduled tasks allow you to create automated actions such as running queries and exporting query results without taking direct action from the user interface. Scheduled tasks are invaluable tools that help to automate business processes to run at specified dates and times. This is especially useful for actions that must be run on a regular basis, such as weekly sales reports that are supposed to be distributed to several different locations, or quarterly earnings reports.
The QMF Text Analytics feature includes a sample knowledge base for collecting, organizing, and interpreting vast amounts of unstructured text. The interpretive capabilities of Text Analytics gives the enterprise insight into what people are saying in social media data, surveys, call notes, internal communications, or even financial, legal, and medical data. The enterprise can use this newfound insight to drive business and operational decisions.
With Rocket CorVu you can view visual dashboards and visual applications on mobile devices such as the iPad. Rocket CorVu is a free companion product for QMF for WebSphere. With Rocket CorVu you can view visual projects from anywhere at anytime using your mobile device.
CHECK DATA AUXERROR INVALIDATE was ignored for LOB columns for DB2 directory objects.
CHECK DATA TABLESPACE DSNDB01.SPT01 AUXERROR INVALIDATE issued multiple MSGDSNU809I indicating missing AUX index entries but was not invalidating the LOB columns. CHECK DATA SHRLEVEL REFERENCE AUXERROR INVALIDATE should invalidate LOB columns for directory objects. CHECK DATA SHRLEVEL CHANGE AUXERROR INVALIDATE should generate REPAIR statements for invalid LOB columns for directory objects.
LOAD FORMAT INTERNAL using an input data set, SYSREC, that contained data that was not in internal format caused DB2 to
crash. FORMAT INTERNAL is a performance option that bypasses all data validity checks to improve performance and incorrect
usage may result in numerous types of errors if the procedures outlined in the Utility Guide and Reference are not followed.
To minimize problems caused by usage error, LOAD FORMAT INTERNAL will be modified to validate that the actual length of the
internal format record equals the length specified in the internal format record header.
An agent may become hung in CSECT DSNTABA if a -CANCEL THREAD command is issued for that agent while it was holding an IRLM lock.This also could impact a release migration.
-CANCEL THREAD command is issued for an agent that is holding a SKPT p-lock, EDM abort processing will save that SKPT p-lock on a global EDM chain to be cleaned up during IRLM synchronization that occurs during general agent abort/deallocation processing. Any new agent that requires the same SKPT p-lock may also try to purge SKPT p-locks on the global EDM chain that are leftover from previous threads.
Solution is to remove the race condition by having EDM cancel processing use an agent specific chain to save those SKPT p-locks that need to be freed during the abort/deallocation process.