In general, the performance of an event which takes place once a year or once every two years is not something that requires a high-priority performance inquiry. However, when a customer reported that a particular DB2® for OS/390® V4 to V5 migration job had still not completed after 24 hours, you better believe it got our attention. We decided that we needed to accomplish two main objectives:
- First and foremost, dramatically reduce the time required for a catalog migration.
- Second, provide a technique to estimate how long a given migration job would take as a function of the catalog size. This second objective is quite important when a job keeps running on and on, as it would be a shame to cancel after n hours, if it is to finish in n hours and one minute.
Step 1: Finding a challenging catalog
The first step was to find a realistic and interesting catalog to use. Because we know that the size of the catalog has a direct effect on how long it takes to migrate it, we did a survey of customer catalogs to find the biggest one.
Step 2: Determining optimum buffer pool sizes
After we found our very large catalog, we needed to determine optimum size of buffer pools for the DB2 catalog and work files. The catalog buffer pool is obviously important for catalog migration performance. However, it turns out the work file buffer pool is also quite important because there are multiple large DB2 sorts that occur when the catalog is migrated. We changed both buffer pools from 1000 buffers to 20000 buffers to determine an optimum point. As the number of buffers was increased from 1000, the catalog migration performance continued to improve, but beyond 10000, there was not much improvement. Therefore, we decided that 10000 buffers is an optimum size, and we used it for both the catalog and work file buffer pools.
As you will see from the results in this report, there was more than ten times improvement in both the CPU time and elapsed time of catalog migration in V6 to V7 compared to V5 to V6. Because the catalog migration time for this large catalog was only 1.4 minutes, it is quite reasonable to assume that more average-sized catalogs can be migrated in less than a minute. For this reason, there seemed no reason in trying to meet the second objective, that of developing a technique to estimate how long it will take to migrate a catalog.
The dramatic performance improvement brought about is important especially for customers with large catalogs continually growing in size, because the time required for catalog migration is highly correlated with the catalog size.
Hardware and software configuration
Here is the configuration that we used for our native measurements:
- DB2 for OS/390 V5, V6 and V7.
- OS/390 release 2.7.
- 12-way processors 9672 (G6).
- Shark (ESS E20) DASDs.
The time it takes for catalog migration depends primarily on the size of the DB2 catalog. Catalog statistics were collected from sixteen large DB2 customers and the biggest catalog among them was chosen for this study. This catalog was the biggest by far at over 5GB, including 9000 databases, 10000 table spaces, 70000 indexes, and 260000 tables. Table 1 shows the number of pages in the DB2 catalog.
Table 1. Page counts in catalog table spaces
| Table spaces | Number of Pages |
| SYSCOPY | 42,120 |
| SYSDBASE | 274,680 |
| SYSDBAUT | 1,800 |
| SYSDDF | 180 |
| SYSGPAUT | 180 |
| SYSGROUP | 180 |
| SYSPKAGE | 931,808 |
| SYSPLAN | 17,280 |
| SYSSTATS | 1,980 |
| SYSSTR | 180 |
| SYSUSER | 180 |
| SYSVIEWS | 360 |
| Total | 1,270,928 |
Table 2 shows the massive numbers of rows that are in some of these catalog tables.
Table 2. Record counts in catalog tables
| Tables |
Number of Rows | Tables |
Number of Rows | Tables |
Number of Rows |
| SYSCOPY | 975431 | LOCATIONS | 0 | SYSDBRM | 15920 |
| SYSCOLUMN | 2107861 | LULIST | 0 | SYSPLAN | 1637 |
| SYSFIELDS | 0 | LUMODES | 0 | SYSPLANAUTH | 6735 |
| SYSFOREIGNKEYS | 180 | LUNAMES | 1 | SYSPLANDEP | 4217 |
| SYSINDEXES | 70268 | MODESELECT | 0 | SYSSTMT | 216743 |
| SYSINDEXPART | 74717 | USERNAMES | 0 | SYSCOLDIST | 28502 |
| SYSKEYS | 263148 | SYSRESAUTH | 4169 | SYSCOLDISTSTATS | 4363 |
| SYSRELS | 93 | SYSSTOGROUP | 6 | SYSCOLSTATS | 19364 |
| SYSSYNONYMS | 723 | SYSVOLUMES | 7 | SYSINDEXSTATS | 574 |
| SYSTABAUTH | 2365287 | SYSPACKAGE | 690411 | SYSTABSTATS | 744 |
| SYSTABLEPART | 14493 | SYSPACKAUTH | 799455 | SYSSTRINGS | 403 |
| SYSTABLES | 261262 | SYSPACKDEP | 46 | SYSCHECKS | 2479336 |
| SYSTABLESPACE | 10003 | SYSPROCEDURES | 1 | SYSCHECKDEP | 48 |
| SYSDATABASE | 9081 | SYSPACKLIST | 4085 | SYSUSERAUTH | 59 |
| SYSDBAUTH | 55023 | SYSPACKSTMT | 13448719 | SYSVIEWDEP | 162 |
| IPNAMES | 0 | SYSPLSYSTEM | 0 | SYSVIEWS | 266 |
The optimum bufferpool size of 10000 buffers was used for both the catalog and directory buffer pool (BP0) and the workfile buffer pool (BP4), resulting in the best migration CPU and elapsed time results in the performance measurements. This optimal buffer pool size was arrived at after experimenting with 1000, 10000, and 50000 buffers for each bufferpool. DB2 sort activity mostly disappeared in V6 to V7 migrations. As such, a 1000 buffer bufferpool for work files is sufficient for optimal performance in a V6 to V7 migration.
In general, the default values were used for buffer pool thresholds. The only exception was a 100% VPSEQT (% of buffers that can be used for sequential processing) for work files, as recommended, since more than 99% of the pages in the work files are accessed sequentially. No hiperpool was used because a very large set of buffers was not necessary for a satisfactory catalog migration performance.
We used the DB2 V5 customer catalog to measure migration performance in the following three scenarios, each of which is run in a data-sharing and non data-sharing environment:
Scenario 1: Migration from V5 to V6 with and without performance fixes applied
Scenario 2: Migration from V6 to V7 with and without performance fixes applied.
Scenario 3: Migration from V5 to V7 with and without performance fixes applied.
We present the results of these three scenarios both for a data-sharing and non data-sharing configuration.
Table 3 shows the elapsed time, cpu time, and the number of getpages that were performed for the DB2 catalog and the work file. Table 4 shows the same data after we applied two performance fix APARs.
Table 3. Results of non data-sharing measurement
| (seconds) | Elapsed time | CPU time | Catalog / Workfile getpages |
| V5 to V6 GA | 944 | 250 | 2.2 M / 516K |
| V6 to V7 GA | 117 | 24 | 535K / 24 |
| V5 to V7 GA | 925 | 226 | 2.4 M / 516K |
Table 4. Results of non data-sharing measurement with performance fixes applied
| (seconds) | Elapsed time | CPU time | Catalog / Workfile getpages |
| V5 to V6 with PQ38035 and PQ44985 | 834 | 192 | 1.6 M / 516K |
| V6 to V7 with PQ38035 and PQ44985 | 82 | 6 | 76K / 24 |
| V5 to V7 with PQ38035 and PQ44985 | 768 | 194 | 1.6 M / 516K |
Figure 1 is a graphical representation of the elapsed and CPU times shown in tables 3 and 4. For more detail, see Reading the detailed data.
Figure 1. Graphical representation of performance improvements
Table 5 breaks down the statistical data a little further so that you can see how we came up with our measurement results.
Table 5. Detailed statistics from non data-sharing measurements
| V5 to V6 GA | V5 to V6 (1) | V5 to V7 GA | V5 to V7 (1) | V6 to V7 GA | V6 to V7 (1) | |
| Elapsed time | 944sec | 834sec | 925sec | 768sec | 117sec | 82sec |
| Other read wait | 423sec | 379sec | 391sec | 411sec | 2sec | 2sec |
| #events | 51K | 46K | 47K | 50K | 179 | 173 |
| Ext/Del/Define wait | 160sec | 157sec | 180sec | 53sec | 21sec | 20sec |
| #events | 590 | 590 | 644 | 179 | 93 | 94 |
| Sync DB i/o wait | 48sec | 46sec | 49sec | 52sec | 46sec | 44sec |
| #events | 43K | 42K | 44K | 44K | 40K | 40K |
| CPU time | 250sec | 192sec | 226sec | 194sec | 24sec | 6sec |
|
Catalog and Directory | ||||||
| Getpages | 2.2M | 1.6M | 2.4M | 1.6M | 535K | 76K |
| Buffer updates | 149K | 149K | 157K | 156K | 8K | 7K |
| Pages written | 38K | 38K | 39K | 39K | 375 | 318 |
| Workfile | ||||||
| Getpages | 516K | 516K | 516K | 516K | 24 | 24 |
| Buffer updates | 531K | 531K | 531K | 531K | 20 | 20 |
| Pages written | 306K | 308K | 306K | 254K | 0 | 0 |
| Lock requests | 333K | 20K | 794K | 28K | 462K | 10K |
| Unlock requests | 318K | 5K | 775K | 10K | 457K | 5K |
| Log records created | 9K | 9K | 18K | 17K | 10K | 8K |
| Log MB created | 2.0MB | 2.0MB | 4.8MB | 4.6MB | 2.6MB | 2.3MB |
Now that we have the raw performance data, letÂs dig a little deeper and see where the action is taking place.
V5 to V6 GA (no performance fixes applied)
It took 944 seconds, or about 16 minutes, to migrate our catalog from V5 to V6 without any performance fixes. Out of the total 944 seconds elapsed time:
- 423 seconds were spent waiting for "other read I/Os"; that is, asynchronous prefetch read I/Os.
- 250 seconds were spent on CPU time.
- 160 seconds were spent on wait time for data set extensions, deletes and defines.
- 48 seconds were spent on wait time for synchronous database read I/O wait time.
In terms of getpage activity on the catalog, 2.2 million getpages were issued, and 149000 catalog entries were updated, resulting in 38000 catalog pages written.
In terms of getpage on the work files, there were 516000 getpages for sorting. Some of the large sorts invoked here include a 2.1 million row sort of SYSCOLUMNS, a 15000 row sort of SYSTABLEPART, a 75000 row sort of SYSINDEXPART, a 2.4 million row sort of SYSTABAUTH, and a 70000 row sort of SYSINDEXES.
The locking and logging activity is self-evident from Table 5. What is interesting is what happens to the locking activity after the performance fixes are applied.
V5 to V6 with performance fixes applied (PQ38035 and PQ44985)
With the fixes applied, the elapsed time dropped 12%, as a result of eliminating some catalog table space scans, as indicated by:
- About a 10% reduction in prefetch read I/O wait time and the number of waits.
- A 23% reduction in CPU time.
- A 27% reduction in the number of getpages for catalog tables.
- And a whopping 26 times reduction in the number of lock and unlock requests.
V6 to V7 GA (no performance fixes applied)
Compared with the V5 to V6 GA numbers, V6 to V7 showed an elapsed time improvement of 8 times and a CPU time improvement of 10 times.
V6 to V7 with performance fixes applied (PQ38035 and PQ44985)
If we compare the V6 to V7 migration with the performance fixes to that of V5 to V6 GA, there is more than a 10 times improvement in elapsed time and more than a 40 times CPU time improvement.
Three major areas of improvement are:
- More than 20 times reduction in catalog getpages and updates.
- Eliminating sort for catalog table rows.
- More than 40 times reduction in the number of lock and unlock requests.
Netting it out for V5 customers
Notice that a V5 to V7 catalog migration is faster than a V5 to V6 migration. If youÂve been thinking about migrating, consider migrating directly to V7 without fear of lengthy migration times.
Table 6 shows the elapsed time, cpu time, and the number of getpages that were performed for the DB2 catalog and the work file. Table 7 shows the same data after we applied the two performance fix APARs.
Table 6. Results of data-sharing measurement
| (seconds) | Elapsed time | CPU time | Catalog / Workfile getpages |
| V5 to V6 GA | 895 | 242 | 1.7M / 516K |
| V6 to V7 GA | 73 | 6 | 77K / 32 |
| V5 to V7 GA | 756 | 197 | 1.6 M / 516K |
Table 7. Results of data-sharing measurement with fixes
| (seconds) | Elapsed time | CPU time | Catalog / Workfile getpages |
| V5 to V6 with PQ38035 and PQ44985 | 812 | 191 | 1.6 M / 516K |
| V6 to V7 with PQ38035 and PQ44985 | 74 | 6 | 77K / 32 |
| V5 to V7 with PQ38035 and PQ44985 | 754 | 195 | 1.6 M / 516K |
Figure 2 is a graphical representation of the elapsed and CPU times shown in tables 31 and 42. For more detail, see Detailed performance statistics.
Figure 2. Graphical representation of performance improvements in a data-sharing environment
Table 8 breaks down the statistical data a little further so that you can see how we came up with our measurement results.
Table 8. Detailed statistics from data-sharing measurements
| V5 to V6 GA | V5 to V6 (1) | V5 to V7 GA | V5 to V7 (1) | V6 to V7 GA | V6 to V7 (1) | |
| Elapsed time | 895sec | 812sec | 756sec | 754sec | 73sec | 74sec |
| Other read wait | 410sec | 379sec | 381sec | 382sec | 1sec | 1sec |
| #events | 50K | 46K | 46K | 46K | 129 | 128 |
| Ext/Del/Define wait | 134sec | 134sec | 37sec | 37sec | 15sec | 14sec |
| #events | 594 | 594 | 183 | 183 | 98 | 98 |
| Sync DB i/o wait | 52sec | 50sec | 47sec | 47sec | 44sec | 44sec |
| #events | 42K | 42K | 43K | 43K | 40K | 40K |
| CPU time | 242sec | 191sec | 197sec | 195sec | 6sec | 6sec |
| Catalog / Directory | ||||||
| Getpages | 1.9M | 1.6M | 1.6M | 1.6M | 76K | 76K |
| Buffer updates | 149K | 149K | 156K | 156K | 7K | 7K |
| Pages written | 38K | 38K | 39K | 39K | 296 | 296 |
| Workfile | ||||||
| Getpages | 516K | 516K | 516K | 516K | 32 | 32 |
| Buffer updates | 531K | 531K | 531K | 531K | 28 | 28 |
| Pages written | 308K | 308K | 308K | 308K | 0 | 0 |
| Lock requests | 23K | 23K | 31K | 31K | 11K | 11K |
| Unlock requests | 7K | 8K | 13K | 13K | 6K | 6K |
| Log records created | 9K | 9K | 17K | 17K | 9K | 9K |
| Log MB created | 2.0MB | 2.0MB | 4.7MB | 4.8MB | 2.4MB | 2.4MB |
Just as with non data-sharing, a data-sharing V6 to V7 catalog migration with PQ38035 and PQ44985 had more than a 10 times improvement in elapsed time and more than a 40 times improvement in CPU time compared to a V5 to V6 GA migration.
Just as with non data-sharing, a V5 to V7 catalog migration is faster than a V5 to V6 migration.
We 've shown you a lot of numbers. Now we will tell you what this means for you in terms of expectations, and we will also give you some recommendations to consider. These conclusions all reflect our experience with a very large (over 5GB) catalog that contains information about 9000 databases, 10000 table spaces, 70000 indexes, and 260000 tables!
With the largest catalog we could find, the catalog migration in a non data-sharing environment from V5 to V6 without performance fixes took about 16 minutes. After the two V6 APARs PQ38035 and PQ44985 were applied, the elapsed time decreased to about 14 minutes. This compares with one hour or more of catalog migration time experienced by customers with very large catalogs in V4 to V5 migration.
The V6 to V7 catalog migration time was dramatically reduced, more than 10 times in elapsed time and more than 40 times in CPU time. This migration took a little over one minute for a very large catalog, primarily because of minimizing catalog updates that are made, mostly via table space scans. Most customers can probably expect migration times less than 1 minute.
A key point to remember is that migrating from V5 to V7 is actually faster than migrating from V5 to V6.
The same good story is seen in a data-sharing environment.
If you want to migrate directly from Version 5 to Version 7, there is no reason to avoid doing so if your concern is about the elapsed time of the migration process.
To make catalog migration go as quickly as possible, it makes sense to concentrate your efforts on minimizing I/O time, because the data show that CPU time is a small subset of the total elapsed time. There are two ways to minimize I/O time.
-
Ensure adequate buffer pool size
Our experiments show that 10000 buffers for the catalog buffer pool and 10000 buffers for the work file buffer pool appeared to be a good choice for catalog migration performance.
In a V6 to V7 migration, a minimum buffer pool size of 1000 buffers is acceptable for work file buffers because DB2 sort activity is almost completely eliminated.
You can use the default buffer pool thresholds, except you should set VPSEQT to 100% for the work file buffer pool. This is because more than 99% of work file accesses are sequential, and VPSEQT represents the percentage of buffers that can be used by sequential access. Use the command ALTER BUFFERPOOL to change both the buffer pool size and the VPSEQT threshold:- ALTER BUFFERPOOL work_file_buffer_pool_name VPSIZE(10000) VPSEQT(100)
-
Use ESS (Shark) with the parallel access volume feature.
If you canÂt provide a sufficiently large buffer pool as recommended above, perhaps because of other virtual storage constraints, then the parallel access volume (PAV) feature/ We strongly recommend the use of this feature for supporting concurrent multiple I/Os to the same volume at the same time. The PAV feature is particularly advantageous for data sets that can be accessed concurrently by multiple threads, such as DB2 work files and catalogs. Of course, ESS with PAV can still be useful even when there is a sufficiently large buffer pool to further minimize I/O time and thus elapsed time.
The authors would like to thank Jerry Heglar, the manager of the DB2 Performance department, and Susan Hom, the manager of RDS Development, for their management support in preparing this report.
Comments (Undergoing maintenance)





