Analyzing page movement
Excessive paging affects the processor usage, because jobs in central storage that should be using processor time are waiting for data to be moved into central storage. The more jobs you have waiting on paging, the lower your processor usage. Drops in processor usage may indicate storage constraints.
The processor used for storage management runs at a higher priority (performance group 0) than applications. Therefore, this movement may cause additional latent demand when the processor is near capacity and processor storage also is constrained.
To minimize the delay caused by a page-in, avoid the creation of contention on the I/O path to the page data sets.
IBM Z Performance and Capacity Analytics includes several predefined reports that show the activity of the paging subsystem. These vary from monthly overview reports to hourly trends of system storage paging rates. These reports can help you determine if too much time is being spent on paging.
Moving from central to expanded storage
The page rate from central to expanded storage is an indicator of how much page movement is occurring. To determine the impact, you must understand the cause of this page movement. Is it because of paging, or swapping, or hiperspace activity?
Note: Hiperspace™ activity that uses the MOVE PAGE instruction is not recorded.
Moving from expanded to central storage
Storage movement from expanded storage to central storage indicates contention for central storage. Pages are being moved out of central storage too quickly. The page rate from expanded to central storage indicates how serious the central storage constraint is.
Moving from expanded to auxiliary storage
Storage movement from expanded to auxiliary storage (pages migrated) indicates that you are using all of your expanded storage. You must find out how many of these pages are being paged back in as page faults (page-ins).
Migration rate indicates how effectively you are using expanded storage. If the page faults are low, then this movement from expanded to auxiliary storage is not an issue. A low migration rate means that pages that are paged out to expanded storage are referenced quickly enough to avoid migration to an external I/O device. Low migration rates mean that the system is avoiding a high paging overhead.
A high migration rate, in contrast, might mean contention for limited expanded storage or that the expanded storage criteria table does not match your workload.
When z/OS moves pages from expanded to auxiliary storage, it must first move the data from expanded to central, and then to auxiliary. So, page migration can cause CPU delay.
Analyzing the demand page-in rate
If excessive paging is occurring, you must determine when it happens and which workloads are experiencing it. The MVSPM Page-ins by Workload Type, Hourly Trend report shows which workloads are experiencing the page faults.
The highest workload delay caused by storage contention occurs when the demand page-in rate is highest.
If high levels of demand paging exist, you might want to further break out this work by performance group to determine which PGN within a specific workload type is experiencing this paging. This analysis shows you the specific type of work experiencing demand page-ins.
You can obtain more information by looking at page I/O response time. To find the impact of paging on a particular workload type, multiply the I/O response time by the number of page faults for that workload. The I/O response time, multiplied by the number of page faults per second per workload type, shows the impact this paging is causing to each workload type.
For workloads that offer transaction reporting, the portion of the response time caused by paging can be estimated by calculating the number of page-ins per transaction and multiplying by the I/O response time. For example, if a workload did 40 page-ins per second and 10 transactions per second, and the average page response time was 30 milliseconds, then:
40 page-ins/sec * 30 milliseconds = 120 milliseconds
10 transactions/sec
120 milliseconds of that workload’s response time was due to paging. If you had subsecond response time, then the time caused by paging represents a minimum of 12% of the response time.
Analyzing page data sets
You can define two kinds of paging data sets in z/OS page data sets and swap data sets. You must define page data sets. Swap data sets are optional. If you have swappable workloads on your system, you can still run with only page data set. You can also have both nonswappable and swappable workloads use swap data sets to ensure that swap activity (usually numerous pages in a swap set) does not interfere with demand paging (usually one page at a time) from online workloads (nonswappable).
The structure of page and swap data sets is different and so is the format of the requests to these data sets. The nomenclature is somewhat misleading: while all nonswap paging goes to page data sets, part of swap paging goes to swap data sets, and part of it goes to page data sets. Thus a one-to-one identification of paging category (that is, paging or swapping) and paging data set type (page or swap) is not possible. The functional division between swap and nonswap paging does not totally correspond to the division between swap and paging data sets.
The MVSPM Page Data Set Type Response Time, Hourly Trend report shows the I/O response time by the hour for each page data set type.
The I/O response time is the average for all devices being used for each page type. Many users have eliminated swap data sets and converted to local page data sets only. This conversion should not impact performance as long as the central storage contention is low. If a high level of storage contention causes a high demand page rate and the swap activity is forced to the same local page data sets, performance delays to the workloads experiencing page faults may occur. If the swap activity is low, there is no impact on the local page data sets. Much of this activity is accommodated by expanded storage, if available.
Analyzing block paging
The block paging function of z/OS uses the sequential or repeatable data reference patterns of many numerically intensive computing applications. It reduces repeated page faults by packaging together pages that are expected to be referenced together in the future and, at page fault time, loading them into central storage as a block rather than one at a time. This function can markedly improve the elongated elapsed times suffered by numerically intensive computing applications when their data has spilled to auxiliary paging devices.