Scale guests into memory overcommitment on z/VM 5.2

In these tests we started with one database server guest and then added four more database server guests for a total of five guests. These five guests used all of the available physical memory. The number of workload driver users was adjusted to reach maximum throughput and the users were distributed evenly among the guests. The guests were scaled from five to ten. We strove to maintain maximum throughput while scaling the guests from five to ten.

Total throughput

Figure 1 shows the normalized transactional throughput from all guests when scaling the number of guests on z/VM® 5.2 into memory overcommitment.

Figure 1. Normalized transactional throughput for scaling the number of guests on z/VM 5.2
lm02

Observations

Figure 1 shows the effect of memory overcommitment on the throughput. With five guests, all memory is in use. The addition of just one more guest caused a decrease in throughput of more than 50%. Adding further guests leads to a more moderate degradation.

CPU utilization

Figure 2 shows the CPU utilization of each guest and the total CPU utilization for the z/VM LPAR for the ten CPUs assigned.

Figure 2. Total CPU utilization per number of guests
lm03

Observations

Figure 2 shows the effects of memory overcommitment on CPU utilization. At five guests we were able to achieve more than 900% utilization of the ten total CPUs and the z/VM overhead was small. As more guests were added, the total CPU utilization used by the guests decreased significantly and z/VM overhead increased.

Page location

Figure 3 shows the location of page frames associated with guests. The values are a sum of the location of the page frames for each individual guest.

Figure 3. Page location - guest scaling on z/VM 5.2
lm04

Observations

Figure 3 shows the location of pages for the guests as the number of guests is scaled from five to ten. The most interesting curve is the utilization of the paging space on DASD. There is a steep increase as the number of guests is increased and at ten guests, there are as many pages on DASD as in memory.

Page movement

Figure 4 shows the relation of the memory overcommitment on the page movement rate to the various destinations.

Figure 4. Page movement - guest scaling on z/VM 5.2
lm05

Observations

As guests are scaled from five to ten, the page movement rates from main storage to XSTOR and from XSTOR to DASD increase. The maximum page movement rate was observed at nine guests and at ten guests the movement rate had decreased. This decrease corresponds with the decrease in CPU utilization that was observed at ten guests. The highest traffic is seen in the direction from main storage to expanded storage, closely followed by the direction from expanded storage to DASD.

Conclusion

Six guests and a memory overcommitment of only 35% resulted in a throughput reduction of 48%. At six guests, after the initial affects of page movement to DASD, the addition of additional guests has a lower impact on throughput.

The total system memory of 80 GB (20,971,520 pages) is overcommitted as follow:

Table 1. Overcommitted system memory - z/VM 5.2
Number of Guests Pages Not in Memory (= paged out) Real Overcommit (pages used: memory) - 1 Planned Overcommit (virtual : physical) - 1 Throughput % From Baseline
XSTOR DASD % % %
5 267907 0 1 0 baseline
6 1034590 6280409 35 20 52
7 1032323 13128704 68 40 47
8 1039500 18219008 92 60 41
9 1038750 20236288 101 80 38
10 1045785 20754432 104 100 31

Interestingly, z/VM has more pages in use than it should have, based on the planned overcommitment. This happens because, in some cases, z/VM has pages in memory as well as in page space to avoid bouncing pages.

At six guests the throughput has decreased to 52% of the five guests' value. With the addition of more guests, throughput declines at a linear rate and at ten guests has decreased to 31%. As guests were added, z/VM overhead also increased and we were not able to achieve higher percentages of total CPU utilization.

Looking at the movement rates, it shows that, at most, the pages were moved to DASD via XSTOR, and that the movement rate back to main storage is very low, indicating that z/VM had moved the right pages.