Scale guests into memory overcommitment on z/VM 5.2

Edit online

In these tests we started with one database server guest and then added four more database server guests for a total of five guests. These five guests used all of the available physical memory. The number of workload driver users was adjusted to reach maximum throughput and the users were distributed evenly among the guests. The guests were scaled from five to ten. We strove to maintain maximum throughput while scaling the guests from five to ten.

Total throughput

Figure 1 shows the normalized transactional throughput from all guests when scaling the number of guests on z/VM® 5.2 into memory overcommitment.

lm02 — Figure 1. Normalized transactional throughput for scaling the number of guests on z/VM 5.2

Observations

Figure 1 shows the effect of memory overcommitment on the throughput. With five guests, all memory is in use. The addition of just one more guest caused a decrease in throughput of more than 50%. Adding further guests leads to a more moderate degradation.

CPU utilization

Figure 2 shows the CPU utilization of each guest and the total CPU utilization for the z/VM LPAR for the ten CPUs assigned.

lm03 — Figure 2. Total CPU utilization per number of guests

Observations

Figure 2 shows the effects of memory overcommitment on CPU utilization. At five guests we were able to achieve more than 900% utilization of the ten total CPUs and the z/VM overhead was small. As more guests were added, the total CPU utilization used by the guests decreased significantly and z/VM overhead increased.

Page location

Figure 3 shows the location of page frames associated with guests. The values are a sum of the location of the page frames for each individual guest.

lm04 — Figure 3. Page location - guest scaling on z/VM 5.2

Observations

Figure 3 shows the location of pages for the guests as the number of guests is scaled from five to ten. The most interesting curve is the utilization of the paging space on DASD. There is a steep increase as the number of guests is increased and at ten guests, there are as many pages on DASD as in memory.

Page movement

Figure 4 shows the relation of the memory overcommitment on the page movement rate to the various destinations.

lm05 — Figure 4. Page movement - guest scaling on z/VM 5.2

Observations

As guests are scaled from five to ten, the page movement rates from main storage to XSTOR and from XSTOR to DASD increase. The maximum page movement rate was observed at nine guests and at ten guests the movement rate had decreased. This decrease corresponds with the decrease in CPU utilization that was observed at ten guests. The highest traffic is seen in the direction from main storage to expanded storage, closely followed by the direction from expanded storage to DASD.

Conclusion

Six guests and a memory overcommitment of only 35% resulted in a throughput reduction of 48%. At six guests, after the initial affects of page movement to DASD, the addition of additional guests has a lower impact on throughput.

The total system memory of 80 GB (20,971,520 pages) is overcommitted as follow:

Table 1. Overcommitted system memory - z/VM 5.2
Number of Guests	Pages Not in Memory (= paged out)		Real Overcommit (pages used: memory) - 1	Planned Overcommit (virtual : physical) - 1	Throughput % From Baseline
Number of Guests	XSTOR	DASD	%	%	%
5	267907	0	1	0	baseline
6	1034590	6280409	35	20	52
7	1032323	13128704	68	40	47
8	1039500	18219008	92	60	41
9	1038750	20236288	101	80	38
10	1045785	20754432	104	100	31

Interestingly, z/VM has more pages in use than it should have, based on the planned overcommitment. This happens because, in some cases, z/VM has pages in memory as well as in page space to avoid bouncing pages.

At six guests the throughput has decreased to 52% of the five guests' value. With the addition of more guests, throughput declines at a linear rate and at ten guests has decreased to 31%. As guests were added, z/VM overhead also increased and we were not able to achieve higher percentages of total CPU utilization.

Looking at the movement rates, it shows that, at most, the pages were moved to DASD via XSTOR, and that the movement rate back to main storage is very low, indicating that z/VM had moved the right pages.