Memory plugging

Edit online

To vary the amount of memory in a system, the cpuplugd daemon uses a ballooning technology provided by the Linux® cmm module.

This manages a memory pool, called the CMM pool. Memory in that pool is 'in use' from the Linux operating system point of view, and therefore not available, but eligible for the z/VM® in case memory pages are needed. The CMM module handles the communication with the z/VM Hypervisor that these pages are disposable. To vary the amount of memory in a system, the memory is either assigned to or withdrawn from the CMM pool by the cpuplugd daemon.

```
pgscan_d="vmstat.pgscan_direct_dma[0] + vmstat.pgscan_direct_normal[0] + vmstat.pgscan_direct_movable[0]"
pgscan_d1="vmstat.pgscan_direct_dma[1] + vmstat.pgscan_direct_normal[1] + vmstat.pgscan_direct_movable[1]"
```
There are two mechanisms managing the memory, an asynchronous process, called the kswap daemon and a synchronous mechanism, called direct page scans. The kswap daemon is triggered when the amount of free pages falls below some high water marks. The synchronous mechanism is triggered by a memory request which could not be served. The last one delays the requester. We got very good results when using only the direct scans as in the following calculations. If this causes systems that are too small, kswap scans as used in configuration 3 in Memory plugging configuration 3 can be included.
```
pgscanrate="(pgscan_d - pgscan_d1) / (cpustat.total_ticks[0] - cpustat.total_ticks[1])"
```
Only the current situation is considered. If only direct scans are used as criteria this is important because the occurrence of direct page scans indicates that an application delay already occurred.
```
avail_cache="meminfo.Cached -meminfo.Shmem"
```
The memory reported as cache consists mostly of page cache and shared memory. The shared memory is memory used from applications and should not be touched, whereas the page cache can roughly be considered as free memory. This is especially the case if there are no application runnings which perform a high volume of disk I/O transfers through the page cache.
```
CMM_MIN="0"
CMM_MAX="1245184"
```
CMM_MIN specifies the minimum size of the cmm pool in pages. A value of zero pages allows the full removal of the pool. As maximum value (CMM_MAX) a very large value of 1,245,184 pages (4,864 MiB) was used, which would stop the size of the pool from increasing when less than 256 GB memory remain. In real life the pool never reached that limit, because the indicators for memory shortage were reached earlier and stopped the size of the pool from increasing.
```
CMM_INC="meminfo.MemFree  / 40"
CMM_DEC="meminfo.MemTotal / 40"
```
These values are specified in pages (4 KiB each), KiB base values as the data in meminfo must be divided by a factor of 4, for example, 40 KiB is 10 pages. CMM_INC is defined as percentage of free memory (for example, 10%). This causes the increment of the CMM pool to become smaller and smaller the closer the system comes to the 'ideal' configuration. CMM_DEC is defined for a percentage of the system size, for example, 10%. This leads to a relatively fast decrement of the CMM pool (that is, providing free memory to the system), whenever an indicator of a memory shortage is detected.
```
MEMPLUG   = "pgscanrate > 20"
MEMUNPLUG = "(meminfo.MemFree + avail_cache) > ( meminfo.MemTotal / 10 )"
```
Memory is moved from the CMM pool to the system (plugged), when the direct scan rates exceed a small value. Memory is moved from the system to the CMM pool (unplugged), if more than 10% of the total memory is considered as unused, this includes the page cache.
Note:
- The increments for the CMM pool are always smaller then the smallest value created here to allow an iterative approach to reduce the volatile memory.
- In case a workload depends on Page Cache caching, such as a database performing normal file system I/O, an increase of the limit specified in the MEMUNPLUG rule could improve the performance significantly. For most application caching behavior add twice the I/O throughput rate (read + write) in KiB as start value to the recommended 10% in our rule. For example, for a total throughput of 200MB/sec:
```
MEMUNPLUG="(meminfo.MemFree+avail_cache)>(400*1024+meminfo.MemTotal/10)"
```
  In case of special page cache demanding applications even higher values might be required.
For small systems (<0.5 GB) the 10% limit might be reduced to 5%.

Memory hotplug seems to be workload dependent. This paper does give a basis to start from, a server type-dependent approach for our scenario could look as follows:

Table 1. Server type-dependent approach
Server type	Memory size	CMM_INC	Unplug when
Web Server	< 0.5 GB	free mem /40	(free mem + page cache) > 5%
Application Server	< 2 GB	free mem /40	(free mem + page cache) > 5%
Database Server	= 0.5 GB	(free mem+page cache) /40	(free mem + page cache) > 5%
Combo	> 2GB	free mem /40	(free mem + page cache) > 10%

Installation of z/VM APAR VM65060 is a requirement when memory management via cpuplugd is planned. It reduces the amount of steal time significantly, more details are in Memory plugging and steal time. It is available for z/VM 5.4, z/VM 6.1, and z/VM 6.2.