Memory plugging
To vary the amount of memory in a system, the cpuplugd daemon uses a ballooning technology provided by the Linux® cmm module.
This manages a memory pool, called the CMM pool. Memory in that pool is 'in use' from the Linux operating system point of view, and therefore not available, but eligible for the z/VM® in case memory pages are needed. The CMM module handles the communication with the z/VM Hypervisor that these pages are disposable. To vary the amount of memory in a system, the memory is either assigned to or withdrawn from the CMM pool by the cpuplugd daemon.
There are two mechanisms managing the memory, an asynchronous process, called thepgscan_d="vmstat.pgscan_direct_dma[0] + vmstat.pgscan_direct_normal[0] + vmstat.pgscan_direct_movable[0]" pgscan_d1="vmstat.pgscan_direct_dma[1] + vmstat.pgscan_direct_normal[1] + vmstat.pgscan_direct_movable[1]"
kswap
daemon and a synchronous mechanism, called direct page scans. The kswap daemon is triggered when the amount of free pages falls below some high water marks. The synchronous mechanism is triggered by a memory request which could not be served. The last one delays the requester. We got very good results when using only the direct scans as in the following calculations. If this causes systems that are too small,kswap
scans as used in configuration 3 in Memory plugging configuration 3 can be included.
Only the current situation is considered. If only direct scans are used as criteria this is important because the occurrence of direct page scans indicates that an application delay already occurred.pgscanrate="(pgscan_d - pgscan_d1) / (cpustat.total_ticks[0] - cpustat.total_ticks[1])"
The memory reported as cache consists mostly of page cache and shared memory. The shared memory is memory used from applications and should not be touched, whereas the page cache can roughly be considered as free memory. This is especially the case if there are no application runnings which perform a high volume of disk I/O transfers through the page cache.avail_cache="meminfo.Cached -meminfo.Shmem"
CMM_MIN="0" CMM_MAX="1245184"
CMM_MIN
specifies the minimum size of the cmm pool in pages. A value of zero pages allows the full removal of the pool. As maximum value (CMM_MAX
) a very large value of 1,245,184 pages (4,864 MiB) was used, which would stop the size of the pool from increasing when less than 256 GB memory remain. In real life the pool never reached that limit, because the indicators for memory shortage were reached earlier and stopped the size of the pool from increasing.
These values are specified in pages (4 KiB each), KiB base values as the data inCMM_INC="meminfo.MemFree / 40" CMM_DEC="meminfo.MemTotal / 40"
meminfo
must be divided by a factor of 4, for example, 40 KiB is 10 pages.CMM_INC
is defined as percentage of free memory (for example, 10%). This causes the increment of the CMM pool to become smaller and smaller the closer the system comes to the 'ideal' configuration.CMM_DEC
is defined for a percentage of the system size, for example, 10%. This leads to a relatively fast decrement of the CMM pool (that is, providing free memory to the system), whenever an indicator of a memory shortage is detected.
Memory is moved from the CMM pool to the system (plugged), when the direct scan rates exceed a small value. Memory is moved from the system to the CMM pool (unplugged), if more than 10% of the total memory is considered as unused, this includes the page cache.MEMPLUG = "pgscanrate > 20" MEMUNPLUG = "(meminfo.MemFree + avail_cache) > ( meminfo.MemTotal / 10 )"
Note:For small systems (<0.5 GB) the 10% limit might be reduced to 5%.- The increments for the CMM pool are always smaller then the smallest value created here to allow an iterative approach to reduce the volatile memory.
- In case a workload depends on Page Cache caching, such as a database
performing normal file system I/O, an increase of the limit specified
in the MEMUNPLUG rule could improve the performance significantly.
For most application caching behavior add twice the I/O throughput
rate (read + write) in KiB as start value to the recommended 10% in
our rule. For example, for a total throughput of 200MB/sec:
In case of special page cache demanding applications even higher values might be required.MEMUNPLUG="(meminfo.MemFree+avail_cache)>(400*1024+meminfo.MemTotal/10)"
Memory hotplug seems to be workload dependent. This paper does
give a basis to start from, a server type-dependent approach for our
scenario could look as follows:
Installation of z/VM APAR
VM65060 is a requirement when memory management via cpuplugd is planned.
It reduces the amount of steal time significantly, more details are
in Memory plugging and steal time. It is available
for z/VM 5.4, z/VM 6.1, and z/VM 6.2.
Server type | Memory size | CMM_INC | Unplug when |
---|---|---|---|
Web Server | < 0.5 GB | free mem /40 | (free mem + page cache) > 5% |
Application Server | < 2 GB | free mem /40 | (free mem + page cache) > 5% |
Database Server | = 0.5 GB | (free mem+page cache) /40 | (free mem + page cache) > 5% |
Combo | > 2GB | free mem /40 | (free mem + page cache) > 10% |