Massive Memory Savings on your POWER6 and POWER7 machines
nagger 100000MRSJ Visits (11497)
I was with a large UK based insurance company last week. They, and their facilities management team, run many very large machines like POWER6 595 and POWER7 Power 770/780. Most machines are 64 core and 2 TB of memory and running 160+ virtual machines (LPARs) on each. These are large, complex and well run machines - all of the AIX images and VIOS are pretty well up to date with current or N -1 releases. A very pleasant surprise and impressed. They told me that even with 2 TB of memory they tend to run out of memory before CPU cycles. Each production service has 6 to 10 virtual machines for the various components and test, dev, training, etc. and for all the various services for the web apps. This means each locks down memory but then is not actually active a lot of the time. The CPU cycles can be "borrowed" in milli-seconds by using Shared CPUs. With so many virtual machines moving memory around using dynamic changes is not practical. I was there to tell them the technical features and best practice about three technologies which have all been available for 1 to 3 year together these can "fix" the memory problem.
Workload Partitions (WPAR) - out for 3 years
Pre-req any AIX 6 or AIX 7 virtual machine. To run WPARs it a standard feature of AIX6+ but Workload Partition Manager is recommended and gives you Live WPAR Mobility (Relocation to techies) and a neat GUI for Systems Director.
Here we use one AIX in memory (say about 1 GB of memory) to run many WPARs - which are groups of processes that behave like mini copies of AIX but share the one global AIX kernel. If you run 10 WPARs in 1 global AIX - that is 9 GB of memory saved. Then if you install the applications or middle ware in the global AIX - the WPARs can share the binaries in memory too - another(say) 9 GB of memory. On this customers systems this is a huge saving. OK they have to rework their standard AIX image into a standard WPAR install but they gain rapid deployment (like 90 seconds against a LPAR install of 20 minutes), plus global AIX super-duper user access to fix broken images (no more diag CD booting to fix a non-bootable LPAR). and lots of other benefits that you can see on this blog in previous entries.
Active Memory Expansion - available for 1 year
Pre-reqs AIX 6 TL4 sp2+ and AIX 7 on POWER7. 60 day trial and then a cost per machine via an activation code on the HMC. Virtual machines must be pure virtual = no physical adapters and shared CPUs.
Here AIX conceptually ,compresses memory pages and sneaks them in to a paging area within a RAM disk. The benefits varies with the compress ratios of memory pages which are impossible to known so there is a "amepat" command that (even if AME is not active) tells you the compressions rates and the CPU cycles that will be used. There is a trade off between CPU cycles to compression achieved - if you demand too high a expansion it costs in CPU time but there is a sweet spot for very little CPU, you get a large extra chunk of apparent memory - very good for Java applications as they typically have poor memory management. AME is switched on/off on the HMC and needs a restart of the virtual machine. Then you can dynamic set the expansion target. A target of 0 expansion is effectively off but ready to use. You can decide to grow the RAM in the virtual machine or memory shrink to give up RAM for other virtual machines. In practice, I have found 1.5 to 2.2 Expansion Factor target works for me but it is highly workload dependant (where the amepat command lets you know yours). For this customer that may get to 2TB note looking like 3TB or more - Wow! that is a serious saving.
Active Memory Sharing - out for nearly 2 years
Pre-regs are fairly large but all available from May 2009 - POWER6 or above, Firmware(P6) 342+, HMC 7.3.4SP2+, VIOS 2.1.1+, AIX 6 TL03+, PowerVM Enterprise, activated on HMC via activation code. Again pure virtual. Smart and pro-active customer will already be at these levels.
Here the memory of the machine is pooled together and virtual machines allocated logical memory (not dedicated) from the pool (similar concept to shared CPU pools). The LPAR then need restarting in AMS mode. The pool can be over committed - so not every one can get their allocation. When memory gets short ,the Hypervisor will request memory page loaning so virtual machines can help each other out and get memory to the memory hungry high demand locally paging one. If still yet more memory is needed, the Hypervisor pages out old pages from the virtual machines to paging on a friendly VIOS that has specially assigned paging devices for this use. Then the pages can be given to the other virtual machine. In practice, this flows memory to the most paging virtual machine - due to the need to page out - this will not throw GB/second of memory around but think more like a few 10's of MB can arrive per second on large machine.
My customer decided to immediately try:
I would not be surprised if they save between 512MB and 1TB of memory so they can make full use of the CPU power - they said they would update me on their test and I hope to let you know how they get on!
I hope this quick blog summary of memory saving options is useful, thanks Nigel Griffiths