POWER Relative Performance (rPerf) is often used as a way to approximate the expected difference in performance between two Power Systems servers. Although rPerf is a useful tool, it is important to understand the limitations of using rPerf to provide an estimate the performance of your specific workloads in your particular environment with a new server,
First, rPerf numbers, like any published benchmark, represent the best case result when the application, configuration and system resources are all optimized -- factors that are likely not all optimized in your environment with your workloads. Additionally, the actually real world performance of a given server might be affected over time by improvements or bugs in the firmware, AIX, or the application while the official rPerf is seldom changed in the Power Systems Performance Report after the initial release of the server model.
Second, rPerf numbers are expressed in terms of throughput as opposed to speed. A newer server with twice the rPerf as that of an older server may be able to perform twice as many transactions as the older server, but not necessarily do the same number of transactions in one half the time. This is because the doubling of transactions may be based on doubling the number of threads of work rather than doubling the speed of each thread. This is especially important for older, single threaded applications that cannot exploit the benefits of SMT.
Third, you need to understand that although Power Systems servers tend to scale fairly linearly, you can't make broad assumptions that adding an additional core capacity to an LPAR on a 32 core server will deliver exactly a further 1/32 of the extra performance. Particularly as you move to larger servers, the topology of the cores and memory that are used can have a big effect on the cache and memory affinity of the LPAR and thus have a big effect on the actual performance. There are similar assumptions when dealing with LPARs of smaller sizes than the official rPerf numbers. The official rPerf numbers for current and older machines can be found here:
Having stated that, many people use rPerfs as a simplistic way of preparing a guesstimate to planning new machines based on running POWER workloads for both workload migration and server consolidation.
There is an excellent article in a recent IBM Systems Magazine on exactly this subject and I refer you to that rather than duplicate it here.
- Sizing by adding up old box rPerf numbers plus scaled the rating by number of CPU cores in the LPAR and scaled down based on CPU utilisation
- Add guesstimate of new workloads
- Add guesstimate of growth
- Add comfort factor like the old 80:20 rule to cover peaks in work
- Then match the rPerf requirement against a new machine or if no workload is very large a pair of machines
- Then decide practical things like the number of CPUs and GHz, memory to match and then adapters to keep the data flowing in and out of the machine.
I have done this sort of exercise myself in the past. So lets talk about things you should think about when using rPerf to make sizing or performance estimates. I call these my Ten Golden Rules or considerations to remember while using rPerfs for sizing
. These guidelines may help you achieve the performance that you expected.
- Highly threaded workloads - 2 to 3 times total SMT threads in the LPAR to make full use of the POWER7 processor. On a 5 CPU LPAR that is 20 SMT so we need something like 40 or more concurrently active processes or process threads.
- Well tuned system - not out of the box settings. This is normal tuning like disk queue depth, memory use, large network packets. Recent AIX version are much better out of the box but the basics still need to be monitored and checked. The tuning setting from older machines and OS are unlikely to work perfectly.
- Full Spec RAM - All memory DIMM slots used & lots of memory. This is the configuration used in working out rPerfs and unused DIMM slots or minimum memory will reduce performance. How much of a reduction? - it is impossible to know without a benchmark.
- No Disk Issues - obviously a disk bottleneck is not overcome with a newer, faster CPU. Also no SAN bottlenecks.
- No Network Issues - as above.
- Current application, RDBMS, middle-ware & web servers software levels - rPerfs on new machine use the latest software and that can contribute to performance. Running older software from the older machines can cause a mismatch with your rPerf expectation.
- Latest AIX with Service Packs - like all benchmarks, the best operating system version is used to get those latest performance fixes and improvements.
- Large LPARs - rPerf numbers are not based on "micro-LPARs" (below a whole CPU) or many LPARs sizes of just 1 or 2 CPUs. Check the Facts and Features document for the rPerf with smallest number of CPU cores for you machine, below that number you are making assumptions - particularly as you go below boundaries in the machine like a drawer or whole POWER7 chip and below one CPU core.
- Firmware is Current - The firmware includes the Hypervisor and this has many performance enhancement tweaks and often based on field experience that you need working for you. Particularly, take and early opportunity to apply the first two updates after the initial release.
- Bug Free - users have to be willing to upgrade firmware, AIX and application software, as necessary. Downtime needs to be built-in to allow the removal of problems in the above. This is why PowerHA System Mirror (HACMP) and Live Partition Mobility were developed. You have to be realistic and understand that you will need to be able to move up to later levels of software and firmware to take advantage of improvements. IBM is currently completely out of "magic pixie dust" to fix problems without changes. Nor can IBM promise that specific updates will fix all known problems in the universe.
If you are breaking a lot of these assumptions then you may not get the performance you expected or hoped to achieve. If you are break just one or two and then only a little bit then you never know - "you might get lucky". If you are some where between these options then you may have to make a series of changes, upgrades or updates to achieve good performance on your new machine. Performance tuning is always an iterative process of removing one
performance bottleneck in order to reveal the next performance
bottleneck. The performance capabilities of an server requires
systematic work to unleash the full performance potential of the
underlying computing platform.
But I hope this blog entry will help every one to either get it right first time or at least set realistic expectation and outline some prime areas that will need investigation to realise benefits, thanks Nigel