nmon CPU graphs - Why are the PCPU_ALL graphs lower?
nagger 100000MRSJ Comments (7) Visits (9918)
I have been looking at some nmon data from an IBMer looking into a customers machine. The virtual machine (LPAR) is running the Oracle RDBMS with 75 dedicated CPUs - on a POWER7 Power 795 at 4 GHz.
Anyway back to the plot: The Summary CPU and Disk I/O graphs (see below) looks good with a CPU peak in 13:100 to 14:30 period. Which is just under the 75 maximum. As this is dedicated CPU the CPU in non-busy periods is just thrown away running a Wait loop and there is no option to borrow extra CPU cycles at the peaks but that is a debate for another time. I suspect it would like a couple more CPUs. The Run Queue is roughly 140 through the peak - that is a little low as we have a 75 CPU cores each with four logical threads (SMT=4) So we could run at most 300 concurrent process threads.
But the PCPU_ALL graphs look different (see below). This graph shows the CPU Utilisation numbers on a Physical CPU basis. Here the peak is only reaching roughly 60 - 62 CPU cores. So why is this lower than the other two graphs.
I actually get a lot of questions about the utilisation numbers so my second response was "oh no here we go again!".
* * * WARNING: BLUNTNESS ALERT ON * * *
By definition you can't go over 100% utilisation. We all laugh at the silly leaders demanding 150% effort on his or her project. To stop down stream performance stats applications from breaking from utilisation numbers of more than 100%, it was decided to make fully using your CPU Entitlement = 100% utilisation and NOT report over 100% of CPU time - even when you are using much more CPU time. The theoretical maximum (assuming the highest possible virtual processor of 10 times Entitlement) number is 1000% (one thousand percent).
If I had to guess then the Utilisation numbers in our PCPU_ALL graph (above) have been scaled from 75 cores to roughly 62 cores so "show" some SMT threads are unused so the CPU cores are not fully used (and given enough threads it could give you more performance). Roughly 10 - 15% more. Now, in my humble opinion, this is totally the wrong way of doing this as it is just plain confusing. There is no clue in the 62 CPU cores that it is the lack of threads that is the bottleneck and it is completely untrue that the 15 core difference (75 - 60) are not used - all 75 CPU cores are being used just not used fully with all four SMT threads pulling their weight. OK, you could reduce the Virtual Processor count to say 65 to force higher SMT use but at the expense of some response time.
In this case, we have dedicated CPUs and can forget the effects of going over Entitlement on Uncapped LPARs.
* * * WARNING: BLUNTNESS ALERT OFF * * *
For more information see - Unde