Topic
  • 3 replies
  • Latest Post - ‏2013-12-18T09:17:43Z by nagger
khos
khos
15 Posts

Pinned topic HT number clarification

‏2013-12-13T22:46:37Z | cpu number of on top

The server has 32 cores (8 chips 4 core/chip, SMT=4) .  In a LPAR with 10 cores we notice 80 based on  "mpstat -d "output

We have a IBM system p 7 750

The server has 32 cores (8 chips 4 core/chip, SMT=4) .  In a LPAR with 10 cores we notice 80 based on output of:

 "mpstat -d "output

We are looking for some help determine the number CPU (HT) to help determine process (es) CPU usage on the 'top" pages.  My understanding is that TOP sheet/page shows process (es) CPU usage per CPU.  Basically, if a process A utilization is 10% of CPU then that process is taking about 10% of 40 CPU but not 10% of LPAR. In this lpar case, we should interpret the CPU% usage of 10% of 40 CPUs, right? Or, it is using a 10% of 80 CPUs accordingly to the mpstat -d report?

Thanks in advance for any help, comments, or pointer to a document.

mpstat -d   output

CPU 0 .. CPU 79

CPU

0
1
..
79  

while we were expecting to see CPU 0 to 39 for total of 40 HT.  Are we interpret this data correctly?

Here are a page from nmon analyzer "BBBL":

lparno 5

CPU in sys 32            
Virtual CPU 28            
Logical CPU 112            
Pool CPU 32            
smt threads 4            
capped 0            
min Virtual 4            
max Virtual 32            
min Logical 4            
max Logical 128            
min Capacity 4            
max Capacity 16            
Entitled Capacity 15.5            
Weight 128            
min Memory MB 40960            
max Memory MB 122880            
online Memory 49152            
pool id 0            
Flags LPARed DRable SMT Shared UnCapped Migratable Not-Donating AMSable.

 

  • nagger
    nagger
    1731 Posts

    Re: HT number clarification

    ‏2013-12-16T12:20:14Z  

    Hi,

    You keep stating it is a 10 core LPAR/VM but the lparstat -i output says the VM has Entitlement of 15.5 (guaranteed CPU time) but running on Virtual CPU number set to 28 with SMT=4 you get  28 x 4 = 112 Logical CPUs.

    Next you have to move away from Utilisation maths on Logical CPUs that does not work for two reasons.

    1) the four SMT are time sharing units in the single physical CPU core so you don't get 4 times the work done - it is more like 50% extra or 1.5

    2) the Utilisation is relative to the Entitlement BUT above the Entitlement on a unCapped LPAR/VM is still reported as near 100% even though you could max out at 10 times the CPU time.

    I recommend you just look at the Utilisation for the ratio between User time and Systems time but for everything else look at the Physical CPU time or as nmon calls it CPU_Used on screen.

    Please don't use "HT" to Power guys - that is some horrid Intel thing and SMT is very different and more efficient :-)

    I hope that helps, Nigel Griffiths

  • khos
    khos
    15 Posts

    Re: HT number clarification

    ‏2013-12-16T17:19:21Z  
    • nagger
    • ‏2013-12-16T12:20:14Z

    Hi,

    You keep stating it is a 10 core LPAR/VM but the lparstat -i output says the VM has Entitlement of 15.5 (guaranteed CPU time) but running on Virtual CPU number set to 28 with SMT=4 you get  28 x 4 = 112 Logical CPUs.

    Next you have to move away from Utilisation maths on Logical CPUs that does not work for two reasons.

    1) the four SMT are time sharing units in the single physical CPU core so you don't get 4 times the work done - it is more like 50% extra or 1.5

    2) the Utilisation is relative to the Entitlement BUT above the Entitlement on a unCapped LPAR/VM is still reported as near 100% even though you could max out at 10 times the CPU time.

    I recommend you just look at the Utilisation for the ratio between User time and Systems time but for everything else look at the Physical CPU time or as nmon calls it CPU_Used on screen.

    Please don't use "HT" to Power guys - that is some horrid Intel thing and SMT is very different and more efficient :-)

    I hope that helps, Nigel Griffiths

    Nigel,

    Thank you.   That help my understanding of  Logical CPUs better and do agree on the SMT thread CPU percentage weight/distribution.

    We generally archive NMON data into a file and used that file to help us determine characteristic of application.  
    That application is encompasses of 4-5 different process.  We need to use NMON SCPU_ALL page to help determine application CPU usage, as whole.


    For an example:

    SCPU
                        Total     User        Sys                Wait              Idle           CPU%               

    14:26:14

    2.6

    1.68

    0

    2.67

    4.28

       

    0.6634

    That shows our application was using about "0.6634" of physical CPUs at snapshot of (14:26:14). Well, we have 15.5 CPU Entitlement. (4.28% * 15.5 CPU) = 0.6634.  

    We are also trying to determine those 4-5 process CPU usages from test to test using "TOP" page/sheet looking at %CPU, %Usr, and %sys to help us determine their CPU usage variation.
     

    As an example:

    A process CPU consumption is 5.79 (%CPU), (%Usr of 4.55) and %Sys of 1.24 with a ratio of 27.25% for sys/user.    

    Would you agree with my assumption?   Thanks in advance for your comments or a pointer.

    Of course, I have 112 Logical CPU with 28 Virtual CPU and SMT=4.  Please excuse usage of incorrect terminology on the SMT note.

    Updated on 2013-12-16T17:21:14Z at 2013-12-16T17:21:14Z by khos
  • nagger
    nagger
    1731 Posts

    Re: HT number clarification

    ‏2013-12-18T09:17:43Z  
    • khos
    • ‏2013-12-16T17:19:21Z

    Nigel,

    Thank you.   That help my understanding of  Logical CPUs better and do agree on the SMT thread CPU percentage weight/distribution.

    We generally archive NMON data into a file and used that file to help us determine characteristic of application.  
    That application is encompasses of 4-5 different process.  We need to use NMON SCPU_ALL page to help determine application CPU usage, as whole.


    For an example:

    SCPU
                        Total     User        Sys                Wait              Idle           CPU%               

    14:26:14

    2.6

    1.68

    0

    2.67

    4.28

       

    0.6634

    That shows our application was using about "0.6634" of physical CPUs at snapshot of (14:26:14). Well, we have 15.5 CPU Entitlement. (4.28% * 15.5 CPU) = 0.6634.  

    We are also trying to determine those 4-5 process CPU usages from test to test using "TOP" page/sheet looking at %CPU, %Usr, and %sys to help us determine their CPU usage variation.
     

    As an example:

    A process CPU consumption is 5.79 (%CPU), (%Usr of 4.55) and %Sys of 1.24 with a ratio of 27.25% for sys/user.    

    Would you agree with my assumption?   Thanks in advance for your comments or a pointer.

    Of course, I have 112 Logical CPU with 28 Virtual CPU and SMT=4.  Please excuse usage of incorrect terminology on the SMT note.

    Khos,

    SCPU is based on logical CPUs (from my memory) I would not do maths on them.

    The ratio is right but most guys use a user:system time ratio :-) 75% higher being good.

    I think you have to monitor the physical CPU use but then look at the Top Processes stats as a completely different set of numbers - getting the two set to add up and agreed is largely pointless. Except for some low end special cases (E=1, VP=1, capped, SMT=1) you can use maths to convert logical to physical stats.

    Cheers, Nigel Griffiths.