Topic
  • 4 replies
  • Latest Post - ‏2013-09-20T11:56:42Z by Moonlit Waters
Moonlit Waters
Moonlit Waters
3 Posts

Pinned topic Per-chip CPU power and temp?

‏2013-09-17T20:06:01Z |

One of the things we're doing with our 9125-F2C hardware is performing a variety of benchmarking experiments.  One measurement needed for at least one of the benchmarks is per-chip power and temperature monitoring for each physical CPU.

Imagine partitioning a 9125-F2C CEC into 32 LPARs, one per physical POWER7 chip in the drawer.  Right now we can get temperature and power statistics using a variety of mechanisms, including the xCAT rvitals and renergy commands, but these are granular only down to the octant, i.e. the QCM (quad-chip module) level, i.e. to a collection of 4 POWER7 chips.  So if we had four LPARs running on one QCM, we'd only be able to measure environmental data for the aggregation of the LPARs, not for each LPAR.

So -- what we need is a mechanism to measure current temperature, voltage, and power (in watts) per physical POWER7 CPU (per core would be even better).  Does such a mechanism exist?  If so, how can we leverage it from RHEL 6.2 (EMS or in-CEC LPAR)?

  • Bill_Buros
    Bill_Buros
    182 Posts
    ACCEPTED ANSWER

    Re: Per-chip CPU power and temp?

    ‏2013-09-18T21:27:44Z  

    and the answer ...

    The "rvitals" xCAT command provides a full set of Power/Thermal monitoring information for the entire rack.  There is no possible way to measure power per processor chip, which is the minimum 8-core LPAR size on a Power775, because a single voltage level powers all 4 processor chips in an Octant.  As for temperature of the 4 processor chips of an Octant, they will all be "close to the same" regardless of activity per chip because they are on a common module

  • Bill_Buros
    Bill_Buros
    182 Posts

    Re: Per-chip CPU power and temp?

    ‏2013-09-18T11:44:38Z  

    I don't think measuring at that granularity is possible, but am working to confirm that with the engineers.

  • Moonlit Waters
    Moonlit Waters
    3 Posts

    Re: Per-chip CPU power and temp?

    ‏2013-09-18T11:54:17Z  

    I don't think measuring at that granularity is possible, but am working to confirm that with the engineers.

    Thanks -- looking forward to hearing what they have to say.

  • Bill_Buros
    Bill_Buros
    182 Posts

    Re: Per-chip CPU power and temp?

    ‏2013-09-18T21:27:44Z  

    and the answer ...

    The "rvitals" xCAT command provides a full set of Power/Thermal monitoring information for the entire rack.  There is no possible way to measure power per processor chip, which is the minimum 8-core LPAR size on a Power775, because a single voltage level powers all 4 processor chips in an Octant.  As for temperature of the 4 processor chips of an Octant, they will all be "close to the same" regardless of activity per chip because they are on a common module

  • Moonlit Waters
    Moonlit Waters
    3 Posts

    Re: Per-chip CPU power and temp?

    ‏2013-09-20T11:56:42Z  

    and the answer ...

    The "rvitals" xCAT command provides a full set of Power/Thermal monitoring information for the entire rack.  There is no possible way to measure power per processor chip, which is the minimum 8-core LPAR size on a Power775, because a single voltage level powers all 4 processor chips in an Octant.  As for temperature of the 4 processor chips of an Octant, they will all be "close to the same" regardless of activity per chip because they are on a common module

    Thanks for the quick turnaround.  This is consistent with everything we've seen, but it's good to have confirmation from engineers closer to the design that per-QCM monitoring is the best we can do.