Topics covered here
Introduction
Installing additional software on your system
When running with Linux on Power, we recommend adding several components to the system installation:
- sysstat
- gcc/gfortran and the pre-reqs
- oprofile
To extend what's available from the basic distro packages, we recommend downloading some additional packages from the web
- nmon - latest version
- Look for version nmon 12a for Linux. nmon runs on SLES9, SLES10, SLES 11 and RHEL4, RHEL 5.
- The trial IBM compilers - see this page
for some pointers
- The latest IBM Java - download
- Java 6, Java 5, Java 1.4.2
Understand the system
There are three aspects to understanding your system
- The operating system and Linux software
- The hardware system - processors, memory, disk, etc
- The Power logical partition
Understand your operating system
In general, the newer versions of the operating system and software stack have improved support of the latest Power hardware.
| |
Kernel |
gcc level |
Page size |
16MB pages |
16GB pages |
SMT steal cycles |
RHEL 4.7 |
2.6.9-78.EL |
gcc 3.4.6 |
4KB |
|
|
|
RHEL 5.1 |
2.6.18-53.el5 |
gcc 4.1.2 (6/26/2007) |
64KB |
Supported |
Not avail |
|
RHEL 5.2 |
2.6.18-92.el5 |
gcc 4.1.2 (11/24/2007) |
64KB |
Supported |
Not avail |
|
RHEL 5.3 |
2.6.18-128.el5 |
gcc 4.1.2 (7/28/2008) |
64KB |
Supported |
Not avail |
New form |
RHEL 5.4 |
2.6.18-164.el5 |
gcc 4.1.2 (7/04/2008) |
64KB |
Supported |
Not avail |
New form |
| |
|
|
|
|
|
|
SLES 10 sp1 |
|
|
|
|
|
|
SLES 10 sp2 |
2.6.16.60-0.21 |
gcc 4.1.2 (1/15/2007) |
4KB |
Supported |
Not availl |
New form |
SLES 11 |
2.6.27.19-5 |
gcc 4.3.2 |
64KB |
Supported |
Supported |
New form |
Examples are below..
Red Hat Enterprise Linux (RHEL)
For RHEL 5.2:
For RHEL 5.2:
Novell SUSE Linux Enterprise Server (SLES)
For SLES 10 sp2:
For SLES 11:
Understand your hardware system
Industry terminology for cores, processors, chips, sockets, CPUs can be ambiguous.
- Even respected organizations like SPEC.org
have worked over the years trying to clarify the terms used by marketing teams and technical teams. For example, the SPECmpi2007 group published run rules
which describe these terms for use with respect to SPECmpi workloads.
For Power systems, performance analysts generally follow these conventions...
- The controlling HMC refers to "processors" .. which are also often referred to as "cores" on Power
- Each Power 5 and 6 processor (or core) can support two Simultaneous hardware threads (SMT)
- Power systems typically do not leverage the term "sockets" since that term can be ambiguous for the packaging approach used across the diverse Power systems.
- Linux sees "CPUs" which are individually controlled by the scheduler with one CPU per hardware SMT thread - up to two hardware threads per core
- The Linux scheduler for Power systems knows how to efficiently schedule the two SMT threads for each Power core
- If SMT is on, the Linux CPUs are numbered sequentially (0, 1, 2, 3, ...)
- There is no correlation or association that either Linux CPU (ie: 0 or 1) is the "real core". We hear this surprisingly often.
- With SMT on, the processor core is kept "more busy" when one of the two hardware threads is waiting on something
- If SMT is off, the Linux CPUs are numbered with even numbers (0, 2, 4, ...)
It is usually recommended that you run with SMT = on
Understand your partition
Here's an example of a Power 6 partition running with SLES 11. There are quite a few fields, all of which are helpful in describing the details of the defined partition. For performance purposes, there are a handful of fields which are particularly important which we'll look at a little more closely.
The values from "lparcfg" simply reflect how the partition is defined. To change these values you'll need to modify the partition definition with the HMC or the IVM. For more details, check out the IBM Redbook [Virtualizing an Infrastructure
with System p and Linux |http://www.redbooks.ibm.com/abstracts/sg247499.html?Open]. Once changed, the partition will need to be shutdown and re-started from the HMC.
Key fields typically checked by performance teams:
- system_potential_processors=2 The number of physical cores on the underlying system
- system_active_processors=2 The number of physical cores which are active on the underlying system
- partition_potential_processors=2 The number of virtual processors (cores) defined in the partition
- partition_active_processors=2 The number of virtual processors (cores) are active
- partition_max_entitled_capacity=200 100=1 processor. So 2 processors total as the "max".
- partition_entitled_capacity=40 40=40% of 1 processor
- Capped=0 0 means this partition can use other unused CPU cycles from other partitions up to the max_entitled_capacity for this partition
- shared_processor_mode=1 Unused CPU cycles are shared with other partitions on the system
So in this example, if SMT is on when Linux is running, this partition will see four CPUs, running on the two processor cores assigned to the partition. The CPU cycles not used in this partition are "shared" with other partitions, and this partition will get assigned at least 40% of 1 processor and can use up to 2 full processors if available.
 | For Performance Benchmarks
For consistent performance benchmarking, we recommend that shared_processor_mode should be 0, capped should be 1, and the capacity values should correlate to the processors assigned.
Later, for production use, for more effective use of the whole system, you should seriously consider sharing CPU cycles across partitions. |
Understand your application