Topic
  • 7 replies
  • Latest Post - ‏2010-05-12T18:25:54Z by SystemAdmin
null
null
11 Posts

Pinned topic How do I handle the "holes" in the CPU numbering on POWER7 ?

‏2010-05-11T20:54:25Z |
How do I boot the operating system with no "holes" on the processor numbering?

With holes on the processor naming, I am having some challenges to "assign" OMP threads to processors. Are there any tricks to this?

Also, watching "top" and found that numactl might not do this properly ?
Updated on 2010-05-12T18:25:54Z at 2010-05-12T18:25:54Z by SystemAdmin
  • wpeter
    wpeter
    5 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-11T22:13:17Z  
    On Power systems, there are holes in CPU numbering when some HW threads are not used, e.g., the ST mode (CPUs 0, 4, 8, ...) or the SMT2 mode (CPUs 0, 1, 4, 5, ...) with POWER7 processors. There are no holes in the SMT4 mode (CPUs 0, 1, 2, 3, 4, 5, 6, 7, ...) since every HW thread is used.

    We can run "ppc64_cpu --smt=0", "ppc64_cpu --smt=2" and "ppc64_cpu --smt=4" to set different modes dynamically.

    Even with holes on POWER7 systems, we can use the environment parameter "XLSMPOPTS" to bind OpenMP threads to specified HW threads, assuming that the IBM XL compiler is being used.

    Depending on the SMT mode, XLSMPOPTS s set differently.

    For the ST mode (CPUs 0, 4, 8, 12, ...), set XLSMPOPTS=STARTPROC=0:STRIDE=4. It means the first OpenMP thread is bound to CPU0, the second to CPU4 (0+4), the third to CPU8 (4+4), and so on.

    For the SMT4 mode (CPUs 0, 1, 2, 3, 4, ..), set XLSMPOPTS=STARTPOC=0:STRIDE=1. It means the first OpenMP thread is bound to CPU0, the second to CPU1 (0+1), the third to CPU2 (1+1), and so on.

    However, we cannot use STRIDE=2 for the SMT2 mode (CPUs 0, 1, 4, 5, 8, 9, ...), since the second thread cannot be bound to CPU2 (0+2) which is not enabled. The IBM XL beta compiler has a new option in XLSMPOPTS to handle it.

    We can set XLSMPOPTS=proc='0,1,4,5,...'. The first OpenMP thread is bound to CPU0, the second to CPU1, the third to CPU4, the fourth to CPU5, and so on, just following the explicit list of CPU numbers.

    Definitely, we can use OMP_NUM_THREADS=N to create N OpenMP threads for each involved process.
  • SystemAdmin
    SystemAdmin
    706 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-12T13:48:11Z  
    • wpeter
    • ‏2010-05-11T22:13:17Z
    On Power systems, there are holes in CPU numbering when some HW threads are not used, e.g., the ST mode (CPUs 0, 4, 8, ...) or the SMT2 mode (CPUs 0, 1, 4, 5, ...) with POWER7 processors. There are no holes in the SMT4 mode (CPUs 0, 1, 2, 3, 4, 5, 6, 7, ...) since every HW thread is used.

    We can run "ppc64_cpu --smt=0", "ppc64_cpu --smt=2" and "ppc64_cpu --smt=4" to set different modes dynamically.

    Even with holes on POWER7 systems, we can use the environment parameter "XLSMPOPTS" to bind OpenMP threads to specified HW threads, assuming that the IBM XL compiler is being used.

    Depending on the SMT mode, XLSMPOPTS s set differently.

    For the ST mode (CPUs 0, 4, 8, 12, ...), set XLSMPOPTS=STARTPROC=0:STRIDE=4. It means the first OpenMP thread is bound to CPU0, the second to CPU4 (0+4), the third to CPU8 (4+4), and so on.

    For the SMT4 mode (CPUs 0, 1, 2, 3, 4, ..), set XLSMPOPTS=STARTPOC=0:STRIDE=1. It means the first OpenMP thread is bound to CPU0, the second to CPU1 (0+1), the third to CPU2 (1+1), and so on.

    However, we cannot use STRIDE=2 for the SMT2 mode (CPUs 0, 1, 4, 5, 8, 9, ...), since the second thread cannot be bound to CPU2 (0+2) which is not enabled. The IBM XL beta compiler has a new option in XLSMPOPTS to handle it.

    We can set XLSMPOPTS=proc='0,1,4,5,...'. The first OpenMP thread is bound to CPU0, the second to CPU1, the third to CPU4, the fourth to CPU5, and so on, just following the explicit list of CPU numbers.

    Definitely, we can use OMP_NUM_THREADS=N to create N OpenMP threads for each involved process.
    So when using "top" (or mpstat) you should see the processes stick to the correct CPUs when you use the XLSMPOPTS as specified. If that's not happening, we would consider that a bug.

    Keep in mind there are other ways of binding processes and threads to CPUs which may leverage different mechanisms available in Linux. The example provided by Peter is an OMP method taking advantage of the IBM XL compilers.
  • null
    null
    11 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-12T14:53:47Z  
    • wpeter
    • ‏2010-05-11T22:13:17Z
    On Power systems, there are holes in CPU numbering when some HW threads are not used, e.g., the ST mode (CPUs 0, 4, 8, ...) or the SMT2 mode (CPUs 0, 1, 4, 5, ...) with POWER7 processors. There are no holes in the SMT4 mode (CPUs 0, 1, 2, 3, 4, 5, 6, 7, ...) since every HW thread is used.

    We can run "ppc64_cpu --smt=0", "ppc64_cpu --smt=2" and "ppc64_cpu --smt=4" to set different modes dynamically.

    Even with holes on POWER7 systems, we can use the environment parameter "XLSMPOPTS" to bind OpenMP threads to specified HW threads, assuming that the IBM XL compiler is being used.

    Depending on the SMT mode, XLSMPOPTS s set differently.

    For the ST mode (CPUs 0, 4, 8, 12, ...), set XLSMPOPTS=STARTPROC=0:STRIDE=4. It means the first OpenMP thread is bound to CPU0, the second to CPU4 (0+4), the third to CPU8 (4+4), and so on.

    For the SMT4 mode (CPUs 0, 1, 2, 3, 4, ..), set XLSMPOPTS=STARTPOC=0:STRIDE=1. It means the first OpenMP thread is bound to CPU0, the second to CPU1 (0+1), the third to CPU2 (1+1), and so on.

    However, we cannot use STRIDE=2 for the SMT2 mode (CPUs 0, 1, 4, 5, 8, 9, ...), since the second thread cannot be bound to CPU2 (0+2) which is not enabled. The IBM XL beta compiler has a new option in XLSMPOPTS to handle it.

    We can set XLSMPOPTS=proc='0,1,4,5,...'. The first OpenMP thread is bound to CPU0, the second to CPU1, the third to CPU4, the fourth to CPU5, and so on, just following the explicit list of CPU numbers.

    Definitely, we can use OMP_NUM_THREADS=N to create N OpenMP threads for each involved process.
    I am getting an error message on
    XLSMPOPTS=PROC.....

    The error message is
    1587-103 ..... The option strings for the SMP run-time ....... "PROC" contains unexpected or invlaid text in stead of an option name. All SMP run time .. has been set to default .)
  • SystemAdmin
    SystemAdmin
    706 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-12T14:58:06Z  
    • null
    • ‏2010-05-12T14:53:47Z
    I am getting an error message on
    XLSMPOPTS=PROC.....

    The error message is
    1587-103 ..... The option strings for the SMP run-time ....... "PROC" contains unexpected or invlaid text in stead of an option name. All SMP run time .. has been set to default .)
    Try PROCS=
  • wpeter
    wpeter
    5 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-12T15:18:19Z  
    Try PROCS=
    Bill, you are right.

    I missed an 's'. It should be 'procs=...'.
  • SystemAdmin
    SystemAdmin
    706 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-12T16:07:47Z  
    So let's be sure everything is correct.

    Confirm POWER7 mode
    
    # cat /proc/cpuinfo | grep -m 1 cpu cpu             : POWER7 (architected), altivec supported
    


    Confirm correct ppc64_cpu
    
    # ppc64_cpu | grep X ppc64_cpu --smt=X           # Set SMT state to X
    


    Confirm correct SMT mode
    
    # ppc64_cpu --smt SMT=2
    


    I have a two core system in SMT=2 mode. I confirm the CPU numbering:
    
    # cat /proc/cpuinfo | grep processor processor  : 0 processor   : 1 processor   : 4 processor   : 5
    


    I have stream installed. I have the beta IBM compilers installed for POWER7.

    
    # export PATH=$PATH:/opt/ibmcmp/vac/11.1/bin # xlc -O5 -qsmp=omp -qthreaded stream.c -o stream # XLSMPOPTS=PROCS=0,1,4,5 ./stream
    


    I had modified stream to repeat itself quite a bit, so could watch "top" from another window

    
    top - 16:01:35 up 22:35,  2 users,  load average: 0.06, 0.06, 0.07 Tasks:  97 total,   1 running,  96 sleeping,   0 stopped,   0 zombie Cpu0  : 46.4%us,  0.4%sy,  0.0%ni,  2.8%id,  0.0%wa,  0.0%hi,  0.0%si, 50.4%st Cpu1  : 46.5%us,  0.4%sy,  0.0%ni,  3.1%id,  0.0%wa,  0.1%hi,  0.0%si, 49.9%st Cpu4  : 45.5%us,  0.5%sy,  0.0%ni,  4.5%id,  0.1%wa,  0.1%hi,  0.0%si, 49.3%st Cpu5  : 45.6%us,  0.5%sy,  0.0%ni,  4.5%id,  0.0%wa,  0.0%hi,  0.0%si, 49.5%st   Mem:     10170M total,     3493M used,     6676M free,      325M buffers
    


    I confirmed the similar steps with XLF (Fortan).

    Please be sure all of the setup steps are correct.
  • SystemAdmin
    SystemAdmin
    706 Posts

    Re: How do I handle the "holes" in the CPU numbering on POWER7 ?

    ‏2010-05-12T18:25:54Z  
    So let's be sure everything is correct.

    Confirm POWER7 mode
    <pre class="jive-pre"> # cat /proc/cpuinfo | grep -m 1 cpu cpu : POWER7 (architected), altivec supported </pre>

    Confirm correct ppc64_cpu
    <pre class="jive-pre"> # ppc64_cpu | grep X ppc64_cpu --smt=X # Set SMT state to X </pre>

    Confirm correct SMT mode
    <pre class="jive-pre"> # ppc64_cpu --smt SMT=2 </pre>

    I have a two core system in SMT=2 mode. I confirm the CPU numbering:
    <pre class="jive-pre"> # cat /proc/cpuinfo | grep processor processor : 0 processor : 1 processor : 4 processor : 5 </pre>

    I have stream installed. I have the beta IBM compilers installed for POWER7.

    <pre class="jive-pre"> # export PATH=$PATH:/opt/ibmcmp/vac/11.1/bin # xlc -O5 -qsmp=omp -qthreaded stream.c -o stream # XLSMPOPTS=PROCS=0,1,4,5 ./stream </pre>

    I had modified stream to repeat itself quite a bit, so could watch "top" from another window

    <pre class="jive-pre"> top - 16:01:35 up 22:35, 2 users, load average: 0.06, 0.06, 0.07 Tasks: 97 total, 1 running, 96 sleeping, 0 stopped, 0 zombie Cpu0 : 46.4%us, 0.4%sy, 0.0%ni, 2.8%id, 0.0%wa, 0.0%hi, 0.0%si, 50.4%st Cpu1 : 46.5%us, 0.4%sy, 0.0%ni, 3.1%id, 0.0%wa, 0.1%hi, 0.0%si, 49.9%st Cpu4 : 45.5%us, 0.5%sy, 0.0%ni, 4.5%id, 0.1%wa, 0.1%hi, 0.0%si, 49.3%st Cpu5 : 45.6%us, 0.5%sy, 0.0%ni, 4.5%id, 0.0%wa, 0.0%hi, 0.0%si, 49.5%st Mem: 10170M total, 3493M used, 6676M free, 325M buffers </pre>

    I confirmed the similar steps with XLF (Fortan).

    Please be sure all of the setup steps are correct.
    An SMT=off example:

    
    # ppc64_cpu --smt=off # ppc64_cpu --smt SMT is off
    


    Checking CPU numbering
    
    # cat /proc/cpuinfo | grep proc processor     : 0 processor   : 4
    


    Running stream again
    
    # XLSMPOPTS=PROCS=0,4 ./stream
    


    Watching from "top"
    
    top - 18:23:16 up 1 day, 57 min,  2 users,  load average: 0.16, 0.05, 0.01 Tasks:  77 total,   1 running,  76 sleeping,   0 stopped,   0 zombie Cpu0  : 59.9%us,  0.5%sy,  0.0%ni,  3.1%id,  0.0%wa,  0.1%hi,  0.0%si, 36.4%st Cpu4  : 59.1%us,  0.6%sy,  0.0%ni,  4.2%id,  0.0%wa,  0.2%hi,  0.0%si, 35.9%st
    


    Alternatively, with this mode, you can specify stride=4 to execute on the SMT=off threads.
    
    # XLSMPOPTS=startproc=0:stride=4 ./stream