Topic
7 replies Latest Post - ‏2012-08-22T18:44:32Z by NishAravamudan
djlawler
djlawler
2 Posts
ACCEPTED ANSWER

Pinned topic NUMA on Intel vs IBM memory design?

‏2012-08-09T15:52:27Z |
 Anyone have better technical answer about how Power compares to NUMA which as far as I can tell is mostly a x86 design (some others) for optimizing memory usage and speeding up performance? I found one small reference to the fact that Power uses a shared memory bus different then NUMA designed systems but was hoping for more details? The local caching of data in memory closer to the processor is I believe a main design characteristic of Power but unsure about specifics compared to NUMA?  I recall a discussion at Austin in one customer briefing about Power 8 working on better memory usage so hopefully someone can explain the differences? Thanks
Updated on 2012-08-22T18:44:32Z at 2012-08-22T18:44:32Z by NishAravamudan
  • Bill_Buros
    Bill_Buros
    126 Posts
    ACCEPTED ANSWER

    Re: NUMA on Intel vs IBM memory design?

    ‏2012-08-09T18:10:51Z  in response to djlawler
    Power systems have excellent NUMA characteristics and the operating systems all provide the standard NUMA controls to optimize process and memory placement (and other aspects as well).   There are numerous levels of NUMA distances on the various Power systems in the marketplace today.   I'll hunt down some better references which can cover these aspects.   
     
    In general though, using the two-socket Power 730 system as an example, there is memory local to the 8-core socket and there is memory "next-door" on the next socket
     
    For example...
    Checking the total system and free memory available on the Power 730 server.   128GB memory.
    #  grep Mem /proc/meminfo 
    MemTotal:       129753088 kB
    MemFree:        122301888 kB
     
    Then checking to see where that memory is from the numa perspective. 
    # cat /sys/devices/system/node/node*/meminfo | grep MemTotal  
    Node 0 MemTotal:       66060288 kB
    Node 1 MemTotal:       64487424 kB
     
    # cat /sys/devices/system/node/node*/meminfo | grep MemFree 
    Node 0 MemFree:        61253504 kB
    Node 1 MemFree:        61048384 kB
     
    Power7 systems are designed to scale up smoothly through the 256 core system images. 
     
    Future Power hardware processor changes and design points can't of course be discussed here.    :-)
     
    • Bill_Buros
      Bill_Buros
      126 Posts
      ACCEPTED ANSWER

      Re: NUMA on Intel vs IBM memory design?

      ‏2012-08-09T18:41:32Z  in response to Bill_Buros
      One example of good reading material is every Power system has a RedBook which describes the technical architecture.    For example, search for RedBook Power 730
       
      The Power 750 Redbook 
        
       
       
      Updated on 2012-08-09T18:41:32Z at 2012-08-09T18:41:32Z by Bill_Buros
      • djlawler
        djlawler
        2 Posts
        ACCEPTED ANSWER

        Re: NUMA on Intel vs IBM memory design?

        ‏2012-08-09T18:52:11Z  in response to Bill_Buros
         Thanks Bill, I will look over the Redbooks, I need to understand the key talking points how Power differs from Intel systems which seems to have a lot material for System x and NUMA. Thanks again!
        • Bill_Buros
          Bill_Buros
          126 Posts
          ACCEPTED ANSWER

          Re: NUMA on Intel vs IBM memory design?

          ‏2012-08-09T19:11:02Z  in response to djlawler
          Sounds good. 
           
          Keep in mind  that the bigger workhorse systems that IBM has (like the Power 770/780 and the Power 795 servers) have more complex system design points built specifically around the NUMA characteristics.   The more straight-forward single drawer servers are basic servers with a couple of processor sockets and local/"remote" memory.
           
          We can help with detailed and specific questions on the things automatically done to help runtime NUMA characteristics and the more manual things done to tune and optimize a system and applications.    The normal Linux numactl and taskctl commands are often used as we tweak for optimized performance.  
           
          One last example is an SAP performance tuning paper done late last year which shows some of the more practical usage examples of NUMA.    That might help you picture what is typically done when needed.   
        • JayFurmanek
          JayFurmanek
          91 Posts
          ACCEPTED ANSWER

          Re: NUMA on Intel vs IBM memory design?

          ‏2012-08-09T20:54:02Z  in response to djlawler
          NUMA is actually relatively new to x86. In fact, Nehalem was the fist Intel chip with a NUMA inspired architecture (2007). That may explain the difference in volume of documentation somewhat.
           
          Intel incorporated a NUMA style to get around the problem of slow memory and starved cores. Power goes beyond that and uses it organize the system for software operation, particularly for the larger, multi-chassis machines, which essentially become clusters within themselves.
          • NishAravamudan
            NishAravamudan
            4 Posts
            ACCEPTED ANSWER

            Re: NUMA on Intel vs IBM memory design?

            ‏2012-08-22T18:44:32Z  in response to JayFurmanek
             Some distinction should be made about "x86" here. Intel, yes; AMD, no. AMD has been NUMA (even on the same board) for some time, architecturally. My old two-socket Opteron board, for instance, had an uneven distribution of the 6 DIMM slots.
  • Bill_Buros
    Bill_Buros
    126 Posts
    ACCEPTED ANSWER

    Re: NUMA on Intel vs IBM memory design?

    ‏2012-08-10T12:17:25Z  in response to djlawler
    One of the best places to see the POWER7 design points and methodology is the IBM Journal of Research and Development.     In May-June 2011, the focus was on the IBM POWER7 Technology and Systems.
     
    In recent years, the Journals have been hosted on the IEEE Xplore web site, so if you or your company have a subscription to that, you should have full access to the articles.   
     
    As examples, the articles included in that journal edition include ..
    • IBM POWER7 multicore server processor
    • IBM POWER7 systems
    • IBM POWER7 performance modeling, verification, and evaluation
    • IBM POWER7 processor circuit design
    • Power optimization methodology for the IBM POWER7 microprocessor
    • Design methodology for the IBM POWER7 microprocessor
     
    The ""IBM POWER7 Systems" article is probably the most relevant to this discussion.  Here's a portion of the abstract.    
    This paper describes the system architectures and designs of the IBM POWER7 servers. From the smallest single-processor socket blade to the largest 32-processor-socket 256-core enterprise rack server, each system is designed to fully exploit the performance and the scalability of the POWER7 processor. This paper describes the enhancements made to the memory and input/output subsystems to achieve balanced and scalable designs, the changes made to the power and cooling circuitry to manage energy consumption and power dissipation, and the enhancements made to reliability, availability, and serviceability. 
     
    The key aspect of the article is that each of the major systems in the Power line are specifically designed with a targeted focus, across sockets, cores, DIMM slots, maximum memory, PCI slots, GX slots, disk bays, and I/O expansion options.
     
    Let us know if you don't have access to these articles.   I think they'll prove invaluable in your research of the Power systems and designs.