IBM BladeCenter QS21 hardware performance glossary

Dual Cell/B.E. system performance numbers include latencies, throughputs, and optimization tips

Although there is extensive published data about the hardware performance features of a single Cell Broadband Engine™ (Cell/B.E.) processor (and about the performance of a multitude of applications ported to it), there is little on the specific hardware performance features of the IBM BladeCenter® QS21 using a coherent SMP node of two Cell/B.E processors as well as an elaborate IO subsystem. This glossary goes with the article "Evaluating IBM BladeCenter QS21 hardware performance." In that article, the authors close that gap by providing information about basic latencies, throughputs, and relative execution times for some key computational benchmark kernels, such as Linpack and SPEC2000. The article also delivers a basic architectural overview of the system. And, you can get tips on how to optimize application performance.

Share:

Peter Altevogt (ALTEVOGT@de.ibm.com), Performance Architect, IBM

Dr. Peter Altevogt is a performance architect in the IBM Systems and Technology Group at the IBM Laboratory Boeblingen (Germany). He built the performance team for the IBM Blade computer using the Cell/B.E. processor. His other responsibilities include performance analysis and modeling of future IBM processors and systems. Dr. Altevogt holds degrees in Mathematics and Physics from the University of Heidelberg, and he holds a doctorate in theoretical physics from the University of Karlsruhe. He joined the IBM Scientific Center in Heidelberg in 1991, and he moved to the IBM Laboratory Boeblingen in 1998.



Hans Boettiger (h.boettiger@de.ibm.com), Performance Architect, IBM

Hans Boettiger works in IBM Systems and Technology Group at the IBM Germany Development Lab. He joined IBM in 1973. He has held various technical leadership positions in software, operating systems, and hardware development for mainframes, as well as in performance analysis for BI systems, compilers, and blade computers. He currently works as a performance architect on next generation systems.



Tibor Kiss (tibor.kiss@de.ibm.com), Performance Engineer, Contractor, IBM

Tibor Kiss is a performance engineer at the IBM Laboratory Boeblingen (Germany). Since 2005, he has been a member of the IBM Systems and Technology Group performance team, responsible for the performance of the IBM Blades using the Cell/B.E. processor. He holds a Bachelor of Science degree in Computer Engineering. His interests include performance analysis and modeling.



Zvonko Krnjajic (KRNJAJIC@de.ibm.com), Software Engineer, Contractor, IBM

Zvonko Krnjajic is a software engineer at the IBM Laboratory Boeblingen (Germany) working on performance analysis of Cell/B.E.-based blades. His other interests include graphics on the Cell/B.E processor (he did his diploma thesis on implementing graphics algorithms on the Cell/B.E. processor at the IBM Laboratory Boeblingen). He holds a Bachelor's degree from the University of Esslingen, and he is currently working on his master's thesis in the area of Distributed Systems Engineering with a focus on general purpose computing on GPUs. He is also interested in High Performance Computing and Cryptography.



06 May 2008

This glossary goes with the article "Evaluating IBM BladeCenter QS21 hardware performance."

Glossary of terms
TermDefinition
BE0, BE1Aliases for the two Cell/B.E. processors of the QS21.
BIFThe Cell/B.E. interface: A fully coherent protocol connecting the two Cell/B.E. processors of the QS21.
DCBB2Dual Cell-based blade configuration number 2: A special deliverable for IBM Global Engineering Services (GES).
DDR2Double data rate 2 is a technology for high speed memory.
DMADirect memory access is a technology to move data within a computer system without requiring services from the main processor.
EIBThe Element Interconnect Bus is the communication path for command and data between all processor elements and the on-chip memory and I/O controller of the Cell/B.E. processor.
HS ConnectorA High-Speed 2x PCI-E 16x Connector supports only the DCBB2 blade deliverable.
HSDCHigh speed daughter cards, such as the InfiniBand Daughter Card.
IBDCInfiniBand Daughter Card.
IOIFThe non-coherent I/O interface protocol of the Cell/B.E. system that is suitable for I/O devices.
mc0, mc1The memory controller interfacing the Southbridges and the attached DDR2 memory.
MFCMemory flow controller: Component of the SPE that transfers data between the local store of the SPU and the XDR DRAM and providing synchronization services.
MICMemory interface controller: Provides the interface between the EIB bus and the XDR DRAM.
MPIMessage passing interface is a specification of a message passing library.
MTUMaximum transmission unit: Specifies the maximum packet size in bytes that can be transmitted over a network without being fragmented.
n1/2n1/2 is the message size where the throughput achieves half of its maximum value.
PCI-EPCI Express: A computer expansion card interface standard introduced to replace PCI-X.
PCI-XPeripheral Component Interconnect Extended: A computer expansion card interface standard introduced to replace PCI.
PPEThe PowerPC® Processor Element of the Cell/B.E. processor: A general purpose, dual-threaded 64-bit RISC® processor core.
rDMARemote direct memory access allows data to move directly from the memory of one computer into that of another without involving either one's processor.
SIMDSingle Instruction Multiple Data is a classic technique to implement data parallelism; that is, to execute the same operation concurrently on a set of data.
SPE1, ..., SPE8The eight Synergistic Processor Elements constitute the computational core of the Cell/B.E. processors. Each SPE is a 128-bit RISC processor executing SIMD instructions. Its main units are the Synergistic Processing Unit (SPU), which contains the computational pipelines, and the memory flow controller (MFC), which implements the DMA operations.
SPUSynergistic Processing Unit: Processor component of an SPE with two pipelines executing up to two instructions per cycle and an attached local store memory.
TLBTranslation lookaside buffer: Caches at the SPEs and PPEs that are used by the memory management hardware to improve the latency of virtual address translation.
XDR™eXtreme Data Rate Dynamic Random Access Memory: High-performance memory from Rambus, Inc.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration
ArticleID=306705
ArticleTitle=IBM BladeCenter QS21 hardware performance glossary
publish-date=05062008