In the beginning, each computer's central processing unit, or CPU, was unique. Each had its own instruction set, which was incompatible with any other. All of that changed back in the thermionic valve (or "vacuum tube") days with the introduction of the IBM S/360™ line of computers, in 1964. Suddenly, code didn't have to be thrown away and reimplemented every time you bought a new computer. Today's IBM mainframes still maintain backwards-compatibility with that revolutionary 1962 instruction set. And the same spirit of compatibility infuses IBM's other CPU lines.
At the user-mode level, the instruction set of the PowerPC® family of processors provides full application compatibility, from the lowliest automated traffic light to the powerful BladeCenter JS20 or the Apple Xserve G5. In addition, PowerPC microprocessors share a large common instruction set with IBM's other RISC processor lines, POWER™ and Star, which leads to "near" compatibility across all three families. In many cases, this equals binary compatibility; in some cases, it means that a simple recompilation is needed; in all cases, it means that porting is a breeze.
IBM's four families of processors -- the Power Architecture™, the PowerPC family of processors, the Star chips, and even the line of chips that power IBM mainframes -- all have a common ancestor: the IBM 801.
The fifth and newest processor family to join this illustrious group is the Cell Broadband Engine Architecture family. While its central processor, or PPU, is Power-like -- for instance, compatible enough to run PowerPC Linux -- it is surrounded by a number of SPUs (in the first iteration, eight) which use a completely different ISA. For this reason and because it is jointly designed and owned by IBM, Sony, and Toshiba rather than IBM alone, it is really a separate product line, rather than another PowerPC.
The IBM 801 started out attempting to solve the same problem as a lot of the computers in the 1970s: switching telephone calls. The design team's goal was to complete one instruction per clock cycle, and to accommodate 300 calls per minute.
Most of the computers of the day, such as the IBM S/360 mainframe, had complex and redundant instruction sets known today as CISC (complex instruction set computer). The trend towards miniaturization in computing, which began with the 1947 invention of the transfer resistor (or "transistor") only exacerbated this. As integrated circuits grew smaller, designers took advantage of the extra space to cram even more instructions into the chip. All of this complexity meant that by the 1970s, computer chips could do really amazing things (like power increasingly complex digital watches). But it also meant that the chips needed more machine time to execute, making it impossible for the 801 team to achieve their performance goals.
IBM's John Cocke was no stranger to the battle against complexity. He had already worked on the IBM Stretch computer, a rival to the IBM 704 mainframe, and on Stretch successor ACS (Advanced Computing Systems), rival to the 704's successor, the S/360.
He sliced away at the redundancy in the instruction set and designed a machine with half the circuits of its contemporaries -- that ran twice as fast as they did. The fast core and fewer circuits led not only to greater performance, but also to lower power consumption and -- perhaps most important for many consumers today -- much lower costs. This architecture became known as RISC (reduced instruction set computer). Some prefer to call it "load-store," pointing out that the instruction set of a RISC computer can number as many as 100 instructions or more (as the Power Architecture does). Others counter that the RISC is not a reduced set of instructions, but rather a set of reduced instructions -- each of the complex instructions of the CISC is broken down into shorter basic building blocks that can then be combined.
In any case, the complexity that was removed from the CPU didn't just disappear; it was pawned off on the compiler. In order to do that gracefully, John Cocke became not only an expert in compilers, but especially in optimizing them. His work on RISC and optimizing compilers won him many awards, not the least of which was the 1987 Turing Award.
As for the IBM 801, it never did become a telephone switcher. Instead, it became the first RISC chip and powered many IBM hardware products -- for a time, it even did a stint as a minicontroller and processor in its rival, the IBM mainframe series.
The RISC architecture soon came to dominate the workstation and embedded markets, and John Cocke moved on to other projects. In the 1980s, he had a chance to refine his 801 design in a project that was code-named "America" and that would become the POWER series of chips. He even had a hand in the development of the PowerPC architecture a few years later. Like the 801, the PowerPC was designed to be a universal microprocessor that could run on any machine, from the smallest to the tallest.
Today, the RISC architecture is the single most common CPU type in use and is the basis for everything from workstations to cell phones, video game consoles to supercomputers, traffic lights to desktops, and broadband modems to automobile fuel-injection and collision avoidance systems. Even x86 chip manufacturers, which continued for quite a time to produce CISC chips, have based their 5th- and 6th-generation chips on RISC architectures (and translate x86 opcodes into RISC operations to make them backwards-compatible).
POWER stands for Performance Optimization With Enhanced RISC, and it is the main processor in many IBM servers, workstations, and supercomputers. Descended directly from the 801 CPU, it is a 2nd-generation RISC processor. Introduced in 1990 to power the RS, or RISC System/6000 UNIX® workstations (now called the eServer™ pSeries®), POWER exists in iterations from POWER1, and POWER2™, through POWER5™ and beyond.
The 801 was a very simple design. But because all instructions completed in one clock cycle, it lacked floating-point and superscalar (parallel processing) ability. The POWER architecture set out to correct this -- some would say, perhaps, to overcorrect it. With more than 100 instructions, the POWER is a pretty complex RISC.
Highlights for each iteration follow; for comprehensive details, please see the links listed in Resources.
Released in 1990: 800,000 transistors per chip
Unlike other RISC processors of the day, POWER1 was functionally partitioned. This gave it superscalar abilities beyond those of mortal chips. It also had separate floating-point registers and could scale from the low to the high end of the UNIX workstations it was built for. The very first POWER1 was actually several chips on a single motherboard; this was soon refined down to one RSC (RISC Single Chip) with more than a million transistors. The RSC implementation of the POWER1 microprocessor was used as the central processor for the Mars Pathfinder mission and is the chip from which the PowerPC line is descended.
Released in 1993 and in use until 1998: 15 million transistors per chip
The POWER2 added a second floating-point unit (FPU) and more cache. The PSSC superchip, a single-chip implementation of the POWER2's eight-chip architecture, powered the 32-node IBM DEEP BLUE® supercomputer that beat world champion Garry Kasparov at chess in 1997.
Released in 1998: 15 million transistors per chip
The first 64-bit symmetric multiprocessor (SMP), POWER3 is completely compatible with the original POWER instruction set -- and compatible with the PowerPC instruction set as well. The POWER3 was designed for work on scientific and technical computing applications from aerospace and pharma to weather prediction. It features a data prefetch engine, non-blocking interleaved data cache, dual floating point execution units, and many other goodies. The POWER3-II reimplemented POWER3 using copper interconnects, delivering double the performance at about the same price.
Released in 2001: 174 million transistors per processor
A gigaprocessor incorporating 0.18-micron copper and SOI (Silicon-on-Insulator) technology, the POWER4 was the single most powerful chip on the market when it was introduced. It inherited all of the characteristics of the POWER3 -- including compatibility with the PowerPC instruction set -- but reinvented itself with a completely new design. Each processor has two 64-bit 1GHz+ PowerPC cores, making it the first server processor with a multicore design on a single die (also known as "SMP on a chip," or "system on a chip"). Each processor can execute as many as 200 instructions simultaneously. The POWER4 supersedes the Star family of processors and is the power behind the IBM Regatta servers as well as being the father of the PowerPC 970 processor (also known as the Apple G5). The POWER4+™ (also known as POWER4-II) does the same, but at higher frequencies and with less power consumption. It was the first to use the 130-nanometer copper/SOI process.
Released in 2003: 276 million transistors per processor
Like the POWER3 and POWER4, the POWER5 unifies the POWER and PowerPC architectures. The POWER5 is also based on the 130-nanometer copper/SOI process, and features communications acceleration, chip multiprocessing, a larger L2 cache, a memory controller on the chip, simultaneous multithreading, advanced power management, eFuse (morphing) and hypervisor technology. IBM servers built with the POWER5 feature up to ten LPARs capable of running up to 256 independent operating systems on the higest end. POWER5 processors can be found hanging about in iSeries and pSeries servers, as well as in the first IBM entry-level UNIX/Linux box, the OpenPower™ line. IBM introduced the POWER5+™ processors, which are built with a 90-nanometer process similar to that used with the Cell Broadband Engine, in 2005. POWER5+ ups the clockspeed significantly -- on a smaller die.
Under wraps for the most part; but IBM is taking the next step in the POWER line's evolution by involving customers more in setting the requirements for the design of the POWER5+ and higher iterations. The POWER6 is said to be code-named Eclipz, is expected to be based on the 65-nanometer process, and is expected by many to debut in the 2006-2007 timeframe.
The PowerAS family first appeared in 1995, and the first chips using the RS64 name appeared in 1997. These chips are known within IBM as the Star family, because most of the code words for the various iterations contained the word "star" or something like it (the notable exception is the original RS64, code-named "Apache", and the PowerAS chips).
Descended from a modified PowerPC architecture, they also inherited a number of traits from the POWER line. From inception they were optimized for one thing only: commercial workloads. This degree of specialization put them at the top of the UNIX server game for roughly six years.
The RS64 family left things like branch prediction, exceptional floating-point powers, and hardware prefetch to its POWER3 cousin and focused instead on exceptional integer performance and large, sophisticated on- and off-chip caches. The RS64 family was 64-bit for its entire span, and introduced multithreading in the RS64 II iteration in 1998. The RS64 could scale to as many as 24 processor SMP in a single machine, and -- unlike its POWERful cousins -- consumed as few as 15 watts per processor.
These qualities made it ideal for things like on-line transaction processing (OLTP), business intelligence, enterprise resource planning (ERP), and other large and hyphenated, function-rich, database-enabled, multi-user, multi-tasking jobs with high cache miss rates -- including Web serving. RS64 chips shipped in the IBM eServer pSeries™ (RS series) and iSeries (AS series) only. The contrast with the (also highly specialized) Cell Broadband Architecture design is interesting, and shows a very different set of design priorities.
Released in 1995, model numbers A10, code name "Cobra;" and A30, code name "Muskie"
The A10 was a uniprocessor, and the A30 was 4-way SMP.
Released in 1997, model number A35, code name: Apache
The first RS64 and the world's first 64-bit PowerPC RISC. Both superscalar and scalable, it was more compatible with POWER1 than later RS64 chips would be. By focusing on commercial workloads, it was able to implement functions on one chip that had previously required seven. It was used in the AS/400® (then called A35) and RS/6000®.
Released in 1998, model number A50, code name: Northstar
The second iteration featured four processors per card and up to three cards per RS/6000 to create a 4-way, 8-way, or 12-way SMP system.
Released in 1999, code name: Pulsar
The first RS64 to use IBM copper and SOI (Silicon on Insulator), now with six processor cards scaling up to 24-way SMP.
Released in 2001, code names: IStar, SStar
The first mass-market processor to implement multithreading, RS64 IV was faster and smaller than its predecessors.
Today, the convergence of commercial and scientific computing has created a need for a single processor to address both markets, and the Star family has merged into the POWER family, starting with the POWER4.
The PC in PowerPC stands for performance computing. Descended from the POWER architecture, it was introduced in 1993. Like the IBM 801, it was designed from the beginning to run on a broad range of machines, from battery-operated handhelds to supercomputers and mainframes. But it saw its first commercial use on the desktop, in the Power Macintosh 6100.
Born of an alliance between Apple, IBM, and Motorola (also known as the AIM alliance), the PowerPC was based on POWER, but with a number of differences. For instance, PowerPC is open-endian, supporting both big-endian and little-endian memory models, where POWER had been big-endian. The original PowerPC design also focused on floating-point performance and multiprocessing capabilities. Still, it did and still does include most of the POWER instructions. Many applications work on both, perhaps with a recompile to make the transition.
Since 1993, the PowerPC ecosystem has of course evolved; Apple is no longer actively involved in PowerPC, and Motorola's PowerPC (and other microprocessor) development work has been spun off into the independent semiconductor company known as Freescale. In 2004, AMCC acquired the 4xx line of customizable embedded PowerPC cores, and in 2005, HCL Technologies of India announced it would open the first non-IBM Power Architecture Design Center: the landmark agreement enables HCL to sublicense end-to-end Open SystemC models, core hardening and integration services, SoC prototyping, and other services around the IBM PowerPC 405 and PowerPC 440 embedded microprocessor cores without any involvement from IBM. The Power.org consortium launched at the end of 2004 to foster innovation in and growth around the Power Architecture ecosystem (including PowerPC). Top-of-the-line custom processors, such as the next-gen Microsoft® Xbox 360® processor, are based on the PowerPC architecture, and IBM remains firmly committed to the PowerPC family as an important component of its microprocessor lineup (note that Freescale hints the PowerPC might undergo a name change in 2006!).
While IBM, Freescale, and AMCC develop their chips separately; at the user level, all PowerPC processors run the same core PowerPC instruction set, ensuring full ABI compatibility for the software products that run on them. Since 2000, Freescale (then still Motorola) and IBM PowerPC chips have followed the Book E spec, which provides additional enhancements to make PowerPC more attractive for embedded processor applications such as networking and storage equipment as well as for consumer devices.
Aside from compatibility, one of the best things about the PowerPC architecture is that it is open: it specifies an instruction set architecture (ISA) that allows anyone to design and fabricate PowerPC-compatible processors; and source code for software modules developed in support of PowerPC is freely available. Finally, the small size of the PowerPC core leaves a great deal of room on each die for additional components, from added cache to coprocessors, allowing for an amazing amount of design flexibility. The Xbox 360 processor and the Cell Broadband Engine processor both give excellent examples of this, as do the various System-on-Chip processors, such as the AMCC 405GPr.
Two of IBM's five server lines are based on the PowerPC architecture, as are the current generation of Apple Computer desktop and server lines, the Nintendo GameCube, many IBM BladeServers, and the IBM Blue Gene® supercomputer.
Today, the three main PowerPC families are the embedded PowerPC 400 series and the stand-alone PowerPC 700 and PowerPC 900 families. For historical perspective, we will also give highlights for the stand-alone PowerPC 600, because it was the first.
PowerPC 600 family
The PowerPC 601 was the first chip in the first PowerPC family. A sort of bridge between the POWER and PowerPC architectures, it maintained more compatibility with POWER1 than later PowerPCs (even those from the same family), as well as compatibility with the Motorola 88110 bus. The PowerPC 601 made its debut in the very first PowerMac 6100 in 1994, running at blazing speeds of up to 66 MHz. The next chip in the line was the PowerPC 603™, a low-end, low-power core that is the chip most often found in cars. Released at the same time as the 603, in its day the PowerPC 604™ was the most powerful high-volume chip in the industry. Both the 603 and 604 were rereleased in tweaked "e" versions (the 603e and 604e) with improved performance. Finally, the first 64-bit PowerPC, the very high-end PowerPC 620®, was released in 1995.
PowerPC 700 family
With a debut in 1998, the PowerPC 740 and PowerPC 750 were very similar to the 604e -- some people would say they are all members of the same 600/700 family. The PowerPC 750 was the world's first copper-based microprocessor, and when used in Apple computers is usually known as the G3. It was rather quickly eclipsed by the G4, or Motorola 7400. The 32-bit PowerPC 750FX wowed the industry with speeds of up to 1 GHz when it was released in 2002. IBM followed this in 2003 with the 750GX, which incorporates 1MB of L2 cache at speeds of 1GHz at around seven watts of power consumption. The 6xx processors have been retired, but the 7xx line is very much alive and kicking at the high end of the embedded space: IBM introduced the new RoHS-compliant, low-power PowerPC 750GL in 800MHz and 933MHz flavors in 2005.
PowerPC 900 family
The 64-bit PowerPC 970, a single-core version of the POWER4, can process 200 instructions at once at speeds of up to 2 GHz and beyond -- all while consuming just tens of watts of power. Its low power consumption makes it a favorite with notebooks and other portable applications on the one hand, and with large server and storage farms on the other. Its 64-bit capability and single instruction multiple data (SIMD) unit accelerate computationally intensive workloads such as multimedia and graphics. It is used in Apple desktops, Apple Xserve servers, imaging applications, and -- increasingly -- in networking applications. Apple's Xserve G5, launched in March of 2004, represented the first use of the new PowerPC 970FX -- the first chip made using both strained silicon and SOI technologies together, enabling the chip to run at even greater speeds with even less power consumption -- in an off-the-shelf system. IBM introduced the PowerPC 970MP, with a slew of intriguing power-saving features and capabilities, in 2005. Today the PowerPC 970 is still found in some Apple products; as well, in IBM BladeServers, Terra Soft Solutions and Genesi systems, and has even been known to make the occasional appearance in the embedded space.
This is the embedded family of PowerPC processors. The PowerPC's flexible architecture allows for a great deal of specialization, and that is nowhere so apparent as in the 4xx family, which is equally at home in applications ranging from set-top boxes to the IBM Blue Gene supercomputer. On one end of the spectrum, the PowerPC 405EP consumes just one watt of power to achieve speeds of up to 200 MHz, while the copper-based 800 MHz PowerPC 440 series offers the 4xx line's highest performance for an embedded processor. Each 4xx subfamily can be specialized as well; for instance, the PowerPC 440GX's dual Gigabit ethernet and TCP/IP off-load acceleration can decrease utilization for packet-intensive applications by more than 50%. A large array of products are built around highly modified PowerPC 400 family cores, not the least of which is the Blue Gene supercomputer with two PowerPC 440 processors and two FP (floating point) cores per chip. AMCC acquired the 4xx line in 2004, although IBM is still handling manufacturing for these processors. AMCC has introduced new designs, including the security-conscious 440GRx and 440EPx, the next-gen PowerNP™ family, and the RAID-enabled 440SP and 440SPe, in 2005.
Originally thought of as a desktop chip, the PowerPC's low power needs make it an excellent candidate for the embedded space, and its high performance makes it attractive for advanced applications. It is well suited for everything from video game consoles and multimedia entertainment systems, to personal digital assistants and cell phones, to base stations and PBX switches. It is at home in broadband modems, hubs and routers, automotive subsystems, printers, copiers, and faxes. And of course, server systems and workstations, too.
The Cell Broadband Engine™ (Cell BE) architecture is a new architecture which extends the 64-bit Power Architecture technology. Capable of massive floating point processing and ideal for compute-intensive tasks, the Cell BE processor is a single-chip multiprocessor no bigger than a fingernail, with nine processors operating on a shared, coherent memory. The Cell BE processor contains a Power Architecture-based control processor (PPU) augmented with eight (or more) SIMD Synergistic Processor Units (SPUs) and a rich set of DMA commands for efficient communications between them all.
While it was originally designed for use in the Sony® PlayStation® 3, Sony, Toshiba, and IBM (known collectively as STI) have wider-ranging plans for the new family than that. While some in the developer community fear the complexity of its dual-ISA, others rightly point out that the chip world is going multiprocessor across the board, and you've got to start programming for them sometime. You can do that right now with the IBM Cell BE SDK (see Resources); and it is expected that you will be able to do it on real hardware -- including multiprocessor systems, blade systems, some special-purpose hardware, and at least one evaluation platform -- beginning in 2006.
So far, the new Cell Broadband Engine Architecture family has only one member:
Cell Broadband Engine processor
Expected release date: 2006; 234 million transistors
The Cell BE processor is manufactured with a 90nm process and crams 234 million transistors onto a 221-mm-square die. This vast amount of real estate is inhabited by the 64-bit Power-based VMX-enabled central core (the PPE); eight synergistic cores (the SPEs, each of which has its own dedicated DMA engine, over one hundred 128-bit register files, and 256KB of Local Store). As well, the tiny die houses 512KB of L2 cache, the chip's specially designed Element Interconnect Bus (EIB), the shared Memory Interface Controller (MIC), and Rambus XDR, and FlexIO interfaces. The Cell Broadband Engine supports hyper-pipelining, resource allocation, locking caches, virtualization, power management, and just enough redundancy to make it really reliable!
You will remember that the 801 project was in great part a reaction to the complexity of CISC systems and specifically, the extreme CISC of the IBM mainframe. Nevertheless, IBM mainframes were also beneficiaries of the 801 project, and so are distantly related to IBM's three lines of RISC processors. IBM's "fourth family" of processors, the mainframe chips, have a very complicated family history of their own.
One of the reasons for this is that mainframes rely much less on the CPU and more on system architecture and I/O channels than other types of computers. The revolutionary S/360 family of mainframes that introduced compatibility to the industry were still powered by magnetic cores. With a name change to S/370™ in 1971, they became the first mainframes in the industry to switch to chips. Of course they used CISC chips; specifically, bipolar junction transistors with a CISC architecture. Some of the S/360 and S/370 systems adopted some RISC design techniques, implementing part of the instruction set in hardware, which actually improved performance! An even more significant change came when they began to use CMOS instead of bipolar transistors; the first generation (or G1) CMOS mainframe chips came out around 1994, and by 1997 IBM announced that henceforth all mainframes would ship only with CMOS and never again with bipolar transistors. And it isn't only mainframes that have made the switch to CMOS: while bipolar transistors ruled the early chipmaking world, most of the processors made today are CMOS.
So what are these CMOS chips, exactly? Well, CMOS (complementary metal-oxide semiconductor) chips use metal-oxide semiconductor field effect transistors (MOSFETs). These are fundamentally different from bipolar transistors. A few of the effects of those differences are highlighted here; see Resources for details.
Bipolar transistors are blazingly fast, but they consume a great deal of power, even in a standby or steady state. Meanwhile, an FET transistor is achingly slow, but consumes no power at all in a steady state. Thus, for applications where long battery life is crucial -- and performance isn't -- FETs are the way to go. Thus, in the days when computing was still so primitive that people thought that digital watches were a really neat idea, it was CMOS chips that powered them. They also powered other applications requiring little power and none-too-fast performance -- like housing a personal computer's BIOS.
Now, another big difference between bipolar and FET transistors is topology: bipolar transistors have a vertical layout, while FET-based chips are built on the horizontal. Thus, there is more room on a FET-based chip. Eventually, around the cusp of the 1980s and 90s, the relentless march of miniaturization approached sizes so small that the larger area of the slower FET-based chips could be filled with enough transistors to whomp the performance superiority of the bipolar model. FET-based chips have one last thing going for them, which is that they interfere electronically with their neighbors much less than bipolar transistors do. So, while bipolar transistors run up against a wall where making them any smaller leads to unacceptable levels of electrical interference, FET-based chips can be made even smaller than that, and so packed even more densely in their larger surface area. Thus, most of the latest advances in nano-scale chip processing have been on CMOS chips.
The other really interesting thing about mainframe chips is their level of redundancy. They are usually packaged together in Multi-Chip Modules (MCM) of 20 or 30 chips or more: fully one half of them are there as backups, ready to take over if an active chip fails. Further, mainframes process each instruction they receive twice, on separate chips, and check their answer before returning it. As we reach the milestone of one billion transistors on a single chip, we may find that kind of stability applied to consumer processors as well.
What do the Nintendo GameCube's Gekko, Cray's X1 supercomputer chips, NVIDIA's latest GeForce processor, and the next-generation Microsoft Xbox and Sony PlayStation all have in common? All of them use chip technology licensed from or manufactured by IBM. In the last few years, IBM has begun to open its foundries -- and its research -- to outside business like never before. The E&TS division has well over a thousand engineers for hire, available to work on software, technology, and chip engineering for their clients.
E&TS did much of the work on the Xbox 360 processor. Between the Xbox 360 processor (a three-core PowerPC), the Cell Broadband Engine (a dual-threaded PowerPC with eight specialized mathematical processors), and the processor in the Nintendo Revolution (which is an IBM chip, though it is not confirmed yet whether it is or isn't Power Architecture technology), IBM semiconductor solutions has managed a sweep of next-generation gaming console hardware. Various System-on-Chip designs are also based on Power Architecture technology, and HCL Technologies in India is developing designs built around Power Architecture technology.
Figure 1. It's wafer-thin: 300mm wafers yield more chips
And of course one of the many reasons that the Power Architecture technology is so appealing is the new top-of-the-line IBM fab in Fishkill, New York. The Fishkill fab is so up-to-date that it is capable of producing chips with all of the latest acronyms, from copper CMOS technology to Silicon-on-Insulator (SOI) and low-k dielectrics -- all on 300mm wafers. The Fishkill fab is so with-it that the server room runs exclusively on Linux. And the Fishkill fab is so amazingly, mind-bogglingly hip that it won Semiconductor International's 2005 Top Fab award.
As well, IBM foundries are the world's leading supplier of ASICs (application-specific integrated circuits), from Customizable Control Processor (CCP) options -- where a large portion of the design is fixed, but there is plenty of room left for customization -- to IBM design expertise in tailoring an existing product to a new application, to support for other suppliers' processors and coprocessors. In short, they're ready for anything.
Just twenty years ago, chip components were measured in microns, or thousands of nanometers. Today, chips produced on 300mm wafers contain components with an average size measured in the tens of nanometers. You will of course recall that one nanometer is one millionth of a millimeter, and that a human hair has a thickness of about 100,000 nanometers. At this rate, we will soon be measuring components in Angstroms.
Inexpensive processors with a billion transistors per chip are just around the corner, and industry watchers suggest we will reach speeds of 100GHz by 2010. The Cell Broadband Engine (a cooperative effort of Sony, Toshiba, and IBM) is widely considered an effective exploratory leap in one of the directions that might get us there.
In the nearer-term, we can look forward to the release of the Sony PlayStation 3 and the Toshiba Cell BE development board -- and maybe even the POWER6 -- in 2006; as well as to another fabulous year at the IBM developerWorks Power Architecture technology zone.
Cell Broadband Engine is a trademark of Sony Computer Entertainment Inc.
- Learn more about John Cocke
- John Cocke's first assignment upon joining IBM in the late 1950s was to work on the Stretch computer. While it never delivered on its promise to outperform the IBM 704 mainframe by a factor of 100, it did outpace its rival by a factor of 30. It also pioneered lookahead, pipelining, branch prediction, multiprogramming, memory protection, generalized interrupts, and the 8-bit byte -- and more! All of these were later used in the IBM System/360 line, and have since trickled out to most chips on the market today.
- The successor to the 704 was known as Project X, and it competed internally with the successor to Stretch, Project Y. While Project X was to become the IBM S/360 family of mainframe computers, Project Y would become ACS (Advanced Computing Systems), IBM's first attempt at a supercomputer. ACS was the project John worked on after Stretch, and was the forefather of John's next assignment, the 801.
- A hacker in the true sense of the word, John Cocke changed chip design -- and the computing world -- forever. For this he received a number of industry and national awards, not the least of which are commendation from The Franklin Institute and the 1987 Turing Award.
- More mainframes
- The S/360 mainframe was released almost exactly 40 years ago, priced to move at a mere US$133,000 for a basic configuration. Read a copy of the press release dated April 7, 1964.
- For more information on how mainframe architecture and issues have influenced Power Architecture technology, take a look at the Big Iron series in the Power Architecture technology zone.
- The newest IBM mainframe was introduced in 2005. Representing a major systems strategy shift, the new z9 can eat its predecessor, the T-Rex, for lunch.
- More history
- In the mid 1980s, IBM released the first workstation to use a RISC chip. Named the RT (in keeping with IBM AT and XT machines of the era), it was not destined for greatness. For some interesting technical background on the ROMP processor, read "The IBM RT PC ROMP processor and memory management unit architecture." You'll also find more about the RT and the ROMP chip under RISC in Wikipedia.
- See the IBM Archives and their links page for more on the history of IBM, and see wiki for a History of computation. Already an expert in computing history? Then take this Pop Quiz.
- The original Star, or RS64 IV, architecture is described in "A multithreaded PowerPC processor for commercial servers" (IBM Journal of Research and Development, 2000).
- More background
- As Wikipedia, the free encyclopedia, will explain, Bipolar junction transistors (BJT) are not only doped sandwiches, but also something essentially contrary to CMOS. If we could only successfully apply the same advanced processes used in CMOS manufacture today, to bipolar chips, we would make a quantum leap forward to chips with absolutely undreamed-of levels of performance. This CMOS gates demonstration will take your understanding of CMOS to the next level.
- "Maintaining the benefits of CMOS scaling when scaling bogs down" by E. J. Nowak (IBM Journal of Research and Development, 2002) attempts, among other things, to answer the question What happens when we get to 5 nanometers?, while "The future of CMOS technology" by R.D. Isaac (IBM Journal of Research and Development, 2000) offers great background on challenges facing chip designers, and how and why CMOS have displaced bipolar transistor designs (note especially Table 1!).
- IBM Journal of Research and Development
- Most of IBM Journal of Research and Development Volume 34, Issue 1 is devoted to the original POWER architecture, often referred to in those days also as "the RS/6000 processor" because it powered RS/6000 machines. This issue includes an article on "The evolution of RISC technology at IBM" by John Cocke himself (with Victoria Markstein). (IBM Journal of Research and Development, 1990).
- All of the 2005 issues of IBM JoRD have been devoted to chippy topics as well. See: Electrochemical Technology in Microelectronics (Volume 49, Number 1), IBM BladeCenter Systems (Volume 49, Number 6), Blue Gene (Volume 49, Number 2/3), and POWER5 and Packaging (Volume 49, Number 4/5), and (coming soon) Spintronics (Volume 50, Number 1).
- More POWER to you
- You will also find many good reference works on the IBM POWER2 Architecture, the POWER3, the POWER4 System, and the POWER5 System at the IBM Web site.
- Servers based on the POWER5 include iSeries which can run AIX and Windows Server and Linux, and the IBM pSeries with models from the very low to the very high end; both they and the entry-level OpenPower, can run either AIX or Linux.
- Super Powers
- The POWER5 is also the brains behind the ASC Purple supercomputer -- which boasts more than 12,000 POWER5 processors in 197 refrigerator-sized nodes that cover an area equivalent to the size of two basketball courts -- several orders of magnitude larger than the ENIAC, whose size is the butt of many a nerdish joke. ASC Purple was delivered to Lawrence Livermore in the July of 2005 and is currently at No. 3 on the Top500.
- The Blue Gene supercomputer, which is considered to be currently the most powerful computer in the world, according to the TOP500 supercomputer list, having been at No. 1 for all of 2005, and at both No. 1 and No. 2 in the 25th and 26th lists (the 27th list will come out in June 2006).
- IBM Semiconductor solutions now
- IBM Semiconductor solutions has a great Photo Catalog, and a nice group of resources on its technology and innovation page, nifty new video presentations, and oodles of documentation.
- In addition, IBM Semiconductor solutions offers Custom chip solutions with the broadest architecture support in the business; from Power Architecture as well as other cores, including other suppliers' cores.
- IBM Semiconductor solutions also offers Evaluation kits for PowerPC cores, which come with schematics, source code, design details, and a comprehensive selection of tools to enable development of PowerPC-based applications. You can download the IBM PowerPC 970FX Evaluation Kit and the IBM PowerPC 750GX-750FX Evaluation Kit from the developerWorks Power Architecture downloads page.
- AMCC has announced a number of 4xx PowerPC cores in 2005, including the PowerPC 440GR and the low-cost, low-power, security-enabled PowerPC 440GRx and PowerPC 440 EPx.
- And in India, HCL Enterprise can also help you design your own PowerPC core
- Absolutely fabulous
- This year, IBM has won the 2004 National Medal of Technology for semiconductor innovation, and the Fishkill fab was awarded Top Fab of 2005. In 2000, IBM's Burlington fab shared the first of these awards with two other facilities.
- Learn more about the IBM/Chartered "common platform" of foundry process technology and the network of technology alliances it has spawned.
Keep abreast of the next new Power Architecture breakthrough as it
happens: subscribe to the Power
Architecture Community Newsletter.
The Power.org organization is an
excellent starting point for exploring the Power Architecture community.
Become a developer-level member
and join the conversation.
Find more resources for Power developers in the developerWorks Power
Take part in the IBM developerWorks Power Architecture Cell
Broadband Engine discussion forum.
Send a letter to the editor.
The developerWorks Power Architecture editors welcome your comments on this article. E-mail them at firstname.lastname@example.org.