It's all very nice to have incredibly fast processors, ludicrous amounts of disk space, and so much RAM that a comparable amount of magnetic core would outweigh the planet. However, if you can't move data from the CPU to memory or from memory to your disk controller, a blown-out system doesn't do you a lot of good.
In the world of chip interconnects, some degree of standardization is absolutely essential -- you obviously can't function without a way to connect devices together. And as I've demonstrated in previous columns, usually a few standardized methods are better for the developer, consumer, and the manufacturer than an uncountable host of competing, proprietary protocols.
As computer speeds and bandwidth needs have doubled, tripled, and more, the need for faster data transfer has led to a great deal of work in this field. Unfortunately for nearly everybody, a huge amount of that work has been proprietary. Still, some of the designs are becoming standardized now, and this is leading to a dramatic reduction in costs to developers and users alike. Five years ago, many hardware developers felt the need to craft custom interconnects. Now it seems like a silly task; the options are multiple.
The current four big contenders are (in alphabetical order):
- HyperTransport (HT)
- Infiniband (IBA)
- PCI Express (PCIe)
- RapidIO.
This article examines the ways in which these standards are competing, cooperating, and interoperating. Note that all of these are trade association standards first and foremost; they are based on the needs and technical input of companies which have invested in membership in industry consortiums. It is particularly interesting to note that some companies are involved with more than one of these standards.
What are we talking about, anyway?
All of these standards provide architectures for moving data from one chip to another -- for instance, from memory to CPU and back, or from a CPU to a video card. In essence, these standards solve the same kinds of problems that PCI solved (see Resources for the Standards and specs column on PCI that provides some history on that).
These specifications can cover both software and hardware layers. HyperTransport's FAQ (which, I must admit, struck me as perhaps slightly influenced by marketing decisions) points out that HyperTransport is software-compatible with PCI, even though its other technical specifications are, to put it lightly, a bit different.
All of these interconnect specifications offer much larger bandwidth than traditional PCI (and PCI Extended, or PCI-X). The canonical bandwidth of PCI is 133MBps (33MHz and 32-bit wide transfers gives you four bytes per cycle for about 133MBps). Bandwidth numbers are hard to calculate, harder to measure, and still harder to get any useful information out of. Someone (it is hotly debated -- I think it was Mark Twain) once said that there are "lies, damn lies, and statistics;" benchmarks have generally been considered in their own class, making conventional statistics seem reliable and informative. In practice, you may reasonably assume that all four are "much faster than PCI."
The PCI specification was a parallel specification; everyone shared a single 32-bit wide bus. HyperTransport is a parallel interface, running between two and 32 bits wide at clock speeds between 200 and 1400MHz. Data can move on both sides of a clock, allowing 2.8 billion transfers per second.
By contrast, Infiniband, RapidIO, and PCI Express are serial interfaces, although some implementations might use multiple serial channels. You might think that a six-bit parallel interface and half a dozen single-bit serial interfaces are pretty much interchangeable, but there are key differences. One is that with the serial interface each signal contains its own clock. It doesn't matter if two adjacent serial channels have slightly different speeds or even wildly different speeds -- they are not required to be in sync with each other.
In all of these specs, commands and data are somewhat mixed together, so the theoretical bandwidth of the bus is higher than the practical available bandwidth for moving data. Although a HyperTransport bus can, in principle, move 22GBps, some significant portion of that data will be meta data like commands or addresses rather than the actual bits being moved. PCI Express and Infiniband use 8-/10-bit encoding to keep signal clocks clear, but this has the side effect of imposing some overhead; out of 10GB of data transmitted over a PCI Express bus, 2GB will be protocol overhead. RapidIO also uses 8-/10-bit encoding for its serial implementation.
Nominal top performance for Infiniband is 120GBps at 12 lines and quad data rate. Most systems today are probably using 4X single or double data rates, giving a more sedate 10-20GBps of raw bandwidth or 8-16GBps of real data transfer. PCI Express uses single-bit bidirectional "lanes," each of which uses two wires each way and can support anywhere from 1 to 32 "lanes." So, in principle, an x16 slot (the kind used for most video cards) supports 80Gbps of raw data transfer, or about 4GBps each way -- 40Gbps each way and about 10 bits of raw data per byte of real data. Future versions of PCIe might support signaling rates of 5 or 10GHz.
The other specification that matters is latency. Latency is, of course, even harder to give a concrete single-number measurement to. The latency of fetching a single byte might be much lower than the latency of fetching a full 64-bit word, or they might be about the same. This is one area where HyperTransport seems to have an edge with its greater capability for parallelism.
In all cases, information tends to need to be chunked some. No matter what width of HyperTransport bus you're using, packets come in 32-bit chunks; that takes eight transfers over a 4-bit bus, but only one transfer on a 32-bit bus. With serial interfaces, data must be aggregated over a handful of cycles. On the other hand, serial interfaces might be able to manage higher clock speeds.
Interconnects get used all over the place. For instance, the connection between the northbridge chip (which handles CPU, memory, and high-speed bus interfaces, such as AGP or PCIe) and the southbridge chip (which handles devices such as a regular PCI bus and legacy system devices such as non-volatile RAM, real-time clock, and so forth) might be built using HyperTransport, even if the rest of the system doesn't use it. Bridges from one interface to another are not uncommon; in general, they have little effect on potential bandwidth, but a significant effect on latency. However, latency can have a huge impact on effective bandwidth if a protocol requires a great deal of back-and-forth communication.
Not all systems need a northbridge chip anymore; some AMD64 chips have an integrated memory controller, dramatically reducing latency of memory communications; and PCI Express systems have a lot less reason for a dedicated AGP controller providing "fast" access to the CPU and main memory.
Infiniband has some focus on longer connections. While HyperTransport and PCIe connections are typically found within a motherboard, Infiniband is used for connecting external peripherals as well. For instance, multiple machines could share access to a disk server device using Infiniband (curiously reminiscent of college computer labs connecting several computers to a single SCSI disk, back when it was too expensive to have multiple hard disks!). It's also useful for improving the performance of clusters, giving machines a communications protocol with better bandwidth and latency than Ethernet. HyperTransport has been used for CPU-to-CPU communications, especially on multiprocessor AMD systems. By contrast, PCI Express and RapidIO are more focused on CPU-to-I/O device connections.
As is so often the case, protocols have layers. A system using HyperTransport might well use it to connect a PCI bus to a CPU. A system using PCI Express is very, very likely to have a regular PCI bus available for compatibility with old hardware. The overwhelming majority of "PCI Express" products you can buy in stores today are video cards targeted at systems that have a single PCI Express slot, so many systems offer only a single PCIe slot, but several old-fashioned PCI slots. (Annoyingly, they are often just 33MHz/32 bits, not even offering the enhancements of higher clock speeds or data widths.)
Protocol can matter a lot. The difference between performance in streaming gigabytes of writes to a memory or disk device, and performance with a more elaborate protocol, such as USB, can be stunning. It is often hard to explain why we need such huge bus bandwidth to get the full performance of devices with much lower-rated raw performance.
Tunneling -- using one protocol to move packets for another protocol -- likewise increases overhead. But the benefits can be significant since tunneling allows multiple interconnects to be used together on a system. In general, these interconnects are intended to provide transparent access to existing protocols. For instance, if you have a PCI bridge attached to a newer interconnect architecture, existing PCI driver code should work on it without any modifications.
I can hardly talk about multiple specifications without putting in a section on this, but really, the answer is that developers are the clear winners. These architectures coexist nicely. A machine might use HyperTransport to connect CPU and memory and also have a PCI Express video card. Protocols can be tunneled; in fact there's even a protocol (HyperTunnel) for tunneling HyperTransport packets over Infiniband hardware.
People who want to build or use hardware are winning. Multiple competing specs exist for moving gigabytes of data around. What would have been a design challenge for a team of engineers with a large budget a few years back is now a question of deciding which off-the-shelf product to use.
As much as I hate to say it, it might be for the best that there are multiple standards in this field. While bridges between interconnects have some performance impact, it's quite clear from the continued investment in each of these standards that they are meeting real needs.
HyperTransport's low latency for very short distances serves a need, but so does Infiniband's ability to run a cable more than a foot long. Interoperability between these standards seems to be unusually good. You might speculate on two likely causes of this:
- First, these specifications are necessarily being worked on by people with an interest in interconnection and interoperability.
- Second, many of the same players are active on two or more of these standards.
Doing research for this turned up various other articles talking about who was "winning." I don't think any two of these articles gave the same list. This provides a basis for an alternative strategy when you are trying to decide which of two competing standards to go with: Get involved with both, make sure they can talk to each other, and lower your risks.
As Andrew Tanenbaum once said, "The wonderful thing about standards is that there's so many of them to choose from."
Learn
-
WikiPedia has articles on Hypertransport, PCI Express, and RapidIO.
-
Check out the home pages for the RapidIO
consortium, the HyperTransport
consortium, and the PCI-SIG.
-
The 8B/10B encoding
specification was developed by IBM and was described in 1983 by Al Widmer and
Peter Franaszek in the IBM Journal of Research and Development.
-
You can connect HyperTransport
systems over Infiniband, if you want.
-
EE Times covered the ongoing
state of interconnect standards.
-
Anytime you need to find a fun quote
(about, say, statistics), try QuoteGarden.
-
The IBM CoreConnect™ bus
architecture is detailed in this PDF
whitepaper.
-
"SoC
design with CoreConnect: 128-bit PLB explained" (developerWorks, June
2005) is a good close-up of CoreConnect which is based on three buses --
the Device Control Register (DCR), the Processor Local Bus (PLB), and the
On-chip Peripheral Bus (OPB).
-
"Standards
and specs: The PCI bus" (developerWorks, February 2005) looks at the war
that PCI won and the possible permutations spawned by that win.
Get products and technologies
-
Start the new year right by checking out the downloads available for the Cell Broadband
Engine architecture.
-
Experimenting with the emerging
technologies from alphaWorks can make you an expert in all types of
tomorrow's technologies.
Discuss
-
Need to find out about how these specs can affect your PowerPC® processor
project? Post your query on a developerWorks
forum.






