Systems

Beat the heat in 3D chip stacks with ICECool

Share this post:

Editor’s note: This article is by Timothy Chainer, Pritish R. Parida, and Mark Schultz, IBM Research.

In the Moore’s Law race to keep improving computer performance, the IT industry has turned upward, stacking chips like nano-sized 3D skyscrapers. But those stacks, like the law it’s challenging, have their limits, due to overheating. So, our team in New York, alongside colleagues in Zurich, received a 2013 contract to tackle intra-chip cooling from the Defense Advanced Research Projects Agency (DARPA) in its ICECool program. For our part, we developed a new cooling technology to overcome the thermal barrier of stacking chips, an on-chip solution that could help to even cool off entire datacenters.

Today, chips are cooled by fans which push air through heatsinks that sit on top of the chips to carry away excess heat. Advanced water-cooling approaches, which are more effective than air-cooling approaches, replace the heatsink with a cold plate that is closer to the chip.  But because of the electrical conductivity of water, this approach requires a barrier to protect the chip. ICECool uses a nonconductive fluid to take the next step of bringing the fluid into the chip (as shown in the image below). This does away with the need for a barrier between the chip and fluid. It not only delivers a lower device junction temperature (Tj), but also reduces system size, weight, and power consumption (SWaP). Our tests on IBM Power 7+ chips demonstrated junction temperature reduction by 25ᵒ C, and chip power usage reduction by 7 percent compared to traditional air cooling.

Cooling system technology trend toward ICECool

Today’s chip stack “skyscrapers” in reality are more like chip stack “row houses.” Using a heatsink or cold plate holds back 3D chip-stacking height because of their inability to cool chips in the middle and bottom of the stack. IBM’s ICECool technology circumvents that problem by pumping , a heat-extracting dielectric fluid right into microscopic gaps, some no thicker than a single strand of hair, between the chips at any level of the stack.

The dielectric fluid used in ICECool can come into contact with electrical connections, so is not limited to one part of a chip or stack. This “go anywhere” ability benefits chip stacks in terms of materials and architecture, such as putting memory directly on the stack, which improves the speed of everything from graphics rendering to deep learning algorithms.

Cooling fluid in chip stack

ICECool works much like coolant in a car’s air conditioning. It’s pumped into the chips, where it removes the heat from the chip by boiling from liquid-phase to vapor-phase. It then re-condenses, dumping the heat to the ambient environment where the process begins again. Cars, though, need a compressor to cool the air below the ambient temperature (because rolling down the window doesn’t help much in rush hour traffic). Chips, unlike humans, can operate at 85ᵒ C or 185ᵒ F. So the outdoor ambient temperatures are already cooler than the chips. Therefore, our ICECool process doesn’t need a compressor (one of many elements that contribute to lowering a datacenter’s energy expenditure).

Datacenters chill out with ICECool, too

Datacenters in the US – often non-descript buildings spanning millions of square feet – full of servers that, among many things, power the internet, use about 70 million megawatts of electricity, annually. Those MWs translate to about 2 percent of the country’s energy. Two percent may not sound like much, but that’s more electricity than 29 states, as well as the District of Columbia use individually in a year.

IBM Research teams have been hard at work reducing the heat produced by datacenters – which accounts for a third of those 70 million MWs. While most data centers today are air cooled, IBM has developed warm-water cooling with projects such as a Department of Energy project (Economizer Based Data Center Liquid Cooling) and the SuperMUC hot water-cooled data center in Munich. While water is an effective coolant and shown to provide significant cooling energy savings, it requires isolation from the electronics. As ICECool uses a non-conductive dielectric fluid it can come in direct contact with electronics and remove heat by converting from liquid to vapor-phase as it flows through the electronics package.

CRAC (Computer Room Air-Conditioning) and CRAH (Computer Room Air Handler) units, which are like “heatsinks” for today’s datacenters, blow chilled air across the rows and rows of servers. That chilled air is supported by a compressor-based chiller (like a car’s AC). The chiller removes the heat via a tower on top of the exterior of the datacenter. Think of the tower as a giant radiator that dumps heat into the atmosphere – as shown on the top-left of the diagram, below. This is the loop that accounts for one-third of a datacenter’s costs.

Datacenter cooling

ICECool has the potential to eliminate the chiller, plus the CRAC unit, and most of the fans because it can be in direct contact with any and all electronic components. Based on our tests with IBM Power Systems, ICECool technology could reduce the cooling energy for a traditional air-cooled data center by more than 90 percent.

Read our IEEE Transactions on Components, Packaging and Manufacturing Technology paper, Improving Data Center Energy Efficiency with Advanced Thermal Management, to learn more about ICECool. IBM Research scientist will also present the results at the IEEC 29th Annual Electronics Packaging Symposium (EPS) in September and at SC17 in November.

 

Acknowledgement: This project was supported in part by the U.S. Defense Advanced Research Projects Agency Microsystems Technology Office ICECool Fundamentals Program under award number HR0011 13-C-0035 and ICECool Applications Program under award number FA8650-14-C-7466.

Disclaimer: The views, opinions, and/or findings contained in this article are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

More Systems stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page: https://research.ibm.com/blog

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading