Posted in: AI Hardware, Government, Systems, Thomas J Watson Research Center

We’ve Reached the Summit

Introducing the world’s smartest, most powerful supercomputer

In 2014, the US Department of Energy (DoE) kicked off a multi-year collaboration between Oak Ridge National Laboratory (ORNL), Argonne National Laboratory (ANL) and Lawrence Livermore National Laboratory (LLNL) called CORAL, the next major phase in the DoE’s scientific computing roadmap and path to exascale computing.

They selected IBM to build two of their next-generation supercomputers, Summit at ORNL and Sierra at LLNL. Summit, online now,* is the world’s most powerful and smartest supercomputer, capable of an estimated 200 quadrillion calculations per second – or 200 petaflops.  It will provide at least 5-10X more performance on DOE applications compared to its predecessor.  It’s a million times more powerful than a high end laptop, allowing it to sift through thousands and thousands of variables and create models and simulations to help find answers to the world’s most complex problems.

Summit is based on a fundamentally new architecture spearheaded by IBM, where compute power is embedded everywhere data resides, combining some of the world’s most powerful central processing units (CPUs) with graphics processing units (GPUs) optimized for scientific and artificial intelligence (AI) workloads – all connected together with lightning-fast networking.  That means it can handle any type of high-performance computing (HPC) workload, from traditional modeling and simulation to data analytics and AI at unprecedented scale and speed on massive data sets.   Summit can make connections and predictions that will help advance cancer research, understand genetic factors that contribute to opioid addiction, simulate atoms to develop strong, more energy efficient materials and understand the elements of supernovas or exploding stars to explore the universe in new ways.

And this technology isn’t just for the government. This new supercomputing technology is in our business product line with IBM AC922 and the family of new POWER9-based servers built for AI workloads. The result? Business computing that can help banks identify fraud in real time or pinpoint a breakdown in a company’s supply chain and fix it before it impacts customers.

The Summit journey begins

When my team at IBM Research started on the journey to architect and build Summit more than five years ago, we knew there would be a convergence of high performance computing, big data, and analytics/AI and we would need to rewrite the rules to meet the demands of the future. We brought together the best and brightest across the technology industry to redefine supercomputing and deliver new innovations and discoveries.  Summit is a transformational system in that it is not simply a system that makes yesterday’s applications run faster but rather a system which enables the incorporation of AI into HPC and other workloads to achieve results never before possible.

Back in 2013, our mission was to build something between five and 10 times more powerful than what the government had at the time. Not only that, they wanted to be able to run much more complex codes than they had in the past: Very large “scalable science” codes that can run on the entire system as well as throughput, or ensemble codes where multiple copies of the code are running at the same time. These computers would have to manage impossibly complex workloads at scale with massive data sets.

Why couldn’t data be king?

In the traditional supercomputing model, the CPU was king. You had the CPU in one place and the data sitting separately in another place. Every time you wanted the CPU to operate on the data, the data had to travel to the CPU. Moving data to a place it can be operated on eats up time and energy. We began to ask, why should the processor be the king? The data should be king.  The data needs to be near the processing elements and we had to minimize data movement.

We realized that with the rate of improvement of silicon technology slowing down, we could get much more system performance by using accelerators (specialized processors like GPUs with their own memory nearby) coupled with more general-purpose CPUs with their own memory, which play a critical role in feeding the GPUs with data orchestrating the workload, running the serial parts of the code that do not run efficiently on the GPUs, and managing all of the other services it takes for the computer to work.  And the GPUs and CPUs could share what was in their memories. Combining CPUs and GPUs creates a heterogeneous architecture — and with it you get an enormous performance benefit.

Together with NVIDIA, our IBM Research team envisioned using NVLink between the CPUs and GPUs for Summit – a proprietary interface that allows CPUs and GPUs to share data up to 4X faster than x86-based systems. We also made the system more efficient at moving data by putting in extremely high-speed connections between the 4608 individual CPU/GPU compute nodes in collaboration with Mellanox.

That means that tasks that took months or weeks now take days or hours to complete. To harness Summit-scale capabilities for AI, we’ve worked for several years on software libraries and algorithms which exploit not only GPUs but also NVLink and high-performance networking.  Our AI software, like Distributed Deep Learning (PowerAI DDL), can teach computers to identify objects in images, understand the contents of documents, forecast demand or assess risk. As an example, we showed that using supercomputing technology, we could train a model to identify things at higher than human accuracy. Image recognition software has potential applications in a wide variety of fields – from identification of cancer in medical scans to assessing damage after a hurricane.

The computing power of a bleeding-edge machine like Summit can help train AI much faster than traditional computers. We know that having much shorter time to solution means that ultimately people have higher quality AI, can apply AI to more data types and create more solutions — because it becomes more feasible to run lots of studies.  Our research team is working not only on Summit-scale AI, but also on systems for AI in the enterprise – taking mini versions of Summit technology (the AC922 Power9 system and PowerAI software) into clients’ data centers to help them create AI on their own data.

Summit is the world’s smartest and most powerful computer.  IBM Research and IBM Systems worked together to integrate numerous advanced technologies and to architect, design, and build this incredible system.

Summit

Summit by the numbers

200: Petaflops (quadrillion calculations per second)

9,216: Number of IBM POWER9 CPUs.

27,648: Number of NVIDIA Volta GPUs.

4,608: Number of nodes.

305 days: If every person on Earth completed one calculation per second, that’s how long it would take to do what Summit can do in 1 second.

Summit is around a million times faster than a high-end laptop.

340 tons: Summit’s cabinets, file system, and overhead infrastructure weigh more than a large commercial aircraft.

5,600 sq. ft.: Summit takes up the space of two tennis courts.

185 miles: Summit is connected by enough high-speed fiber optic cable to stretch from Knoxville to Nashville.

74 years: Summit’s file system can store 250 petabytes of data, or the equivalent of 74 years of high definition video.

4,000: Gallons of water flowing to cool Summit, per minute.

70: Temperature of the water, in degrees Fahrenheit.

* Sierra is expected to come online later this month.

Summit

Michael Rosenfield

Vice President, Data Centric Solutions, IBM Research