Power servers

PowerAI: The World’s Fastest Deep Learning Solution Among Leading Enterprise Servers

Share this post:

Over the past several weeks, my IBM colleagues have written about our progress porting and optimizing popular deep learning frameworks for the most advanced platform for accelerated computing in the enterprise, the IBM S822LC for HPC.

Today I am pleased to announce another major milestone: the creation of the world’s fastest deep learning solution among leading enterprise servers. This offering includes new IBM PowerAI software toolkit paired with NVIDIA NVLink and GPUDL libraries optimized for IBM Power architecture. We call it PowerAI.

Foundations of PowerAI

power-ai-announce1

PowerAI brings together a collection of the most popular open source frameworks for deep learning, along with supporting software and libraries, all in a single installable package. Our design goal was to simplify the acquisition, installation and system optimization required to bring up a deep learning infrastructure, allowing users to spend less time on implementation and more time training neural networks for results. More about those results soon.

At the core of the PowerAI solution is the high-performance Power Systems S822LC for a high-performance computing (HPC) server, incorporating two POWER8 CPUs, up to four NVIDIA Tesla P100 GPUs, and across-the-system high-bandwidth NVLink connectivity, tying together GPU-GPU and GPU-CPU with multiple point-to-point connections.

This architecture is designed for the compute intensive requirements of deep learning software, providing a high bandwidth connection between the GPU and system memory, and GPU to GPU. With PowerAI and NVIDIA NVLink, deep learning workloads can utilize this bandwidth, moving large training data sets from system memory to GPU memory; the outcome is designed to be a shorter training cycle and the ability to train with larger data sets for improved accuracy.

Optimizations and industry exclusives

Working closely with IBM Research in Tokyo, the PowerAI development team has integrated several performance enhancements into one of these frameworks. These optimizations, packaged in the IBM-Caffe binary, leverage NVIDIA NVLink bandwidth and reduce some of the redundant data movement within this deep learning framework. This optimization, along with the increased performance of the NVIDIA Tesla P100s, enables a four GPU S822LC for HPC system to outperform an eight GPU plus Intel Broadwell system running the VGGNet workload on the Caffe framework by 24 percent.[1]

powerai-announce-chart1S822LC/HPC with 4 Tesla P100 Tesla GPUs is 24 percent faster than 8 Tesla M40 GPUs

We’re extremely excited about the promise of this optimization and look forward to seeing how our clients and partners incorporate it into their deep learning workflows.

The toolkit also leverages GPUDL libraries including deep neural network library (cuDNN), basic linear algebra subroutines (cuBLAS) and collective communication library (NCCL) as part of  NVIDIA SDKs to deliver multi-GPU acceleration for optimizing performance on IBM servers.

Over time, we intend to explore additional optimizations and unique capabilities integrated into future releases of PowerAI.

Getting started with PowerAI

The PowerAI packages are available now, linked to our PowerAI landing page. These images will install on an S822LC for HPC server running Ubuntu 16.04, NVIDIA CUDA 8 and NVIDIA cuDNN 5.1. If you were to build this infrastructure from scratch, it could likely take days; our design point is to be running in an hour or less.

If you would like to evaluate this solution in the cloud, we are excited to announce that IBM’s Power HPC cloud partner, Nimbix, has made the IBM Caffe framework available on their S822LC for HPC infrastructure as a service; instead of an hour, you could be training in minutes.

We’re truly excited about this offering and would welcome the chance to hear from you. As you and your organization get started with PowerAI, please share your results and comments.

[1] Test System: IBM S822LC 20-cores 2.86GHz 512GB memory  / 4 NVIDIA Tesla P100 GPUs / Ubuntu 16.04 /             CUDA 8.0.44 / cuDNN 5.1  / IBM Caffe 1.0.0-rc3 /  Imagenet Data

Competitive System: Intel Broadwell E5-2640v4 20-core 2.6 GHz 512GB memory / 8 NVIDIA TeslaM40 GPUs / Ubuntu 16.04 / CUDA 8.0.44 / cuDNN 5.1 / BVLC Caffe 1.0.0-rc3 / Imagenet Data

Offering Manager, High Performance Computing and Deep Learning IBM Systems

Add Comment
4 Comments

Leave a Reply

Your email address will not be published.Required fields are marked *


TARUNDEEP KALRA

Will this be available as a Service sometime later.


Scott Soutter

Hi Tarundeep, the IBM Caffe framework is available right now as a service on our HPC cloud partner Nimbix at https://power.jarvice.com. Thanks for your question!


Roland Barnes

Hi there Scott,
Please don’t take this wrong way as I absolutely LOVE Power technology !!!
But why aren’t you showing 4 x P100 GPUs running on both servers including the x86 server ?
It’s not an apples to apples comparison is it ?
Apologies if I’ve missed a technical issue whereby this wouldn’t be possible on x86 (over pci).
Regards, Roland


Scott Soutter

Hi Roland, thanks for the comment.

We recently (26 January) announced an update to PowerAI incorporating TensorFlow 0.12. As part of that release we did some performance profiling, and we found that our S822LC for HPC with 4x NVLink connected Tesla P100 GPUs outperformed a similarly configured x86 system with PCIe attached Tesla P100 GPUs by 30%.

This was measuring single system performance, running the Inception v3 benchmark on TensorFlow.

More Power servers stories

HyperSecure DBaaS: the evolution of cloud databases

How important is security for your enterprise data? While today’s business leaders want to automate tasks to eliminate human errors and security breaches, they don’t want to spend time on technical details. Managing databases in the traditional way requires a lot of highly-specialized technical knowledge, and the DB admins must know and handle nearly all […]

Continue reading

‘Twas the night before Go Live

‘Twas the night before Go Live, when all through the site, Technologies were in place to make it run right; A new cloud-based app had been rolled out with care, In hopes that customers soon would be there. The whole DevOps crew was snug in their beds, While visions of gigabytes danced in their heads. […]

Continue reading

Power your containerized cloud with IBM

Last month, IBM announced a new on-premises offering called IBM Cloud Private. An innovative and revolutionary platform-as-a-service (PaaS) offering, IBM Cloud Private incorporates the best of open source tools, including Kubernetes container orchestration, with the unique IBM values that enterprises need to be confident in a secure, compliant and performant private cloud platform. With native […]

Continue reading