It’s been an exciting year, and one that I’ll be looking back on with much satisfaction for many years to come. In the space of a year, we’ve created PowerAI, converting Deep Learning from a promise into a technology that’s ready to use. For many years, Artificial Neural Networks, the technology behind Deep Learning, has held out the promise of real cognitive computing. However, it has taken many innovations, including more sophisticated network architectures and system innovations such as programmable inference accelerators which were invented in 1994 and, to accelerate neural network training, numeric accelerators invented around 2000 to realize the promise of Deep Learning.
What a difference a year makes! Last year this time we were debugging our future work horse for Deep Learning on Power, a PCIe-based multi-GPU enclosure. A few months earlier, we had concluded that we were running out of steam with our trusty Nvidia K80 numeric accelerators. These accelerators had been designed for HPC workloads in all national labs and were the best accelerators in the industry – but they were strating to run out of steam for the most sophisticated cognitive applications. The 16 GPU enclosures gave us the work horse to build a mature Deep Learning environment on Power that was ready for even the most advanced enterprise users.
But creating a stable hardware platform for Deep Learning users was just a part of the challenge. The Deep Learning systems in use in early 2016 originated from academic and industrial research labs and were built around technological prowess, rather than ease of use and deployment. While rapid innovation is a good model for incubating new technology, it did not address the needs of technology adopters. To fill this gap, we created the Power Machine Learning and Deep Learning software kit – a distro for cognitive applications which we first released in April 2016.
To accelerate progress in the advancement of Deep learning, we also contributed our enhancements to take advantage of the innovations in Power to the community and published build recipes for early adopters who want to use the very newest versions of DL frameworks from source. In parallel to our work on Deep Learning frameworks, we expanded and optimized the Power software ecosystem for Deep Learning: we engaged with the open source community to optimize and release mathematics libraries for Power, such as OpenBLAS, ATLAS, FFTW, numpy; and IBM’s Toronto Lab released MASS, the mathematics acceleration subsystem for Linux to accelerate Deep learning applications. In addition, we incubated efforts to port and optimize the dynamic scripting languages used to develop cognitive applications such as Julia, Lua and Python for Power.
In parallel, we worked with our colleagues in the system architecture team to use our experiences with deep learning stacks to create the first enterprise server optimized for Deep Learning which was announced in September. Building on our experience with up to 16 PCIe-connected GPUs, and with K40, K80 and M40 GPUs we concluded that Power offered the best host architecture for a multi-GPU solution, and that a smaller number of stronger GPUs would offer significant scalability benefits. Drawing on the ongoing work on coherent processor/accelerator interfaces and the world’s fastest supercomputers, we also recognized the benefits of a coherent high-performance accelerator interconnect. With these guidelines as a basis, a system based on the POWER8+ with up to four of the newest Nvidia P100 GPUs and CPU/GPU NVLink offered an ideal design point.
Like a clock work, all the pieces fell into place for finishing 2016 with exciting highlights: when we released PowerAI, our Deep Learning distro for the new DL-optimized enterprise server S822LC for HPC, the new S822LC for HPC server completed training tasks in under an hour that had previously taken days and weeks. So, looking back on 2016, it was an amazing year: we built an astonishing three generations of hardware systems and released three generations of the Deep Learning software distros in about a year to transform IT and make Deep Learning a reality for the enterprise!
But we’re not done – as we look ahead to 2017, we have many new exciting ideas to transform the industry with cognitive computing innovations! Let us know how cognitive computing will transform your business and let’s innovate together!
Dr. Michael Gschwind is Chief Engineer for Machine Learning and Deep Learning for IBM Systems where he leads the development of hardware/software integrated products for cognitive computing. During his career, Dr. Gschwind has been a technical leader for IBM’s key transformational initiatives, leading the development of the OpenPOWER Hardware Architecture as well as the software interfaces of the OpenPOWER Software Ecosystem. In previous assignments, he was a chief architect for Blue Gene, POWER8, POWER7, and Cell BE. Dr. Gschwind is a Fellow of the IEEE, an IBM Master Inventor and a Member of the IBM Academy of Technology.