May 10, 2017 | Written by: Sam Ponedal
Categorized: OpenPOWER | Power servers | Power Systems
Share this post:
Day two of NVIDIA’s GPU Technology Conference has wrapped, and the excitement around deep learning and PowerAI continues to grow. In a lot of ways, the evolution of GTC (focusing on data-intensive computing and HPC to accelerated computing to deep learning) mirrors that of IBM.
IBM earned its pedigree in the realm of HPC computing, building supercomputers for decades that pushed the boundaries of computing. We’re still pushing those boundaries today with new clusters like Summit and Sierra, the two clusters being built at Oak Ridge National Labs and Lawrence Livermore National Labs after IBM won two of the three opportunities in the US Department of Energy’s CORAL grant. That continues to this day, as we look to build on our accelerated DNA to be at the forefront of machine learning, deep learning and AI.
A big part of that effort is our S822LC for HPC server, which we fondly call “Minsky”. With the first-ever NVIDIA NVLink GPU-to-CPU interface and four NVIDIA Tesla P100 GPUs, the Minsky outperforms a competitor’s x86 system by 2.5 times! This has proved to be an attractive option for innovators looking to solve problems they couldn’t solve before. And just like our showcase of CORAL at GTC in 2015, and our work to deliver real-time data analytics insights with Kinetica at GTC in 2016, this year the IBM booth showcased Minsky and what people are building with it and PowerAI in 2017.
Deep learning finds a home at the IBM booth
At the booth, some PowerAI users showed off how they are helping to bring PowerAI to others. Our cloud partner Nimbix showcased how they are providing access to Minsky in the cloud, and helped people who are ready to start implementing deep learning get started by signing people up for their free trial of PowerAI in the cloud.
For those who are interested in deep learning, but not sure how to use the open-source deep learning distributions, one of our PowerAI users, YayBench, was showing people how they could get started with their online deep learning education courses. YayBench leverages the Nimbix cloud to not only give learners lessons but also to allow them to try their new skills in a real Minsky instance on the cloud. IBM works to make developers’ jobs easier with PowerAI, and we’re excited that some of PowerAI’s first users are adopting that mantra as well to help spread deep learning across organizations.
Charting the future of deep learning with PowerAI
The big news of the day came in IBM Cognitive Systems VP of HPC, Analytics, and AI Sumit Gupta’s session where he laid out the roadmap of new features and updates for PowerAI. As Sumit explained to a packed room, up to 80 percent of a deep learning developer’s time can be spent in tuning their data sets to match their chosen deep learning framework. Well in case you hadn’t noticed, PowerAI is all about making things simple. To that end, Sumit revealed new key features to help:
- AI Vision: A custom application development tool aimed at computer vision workloads. AI Vision helps application developers with little or no experience with deep learning to build a trained deep learning model for different input data sets.
- Apache Spark-based data extraction, transformation and preparation tool: We enhanced IBM Spectrum Conductor with Spark with a GUI-based set of tools that help the data scientist to create functions that transform an input data set to the format required by frameworks like TensorFlow or Caffe, making it a cinch to match your data set to your framework. For example, you can have a transform function that resizes images for Caffe, and then simply transform and load any input data set into the right format to use with Caffe. IBM Spectrum Conductor with Spark automatically launches a whole set of Spark jobs on the cluster, each of which resizes a portion of the input data set.
- DL Insight: Model tuning software that automatically tunes hyper-parameters for models based on input data sets using Spark-based distributed computing. We enhanced IBM Spectrum Conductor with Spark so that it automatically launches multiple model training runs with different hyper-parameters using a subset of the data. It then monitors the training progress and searches and identifies the best hyper-parameters using several different search methods such as random and Bayesian search. To improve usability, DL Insight comes with a powerful and intuitive GUI that visualizes training and provides continuous feedback to quickly create and optimize deep learning models.
- Distributed deep learning: To accelerate training time, we are adding methods to scale a single training job across a cluster of servers. We have both a MPI-based scaling approach, inspired by high-performance computing methods, as well as a Spark and HPC-converged distributed computing model, for clusters with either an ethernet or InfiniBand network.
You can read more about these on Sumit’s LinkedIn post detailing the changes.
That’s all for today’s dispatch. Tune in tomorrow for our third and final post from GTC. Let us know what you’re excited about at GTC in the comments, and don’t forget to stop by our booth and say hello!
Read our post recapping day one here.