PowerAI started off as a package of software distributions of many of the major deep learning software frameworks for model training like TensorFlow, Caffe, Torch, Theano, and the associated libraries like cuDNN. The PowerAI software has always been optimized for performance using the NVLink-based Power server, the IBM Power 822LC for HPC. The AI stack foundation starts with the right hardware: servers with accelerators and the right storage. GPU accelerators are extremely well suited for the compute-intensive nature of deep learning training, and servers with the highest CPU to GPU bandwidth, like IBM’s NVLink server, enable the high performance data transfer required for larger and more complex deep learning models. But all of this first starts with getting the right data.
IBM has announced a new set of PowerAI tools to make deep learning easier to use for data scientists. The four major additions are as follows:
AI Vision: A custom application development tool aimed at computer vision workloads. AI Vision enables application developers with little or no experience with deep learning to build a trained deep learning model for different input data sets.
Apache Spark-based Data Extraction, Transformation and Preparation tool: We enhanced IBM Spectrum Conductor with Spark (https://ibm.co/2s5xZwR) with a GUI-based set of tools that enable the data scientist to create functions that transform an input data set to the format required by frameworks like TensorFlow or Caffe, making it a cinch to match data sets to frameworks. For example, clients can have a transform function that resizes images for Caffe, and then simply transform and load any input data set into the right format to use with Caffe. IBM Spectrum Conductor with Spark automatically launches a whole set of Spark jobs on the cluster, each of which resizes a portion of the input data set.
DL Insight: Model tuning software that automatically tunes hyper-parameters for models based on input data sets using Spark-based distributed computing. We enhanced IBM Spectrum Conductor with Spark (https://ibm.co/2s5xZwR) so that it automatically launches multiple model training runs with different hyper-parameters using a subset of the data. It then monitors the training progress and searches and identifies the best hyper-parameters using several different search methods such as random and Bayesian search. To improve usability, DLInsight comes with a powerful and intuitive GUI that visualizes training and provides continuous feedback to quickly create and optimize deep learning models.
Distributed Deep Learning: To accelerate the training time, we are adding methods to scale a single training job across a cluster of servers. We have both a MPI-based scaling approach, inspired by high-performance computing methods, as well as a Spark and HPC converged distributed computing model, for clusters with either an Ethernet or InfiniBand network.
For more information on these new Power AI additions, see Sumit Gupta's blog: https://www.linkedin.com/pulse/new-powerai-developer-tools-make-deep-learning-easier-sumit-gupta.
For more information about IBM PowerAI, refer to the following website: https://www.ibm.com/us-en/marketplace/deep-learning-platform.
For more information about IBM Redbooks and the upcoming IBM PowerAI - Deep Learning residency, refer to the following website: http://www.ibm.com/redbooks.