Technical Blog Post
Abstract
Deep Learning on OpenPOWER: Building Caffe on OpenPOWER Linux Systems
Body
The Machine Learning and Deep Learning project in IBM Systems is a broad effort to build a co-optimized stack of hardware and software to make IBM Power Systems the best platform to develop and deploy cognitive applications. As part of this project, IBM has developed new processors, systems, and a co-optimized software stack uniquely optimized for AI applications.
The first offerings for this new era of cognitive computing are our first server designed from ground up for cognitive computing with the S822LC, and the PowerAI distribution of AI tools and libraries for the Ubuntu and RedHat Linux operating systems. Most data scientists and AI practitioners building cognitive solutions ptrefer to use the pre-built, pre-optimized deep learning frameworks of the PowerAI distribution.
In addition to creating the binary distribution of DL frameworks, we have also been working with the Open Source community to enable the open source frameworks to be built directly from the repositories to enable Deep Learning users to harness the power of the OpenPOWER ecosystem. With the introduction of little-endian OpenPOWER Linux, installation of open source applications on Power has never been easier.
If you need to build Caffe from source, this blog provides instructions on building Caffe on (little-endian) OpenPOWER Linux distributions, such as Red Hat Enterprise Linux 7.1, SUSE Linux Enterprise Server 12, Ubuntu 14.04, and subsequent releases. Caffe may be built with support for CUDA 7.5 or CUDA 8 to exploit Nvidia numerical accelerators, such as the Pascal P100 numerical accelerators in the S822LC for HPC systems with four P100 accelerators, or with PCIe-attached accelerators in conjunction with POWER8™ systems. While this blog describes building a specific version of Caffe using a specific version of CUDA, the instructions may be adapted to build other versions of Caffe with the corresponding version of the CUDA runtime.
Caffe uses the BLAS basic linear algebra library interfaces – you may use a BLAS library already installed on your system, or build a new BLAS library based on ATLAS or OpenBLAS optimized for your POWER8™ system. For more instructions on building OpenBLAS on your POWER8™ Linux system, see an my blog on Building Optimized Libraries for Deep Learning on OpenPOWER.
Installing prerequisites for building Caffe
To build Caffe, you will need at a minimum the following Linux packages:
liblapack-dev
libboost-all-dev
libopencv-dev
nvidia-cuda-toolkit
cuda
cmake
protobuf-compiler
libprotobuf-dev
libprotoc-dev
libprotobuf-dev
libatlas-base-dev
libhdf5-serial-dev
liblmdb-dev
libleveldb-dev
libsnappy-dev
libgoogle-glog-dev
libgflags-dev
libsnappy-dev
build-essential
doxygen
g++
python-numpy
Most of these can be installed with the system installer directly from your Linux distribution, e.g., with the following command under Ubuntu:
$ sudo apt-get install -y liblapack-dev \
libboost-all-dev \
libopencv-dev \
cmake \
protobuf-compiler \
libprotobuf-dev \
libprotoc-dev \
libprotobuf-dev \
libatlas-base-dev \
libhdf5-serial-dev \
liblmdb-dev \
libleveldb-dev \
libsnappy-dev \
libgoogle-glog-dev \
libgflags-dev \
libsnappy-dev \
build-essential \
doxygen \
g++ \
python-numpy
Installing CUDA
If you are planning to build Caffe with support for Nvidia Numeric Accelerators, you must install CUDA before building Caffe. You have a choice of building Caffe with either CUDA 7.5 or CUDA 8. If CUDA is not already installed on your system, follow the instructions for installing either CUDA 7.5 or CUDA 8 below.
Installing CUDA 7.5
If CUDA 7.5 is already installed on your OpenPOWER system, you can skip to the next step, installing cuDNN 5.1. To install CUDA 7.5, download the CUDA distribution from
https://developer.nvidia.com/cuda-downloads and follow the installation instructions.
For example, on Ubuntu 14.04, this is performed as follows:
Download and install NVIDIA CUDA 7.5 from https://developer.nvidia.com/cuda-downloads
• Select Operating System: Linux
• Select Architecture: ppc64le
• Select Distribution Ubuntu
• Select Version 14.04
• Select the Installer Type that best fits your needs
• Follow the Linux installation instructions in the CUDA Quick Start Guide linked from the download page, including the steps describing how to set up the CUDA development environment by updating PATH and LD_LIBRARY_PATH.
If cuDNN v5.1 is already installed on your OpenPOWER system, you can skip to the next step, building Caffe. Download NVIDIA cuDNN 5.1 for CUDA 7.5 Power8 from https://developer.nvidia.com/cudnn and follow Nvidia’s installation instructions. Registration in NVIDIA’s Accelerated Computing Developer Program is required to download cuDNN.
On Ubuntu 14.04, download the following
- cuDNN v5.1 Runtime Library for Power8(Deb)
- cuDNN v5.1 Developer Library for Power8(Deb)
- cuDNN v5.1 Code Samples and User Guide Power8(Deb)
and install as follows:
$ sudo dpkg -i libcudnn5*deb
Installing CUDA 8
If CUDA 8 is already installed on your OpenPOWER system, you can skip to the next step, installing cuDNN 5.1. To install CUDA 8, download the CUDA distribution from
https://developer.nvidia.com/cuda-downloads and follow the installation instructions.
For example, on Ubuntu 16.04, this is performed as follows:
Download and install NVIDIA CUDA 8 from https://developer.nvidia.com/cuda-downloads
• Select Operating System: Linux
• Select Architecture: ppc64le
• Select Distribution Ubuntu
• Select Version 16.04
• Select the Installer Type that best fits your needs
• Follow the Linux installation instructions in the CUDA Quick Start Guide linked from the download page, including the steps describing how to set up the CUDA development environment by updating PATH and LD_LIBRARY_PATH.
If cuDNN v5.1 is already installed on your OpenPOWER system, you can skip to the next step, building Caffe. Download NVIDIA cuDNN 5.1 for CUDA 8 on POWER8 from https://developer.nvidia.com/cudnn and follow Nvidia’s installation instructions. Registration in NVIDIA’s Accelerated Computing Developer Program is required to download cuDNN.
On Ubuntu 16.04, download the following "cuDNN v5.1 Library for Power8" and install as follows (substituting the path to the downloaded library archive where download-path is indicated in blue):
$ cd /usr/local
$ tar -xvzf download-path/cudnn-8.0-linux-ppc64le-v5.1.tgz
Building Caffe
Caffe can be built either from the "master" repository maintained by the Berkeley Vision and Learning Center (henceforth “BVLC Caffe” ), or from a fork maintained by Nvidia which contains additional optimizations for GPUs (henceforth "NV Caffe" ). BVLC Caffe is the more recent code base, and some models working with BVLC may not work with the Nvidia fork. On the other hand, NV Caffe typically offers higher performance when training models using Nvidia numerical acccelerators. Both Caffe versions are built the same way on OpenPOWER Linux, with the only difference being the commands used to check out one or the other version of Caffe. If you want to build both versions, you can check out and build both BVLC Caffe and NV Caffe in separate directories.
Checkout out BVLC Caffe from github
To check out BVLC Caffe from the repository use the following commands. (We currently use the IBM repository for BVLC Caffe on Power as we work with the BVLC Caffe maintainers to integrate code enhancements to support Power into the master Caffe repository.)
$ git clone https://github.com/ibmsoe/caffe
$ cd caffe
$ git checkout rc3-bvlc-ppc
At this point, you can either proceed to build with cmake, or with make. Make requires editing of a configuration file, but offers more control. Conversely, cmake offers the faster process at this point.
Checkout out NV Caffe from github
To check out NV Caffe from the repository use the following commands. (We currently use the IBM repository for NV Caffe on Power as we work with the NV Caffe maintainers to integrate code enhancements to support Power into the master Caffe repository.)
$ git clone https://github.com/ibmsoe/caffe
$ cd caffe
$ git checkout v0.14.5-nv-ppc
At this point, you can either proceed to build with cmake, or with make. Make requires editing of a configuration file, but offers more control. Conversely, cmake offers the faster process at this point.
Building Caffe with cmake
If you are looking for more control in building Caffe, proceed to the section on Building Caffe with make. To build Caffe with cmake, create a build subdirectory in the caffe directory and then build Caffe with the following commands:
$ mkdir build
$ cd build
$ cmake ..
$ make all
$ make runtest
Building Caffe with make
In the caffe directory, copy Makefile.config.example to Makefile.config and then edit Makefile.config to reflect your preferences:
$ cp Makefile.config.example Makefile.config
$ vi Makefile.config
Most options can be simply selected by uncommenting alternative setting by removing the comment character # at the beginning of a line to give a configuration switch a particular value. For example, to build with cudnn, uncomment line setting the USE_CUDNN variable to 1:
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
The possible flags and their meaning are described in comments above the commented switch value. By convention, for on/off switches, “1” reflects a setting being enabled, and “0” reflects a disabled setting.
Depending on your Linux version, and the location of your header and library files, you may have to add paths to the INCLUDE_PATH and LIBRARY_PATH variables around line 102 of Makefile.config. For example, for Ubuntu 16.04, you may need to specify the location of the HDF5 header and library files as follows:
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/powerpc64le-linux-gnu/hdf5/serial
After updating Makefile.config, build Caffe with the following command sequence:
$ make
$ make runtest
See what you can do with Caffe on OpenPOWER
A variety of GPU numerical accelerator configurations can be used to accelerate Caffe on OpenPOWER systems, including the recently announced new IBM Power Systems S822LC for High Performance Computing server. You can learn more about and order these systems by contacting your IBM Business Partner.
IBM invites GPU software developers to join the IBM-NVIDIA Acceleration Lab to be among the first to try these systems and see the benefits of the Tesla P100 GPU accelerator and the high-speed NVLink connection to the IBM POWER8 CPU.
I look forward to hearing about the performance you get from these systems. Share how you want to use Caffe on OpenPOWER and how Deep Learning on OpenPPOWER will enable you to build the next generation of cognitive applications by posting in the comments section below.
Dr. Michael Gschwind is Chief Engineer for Machine Learning and Deep Learning for IBM Systems where he leads the development of hardware/software integrated products for cognitive computing. During his career, Dr. Gschwind has been a technical leader for IBM’s key transformational initiatives, leading the development of the OpenPOWER Hardware Architecture as well as the software interfaces of the OpenPOWER Software Ecosystem. In previous assignments, he was a chief architect for Blue Gene, POWER8, POWER7, and Cell BE. Dr. Gschwind is a Fellow of the IEEE, an IBM Master Inventor and a Member of the IBM Academy of Technology.
UID
ibm16169851
