Technical Blog Post

Abstract

Deep Learning on OpenPOWER: Building Torch on OpenPOWER Linux Systems

Body

The Machine Learning and Deep Learning project in IBM Systems is a broad effort to build a co-optimized stack of hardware and software to make IBM Power Systems the best platform to develop and deploy cognitive applications. As part of this project, IBM has developed new processors, systems, and a co-optimized software stack uniquely optimized for AI applications.

The first offerings for this new era of cognitive computing are our first server designed from ground up for cognitive computing with the S822LC, and the PowerAI distribution of AI tools and libraries for the Ubuntu and RedHat Linux operating systems. Most data scientists and AI practitioners building cognitive solutions ptrefer to use the pre-built, pre-optimized deep learning frameworks of the PowerAI distribution.

In addition to creating the binary distribution of DL frameworks, we have also been working with the Open Source community to enable the open source frameworks to be built directly from the repositories to enable Deep Learning users to harness the power of the OpenPOWER ecosystem. With the introduction of little-endian OpenPOWER Linux, installation of open source applications on Power has never been easier.

If you need to build Torch from source, this blog provides instructions on building Torch on (little-endian) OpenPOWER Linux, such as Red Hat Enterprise Linux 7.1, SUSE Linux Enterprise Server 12, Ubuntu 14.04, and subsequent releases. Torch may be built with support for CUDA7.5 or CUDA8 to exploit Nvidia numerical accelerators, such as the Pascal P100 numerical accelerators in the S822LC for HPC systems with four P100 accelerators, or with PCIe-attached accelerators in conjunction with POWER8™ systems.

Torch uses the BLAS basic linear algebra library interfaces. We recommend that you rebuild a BLAS library based on OpenBLAS optimized for your POWER8™ system. The build scripts used in this tutorial include instructions to build OpenBLAS for use in conjunction with Torch.

Prerequisites for building Torch

To build Torch, you will need at a minimum the following Linux packages:

cmake

curl

fftw-devel

gcc-c++

gcc-gfortran

git

gnuplot

GraphicsMagick-devel

ImageMagick

ipython / python-ipython

libfftw3-dev

libgraphicsmagick1-dev

libjpeg-turbo-devel

libpng-devel

libsox-dev

libsox-fmt-all

make

ncurses-devel

python-software-properties

qt-devel

qtwebkit-devel

readline-devel

software-properties-common (Ubuntu)

sox

sox-devel r-r

unzip

zeromq / czmq-devel , czmq

These can be installed with a dependence installation script that is part of the Torch distribution, as described below.

Building Torch with CUDA 7.5

If CUDA 7.5 is already installed on your OpenPOWER system, you can skip to the next step, installing cuDNN 5.1. To install CUDA 7.5, download the CUDA distribution from

https://developer.nvidia.com/cuda-downloads and follow the installation instructions.

For example, on Ubuntu 14.04, this is performed as follows:

Download and install NVIDIA CUDA 7.5 from

https://developer.nvidia.com/cuda-downloads

• Select Operating System: Linux

• Select Architecture: ppc64le

• Select Distribution Ubuntu

• Select Version 14.04

• Select the Installer Type that best fits your needs

• Follow the Linux installation instructions in the CUDA Quick Start Guide linked from the download page, including the steps describing how to set up the CUDA development environment by updating PATH and LD_LIBRARY_PATH.

If cuDNN v5.1 is already installed on your OpenPOWER system, you can skip to the next step, building Torch. Download NVIDIA cuDNN 5.1 for CUDA 7.5 Power8 from https://developer.nvidia.com/cudnn and follow Nvidia’s installation instructions. Registration in NVIDIA’s Accelerated Computing Developer Program is required to download cuDNN.

On Ubuntu 14.04, download the following

cuDNN v5.1 Runtime Library for Power8(Deb)
cuDNN v5.1 Developer Library for Power8(Deb)
cuDNN v5.1 Code Samples and User Guide Power8(Deb)

and install as follows:

$ sudo dpkg -i libcudnn5*deb

Building Torch with CUDA 8

If CUDA 8 is already installed on your OpenPOWER system, you can skip to the next step, installing cuDNN 5.1. To install CUDA 8, download the CUDA distribution from

https://developer.nvidia.com/cuda-downloads and follow the installation instructions.

For example, on Ubuntu 16.04, this is performed as follows:

Download and install NVIDIA CUDA 8 from https://developer.nvidia.com/cuda-downloads

• Select Operating System: Linux

• Select Architecture: ppc64le

• Select Distribution Ubuntu

• Select Version 16.04

• Select the Installer Type that best fits your needs

• Follow the Linux installation instructions in the CUDA Quick Start Guide linked from the download page, including the steps describing how to set up the CUDA development environment by updating PATH and LD_LIBRARY_PATH.

If cuDNN v5.1 is already installed on your OpenPOWER system, you can skip to the next step, building Torch. Download NVIDIA cuDNN 5.1 for CUDA 8 on POWER8 from https://developer.nvidia.com/cudnn and follow Nvidia’s installation instructions. Registration in NVIDIA’s Accelerated Computing Developer Program is required to download cuDNN.

On Ubuntu 16.04, download the following

• cuDNN v5.1 Library for Power8

and install as follows (substituting the path to the downloaded library archive where download-path is indicated in blue):

$ cd /usr/local

$ tar -xvzf download-path/cudnn-8.0-linux-ppc64le-v5.1.tgz

Building Torch 7.0

To build the Torch, start by downloading Torch. Check out Torch from the repository (We currently use a separate repository for Torch on Power as we work with the Torch maintainers to integrate code enhancements to support Power into the master Torch repository.):

$ git clone https://github.com/PPC64/torch-distro

$ cd torch-distro

At this point, update the install-deps script to build the OpenBLAS library with POWER8 optimizations starting around line 15 by adding TARGET=POWER8:

if [ $(getconf _NPROCESSORS_ONLN) == 1 ]; then

make NO_AFFINITY=1 USE_OPENMP=0 USE_THREAD=0 TARGET=POWER8

else

make NO_AFFINITY=1 USE_OPENMP=1 TARGET=POWER8

Optionally, enable MASS as described in Building Optimized Libraries for Deep Learning on OpenPOWER Linux Systems by adding the USE_MASS=1 option starting around line 15 of the install-deps script:

if [ $(getconf _NPROCESSORS_ONLN) == 1 ]; then

make NO_AFFINITY=1 USE_OPENMP=0 USE_THREAD=0 USE_MASS=1 TARGET=POWER8

else

make NO_AFFINITY=1 USE_OPENMP=1 USE_MASS=1 TARGET=POWER8

Torch depends on the Lua scripting language. To install both Lua and Torch, invoke the build script:

$ ./install.sh

By default, Torch will install LuaJIT 2.1. If you want to build with another version of Lua, clean the build directory with ./clean.sh and invoke the install.sh script selecting an alternate LUA version as follows:

$ TORCH_LUA_VERSION=LUA51 ./install.sh

$ TORCH_LUA_VERSION=LUA52 ./install.sh

See what you can do with Torch on OpenPOWER

A variety of GPU numerical accelerator configurations can be used to accelerate Torch on OpenPOWER systems, including the new IBM Power Systems S822LC for High Performance Computing server. You can learn more about and order these systems by contacting your IBM Business Partner.

IBM invites GPU software developers to join the IBM-NVIDIA Acceleration Lab to be among the first to try these systems and see the benefits of the Tesla P100 GPU accelerator and the high-speed NVLink connection to the IBM POWER8 CPU.

I look forward to hearing about the performance you get from these systems. Share how you want to use Torch on OpenPOWER and how Deep Learning on OpenPOWER will enable you to build the next generation of cognitive applications by posting in the comments section below.

Dr. Michael Gschwind is Chief Engineer for Machine Learning and Deep Learning for IBM Systems where he leads the development of hardware/software integrated products for cognitive computing. During his career, Dr. Gschwind has been a technical leader for IBM’s key transformational initiatives, leading the development of the OpenPOWER Hardware Architecture as well as the software interfaces of the OpenPOWER Software Ecosystem. In previous assignments, he was a chief architect for Blue Gene, POWER8, POWER7, and Cell BE. Dr. Gschwind is a Fellow of the IEEE, an IBM Master Inventor and a Member of the IBM Academy of Technology.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power ->PowerLinux"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

UID

ibm16169839

Tips