Technical Blog Post
Abstract
Deep Learning on OpenPOWER: Building Theano on OpenPOWER Linux Systems
Body
The Machine Learning and Deep Learning project in IBM Systems is a broad effort to build a co-optimized stack of hardware and software to make IBM Power Systems the best platform to develop and deploy cognitive applications. As part of this project, IBM has developed new processors, systems, and a co-optimized software stack uniquely optimized for AI applications.
The first offerings for this new era of cognitive computing are our first server designed from ground up for cognitive computing with the S822LC, and the PowerAI distribution of AI tools and libraries for the Ubuntu and RedHat Linux operating systems. Most data scientists and AI practitioners building cognitive solutions ptrefer to use the pre-built, pre-optimized deep learning frameworks of the PowerAI distribution.
In addition to creating the binary distribution of DL frameworks, we have also been working with the Open Source community to enable the open source frameworks to be built directly from the repositories to enable Deep Learning users to harness the power of the OpenPOWER ecosystem. With the introduction of little-endian OpenPOWER Linux, installation of open source applications on Power has never been easier.
If you need to build Theano from source, this blog provides instructions on building Theano on (little-endian) OpenPOWER Linux, such as Red Hat Enterprise Linux 7.1, SUSE Linux Enterprise Server 12, Ubuntu 14.04, and subsequent releases. Theano may be built with support for CUDA7.5 or CUDA8 to exploit Nvidia numerical accelerators, such as the Pascal P100 numerical accelerators in the S822LC for HPC systems with four P100 accelerators, or with PCIe-attached accelerators in conjunction with POWER8™ systems.
Theano uses the BLAS basic linear algebra library interfaces. We recommend that you rebuild a BLAS library based on OpenBLAS optimized for your POWER8™ system. You can read more about building an optimized OpenBLAS library in Building Optimized Libraries for Deep Learning on OpenPOWER Linux Systems.
Prerequisites for building Theano
To build Theano, you will need at a minimum the following Linux packages:
- Python 2 version >= 2.6, or Python 3 version >= 3.3
- python-pip
- python-dev
- Python NumPy version >= 1.7.1
- Python SciPy version >= 0.11
- a BLAS installation
In addition, Theano can be built to use the following optional packages
- nose version >= 1.3.0
- Sphinx version >= 0.5.1
- pygments
- pydo
On Ubuntu, you you can install many of these above packages with the apt-get system installer:
$ sudo apt-get install python-numpy \
python-scipy \
python-dev \
python-pip \
python-nose \
g++ \
libblas-dev
To build your own optimized OpenBLAS library, see Building Optimized Libraries for Deep Learning on OpenPOWER Linux Systems. To build and install libgpuarray, see http://deeplearning.net/software/libgpuarray/installation.html.
Building Theano with CUDA 7.5
If CUDA 7.5 is already installed on your OpenPOWER system, you can skip to the next step, installing cuDNN 5.1. To install CUDA 7.5, download the CUDA distribution from
https://developer.nvidia.com/cuda-downloads and follow the installation instructions.
For example, on Ubuntu 14.04, this is performed as follows:
Download and install NVIDIA CUDA 7.5 from https://developer.nvidia.com/cuda-downloads
• Select Operating System: Linux
• Select Architecture: ppc64le
• Select Distribution Ubuntu
• Select Version 14.04
• Select the Installer Type that best fits your needs
• Follow the Linux installation instructions in the CUDA Quick Start Guide linked from the download page, including the steps describing how to set up the CUDA development environment by updatingPATH and LD_LIBRARY_PATH.
If cuDNN v5.1 is already installed on your OpenPOWER system, you can skip to the next step, building Theano. Download NVIDIA cuDNN 5.1 for CUDA 7.5 Power8 from https://developer.nvidia.com/cudnn and follow Nvidia’s installation instructions. Registration in NVIDIA’s Accelerated Computing Developer Program is required to download cuDNN.
On Ubuntu 14.04, download the following
- cuDNN v5.1 Runtime Library for Power8(Deb)
- cuDNN v5.1 Developer Library for Power8(Deb)
- cuDNN v5.1 Code Samples and User Guide Power8(Deb)
and install as follows:
$ sudo dpkg -i libcudnn5*deb
Building Theano with CUDA 8
If CUDA 8 is already installed on your OpenPOWER system, you can skip to the next step, installing cuDNN 5.1. To install CUDA 8, download the CUDA distribution from
https://developer.nvidia.com/cuda-downloads and follow the installation instructions.
For example, on Ubuntu 16.04, this is performed as follows:
Download and install NVIDIA CUDA 8 from https://developer.nvidia.com/cuda-downloads
• Select Operating System: Linux
• Select Architecture: ppc64le
• Select Distribution Ubuntu
• Select Version 16.04
• Select the Installer Type that best fits your needs
• Follow the Linux installation instructions in the CUDA Quick Start Guide linked from the download page, including the steps describing how to set up the CUDA development environment by updating PATH and LD_LIBRARY_PATH.
If cuDNN v5.1 is already installed on your OpenPOWER system, you can skip to the next step, building Theano. Download NVIDIA cuDNN 5.1 for CUDA 8 on POWER8 from https://developer.nvidia.com/cudnn and follow Nvidia’s installation instructions. Registration in NVIDIA’s Accelerated Computing Developer Program is required to download cuDNN.
On Ubuntu 16.04, download the "cuDNN v5.1 Library for Power8" and install as follows (substituting the path to the downloaded library archive where download-path is indicated in blue):
$ cd /usr/local
$ tar -xvzf download-path/cudnn-8.0-linux-ppc64le-v5.1.tgz
Building and Testing Theano
Theano can be installed using the python package installer pip as follows:
$ sudo pip install Theano
To use GPUs, you must then tell Theano where the CUDA root folder is, and there are three ways to do it. Any one of them is enough:
- Define a $CUDA_ROOT environment variable to equal the cuda root directory, as in CUDA_ROOT=/path/to/cuda/root, or
- add a cuda.root flag to THEANO_FLAGS, as in THEANO_FLAGS='cuda.root=/path/to/cuda/root', or
- add a [cuda] section to your .theanorc file containing the option root = /path/to/cuda/root.
You can test your Theano installation with the following command:
$ python -c "import theano; theano.test()"
You can use nose to test your installation with the following command:
$ nosetests theano
You can find additional system-independent information about installing theano at http://deeplearning.net/software/theano/install.html.
See what you can do with Theano on OpenPOWER
A variety of GPU numerical accelerator configurations can be used to accelerate Deep Learning on OpenPOWER systems, including the new IBM Power Systems S822LC for High Performance Computing server. You can learn more about and order these systems by contacting your IBM Business Partner.
IBM invites GPU software developers to join the IBM-NVIDIA Acceleration Lab to be among the first to try these systems and see the benefits of the Tesla P100 GPU accelerator and the high-speed NVLink connection to the IBM POWER8 CPU.
I look forward to hearing about the performance you get from these systems. Share how you want to use Theano on OpenPOWER and how Deep Learning on OpenPOWER will enable you to build the next generation of cognitive applications by posting in the comments section below.
Dr. Michael Gschwind is Chief Engineer for Machine Learning and Deep Learning for IBM Systems where he leads the development of hardware/software integrated products for cognitive computing. During his career, Dr. Gschwind has been a technical leader for IBM’s key transformational initiatives, leading the development of the OpenPOWER Hardware Architecture as well as the software interfaces of the OpenPOWER Software Ecosystem. In previous assignments, he was a chief architect for Blue Gene, POWER8, POWER7, and Cell BE. Dr. Gschwind is a Fellow of the IEEE, an IBM Master Inventor and a Member of the IBM Academy of Technology.
UID
ibm16169833
