Hardware and software requirements

IBM Watson Machine Learning Accelerator requires the following hardware and software.

Hardware requirements

The following hardware is supported:
    • IBM® Power System IC922 with or without NVIDIA Tesla T4 GPUs
    • IBM Power System AC922 with or without NVIDIA Tesla V100 GPUs
    • IBM Power System S822LC with or without NVIDIA Tesla P100 GPUs
    • Inspur Power Systems FP5468G2 servers with NVIDIA Tesla V100 GPUs
    • Inspur Power Systems FP5280G2 servers with NVIDIA Tesla V100 GPUs
    • x86_64 systems with or without NVIDIA Tesla V100, P100, or T4 GPUs
Hardware requirements

The following tables list the minimum system requirements for running IBM Watson Machine Learning Accelerator in a production environment. You might have extra requirements (such as extra CPU and RAM) depending on the Spark instance groups that will run on the hosts, especially for compute hosts that run workloads.

Table 1. Minimum hardware requirements
Requirement Management hosts Compute hosts Notes
RAM 64 GB 32 GB In general, the more memory your hosts have, the better performance is.
Disk space to extract install files from the WML Accelerator install package 16 GB (First management host only) NA  
Disk space to install IBM Spectrum Conductor™ 12 GB 12 GB  
Disk space to install IBM Spectrum Conductor Deep Learning Impact 11 GB 11 GB  
Additional disk space (for Spark instance group packages, logs, and so on.) Can be 30 GB for a large cluster 1 GB*N slots + sum of service package sizes (including dependencies) Disk space requirements depend on the number of Spark instance groups and the Spark applications that you run. Long running applications, such as notebooks and streaming applications, can generate huge amounts of data that is stored in Elasticsearch. What your applications log can also increase disk usage. Consider all these factors when estimating disk space requirements for your production cluster. For optimal performance, look at tuning how long to keep application monitoring data based on your needs.

Software requirements

The following software is required:

Table 2. Software requirements
Hardware Operating system GPU software
POWER8 Red Hat® Enterprise Linux® (RHEL) 7.7 (ppc64le)
  • CUDA Deep Neural Network (cuDNN) 7.6.5 library
  • NVIDIA CUDA 10.2
  • NVIDIA GPU driver 440.33.01
  • NVIDIA NCCL2 2.5.6
  • Anaconda 2019.10 (with conda 4.7.12)
POWER9 with this security fix: RHSA-2018:1374 - Security Advisory RHEL 7.6 (ppc64le)
  • CUDA Deep Neural Network (cuDNN) 7.6.5 library
  • NVIDIA CUDA 10.2
  • NVIDIA GPU driver 440.33.01
  • NVIDIA NCCL2 2.5.6
  • Anaconda 2019.10 (with conda 4.7.12)
x86 RHEL 7.7
  • CUDA Deep Neural Network (cuDNN) 7.6.5 library
  • NVIDIA CUDA 10.2
  • NVIDIA GPU driver 440.33.01
  • NVIDIA NCCL2 2.5.6
  • Anaconda 2019.10
  • Supported GPUs: NVIDIA P100 and V100
  • Shared file system:
    • IBM Spectrum Scale 5.0.4-3, 5.0.3, 5.0.1, 4.2.3, 4.2.2, 4.2.1 or 4.1.1
    • Network file system (NFS) 2, 3, or 4
      Note: If using NFSv4, only framework plugin functionality for IBM Spectrum Conductor Deep Learning Impact is available. For full functionality of IBM Spectrum Conductor Deep Learning Impact including the cluster management console, use NFSv3 or NFSv2.

Deep learning frameworks

By default, all of the frameworks included with WML CE are installed. At least one supported framework must be installed. However, it is recommended that you install both TensorFlow and IBM Caffe. If either framework is missing, the option for the missing framework will not work in the cluster management console.

To determine which frameworks are included with WML Accelerator, see What's included.

Required additional repositories

See the following topic: Red Hat Enterprise Linux operating system and repository setup.