POWER9: purpose-built systems accelerate AI in the IBM Cloud – open, paid beta

By | 2 minute read | February 5, 2020

Cloud and AI strategies are not one-size-fits-all. From client engagements, we understand the journeys to cloud and AI are complex and unique to each organization. Cloud strategies can include many dimensions of cloud consumption, including public and private, shared, dedicated, on premises and off premises.

The average enterprise is leveraging multiple clouds. AI adds another layer of complexity as organizations experimenting with AI struggle to find the right infrastructure to run these compute-intensive workloads. Organizations need to understand how their cloud strategy supports their AI initiatives and what their AI strategy needs in the way of cloud-enabled resources. Adopting generic strategies can be misguiding and can risk the overall success of the implementation.

IBM Cloud Virtual Servers on VPC for POWER

To help organizations succeed in today’s hybrid multicloud world and start their journey to AI, IBM Cloud and IBM Power Systems have paired up to offer POWER9 processor-based virtual servers, with and without GPUs, inside the IBM Cloud Virtual Private Cloud (VPC). Currently, this offering is an open, paid beta.

Based on the IBM Power System AC922 server, the infrastructure that fuels the world’s two fastest supercomputers1, IBM Cloud Virtual Servers on VPC for POWER bring together NVIDIA V100 Tensor Core GPUs and a POWER9 server in an on-demand, pay-as-you-go, next-generation cloud environment. These combinations provide organizations with rapid and cost-effective access to the compute and processing power needed for AI and deep learning workloads along with the enterprise-grade security and isolation of a private cloud. Delivering 2.85x faster throughput (compared to x86 infrastructure) for AI, deep learning and visual analytics workloads2, these POWER9 processor-based virtual servers offer an ideal platform for AI-hungry apps on Linux.

Infrastructure for AI

Many clients have already adopted the Power System AC922 as their AI training platform. With IBM Cloud Virtual Servers for VPC on POWER, they can extend their POWER9 deployment options and execute AI workloads on GPU-accelerated POWER9 technology in a logically isolated, public cloud environment. Organizations can get started with AI quickly and build, train, and test new AI applications, frameworks and libraries in the cloud with rapid self-service provisioning and flexible management.

Get started with IBM Cloud Virtual Servers for VPC on POWER with the open, paid beta

Interested in pricing and customization options for POWER9 in the cloud? Want to check out the open, paid beta? Get started by visiting the IBM Cloud Catalog. POWER9 VSIs on Linux are currently available in VPC for provisioning in the Dallas MZR.

To learn more about Virtual Private Cloud visit https://www.ibm.com/cloud/virtual-private-cloud.

To learn more about accelerated Power Systems visit us here.

[1] https://www.top500.org/lists/2019/11/

[2] Results are based on IBM Internal Measurements running 1000 iterations of Deeplabv3+ model (batch size=1, resolution=2100^2) on PASCAL VOC 2012 dataset. (http://hosts.robot.ox.ac.uk/pascal/VOC/voc2012/htmldoc/index.html)

Power AC922 on IBM’s Virtual Private Cloud (NextGen); 28 cores (2 x 14c chips), POWER9 with NVLink 2.0; 3.8 GHz, 1 TB memory , 4xTesla V100 32GB-GPU ; Ubuntu 18.04.3 LTS(4.15.0-64-generic)  with CUDA 10.1.168/ CUDNN 7.5.1;nvidia-driver-418.87; Competitive stack: 2x Xeon E5-2698 (bare metal); 40 cores (2 x 20c chips); 2.40 GHz; 768 GB memory, 8xTesla V100 16GB-GPU,Ubuntu 16.04.5 with CUDA .10.1.168/ CUDNN 7.5.1; nvidia-driver-418.39

Software: IBM TFLMS – P9: TFLMSv2- WML-CE 1.6.1 Tensorflow 1.14 ,tensorflow-large-model-support 2.0.1 ; Competitive stack: TFLMSv2: – WML-CE 1.6.1 Tensorflow 1.14, tensorflow-large-model-support 2.0.1;  Deeplabv3 – https://github.com/mldlppc/tensorflow-models (branch-> powerai1.6.0)

TFLMSv2 parameters (swapout_threshold=1,swapin_ahead=1,swapin_groupby=0,sync_mode=0)

Results can vary with different datasets and model parameters