Compute Infrastructure

Accelerate and streamline AI and HPC workloads with new NVIDIA GPUs on IBM Cloud

Share this post:

Each day, data scientists around the world are using artificial intelligence (AI) and high performance computing (HPC) to solve complex challenges and create new business value from data.

Whether they are training a chatbot to provide better customer service, creating reservoir simulations for new oil fields or teaching autonomous cars to mind the rules of the road, these data-intensive workloads require fast, secure and cost-effective cloud infrastructure.

At IBM, we’re focused on delivering new AI capabilities both in the cloud and on premises to help enterprises not only gain critical insights from their data, but also create new value with that data. We’ve been working closely with NVIDIA to bring their latest GPU (graphics processing unit) technology, NVIDIA Tesla V100, to the cloud and were the first to offer a comprehensive suite of GPUs including the P100, K80 and M60 on IBM Cloud bare metal and virtual servers. To power on-premises workloads, IBM also offers the industry’s only CPU-to-GPU NVIDIA NVLink connection on our latest POWER9 servers.

Building on this momentum, we’re excited to share that IBM is introducing the NVIDIA Tesla V100 GPU to support mission-critical AI, deep learning and HPC workloads on the cloud.

Starting today, you can equip individual IBM Cloud bare metal servers with up to two NVIDIA Tesla V100 PCIe GPU accelerators — NVIDIA’s latest, fastest and most advanced GPU architecture. The combination of IBM high-speed network connectivity and bare metal servers with the Tesla V100 GPUs provides the performance and speed that enterprises need. That means AI models that once needed weeks of computing resources can now be trained in just a few hours.

Performance and rapid provisioning are critical for AI and HPC workloads. Building on IBM bare metal support for the NVIDIA Tesla P100 GPU, IBM will also make the P100 GPU available on IBM Cloud virtual servers. This provides power and high performance for AI and deep learning workloads with the scalability and flexibility of IBM’s virtual servers. With the Tesla P100 GPU accelerator, you can leverage up to 65 percent more deep learning capabilities and 50 times the performance than its predecessor.

While performance and processing power are critical for AI and HPC workloads, data protection and resiliency also remain integral. To support these mission-critical workloads, IBM offers block and file storage that is resilient, security-rich and persistent beyond the lifecycle of your compute instance.

This new IBM Cloud service delivers near-instant access to the most powerful GPU technologies to date, enabling enterprises, data scientists and researchers from organizations including NASA Frontier Development Lab and SpectralMD to train deep learning models and create innovative cloud-native applications that can help address complex problems. IBM also collaborates with technology partners such as Rescale and Bitfusion to facilitate seamless access to GPUs on IBM Cloud.

For example, during the NASA Frontier Development Lab this past summer, a team of researchers and data scientists used machine learning techniques on the IBM Cloud to develop new processes for 3D modeling of asteroids from radar data. With an average of 35 new asteroids and near-Earth objects discovered each week, there is currently more data available than experts can keep up with, and existing 3D modeling processes can take several months. Using NVIDIA P100 GPUs on the IBM Cloud and IBM Cloud Object Storage, the team was able to generate asteroid shapes images an average of five to six times faster than previous processes allowed.

SpectralMD, a clinical research stage medical device company, created the DeepView™ Wound Imaging System, which brings together deep learning techniques and medical imaging with the goal of improving outcomes for patients with chronic and burn wounds. Using NVIDIA Tesla P100 GPUs on IBM Cloud, SpectralMD trains and tests deep learning models on large clinical data sets of medical images to assist clinicians in developing insights to determine the best treatment options for a wound. According to SpectralMD, having all the available resources of each GPU on IBM Cloud bare metal servers decreases cross-validation of deep learning models from weeks to hours and reduces issues such as memory bottlenecks.

“Whether they are accelerating drug discovery or creating virtual personal assistants that converse naturally, data scientists are using our GPU computing platform in the cloud to solve complex problems that were once considered unsolveable,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “The new IBM Cloud offerings based on our Volta technology provide incredible processing speeds and the ability to scale up or down on demand for HPC and deep learning.”

We’re looking forward to seeing what you can create with NVIDIA GPUs on the IBM Cloud today.

 

More Compute Infrastructure stories

How to Run SAP Workloads in the Cloud on a Budget

We’ve completed SAP HANA certification for our latest generation of Intel Broadwell-based SAP HANA servers, and they’re now generally available worldwide. Learn about how you can migrate your SAP workloads to the cloud—all at a low cost.

Continue reading

Aerospike on IBM Cloud bare metal servers: Powering your cognitive architectures

Fast and flexible, we hear this constantly when talking about cloud infrastructure, but it’s not to be taken lightly. A fast and flexible solution could make or break your workload and overall cloud performance. This means you need to choose a cloud solution that provides fast and flexible infrastructure, all while meeting the constant demands of […]

Continue reading

Obtain and visualise uniform metrics, logs, traces across microservices using Istio

In this blog post, you will learn how to setup Istio on your Kubernetes cluster using Helm or Kubernetes-YAML and you will be using add-ons like Jaeger, Prometheus, Grafana, & Weavescope to collect, query and visualize metrics, logs & traces (in-depth telemetry) for your microservices.

Continue reading