Introducing Spectrum Storage for AI with NVIDIA DGX for deep learning

Artificial intelligence, machine learning and deep learning projects are happening in every industry and research team across the globe. According to IDC[1], by 2019 40 percent of digital transformation initiatives will use AI services; by 2021, 75 percent of commercial enterprise apps will use AI, over 90 percent of consumers will interact with customer support bots, and over 50 percent of new industrial robots will leverage AI.

These companies are investing in hiring data scientists and buying powerful GPU-based servers to develop these AI/ML/DL models and drive significant business value from the oceans of data. But at the heart of the work is data and the efficiency of the AI data pipeline from data preparation to insights.

The choice of storage is critical to their success. Data scientists need access to large, readily accessible quantities of data supported by a wide variety of data tools. High-performance, multi-protocol shared storage for the latest AI and data tools, like TensorFlow, PyTorch and Spark, gives data teams faster access to more data with less complexity, lower costs and reliability.

To drive AI development productivity and streamline the AI data pipeline, IBM is expanding the successful release of IBM Systems AI Infrastructure Reference Architecture with the release of IBM Spectrum Storage for AI with NVIDIA DGX. This reference architecture combines the industry acclaimed software-defined storage scale-out file system, IBM Spectrum Scale on flash with NVIDIA DGX-1. It provides the highest performance storage in any compared system[2] with the unique ability to support a growing data science practice.

“IBM’s strategy is to make AI/ML/DL more accessible and more performant,” says Ashish Nadkarni, Group Vice President, IDC Infrastructure Systems, Platforms and Technologies. “IBM Spectrum Storage for AI with NVIDIA DGX is designed to provide a tested and supported platform. For those who are choosing NIVIDIA DGX servers for the open source frameworks and high-throughput GPU platforms, IBM Spectrum Scale can add intelligent, scalable, secured, metadata rich, cloud- integrated, multiprotocol, high-performing and efficient storage in an easy-to-deploy solution from their top tier business partners.”

IBM Spectrum Storage for AI with NVIDIA DGX provides a foundation for an AI data infrastructure on which companies can build their data science services and deliver powerful business value from their AI applications. The proper storage can support the AI data pipeline from data preparation to training, inference and archive. Tested, tuned and delivered exclusively through the reseller channel, IBM Spectrum Storage for AI with NVIDIA DGX provides the reference architecture to ramp up quickly and grow confidently.

“As an IBM Platinum and NVIDIA Deep Learning partner with a full stack approach to partnering with clients around data-driven digital transformation, Mark III has seen firsthand how impactful AI can be in fundamentally transforming the user experience and discovering deeper insights from real-time and historical data than ever thought possible,” says Stan Wysocki, President of Mark III Systems.  “As AI weaves its way into enterprises and institutions, we’re excited about the possibilities around the combination of IBM Spectrum Storage for AI with NVIDIA Deep Learning to help enable our clients to build, launch, and scale their AI-driven applications and services around a robust enterprise platform engineered for the future.”

Covering systems and software, IBM Spectrum Storage for AI with NVIDIA DGX is designed for data science productivity and IT simplicity.

IBM Spectrum Scale is an award-winning[3] choice for AI and HPC projects because it is readily deployed, high performance and provides unique, intelligent scaling. In  the SpectrumAI reference architecture, it provides maximum data flexibility with developer agility to drive productivity.

Software-defined IBM Spectrum Storage for AI with NVDIA DGX can be configured to meet your current and growing business requirements. IBM Spectrum Scale can be deployed in configurations from a single IBM Elastic Storage Server (ESS), to support for a few NVIDIA DGX-1 servers, to a rack of 9 servers with 72 Tesla V100 Tensor Core GPUs to multi-rack configurations. Unlike traditional storage arrays, the highly parallel IBM Spectrum Scale scales practically linearly with random read data throughput requirements to feed multiple GPUs. The result is a solution that delivers AI workload performance from shared storage comparable to the that of local RAM disk. In a single reference architecture rack, IBM Spectrum Scale on three IBM NVMe arrays demonstrated 120GB/s of data throughput[4] to support multiple user and multiple models simultaneously.

The NVIDIA DGX software stack is designed for maximized GPU-accelerated training performance and including the new RAPIDS framework to accelerate data science workflow. Adoption of new AI frameworks is simplified by the container model, supported by NVIDIA’s NGC container repository of GPU-optimized applications.

“Our customers are looking to scale up their data science infrastructure to help solve the challenges of AI data deployments,” says Charlie Boyle, Senior Director of DGX Systems at NVIDIA. “The power of the GPU-accelerated NVIDIA DGX-1, combined with IBM Spectrum Scale’s storage technology, gives data scientists a turnkey solution that brings together industry-leading compute and high-performance storage with proven results.”

Updated for AI and flash, IBM Spectrum Scale provides the flexibility to fit into the extended data pipeline to help organizations meet their performance, economic and data governance requirements. Delivered in IBM Spectrum Storage for AI with NVIDIA as an integrated all-flash solution, it can also provide storage services across different storage choices, including AWS public cloud. IBM Spectrum Scale can share data with IBM Cloud Object Storage and tape with shared metadata services provided by IBM Spectrum Discover.

“The same storage innovations that are driving the fastest and smartest supercomputers in the world are now available to any data science team building high-performance AI model training infrastructure. Integrated by our joint channel partners, IBM Spectrum Storage for AI with NVIDIA DGX can get our clients started quickly and help to grow their AI, machine learning and deep learning projects.” says Sam Werner, VP Offering Management, IBM Software-Defined Storage.

AI data pipeline

IBM Storage has developed reference architectures and converged systems that our clients need with balanced configurations that can deliver performance and efficiency. Today, data scientists may start small, but the data will always grow. By choosing the right starting point, organizations can build an AI data pipeline where value and efficiency grows with volumes of data.

To learn more about IBM SpectrumAI with NVIDIA DGX, visit this webpage and join us for the February 19th webinar “Building your AI Data Pipeline with IBM Spectrum Storage for AI with NVIDIA DGX.”

This is the latest in a series of IBM Systems infrastructure solutions for AI that span covering solutions with IBM Power AC922, IBM PowerAI, IBM Spectrum Computing and the IBM Spectrum Storage portfolio. To learn more about the IBM infrastructure solutions for AI, visit the IBM Systems AI webpages, or start with the IBM Systems AI Infrastructure Reference Architecture.

[1] Source: IDC TECHNOLOGY SPOTLIGHT Sponsored by: IBM  #US43977818

[2] FIO data throughput testing of 120GB/s and AI workloads performance compared to the self-reported results of other NVIDIA RA business partners https://www.nvidia.com/en-us/data-center/dgx-reference-architecture/

[3] Winner of many awards, including 2017 Flash Memory Summit “Most Innovative Flash Memory Business Application.” and HPCwire Reader’s Choice awards 2018.

[4] IBM lab testing of IBM Spectrum Scale on 3 NVMe arrays using 1M random reads driven by FIO on 9 NVIDIA DGX-1 systems connected with Mellanox EDR Infiniband.