Accelerating PyTorch Inference on IBM Z and IBM LinuxONE
18 November 2024
Authors
Tina Tarquinio VP, Product Management, IBM Z and LinuxONE
Elpida Tzortzatos IBM Fellow and CTO of z/OS and AI on IBM Z and LinuxONE
The rise of PyTorch: Transforming AI development

Over the past decade, we have seen a dramatic transformation in enterprises led by AI. The rise of big data and specialized hardware has made powerful AI models that were once limited to elite research teams at top-tier universities accessible to the masses. Deep neural networks have powered this democratization, and deep learning frameworks such as PyTorch and TensorFlow have aided the development of these models. PyTorch has become a key player in the AI landscape, offering unique advantages that have led to its widespread use and adoption.

3D design of balls rolling on a track
The latest AI News + Insights 
 Expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter. 
PyTorch acceleration with IBM Z Telum on-chip accelerator for AI

Recent advancements in hardware AI accelerators have provided the power needed to effectively use deep learning frameworks like PyTorch. These hardware improvements accelerate compute of more complex models on large datasets, significantly speeding up experimentation and deployment. With the latest continuous delivery update of AI Toolkit for IBM Z® and LinuxONE®, we are adding support for PyTorch through a new container: IBM Z Accelerated for PyTorch. This contains a development and inference environment for PyTorch. It will use new inference acceleration capabilities that transparently target the IBM Integrated AI Accelerator and provide significant acceleration to traditional machine learning and deep learning, as well as Encoder LLMs models. These capabilities help accelerate experimentation with rapid PoCs and build AI solutions on IBM Z and LinuxONE.

Mixture of Experts | Podcast
Decoding AI: Weekly News Roundup

Join our world-class panel of engineers, researchers, product leaders and more as they cut through the AI noise to bring you the latest in AI news and insights.

What is PyTorch?

PyTorch is an open source machine learning framework that provides a flexible platform for building deep learning models. Released by Facebook's AI Research lab in 2016, PyTorch allows developers to create and modify models easily through its dynamic structure, which offers immediate feedback. This adaptability makes it particularly appealing for researchers and developers who want to experiment with new ideas.

The importance of PyTorch in today's AI framework

PyTorch has gained widespread popularity in the AI ecosystem. Its user-friendly interface and powerful features have made it the framework of choice for both academic research and business applications. PyTorch has played a crucial role in advancing deep learning by providing tools that simplify the process of building and training complex models. Its flexibility allows developers to experiment with different architectures and techniques, leading to more innovative solutions. Features like automatic differentiation and intuitive tensor manipulation have made it easier to implement advanced algorithms, resulting in faster progress in research and application.

One of the most significant areas where PyTorch has made an impact is in the development of large language models (LLMs). These models, which can understand and generate human-like text, have revolutionized natural language processing. Frameworks like PyTorch have facilitated the creation and fine-tuning of these models, enabling researchers to explore new architectures and training methods more efficiently.

Notably, many of the latest state-of-the-art language models, including those developed by major tech companies, have been implemented by using PyTorch. The framework's ability to handle vast amounts of data and its support for distributed training have allowed for the scaling up of models that can comprehend context and nuance in language.

With IBM Z Accelerated for PyTorch delivered through the AI Toolkit for IBM Z and LinuxONE, our clients can use PyTorch model deployments with the reliability, availability and scalability of IBM Z, along with the inferencing acceleration capabilities of Telum® on-chip accelerator. This inference acceleration is transparent to clients as the containers are designed to take advantage of the Neural Network Processing Assist (NNPA) instructions of Telum transparently and automatically.

Clients can now use this capability for high-value use cases like fraud detection, claims processing, natural language processing, image detection and more. These models can be deployed in the native PyTorch format or exported to formats like ONNX, which are highly optimized for inferencing.

Whether the PyTorch models are deployed on z/OS® or in a Linux on IBM Z environment, the colocation of these models with our client’s mission-critical data and applications helps them to gain business insights at scale while continuing to meet even the most stringent service-level agreements.

AI Toolkit for IBM Z and LinuxONE

The AI Toolkit for IBM Z and IBM® LinuxONE is designed to enable our clients to deploy and accelerate the adoption of popular open source AI frameworks on their z/OS® and IBM® LinuxONE platforms. The AI Toolkit follows a rigorous IBM Secure Engineering process that vets and scans open source AI-serving frameworks and IBM-certified containers for security vulnerabilities and validates compliance with industry regulations. Clients can also purchase IBM Elite Support for AI Toolkit for IBM Z and LinuxONE.

Learn more about the AI Toolkit for IBM Z and IBM LinuxONE

Learn what you can expect from IBM Elite Support

Related solutions IBM watsonx.ai

Train, validate, tune and deploy generative AI, foundation models and machine learning capabilities with IBM watsonx.ai, a next-generation enterprise studio for AI builders. Build AI applications in a fraction of the time with a fraction of the data.

Discover watsonx.ai
Artificial intelligence solutions

Put AI to work in your business with IBM's industry-leading AI expertise and portfolio of solutions at your side.

Explore AI solutions
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI services
Take the next step

Get one-stop access to capabilities that span the AI development lifecycle. Produce powerful AI solutions with user-friendly interfaces, workflows and access to industry-standard APIs and SDKs.

Explore watsonx.ai Book a live demo