October 26, 2021 By Elpida Tzortzatos 3 min read

Nearly everyone recognizes the profound opportunity to bring new insights and better decisions to business workloads using AI and analytics. Enabling AI on IBM Z and LinuxONE is a key focus for IBM,  allowing clients to have a reliable, secured, and high performing environment for delivering critical business insights using Machine Learning and Deep Learning applications.

However, with this opportunity there are also challenges, especially those around deploying AI in a production environment. The use of AI in critical business workloads is a growing space, and as with other new technologies, the path to production can be challenging. Key challenges include the need to deploy data science assets without sacrificing production qualities of service (i.e., meet response time goals) in a consistent, repeatable manner.

That is where the Open Neural Network Exchange (ONNX) comes in. ONNX is an open-source format used to represent machine learning models and is one of the key ecosystem technologies enabling a “Build and Train Anywhere, Deploy on IBM Z” strategy. ONNX helps establish a streamlined path to take a project from inception to production. Models represented in a standard ONNX format can then be implemented by an ONNX backend (i.e., runtime or model compiler), such as on IBM Z.

This journey to production starts with the data scientist, who may use a preferred set of tools to understand a business problem and analyze data. When that data scientist creates and trains a model, they build assets that ultimately need to be deployed in production. Often, however, the deployment platform and production requirements aren’t considered heavily in these early stages. This is where utilizing ONNX in a deployment strategy really shines. Many of the most popular libraries and frameworks, including PyTorch and TensorFlow, support the ability to export or convert a trained model to an ONNX format.

Once a model has an ONNX representation, it can be deployed to run on any platform with an ONNX runtime. This provides several key benefits: the model is now portable, with no runtime dependencies on the libraries or framework it was trained on. For example, an ONNX model that was originally created and trained in TensorFlow can be served without the TensorFlow runtime. Additionally, ONNX allows vendors to create high performing model backends that can optimize and accelerate the model for a specific architecture.

For IBM Z and the mission critical workloads it typically hosts, this combination of portability and optimization makes IBM Z an optimal environment for deploying models. One key example of the use of ONNX is in Watson Machine Learning for z/OS (WMLz), which incorporates an ONNX model compiler technology based on the ONNX-MLIR project. The ONNX model compiler feature of WMLz is focused on deep learning models and produces an executable optimized to run on IBM Z. WMLz allows the user to easily deploy this compiled ONNX model for model serving.

As IBM Z continues to innovate in enterprise AI, ONNX is a key part of IBM’s AI strategy. It allows IBM to build a deployment strategy optimized for the IBM Z architecture, while staying closely aligned with the broader ecosystem.

In August, you may have read that IBM previewed Telum, the next generation IBM Z processor. IBM is now examining opportunities to exploit the on-chip AI accelerator with the ONNX model compiler.

ONNX is part of the Linux Foundation and has widespread support from numerous key vendors that recognize the value it delivers. IBM is an early adopter of ONNX and contributes upstream to the ONNX project.

Be on the lookout for additional updates on how you can leverage ONNX as part of your IBM Z AI story!  

>> To learn more, you can also read about ONNX and IBM Watson Machine Learning for z/OS here.

Was this article helpful?

More from Cloud

IBM Cloud Virtual Servers and Intel launch new custom cloud sandbox

4 min read - A new sandbox that use IBM Cloud Virtual Servers for VPC invites customers into a nonproduction environment to test the performance of 2nd Gen and 4th Gen Intel® Xeon® processors across various applications. Addressing performance concerns in a test environment Performance testing is crucial to understanding the efficiency of complex applications inside your cloud hosting environment. Yes, even in managed enterprise environments like IBM Cloud®. Although we can deliver the latest hardware and software across global data centers designed for…

10 industries that use distributed computing

6 min read - Distributed computing is a process that uses numerous computing resources in different operating locations to mimic the processes of a single computer. Distributed computing assembles different computers, servers and computer networks to accomplish computing tasks of widely varying sizes and purposes. Distributed computing even works in the cloud. And while it’s true that distributed cloud computing and cloud computing are essentially the same in theory, in practice, they differ in their global reach, with distributed cloud computing able to extend…

How a US bank modernized its mainframe applications with IBM Consulting and Microsoft Azure

9 min read - As organizations strive to stay ahead of the curve in today's fast-paced digital landscape, mainframe application modernization has emerged as a critical component of any digital transformation strategy. In this blog, we'll discuss the example of a US bank which embarked on a journey to modernize its mainframe applications. This strategic project has helped it to transform into a more modern, flexible and agile business. In looking at the ways in which it approached the problem, you’ll gain insights into…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters