October 26, 2021 By Elpida Tzortzatos 3 min read

Nearly everyone recognizes the profound opportunity to bring new insights and better decisions to business workloads using AI and analytics. Enabling AI on IBM Z and LinuxONE is a key focus for IBM,  allowing clients to have a reliable, secured, and high performing environment for delivering critical business insights using Machine Learning and Deep Learning applications.

However, with this opportunity there are also challenges, especially those around deploying AI in a production environment. The use of AI in critical business workloads is a growing space, and as with other new technologies, the path to production can be challenging. Key challenges include the need to deploy data science assets without sacrificing production qualities of service (i.e., meet response time goals) in a consistent, repeatable manner.

That is where the Open Neural Network Exchange (ONNX) comes in. ONNX is an open-source format used to represent machine learning models and is one of the key ecosystem technologies enabling a “Build and Train Anywhere, Deploy on IBM Z” strategy. ONNX helps establish a streamlined path to take a project from inception to production. Models represented in a standard ONNX format can then be implemented by an ONNX backend (i.e., runtime or model compiler), such as on IBM Z.

This journey to production starts with the data scientist, who may use a preferred set of tools to understand a business problem and analyze data. When that data scientist creates and trains a model, they build assets that ultimately need to be deployed in production. Often, however, the deployment platform and production requirements aren’t considered heavily in these early stages. This is where utilizing ONNX in a deployment strategy really shines. Many of the most popular libraries and frameworks, including PyTorch and TensorFlow, support the ability to export or convert a trained model to an ONNX format.

Once a model has an ONNX representation, it can be deployed to run on any platform with an ONNX runtime. This provides several key benefits: the model is now portable, with no runtime dependencies on the libraries or framework it was trained on. For example, an ONNX model that was originally created and trained in TensorFlow can be served without the TensorFlow runtime. Additionally, ONNX allows vendors to create high performing model backends that can optimize and accelerate the model for a specific architecture.

For IBM Z and the mission critical workloads it typically hosts, this combination of portability and optimization makes IBM Z an optimal environment for deploying models. One key example of the use of ONNX is in Watson Machine Learning for z/OS (WMLz), which incorporates an ONNX model compiler technology based on the ONNX-MLIR project. The ONNX model compiler feature of WMLz is focused on deep learning models and produces an executable optimized to run on IBM Z. WMLz allows the user to easily deploy this compiled ONNX model for model serving.

As IBM Z continues to innovate in enterprise AI, ONNX is a key part of IBM’s AI strategy. It allows IBM to build a deployment strategy optimized for the IBM Z architecture, while staying closely aligned with the broader ecosystem.

In August, you may have read that IBM previewed Telum, the next generation IBM Z processor. IBM is now examining opportunities to exploit the on-chip AI accelerator with the ONNX model compiler.

ONNX is part of the Linux Foundation and has widespread support from numerous key vendors that recognize the value it delivers. IBM is an early adopter of ONNX and contributes upstream to the ONNX project.

Be on the lookout for additional updates on how you can leverage ONNX as part of your IBM Z AI story!  

>> To learn more, you can also read about ONNX and IBM Watson Machine Learning for z/OS here.

Was this article helpful?
YesNo

More from Cloud

Top 6 innovations from the IBM – AWS GenAI Hackathon

5 min read - Generative AI innovations can transform industries. Eight client teams collaborated with IBM® and AWS this spring to develop generative AI prototypes to address real-world business challenges in the public sector, financial services, energy, healthcare and other industries. Over the course of several weeks, cross-functional teams comprising client teams, IBM and AWS representatives worked to design, develop and iterate on prototypes that push the boundaries of what's possible with generative AI. IBM used design thinking and user-centric approach to guide the…

IBM + AWS: Transforming Software Development Lifecycle (SDLC) with generative AI

7 min read - Generative AI is not only changing the way applications are built, but the way they are envisioned, designed, tested, documented, and deployed. It’s also revolutionizing the software development lifecycle (SDLC). IBM and AWS are infusing Amazon Bedrock generative AI capabilities into the IBM® SDLC solution to drive increased efficiency, speed, quality and value in every application lifecycle consistently and at scale. The evolution of the SDLC landscape The software development lifecycle has undergone several silent revolutions in recent decades. The…

How digital solutions increase efficiency in warehouse management

3 min read - In the evolving landscape of modern business, the significance of robust operational and maintenance systems cannot be overstated. Efficient warehouse management helps businesses to operate seamlessly, ensure precision and drive productivity to new heights. In our increasingly digital world, bar coding stands out as a cornerstone technology, revolutionizing warehouses by enabling meticulous data tracking and streamlined workflows. With this knowledge, A3J Group is focused on using IBM® Maximo® Application Suite and the Red Hat® Marketplace to help bring inventory solutions…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters