Artificial intelligence (AI) is a profoundly transformative technology because of its broad applicability to many use cases. It already impacts our personal lives, and it is changing the way we work and do business. In this blog we’ll examine AI and its role for clients running IBM Z and IBM LinuxONE workloads. We will cover principles of the IBM Z AI strategy and developments underway around IBM Z’s role as a world-class inference platform. We are developing a blog series to describe key elements of AI on IBM Z and how clients can tap into these capabilities for their next-generation AI applications.
Our mission is to provide a comprehensive and consumable AI experience for operationalizing AI on Z, and this includes the goal of building inference capabilities directly into the platform. Following the IBM AI ladder inference is part of the Analyze/Infuse rungs. Inference refers to the point when a model is deployed for production and is used by the application to make business predictions.
IBM’s design goal is to enable low latency inference for time-sensitive work such as in-transaction inference and other real-time or near-real-time workloads. One example is fraud detection; for banks and financial markets, accurate detection of fraud can result in significant savings. IBM is architecting optimizations in software and hardware to meet these low latency goals and to enable clients to integrate AI tightly with IBM Z data and core business applications that reside on Z. These technologies are designed to enable clients to embed AI in their applications with minimal application changes.
Target use cases include time sensitive cases with high transaction volumes and complex models, typically requiring deep learning. In these transactional use cases, a main objective is to reduce latency for faster response time, delivering inference results back to the caller at a high volume and speed.
Train anywhere and deploy on Z
IBM recognizes that the AI training landscape is quite different from the inference one. Training is the playground of data scientists who are focused on improving model accuracy. Data scientists use platforms that may be ideal for training but are not necessarily efficient for deploying models. Our approach enables clients to build and train models on the platform of their choice (including on premises or Z in a hybrid cloud), leveraging any investments they have made. They can then deploy those models to an environment that has transactional and data affinity to the use case – such as transactional processing on Z. That is the heart of our “train anywhere, deploy on Z” strategy.
To enable this strategy, IBM is architecting solutions to enable model portability to Z without requiring additional development efforts for deployment. We are investing in ONNX (Open Neural Network Exchange) technology, a standard format for representing AI models allowing a data scientist to build and train a model in the framework of choice without worrying about the downstream inference implications. To enable deployment of ONNX models, we provide an ONNX model compiler that is optimized for IBM Z. In addition to this, we are optimizing key open-source frameworks such as TensorFlow (and TensorFlow Serving) for use on IBM Z.
To summarize, our mission is to enable clients to easily deploy AI workloads on IBM Z and LinuxONE in order to deliver faster business insights while driving more value to the business. We are enhancing IBM Z as a world-class inference platform. We aim to help clients accelerate deployment of AI on Z by investing in seamless model portability, in integration of AI into Z workloads, and in operationalizing AI with industry leading solutions such as IBM Cloud Pak for Data for more flexibility and choice in hybrid cloud deployments. We will explore several AI technologies in future blog posts around open source, ONNX, and TensorFlow, Cloud Pak for Data and more. Stay tuned to our journey to AI in this blog series.