We have seen that the explosion in interest and adoption of AI has led IT leaders to revisit their capacity plans. They are seeing the need for increasing compute resources at a scale that has rarely been observed in the past.

As businesses look to adopt AI pervasively across their organization to both improve operational productivity and efficiency and to create new business value, they are having to look for new ways to meet the need for increased compute resources.

Many enterprises are realizing that to maximize the value of their AI investments, infrastructure needs to scale rapidly and efficiently to deliver AI insights where and when they need them. Also, the infrastructure must deliver the security and resiliency they need for mission-critical workloads.

In 2022, IBM introduced the groundbreaking IBM® z16™ featuring an on-chip AI inference accelerator that is focused on accelerating AI model execution to deliver real-time insights that can meet the needs of the most demanding business workloads. With IBM z16, process up to 3.5 million inference requests per second with 1 ms response time using a credit card fraud detection model.1

Today, we are announcing the AI Bundle for IBM Z® and LinuxONE. This builds on the success of the IBM z16 or LinuxONE 4 by providing both new dedicated capacity for AI workloads with a highly optimized software stack.

What is the AI Bundle for IBM Z and LinuxONE?

The AI Bundle for IBM Z and LinuxONE is an AI dedicated hardware infrastructure with an optimized core software stack. It allows clients to pursue their AI journey with streamlined AI deployment on IBM Z and LinuxONE. The AI Bundle for IBM Z and LinuxONE brings the advantage of dedicated hardware, providing enterprises with control over their infrastructure and data to manage processes within their organization’s data center environment and gain business insights.

Distinguishing features

With a curated suite of AI software (AI Toolkit for IBM Z and LinuxONE and IBM Cloud Pak® for Data on IBM Z and LinuxONE), clients can manage AI model lifecycles in one place allowing for a quick deployment of wide range of use cases.

Leveraging the IBM Telum® processor with the Integrated Accelerator for AI, enterprises can run inferencing for high volume workloads at scale. On digital currency transactions, run inferencing for fraud 85% faster by colocating your application with Snap ML on IBM z16 versus running inferencing remotely using Scikit-learn on a compared x86 server.2

A wide range of use cases can be implemented with the AI Bundle software components for the latest IBM hardware platforms:

  1. Claims fraud: A state government in the US realized that their process to determine fraudulent claims was manual and intensive and taking up to 40 hours per case. In support of this use case, IBM has demonstrated that by leveraging IBM z16 with the Integrated Accelerator for AI, add-in transactional fraud detection for OLTP transactions resulted with only 2 ms additional response time.3
  2. Clearing and settlement: A card processor explored using AI to assist in determining which trades and/or transactions have a high-risk exposure before settlement to reduce liability, chargebacks and costly investigation. In support of this use case, IBM has validated that the IBM z16 is designed to score business transactions at scale delivering the capacity to process up to 300B deep inferencing requests per day with 1 ms of latency.4
  3. Anti-money laundering (AML): A large European bank needed to introduce AML screening into their instant payments operational flow. Their current end-day AML screening was no longer sufficient due to stricter regulations. In support of this use case, IBM has demonstrated that the IBM z16 with the Integrated Accelerator for AI provides 4x faster response time versus compared z15® when both are running equivalent OLTP workloads with batched fraud detection.5

Get started today

The AI Bundle for IBM Z and LinuxONE will be generally available from IBM and certified Business Partners on 26 April 2024.

In addition, IBM offers a no-charge AI on IBM Z and LinuxONE Discovery Workshop. This workshop is a great starting point and can help you evaluate potential use cases and define a project plan. This workshop can help you leverage the AI Bundle for IBM Z and LinuxONE effectively. To learn more or to schedule a workshop on any of the AI use cases or products, email us at aionz@us.ibm.com.


1 DISCLAIMER: The performance result is extrapolated from IBM internal tests running local inference operations in an IBM z16 LPAR with 48 IFLs and 128 GB memory on Ubuntu 20.04 (SMT mode) using a synthetic credit card fraud detection model using the Integrated Accelerator for AI. The benchmark was running with 8 parallel threads each pinned to the first core of a different chip. The lscpu command was used to identify the core-chip topology. A batch size of 128 inference operations was used. Results were also reproduced using a z/OS® V2R4 LPAR with 24 CPs and 256 GB memory on IBM z16. The same credit card fraud detection model was used. The benchmark was executed with a single thread performing inference operations. A batch size of 128 inference operations was used. Results may vary.

2 DISCLAIMER: Performance results based on IBM internal tests doing inferencing using a Scikit-learn Random Forest model with Snap ML v1.9.0 (tech preview) backend on IBM z16 and with Scikit-learn v1.0.2 backend on compared x86 server. The model was trained on the following public data set: https://www.kaggle.com/datasets/ellipticco/elliptic-data-set. BentoML v0.13.1 was used on both platforms as a model serving framework. IBM z16 configuration: Ubuntu 20.04 in an LPAR with 2 dedicated IFLs, 256 GB memory. x86 configuration: Ubuntu 20.04 on 9 IceLake Intel® Xeon® Gold CPU @ 2.80 HGz with hyperthreading turned on, 1 TB memory.

3 DISCLAIMER: Performance results were extrapolated from IBM internal tests running an OTLP workload with credit card transaction using the credit card fraud detection model on IBM z16 vs running it without credit card fraud detection. IBM z16 con figuration: Ubuntu 20.04 in an LPAR with 12 dedicated IFLs, 256 GB memory and IBM FlashSystem® 9200 storage. System utilization has been in both cases above 70%. Results may vary.

4 DISCLAIMER: Performance result is extrapolated from IBM internal tests running local inference operations in an IBM z16 LPAR with 48 IFLs and 128 GB memory on Ubuntu 20.04 (SMT mode) using a synthetic credit card fraud detection model (https://github.com/IBM/ai-on-z-fraud-detection) exploiting the Integrated Accelerator for AI. The benchmark was running with 8 parallel threads each pinned to the first core of a different chip. The lscpu command was used to identify the core-chip topology. A batch size of 128 inference operations was used. Results were also reproduced using a z/OS V2R4 LPAR with 24 CPs and 256 GB memory on IBM z16. The same credit card fraud detection model was used. The benchmark was executed with a single thread performing inference operations. A batch size of 128 inference operations was used. Results may vary

5 DISCLAIMER: Performance results based on IBM internal tests running online transaction processing (OLTP) credit card workloads with in-transaction fraud detection (https://github.com/IBM/ai-on-z-fraud-detection). On IBM z16 A01 and z15 T01, both systems ran z/OS® 2.4, had 4 central processors, 8 z Systems® Integrated Information Processors (zIIPs) with simultaneous multithreading 2, and 16 GB memory. Inferencing was done in IBM Watson®z15 Machine Learning for z/OS Online Scoring Community Edition v1.0.0 in a single IBM z/OS Container Extensions (zCX) container. zCXwas version V2R4 with APAR 0A59865. The application ran in CICS v5.4 on WebSphere® Application Server Version v8.5 Liberty with Java 8.0.6.20 and IBM Enterprise COBOL for z/OS 6.2.0 P190522. The database for the application was a colocated Db2 for z/OS v12. The workload driver, JMeter, was based on an initial workload that targeted 10,000 transactions per second on z15 without fraud detection. This same driver configuration was then used with fraud detection on both systems where the 32 most recent transactions for that credit card were batched–client-side for fraud detection.

More from Artificial intelligence

Introducing watsonx platform on Microsoft Azure

4 min read - Artificial intelligence (AI) is revolutionizing industries by enabling advanced analytics, automation, and personalized experiences According to The business value of AI, from the IBM Institute of Business Value, AI adoption has more than doubled since 2017. Enterprises are taking an intentional design approach to hybrid cloud and AI to drive technology decisions and enable adoption of Generative AI. According to the McKinsey report,  The economic potential of generative AI: The next productivity frontier, generative AI is projected to add $2.6…

Democratizing Large Language Model development with InstructLab support in watsonx.ai

5 min read - There is no doubt that generative AI is changing the game for many industries around the world due to its ability to automate and enhance creative and analytical processes. According to McKinsey, generative AI has a potential to add $4 trillion to the global economy. With the advent of generative AI and, more specifically, Large Language Models (LLMs), driving tremendous opportunities and efficiencies, we’re finding that the path to success for organizations to effectively use and scale their generative AI…

IBM watsonx Code Assistant for Z: accelerate the application lifecycle with generative AI and automation

2 min read - The business outcomes from scaling AI are clear. Companies at the forefront of generative AI and data-led innovation are seeing 72% greater annual net profits and 17% more annual revenue growth than their peers. A key step toward unlocking these gains is the adoption of purpose-built AI assistants that are tailored to your use case. IBM watsonx™ Code Assistant for Z is an AI-based coding assistant designed to accelerate the application lifecycle for mainframe applications, leveraging generative AI and automation…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters