AI Toolkit for IBM Z and LinuxONE

AI Toolkit for IBM Z® and LinuxONE is a family of popular open source AI frameworks with IBM® Elite Support and adapted for IBM Z and LinuxONE hardware.

While open source software has made AI more accessible, affordable and innovative, you need the right level of support to successfully implement these frameworks. With the introduction of an AI Toolkit for IBM Z and LinuxONE, you can use our verified support offering to deploy and accelerate the adoption of popular open source AI frameworks on your z/OS® and LinuxONE platforms.

The AI Toolkit consists of IBM Elite Support and IBM Secure Engineering, which vet and scan open source AI serving frameworks and IBM-certified containers for security vulnerabilities and validate compliance with industry regulations.

Explore the family of AI frameworks on the GitHub community

Benefits

Deploy with confidence

Use premium support offered by IBM Elite Support to get expert guidance when you need it for successfully deploying open source AI and IBM non-warranted software.

Improve performance

Use IBM Z Integrated Accelerator for AI to significantly improve the performance of open source and IBM non-warranted AI programs.

Use AI frameworks

Capitalize on both deep learning and traditional ML approaches to create and serve AI frameworks.

Performance benefits

Deliver innovation through open source with the AI Toolkit for IBM Z and LinuxONE.

Expedite fraud detection

Digital currency transactions run inferencing for fraud 85% faster by colocating your application with Snap ML on IBM LinuxONE Emperor 4.¹

Enhance biomedical image throughput

With IBM z16™ single frame, using the Integrated Accelerator for AI provides 6.8X more throughput for inferencing on biomedical image data with TensorFlow 2.9.1 compared to using IBM z16 single frame alone.²

Boost biomedical image inferencing

With IBM z16 multi frame and LinuxONE Emperor 4, using the Integrated Accelerator for AI provides 2.5x more throughput for inferencing on biomedical image data with TensorFlow serving versus on compared x86 system.³

Lower fraud response times

Run credit card fraud detection with 7x lower response times using the ONNX-MLIR backend for Nvidia Triton on IBM z16 multi frame and LinuxONE Emperor 4 versus using the ONNX Runtime backend for Nvidia Triton on a compared x86 server.⁴

Accelerate customer transaction prediction

Run prediction of customer transactions 3.5x faster by colocating your application with the Snap ML library on IBM z16 multi frame and LinuxONE Emperor 4 versus running prediction remotely using the NVIDIA Forest Inference Library on a compared x86 server.⁵

Features

Free download of AI frameworks

Reduce costs and complexity while accelerating time to market with lightweight and free to download tools and runtime packages.

Download the family of AI Frameworks

TensorFlow compatible

Accelerate seamless integration of TensorFlow with IBM Z Accelerated for TensorFlow to develop and deploy machine learning (ML) models on neural networks.

Accelerate TensorFlow Inference

AI frameworks integration

Use IBM Z Accelerated for NVIDIA Triton Inference Server to streamline and standardize AI inferences by deploying ML or DL models from any framework on any GPU- or CPU-based infrastructure.

Triton Inference Server

ML models with TensorFlow Serving

Harness the benefits of TensorFlow Serving, a flexible and high-performance service system, with IBM Z Accelerated for TensorFlow Serving to help deploy ML models in production.

TensorFlow Serving

Compile models with IBM zDLC

Convert ML models into a code that can be run on z/OS or LinuxONE with the help of IBM Z Deep Learning Compiler (IBM zDLC).

Learn more about IBM zDLC

Run Snap ML

Use IBM Z Accelerated for Snap ML to build and deploy ML models with Snap ML, an IBM non-warranted program that optimizes the training and scoring of popular ML models.

IBM Snap machine learning

PyTorch compatible

Accelerate seamless integration of PyTorch with IBM Z Accelerated for PyTorch to develop and deploy machine learning (ML) models on neural networks.

Accelerate PyTorch Inference

Use cases

Natural language processing

Combine the power of AI with the scalability and security of IBM Z and LinuxONE to process and analyze vast amounts of data to provide accurate classifications and predictions. AI inferencing with on-chip accelerators helps deliver real-time NLP results.

Fraud detection and prevention

Use AI with IBM Telum processor and integrated accelerator to monitor large volumes of transactions with low latency, adapt more dynamically to all types of fraud like credit card fraud and deter fraud in real-time.

Anti-money laundering (AML)

Train ML models using Scikit-learn or Snap ML to identify money laundering patterns by analyzing large data sets of financial transactions. Use the high performance, data compression and encryption capabilities of IBM Z and LinuxONE, which are essential for AML applications.

Resources

Solving fraud scenarios in real-time

Find out how you can use a scalable and consistent AI solution to detect, prevent and address fraud.

Solving anti-money laundering

Explore how to use AI applications to not only identify various money laundering patterns but also prevent them from happening in real-time.

Image and natural language processing

Discover how to get uncompromised model accuracy and very low latency for integrating inferencing into transaction processing and data serving applications where fast response times matter.

Take the next step

Get started on your journey to AI with AI Toolkit for IBM Z and LinuxONE. Schedule a no-cost 30-minute meeting with an IBM Z and LinuxONE representative.

Obtain from Passport Advantage

More ways to explore

Documentation

Support

Community

Footnotes

All links reside outside ibm.com

¹DISCLAIMER: Performance results based on IBM internal tests doing inferencing using a Scikit-learn Random Forest model with Snap ML v1.9.0 (tech preview) backend on IBM LinuxONE Emperor 4 and with Scikit-learn v1.0.2 backend on compared x86 server. The model was trained on the following public data set https://www.kaggle.com/datasets/ellipticco/elliptic-data-set. BentoML v0.13.1 (https://github.com/bentoml/BentoML) was used on both platforms as model serving framework. IBM LinuxONE Emperor 4 configuration: Ubuntu 20.04 in an LPAR with 2 dedicated cores, 256 GB memory. x86 configuration: Ubuntu 20.04 on 9 IceLake Intel® Xeon® Gold 6342 CPU @ 2.80 GHz with Hyperthreading turned on, 1 TB memory.

²DISCLAIMER: Performance results based on IBM internal tests running TensorFlow 2.9.1 with the IBM-zdnn-plugin (https://ibm.github.io/ibm-z-oss-hub/containers/index.html) for inferencing doing semantic segmentation for medical images (https://github.com/karolzak/keras-unet#usage-examples). Tests were run locally by sending 30 images at a time running TensorFlow 2.9.1 on 5 cores on a single chip vs running it on 5 cores on a single chip and using the Integrated Accelerator for AI via the IBM-zdnn-plugin. IBM Machine Type 3932 configuration: 1 LPAR configured with 10 dedicated IFLs, 128 GB memory, Ubuntu 22.04. Results may vary.

³DISCLAIMER: Performance results based on IBM internal tests running TensorFlow 2.12.0 serving with the IBM-zdnn-plugin (https://ibm.github.io/ibm-z-oss-hub/containers/index.html) for inferencing doing semantic segmentation for medical images (https://github.com/karolzak/keras-unet#usage-examples). Tests were run remotely using the wrk workload driver (https://github.com/wg/wrk) sending single images against TensorFlow 2.12.0 serving. IBM Machine Type 3931 configuration: 1 LPAR configured with 12 dedicated IFLs, 128 GB memory, Ubuntu 22.04. x86 configuration: Ubuntu 22.04 on 12 Ice Lake Intel® Xeon® Gold CPU @ 2.80GHz with Hyper-Threading turned on, 1 TB memory. Results may vary.

⁴DISCLAIMER: Performance results based on IBM internal tests doing inferencing using Nvidia Triton with the ONNX-MLIR backend (https://github.com/IBM/onnxmlir-triton-backend) on IBM Machine Type 3931 versus using the ONNX Runtime backend for Nvidia Triton on a compared x86 server. The CCFD model was trained on a synthetic dataset. As model serving framework NVIDIA Triton 23.05 (https://github.com/triton-inference-server/server) was used on both platforms and driven via the gRPC benchmarking tool ghz (https://github.com/bojand/ghz). IBM Machine Type 3931 configuration: Ubuntu 22.04 in an LPAR with 6 dedicated IFLs, 128 GB memory. x86 configuration: Ubuntu 22.04 on 2x 24 Ice Lake Intel® Xeon® Gold CPU @ 2.80GHz with Hyper-Threading turned on, 1 TB memory.

⁵DISCLAIMER: Performance results based on IBM internal tests doing inferencing using a Random Forest model with Snap ML v1.12.0 backend which uses the Integrated Accelerator for AI on IBM Machine Type 3931 versus the NVIDIA Forest Inference Library (https://github.com/triton-inference-server/fil_backend) backend on compared x86 server. The model was trained on the following public dataset https://www.kaggle.com/c/santander-customer-transaction-prediction and NVIDIA Triton™ (https://github.com/triton-inference-server/server) was used on both platforms as model serving framework. The workload was driven via the http benchmarking tool Hey (https://github.com/rakyll/hey). IBM Machine Type 3931 configuration: Ubuntu 22.04 in an LPAR with 6 dedicated IFLs, 256 GB memory. x86 configuration: Ubuntu 22.04 on 6 Ice Lake Intel® Xeon® Gold CPU @ 2.80GHz with Hyper-Threading turned on, 1 TB memory.