While open source software has made AI more accessible, affordable and innovative, you need the right level of support to successfully implement these frameworks. With the introduction of an AI Toolkit for IBM Z and LinuxONE, you can use our verified support offering to deploy and accelerate the adoption of popular open source AI frameworks on your z/OS® and LinuxONE platforms.
The AI Toolkit consists of IBM Elite Support and IBM Secure Engineering, which vet and scan open source AI serving frameworks and IBM-certified containers for security vulnerabilities and validate compliance with industry regulations.
Explore the family of AI frameworks on the GitHub community
Use premium support offered by IBM Elite Support to get expert guidance when you need it for successfully deploying open source AI and IBM non-warranted software.
Use IBM Z Integrated Accelerator for AI to significantly improve the performance of open source and IBM non-warranted AI programs.
Capitalize on both deep learning and traditional ML approaches to create and serve AI frameworks.
Deliver innovation through open source with the AI Toolkit for IBM Z and LinuxONE.
Digital currency transactions run inferencing for fraud 85% faster by colocating your application with Snap ML on IBM LinuxONE Emperor 4.1
With IBM z16™ single frame, using the Integrated Accelerator for AI provides 6.8X more throughput for inferencing on biomedical image data with TensorFlow 2.9.1 compared to using IBM z16 single frame alone.2
With IBM z16 multi frame and LinuxONE Emperor 4, using the Integrated Accelerator for AI provides 2.5x more throughput for inferencing on biomedical image data with TensorFlow serving versus on compared x86 system.3
Run credit card fraud detection with 7x lower response times using the ONNX-MLIR backend for Nvidia Triton on IBM z16 multi frame and LinuxONE Emperor 4 versus using the ONNX Runtime backend for Nvidia Triton on a compared x86 server.4
Run prediction of customer transactions 3.5x faster by colocating your application with the Snap ML library on IBM z16 multi frame and LinuxONE Emperor 4 versus running prediction remotely using the NVIDIA Forest Inference Library on a compared x86 server.5
Reduce costs and complexity while accelerating time to market with lightweight and free to download tools and runtime packages.
Accelerate seamless integration of TensorFlow with IBM Z Accelerated for TensorFlow to develop and deploy machine learning (ML) models on neural networks.
Use IBM Z Accelerated for NVIDIA Triton Inference Server to streamline and standardize AI inferences by deploying ML or DL models from any framework on any GPU- or CPU-based infrastructure.
Harness the benefits of TensorFlow Serving, a flexible and high-performance service system, with IBM Z Accelerated for TensorFlow Serving to help deploy ML models in production.
Convert ML models into a code that can be run on z/OS or LinuxONE with the help of IBM Z Deep Learning Compiler (IBM zDLC).
Use IBM Z Accelerated for Snap ML to build and deploy ML models with Snap ML, an IBM non-warranted program that optimizes the training and scoring of popular ML models.
Find out how you can use a scalable and consistent AI solution to detect, prevent and address fraud.
Explore how to use AI applications to not only identify various money laundering patterns but also prevent them from happening in real-time.
Discover how to get uncompromised model accuracy and very low latency for integrating inferencing into transaction processing and data serving applications where fast response times matter.
Learn how AI Toolkit can help you deploy your AI models on z/OS for real-time business insights at scale.
Find out how AI Toolkit provides a DIY-based approach to model creation and serving on Linux® on Z and LinuxONE.
All links reside outside ibm.com
1 DISCLAIMER: Performance results based on IBM internal tests doing inferencing using a Scikit-learn Random Forest model with Snap ML v1.9.0 (tech preview) backend on IBM LinuxONE Emperor 4 and with Scikit-learn v1.0.2 backend on compared x86 server. The model was trained on the following public data set https://www.kaggle.com/datasets/ellipticco/elliptic-data-set. BentoML v0.13.1 (https://github.com/bentoml/BentoML) was used on both platforms as model serving framework. IBM LinuxONE Emperor 4 configuration: Ubuntu 20.04 in an LPAR with 2 dedicated cores, 256 GB memory. x86 configuration: Ubuntu 20.04 on 9 IceLake Intel® Xeon® Gold 6342 CPU @ 2.80 GHz with Hyperthreading turned on, 1 TB memory.
2 DISCLAIMER: Performance results based on IBM internal tests running TensorFlow 2.9.1 with the IBM-zdnn-plugin (https://ibm.github.io/ibm-z-oss-hub/containers/index.html) for inferencing doing semantic segmentation for medical images (https://github.com/karolzak/keras-unet#usage-examples). Tests were run locally by sending 30 images at a time running TensorFlow 2.9.1 on 5 cores on a single chip vs running it on 5 cores on a single chip and using the Integrated Accelerator for AI via the IBM-zdnn-plugin. IBM Machine Type 3932 configuration: 1 LPAR configured with 10 dedicated IFLs, 128 GB memory, Ubuntu 22.04. Results may vary.
3 DISCLAIMER: Performance results based on IBM internal tests running TensorFlow 2.12.0 serving with the IBM-zdnn-plugin (https://ibm.github.io/ibm-z-oss-hub/containers/index.html) for inferencing doing semantic segmentation for medical images (https://github.com/karolzak/keras-unet#usage-examples). Tests were run remotely using the wrk workload driver (https://github.com/wg/wrk) sending single images against TensorFlow 2.12.0 serving. IBM Machine Type 3931 configuration: 1 LPAR configured with 12 dedicated IFLs, 128 GB memory, Ubuntu 22.04. x86 configuration: Ubuntu 22.04 on 12 Ice Lake Intel® Xeon® Gold CPU @ 2.80GHz with Hyper-Threading turned on, 1 TB memory. Results may vary.
4 DISCLAIMER: Performance results based on IBM internal tests doing inferencing using Nvidia Triton with the ONNX-MLIR backend (https://github.com/IBM/onnxmlir-triton-backend) on IBM Machine Type 3931 versus using the ONNX Runtime backend for Nvidia Triton on a compared x86 server. The CCFD model was trained on a synthetic dataset. As model serving framework NVIDIA Triton 23.05 (https://github.com/triton-inference-server/server) was used on both platforms and driven via the gRPC benchmarking tool ghz (https://github.com/bojand/ghz). IBM Machine Type 3931 configuration: Ubuntu 22.04 in an LPAR with 6 dedicated IFLs, 128 GB memory. x86 configuration: Ubuntu 22.04 on 2x 24 Ice Lake Intel® Xeon® Gold CPU @ 2.80GHz with Hyper-Threading turned on, 1 TB memory.
5 DISCLAIMER: Performance results based on IBM internal tests doing inferencing using a Random Forest model with Snap ML v1.12.0 backend which uses the Integrated Accelerator for AI on IBM Machine Type 3931 versus the NVIDIA Forest Inference Library (https://github.com/triton-inference-server/fil_backend) backend on compared x86 server. The model was trained on the following public dataset https://www.kaggle.com/c/santander-customer-transaction-prediction and NVIDIA Triton™ (https://github.com/triton-inference-server/server) was used on both platforms as model serving framework. The workload was driven via the http benchmarking tool Hey (https://github.com/rakyll/hey). IBM Machine Type 3931 configuration: Ubuntu 22.04 in an LPAR with 6 dedicated IFLs, 256 GB memory. x86 configuration: Ubuntu 22.04 on 6 Ice Lake Intel® Xeon® Gold CPU @ 2.80GHz with Hyper-Threading turned on, 1 TB memory.