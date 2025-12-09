Designed to Scale and Optimize GenAI Inferencing
IBM AI Optimizer for Z delivers high‑performance, policy‑driven AI inferencing directly on IBM Z, engineered to meet the demands of GenAI at enterprise scale. Powered by the IBM Spyre™ accelerator, it brings low‑latency, high‑throughput, and security‑rich model execution to the platform that runs the world’s most mission‑critical workloads.
As GenAI reshapes business strategy, organizations running on IBM Z face a clear mandate: scale AI efficiently, securely, and without runaway infrastructure cost. AI Optimizer for Z 2.1 addresses this by optimizing inferencing where data and transactions are already live—on Z—reducing time‑to‑value while eliminating inefficiencies that slow down AI adoption.
AI Optimizer for Z is available in two editions: Advanced Edition and Essentials Edition.
AI Optimizer for Z 2.1 Essentials Edition extends automation capabilities with seamless automation of IBM watsonx Assistant for Z 3.1 and IBM Software Hub 5.2 installation.
Gain full visibility into GenAI inferencing across IBM Z with enterprise‑grade observability. Built‑in Prometheus and Grafana dashboards provide deep insights into:
This transparency helps eliminate over‑provisioning, streamline capacity planning, and drive smarter infrastructure investment.
AI Optimizer for Z 2.1 introduces a staged caching model to accelerate GenAI inferencing:
AI Optimizer registers models running on Spyre for optimization. Users can configure their own routing strategies or rely on the built‑in intelligent router, which considers performance, availability, and usage patterns. Semantic tagging allows grouping of models for use‑case‑aligned routing thus providing more flexibility on inferencing requests.
Models deployed outside IBM Z or LinuxONE can be registered, tagged, grouped, and monitored along with on‑platform models. This provides a unified operational view of GenAI inferencing across hybrid environments, ensuring consistency in governance and performance tracking.
AI Optimizer for Z automates installation and configuration of key IBM Z Gen AI components and products, such as IBM watsonx Assistant for Z, ensuring fast and reliable setup. It validates infrastructure and provides a health dashboard for easy monitoring. This reduces complexity and accelerates time to production.
When AI Optimizer for Z meets IBM watsonx Assistant for Z on IBM Spyre accelerator, enterprises get the best of both worlds — application and inference optimization in perfect harmony. AI Optimizer ensures every query, inference, and model call is routed, cached, and scaled for maximum efficiency, while the Assistant delivers natural, conversational engagement with customers and employees. Running on Spyre’s high-performance, energy-efficient architecture, the two together enable faster responses, lower latency, and end-to-end visibility — transforming customer interactions into seamless, AI-powered experiences that are smarter, faster, and built for enterprise scale.
