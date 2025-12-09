AI Optimizer for Z provides advanced real-time monitoring for gen AI workloads using Prometheus for metric collection and Grafana for intuitive visualization. It tracks key metrics such as token throughput, latency per request, cache hit ratio, time-to-first-token and memory utilization, along with a plan to include hardware usage metrics like GPU/accelerator utilization.

AI Optimizer can integrate with the OpenTelemetry (OTel) collector when it is configured with Prometheus receivers. This enables seamless telemetry ingestion and interoperability for unified observability across hybrid environments. These insights empower organizations to make informed decisions on capacity planning, workload routing, performance monitoring and infrastructure optimization—helping avoid over-provisioning, reduce costs and improve overall performance.