A distributed platform enables hybrid computing
Updated March 2026
In Progress
2026
Accelerate the development and delivery of AI across the hybrid cloud
The AI innovation shift from models to AI systems that incorporate models with traditional components, and the applications built on top of these systems, represent a novel paradigm. Standards are emerging around these new concepts and interactions. We can view these trends as the beginning of a new application and protocol layer (layer 8) that leads to the evolution of middleware and platform. It creates a need to build an "operating system" for AI (AIOS) to manage and provide common services to this class of applications.
AI innovation also intensifies the need for operating such non-deterministic systems at scale while integrating them into existing ecosystems. A wave of new security threats is also emerging stemming from cloud-specific agent indeterminacy in distributed environments. These create the need for novel observability and security mechanisms.
The AIOS will support development, life-cycle management, and operations of AI workloads across the entire stack: models, tools, persistence, and applications. AIOS for enterprises will be implemented by extending and adapting an existing cloud platform rather than by building a new stack, as this allows the reuse of capabilities and best practices that provide reliability, security, and compliance.
We will deliver a hardened, enterprise-grade inference stack for OpenShift AI with a production-ready control plane, SLO-driven autoscaler, pluggable scheduler, fast model actuation, and multi-tiered kv-cache offloading for heterogeneous hardware. We will develop a simplified and consistent experience for connecting models to data and will consistently and flexibly scale AI across hybrid cloud. Storage with AI capabilities (content-aware storage) will enable storage to process data effectively for agentic AI applications. We will have an MCP platform with registry and life-cycle management; an MCP gateway for agent-tool discovery, connectivity, and access control; and MCP servers for agents to manage hybrid cloud platform and infrastructure.
Focused AI agents will assist with system operation by generating recommendations and remediations that build on top no/low-code techniques, supporting humans in the loop for change request approvals. AI agents will be able to define (code) development and operations guard-rails, observe their effectiveness in live systems autonomously, and fine-tune as needed. The ability to continuously and intelligently assess applications and operations against benchmark expectations will yield richer visibility into operations risk and deliver higher trust and confidence in AI agents in operations.
We will deliver trusted agent identity support for agent-tool authorization flows based on existing and emerging standards. We will implement best-of-class security by integrating IBM enterprise security tools with AIOS to achieve identity management for agents; data protection; security posture management; and governance, risk, and compliance.
Data-centric workloads will drive the need for batch inference using smaller-scale models to support various ingest and analytic flows in which millions of records or documents need to be processed. These scales also have significant cost implications, further driving the need for efficient bulk inference with bursty demand. Finally, the need for low-latency inference associated with data access will continue to drive the push of inference into data repositories.
Our integrated AI agent platform will have agent life-cycle and layer-8 interconnect for security, safety, metering, quality enforcement, and observability, including seamless interactions of humans into agent life-cycle and operations. The llm-d "inference brain" will provide core scheduling, kv cache management, and autoscaling logic behind platform-agnostic APIs with key enterprise features for security and multi-tenancy.
It will integrate with AI hardware like AIU Spyre. KV-cache-optimized storage will improve inference performance and cost.