Compound AI systems are advanced configurations that combine multiple AI models, techniques or systems to solve complex problems more effectively than a single artificial intelligence (AI) model could. These systems integrate different components, each specialized in a particular task to work collaboratively or sequentially.
While much attention is paid to large language models (LLMs), these massive machine learning (ML) models face limitations. They’re expensive to run and develop and they’re slow. They lack domain-specific expertise and aren’t always adept at handling complex tasks involving many steps across systems.
Due to such constraints, researchers have found that complementing monolithic models with other models and tools, each optimized for a specific role can make for a more effective approach.
A 2024 blog post from the Berkeley Artificial Intelligence Research (BAIR) Lab (link resides outside IBM.com) was an early clarifying vision of what compound AI systems would look like. The post proposed that better results might be obtained by building compound AI systems and that the future of AI would involve organizations bringing together LLMs, retrieval systems, AI agents and external tools, each optimized for specific tasks.
There are numerous benefits to the orchestration of multiple individual models and interacting components.
By dividing tasks among specialized models, compound systems reduce the cognitive load on individual AI components. For example, one model might focus on analyzing structured data while another interprets unstructured data such as images or text. This division of labor leads to improved performance and higher accuracy compared to single-model systems.
LLMs are impressive and increasingly so when supplied with more computational resources, but run up against performance ceilings and diminishing returns due to scalability laws. Sometimes, rather than pouring more compute into an LLM, it might be preferable to delegate certain tasks to another model, agent or tool that isn’t so resource hungry.
Combining multiple models to perform a task can sometimes be faster than training a single LLM to do it. Moreover, compound systems can process different types of data simultaneously, enabling real-time decision-making. This is critical in AI applications such as fraud detection, where rapid responses are essential, or in edge applications, where latency must be minimized.
Compound systems are highly versatile and applicable across diverse use cases. This versatility makes compound AI systems a preferred choice for businesses aiming to optimize operations across multiple domains.
Organizations can benefit from system designs that use a combination of pretrained components, open source solutions and custom modules. Each component can be independently updated or replaced as technology evolves without overhauling the entire system. By distributing tasks across various models, compound systems benefit from adaptability and resilience to individual component failures.
Compound methods including retrieval augmented generation (RAG) extend the capabilities of LLMs by enabling them to access data sources outside of their initial training datasets. Combining different models enables developers to optimize for specific goals, such as speed or domain expertise.
LLMs can be unwieldy, prone to hallucinations and make decisions that aren’t always readily explainable. A compound AI solution can help control inputs and filter outputs, resulting in more controlled behavior that promotes trust.
Compound AI systems are already being used in real-world use cases, such as:
Certain versions of chatbots including OpenAI’s ChatGPT and Microsoft’s Copilot are built on compound architectures. ChatGPT, for example, extends its utility through several tools and APIs for specific tasks.
It brings together an LLM, the DALL-E image generator and a code interpreter plug-in. It uses RAG to access external data sources and knowledge bases dynamically. Separate AI models are used to detect and filter harmful or inappropriate content before delivering a response.
Although this technology has yet to be brought to the mainstream, autonomous vehicle systems use computer vision models to detect and recognize objects in the car’s surroundings. Sensor fusion algorithms combine data from cameras, LiDAR, radar and ultrasonic sensors to create a comprehensive 3D map of the environment, enhancing situational awareness.
Reinforcement learning models handle decision-making, such as determining when to change lanes, adjust speed or stop at a traffic light, based on real-time conditions.
Also, natural language processing (NLP) enables the vehicle to interpret and respond to spoken commands from passengers. These components work together seamlessly to process vast amounts of data, make intelligent snap decisions and provide an intuitive experience.
A compound AI system in customer support combines several AI technologies to deliver efficient, personalized and responsive service. For instance, NLP models analyze customer inquiries to extract intent and key details, enabling the system to understand the issue accurately.
After the intent is identified, a chatbot powered by generative AI (gen AI) conversationally engages the customer, offering immediate assistance or clarifying additional details. At the same time, a recommendation system suggests relevant solutions, such as troubleshooting steps, FAQ articles or product recommendations tailored to the customer's needs.
To enhance the experience, a sentiment analysis model evaluates the customer’s tone and emotional state, helping prioritize urgent or dissatisfied cases for human intervention. This combination of components allows for fast, intelligent and empathetic customer support, reducing resolution time while maintaining high levels of customer satisfaction.
A compound AI system in supply chains uses multiple AI components to optimize logistics, inventory management and overall efficiency. For instance, predictive analytics models forecast demand by analyzing historical sales data, seasonal trends and market variables, enabling precise inventory planning.
Computer vision systems monitor warehouse operations, identifying inefficiencies or errors in real time, such as misplaced items or damaged goods. Simultaneously, route optimization algorithms determine the most efficient delivery paths, considering factors such as traffic, weather and fuel consumption.
Also, NLP enables automated handling of supplier and customer communications, such as processing purchase orders or responding to inquiries. By integrating these components, the system improves supply chain responsiveness, reduces waste and helps ensure timely delivery, all while adapting dynamically to changes in demand and external conditions.
Designing compound AI systems involves integrating multiple AI models and components into cohesive frameworks capable of tackling complex tasks. These frameworks provide the infrastructure for combining diverse models and help to ensure seamless communication between them.
In a compound AI system, a programmed control logic might call on a model or an LLM might be “in charge,” depending on the goals of the system.
There are distinct advantages to both approaches, and the various ways that models and other components can work together within an AI system are limitless, so designers need to think critically about their approach and be willing to experiment with various architectures and combinations of components.
Machine learning operations (MLOps) become more tricky with compound workflows. For example, it’s difficult to apply consistent metrics across different types of tools and models. The BAIRD researchers claim that a new phase of AI development arises alongside the shift to compound systems to help grapple with the challenges presented by monitoring, debugging and other operational concerns involved.
We surveyed 2,000 organizations about their AI initiatives to discover what's working, what's not and how you can get ahead.
IBM® Granite™ is our family of open, performant and trusted AI models, tailored for business and optimized to scale your AI applications. Explore language, code, time series and guardrail options.
Access our full catalog of over 100 online courses by purchasing an individual or multi-user subscription today, enabling you to expand your skills across a range of our products at one low price.
Led by top IBM thought leaders, the curriculum is designed to help business leaders gain the knowledge needed to prioritize the AI investments that can drive growth.
Want to get a better return on your AI investments? Learn how scaling gen AI in key areas drives change by helping your best minds build and deliver innovative new solutions.
Learn how to confidently incorporate generative AI and machine learning into your business.
Dive into the 3 critical elements of a strong AI strategy: creating a competitive edge, scaling AI across the business and advancing trustworthy AI.