Can AI learn to second-guess itself?

Man in deep thought looking at computer monitor in dimly lit office

Author

Sascha Brodsky

Staff Writer

IBM

This article was featured in the Think newsletter. Get it in your inbox.

AI systems can ace tests and mimic experts, but they still don’t know when to raise a hand and admit they might be wrong.

Large language models (LLMs) can now translate text, summarize information, generate code and hold conversations with surprising fluency. They are fast, fluent and often convincing. But they don’t know when they’re guessing, and they can sound confident even when they’re confused or out of their depth.

What’s missing is a way to step back and reflect; a system that knows when to pause and reassess. In humans, this reflective layer is known as metacognition, and IBM researchers believe that a similar functionality could be the key to making AI more reliable and adaptable.

IBM Fellow and Global AI Ethics Lead Francesca Rossi and colleagues are championing a new architecture that separates fast heuristics from deliberate reasoning and layers a system of reflective oversight on top. The architecture, named SOFAI, stands for Slow and Fast AI; it draws inspiration from psychology and cognitive science, and translates those insights into AI engineering principles.

“We wanted to combine the strengths of both neural and symbolic approaches without forcing them into a single hybrid,” Rossi said in an interview with IBM Think. “That meant creating a structure that could govern which reasoning mode to use, and when.”

When to trust your gut, and when to escalate

The SOFAI system, as detailed in a recent paper published by the Association for Computing Machinery, thinks in layers. At the first level are fast solvers, including machine learning models such as LLMs or more traditional machine learning models, which respond quickly and handle familiar problems with ease. These are the instinctual parts of the system, reacting without hesitation.

Then come the slow solvers, which take a more careful path. These engines often rely on symbolic reasoning and structured rules to work through problems step by step. They take longer, but their answers are more accurate. Watching over both is a third layer: metacognition. This part of SOFAI evaluates what the system is doing, determines whether the quick answer is satisfactory, given the context, and intervenes if a different approach is required. The metacognition layer acts as a kind of internal guide, choosing when to move fast and when to slow down.

“The fast solver would always go first,” Rossi said. “If it produced a plausible answer, and the stakes were low, the process would end there. But if the metacognition module found the answer lacking, it would send the problem to a more rigorous reasoning engine.”

This design mirrors how people think. For routine tasks like reading, driving or small talk, we respond quickly and automatically. But when we encounter uncertainty, risk or complexity, we slow down. We consider alternatives, check our reasoning or consult others. SOFAI is built to replicate that reflective process.

Rossi’s team found that even limited feedback loops between the fast solver and the metacognition module could significantly improve results without needing to escalate to slower systems. In SOFAI, the architecture also supports an iterative correction: the fast solver can revise its output in response to guided feedback before the system resorts to the costlier slow solver.

“Looping once or twice is often enough to reach high-quality answers,” Rossi said. “That saves time, reduces energy use and preserves the option to escalate only when truly necessary.”

Steering language models with structured oversight

SOFAI’s implications are particularly powerful in the context of LLMs. These models power chatbots, content generators, customer support platforms and coding assistants. But their outputs can be inconsistent, unpredictable and misaligned with human values. Organizations using LLMs often struggle to control what the models say, particularly when context, legality or ethics are at stake.

“One of the challenges is that language models are trained on vast amounts of internet data, which includes noise, bias and incomplete perspectives,” Rossi explained. “You cannot assume that the model knows which values to apply in a given setting.”

SOFAI offers a practical approach to keeping AI systems on track without requiring retraining of the underlying language model. Instead, it introduces a layer of oversight that acts like a built-in review team, keeping the AI focused and consistent. This includes evaluators that review the model’s answers, comparing them with internal policies, ethical standards or specific rules for the task. If something seems off, the system can autonomously ask for a better version. “You can think of it as teaching the system how to review itself,” Rossi said. “Not everything needs to be solved by retraining the model. Sometimes it’s enough to have an intelligent judge between input and output.”

This modularity enables SOFAI to adapt to diverse cultural, organizational and regulatory needs. Different evaluators can be defined for different jurisdictions or business units. If guidelines change, only the evaluators need to be updated. The core solvers remain intact.

“The beauty of keeping metacognition separate is that it puts alignment within reach,” Rossi said. “It gives users the tools to co-create AI behavior without needing to rebuild the system from scratch.”

A framework built for enterprise scale

Rossi is clear that SOFAI wasn’t built to live on paper. It was designed for messy, real-world use. As companies connect language models to tools like search engines, spreadsheets and scheduling apps, the result is often more powerful yet harder to control, she said. SOFAI helps manage this complexity by selecting the right systems for each task, deciding when to escalate, and keeping outputs aligned with policy as the AI environment becomes increasingly interconnected.

“In the future, enterprise AI will not look like a single monolithic model,” Rossi said. “It will be a composite—different agents, tools, APIs and reasoning systems. SOFAI is a framework for governing that ecosystem.”

In many hypothetical scenarios, SOFAI operates as a project manager for thinking. It monitors requests, evaluates the task’s complexity, selects the appropriate solver or combination of solvers, and then oversees the result. It also tracks patterns over time, learning which combinations produce reliable outcomes and adapting accordingly.

“You get both performance and explainability,” Rossi said. “You can audit decisions, trace which solvers were used and adjust metacognitive parameters if needed.”

This structure also offers environmental and operational advantages. Many large-scale reasoning systems are expensive to run, especially when they rely on massive inference graphs or real-time symbolic logic. By minimizing calls to these heavyweight solvers, SOFAI can reduce energy consumption while maintaining high accuracy.

“Efficiency is not only about speed,” Rossie said. “It’s about knowing when not to overthink. And that’s what metacognition allows.”

A model for responsible, adaptive AI

The rise of generative AI has forced organizations and policymakers to confront difficult questions. How can we ensure that systems behave appropriately? What does accountability look like when decisions are made autonomously? How can different values be reflected in models trained on global data?

SOFAI provides one possible answer. By incorporating a decision-making layer that mimics human-like metacognition, the architecture facilitates oversight that is both flexible and grounded in reality. It can scale across applications, adapt to local norms and provide users with real-time control over machine behavior.

“We are not trying to make AI more human,” Rossi said. “We are trying to make it more responsible. And that starts with knowing when to pause, reconsider and change course.”

According to Rossi, the architecture also anticipates the future of agent-based systems. In tomorrow’s AI ecosystems, she said, models will collaborate, hand off subtasks, query databases and iterate through steps to achieve goals. Without coordination, that complexity could become unmanageable. But SOFAI offers a lightweight, adaptable and principled governance structure for steering agents.

“Think of it as a conductor, not a dictator,” Rossi said. “You need to keep the orchestra in harmony, but you also need improvisation.”

Rossi believes that building metacognition into AI is not just a technical decision, but a responsible one. It reflects a commitment to transparency, adaptability and shared agency. The goal is not to impose control from above, but to give humans a meaningful way to collaborate with intelligent systems.

“We are building systems that learn, adapt and act in the world,” Rossi said. “They must also learn when to stop, reflect and ask: ‘Am I doing the right thing?’”

AI Academy

Become an AI expert

Gain the knowledge to prioritize AI investments that drive business growth. Get started with our free AI Academy today and lead the future of AI in your organization.

Related solutions
IBM® watsonx Orchestrate™ 

Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.

Explore watsonx Orchestrate
Artificial intelligence solutions

Put AI to work in your business with IBM’s industry-leading AI expertise and portfolio of solutions at your side.

Explore AI solutions
Artificial intelligence consulting and services

IBM Consulting AI services help reimagine how businesses work with AI for transformation.

Explore AI services
Take the next step

Whether you choose to customize pre-built apps and skills or build and deploy custom agentic services using an AI studio, the IBM watsonx platform has you covered.

  1. Explore watsonx Orchestrate
  2. Explore watsonx.ai