Think smart, not hard: How Claude's hybrid reasoning could change AI economics

Author

Staff Writer

IBM

Anthropic's new Claude 3.7 Sonnet can now turn its deep thinking mode on and off like a light switch, answering simple questions instantly while reserving the computational heavy lifting for complex problems that need it.

This hybrid reasoning approach marks a shift in artificial intelligence that experts say can both cut costs and boost capabilities, with IBM's Granite models also adopting similar toggling features based on the task complexity. This evolution comes as organizations worldwide struggle with the financial realities of advanced AI, potentially making sophisticated reasoning more accessible while conserving valuable computing resources.

"The cost structure of thinking models matters; not all questions require a 32-second pause for the model to think through it," Maya Murad, Product Manager for AI at IBM Research, says during a recent episode of the Mixture of Experts podcast. "This capability allows enterprises to use resources intelligently, applying extensive computation only when the problem requires it, creating AI systems that better match how humans approach different cognitive tasks."

The economics of machine thought

Hybrid reasoning signals a shift in the AI industry's focus from simply building more powerful systems to creating ones that are practical to use, Abraham Daniels, a Senior Program Manager with IBM Research, tells IBM Think. For businesses, this change could be crucial, as the cost of operating sophisticated AI has become a major consideration.

Models consume significantly more computational resources—and therefore cost more money—during deep reasoning than when providing simple responses. Hybrid reasoning lets companies optimize AI spending by matching computation levels to task complexity.

Anthropic recently launched Claude 3.7 Sonnet with "extended thinking mode," allowing users to request more thorough analysis when needed. IBM similarly equipped its Granite models with "toggling" capabilities, giving users control over when to activate intensive reasoning.

"We built hybrid reasoning with a different philosophy to other reasoning models on the market," an Anthropic spokesperson told IBM Think. "Our approach is based on how the human brain works. As humans, we don’t have two separate brains for fast versus deep thinking—and at Anthropic, we regard reasoning as something that needs to be deeply integrated into the capabilities of all of our models versus a separate feature. This approach is based on how we see Claude integrating with our customers across all applications. While some interactions require quick responses, like brainstorming marketing collateral, others, like complex financial analysis or industry research, require deeper, longer thinking. We wanted to make both of these functionalities as simple and cost-effective as possible for our customers to access and use."

The AI's thought process becomes more transparent with this approach. "The model itself is still a black box, but at least on the output, you can kind of see how the model came to that conclusion," Daniels says. This visibility can improve results and address explainability concerns, which is particularly important for regulated industries, he says.

Daniels and other experts see this development as addressing a practical need: eliminating unnecessary computational overhead for straightforward questions.

"You don't need a ton of reasoning for all tasks, and it gives you the ability, basically, when you have more complicated things, to pay more—both in terms of latency and cost," says Kate Soule, Director of Technical Product Management at IBM Research, on the podcast.

The latest AI News + Insights  

Discover expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter.

Inside the black box

The inner workings of large language models (LLMs) have traditionally been opaque. A model would receive a prompt and generate a response, without revealing its internal reasoning steps.

Hybrid reasoning changes this dynamic by exposing a model’s step-by-step thinking process. When activated, systems like Granite 3.2 show their work, making the logical paths they follow visible.

"Our decision to make Claude’s reasoning process visible reflects consideration of multiple factors. One of those factors includes enhanced user experience and trust transparency in Claude’s reasoning process," the Anthropic spokesperson said. "This provides users with insight into how conclusions are reached, fostering appropriate levels of trust and understanding. Users generally trust outputs more when they can observe the chain of thought. We hope this visibility allows users to better evaluate the quality and thoroughness of Claude’s reasoning, and helps users better understand Claude’s capabilities. Furthermore, we hope users and developers can create better prompts by reading Claude’s thinking outputs and providing targeted feedback on specific reasoning steps."

"Being able to expose the actual thinking of the model is great for explainability," says Daniels. "Prior to being able to demonstrate the chain-of-thought (CoT) reasoning, it was really just the next token probability. So a little bit of a black box."

These technologies have business applications that extend across many industries. "Finance and legal are natural fits because they deal with structured documentation," says Daniels, adding that "any regulated industry stands to gain tremendous value" from these advanced thinking models.

But hybrid reasoning can be especially useful in domains requiring complex analysis.

"Math and code are really the two focus points that I've seen in terms of benchmarks for reasoning," says Daniels. For software development, the benefits could be substantial: "Using a thinking model would be able to frame out what the scope of the project should look like given the requirements that you've laid out," he says.

Standard LLMs generate responses by predicting the most likely next word based on patterns in their training data. This approach works well for many tasks, but these models can struggle with multi-step reasoning problems.

Hybrid reasoning models can switch into a computationally intensive mode, explicitly generating intermediate reasoning steps before providing a final answer. The model uses these steps to work through complex problems, similar to how humans write out intermediate steps when solving complex math problems.

The architecture enabling hybrid reasoning builds upon what researchers call "test-time compute," which involves dedicating computational resources during inference rather than only during training.

"A lot of times, traditionally, all your computing power would be used to train the model, and then inferencing the model would be relatively light in terms of computational requirements," Daniels says.

But as AI systems grow more complex, the challenge won’t just be processing power—it’ll be knowing when to use it efficiently. That’s why the next frontier for hybrid reasoning, Daniels says, will be smarter self-regulation: teaching AI when to activate its deeper thinking mode on its own, without humans telling it to do so.

"The next step in terms of reasoning models, or hybrid reasoning models, is how can we better understand or better triage inputs within the test-time compute, or within the thinking framework," he says.

Mixture of Experts | 5 December, episode 84

Decoding AI: Weekly News Roundup

Join our world-class panel of engineers, researchers, product leaders and more as they cut through the AI noise to bring you the latest in AI news and insights.

Watch all episodes of Mixture of Experts

Unpacking the agentic AI journey: what delivers, what distracts, and what deserves your investment

Join us to explore where agentic AI is already delivering measurable value, where the technology is still evolving, and how to prioritize investments that align with your organization’s strategic goals.

Resources

Unpacking the agentic AI journey: what delivers, what distracts, and what deserves your investment

Start realizing ROI: A practical guide to agentic AI

Discover ways to get ahead, successfully scaling AI across your business with real results.

How AI agents and assistants can benefit your organization

Dive into this comprehensive guide that breaks down key use cases, core capabilities, and step-by-step recommendations to help you choose the right solutions for your business.

Top strategic technology trends for 2025: Agentic AI

Download this Gartner® research to learn the potential opportunities and risks of agentic AI for IT leaders and how to prepare for this next wave of AI innovation.

Level up your AI expertise

Access our full catalog of over 100 online courses by purchasing an individual or multi-user subscription today, enabling you to expand your skills across a range of our products at a low price.

From AI projects to profits: How agentic AI can sustain financial returns

Learn how organizations are shifting from launching AI in disparate pilots to using it to drive transformation at the core.

Explore IBM Granite

IBM® Granite® is a family of open, performant and trusted AI models tailored for business and optimized to scale your AI applications. Explore language, code, time series and guardrail options.

IBM is named a Leader in Data Science & Machine Learning

Learn why IBM has been recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms.

IBM AI Academy

Led by top IBM thought leaders, the curriculum is designed to help business leaders gain the knowledge needed to prioritize the AI investments that can drive growth.

The 2025 CEO’s guide: 5 mindshifts to supercharge business growth

Activate these five mindshifts to cut through the uncertainty, spur business reinvention, and supercharge growth with agentic AI.

Unlock the power of generative AI and ML

Learn how to confidently incorporate generative AI and machine learning into your business.

How to thrive in this new era of AI with trust and confidence

Dive into the three critical elements of a strong AI strategy: creating a competitive edge, scaling AI across the business and advancing trustworthy AI.

Think smart, not hard: How Claude's hybrid reasoning could change AI economics

Author

The economics of machine thought

The latest AI News + Insights

Inside the black box

Decoding AI: Weekly News Roundup

Share

Resources

The latest AI News + Insights