The inner workings of large language models (LLMs) have traditionally been opaque. A model would receive a prompt and generate a response, without revealing its internal reasoning steps.

Hybrid reasoning changes this dynamic by exposing a model’s step-by-step thinking process. When activated, systems like Granite 3.2 show their work, making the logical paths they follow visible.

"Being able to expose the actual thinking of the model is great for explainability," says Daniels. "Prior to being able to demonstrate the chain-of-thought (CoT) reasoning, it was really just the next token probability. So a little bit of a black box."

These technologies have business applications that extend across many industries. "Finance and legal are natural fits because they deal with structured documentation," says Daniels, adding that "any regulated industry stands to gain tremendous value" from these advanced thinking models.

But hybrid reasoning can be especially useful in domains requiring complex analysis.

"Math and code are really the two focus points that I've seen in terms of benchmarks for reasoning," says Daniels. For software development, the benefits could be substantial: "Using a thinking model would be able to frame out what the scope of the project should look like given the requirements that you've laid out," he says.

Standard LLMs generate responses by predicting the most likely next word based on patterns in their training data. This approach works well for many tasks, but these models can struggle with multi-step reasoning problems.

Hybrid reasoning models can switch into a computationally intensive mode, explicitly generating intermediate reasoning steps before providing a final answer. The model uses these steps to work through complex problems, similar to how humans write out intermediate steps when solving complex math problems.

The architecture enabling hybrid reasoning builds upon what researchers call "test-time compute," which involves dedicating computational resources during inference rather than only during training.

"A lot of times, traditionally, all your computing power would be used to train the model, and then inferencing the model would be relatively light in terms of computational requirements," Daniels says.

But as AI systems grow more complex, the challenge won’t just be processing power—it’ll be knowing when to use it efficiently. That’s why the next frontier for hybrid reasoning, Daniels says, will be smarter self-regulation: teaching AI when to activate its deeper thinking mode on its own, without humans telling it to do so.

"The next step in terms of reasoning models, or hybrid reasoning models, is how can we better understand or better triage inputs within the test-time compute, or within the thinking framework," he says.