Deep research on a dime

Overlooking a man's shoulder as he works on coding on a computer with multiple monitors

Author

Aili McConnon

Staff Writer

IBM

This article was featured in the Think newsletter. Get it in your inbox.

Deep research may no longer require deep pockets. Chinese tech giant Alibaba’s new Tongyi DeepResearch model, which surged to the top of Hugging Face’s most downloaded model list last week, is an autonomous deep research agent that operates more quickly and cost-effectively than other agents capable of complex research, according to its developers. And it’s open source to boot.

“I think it’s a really cool step forward,” said Gabe Goodhart, Chief Architect of AI Open Innovation at IBM, on a recent episode of the Mixture of Experts podcast. He welcomed the news that Alibaba has developed an open LLM “that you can run on a personal workstation that measures up to a frontier research system.”

What are deep research agents for?

Deep research agents autonomously gather, synthesize and analyze information from multiple sources. Typically, they run on large proprietary frontier LLMs that require hundreds of billions of parameters and take longer to reason (sometimes up to 30 minutes per query) because they are consulting many more sources than other agents that respond immediately.

But while Tongyi DeepResearch has 30 billion parameters, it only activates 3 billion tokens at a time. This enables faster inference and lower costs than typical deep research agents, while maintaining equivalent or better performance, according to its performance on several benchmarks, such as Humanity’s Last Exam.

Beyond benchmarks, the DeepResearch agent is also already adept at real-world work, as it can “autonomously [execute] complex, multi-step research tasks that mirror a junior attorney’s workflow,”  wrote Tongyi Lab’s DeepResearch team in a GitHub blog post. According to this post, the Tongyi FaRui legal research agent received a higher accuracy score than OpenAI’s Deep Research agent and Anthropic’s Claude research agent when citing legal statutes and cases.

Small but mighty

Mihai Criveti, a Distinguished Engineer at IBM, stressed that small but mighty models that can complete targeted deep research tasks are particularly enticing for enterprises. 

“There are a lot of organizations that are looking for cheaper, faster and smaller models, preferably behind their firewall—especially if you have some of these models work with financial data, or HR data, or internal data,” he said on Mixture of Experts.

Large AI companies like OpenAI and Anthropic offer smaller models that can run more cost-efficiently. But most CIOs would not be comfortable sending their internal data to a large public model, Criveti said. “So, if there is a model that keeps data on-site and can run privately on my laptop or maybe in the future on my phone, that’s awesome,” he said.

The latest AI trends, brought to you by experts

Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.

Thank you! You are subscribed.

Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.

A DeepSeek moment for Alibaba?

Tongyi DeepResearch’s appeal also stems from its ability to solve very specific problems, such as creating a detailed trip itinerary or a comprehensive legal research report, said Sandi Besen, an AI Research Engineer at IBM, on Mixture of Experts. The model “solves a very narrow piece of the puzzle,” she said. “I could very much see using this deep research agent as part of a broader agent team or broader agent architecture.”

Besen likened the arrival of the Alibaba agent to DeepSeek-R1, the model that rocked Silicon Valley and Wall Street in early 2025 because it met or surpassed many frontier models from OpenAI and Anthropic on certain benchmarks but reportedly cost a fraction of the price to build and use.

However, it was not just because of Tongyi’s efficiency that Besen compared it to DeepSeek-R1, but also because the latter catapulted a particular training technique into the public eye.

“Distillation became a big deal after the DeepSeek paper came out. I wonder whether this paper will trigger some sort of trend in terms of the triathlon of training where you do continual pre-training, then fine-tuning, and then on-policy RL [reinforcement learning],” said Besen.

It is exactly this combination of techniques that the developers drew attention to in their paper about Tongyi DeepResearch. “Overall, this pipeline marks a breakthrough: it connects pre-training to deployment without silos, yielding agents that evolve through trial–and–error,” the researchers wrote.

While it’s still too early to say what Tongyi DeepResearch’s impact will be, it’s possible its influence will extend beyond this specific model, said Besen. “Sometimes it’s not the first model that comes out that’s actually ‘the best,’ but it’s the trend that it drives that points us in a different direction.”

Abstract portrayal of AI agent, shown in isometric view, acting as bridge between two systems
Related solutions
AI agents for business

Build, deploy and manage powerful AI assistants and agents that automate workflows and processes with generative AI.

    Explore watsonx Orchestrate
    IBM AI agent solutions

    Build the future of your business with AI solutions that you can trust.

    Explore AI agent solutions
    IBM Consulting AI services

    IBM Consulting AI services help reimagine how businesses work with AI for transformation.

    Explore artificial intelligence services
    Take the next step

    Whether you choose to customize pre-built apps and skills or build and deploy custom agentic services using an AI studio, the IBM watsonx platform has you covered.

    Explore watsonx Orchestrate Explore watsonx.ai