DeepSeek goes global, AI goes local

View of Earth from space with digital graphing overlaying the globe

Authors

Aili McConnon

Staff Writer

IBM

Anabelle Nicoud

Staff Writer

IBM

This article was featured in the Think newsletter. Get it in your inbox.

When DeepSeek-R1 was released on January 20th, the powerful but cost-efficient AI reasoning model electrified both Silicon Valley and Wall Street. Why? It could reason as well as top models from the likes of OpenAI and Anthropic, but reportedly used much less compute and cost a fraction as much to train and use. Last month alone, the model was downloaded more than  800,000 times on Hugging Face.

“It was a wake-up call,” remembered Larry Li, a Founder and Managing Partner at Palo Alto-based investing firm AMINO Capital, in a recent interview with IBM Think. New technologies are often “reverse-engineered,” he said. “But nobody expected it could be done so well.”

“It was just changing the narrative that the US is the only place in the world that you can innovate,” said Matthieu Soulé, the Head of Cathay Innovation’s C.Lab, a fund that invests in AI innovation across the EU and Asia, including China.

Many predicted DeepSeek’s success would revolutionize the industry and global AI race more broadly. Six months later, we wanted to check in and see: did this really happen?

We spoke with some of the same experts we interviewed in the hours after DeepSeek-R1 was released, as well as several other experts to get a holistic picture.

What changed following DeepSeek-R1

In the days following the DeepSeek-R1 release, many raised concerns about whether the company had accurately tallied and reported the full costs—not just training the near-final model—and what components they had used from which companies. In other words, had they truly done something revolutionary, or was it more incremental progress?

Some like Kaoutar El Maghraoui, IBM Principal Research Scientist, feel the true innovation may have been what she calls “architectural efficiency” or combining techniques including “the mixture of experts, a reinforcement learning strategy, hardware-software codesign and various other optimization tricks. It's mostly a clever and effective implementation of already existing techniques,” she said in a recent IBM Think interview.

Still, experts across the board agree that DeepSeek-R1 shifted the global AI landscape in a few key ways. For one, many took for granted that American AI companies had a “moat” or lead that would be near impossible to make up. DeepSeek debunked that assumption as it lowered the barrier for developers and smaller companies to access the tools to develop their own LLMs.

“Developers and users now have access to the same type of capabilities as OpenAI’s o1 for a fraction of the cost,” said Abraham Daniels, a Senior Technical Product Manager at IBM, in an interview.

The fact that DeepSeek open-sourced its models played a big part in increasing the accessibility. “We have seen an uptick of interest in open source since DeepSeek and contributing to the AI Alliance,” said Anthony Annunziata, Director of AI Open Strategy at IBM and the AI Alliance. The AI Alliance is an international network of companies and organizations working to create open and safe AI, founded by IBM and Meta.

“Across Europe, in Vietnam, India and Japan, you have all these regional AI companies that want to make sure they retain sovereign control of their artificial intelligence—that they can shape it the way they want to fit their cultural, societal and economic needs, which are different from the US and other places,” said Annunziata.

Protecting home-grown AI research is top of mind. “There is a real digital sovereignty push where governments are trying to figure out how they can avoid foreign AI influence,” said El Maghroui.

Creating LLMs based on local languages motivates many entrepreneurs. “AI is heading in that direction where, as a utility, each country or region wants to have their own language model to at least have a say in terms of influencing the behavior,” said Li.

Japan, for example, recently enacted the AI Promotion Act to notably support the promotion of the technology. In late June, the AI Alliance launched a new chapter in Japan to focus on two areas of high interest to local entrepreneurs: AI sovereignty and AI in manufacturing. Language plays a large role in controlling one’s AI systems, so in late 2024, for example, a group of over 1,500 researchers from academia and industry have joined together to develop strong and open Japanese language models.

Many homegrown AI models and entrepreneurs also prioritize local economic interests. In the case of Japan, many of the companies that joined the AI Alliance, including Mitsubishi Electric and Panasonic, are developing AI models targeted at manufacturing and industrial applications, a particularly large segment of Japan’s economy.

The latest AI trends, brought to you by experts

Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.

Thank you! You are subscribed.

Your subscription will be delivered in English. You will find an unsubscribe link in every newsletter. You can manage your subscriptions or unsubscribe here. Refer to our IBM Privacy Statement for more information.

On the flip side, there is also a growing appetite for local models, said Daisuke Okanohara, CTO and Co-Founder of Preferred Networks, a Japanese hardware and software company that develops advanced software using deep learning and AI. In May, Preferred Networks released its second version of PLaMo, a compact model that can run on premise and is trained in Japanese and English.

“Its performance is not as competitive as frontier models overall, but it excels in certain specific tasks,” Okanohara said during an interview with IBM Think. “In small model use cases—such as models with eight to 30 billion parameters—it outperforms CLANG, GPT-4o mini, and similar models in several Japanese-language tasks.”

Vietnam has also experienced a flurry of entrepreneurial LLM activity, and the AI Alliance launched a chapter there in June this year. In addition to developing a Vietnamese language model, entrepreneurs are focused on using AI models to develop new kinds of chips to power AI, said Annunziata.

Another reason DeepSeek inspired so many local entrepreneurs was the fact that various countries banned or restricted the use of DeepSeek-R1, citing security and privacy concerns. Italy, Australia, South Korea and Canada banned DeepSeek, and it was restricted in several US states too, particularly on government sites. This had an interesting ripple effect of motivating local entrepreneurs to use open-source tools to create more secure models that could be used in their specific geographies.

Tech entrepreneur and VC Kai-Fu Lee’s latest tech company, 01.AI, wants to explore the B2B market for enterprise AI—a sector notoriously difficult in China, where half of the companies are state-owned and larger private firms can fall under government influence as they scale. Lee previously launched Rhymes AI, a company that released several products last fall, including a search engine and Allegro, an open-source video generation model.

“We look at it with a pragmatic approach: the models are really, really good enough. However, it's still not easy to use for a lot of businesses and enterprises, and that's the problem that we are trying to tackle,” said Anita Huang, Co-Founder of 01.AI, in an interview with IBM Think. “We feel that the missing piece, especially for the Chinese enterprise market, is that middleware layer that becomes the windows or large language model.” Currently, their enterprise platform uses models like DeepSeek and Alibaba’s Qwen.

What DeepSeek-R1 didn’t change

In the immediate aftermath of DeepSeek, many predicted that it had paved the way for chain-of-thought reasoning to dominate. Since then, however, the industry has shifted. New research has shown that reasoning models are cost- and resource-intensive and not necessary for many tasks when looking for utility from these models.

Perhaps the biggest area of overhype was enterprise adoption of DeepSeek, given its low licensing costs (it was licensed through the permissive MIT license).

“In reality, enterprise adoption remains very limited, mostly because of the lack of data privacy guarantees, lack of compliance, governance and security,” said El Maghraoui.

Most companies, in the US at least, stuck with vendors that offered managed or auditable solutions.

So, while it’s good that “people see that innovations come from surprising places,” Annunziata said, the overall AI industry and broader market have not shifted as some predicted. Instead, “the open-source companies have doubled down on open source, and the large proprietary players are focused on acquiring talent, even more focused on acquiring competitors or blunting competitors, and they are pouring more and more dollars into their models.”

Ultimately, DeepSeek’s greatest legacy may be in making the case for small, fit-for-purpose models, said Daniels.

“DeepSeek opened up the AI race and made small language models the new battleground,” he said. “Highly capable, small language models could be trained more efficiently than your larger models and could better address enterprise use cases.”

AI agents—autonomous AI systems that can reason, plan and execute tasks—have exploded across enterprises in 2025 and are one such use case. Smaller models are often better suited for agentic AI systems because they are more efficient, require fewer resources and can be tailored for specific tasks.

As IBM’s Distinguished Engineer Chris Hay put it on a recent Mixture of Experts episode: “When you want to run agents, you want your models to be small and fast and lean.”

AI Academy

Why foundation models are a paradigm shift for AI

Learn about a new class of flexible, reusable AI models that can unlock new revenue, reduce costs and increase productivity, then use our guidebook to dive deeper.

Related solutions
Model customization with InstructLab

See how InstructLab enables developers to optimize model performance through customization and alignment, tuning toward a specific use case by taking advantage of existing enterprise and synthetic data.

Discover watsonx.ai
AI for developers

Move your applications from prototype to production with the help of our AI development solutions.

Explore AI development tools
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI services
Take the next step

Enhance AI model performance with end-to-end model customization with enterprise data in a matter of hours, not months. See how InstructLab enables developers to optimize model performance through customization and alignment, tuning toward a specific use case by taking advantage of existing enterprise and synthetic data.

Explore watsonx.ai Explore AI development tools